What is Meta Data Management?

By adding metadata, such as project numbers or client names, to documents, you can securely store and retrieve all documents and emails in Teams, SharePoint, and OneDrive. This can be easily achieved by creating a link from Docubird to Microsoft CRM, Exact, or other databases containing project, client, or supplier information. Docubird will offer the option, whether mandatory or not, to add the correct metadata from the linked databases to the documents. This prevents metadata pollution and simplifies retrieval and security.

The main areas for metadata are:

  • Data Retention
  • Records Management
  • Data Loss Prevention (DLP)
  • eDiscovery
  • PII and Privacy Classification

What is Data Retention?

Data retention policy is a set of rules that describe what data is stored and for how long. Data retention policy is a part of data governance, which includes all aspects of data management, including, for example, access rights to the data.

Essentially, it concerns the following points:

  • Which data (documents and email)?
  • Who manages the data (documents and email)?
  • What is the retention period of documents and email?
  • Are there any legal requirements?

What is Records Management?

Records Management encompasses all documents (and possibly multimedia files) that are archived within a company. In an IT context, the term usually refers to a digital method of archiving using a DMS such as SharePoint, Teams, and OneDrive.

What is Data Loss Prevention (DLP)?

Data loss prevention is the practice of detecting and preventing data breaches, exfiltration (leaks), or unwanted destruction of sensitive data. Organizations use DLP to protect and secure their data and to comply with regulations. 

What is eDiscovery?

eDiscovery is a process, for example, following a request from a regulator, a request for inspection, or a legal conflict, in which enormous amounts of information must be searched, structured, and presented in a short period. This must be done in a responsible and transparent manner so that all parties involved can use the results of that investigation. 

Detecting Personally Identifiable Information (PII) with Azure AI via Docubird

To comply with GDPR guidelines, organizations are required to protect sensitive information and prevent it from being unintentionally disclosed. Docubird uses Azure Cognitive Search to detect this sensitive information and apply the correct classification.

What is Personally Identifiable Information?

Personally Identifiable Information (PII) is any data that can be used to identify a person, such as names, driver's license numbers, SSNs, bank account numbers, passport numbers, email addresses, and more. Regulations such as GDPR require strict protection of user privacy.

Detecting PII with Azure Cognitive Search

Docubird uses Azure Cognitive Search to detect PII. Azure Cognitive Search is a cloud solution that provides Docubird developers with APIs and tools to add a comprehensive search experience to their data, content, and applications. Cognitive Search allows you to add cognitive skills to apply AI processes during indexing. Doing so allows you to add new information and structures that are useful for searches and other scenarios.

The Azure PII Detection skill detects Personally Identifiable Information from a document and allows you to classify it in various ways. This skill uses the machine learning models provided by Text Analytics in Cognitive Services.

If a document in SharePoint, Teams, or OneDrive is requested via Docubird, Azure Cognitive Search scans this document for PII. Based on the information found, it is categorized according to the following classification scheme:

  • Highly confidential: Share the most critical data only with named recipients.
  • Confidential: Limited distribution, on a ‘need-to-know’ basis.
  • General: Daily work, internal sharing throughout the organization.
  • Public: Unrestricted and sharing with the outside world is possible.

The classifications ‘Highly confidential’ and ‘Confidential’ can be automatically added to the documents. If PII Detection does not provide a conclusive result, a Privacy Officer can classify the document. This can be automated via Microsoft Power Automate.

Want to know more? Schedule a demo to discuss the possibilities for your company.