Data Classification

Data Classification

One of the key ways organizations can stay in compliance with data privacy laws—thus gaining a competitive edge—is to properly classify data.

Digital Guardian defines data classification as the process of automatically organizing data collected or generated by your organization into set categories, or “tags,” with the goal of allowing the data to be accessed and used, tracked, and managed more efficiently. According to Imperva, data classification tags should be based on data type, sensitivity, and value of the data to the organization in the event it becomes disclosed, changed, or deleted. Proper tagging allows organizations to mitigate risks and maintain compliance with data security requirements.

As this TechTarget article points out, these tags are especially beneficial to organizations that suffer a cyber or malware attack that compromises their information systems—or if employees of an organization attempt to improperly disclose sensitive consumer data.

Imperva notes that data can be classified by identifying its “sensitivity level” as one of the following:

  • High Sensitivity: Data which, if disclosed, changed, or deleted “would have a catastrophic impact on the organization or individuals.” Provided examples of this include financial records, intellectual property, or authentication data.
  • Medium Sensitivity: Data which is intended only for “internal use,” but if disclosed, changed, or deleted would have the same devastating effect as high sensitivity information. Provided examples of this include emails or non-confidential data.
  • Low Sensitivity: Data which is intended for public use. Provided examples of this include public website content and press releases.

In performing data classification, your organization can utilize the following data classification methodologies:

  • Content-Based Classification: Your organization would need to review files and documents and classify them according to their sensitivity level.
  • Context-Based Classification: Your organization would need to review the metadata of data files, such as the software type that was used to generate the data, the data creator, and the location of the files when they were originally drafted or modified to determine if it is an inherently sensitive file—such as when data is received from a finance or legal department of your organization.
  • User-Based Classification: Your organization would need to empower the creator of a document to classify the sensitivity of the data it contains before distributing the document to others.

Additionally, your organization can further classify data with the following tags:

  • Data States: Data is either at rest, in processing, or in transit.
  • Data Format: Data can be structured—human-readable and able to be categorized—or unstructured. Unstructured data includes coded data and binaries.

Imperva points out that before data classification can be done by your organization, an “accurate and comprehensive data discovery” process must be done. To assist in this cumbersome process, your organization can rely on automated tools to sort through the data. However, before an automated process can be utilized, your organization must create a comprehensive policy governing the categorization process.

Your organization’s policy should identify the following:

  • What data is currently being generated or circulated throughout your organization—and who owns or utilizes that data.
  • Whether the person who owns or utilizes the data is the most knowledgeable about the content and the context of the information.
  • Whether the person who owns or utilizes the data the most is the person responsible for the integrity and accuracy of the data.
  • Where data is stored.
  • Whether the data generated or circulated throughout an organization is subject to any regulations or compliance requirements.
    • If yes, what are the consequences of violating these regulations or requirements?

Your policy should also:

  • Define the person responsible for the classification, integrity, and accuracy of data classification;
  • Ensure compliance with relevant regulatory and industry-established mandates;
  • Identify the frequency with which data classification should take place; and
  • What data types should be included in data classification.


* * * * * * *

For ADCG’s Breach Report and more news updates discussing: MIT’s new white paper with insights on emerging and current data privacy legislation; Connecticut’s data privacy becomes law effective in July 2023; the Dutch DPA doubles down on data protection; and the United Kingdom’s Health and Social Care Secretary of State, Sajid Javid, announced a new strategy for protecting health data last week, click here.

To browse through our previously published articles and news alerts, please visit our website, and don’t forget to subscribe to receive free weekly Data and Cyber Governance news and Breach Reports directly to your email.

Our Podcasts are released every Thursday, here. They can also be enjoyed on Spotify and Apple Podcasts. Don’t forget to subscribe!

Leave a Reply

Back To Top