With the new General Data Protection Regulation (GDPR), effective May 2018, approaching, companies based in Europe or holding personal data of individuals residing in Europe are struggling to find their most valuable assets in the organization. : your sensitive data.
The new regulation requires organizations to prevent any personally identifiable information (PII) data breach and to delete any data if requested by any person. After removing all PII data, companies will need to prove that they have been completely removed to that person and to the authorities.
Most companies today understand their obligation to demonstrate responsibility and compliance and have therefore started preparing for the new regulation.
There is so much information out there about ways to protect your sensitive data, so much so that one can feel overwhelmed and start pointing in different directions, hoping to hit the target accurately. If you plan your data governance ahead of time, you can still meet the deadline and avoid penalties.
Some organizations, mostly banks, insurance companies, and manufacturers, have a huge amount of data, as they are producing data at a rapid rate, changing, saving, and sharing files, thus creating terabytes and even petabytes of data. The difficulty of this type of company is to find their sensitive data in millions of files, in structured and unstructured data, which unfortunately in most cases is an impossible mission to accomplish.
The following personally identifiable data is classified as PII according to the definition used by the National Institute of Standards and Technology (NIST):
or full name
or email address
o National identification number
or passport number
o IP address (when linked, but not PII itself in the US)
o Vehicle registration number
o Driver’s license number
o Face, fingerprints or handwriting
o Credit card numbers
or digital identity
or date of birth
or place of birth
o Genetic information
or phone number
o Login name, screen name, nickname or identifier
Most organizations that hold PII of European citizens require detection and protection against any PII data breach and removal of PII (often referred to as the right to be forgotten) from company data. The Official Journal of the European Union: Regulation (EU) 2016/679 of the European Parliament and of the Council of April 27, 2016 has stated:
“The supervisory authorities should supervise the application of the provisions of this Regulation and contribute to its consistent application throughout the Union, in order to protect natural persons in relation to the processing of their personal data and facilitate the free movement of data personnel within the internal market”.
To enable companies holding PII of European citizens to facilitate a free flow of PII within the European market, they must be able to identify their data and categorize it according to the sensitivity level of their organizational policy.
They define the flow of data and the challenges of the markets as follows:
“Rapid technological advances and globalization have posed new challenges for the protection of personal data. The scale of the collection and sharing of personal data has increased significantly. Technology allows both private companies and public authorities to make use of personal data on an unprecedented scale in order to carry out their activities. Individuals make more and more personal information available to the public and around the world. Technology has transformed both the economy and social life, and should further facilitate the free movement of personal data within the Union and the transfer to third countries and international organisations, while ensuring a high level of personal data protection.”
Phase 1 – Data Discovery
Therefore, the first step that needs to be taken is to create a data lineage that will allow you to understand where your PII data is being dumped throughout your organization and will help decision makers spot specific types of data. The EU recommends getting automated technology that can handle large amounts of data, automatically scanning it. No matter how big your team is, this is not a project that can be handled manually when you are faced with millions of different types of files hidden in various areas: in the cloud, storages and local desktops.
The main concern for these types of organizations is that if they are not able to prevent data breaches, they will not comply with the new EU GDPR regulation and may face heavy penalties.
They must designate specific employees who will be responsible for the entire process, such as a Data Protection Officer (DPO) who mainly handles technology solutions, a Chief Information Governance Officer (CIGO), usually a lawyer who is responsible for the compliance, and/or a Compliance Risk Officer (CRO). This person must be able to control the entire process from start to finish, and be able to provide full transparency to management and authorities.
“The data controller must pay particular attention to the nature of the personal data, the purpose and duration of the proposed processing operation(s), as well as the situation in the country of origin, the third country and the country of final destination. , and must provide adequate safeguards to protect the fundamental rights and freedoms of natural persons with respect to the processing of their personal data”.
PII data can be found in all types of files, not only PDFs and text documents, but also image documents, for example a scanned check, a CAD/CAM file that may contain the IP of a product, a confidential sketch, code or binary file, etc.’. Today’s common technologies can extract data from files, making data hidden in text easy to find, but the rest of the files that in some organizations, such as manufacturing, may hold the most sensitive data in image files. These types of files cannot be detected accurately, and without the right technology that is capable of detecting PII data in file formats other than text, it is easy for this important information to be lost and cause substantial harm to the organization.
Phase 2 – Data categorization
This stage consists of behind-the-scenes data mining actions, created by an automated system. The DPO/controller or information security decision maker must decide whether to track certain data, block the data, or send alerts of a data breach. To perform these actions, you need to view your data in separate categories.
Categorizing structured and unstructured data requires complete data identification while maintaining scalability, effectively scanning all databases without “boiling the ocean.”
The DPO must also maintain data visibility across multiple sources and quickly present all files related to a given person according to specific entities such as: name, date of birth, credit card number, social security number, telephone, email address etc
In the event of a data breach, the DPO will report directly to the highest level of management of the controller or processor, or to the information security officer who will be responsible for reporting this breach to the relevant authorities.
Article 33 of the EU GDPR requires you to report this breach to the authorities within 72 hours.
Once the DPO identifies the data, the next step should be to tag/label the files according to the sensitivity level defined by the organization.
As part of regulatory compliance, the organization’s files must be accurately labeled so that these files can be tracked on-premises and even when shared outside the organization.
Phase 3 – Knowledge
Once data is tagged, it can map personal information across networks and systems, both structured and unstructured, and can be easily traced, allowing organizations to protect their sensitive data and allow their end users to use and share files securely, thereby improving data loss. prevention.
Another aspect to consider is protecting sensitive information from insider threats: employees trying to steal sensitive data, such as credit cards, contact lists, etc. or manipulate the data to obtain some benefit. These types of actions are difficult to detect in time without automatic monitoring.
These time-consuming tasks apply to most organizations, prompting them to look for efficient ways to gain insights from their business data on which to base their decisions.
The ability to analyze intrinsic data patterns helps the organization gain better insight into its business data and pinpoint specific threats.
The integration of an encryption technology allows the controller to effectively track and monitor data, and by implementing an internal physical segregation system, it can create a geo-fence of data through personal data segregation definitions, domains/ cross geos and reports on sharing violations once the rule is broken. . Using this combination of technologies, the controller can enable employees to securely send messages across the organization, between the right departments, and outside the organization without being overly locked out.
Phase 4 – Artificial Intelligence (AI)
After data is scanned, tagged, and tracked, a further value to the organization is the ability to automatically filter out atypical behavior from sensitive data and activate protective measures to prevent these events from becoming a data breach incident. This advanced technology is known as “Artificial Intelligence” (AI). Here, the AI function is generally understood as having a strong pattern recognition component and learning mechanism to enable the machine to make these decisions or at least recommend the preferred course of action to the data protection officer. This intelligence is measured by its ability to gain insight from every scan and user input or change in data mapping. Eventually, the AI function creates the digital footprint of organizations that becomes the essential layer between raw data and business flows around data protection, compliance, and data management.