Big Data Needs BigPrivacy: How to Legally Process, Use and Share Data Around the Globe to Avoid the Facebook Backlash
The Facebook data scandal involves alleged harvesting and misuse of personal data for improper purposes. The Facebook-Cambridge Analytica data scandal is a watershed moment that leaves many organizations wondering whether they can continue to collect, use and share personally identifying data for research/academic purposes and commercial/data analysis purposes.
The answer is “Yes.” Organizations can collect personally identifying data and use protected versions of that data for research and analysis if they embrace principles underlying the new EU General Data Protection Regulation (GDPR) - the most advanced regulation to protect data by default to avoid these types of disasters. Whether or not directly subject to the GDPR, organizations must now use the state of the art in data protection to retain and re-establish confidence with customers, partners, regulators and the general public to avoid a Facebook-like backlash.
Facebook allowed Cambridge Analytica to receive access to personally identifying data to perform analysis. This is where Facebook went wrong. The processing that Facebook authorized Cambridge Analytica to perform did not require personally identifying data. Had Facebook only provided Cambridge Analytica with non-identifying cohorts, groups or classes of Facebook related data, the authorized processing could have been completed without requiring access to identifying data.
GDPR officials recognize the value of correlations and relationships that may be revealed by data analysis, but not at the expense of abrogating fundamental rights of privacy. To balance data-driven innovation and privacy rights of individuals, the GDPR introduces the following two new concepts - “Pseudonymisation” and “Data Protection by Design and by Default.”
- Pseudonymisation - as newly defined under the GDPR, “Pseudonymisation” requires separating the information value of data from the means of determining the identity of individuals. Pseudonymising data in compliance with GDPR requirements enables probabilistic analysis - generally the goal of data analytics - without having to reveal personally identifying data. The results of probabilistic analysis can still be re-linked to identifying data but only under controlled conditions. See Pseudonymisation FAQ for more information on GDPR compliant Pseudonymisation.
- Data Protection by Design and by Default - as newly defined under the GDPR, “ Data Protection by Design and by Default” requires that when personal data is processed, only that data that is necessary for a specific authorized use, for a specific purpose at a specific time be provided. Often, identifying data is not required for analysis - all that is needed is information describing a cohort, group or class of people in sufficiently granular detail to communicate desired information value in a non-identifying manner. This can be done by Pseudonymising data and grouping it in cohorts, groups or classes. Under the GDPR, Data Protection by Design and by Default requires that privacy respectful design solutions be embedded into operations to reverse the situation where data is identifiable and vulnerable by default and actions are required to protect the data. Data Protection by Design and by Default changes the default situation to one where data is supposed to be protected by default and actions are required to make use of the data. As a result, data provided for processing by a specific party, at a specific time, for a specific purpose is supposed to include no more identifying data than minimally required for the authorized use. If a different dataset is required for a different authorized purpose, different actions should generate just the data required for that new purpose.
The Facebook data scandal highlights the fact that data uses are getting bigger and so are privacy concerns. Organizations must adopt new state of the art data protection technology introduced under the GDPR. Pseudonymisation and Data Protection by Design and by Default represent the state-of-art for balancing the goals of data-driven innovation and preservation of privacy. For information on how Anonos’ patented BigPrivacy technology uniquely enforces GDPR state-of-the-art data protection principles to enable compliant collection, use and sharing of personally identifying data for research and analytics. See recent blog article: 5 Steps To Enable Compliant Analytics.
The European Commission Directorate‑General for Communications Networks, Content and Technology (DG CONNECT) is the department responsible for both data innovation and data protection. A DG CONNECT official summarised a March 23, 2018 meeting with Anonos as follows:
I had previously seen three avenues to doing data-driven innovation with personal data in a privacy compliant manner. First, you may secure the consent of a person. Second, you may find a use case where anonymized/generalized data are "just good enough," and you can prove that your method of anonymization or generalization works to prevent re-linking of data to individuals. Third, you may hope that emerging technologies for privacy-preserving analytics like multi-party computing or homomorphic encryption could help you achieve your goal. But, we have learned today that there is a new way - Anonos has introduced us to a distinct different fourth approach which is BigPrivacy dynamic pseudonymisation technology.