In the light of the General Data Protection Regulation (GDPR)1, the challenge of proper application of pseudonymisation to personal data is gradually becoming a highly debated topic in many different communities, ranging from research and academia to justice and law enforcement and to compliance management in several organisations in Europe. Based on previous ENISA’s work in the field2, this report explores further the basic notions of pseudonymisation, as well as technical solutions that can support implementation in practice.
In particular, starting from a number of pseudonymisation scenarios, the report defines first the main actors that can be involved in the process of pseudonymisation along with their possible roles. It then analyses the different adversarial models and attacking techniques against pseudonymisation, such as brute force attack, dictionary search and guesswork. Moreover, it presents the main pseudonymisation techniques (e.g. counter, random number generator, cryptographic hash function, message authentication code and encryption) and pseudonymisation policies (e.g. deterministic, document-randomized and fully randomized pseudonymisation) available today. It especially addresses the parameters that may influence the choice of pseudonymisation technique or policy in practice, such as data protection, utility, scalability and recovery. Some more advanced pseudonymisation techniques are also briefly referenced. On the basis of the aforementioned descriptions, the report further builds on two use cases on the pseudonymisation of IP addresses and email addresses, analysing the particularities arising from these specific types of identifiers. It also examines a more complex use case of the pseudonymisation of multiple data records, discussing the possibilities of reidentification.
One of the main outcomes of the report is that there is no single easy solution to pseudonymisation that works for all approaches in all possible scenarios. On the contrary, it requires a high level of competence in order to apply a robust pseudonymisation process, possibly reducing the threat of discrimination or re-identification attacks, while maintaining the degree of utility necessary for the processing of the pseudonymised data.
To this end, the report draws the following conclusions and recommendations for all relevant stakeholders as regards the practical adoption and implementation of data pseudonymisation.
A RISK-BASED APPROACH TOWARDS PSEUDONYMISATION
Although all known pseudonymisation techniques have their own, well-understood, intrinsic properties, this does not render the choice of the proper technique a trivial task in practice. A risk-based approach needs, thus, to be adopted, assessing the required protection level, while considering relevant utility and scalability needs.
Data controllers and processors should carefully consider the implementation of pseudonymisation following a risk-based approach, taking into account the purpose and overall context of the personal data processing, as well as the utility and scalability levels they wish to achieve.
Producers of products, services and applications should provide adequate information to controllers and processors regarding their use of pseudonymisation techniques and the security and data protection levels that these provide.
Regulators (e.g. Data Protection Authorities and the European Data Protection Board) should provide practical guidance to data controllers and processors with regard to the assessment of the risk, while promoting best practices in the field of pseudonymisation.
DEFINING THE STATE-OF-THE-ART
In order to support a risk-based approach for pseudonymisation, the definition of the state-of-theart in the field is essential. To this end, it is important to work towards specific use cases and examples, providing more details and possible options regarding technical implementation.
The European Commission and the relevant EU institutions should support the definition and dissemination of the state-of-the-art in pseudonymisation, in co-operation with the research community and industry in the field.
Regulators (e.g. Data Protection Authorities and the European Data Protection Board) should promote the publication of best practices in the field of pseudonymisation.
ADVANCING THE STATE-OF-THE-ART
While the focus of the report was on basic pseudonymisation techniques that are available today, the use of more advanced (and robust) techniques, such as those arising from the area of anonymisation, is very important for addressing the increasingly complex scenarios in practice.
The research community should work on extending the current pseudonymisation techniques to more advanced solutions effectively addressing special challenges appearing in the big data era. The European Commission and the relevant EU institutions should support and disseminate these efforts.