General Data Protection Regulation (GDPR)
General Provisions
Chapter 2

Terminology

This Chapter presents a number of terms that are used throughout the report and are essential to the understanding of the reader. Some of these terms are based on GDPR, whereas others refer to technical standards or are explicitly defined for the purpose of this report.

In particular, the following terms are utilised:

Personal data refers to any information relating to an identified or identifiable natural person (data subject); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person (GDPR, art. 4(1)).

Data controller or controller is the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data (GDPR, art. 4(7)).

Data processor or processor is the natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller (GDPR, art. 4(8)).

Pseudonymisation is the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person (GDPR, art. 4(5))5.

Anonymisation is a process by which personal data is irreversibly altered in such a way that a data subject can no longer be identified directly or indirectly, either by the data controller alone or in collaboration with any other party (ISO/TS 25237:2017)6.

Identifier is a value that identifies an element within an identification scheme7. A unique identifier is associated to only one element. It is often assumed in this report that unique identifiers are used, which are associated to personal data.

Pseudonym, also known as cryptonym or just nym, is a piece of information associated to an identifier of an individual or any other kind of personal data (e.g. location data). Pseudonyms may have different degrees of linkability (to the original identifiers)8. The degree of linkability of different pseudonym types is important to consider for evaluating the strength of pseudonyms but also for the design of pseudonymous systems where a certain degree of linkability may be desired (e.g. when analysing pseudonymous log files or for reputation systems)9.

Pseudonymisation function, denoted 𝑃, is a function that substitutes an identifier 𝐼𝑑 by a pseudonym 𝑝𝑠𝑒𝑢𝑑𝑜.

Pseudonymisation secret, denoted 𝑠 is an (optional) parameter of a pseudonymisation function 𝑃. The function 𝑃 cannot be evaluated/computed if 𝑠 is unknown.

Recovery function, denoted 𝑅, is a function that substitutes a pseudonym 𝑝𝑠𝑒𝑢𝑑𝑜 by the identifier 𝐼𝑑 using the pseudonymisation secret 𝑠. It inverts the pseudonymisation function 𝑃.

Pseudonymisation mapping table is a representation of the action of the pseudonymisation function. It associates each identifier to its corresponding pseudonym. Depending on the pseudonymisation function 𝑃, the pseudonymisation mapping table may be the pseudonymisation secret or part of it.

Pseudonymisation entity is the entity responsible of processing identifiers into pseudonyms using the pseudonymisation function. It can be a data controller, a data processor (performing pseudonymisation on behalf of a controller), a trusted third party or a data subject, depending on the pseudonymisation scenario. It should be stressed that, following this definition, the role of the pseudonymisation entity is strictly relevant to the practical implementation of pseudonymisation under a specific scenario10. However, in the context of this report, the responsibility for the whole pseudonymisation process (and for the whole data processing operation in general) always rests with the controller.

Identifier domain / pseudonym domain refer to the domains from which the identifier and the pseudonym are drawn. They can be different or the same domains. They can be finite or infinite domains.

Adversary is an entity that tries to break pseudonymisation and link a pseudonym (or a pseudonymised dataset) back to the pseudonym holder(s).

Re-identification attack is an attack to pseudonymisation performed by an adversary that aims to re-identify the holder of a pseudonym.