Pseudonymisation before GDPR or as it is known much of the times, it's mostly based on the fact that you use static de-identifiers. Using static de-identifiers makes sure, with the technology that exists today, makes it very easy to de-identify or re-identify.
For example, if we take three situations. In the AdTech, a lot of players will not even use
Pseudonymisation or encryption. They would just use your entire profile. For example, you’re John McKee, living in an apartment in London, you have a cat, you bike to work, stop in the morning before you go to work at a coffee shop for a double latte.
Then you might know that really in AdTech, that entire digital profile made of us is actually sold and used to enable the real-time bidding. So you see, that's really not applying data minimization, because for example, why does a coffee shop owner or a coffee brand needs to know that you own a cat in order to advertise for their coffee brand.
But that's the way most of the times, AdTech works. Now, if you use static identifier or de-identifier, then you could say, let's replace John McKee by, for example, ABCD and then you would have, ABCD lives in London in an apartment with a cat. ABCD drives to work on his bicycle and stops for a coffee. ABCD likes this type of coffee.
If you use static de-identifiers, then you enable indeed the fact that you cannot be identified easily with direct identifiers like your name, for example. But we can all agree that using all that information together and also using indirect de-identifiers, you could easily, if you have everywhere, ABCD lives in an apartment with a cat, ABCD this, ABCD that. If you're actually combining the different data sets, it is very easy, definitely under the techniques of today to re-identify John McKee using direct and even indirect identifiers.
And that is also what we saw in the
Mosaic Effect Study of Harvard. So
Pseudonymisation, I hear a lot of times people say,
Pseudonymisation does not work because for example, you have the Harvard Study and there, they used static tokens to replace the personal data of the data subject. But based on three indirect de-identifiers or identifiers, your zip code, your birth date and your gender, they were able to relink it to 87% of the data subjects.
So you see, if you use static de-identifiers, static tokens, that by using indirect identifiers, you could re-link it and attribute the personal data back to the data subject as such.
For example, the Belgium Social Security number is made of your birth date and then it seems like a complete random number, but apparently, it is not that random. It is the notification that your parents made at the office and whether it's an even number or an uneven number, depends on the gender. Even are the girls. Uneven are the boys.
It was said at a certain point, if you hash and use an encryption key, hashing, then it would become anonymous data and you would not be able to relink it again, attribute that data to the data subject.
Studies found out that that is not the case and that for example, you really also need to apply the salt, which then, if you hash it and then encrypt the data, it needs a more random encryption in order to make it unlogical for other people to authorize or have access to the data or relink it to the data subject.
Just saying that
Pseudonymisation in itself does not work, is not a statement you can make without knowing really the technology behind it and knowing what have they done? Have they used static tokens? What type of encryption did you use?
So just saying
Pseudonymisation does not work, can’t be said anymore under GDPR.