Code of Conduct on the use of GDPR compliant Pseudonymisation

General Data Protection Regulation (GDPR)
INTRODUCTION
Draft for a Code of Conduct on the use of GDPR compliant pseudonymisation
CHAPTER 3

Application examples of pseudonymisation

3.1 Pseudonymisation Magenta TV (DTAG)

3.1.1 Introduction

Deutsche Telekom generates anonymous statistics based on the use of the Magenta TV product. Personal data is first pseudonymised in order to convert it into anonymous statistics. Certain usage data, so-called events, which are provided with an identifier (ID), are used in particular for pseudonymisation. This makes it possible, for example, to carry out a different count. This means that the question can be answered, as to how many households or how many set-top boxes have watched a particular channel at a certain time. Every user has the possibility to object to this processing (opt-out) at any time.

The abovementioned IDs are ultimately no longer present in anonymous statistics, making it impossible to trace back from the pure numbers to the encrypted IDs.

3.1.2 Description of responsibilities

Telekom Deutschland GmbH is responsible for the personal data generated when using the Magenta TV product. The pseudonymisation is provided by T-Systems GmbH as an IT service provider.

T-Systems will be integrated by Telekom Deutschland in this process via a controller-processor agreement. Another legal unit of T-systems, the Tel-IT, provides an automatically generated key for pseudonymisation. Tel-IT is also involved in development and operation.

The assignment of pseudonymisation is carried out by the "Private Customers Germany" segment. In other words, this division commissions the pseudonymisation of the IT service provider, after consultation with Deutsche Telekom's Corporate Data Protection Department. The Data Protection Department is also responsible for the legal conformity of the pseudonymisation process as such.

3.1.3 Criteria for determining the appropriate pseudonymisation method

The data types to be pseudonymised are usage data from Magenta TV, cf. Art. 4 No. 1 GDPR. In addition, there is also metadata, which also flows into the pseudonymisation. These are pseudonymised for the creation of user profiles, cf. Art. 6 para. 1 lit. f) in connection with Art. 32 para. 1 lit. a) GDPR. This involves more than 10 million data records per day, which are pseudonymised. Personal and device pseudonyms are created for pseudonymisation.

Data field NAME IDENTIFICATION Risk Class Remarks
Subscriber_ID ACCOUNT_ID 1 Pseudonym Subscriber
Physical_Device_ID DEVICE_ID 2 Pseudonym Device identifier

Fig.: Data fields and risk classes

3.1.4 Rights and role concept as well as key management

The authorisations are clearly distributed both by role assignment and technical purpose assignment, which is laid down in the organisation and authorisation concept. The division of Telekom Deutschland GmbH (TDG), which is responsible for the product Magenta TV, has no influence on the pseudonymisation. It only has access to the generated anonymous statistics, which are generated at the end. TSystems performs pseudonymisation by automatically encrypting the usage data via the AcL (Acquisition Layer).

An independent technical instance (TelIT) supplies the key. This system can only be accessed by the Tel-IT and the administrators of T-Systems. The crypto material (keys/salts) required for pseudonymisation is separately encapsulated in a so-called Trust Center (Tel-IT). During configuration, the employee has no way of gaining knowledge of it. Only technical users and a small group of persons (3-4 persons) have access to the cryptomaterial. However, they have no admin rights. This is the organisational separation.

In an additional agreement, TDG also waives its authority to issue instructions regarding the crypto material which it would have according to the controllerprocessor data processing, i.e. TDG may not request this information. Tel-IT is not allowed to hand them over, not even to third parties. The data is only transferred from the AcL to the BDMP (Big Data Management Platform), where it is available to the TDG for analyses, when the pseudonymisation has been completed. The pseudonymised usage profiles are aggregated on the BDMP. Access to the information in the AcL and to the technical instance is excluded.

3.1.5 Data generation

When using a Magenta TV Set-Top-Box (STB) - i.e. when the user presses the remote control - different events are generated depending on which keys have been pressed and in which context the user is. These STB events form the basis of the evaluations. Examples for these events are e.g. the switching on/off processes, channel switching, information about the watched channels or information about activities around recording or watching recordings.

These event data records contain, for example, information about the set-top box (=DeviceID), date/time, and other specific data fields. The personally identifiable information of these events is collected by means of an AES1283 material cipher.

3.1.6 Pseudonymisation

The underlying pseudonymisation process leads to linkable but not detectable pseudonyms. These are generated using so-called deterministic, cryptographically strong ciphers. Since deterministic processes map identical plain texts to identical result values (pseudonyms), linkability is ensured. Through the secure administration of the key material and the organisational separation of access to the keys, the inadmissible reversal of pseudonymisation, i.e. the disclosure of the plain date, is prevented.

The pseudonyms created for the AccountID (ID for the customer) and the DeviceID (ID for the respective set-top box) are used for further evaluations. The event files necessary for the evaluation and the references ACCOUNT_PS and DEVICE_PS do not contain any attributes that directly contain personal data. These references (ACCOUNT_PS and DEVICE_PS) are the person and device pseudonyms.

The pseudonyms are used to record the usage information of Magenta TV in order to generate anonymous statistics. Here, it is important to be able to recognise which event is occurring from the same device or user. Pseudonymisation ensures that employees cannot draw any conclusions about the actual devices or users. The resulting statistics are completely exempt from pseudonymised identifiers and are therefore anonymous.

3.2 Pseudonymisation for the optimisation of online platform advertising

3.2.1 Introduction

Targeting advertising at a desired audience via online platforms such as social media, e-commerce shops or online publishers enables the minimisation of advertising dispersion loss. At the same time, targeted advertising saves platform users unnecessary irritation caused by irrelevant video adverts. The use of commercially available consumer information such as sociodemographic or lifestyle data helps to reach relevant audiences. Acxiom licenses target audience formed with selection criteria and uses multiple pseudonymisation methods, so that the data can be linked for the purpose of presenting individualised online advertising, on the one hand, and to protect data subjects from direct identification, on the other.

3.2.2 Preparation: Creation of a pseudonym-to-pseudonym reference table with the platform partner

In order to reach the desired audience online on one platform online, a twostage process is required. Firstly, and as an independent process from carrying out campaigns for a customer, a data comparison of the databases of Acxiom and the platform operator takes place. In the first step, the plain text name-and-address-database is loaded onto Acxiom's proprietary Privacy Enhancement Tool (PET). There, each data record of the name-and-address-database receives a pseudonymous personal key. This personal key is again hashed with a Salt. In addition, Acxiom pseudonymises the plaintext data of the name-and-address-database by hashing it. The result is a file with two fields: the hashed personal key and the hashed name-and-address-data (Acxiom’s match file). The platform operator on the other side pseudonymises his user data in a similar way and saves the user contact data with the platform's own user ID in a file (match file platform). After comparing the two match files using the pseudonyms or the hash values, a cross reference table is created between the platform user ID and the hashed Acxiom personal key. The platform operator only stores the mapping of the platform user ID to the hashed Acxiom personal key in the cross reference table. All other information is deleted immediately after the comparison.

3.2.3. Audience selection

For the selection of the relevant audience on a platform, Acxiom creates its own data product which contains the pseudonym (an Acxiom personal key associated with a certain salt) as a key variable for matching, but with no names nor addresses.

Audiences can be selected on the basis of sociodemographic data, calculated affinities for certain products or services, but also on the basis of purely geographical information (e.g. advertising for high-speed Internet only in regions where it is available).

Acxiom has a wide range of microgeographic variables that are calculated on a fine-spatial neighborhood level using official data, surveys, market research studies, etc. (i.e. all households in a geographical cell or neighborhood are assigned the same values). For example, the assumption "has a cat" is assigned to all households of this microgeographic cell, regardless of the individual situation of the different families in the neighborhood. The neighborhood must always comprise of at least 4 households4 . By using these characteristics, the identification or detectability of a natural person is prevented by means of these pseudonymous data sets.

The resulting audience that is selected (e.g. "has a cat" and "lives in an apartment") is always a list of hashed Acxiom personal keys. This is uploaded by Acxiom onto the advertising account of Acxiom at the platform operator, and can then be shared with the advertiser, or its agency, so that they can use the audience.

3.2.4 Placing advertisements

Based on the reference created through data comparison between the platform user ID and the hashed Acxiom personal key, the platform operator displays the advertisement to the audience uploaded to the account.

4.2.1 Technical and organisational measures

The platform operator has contractually committed to Acxiom to keep the reference data between the platform user ID and the hashed Acxiom personal key separated, and physically set apart from its CRM system. The advertisement is displayed through a separate advertising delivery system. The corresponding contractual obligations and the processing of the data, physically separated from its own user database, ensure that there is no possibility for the platform operator to identify a person on the basis of these pseudonymous IDs.

At Acxiom, access authorisation to the salt, used to encrypt the personal key, is only granted to a few selected employees. The selected ID numbers cannot be detected by the Acxiom employees who select the audience, since the process described above does not allow them to assign the hashed Acxiom personal keys to a person.

In addition, the twice pseudonymised and hashed Acxiom personal keys are anonymous for both the advertisers who license the audience of Acxiom, and for their agency, as well as for any other third party, since they have no means of identifying a person from the twice pseudonymised personal data, nor the ability to assigning this data to an individual. In addition, advertisers have no access whatsoever to the hashed Acxiom personal keys in the selected audience, because the selection of audience and uploading to the platform takes place exclusively at Acxiom. Viewing of the uploaded audience group is technically not possible for the advertiser.