Basic terms - Data protection - Problem areas under labour law

G. Problem areas under labour law

V. Data protection

1. Basic terms

Article 1 of the GDPR specifies the subject matter and objectives of the Regulation.

A distinction must be made between two equally important objectives: the protection of the fundamental rights of natural persons with regard to the processing of personal data (paragraphs 1 and 2) on the one hand, and, on the other, the free movement of data within the EU (paragraphs 1 and 3). ⁵⁹⁵

Article 4 GDPR contains definitions of the main terms. These definitions alone are put to a serious test by AI and Big Data.

a) Personal data

The linchpin of the GDPR is the protection of "personal data". Accordingly, it is not surprising that the list of definitions in Article 4 GDPR starts with this term.

According to Article 4 No. 1 GDPR, personal data are "any information relating to an identified or identifiable natural person [...]; an identifiable natural person is one who can be identified, directly or indirectly". Overall, the definition of "personal data" was deliberately kept extraordinarily open and thus flexible.⁵⁹⁶ Equally deliberately, the Union legislature accepted the resulting legal uncertainty.⁵⁹⁷

594 Cf. only Gräber/Nolden, in: Paal/Pauly, DS-GVO BDSG. 3rd ed. 2021, § 26 BDSG, marginal no. 10; Malorny, RdA 2022, 170.

595 See also, for example, Spindler/Dalby, in: Spindler/Schuster, Recht der elektronischen Medien, 4th ed. 2019, Art. 1 DSG-VO marginal no. 1.

596 According to the case law of the ECJ, the term must also be interpreted broadly; cf. on this only Karg, in Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 1 marginal no. 3 with further references. (and footnote 10).

597 See only Tosoni/Bygrave, in: Kuner/Bygrave/Docksey/Drechsler, The EU General Data Protection Regulation - A Commentary, 2020, Art. 6 note 7 with further references.

121

HSI-Working Paper No. 17 December 2022

If information is not assigned to a person or cannot be assigned to a person, the the GDPR does not apply. Anonymous data are not protected by the GDPR.⁵⁹⁸ However, this is where the problems begin in connection with AI, considered together with Big Data, the added value of which consists precisely in the fact that data can be statistically correlated with each other, where this was previously not possible or feasible for reasons of time or cost.⁵⁹⁹ Indeed, Big Data analytics and AI regularly draw non-intuitive and unverifiable conclusions and make predictions, about, for example, people's behaviour or certain inclinations. The GDPR undoubtedly applies to Big Data analytics based exclusively on personal data.

However, such analytics can also use exclusively non-personal data.⁶⁰⁰ If, for example, analyses on the behaviour of certain groups are then applied to persons belonging to groups, the GDPR might not be taken into account, even though the risk to these persons is obvious. Against this background, it is understandable that some authors claim that in reality the data being processed here are also personal (derived) data – which, however, is “not at the beginning but at the end” of the data processing, as is the case with the classic personality profile.⁶⁰¹

According to Article 4 No. 1, personal data means any information relating to an identified or identifiable natural person. A reference to a person therefore exists if a person is directly identified by the information.⁶⁰² However, it also exists if a person is identifiable through the addition of further information or intermediate steps. An "identifiable" person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that person.

To answer the question of what is meant by "identifiability", one must also refer to

598 Cf. only Karg, in Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 1 marginal no. 19.

599 See only Roßnagel/Geminn/Jandt/Richtert, Datenschutzrecht 2016 "Smart" genug für die Zukunft? Ubiquitous Computing und Big Data als Herausforderungen des Datenschutzrechts, 2016, p. 21 f.

600 For more details, see Roßnagel/Geminn/Jandt/Richtert, Datenschutzrecht 2016 "Smart" genug für die Zukunft?

Ubiquitous Computing und Big Data als Herausforderungen des Datenschutzrechts, 2016, p. 29 ff. with examples.

601 Thus Roßnagel/Geminn/Jandt/Richtert, Datenschutzrecht 2016 "Smart" genug für die Zukunft? Ubiquitous Computing und Big Data als Herausforderungen des Datenschutzrechts, 2016, p. 26: "„Es wird, um es bildlich auszudrücken, keine Akte über eine bestimmte Person geführt, sondern es gibt eine Vielzahl dynamischer anonymer Akten, die in einem Augenblick auf eine bestimmte Person konkretisiert werden können" ("To put it figuratively, there is no file kept on a specific person, but there are a multitude of dynamic anonymous files that can be concretised to a specific person in an instant").

In conclusion, likewise Zuiderveen Borgesius, Singling out people without knowing their names - Behavioural targeting, pseudonymous data, and the new Data Protection Regulation, Computer Law & Security Review 2016, 256; also Wachter/Mittelstadt, A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI (October 5, 2018). Columbia Business Law Review 2019: https://ssrn.com/abstract=3248829, who call for the establishment of a "right to reasonable inferences". (Problems of a different kind arise when, as is often the case with AI and Big Data, there are mixed data sets, i.e. those containing personal and non-personal data; see Tosoni/Bygrave, in:

Kuner/Bygrave/Docksey, The EU General Data Protection Regulation - A Commentary, 2020, Art. 4 note 6.

602 Cf. only Karg, in Simitis/Hornung/Spiecker gen. Döhmann (eds.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 1 para. 46, 54 et seq.; on the requirements for sufficient "identification" ibid., para. 48 et seq.

122

HSI-Working Paper No. 17 December 2022

Recital 26. This states: “To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments”. This is significant mainly because the legislature has thus decided in principle in favour of the so-called objective or absolute theory (and thus against the so-called subjective or relative theory). According to the former theory, identifiability is already given if either the responsible body or any third party is able to connect the information to a person. Under the subjective or relative theory, in contrast, only those means are to be taken into account that are actually available to the respective responsible body in the concrete individual case in order to establish the personal reference.⁶⁰³

It is clear from what has just been said that the question of identifiability is burdened with considerable uncertainties from the outset. Data processing in the context of AI does not make it any easier to answer this question. Illustrative of this is, for example, the necessity arising from Recital 26 to take into account the means which are “generally likely to be used to identify the natural person directly or indirectly". This is a dynamic test which, in addition to objective factors (such as time and costs), must also take into account the technology available at the time of the processing. In other words, whether identifiability is given depends largely on the state of the art at the time of the legal assessment of the facts. However, in view of the increasing capabilities of AI to assign information to individuals, this means that a processing of data that is still anonymous today may very well be a processing of personal data at a later point in time. Accordingly, data controllers are obliged to conduct a continuous review and risk analysis to ensure that originally anonymous data can continue to be considered as such.⁶⁰⁴

However, the fact that AI noticeably increases the possibility of linking (initially) anonymous data with concrete persons should be beyond question.⁶⁰⁵ Here, the

603 Cf. Karg, in Simitis/Hornung/Spiecker gen. Döhmann (eds.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 1, para. 58 et seq., who believes that "case law and the GDPR" have "now arguably answered the question in favour of the relative theory, albeit with strong limitations and adoption of some elements of the absolute theory".

604 Cf. Karg, in Simitis/Hornung/Spiecker gen. Döhmann (eds.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 1, para. 63.

605 Cf. also Holthausen, RdA 2021, 19 (25) with the conclusion that "anonymity-preserving data mining in the context of Big Data [...] thus (remains) a challenge for data protection as well as data security and a task for research".

123

HSI-Working Paper No. 17 December 2022

possibility of identification is based on statistical correlations between unidentified data and personal data concerning the same person. To put it another way, a data element that is anonymous at first glance is placed in the context of further data through the application of AI, which then enables a personal attribution.⁶⁰⁶ In the meantime, the constantly expanding linking possibilities have even led to calls for a restrictive interpretation of Article 4 No. 1 of the GDPR, with the argument that an excessive application of the GDPR must be counteracted. It is argued that the technical possibilities now allow the linking of almost any data with a person,⁶⁰⁷ although one could add that the use of AI systems can not only noticeably reduce the "costs of identification", but also the "time required" for this. If one followed this, however, there would be a risk of curtailing the scope of protection of Article 8 of the Charter of Fundamental Rights.⁶⁰⁸

Conversely, however, it cannot be overlooked that the GDPR could degenerate into a "law of everything" in view of the extremely open definition of "personal"

data.⁶⁰⁹ In this context, some gloomy forecasts state that the GDPR’s "system of legal protection based on such an all-encompassing notion and high intensity of positive compliance obligations is not going to be sustainable in the long run".⁶¹⁰

According to Recital 26, the Regulation does not apply to the processing of personal data "which have been rendered anonymous in such a way that the data subject is not or no longer identifiable". Anonymisation procedures (as well as pseudonymisation procedures) are among the methods that can contribute to the implementation of data protection requirements through technology design.⁶¹¹ Effective anonymisation prevents "all parties from singling out an individual in a dataset, from linking two records within a dataset (or between two separate

606 For more details, see Sartor, The impact of the General Data Protection Regulation (GDPR) on artificial intelligence, 2020, p. 36 ff.

607 Cf. in particular Forgó/Krügel, MMR 2010, 17 (using the example of geodata).

608 Thus Karg, in: Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht.

1st ed. 2019, Art. 4 No. 1 DSGVO marginal no. 65, who additionally opines that "the expansion of the scope of application of the DSGVO [...] is not caused by an extensive interpretation of the concept of personal data, but by the constantly increasing analytical capabilities of information and communication technology and the associated gain in knowledge about the personality".

609 See Purtova, The Law of Everything. Broad Concept of Personal Data and Future of EU Data Protection Law, Law, Innovation and Technology 2018, 40.

610 Thus Purtova, The Law of Everything. Broad Concept of Personal Data and Future of EU Data Protection Law, Law, Innovation and Technology 2018, 33. The author calls for abandoning the concept of 'personal' data as the cornerstone of data protection altogether and instead providing remedies for 'information-related harm' in the broadest sense.

611 Thus Hansen in: Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 5 marginal no. 50.

124

HSI-Working Paper No. 17 December 2022

datasets) and from inferring any information in such dataset".⁶¹² However, it is unclear how specific the reference to a natural person must be and to what extent a sufficient reference to a person is also given if general statistical statements allow certain conclusions to be drawn about the training data used.⁶¹³ Technically, it is probably true anyway that completely anonymous data do not exist.⁶¹⁴ In particular, the literature calls for the establishment of criteria that can be used to verify beyond doubt whether data are personal or anonymous. In the absence of such verifiability, there would be "no guarantee that a dataset anonymised according to the state of the art is actually anonymous".⁶¹⁵

In any case, however, there are considerable anonymity risks, especially in machine learning.⁶¹⁶ For example, there are findings that certain ML techniques can unexpectedly clearly "remember" the data used to train the model and that this

"memory" may be so strong that a faithful image of the training data can be reconstructed.⁶¹⁷ In the meantime, the European legislature has also explicitly recognised that in the future, the possibility of converting anonymised data into personal data must be increasingly expected.⁶¹⁸ There are proposals in the literature on how "de-anonymisation" by AI could be prevented or at least sanctioned more effectively.⁶¹⁹

612 Cf. Article 29 Working Party, WP 216, p. 9. The Working Party (Article 29 Data Protection Working Party) was an independent advisory body to the European Commission on data protection issues, established on the basis of Article 29 of Directive 95/46/EC (Data Protection Directive) of 24 October 1995. The statements of the group - now replaced by the European Data Protection Board (cf. Art. 68 GDPR) - still carry weight in the interpretation of the GDPR. General on anonymisation methods Karg, in Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No.

5, para. 50 et seq.

613 Cf. Winter/Battis/Halvani, ZD 2019, 489 (89 f.) with reference to Opinion 05/2014 of the Art. 29 Group, which was based on a very broad definition of inference; cf. also Meents, in: Kaulartz/Braegelmann (eds.), Artificial Intelligence and Machine Learning, 2020, p. 465 marginal no. 1; cf. also Gierschmann, ZD 2021, 482 (482): "Science for determining and assessing anonymity still under development". "Privacy-friendly" methods for training AI models are described by Puschky, ZD-Aktuell 2022, 00019.

614 See Kolain/Grafenauer/Ebers, Anonymity Assessment - A Universal Tool for Measuring Anonymity of Data Sets Under the GDPR with a Special Focus on Smart Robotics , November 24, 2021, Rutgers University Computer & Technology Law Journal 2022: https://ssrn.com/abstract=3971139, 29.

615 Winter/Battis/Halvani, ZD 2019, 489 (490).

616 Cf. in this respect also Thieltges, ZfP 2020, 3 (13 ff.) with the additional reference to the fact that "especially in the mixing of private and professional contexts [keyword: "bring you won device"] the personal reference is immanent".

617 The trained network reacted noticeably differently to information that had already been used for training than to previously unseen test data; for more details, see Winter/Battis/Halvani, ZD 2019, 489 (492).

618 However, this is not in the GDPR, but in Regulation (EU) 2018/1807 of the European Parliament and of the Council of 4 November 2018 establishing a framework for the free movement of non-personal data in the European Union, OJ L 303/59, which not only explicitly recognises that "the growing Internet of Things, artificial intelligence and machine learning [...] are significant sources of non-personal data". It also states that when "technological developments make it possible to transform anonymised data back into personal data, these data must be treated as personal data".

619 Cf. Roßnagel/Geminn, ZD 2021, 487 with a consideration of Japanese law.

125

HSI-Working Paper No. 17 December 2022

b) Pseudonomysation

Article 4 No. 5 GDPR contains a definition of "pseudonymisation". This means “the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person”. In this respect, Recital 26 clarifies that "[p]ersonal data which have undergone pseudonymisation […] should be considered to be information on an identifiable natural person".⁶²⁰ With regard to the question of identifiability, Recital 26 states that, as stated above, "all objective factors, such as the cost of identification and the time required for identification" should be taken into account, "taking into consideration the available technology at the time of the processing and technological developments".

Again, it should be noted that the use of AI significantly increases the chances of

"re-identification" - and thus of "overcoming" pseudonymisation.⁶²¹ For example, it is relatively easy to create profiles or augment existing profiles by linking the pseudonymised data records with other (possibly also pseudonymised) data. Such an "overlapping"⁶²² of two data sets, which in themselves do not allow conclusions to be drawn about the persons concerned, can therefore quickly lead to a re-identification of these persons.⁶²³

620 Critical of the existing regulation Schleipfer, ZD 2020, 284 (291), according to which the GDPR overestimates the potential of pseudonymisation and thereby overlooks even more effective possibilities, which is why it would be desirable if the topic of pseudonymity, including all differentiations, were "intensively discussed" in the context of the upcoming evaluation of the GDPR.

621 See, for example, Sweeney, Simple Demographics Often Identify People Uniquely, Carnegie Mellon University, Data Privacy Working Paper 3, 2000: If data omits, for example, names, social security numbers and addresses, but includes date of birth, gender and postcode, then 87% of the US population can nevertheless be uniquely identified; cf. on the whole also Russell, Stuart/Norvig, Peter: Artificial Intelligence - A Modern Approach, 4^th ed., 2022, p. 1166 with further references.

622 Thus Hansen in: Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 5 marginal no. 48.

623 Cf. Hansen in: Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 5 marginal no.

48.

126

HSI-Working Paper No. 17 December 2022

c) Profiling

The GDPR addresses a series of specific regulations to what is called profiling to protect the data subjects. In particular, the aim is to ensure greater transparency in processing.⁶²⁴

Article 4 No. 4 GDPR defines profiling as “any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person, in particular to analyse or predict aspects concerning that natural person's performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements”.

Generally speaking, profiling means "gathering information about an individual (or group of individuals) and evaluating their characteristics or behaviour patterns in order to place them into a certain category or group, in particular to analyse and/or make predictions about, for example, their: ability to perform a task; interests; or likely behaviour".⁶²⁵ Particular importance is attached to the characteristic of automated personality assessment alluded to in Article 4 No. 4 GDPR, which in the present context is based on correlations and probabilities without there having to be a causal relationship.⁶²⁶ In this context, profiling is characterised by the fact that new information and further insights into the personality of the data subject are generated by collecting, linking and analysing individual characteristics.⁶²⁷ The data basis for profiling can be, for example, communication and use habits (activity in social networks, websites visited, etc). The instruments of profiling include, in particular, tracking.⁶²⁸

It is obvious that AI systems enable profiling: the existence of these systems and the potential availability of Big Data have significantly increased the possibilities for profiling and also allow real-time analysis. For example, the literature points out that AI systems in the service of insurers are able to determine the likelihood of illness of applicants based on their health records, but also on their habits (e.g. of

624 Cf. only Scholz, in: Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 4 DSGVO marginal no. 1.

625 Sartor, The impact of the General Data Protection Regulation (GDPR) on artificial intelligence, 2020, p. 39.

626 Scholz, in: Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 4 DSGVO marginal no. 9 with further references.

627 Cf. only Scholz, in: Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 4 DSGVO marginal no. 6.

628 Cf. only Scholz, in: Simitis/Hornung/Spiecker gen. Döhmann (ed.), Datenschutzrecht, 1st ed. 2019, Art. 4 No. 4 DSGVO marginal no. 7 f.

127

HSI-Working Paper No. 17 December 2022

diet or exercise) or social conditions.⁶²⁹ In the present context, it should be noted in particular that the use of people analytics qualifies as profiling if it does not operate at an aggregate level, but rather makes assessments or predictions regarding individual employees.⁶³⁰ However, AI-based text or speech analysis is also profiling, which in this case is intended to reveal certain personal or character traits of applicants.⁶³¹

Nel documento Artificial intelligence and labour law (pagine 120-127)