Under the GDPR controllers are required to provide information relating to what personal information they process, and how that processing takes place.[1] Data is typically needed to train and fine-tune modern artificial intelligence models. If that training data contains personal information, an organization is required to include a description of that processing in its privacy notice.
Related to the issue about whether the use of personal data to train an AI must be disclosed in a privacy notice is the question of whether the privacy notice must be distributed to each individual whose information will be included within training data. Under the GDPR, if the personal information the organization is going to use as part of training data has been collected directly from individuals, those individuals should be provided with a copy of the organization’s privacy notice “at the time when personal data are obtained.”[2] If, on the other hand, the personal information the organization is going to use as part of training data has been collected from a third party source (e.g., scraped from the internet or received from another controller), the GDPR generally permits the controller to provide a copy of its privacy notice “within a reasonable period” after the data is collected.[3] Furthermore, in the following situations the GDPR does not mandate that a privacy notice be directly provided to individuals:
- Individuals already know the organization’s privacy practices. If a “data subject already has the information” that would be contained within a privacy notice the company is not required to provide one to them.[4]
- Impossibility. If providing a privacy notice directly to individuals is “impossible” a company is relieved of the requirement. That said, the GDPR requires that the company “take appropriate measures to protect individual’s rights and freedoms and legitimate interests, including making the information publicly available.”[5] In a recent enforcement action, a supervisory authority mandated that a company that used publicly scraped data to train an AI engage in a “non-marketing oriented information campaign” which included publicizing on “all the main . . . mass media [channels] (including radio, television, newspapers and the Internet)” information about the company’s privacy practices including where individuals could find the company’s privacy notice. [6] The supervisory authority concluded that publicizing the activities of the AI provider were necessary to protect individuals’ rights and freedoms given the impossibility of directly distributing the organization’s privacy notice.
- Disproportionate effort. If providing a privacy notice “would involve a disproportionate effort” a company is not required to provide the notice.[7] That said, the GDPR requires that the company “take appropriate measures to protect the data subject’s rights and freedoms and legitimate interests, including making the information publicly available.”[8] As indicated in the previous paragraph, a supervisory authority has determined on at least one occasion that large information campaigns designed to publicize an organization’s privacy practices do not involve disproportionate effort to disseminate information on how individuals can find the information that must be included in a privacy notice regarding the use of their data to train an AI.
- Processing cannot be disclosed pursuant to European Union law. If a European Union Member State imposes an obligation of secrecy that would prohibit an organization from disclosing the fact that it has processed an individual’s information, the organization is not required to provide individuals with its privacy notice.[9] It is unlikely that this exception would apply to most organizations’ use of personal information as part of training data.
[1] EDPB-EDPS Joint Opinion 5/2021 on the proposal for a Regulation of the European Parliament and of the Council laying down harmonized rules on artificial intelligence (Artificial Intelligence Act) at para. 60 (June 18, 2021) (stating that data subjects should be informed when their data is used for AI training).
[2] GDPR, Article 13(1).
[3] GDPR, Article 14(3)(a).
[4] GDPR, Article 14(5)(a).
[5] GDPR, Article 14(5)(b).
[6] Garante Per La Protezione Dei Dati Personali, Provision of April 11, 2023[9874702] (English translation).
[7] GDPR, Article 14(5)(b).
[8] GDPR, Article 14(5)(b).
[9] GDPR, Article 14(5)(d).