Artificial Intelligence General Data Protection Regulation Italy

Diletta De Cicco

Email

322-627-1100

Bio and Articles

James Downes

Email

44-20-7655-1352

Bio and Articles

Lucia Hartnett Cabeza

Email

+44 20 7655 1076

Bio and Articles

Malcolm Dowden

Email

44-20-7655-1665

Bio and Articles

Find Your Next Job !

LEGAL ASSISTANT II

Experienced Family Law Attorney

Specialist: Legal Information Center Research

Explore More Job Openings

Italian OpenAI : May (A)I?

by: Diletta De Cicco, James Downes, Lucia Hartnett Cabeza, Malcolm Dowden of Squire Patton Boggs (US) LLP - Privacy World

Tuesday, April 11, 2023

Print Mail Download info_icon_img

/>i

Artificial intelligence (AI) depends on the use of “big data” to create and refine the training models from which the AI “learns”. Although concerns have tended to focus on questions such as inherent bias within the training data, or a lack of information in relation to the way in which the AI’s algorithms operate, the Italian data protection authority (the Garante) has made an order that indicates an even more fundamental difficulty for AI developers – lawfully acquiring the data required to build training models in the first place. If training models cannot be lawfully built and expanded, then the viability of AI might be called into question.

On 30 March 2023, the Garante reached for perhaps the most draconian element of its regulatory toolkit – an order temporarily banning OpenAI LLC (OpenAI), provider of the generative AI service ChatGPT, from processing personal data of individuals who are within Italy. This “stop order” reflects the Garante’s view that urgent measures were required in light of the risks posed both to users of ChatGPT and to individuals whose personal data had been collected and used to build its training models.

Key concerns underpinning the Garante’s finding were that:

OpenAI did not properly inform users or individuals whose personal data was collected for use in training the AI models driving ChatGPT of the data collection;
OpenAI did not identify and communicate a valid lawful basis for collecting personal data to train its algorithm;
ChatGPT processes personal data inaccurately as output provided may not correspond to real facts;
OpenAI did not implement users’ age verification mechanisms, even though, based on its terms, the content ChatGPT generates is intended for users over the age of 13.

Taking those factors together, the Garante found that the processing of personal data to train the AI models constituted a breach of the transparency and fair processing obligations in EU GDPR Articles 5, 6, 8, 13 and 25. The temporary “stop order” was imposed with immediate effect, reserving the right either to make the ban permanent or to impose other sanctions depending on the outcome of the Garante’s full investigation.

Responding to the order, OpenAI blocked people in Italy from accessing ChatGPT while it worked on providing responses within the 20-day deadline set by the Garante.

A fundamental challenge for “big data”?

AI has a voracious appetite for data. The sophistication and reliability of AI depends on the quality and extent of its training data. That data must be obtained from somewhere, and developers often resort to measures such as web-scraping, web-crawling or text mining to obtain it in large enough quantities.

The Garante’s findings in relation to ChatGPT indicate that where personal data is regarded as being collected directly from individuals, the developer must ensure that they meet their obligations as data controller to identify an appropriate lawful basis, and to provide the information required by GDPR Article 13.

The Garante’s focus on GDPR Article 13 suggests that they considered personal data to have been collected directly from individual data subjects. It is also possible that techniques such as web scraping would involve the collection of data from sources other than the data subjects themselves. In those cases, the relevant transparency obligations would include provision of the information required by GDPR Article 14. Although Article 14(5)(b) provides an exception where “the provision of such information proves impossible or would involve a disproportionate effort”, other data protection authorities (including Poland’s and the UK’s) have emphasised that “impossible” means “impossible”, and not just extremely difficult or expensive, and that it would not be a “disproportionate effort” to provide information to millions of individuals even if, as in the Polish decision, the costs of doing so would outweigh the revenue and profits hoped for from the processing activities.

Data protection authorities are not inclined to treat the acquisition of personal data for AI training models as a special case, meriting less protection than any other form of personal data acquisition. Acquiring data for training models might attract particularly close scrutiny and ever more resolute protection. It is not merely a question of administrative fines or exposure to potential compensation claims; it is a matter of no-go.

What Now?

In the days since Italy announced its probe, supervisory authorities in France, Germany and Ireland have contacted the Garante to ask for more information on its findings. Other supervisory authorities around the world, such as in Canada and South Korea have also launched their own investigations. Privacy activists have embraced the pack, with two complaints lodged with the French supervisory authority in relation to OpenAI. Jean-Noël Barrot, the French Digital Minister, has publicly stated that the platform does not respect privacy laws. He did not, however, go as far as suggesting that France should ban it.

In the meantime, in Italy, the Garante has opened a dialogue with OpenAI which, after a meeting with the authority last week, has submitted measures that should address the issues identified by the ban. The Garante is now reviewing the documents and information provided, with a new meeting scheduled on April 12, 2023.

May (A)I? Still to be seen …