Recent Trends in Generative Artificial Intelligence Litigation in the United States

by: Christopher J. Valente, Michael J. Stortz, Amy Wong, Michael W. Meredith of K&L Gates LLP - K&L Gates HUB

Tuesday, September 5, 2023

Related Practices & Jurisdictions

Communications Media Internet, Consumer Protection, Intellectual Property

Although still in their infancy, a growing number of recently-filed lawsuits associated with generative artificial intelligence (AI) training practices, products, and services have provided a meaningful first look into how US courts may address the privacy, consumer safety, and intellectual property protection concerns that have been raised by this new, and inherently evolving, technology. The legal theories that have served as the basis of recent claims have varied widely, are often overlapping, and have included invasion of privacy and property rights; patent, trademark, and copyright infringement; libel and defamation; and violations of state consumer protection laws, among others.

To date, courts have appeared reluctant to impose liability on AI developers and have expressed skepticism of plaintiffs’ rhetoric around AI’s purported world-ending potential. Courts have also found a number of recent complaints to be lacking in the specific, factual, and technical details necessary to proceed beyond the pleadings stage. This alert aims to provide a snapshot of the current litigation landscape in the rapidly growing field of generative-AI law in the United States.

On 11 July 2023, in J.L. v. Alphabet Inc

At a hearing on the defendants’ motion to dismiss on 19 July 2023, Judge William Orrick expressed skepticism regarding the plaintiffs’ claims indicating he would tentatively dismiss them. Specifically, he explained that: (1) the images produced by the models are not “substantially similar” to plaintiffs’ art; and (2) because the models had been trained on “five billion compressed images” it is “implausible that [plaintiffs’] works are involved” in the creation of those images. Judge Orrick did, however, provide plaintiffs with an opportunity to amend their complaint “to provide more facts” proving otherwise.⁶

GitHub, Inc. (GitHub), the well-known online code repository, is also the subject of a putative class action filed in November 2022 in the Northern District of California under the caption Doe v. GitHub, Inc.⁷

In the motion to dismiss, defendants argued, among other things, that plaintiffs could not plausibly allege that any code that they individually authored was impermissibly used by Copilot because that model functions by generating its own unique coding suggestions based on what it learned from reviewing open source code without copying or reproducing any portion of that open source code. GitHub also addressed head-on plaintiffs’ allegation (based on an internal study) that “about 1% of the time” Copilot generated snippets of code similar to the publicly available code from which it learned. Even if this were true, GitHub argued, the plaintiffs could not allege that their own code fell within that 1%. Said differently, the plaintiffs could not connect the dots between their own code and any generated code. Therefore, GitHub explained, none of the plaintiffs in Doe could ever have the requisite standing to pursue any legal claim, copyright-based or otherwise. As detailed below, this may be a viable defense available to generative AI developers in nearly all of the legal challenges they presently face relating to data collection and generative AI model training.

It is interesting to note that the plaintiffs in Doe did not assert any direct claims for copyright infringement, instead relying on a theory of improper copyright information management; namely, that defendants violated the Digital Millennium Copyright Act by failing to provide the appropriate attribution, copyright notice, or license terms when making use of the plaintiffs’ code. In contrast, on 28 June 2023, two authors, Paul Tremblay and Mona Awad, filed a class action lawsuit against OpenAI in Tremblay v. OpenAI, Inc.⁸

The New York Times

Right of Publicity and Facial Recognition Cases

In contrast to some of the headline-grabbing lawsuits broadly challenging the legality or safety of AI technology on the whole, a few narrow cases involving the use of AI in connection with facial recognition software may end up being better received by courts. The reason is, in part, that these cases can more clearly identify specific individuals that have suffered harm—namely, those that have had their faces scanned and analyzed without permission.

For example, in April 2023, in Young v. NeoCortext, Inc.

Similar claims were asserted in Flora v. Prisma Labs, Inc.

Tort Cases

POSSIBLE DEFENSES

There is a variety of defenses that have already been effectively asserted by defendants in generative-AI litigation. Common themes include lack of standing, reliance on the “fair use” doctrine, and the legality of so-called “data scraping.” The following is a brief summary of the key principles underlying each of these possible defenses that AI developers may rely on in future litigation.

Lack of Standing

In July 2023, the Seventh Circuit Court of Appeals in Dinerstein v. Google LLC¹⁹

In Dinerstein, the plaintiffs alleged that UCMC breached its contractual privacy arrangements with its patients, invaded their privacy, and violated Illinois’ consumer protection statute by using several years of anonymized patient medical records to train an AI model that could be used in software that could be used to anticipate patients’ future healthcare needs. The US District Court for the Northern District of Illinois ultimately dismissed the plaintiff’s claims due to lack of standing and for failure to state a claim, noting that plaintiffs failed to establish damages associated with the disclosure of their anonymized patient data or defendants’ tortious intent.

“Fair Use”

Another broad defense that might be successfully pursued by AI developers against any copyright claim is the well-recognized doctrine of “fair use.” Fair use is a defense to claims of infringement when copyrighted material is used in a “transformative” way. Transformative use can occur when copyrighted material is used to serve different market “functions” or expand the “utility” of the copyrighted work. The doctrine appears particularly appropriate for the AI training process, which does not involve the traditionally impermissible copying and commercial reproduction of copyrighted work and, instead, only analyzes copyrighted material to detect patterns in an effort to develop a new “function” or “application”, namely, a large language model or other generative AI product.

To date, no US court has explained the appropriate application of the “fair use” doctrine in the context of generative AI models or AI-generated materials. However, the doctrine has provided a complete defense in similar situations. For example, in Authors Guild v. Google, Inc.

In response to lawsuits alleging copyright infringement, some AI developers have already suggested the fair use doctrine’s applicability. In its copyright dispute with Thomson Reuters, for example, ROSS expressly asserted this defense, arguing that it was using Thomson Reuters’ legal database for an entirely different purpose—to train its model to write new and unique code. Some have suggested that the Supreme Court’s decision in Google v. Oracle

GitHub and Microsoft have also argued that the plaintiffs in Doe v. GitHub, Inc.

Legality of So-Called “Data Scraping”

Finally, while generative AI developers may have relied on scraping of the Internet to develop training datasets for their products, they are far from the first group of companies to “scrape” the Internet for commercially useful information. In fact, it is a common practice amongst data science and technology companies. One such company, hiQ Labs, Inc., for example, famously “scraped” information from the publicly available profiles of online users of the business-networking site LinkedIn in order to provide employers with data and analysis regarding potential recruits and their job-seeking behaviors. In hiQ Labs, Inc. v. LinkedIn Corp.

AI developers will likely be able to take advantage of the precedent established in hiQ Labs to defend their data collection practices and can further expect that the hiQ Labs decision will likely feature prominently in the forthcoming motions to dismiss in the P.M. and J.L. cases pending in the US District Court for the Northern District of California.

FUTURE TRAJECTORY

While the outcomes of these early generative AI cases are far from certain, preliminary indications suggest that courts are not succumbing to the hype and rhetoric, and are approaching generative AI claims with a healthy level of skepticism. Yet, many of the potential defenses have still not been tested in the context of generative AI. The coming months will be pivotal in setting the tone for generative AI litigation moving forward.

In addition, while the current wave of generative AI litigation continues to work its way through the courts, recent trends suggest that plaintiffs’ attorneys may be eager to expand beyond the generative AI developers to target companies that adopt or use generative AI products or solutions. As such, businesses exploring or using generative AI products and services would do well to ensure they understand the technology they are adopting as well as the risks associated