The Copyright Office released a “Pre-publication” version of Part 3 of its Report on Copyright and AI. Coincidentally (?) Shira Perlmuter, the Register of Copyrights, was fired amid a shakeup at the Copyright Office. The Report was also supposed to address infringement issues, but did not. Those issued will now be addressed in a Part 4 of the Report.
A more detailed summary will be provided, but some high level takeaways are as follows.
- There is no per se rule on whether training AI on copyrighted content is infringement/Fair Use. It will be a cases-by-case analysis.
- Each “use” of copyrighted content needs to be considered. Several stages in the development of generative AI involve using copyrighted works in different ways that implicate the owners’ exclusive rights. Each needs to be considered separately. Different end uses may yield different results.
- The fair use determination requires balancing the multiple statutory factors in light of all relevant circumstances. But the first and fourth factors will often be most significant. Nothing new here.
- Various uses of copyrighted works in AI training are likely to be transformative. The extent to which they are fair, however, will depend on what works were used, from what source, for what purpose, and with what controls on the outputs—all of which can affect the market. Nothing really new here. Whether a work was pirated or legally obtained can be considered.
- The Report indicates that voluntary licensing and extended collective licensing should be considered- no compulsory schemes or new legislation at least for now.
- Perhaps the most controversial part of the Report, in my opinion, is the breadth of the fourth fair use factor – the effect of the use upon the potential market for or value of the copyrighted work.
- This section evaluates different ways in which the use of copyrighted works for generative AI can affect the market for or value of those works, including through lost sales, market dilution, and lost licensing opportunities. It also addresses broader claims that the public benefits of unlicensed training might shift the fair use balance.
- The Office’s view is that the statute on its face encompasses any “effect” upon the potential market. The speed and scale at which AI systems generate content pose a serious risk of diluting markets for works of the same kind as in their training data – not just competition for sales of an author’s works. Market harm can also stem from AI models’ generation of material stylistically similar to works in their training data, despite noting that copyright does not protect style.
- The report makes clear that using certain “guardrails” could reduce the likelihood of a finding of infringement.
- The Report assesses various infringement considerations with the use of Retrieval Augmented Generation (RAG).
- This is not even close to a complete summary of the 113 page report. But I am working on a more detailed summary.