The rise of large language models (LLMs) such as ChatGPT has created novel legal implications surrounding the development and use of such artificial intelligence (AI) systems. One of the most closely watched AI cases currently is New York Times Co. v. Microsoft Corp., No. 1:23-cv-11195 (S.D.N.Y. filed Dec. 27, 2023), in which the New York Times (NYT) has alleged that OpenAI, the parent company of ChatGPT, impermissibly used NYT-copyrighted works to train the ChatGPT LLM. Though the case is centered on questions of intellectual property, a recent development in the case has raised significant data privacy concerns, as well.
The Preservation Order
In the course of the ongoing litigation, NYT asserted that, if ChatGPT saved its user data, such data could preserve evidence to support NYT’s position. In a May 13, 2025, preservation order, U.S. Magistrate Judge Wang for the Southern District of New York agreed with NYT and instructed OpenAI “to preserve and segregate all output log data that would otherwise be deleted on a going forward basis until further order of the Court.”
This is a sweeping demand, as ChatGPT receives and hosts a vast volume of data. Over 4.5 billion users visit ChatGPT per month, and the LLM receives approximately 2.5 billion prompts each day. In OpenAI’s opposition argument, the company asserted that the order would require the retention of 60 billion conversations that would be near impossible to search, adding that less than 0.010% of the data would be relevant to NYT’s copyright assertions.
Moreover, OpenAI explained that ChatGPT users expect their deleted chats to be unavailable. Indeed, prior to receiving the preservation order, ChatGPT had touted its data retention policies, whereby if a user deletes a chat, the chat is removed from the user’s account immediately and scheduled for permanent deletion from OpenAI systems within 30 days (absent a legal or security reason to preserve it). However, after a late June hearing, U.S. District Judge Stein denied OpenAI’s objections to Magistrate Judge Wang’s preservation order. Judge Stein concluded that OpenAI’s terms of use allow for preservation of data for legal requests, under which this case falls.
The preservation order applies to ChatGPT Free (along with some other versions) but not to ChatGPT Enterprise. This means that the inputted data of individuals who use the free version of ChatGPT might be subject to NYT’s review. However, for organizations that use ChatGPT Enterprise, this data is outside of the preservation order’s scope. Still, if an employee has used the personal version of ChatGPT to input any employer information, that information could now potentially be subject to this order, too.
What’s Next?
OpenAI continues to raise concerns about privacy in response to the preservation order. In a June tweet, OpenAI CEO Sam Altman introduced the concept of an “AI privilege,” offering that AI conversations should receive the same privileges as conversations with lawyers and medical providers. Of course, this is not a legally recognized privilege, and OpenAI has not introduced it in any legal briefings. Even if the company did so, it is unlikely any court would be willing to create this new category of privilege for generative AI interactions.
While many people following the case are alarmed at larger constitutional privacy concerns, it is also unlikely that all 60 billion conversations subject to the preservation order will become available to the public at large. For now, NYT lawyers are expected to gain access to and begin searching OpenAI logs in support of NYT’s copyright case.
OpenAI will surely bolster its security practices in response to the preservation order, but the fact that data that would have otherwise been deleted is now being maintained is a heightened risk in itself.
Considerations for Organizations
From a data governance perspective, this preservation order raises several considerations as organizations think about data governance:
- Review your own internal data retention clauses – OpenAI’s terms of use states: “We may share your Personal Data, including information about your interaction with our Services, with government authorities, industry peers, or other third parties in compliance with the law…if required to do so to comply with a legal obligation…”
Such language is common in most businesses’ privacy policies and terms of use. Companies often need data retention clauses that carve out legal exceptions for certain situations, but this case has demonstrated how such clauses could be turned on their head.
Organizations should be aware that “maintaining data for legal purposes,” could also include court orders such as the OpenAI preservation order, which in OpenAI’s case, ended up counter to its privacy promises to its users.
- Segregate data where technically feasible – Organizations should segregate their data into distinct buckets based on the data’s sensitivity and/or purpose. For OpenAI, one hurdle in arguing against the preservation order was that the high volume of data was not flagged in any manner, so the company was unable to determine which output logs were relevant to the matter.
It is quite possible that OpenAI’s assertion is correct and only a small fraction of the total data is relevant to NYT’s case. Yet, the court had no viable means of making such a determination. If organizations can separate or flag data based on sensitivity and usage, this would help them to isolate relevant data for specific issues rather than having to include all organizational data in evaluating every issue that may arise.
- Evaluate your vendor contracts – Though ChatGPT Enterprise user data is not affected by this preservation order, the matter serves as a reminder to all organizations to review their vendor contracts. Businesses might consider zero data retention agreements for certain vendors so that these vendors don’t store data – even for a legal purpose – after it has been used for its originally intended purpose. Generally, data minimization limits the likelihood of information exposure overall.
- Further raise employee awareness – Organizations should remind their employees that personal ChatGPT conversations can no longer be deleted after 30 days and that the “temporary chat” feature is no longer operational because of the preservation order.
That means that any personal ChatGPT input could become a part of the discovery in this matter. Employees should maintain heightened caution in using tools such as ChatGPT to input any proprietary, business, confidential, or sensitive information.
- Establish an AI Governance Program – It is also prudent for organizations’ legal and IT Teams to understand the AI use across their organizations. Often, individuals or departments are using AI without their legal/IT departments knowing. Usually, this is not nefarious activity, but people are simply unaware of the risks of AI tools and the need for organizational awareness regarding their usage.
Once businesses can wrap their heads around what departments are using what AI tools, they can circulate an AI use questionnaire to encourage responsible and informed use of such tools. In today’s age, employee AI use is likely to happen, but a strong AI governance program and institutionalized policies can increase employee awareness and serve as an organizational safeguard to mitigate AI-related risks.
The reality is that no business can entirely prevent all unauthorized AI use, but with a robust governance program and related AI policies, they can at least train their staff at the individual level and manage that risk holistically as an organization, too.