Recently, two federal judges – Henry Wingate in Mississippi and Julien Neals in New Jersey -- quietly withdrew major rulings. Counsel had flagged glaring factual errors: misnamed parties, phantom declarations, fabricated quotations.
The courts offered no explanation. But the fingerprints of generative AI are hard to miss.
For the past two years, headlines have focused on lawyers who copy-pasted ChatGPT outputs into their filings, often with comical or catastrophic results.
But this time, the hallucinations didn’t come from a brief. They came from the bench. Which means the integrity of the judicial record is now at risk not just from what lawyers submit, but from what judges sign.
So what should the courts do?
There are only three options: abstinence, bureaucracy, and better tools.
The first two are tempting, but dangerous. The only viable path is building better tools.
First, abstinence: prohibit AI entirely. Critics warn that AI risks corrupting judicial legitimacy. Ignoring that “just say no” rarely works, these critics make a more fundamental error: they’ve mistaken the symptom for the cause.
Hallucinated citations aren’t the crisis -- they’re evidence of one. Judges aren’t turning to ChatGPT out of laziness. They’re turning to it because they’re drowning: 66 million filings a year -- that’s 120 cases a minute, around the clock -- shrinking staff, unrelenting deadlines, and dockets that demand expertise and speed.
In this system, backlogs grow, hearings delay, and litigants lose faith that anyone is truly listening.
That breakdown -- not AI -- is what’s partially driving the collapse in public trust. Since 2020, confidence in the courts has dropped from 59% to 35% -- one of the steepest declines Gallup has ever measured, steeper even than in some authoritarian states.
If we care about legitimacy, we must care about capacity.
The scandal isn’t a fake quote. It’s the system that made a judge rely on a chatbot in the first place.
If we care about legitimacy, we must care about capacity. And if we care about capacity, abstaining from the technology that gives judges their best chance of catching their breath is no option at all.
The second option is bureaucracy, which offers the comforting illusion of control. Policies, guidelines, and disclaimers satisfy a familiar instinct: if the tool is risky, regulate the user.
But this approach rests on the incorrect assumption that governance can compensate for unfit tools. It can’t.
Consumer chatbots like ChatGPT are not just error-prone. They’re deceptively error-prone. They don’t spew nonsense; they generate citations that sound plausible, quotes that feel familiar, authorities that glide seamlessly into the legal argument -- until they collapse under scrutiny.
That’s not a misuse problem. That’s a design problem. Bureaucracy offers false reassurance, papering over errors until it’s too late.
Worse, bureaucracy shifts the burden to judges to verify quotes, trace sources, and double-check the law clerks. But for a system already underwater, more paperwork doesn’t mean safer use of AI. It means more pressure at a moment when judges can least absorb it.
The only solution is building better tools.
Bans and rules do not solve the underlying problem, and may even exacerbate it. The only solution is building better tools. The good news is they already exist -- and they’re spreading.
Across courts nationwide, judges and clerks are quietly adopting systems that function less like chatbots and more like junior clerks, mapping claims to legal elements, linking elements to facts, and grounding every inference in controlling law.
What sets these tools apart is discipline. They reflect the key aspects of the rule of law in their software code. They are designed to be neutral, reasoning like a judge rather than an advocate; correct, with no hallucinated cases or doctrinal misstatements; to embody fidelity, surfacing the right law and framing the real issue; and transparency, with every step traceable and open to challenge. A tool meant to serve the rule of law should reflect it.
Recent AI failures in the courtroom shouldn’t trigger retreat, but reform. The problem isn’t that judges used AI; it’s that they used the wrong kind. We don’t need to ban machines from the courthouse. We need to incorporate machines that belong there. The sooner courts get on board with testing and welcoming AI for judges, the sooner the AI will be up to the tasks required of it.
If we get legal AI right, the payoff is profound: faster triage, earlier error detection, less delay, more human attention on what machines can’t do: scrutinize testimony, weigh equities, render judgment.
The rule of law does not forbid the use of AI. It constrains it. And it is only by submitting our tools to those constraints that we can justify their presence in the courtroom. The goal is not to automate judgment, but to protect it.
All of the views and opinions expressed in this article are those of the author and not necessarily those of The National Law Review.