AI Companies Can Use Copyrighted Books to Train Language Models, Judge Rules in Anthropic Case

Anthropic’s Claude AI model gave users answers that were “exceedingly transformative” and did not violate fair-use law, a California judge rules

In a major court victory for artificial intelligence companies on Tuesday, a California federal judge sided with Anthropic, the parent company of the Claude AI chatbot, ruling copyrighted books can be used to train AI models without consent under fair use law.

Still, Anthropic is not completely cleared — Judge William Alsup said the company will have to face a separate trial over its “piracy” of millions of copyrighted books.

On the first issue of fair use, Judge Alsup ruled Section 107 of the Copyrighted Act allowed Anthropic to train Claude using copyrighted books. That is because Claude’s answers, based on the copyrighted material, were “exceedingly transformative” — in other words, much different from the source material and could not be considered rip-offs.

The judge also ruled fair use law allowed Anthropic to take purchased physical books and scan them into a digital “research library” that can be used to train its models.

“However, Anthropic had no entitlement to use pirated copies” of books for its library, the judge ruled. Altogether, Anthropic pirated “over seven million copies of books,” the judge said in his ruling, by illegally downloading digital copies that company executives knew were “unauthorized copies.”

Anthropic, the judge added, “could have purchased books, but it preferred to steal them to avoid” what cofounder and CEO Dario Amodei said was a legal “slog.”

The lawsuit against Anthropic was filed last year by three authors: Andrea Bartz, Kirk Wallace and Charles Graeber. Anthropic, the judge said in his ruling on Tuesday, pirated copies of at least two works from each author.

The ruling comes as the issue of fair use and how AI models are trained has come to the forefront in the media and entertainment worlds in recent years. OpenAI, the parent company of ChatGPT, is currently facing a lawsuit from The New York Times, which claims the AI company stole its content to train its model. Hundreds of Hollywood creators, including Ben Stiller, Aubrey Plaza, and Joseph Gordon-Levitt, also recently called on the Trump Administration to push back against OpenAI and other companies that have been lobbying for looser copyright laws.

And earlier this month, Disney sued AI company Midjourney for copyright infringement, saying its model was illegally creating images that were exact copies of characters like Homer Simpson and Elsa from “Frozen.”

You can read Tuesday’s court ruling by clicking here.

Comments