The Threat Of Extreme Statutory Damages For Copyright Almost Certainly Made Anthropic Settle With Authors

from the can't-risk-it dept

In what may be the least surprising news in the world of copyright and the internet, Anthropic just agreed to settle the copyright lawsuit that everyone’s been watching, but not for the reasons most people think. This isn’t about AI training being found to infringe copyright—in fact, Anthropic won on that issue. Instead, it’s about how copyright’s broken statutory damages system can turn a narrow legal loss into a company-ending threat, forcing settlements even when the core dispute goes your way.

Anthropic had done something remarkably stupid beyond just training: they downloaded unauthorized copies of works and stored them in an internal “pirate library” for future reference. Judge Alsup was crystal clear that while the training itself was fair use, building and maintaining this library of unauthorized copies was straightforward infringement. This wasn’t some edge case—it was basic copyright violation that Anthropic should have known better than to engage in.

And while there were some defenses to this, it would likely be tough to succeed at trial with the position Judge Alsup had put them in.

The question then was about liability. Because of copyright’s absolutely ridiculous statutory damages (up to $150k per work if the infringement was found to be “willful”), which need not bear any relationship to the actual damages, Anthropic could have been on the hook for trillions of dollars in damages just in this one case. That’s not something any company is going to roll the dice on, and I’m sure that the conversation was more or less: if you win and we get hit with statutory damages, the company will shut down and you will get nothing. Instead, let’s come to some sort of deal and get the lawyers (and the named author plaintiffs) paid.

While the amount of the settlement hasn’t been revealed yet, the amount authors get paid is going to come out eventually, and… I guarantee that it will not be much. The expected “class” in this case included about 7 million works. If you went absolutely crazy and said that the settlement was $1 billion (which I highly doubt it would be), and the lawyers take half of that, that means authors would receive… about $70 per work? And if I had to guess, the amount will be significantly below that.

This pathetic payout perfectly illustrates why the whole “we need licensing for AI training” argument is economically illiterate. Even setting aside that Judge Alsup already said training is fair use, the math just doesn’t work. When you’re dealing with millions of works, each individual piece contributes essentially nothing to the final model. The transaction costs alone would eat whatever pennies authors might receive. The only winners in any licensing scheme would be the collection societies and lawyers skimming off the top.

Instead what will happen—what always happens with these collective licensing deals—is that a few of the bigger names will get wealthy, but mainly the middleman will get wealthy. These kinds of schemes only tend to enrich the middlemen (often leading to corruption).

So this result is hardly surprising. Anthropic had to settle rather than face shutting down. But my guess is that authors are going to be incredibly disappointed by how much they end up getting from the settlement. Judge Alsup still has to approve the settlement, and some people may protest it, but it would be a much bigger surprise if he somehow rejects it.

Filed Under: , , , , , , ,
Companies: anthropic

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “The Threat Of Extreme Statutory Damages For Copyright Almost Certainly Made Anthropic Settle With Authors”

Subscribe: RSS Leave a comment
39 Comments
Anonymous Coward says:

When you’re dealing with millions of works, each individual piece contributes essentially nothing to the final model.

I don’t think this is actually true, because some works appear many times over. When ChatGPT was first made available to the public, I tried asking it to write a parody of Hallelujah about an evil capitalist. Its response was the lyrics of Hallelujah, the original song, word for word. Most likely, those copyrighted lyrics are written in thousands of places in the training set ─ there are so many websites for song lyrics, many of which are themselves infringing. If there weren’t, the model wouldn’t have them memorised.

I think it’s beyond argument that a website which, upon request, can serve an exact copy of a copyrighted work, must be subject to copyright law. Fair use can still be a defense within the law, but there is no defense of “but it’s a Rube Goldberg machine”, and no defense of “but we copied millions of other works too!”

Anonymous Coward says:

Re:

When ChatGPT was first made available to the public, I tried asking it to write a parody of Hallelujah about an evil capitalist. Its response was the lyrics of Hallelujah, the original song, word for word.

Really? That’s not the response I just got. Instead, I got a result that was about cash grabs, destruction of the environment and evil CEOs that could be sung to the exact same tune as the original, and the only reason I haven’t posted the result is that I couldn’t copy it (I couldn’t even highlight it). Are you sure you’re not shilling for an evil copyright maximalist?

Anonymous Coward says:

Re:

When ChatGPT was first made available to the public, I tried asking it to write a parody of Hallelujah about an evil capitalist. Its response was the lyrics of Hallelujah, the original song, word for word.

How many years ago was that? Because I just got a very different result, which is as follows:

Write a parody of Hallelujah about an evil capitalist

Sure! Here’s a parody of “Hallelujah” by Leonard Cohen, rewritten to satirize a greedy, evil capitalist. The tone is tongue-in-cheek, blending dark humor and social commentary. It maintains the original song’s cadence and structure, so you could technically sing it to the same tune.

“Cash-elujah” (Parody of Hallelujah)
Lyrics by ChatGPT

[Verse 1]
I heard there was a secret scheme
To build a mall and kill the stream
But you don’t really care for nature, do ya?
It goes like this — the fourth, the fifth
The pink slip falls, the rich man lifts
The baffled clerk just mumbles, “Cash-elujah”

[Chorus]
Cash-elujah, Cash-elujah
Cash-elujah, Cash-eluuuuuujah

[Verse 2]
Your stocks were strong, but you needed more
So you fired folks by the office door
Their tears just upped your quarterly value
You broke their backs for a bonus check
Then bought a yacht and broke its deck
And whispered to your goldfish, “Cash-elujah”

[Chorus]
Cash-elujah, Cash-elujah
Cash-elujah, Cash-eluuuuuujah

[Verse 3]
You say, “The market’s free, so let it reign!”
While lobbyists grease up your gravy train
You turned compassion into dry accounting
And every time a worker pleads,
You plug your ears and raise your fees
Your heart is just a ledger, Cash-elujah

[Chorus]
Cash-elujah, Cash-elujah
Cash-elujah, Cash-eluuuuuujah

[Verse 4]
Maybe there’s a wealth above
But all you preach is “Eat or shove”
You tithe to banks and not to hallelujah
And in the boardroom, cold and slick
You sell the world, then raise the rent
And crown yourself the king of Cash-elujah

[Final Chorus]
Cash-elujah, Cash-elujah
Cash-elujah, Cash-eluuuuuujah
(Repeat with ever-increasing smugness)

Let me know if you’d like a version geared more toward a specific industry (tech bro, oil tycoon, etc.) or set to a different tone (more absurdist, darker, or more comedic).

Anonymous Coward says:

Re: Re:

Nevertheless, there is evidence that Shakespeare would have appreciated copyright in that he only gave each of his actors only the lines they spoke rather than full scripts so his works couldn’t be played in other theaters, and other works were either artworks produced solely for patrons or limited print runs of books. Then copyright laws were enacted, and books, art, plays and music became available for consumption by everybody. So without defending the current regime of copyright maximalism, I will say that (reasonable) copyright laws helped with education of the masses as opposed to just the elite having access to careers in banking, law, etc.

Anonymous Coward says:

Re:

Copyright exists to protect tax revenue and jobs

No, not actually. “To promote the progress of science and useful arts…”

as well as the hard work of people whose work is downloaded, indicating it has value.

The amount or difficulty of work doesn’t translate to copyrightability. I can take a picture in a split second and I own the copyright. You can paint a complex painting for three years and we both get copyright protection. Amount of effort or time doesn’t factor in. This has literally been rejected as an argument in US courts.

Don’t like it? Don’t steal it.

You can’t steal a copyrighted work unless you’re stealing a physical object and depriving the owner of it. Downloading a copy is copyright violation, not theft.

Without an incentive to create you’ll wind up with AI slop and little more online.

Definitely not. The human drive to create existed for millennia before copyright was a thing. And many people create works and release them in the public domain or through creative commons zero licenses or free commercial licenses without expectation of compensation.

Copyright is now mostly a tool for large corporations to pool valuable IP and exploit it as much as possible, charging rent for works which the people who profit the most from it didn’t contribute to, depriving the actual creators of the bulk of the compensation for the bulk of their work.

Arianity (profile) says:

Re: Re:

Companies who profit off copyright have said essentially the same thing about culture in general for years, and it’s never been true. Creative people don’t need an incentive to create; they just want to.

It’s never black and white, it won’t be just slop, but the incentives do matter. Incentives matter for the health, quality, and scale. Also kind of depends on what you’re creating; there’s stuff under copyright that is utilitarian, not an expressive outlet.

It’s also not just the companies, but the artists themselves, too. Successful ones (some with copyright companies they fight with), and unsuccessful ones.

Strawb (profile) says:

Re: Re: Re:

It’s never black and white, it won’t be just slop, but the incentives do matter.

For livelihoods, sure, but I have yet to hear an artist, of any kind, say that they would stop creating things if they couldn’t make money from it.

Humans made art and culture thousands and thousands of years before currency was a thing.

MrWilson (profile) says:

Re: Re: Re:4

I realized I couldn’t make money off my writing, but I enjoy it more because I’m not desperately trying to output quantity to survive so I can have a day job and output quality over a longer period of time at my own pace. Doing what you love as a job that you need to succeed at in order to eat can ruin what you love. And success seems to have ruined a lot of artists. You get people like James Patterson who write outlines and then outsource the actually writing to other writers.

Anonymous Coward says:

Re: Re: Re:2

“For livelihoods” isn’t nothing.

An artist who has to work 40 hours a week in a non-creative job is necessarily going to have less time to create; one who can survive off the proceeds of their creative work can actually make that their career and devote more time and attention to it.

MrWilson (profile) says:

Re: Re: Re:2

And honestly I have seen a lot of art that has been made by people who are primarily interested in making money off of it and a lot of it ends up being low effort that they just bullshit meaning for in unengaging ways and are just desperate to find some sucker to buy into it or a wealthy person to launder money through its purchase.

Arianity (profile) says:

Re: Re: Re:2

but I have yet to hear an artist, of any kind, say that they would stop creating things if they couldn’t make money from it.

Anecdotal, but I’ve seen it happen quite a bit. It’s very common for say, actors. It’s especially public with creators on platforms like Youtube, Twitch, etc try to make it work, and when it doesn’t, they eventually stop posting, never to be heard from. Same for indie game devs who never get to finish their hobby game (Vice versa, when the financials work out to go full time, their output improves significantly).

But more commonly, it’s a reduction. Creating things for a few hours after work a few nights a week isn’t the same as someone who can devote their 9-5 (or more) to it. It’s not no creation, but it’s still a big reduction in potential, even for an artist of the same intrinsic talent. That’s hundreds to thousands of hours of practice/output, throwing ideas around, etc that are lost.

Humans made art and culture thousands and thousands of years before currency was a thing.

Sure, but the scale is different. Like I said, not black and white. But that doesn’t mean nothing is lost, either.

Strawb (profile) says:

Re: Re: Re:3

Anecdotal, but I’ve seen it happen quite a bit. It’s very common for say, actors.

Fair point. Acting is a creative pursuit that tends to require more time than other ones, given that you can’t just do it “off the clock”.

It’s especially public with creators on platforms like Youtube, Twitch, etc try to make it work, and when it doesn’t, they eventually stop posting, never to be heard from. Same for indie game devs who never get to finish their hobby game

Again, not distributing != not creating.

This comment has been deemed insightful by the community.
Anonymous Coward says:

Re: Re: Re:

Incentives matter for the health, quality, and scale.

But incentives aren’t only financial. I write song parodies not because I make anything off them (I actually don’t make a bean), but because other people find them funny, so my incentive is audience engagement. Don’t forget that type of incentive.

Crafty Coyote says:

And these high damages could be bad for the plaintiff- because there’s a good chance that you charge them more than they have or could reasonably be expected to make in a lifetime, if there’s multiple charges.

I don’t think courts would allow for someone to agree to indenture to repay someone, the 14th Amendment forbids that. The only way a guilty defendant could pay is to set up a prison sentence for those who fall into arrears for copyright violations, similar to how nonpayment of income tax is sometimes an arrest able offense.

Anonymous Coward says:

If you went absolutely crazy and said that the settlement was $1 billion (which I highly doubt it would be), and the lawyers take half of that, that means authors would receive… about $70 per work?

$70 per work is around 2-5 times the value I’d expect to see for the vast majority of the works in question (anecdotally as someone who buys a lot of books). That is a very sane amount for copyright violation damages, and frankly something which I think the law should be changed to reflect as a default.

While I dislike much of the path that led to this point, such a conclusion would be very fair if it were to happen.

I do agree that the middlemen here will likely ensure that the actual agreement is significantly lower than that.

MrWilson (profile) says:

This was something I predicted a few months ago. There will be settlements. There will be licensing deals. The sea of authors whose works got copied into an LLM trained on millions of sources (or more), including mine, will not benefit from any settlement or license. And there will be no ethical victory in showing LLM developers that they have to license the works or face consequences. It will just mean that wealthy LLM developers will license works for pennies each and poor LLM developers will have to pirate and risk it or only train on public domain works, which means only wealthy LLM developers (backed by billionaires, vulture capitalists, techbros, and authoritarian governments) will be highly functional and those are the ones that will be used to control your medical records, your bank accounting, your election systems, your grandchild’s education, your news exposure, your social media algorithm, etc.

n00bdragon (profile) says:

I dunno. If someone hit me with a lawsuit arguing for a trillion dollars in damages I might just say “okay” and write them a check with a lot of zeroes on it or pay them in literal monopoly money or something. It’s a request so completely ridiculous that it deserves a ridiculous response. They obviously can’t collect it. They can’t even collect a fraction of 1% of it because it doesn’t exist, even if you dissolved the company and sold the office furniture for fire wood. And like… who would they even sell it to? They can’t sell the company because its debts are worth more than the GDP of half the world’s nations put together. Just demand that they hit you with the full force of those statutory damages because reality will cause their demand to fall apart and force them to reckon with the actual harm done.

Arianity (profile) says:

Because of copyright’s absolutely ridiculous statutory damages (up to $150k per work if the infringement was found to be “willful”), which need not bear any relationship to the actual damages,

$150k for a company knowingly breaking the law is pretty reasonable. We often complain about laws lacking teeth, and companies treating fines as a cost of doing business. A company deliberately building a ‘pirate library’ should be a company-ending threat. Especially when your CEO is running around saying preferred to steal them to avoid “legal/practice/business slog,” as cofounder and chief executive officer Dario Amodei put it.

Even setting aside that Judge Alsup already said training is fair use, the math just doesn’t work.

The onus of that shouldn’t fall on the authors. Sometimes the cost of something doesn’t pencil out. That’s fine, and it happens all the time.

If you went absolutely crazy and said that the settlement was $1 billion (which I highly doubt it would be), and the lawyers take half of that,

You could get those transaction costs significantly lower if you didn’t force authors to go through court to get it. Like order of magnitude lower.

$1 billion also seems quite arbitrary for a company is valued at $170b, and raised more than $3b this year alone. It probably won’t be higher, because of norms, but you could go significantly ‘crazier’. Although, it is a bit funny, since $70… works out to roughly the cost of just paying for a few copies of an actual book. It literally could’ve just paid for them, instead of moving fast and breaking things.

Instead what will happen—what always happens with these collective licensing deals—is that a few of the bigger names

Which is more than the zero Anthropic was hoping for, or any other reasonable proposed alternative has come up with. The alternative right now is multi-billion dollar AI companies sitting on the money.

These kinds of schemes only tend to enrich the middlemen (often leading to corruption).

This isn’t sufficient by itself, but that doesn’t mean it isn’t a small step in the right direction. A better world is possible.

Arianity (profile) says:

Re: Re:

You forgot that copyfraud can be a crime too, didn’t you?

Nope, it just wasn’t relevant to mention.

You’ll be happy to know that I’m for both much stricter penalties and enforcement for copyright abuse and/or fraud, as well. (As well as reforms that make things like false DMCA claims harder to do in the first place).

Arianity (profile) says:

Re: Re: Re:2

Except that temporary copies have an implied license under US law, so any copyright claim in respect of these is copyfraud. Why is that irrelevant, maximalist shill?

a) These weren’t temporary copies. To quote the judge’s ruling: All the foregoing was done to amass a central library of “all the books in the world” to retain “forever.”

b) No they don’t, except under very limited circumstances that wouldn’t apply here even if they weren’t forever copies.

There’s a reason it’s not mentioned in the lawsuit (including as a defense by Anthropic), by the judge, or by Mike. It’s not even mentioned in other comments (including by you). But yes, you’re clearly a very clever boy that solved this problem in a way that no one else saw.

Dister (profile) says:

Re:

$150k for a company knowingly breaking the law is pretty reasonable. We often complain about laws lacking teeth, and companies treating fines as a cost of doing business. A company deliberately building a ‘pirate library’ should be a company-ending threat. Especially when your CEO is running around saying preferred to steal them to avoid “legal/practice/business slog,” as cofounder and chief executive officer Dario Amodei put it.

One hundred percent. I get that Anthropic does a lot of R&D developing the models themselves and the framework for implementing them, but at the end of the day, the differentiator between AI solutions is in very large part (not all, but certainly very significant) is the training data. So while Mike wants to spin this as “turning a narrow legal loss into a company-ending threat,” the offense is a significant driver of the company value in the first place. A ruling can be narrow in its legal applicability, as it was here, but extremely significant in its market applicability.

A pirate library of unlicensed works is indeed extremely significant in the market. Training data is very expensive and very valuable. Access to quality data is almost the entirety of Google’s business model. So stealing that data from copyrighted works is not a “narrow loss.” It is fundamental, both to the copyright holder who derived value from the integrity of the copyrighted work, and to the infringing entity that builds an entire business on the back of that data.

The works themselves, both in the aggregate and individually, are what give these models their value. Claude would be useless without training, and would be barely better than useless unless trained on quality data. And these companies absolutely pick works of authorship that contribute to the quality of the training set. So even if one book is of minor value to the training set, it is still at least some value. We cannot just be running around saying because some particular author would not get what we would consider enough money in return for their work, that there should be no guardrails at all on taking their work and taking its value. Especially when that value is a significant driver of the value for the infringer, as it is here. In fact, to say otherwise presents this somewhat bizarre scenario where a potential infringer should infringe a lot to decrease the incremental value of each infringement, thereby escaping responsibility for any infringement. I am not sure we want to be incentivizing large scale pirating with such a framework.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Subscribe to Our Newsletter

Get all our posts in your inbox with the Techdirt Daily Newsletter!

We don’t spam. Read our privacy policy for more info.

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...