Anonymous Coward

June 26, 2025 at 9:56 am

Down with copyright, and down with the slop machines.

Anonymous Coward

June 26, 2025 at 1:28 pm

Re:

That’s a very rude way to describe authors, even litigious hacks.

Anonymous Coward

June 28, 2025 at 2:58 am

Re: Re:

…litigious hacks.

Like George R.R. Martin.

Anonymous Coward

June 26, 2025 at 10:16 am

You know, I really miss the times where judges made nuanced and thought-out rulings like this.

Instead, SCOTUS is probably gonna give the US free reign to gate off the web tomorrow, with the rest of the world already following in their tracks.

The internet will become unusable.

Anonymous Coward

June 26, 2025 at 10:39 am

I think the judge came to the right decision that anthropic did do something wrong.

Taking from pirated sources should not be an option (for a legitimate company). However I dont think it should matter if the company copied the data or just scanned it.

On a personal note, I am more open to piracy by individuals because it is a more nuanced gray area than for companies.

Anonymous Coward

June 26, 2025 at 11:51 am

Re:

A poor person downloading a $1 million “worth” of content isn’t a $1 million worth of lost sales.

A corporation downloading the same is ripping off the poor artists/creators who are already getting shafted by the media companies/publishers.

Anonymous Coward

June 26, 2025 at 1:58 pm

Re:

I am open to piracy by anyone by nature of the fact that I think the Copyright Act must be repealed.

I think the critical mistake Anthropic made was retaining the pirated copies, which implies a use other than simply training. After a dataset is constructed, you should discard original data sources.

Anonymous Coward

June 26, 2025 at 10:40 am

It’s always such a pleasant surprise when a US court issues a halfway sane ruling on tech and copyright.

Scott Craver

June 26, 2025 at 2:00 pm

This is the same issue that befell MP3.com

MP3.com was taken to court for a “CD beaming” service that they argued was fair use—streaming CDs to people who verified that the owned the physical CD—but it likewise involved building a library of ripped CDs, and were busted on that act of direct infringement.

I always suspected that could become a model for AI copyright lawsuits, where the computation and processing is too transformative and weird to be easily established as infringement, but the copying of online works into training sets could be the act targeted by a lawsuit.

Arianity (profile)

June 27, 2025 at 2:35 am

I think your summary is missing a few crucial points that is worth mentioning:

1) the authors didn’t argue that Claude regurgitated parts of the book(s). This lawsuit is specifically focused on inputs only, and that shapes the ruling a lot. The judge is also making a very big distinction between AI that writes new content itself (in contrast to the Reuter’s decision).

2) the authors conceded that training was similar to human learning: Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write

These make the fourth factor particularly very weak. It’s not necessarily that Alsup is putting less emphasis on it.

And since that was effectively the same as what Anthropic did here, it gets another vote towards fair use:

You’re misreading that portion a bit. The authors explicitly separated the format shift as a separate element: Authors argue it was a distinguishable step requiring independent justification. (That said, this is actually a nice win for property rights in regards to format shifting. Although I’m a little worried about the Judge’s reasoning. As noted, 106 restricts reproduction, and it doesn’t say anything about one to one copies. So he’s freestyling a bit, there)

Organizations like Google and the Internet Archive and many others copy all the content they can find online and store it in giant databases/indexes/libraries. And those have been found to be fair use in the past. So what makes this different?

Those passed fair use because of other factors, not the copying part. The ruling(and past rulings) explicitly goes into this.

though I’d still quibble that under the exact text of copyright law it only counts as a “copy” if it’s a “material object,” and purely digital content isn’t covered

Eh, if you want to get pedantic about it, every “purely digital” copy resides on some physical media, be it RAM or other forms of storage. It very much is fixed by any method now known or later developed, and from which the work can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device.link

Also left open, to me, is the question of what would happen if a model figured out a way to train on those works like Books3/LibGen just by scanning them when found elsewhere online, and not creating the internal library.

That’s still making a copy, as far as copyright goes. Although there’s still some wiggle room in precedent if it’s streamed quickly enough. But practically- it’s potentially much more wasteful, if companies start having to rescreen material. AI companies are already putting significant load on sites. If every new model had to regrab ephemeral data, that would get much worse. That would actually kind of suck, from a practical point of view.

And maybe that’s the proper balance? Alsup has created a framework that distinguishes between legitimate, transformative innovation practices and what amounts to direct infringement with a corporate veneer.

That seems like it might potentially lead to a bad equilibrium. If any sale of a book can be turned into an AI input, it’s going to have to be priced accordingly. I could also see this leading to more “licenses” or other workarounds, where you end up not actually owning the thing, similar to how software commonly works now. But maybe it’s a start of something workable.

Anonymous Coward

June 27, 2025 at 5:09 pm

Re:

That’s still making a copy, as far as copyright goes
No it’s not.

cls

June 27, 2025 at 3:01 am

bad idea

Arianity, Re:

I could also see this leading to more “licenses” or other workarounds, where you end up not actually owning the thing, similar to how software commonly works now.

Are you unfamiliar with the horror that the college textbook leasing has become!?

Arianity (profile)

June 28, 2025 at 1:54 am

Re:

Are you unfamiliar with the horror that the college textbook leasing has become!?

I’d managed to suppress those memories 😬 That situation is utterly unconscionable.

Monday
11:09	Judge Reopens Trump's IRS Case, Wants To Know If The Court Was Defrauded (2)
11:04	Daily Deal: uTalk Language Education (0)
09:31	CBP Commander Greg Bovino Is Taking Guest Speaker Spots At White Nationalist Conferences (5)
05:29	AT&T Sues California Regulators For Trying To Make Broadband Affordable (3)
Sunday
12:00	Funniest/Most Insightful Comments Of The Week At Techdirt (13)
Saturday
12:00	This Week In Techdirt History: May 24th - 30th (2)
Friday
19:39	Knox County, TN Rolls Back 'Roots' Book Ban After Backlash (8)
15:24	How AI Can Lead To False Arrests & Wrongful Convictions (21)
13:09	Ctrl-Alt-Speech: Deus vs. Machina (0)
11:15	Court Temporarily Freezes Trump's $1.776 Billion 'Anti-Weaponization' Slush Fund To Figure Out WTF Is Going On (23)

Judge Alsup: Training AI On Copyrighted Works? Fair Use. Building Pirate Libraries? Not So Much

from the right-to-read dept

Comments on “Judge Alsup: Training AI On Copyrighted Works? Fair Use. Building Pirate Libraries? Not So Much”

Re:

Re: Re:

Re:

Re:

This is the same issue that befell MP3.com

Re:

bad idea

Re:

Add Your Comment Cancel reply

Comment Options:

What's this?

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Trending Posts

Monday

Sunday

Saturday

Friday

More

Tools & Services

Company

Contact

More

Judge Alsup: Training AI On Copyrighted Works? Fair Use. Building Pirate Libraries? Not So Much

from the right-to-read dept

Comments on “Judge Alsup: Training AI On Copyrighted Works? Fair Use. Building Pirate Libraries? Not So Much”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Trending Posts

Monday

Sunday

Saturday

Friday

More

Email This Story

Tools & Services

Company

Contact

More