Crafty Coyote

November 13, 2024 at 11:15 am

After decades of invoking theft equivalence (“you wouldn’t steal a car”), it doesn’t come as any surprise that artists are using amoral computers who can’t know right from wrong to infringe copyright. You can’t jail a computer so the leverage is out the window.

Anonymous Coward

November 13, 2024 at 12:31 pm

Re:

… artists are using …computers … to infringe copyright.

Thank you for tell me you haven’t read the post without telling me you haven’t read the post.

Diogenes (profile)

November 13, 2024 at 1:42 pm

Re: thats not the point though

“You can’t jail a computer so the leverage is out the window.”

Thats irrelevant. This has never been about computers doing stuff. Its about people doing stuff “using a computer”. Somehow that point keeps being missed!

Paul

November 13, 2024 at 7:17 pm

Re: Re: What if they did not use a computer

They read lots of things, then they thought about things , then they wrote something.

Lock them up!!!, How can they possibly write something that not a copy of my work after reading my work.

MrWilson (profile)

November 13, 2024 at 9:10 pm

Re: Re: Re:

Like this. Here’s a sentence using all of the words you wrote but it’s not a copy of your work:

After reading lots of things, they thought about how they could possibly write something that was not a copy of my work, then wrote something and locked it up.

MrWilson (profile)

November 13, 2024 at 2:18 pm

Re:

It’s literally not copyright infringement to train or use these tools. That’s the entire fucking point.

You can argue that it isn’t “art” or that it’s a bad practice or that it’s decreasing business for human artists, but pretending it’s copyright infringement just indicates you’re either ignorant or you have an agenda.

Anonymous Coward

November 14, 2024 at 7:52 am

Re:

…artists are using […] computers […] to infringe copyright.

Thanks for telling everyone here that you didn’t read the article without saying you didn’t read the article.

Anonymous Coward

November 13, 2024 at 11:34 am

The fuck?

All the companies scrapping the open web are the massive companies.

Openai, Microsoft, Google, ect.

Sure some of the companies suing over ai usage are also massive companies.

But none of the companies scraping the entire web to train ai are small companies

Anonymous Coward

November 13, 2024 at 12:40 pm

Re:

All the [AI] companies scrapping the open web are the massive companies.

FTFY. And even then, I am not convinced that ALL of them are massive, though I will concede the three examples you name. A trivial search yielded articles like “The Best 19 AI Website Scrapers You Haven’t Heard Of”. I’m pretty sure there aren’t 19 distinct massive companies making the scrapers, and I’m even more sure that the folks who make the tools aren’t the only ones using them.

Paul Alan Levy (profile)

November 13, 2024 at 11:36 am

This is NOT a ruling on whether training infringes copyright

The plaintiff’s ONLY theory was that removal of CMI violated the DMCA. The judge intimates no view about whether training on copyrighted works infringes copyright.

“Whether there is another statute or legal theory that does elevate this type of harm remains to be seen. But that question is not before the Court today.”

And “Other provisions of the Copyright Act afford such protections [against non-consented use], see 17 U.S.C. § 106, but not Section 1202.”

Those questions remain to be decided in other cases

Anonymous Coward

November 14, 2024 at 7:59 am

Re: If this was a case about humans…

The plaintiff’s ONLY theory was that removal of CMI violated the DMCA. The judge intimates no view about whether reading copyrighted works for training purposes infringes copyright.

“Whether there is another statute or legal theory that does elevate this type of harm remains to be seen. But that question is not before the Court today.”

And “Other provisions of the Copyright Act afford such protections [against non-consented use], see 17 U.S.C. § 106, but not Section 1202.”

Those questions remain to be decided in other cases.

Get it yet, maximalist shill?

That One Guy (profile)

November 13, 2024 at 1:07 pm

Sure hope none of those suing learned their craft from anyone else...

‘It was trained on content that might or even did include my stuff, therefore it’s output from that point forward infringes upon my copyright(s)’ is just the digital equivalent of ‘They learned to read thanks to my books, therefore anything they wrote from that point onwards infringes upon my copyright(s)’.

Anonymous Coward

November 13, 2024 at 1:12 pm

Re:

For-profit machines owned by massive companies designed to scrape the entire Internet many, many times over to where it can have the same effect as a Denial of Service attack on some sites =/= human learning and inspiration.

Anonymous Coward

November 13, 2024 at 4:14 pm

Re: Re:

And if it’s not the same as human learning and inspiration, good luck trying getting it nailed under copyright law, chumley.

The point y’all trying to make is that a human can be found guilty under copyright infringement while trying to call it inspiration or influence. If inspiration can’t be done by a machine… this argument might not work out the way you want it to.

Crafty Coyote

November 14, 2024 at 7:18 pm

Re: Re: Re:

If human learning and inspiration is enough to get into legal trouble, then I can understand why everyone from large corps to artists would be interested in the future of AI to combat copyright. No one wants to get arrested or sued for making art, let’s get these computers involved who can’t be thrown in jail to do this dangerous work for us. Of course, it doesn’t necessarily mean that the people who use these machines are good folk, either.

That One Guy (profile)

November 13, 2024 at 6:26 pm

Re: Re:

Whether or not a profit is being made does not change non-infringing activity to infringing activity, otherwise an author/artist would have to be very careful if they ever tried to charge for their works.

Whether or not a large company is doing something does not change non-infringing activity to infringing activity, because the company isn’t doing squat, the people running it are.

How much is being ‘scraped’ does not change non-infringing activity to infringing activity, otherwise again how many books an author read could be an influence as to whether or not their works were infringing.

If the process of ‘scraping’ a site is resource-intensive enough to cause actual problems to the site’s stability that’s not a copyright issue.

Anonymous Coward

November 14, 2024 at 8:43 am

Re: Re:

For-profit machines owned by massive companies designed to scrape the entire Internet many, many times over to where it can have the same effect as a Denial of Service attack on some sites…

TIL: Browsing the Internet with more than one tab open is the same as DDoS.

Anonymous Coward

November 14, 2024 at 9:19 am

Re: Re: Re:

The Game UI Database, among other sites, has faced slowdowns because of the scrapers. Are you reloading a webpage 200 times a second?

https://www.reddit.com/r/gamernews/comments/1fcmq1g/this_was_essentially_a_twoweek_long_ddos_attack/

Arianity

November 13, 2024 at 1:44 pm

But copyright is the wrong tool for the job.

Any sort of rights (property or otherwise) are going to have similar issues. The underlying issue is companies having exploitative leverage to pay you pennies for that right. No tool is going to be able to do the job as long as the playing field is tilted towards large companies that have so much market power they can force you to accept underpayment. But the only way you’re really going to fix that is to address the power large companies have.

Really what this tells you is that property rights like copyright aren’t sufficient on their own. If you’re trying to plow a field full of rocks, the issue isn’t that the plow isn’t good at it’s job. It’s a singular puzzle piece in a bigger picture.

it will do real damage to the open web by further entrenching the largest companies.

There’s an inherent tension between compensation and the group best able to pay said compensation, unless you’re going to hit them with anti-trust or something to keep them out of the sector. Not compensating lowers the barrier to entry, but is fundamentally pyrrhic. You can lower the barrier to entry to manufacturing by not paying your workers, too.

It is much more akin to reading all these works and then being able to make suggestions based on an understanding of how similar things kinda look, though from memory, not from having access to the originals.

The thing is, even if we accept it as reading (which is playing a bit fast and loose with things like ‘understanding’), the underlying concerns by content creators are still there. It just means that existing copyright isn’t designed for the distinction. Which “You’re not covered under existing law, too bad” doesn’t really seem super tenable in either the short or long term. Not least because it further entrenches those market power problems, or that it’s likely to lead to extending copyright to reading in some fashion.

Anonymous Coward

November 13, 2024 at 1:52 pm

Re:

Agreed. I will never understand why people equate reading with scraping the Internet many times over in ways that are impossible for humans to do.

Anonymous Coward

November 13, 2024 at 3:49 pm

Re: Re: Scraping = Reading

“I will never understand why people equate reading with scraping“

Because it is?? The mechanisms are different but it’s the same fundamental activity. I don’t understand why (well…I kinda do) people seem to insistent on not understanding it.

Imagine if someone had a super power that allowed them to read and remember the entire library of Congress in a day; then that person wrote a bunch of books and essays influenced by the information they consumed.

Nobody would be accusing that person of “violating copyright” because that would be stupid. Yet people, like almost everyone commenting under this article, continue to claim that’s what would be happening. Preposterous

Anonymous Coward

November 13, 2024 at 4:01 pm

Re: Re: Re:

Imagine if someone had a super power that allowed them to read and remember the entire library of Congress in a day; then that person wrote a bunch of books and essays influenced by the information they consumed.

Yeah, I’d be fine with that and I’m sure a lot of other people would be fine with that because it would be a human applying his writing skills.

The learning models aren’t humans; they’re products meant to bring profit to corporations and pumping up stock prices with lofty promises. The learning models don’t deserve the same protections as humans.

Please go watch this Jimquisition video about AI.

Diogenes (profile)

November 13, 2024 at 4:57 pm

Re: Re: Re:² "The learning models aren’t humans;" is irrelevant

Again, this is humans using computers. Whenever anyone implies that the computer is violating copyright they are wrong. Computers cannot violate the law. Only the humans using the computer can. So the question is whether it is a violation of copyright to use a computer to mass read the internet.

MrWilson (profile)

November 13, 2024 at 5:55 pm

Re: Re: Re:²

Yeah, I’d be fine with that and I’m sure a lot of other people would be fine with that because it would be a human applying his writing skills.

And people using an LLM is a human applying a tool that a human made. It’s just a complicated tool. Do you think photographers are artists? A lot of people didn’t consider photography to be art when cameras became more popular and accessible. It was “cheating” to point and click to produce an image. But now we don’t bat an eye at photography as an art form.

The learning models aren’t humans;

Neither are cameras, paint brushes, printing presses, or typewriters. Would you suggest true artists only do finger paintings and true writers never write down their work or would you concede that tools can be used for legitimate purposes?

they’re products meant to bring profit to corporations and pumping up stock prices with lofty promises.

This right here is the problem with your approach. You have a bias that is tainting your understanding of the issue. At it’s core, your argument is actually against the abuses of technology and money by wealthy exploitative tech bros. The problem is that LLMs and image generators aren’t only used by those people, but you’re arguing against the tools when it’s the humans using them in ways you don’t like that are the problem. And exploitative, greedy tech bros are a problem regardless of what they’re doing or what tools they’re using. You’re likely intentionally blind to the benefits of LLMs because you only see them as tools of oppression, despite the fact that some people have found useful and productive and positive uses for them – including uses that benefit women, minorities, LGBTQ, and neurodivergent individuals.

The learning models don’t deserve the same protections as humans.

This is actually true, but not the way you mean it. They don’t deserve protections, but the humans using these tools do.

Please go watch this Jimquisition video about AI.

That video has the same problem you have. It is an argument against bad people using technology in bad ways. It doesn’t actually provide any useful arguments against the technology itself. It also doesn’t provide any legal or technical critique. It just keeps calling it theft and violation of consent (where consent isn’t always legally required) and saying tech bros can’t do anything good. It’s also full of admittedly intentional hyperbole which makes the video just sound angry rather than well-reasoned.

But you and Sterling are either ignoring or ignorant of the fact that these tools have other uses and are being used by smaller companies and individuals who aren’t exploitative tech bros.

Tanner Andrews (profile)

November 15, 2024 at 5:49 am

Re: Re: Re:² profit motive

The learning models aren’t humans; they’re products meant to bring profit to corporations

Yes, but some humans have the same intent. I read stuff and write based on what I have read, and you may be assured that I intend to obtain some advantage, such as money, from doing so. That advantage may come through my corporation, because people give money to the corp and then it gives money to me.

This comment has been flagged by the community. Click here to show it.

Anonymous Coward

November 13, 2024 at 1:45 pm

I think that LLMs are plagiarism machines. Not necessarily in the legal sense, definitely in the artistic sense.
I recommend this video
https://www.youtube.com/watch?v=5qoOYrTzOfM
(which probably should have been featured here)
and then suggest to think of LLMs as a black box that gives you one of the levels of plagiarism, and you can’t know which. Also, this is why I think the outputs should not be copyrightable.

Anonymous Coward

November 13, 2024 at 2:22 pm

Re:

That video definitely shouldn’t have been featured here, Tom. And the outputs aren’t copyrightable if identified as such according to Copyright Law and the Copyright Office. Plagiarism is not a legal concept and has no relevance here.

TKnarr (profile)

November 13, 2024 at 3:12 pm

I think the judge is going to run into some issues on appeal surrounding copying. Whether or not the training violates copyright, the AI system had to make a copy of the material on it’s servers before it could use that material for training. That particular thing comes up elsewhere, where you have to copy something to your system before you can use it and that copying needs permission from the copyright holder separate from any permission to use what you’re copying. While 1201 is supposed to cover that, there’s been so many work-arounds established to enforce shrink-wrap licenses, DRM schemes, anti-cheat provisions and so on that I can’t imagine a competent lawyer not being able to establish a loophole big enough to drive a tractor-trailer through.

Though I’d like to see the appeals courts see reason and rule that yes, 1201 does make that copying legal and you can’t make it illegal again just by coming at it from a different angle.

James Burkhardt (profile)

November 14, 2024 at 6:15 am

Re:

Dear lord, you have no idea how the internet works, huh?

Thats how the internet works. ALL uses of the internet copy data into local storage. Courts have ruled in multiple cases, that the copying necessary for the internet to work is legal. There is no difference between temporary transitory storage for a human to view content and temporary transitory storage for a computer to view content.

James Burkhardt (profile)

November 14, 2024 at 6:21 am

Re:

OKay, rereading, I better understand your claim, but you seem to think a lot more of the text of the internet is protected against copying with DRM than id imagine is really true.

Anonymous Coward

November 14, 2024 at 8:04 am

Re:

As a savant, whenever I read a book, I copy it to my brain for years, which means it isn’t there transiently as it is for most people. With this fact in mind, please explain how people feeding input into LLMs are committing copyright infringement and I’m not? Go on, I’ll wait.

Tanner Andrews (profile)

November 15, 2024 at 5:52 am

Re: eyeballs

the AI system had to make a copy of the material on it’s servers before it could use that material

The lens of my eye had to make a copy of the material which I read before I could use the material.

For people with vision impairments, there may need to be a camera converting images to tactile form. And if I read it on the computer, it is necessary for the computer to have a copy in memory in order to throw it up on the screen.

I give little weight to the argument that a copy must made as part of the process of using material.

Anonymous Coward

November 15, 2024 at 8:32 am

Re: Re:

As a blind person (not “visually impaired”, for the love of God), I’ve yet to come across a camera that can convert anything to Braille without extra hardware, so I instead have it convert the text to audio through TTS software.

Anonymous Coward

November 13, 2024 at 3:55 pm

A study in confirmation bias

These comments are absolutely fascinating. Almost none of them engage with Mike’s point that scraping/training doesn’t infringe. The one or two that do completely contort the words to fit their own biases.

I suppose this is the hallmark of our times though: have an opinion, then work backwards to make up facts that support it.

This comment has been flagged by the community. Click here to show it.

karthikpiercebody (profile)

November 14, 2024 at 3:26 am

Best Labret Piercing Hoops

These comments are incredibly intriguing. Nearly all of them overlook Mike’s argument that scraping and training do not constitute infringement.

William Christian Bonner

November 14, 2024 at 9:37 am

I'm happy that reading a copyrighted book doesn't make me a felon.

I’ve been arguing that if AI is only trained on non copyrighted information, it’s not going to learn much beyond the bible.

I’m happy to see a case of sanity in the AI debate.

Anonymous Coward

November 14, 2024 at 10:32 am

Re:

TIL: The works of Shakespeare are under copyright.

Alice

November 16, 2024 at 9:57 am

ha Ha ha ha. if your IA model does reproduce enough or the whole of anything covered by copyright, good luck gettint out of this in front of a proper court.

Tuesday
11:06	'The Worst Leak I've Witnessed': A CISA Contractor Left AWS GovCloud Credentials Sitting In A Public GitHub Repo (3)
11:01	Daily Deal: The Complete Arduino, Raspberry Pi & ESP32 Bundle (0)
09:36	Super Meth Isn't The Hero We Want, But It's The Hero We Deserve (6)
05:30	NPR Flubs Its Recovery From Brutal Republican Funding Attacks (5)
Sunday
12:00	Funniest/Most Insightful Comments Of The Week At Techdirt (28)
Saturday
12:00	This Week In Techdirt History: May 17th - 23rd (0)
Friday
19:39	The FDA Takes Its Turn Burying Studies Showing The Safety Of COVID, Shingles Vaccines (8)
15:55	Ken Paxton Wanted To Crack Down On Forum Shopping. Now Lawyers Say He’s Improperly Seeking Out Favorable Courts. (4)
13:14	France's Terrible Copyright Law, Hadopi, Is Not Quite Dead (2)
10:59	Journalists Identify Murder Victims Of Trump's Boat Strike Program (18)

Judge: Just Because AI Trains On Your Publication, Doesn’t Mean It Infringes On Your Copyright

from the that's-not-how-any-of-this-works dept

Comments on “Judge: Just Because AI Trains On Your Publication, Doesn’t Mean It Infringes On Your Copyright”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Trending Posts

Tuesday

Sunday

Saturday

Friday

More

Email This Story

Tools & Services

Company

Contact

More