Anonymous Coward

May 21, 2025 at 1:32 pm

The EFF and other tech groups are banging the table and yelling phrases from the same “Revolutionary techology! Democratization! Won’t somebody please think of the Innovation?!?” playbook that they used in embarrassing attempts to legitimize cryptocurrency and NFTs in the eyes of the public. Only this time, the yelling is in defense of these machine models.

MrWilson (profile)

May 21, 2025 at 10:07 pm

Re:

Cryptocurrency were initially a means of preventing governments from controlling currencies. They turned into techbro pump and dump schemes. EFTs have always been a grift. LLMs can actually do some things that humans actually value and benefit from.

Anonymous Coward

May 21, 2025 at 10:31 pm

Re: Re:

Were they, really? Every alt-scrip scheme is a scam.

Anonymous Coward

May 22, 2025 at 8:44 am

Re: Re:

EFTs have always been a grift.

Oh? In what way are electronic funds transfers a grift?

MrWilson (profile)

May 22, 2025 at 9:43 am

Re: Re: Re:

Doh. I wish Techdirt comments had EFTs—editable fucking typos.

Anonymous Coward

May 21, 2025 at 1:50 pm

Err

Bungle

These words indicate a lack of intent to deceive. I like to imagine we all know better by now.

Anonymous Coward

May 21, 2025 at 3:29 pm

“To work effectively, today’s GenAI systems need to be trained on very large collections of human-created works—probably millions of them. At this scale, locating copyright holders and getting their permission is daunting for even the biggest and wealthiest AI companies, and impossible for smaller competitors.”

If you can’t compensate people adequately for feeding their stuff into your regurgitation engines, you shouldn’t be doing it.

Anonymous Coward

May 21, 2025 at 5:27 pm

Re:

Fully Agreed. Maybe the best thing that can happen, is for development to slow down, and we can properly assess the risks and benefits of this techology, rather than moving fast and breaking things.

But then orgs like the EFF and Techdirt will take lines from Trump and talk about how we’re ceding the [insert technology here] race to China.

MrWilson (profile)

May 21, 2025 at 10:03 pm

Re:

If you can’t compensate people adequately for feeding their stuff into your regurgitation engines, you shouldn’t be doing it.

I’m just going to make up fictional rights and argue for compensation. If you’re reading this sentence, you owe me one million dollars!

Anonymous Coward

May 22, 2025 at 12:14 am

Re: Re:

The right to be compensated for your work is pretty fundamental to this whole “capitalism” thing we’ve got going here.

MrWilson (profile)

May 22, 2025 at 9:47 am

Re: Re: Re:

No, it’s not. Capitalism involves the right to most of the value of someone else’s work if you’re the owner of the means of production. For workers, it means getting fucked. For copyright creators who aren’t big corporations, it means having significantly less income from your work than if only humans could own copyrights.

But even so, that’s still not an articulated right under the law. And it’s uselessly vague. The doctrine of first sale and fair use doctrine in copyright law cover the usage.

Anonymous Coward

May 22, 2025 at 9:53 am

Re: Re:

“Pay people? For working? Could you imagine what that would cost? You’d never be able to make money growing cotton! What an imaginary right.”

-You, apparently.

MrWilson (profile)

May 23, 2025 at 10:12 am

Re: Re: Re:

I didn’t say people shouldn’t be paid for their work. That you have to make up a straw man to argue with me is telling.

You’re making an argument like a record company claiming that format shifting or making a mix tape is copyright infringement because you want to maximize profits.

Do you think libraries are copyright infringement hubs?

Anonymous Coward

May 23, 2025 at 11:25 am

Re: Re: Re:²

If you think regurgitation engines are analogous to libraries you’re not worth further conversation with.

MrWilson (profile)

May 23, 2025 at 6:28 pm

Re: Re: Re:³

Wow. Another straw man. You apparently can’t engage with anything I’m actually saying.

Libraries enjoy the same rights based on the first sale doctrine that anyone else does. It was a single example. Why do you think rights should be exclusive to a library and not to a patron of a library? Why should independent researchers trying to cure cancer have to pay millions just to conduct research with the same material the library paid significantly less to access?

MrWilson (profile)

May 23, 2025 at 11:40 am

Re: Re: Re:²

My frustration with the arguments of people claiming it’s not fair use and that all training must be licensed is that many people seem to think they’re championing the little guy when they’re inadvertently advocating for the benefit of the wealthy and corporations.

First, it is fair use whether you like the idea of corporations profiting off of the largely unpaid work of poor and creative people. That’s just capitalism in general. Argue against that instead of licensing costs for AI training if that’s what you want.

Second, the ability of a large corporation to train AI on publicly available data without paying every single copyright owner is the same ability you have to do the same. It’s the same that an independent university researcher trying to study potential treatments for rare and unprofitable diseases. You’re arguing, ultimately, that only wealthy corporations with large treasure troves of corporate profits should be allowed to build LLMs. You’re opposing the democratization of the technology. It’s like saying you don’t like Microsoft’s business practices so Linux should also be outlawed. Or because a CEO drunk drove in his sports car and killed someone then ambulances should be illegal.

We don’t have individual power to control the the use of AI. It will be used in systems that you will be forced to interact with. It is absolutely important to advocate for regulations and oversight and ethical laws that control how this happens. But arguing for licensed training is ensuring only the wealthiest, most profit-driven organizations will be able to afford to develop AI and that’s who will win government contracts for software systems that will be used against you.

Actual creators, such as myself, will get virtually nothing even if licensing is required. Other corporations hold the rights to the most profitable copyrighted content. You’re just arguing over which big pot of money gets smaller or bigger, neither of which you will ever have access to. And licensed training will do nothing to prevent ethical violations in the use of AI.

Explorer09 (profile)

May 23, 2025 at 12:34 pm

Re: Re: Re:²

You’re making an argument like a record company claiming that format shifting or making a mix tape is copyright infringement because you want to maximize profits.

Because yes it is infringement. I know there are exceptions, which are known as “fair use”, but it is the defendants that have a burden to proof their uses are “fair”. Fair use is never granted as a rule. Exceptions are exceptions.

Do you think libraries are copyright infringement hubs?

U.S. Copyright Act, section 108, “Limitations on exclusive rights: Reproduction by libraries and archives”. Please read the law.

MrWilson (profile)

May 23, 2025 at 8:10 pm

Re: Re: Re:³

Because yes it is infringement.

No, not it’s not. Just saying it is doesn’t make it true.

I know there are exceptions, which are known as “fair use”, but it is the defendants that have a burden to proof their uses are “fair”. Fair use is never granted as a rule. Exceptions are exceptions.

Ah, yes, the “fair use is only an affirmative defense” lie again. You are wrong. Fair use is written into Section 107 of the Copyright Act. It is the law. It is granted as a rule. It is a limitation on copyrights and it is something copyright owners must consider before claiming violations, before issuing DMCA takedowns, before filing a lawsuit. This is literally stated in many cases where courts chide copyright owners for suing over blatantly obvious fair use scenarios.

U.S. Copyright Act, section 108, “Limitations on exclusive rights: Reproduction by libraries and archives”. Please read the law.

You should read the law. Section 107. Also case law, because the law as written isn’t the only functional aspect of the law. You should also look into the Doctrine of First Sale, which also covers what libraries do when they lend out copies, regardless of Section 108. It also covers media rental companies and loaning your vinyl collection to a friend and selling your old 8 track on Ebay.

Explorer09 (profile)

May 24, 2025 at 5:12 am

Re: Re: Re:⁴

Ah, yes, the “fair use is only an affirmative defense” lie again. You are wrong. Fair use is written into Section 107 of the Copyright Act. It is the law. It is granted as a rule.

Do you think I had no idea about the Section 107 specifying fair use? You are the one that should read it, because the U.S. Copyright Law doesn’t explicitly say which use in particular is fair and which is not.

Instead, it mandates “four factors” for courts to evaluate what is fair use and what is not. And there had been cases that higher courts rule differently from lower courts even when the mandated criteria are the same.

it is something copyright owners must consider before claiming violations, before issuing DMCA takedowns, before filing a lawsuit.

No. In the U.S., fair use is a defence raised by defendent only. The copyright owner do not have burden of proving fair use. And so your claim of “this can be fair use, you can’t sue me” is blatantly false.

It’s actually “I can sue you, but you tell the judge to dismiss it by convincing the judge it’s fair use”.

MrWilson (profile)

May 24, 2025 at 11:28 am

Re: Re: Re:⁵

Do you think I had no idea about the Section 107 specifying fair use?

You apparently don’t understand it. I’m also saying you should read case law as well because the statute isn’t the full law.

You are the one that should read it, because the U.S. Copyright Law doesn’t explicitly say which use in particular is fair and which is not.

It provides four factors which are used to determine which particular uses are fair use.

Instead, it mandates “four factors” for courts to evaluate what is fair use and what is not.

Four factors are decided by courts because that’s how lawsuits work, but four factor analysis should and can be done by anyone, including users and copyright holders.

And there had been cases that higher courts rule differently from lower courts even when the mandated criteria are the same.

Yes, different humans interpret the same thing differently. we learned this in kindergarten.

No. In the U.S., fair use is a defence raised by defendent only.

No, it’s not. Fair use isn’t only a defense. It is a positive, legal use.

Review the Lenz decision from the Ninth Circuit.

Lenz held that copyright holders must do a fair use analysis prior to issuing a DMCA takedown notice.

The copyright owner do not have burden of proving fair use.

Of course not. Why would they prove something that goes against their claim? They do have a burden to use a four factor analysis prior to issuing a takedown though.

And so your claim of “this can be fair use, you can’t sue me” is blatantly false.

Quote me where I said you can’t be sued because of fair use. You can be sued for just about anything.

It’s actually “I can sue you, but you tell the judge to dismiss it by convincing the judge it’s fair use”.

It’s actually “I can sue you, but the judge might rule in your favor and admonish me for not considering fair use in advance.”

Explorer09 (profile)

May 24, 2025 at 12:18 pm

Re: Re: Re:⁶

Fair use isn’t only a defense. It is a positive, legal use.

Review the Lenz decision from the Ninth Circuit.

Lenz held that copyright holders must do a fair use analysis prior to issuing a DMCA takedown notice.

Good point for citing the Lenz case. I suggest it’s Lenz v. Universal Music you are talking about.

The problem: The Plantiffs have no obligations to prove fair use on behalf of the defendant. The decision was only to address false takedown notices as the judges warn the copyright owners not to abuse it. It had nothing to do with the plantiffs’ ability to sue (for copyright infringment).

The case didn’t say plantiffs have burden of proof on fair use, especially in court. It said that if there is a chance of the use being fair, don’t issue a takedown – because that’s the wrong tool – instead, sue.

By the way, I found several criticisms on the decision on the Web, and it’s worth linking here just for information

(1) https://www.aei.org/technology-and-innovation/intellectual-property/splitting-dancing-baby-9th-circuits-lenz-decision-may-mostly-meaningless/
(2)
https://truthonthemarket.com/2015/09/23/a-takedown-of-common-sense-the-9th-circuit-overturns-the-supreme-court-in-a-transparent-effort-to-gut-the-dmca/

MrWilson (profile)

May 25, 2025 at 9:10 pm

Re: Re: Re:⁷

I wasn’t going to continue to respond but I’m a sucker for people who keep doubling down on their ignorance with arrogance.

The Plantiffs have no obligations to prove fair use on behalf of the defendant.

It’s plaintiffs, not plantiffs. Spell check isn’t hard.

Nobody asserted that plaintiffs have any obligation to prove fair use. You’re arguing with a straw man. I said plaintiffs are obligated to consider fair use prior to issuing a DMCA takedown because that’s what the Lenz decision said.

The decision was only to address false takedown notices as the judges warn the copyright owners not to abuse it.

It was a false takedown notice because the use was fair use and the plaintiffs hadn’t considered it first! You either didn’t read it or didn’t understand it.

It had nothing to do with the plantiffs’ ability to sue (for copyright infringment).

Nobody said they plaintiffs couldn’t sue. I literally said “Quote me where I said you can’t be sued because of fair use. You can be sued for just about anything.”

You aren’t reading or understanding anything you respond to.

Anonymous Coward

May 27, 2025 at 4:43 am

Re: Re: Re:³

This article is specifically about US law, which holds that format shifting of copies of works owned by the format shifter for their personal use is not copyright infringement even if no licensing fees are paid. You were saying?

Explorer09 (profile)

May 22, 2025 at 11:46 pm

Re: Re:

I’m just going to make up fictional rights and argue for compensation. If you’re reading this sentence, you owe me one million dollars!

There’s no fictional right here. AI “reading” or “training” or whatever you called it involves copying the works in the digital form from one computer memory to another, and that’s the “prima facie” copyright infringement as the law would call it.

Analogizing machine reading things with humans reading is useless in analysing the copyright issue with AI training.

MrWilson (profile)

May 23, 2025 at 10:07 am

Re: Re: Re:

AI “reading” or “training” or whatever you called it involves copying the works in the digital form from one computer memory to another, and that’s the “prima facie” copyright infringement as the law would call it.

Hell no. You just declared the most basic function of the Internet to be copyright infringement. You reading these words on your computer would be the result of copyright infringement according to your analysis.

Analogizing machine reading things with humans reading is useless in analysing the copyright issue with AI training.

Just saying that without justification doesn’t make it true.

Anonymous Coward

May 23, 2025 at 11:23 am

Re: Re: Re:²

“You just declared the most basic function of the Internet to be copyright infringement”.

That was an actual legal discussion at the time and I’m old enough to’ve been around for it. It had to get resolved and we were able to draw a distinction between the necessary processes to transmit and display something as people intended and copying and distributing it in a way that wasn’t.

And then people tried to argue that ‘you can’t copyright a number’ so’s to justify piracy because all data is fundamentally a number, and they got their asses kicked too, because the law is in fact capable of drawing distinctions.

People trying to treat legal concepts like they’re immutable mathematical theorems is a peeve of mine.

MrWilson (profile)

May 23, 2025 at 8:22 pm

Re: Re: Re:³

That was an actual legal discussion at the time and I’m old enough to’ve been around for it.

As am I. Pulling the old person card doesn’t work against other old people.

It had to get resolved and we were able to draw a distinction between the necessary processes to transmit and display something as people intended and copying and distributing it in a way that wasn’t.

I’d love to see your case law citations rather than this vague summary based on trust me bro. In many cases, it didn’t get resolved because some cases were settled and other cases were only addressed in a district court and not taken to the Supreme Court and actually decided as a precedent. In many cases, those making bullshit copyright claims simply changed tactics because it wasn’t profitable to continue suing children.

And then people tried to argue that ‘you can’t copyright a number’ so’s to justify piracy because all data is fundamentally a number, and they got their asses kicked too, because the law is in fact capable of drawing distinctions.

You appear to be misremembering the DeCSS case. Separately, you actually can’t copyright a number.

Anonymous Coward

May 23, 2025 at 10:04 pm

Re: Re: Re:⁴

“Separately, you actually can’t copyright a number.”

And yet, digital works are copyrighted. Because laws can draw distinctions that apparently completely elude people who think they’ve found One Weird Trick.

“I’d love to see your case law citations rather than this vague summary based on trust me bro.”

Direct from statute:

“Temporary Reproductions for Technological Processes

30.71 It is not an infringement of copyright to make a reproduction of a work or other subject-matter if

(a) the reproduction forms an essential part of a technological process;

(b) the reproduction’s only purpose is to facilitate a use that is not an infringement of copyright; and

(c) the reproduction exists only for the duration of the technological process."

Copyright Act (R.S.C., 1985, c. C-42)

MrWilson (profile)

May 24, 2025 at 11:16 am

Re: Re: Re:⁵

You missed the point. You claimed a history and I’m asking for citations of that history. Which cases?

Your statutory citation doesn’t prove or explain the claim that “It had to get resolved and we were able to draw a distinction between the necessary processes to transmit and display something as people intended and copying and distributing it in a way that wasn’t.”

Anonymous Coward

May 27, 2025 at 9:33 am

Re: Re: Re:⁶

A direct statutory citation of the statute that unambiguously resolved the issue isn’t enough for you to grasp that there was an issue to be resolved and that without that statute ‘the most basic functions of the internet’ would be copyright infringement?

Like, that’s why that statute was passed. You can look it up yourself. I’m not a big fan of the “You have no proof” brings proof “Not that kind of proof!” dance; I’ve seen it far too often for it to appeal to novelty.

MrWilson (profile)

May 27, 2025 at 6:39 pm

Re: Re: Re:⁷

I didn’t even google it before because I was looking for case law, not statutes, but I just noticed that your statute is Canadian law, not US law.

I’m not a big fan of the “I have proof!” “What kind of proof is it?” “proof in a different country” dance. I haven’t seen it often, but it’s both hilarious and exhausting.

Once was perhaps forgivable. But citing Canadian law twice in an article explicitly about US law is beyond sloppy. This undermines everything you can say on the topic.

Explorer09 (profile)

May 23, 2025 at 10:39 pm

Re: Re: Re:⁴

You appear to be misremembering the DeCSS case. Separately, you actually can’t copyright a number.

Nope. It’s numbers that lack human originality that are uncopyrightable. Numbers generated in a mostly random manner are uncopyrightable (such as, crypto hashes and encryption/decryption keys). But number representing an ASCII encoded chapter of a Harry Potter fiction is copyrightable.

MrWilson (profile)

May 24, 2025 at 1:01 am

Re: Re: Re:⁵

But number representing an ASCII encoded chapter of a Harry Potter fiction is copyrightable.

Show me a copyright registration that includes an ASCII encoded numeric representation of a Harry Potter book chapter.

Anonymous Coward

May 24, 2025 at 9:05 am

Re: Re: Re:⁶

Copyright vests whether ‘registered’ or not. Registered copyright used to be a thing, but it was replaced by automatic copyright a long time ago and for good reasons.

More to the point, for digital stuff:

“3 (1) For the purposes of this Act, copyright, in relation to a work, means the sole right to produce or reproduce the work or any substantial part thereof in any material form whatever”

Copyright Act (R.S.C., 1985, c. C-42)

ASCII encoding is a material form. If you think you can distribute Harry Potter online by virtue of it being ‘a number’, feel free to test your sovereign-citizen style theory in a court.

MrWilson (profile)

May 25, 2025 at 3:51 pm

Re: Re: Re:⁷

Copyright vests whether ‘registered’ or not. Registered copyright used to be a thing, but it was replaced by automatic copyright a long time ago and for good reasons.

I wish people would research before pretending to correct others.

Registered copyright is still a thing. You must register a copyright in order to get statutory damages and attorney’s fees. Large corporations register their copyrights.

Since we’re discussing Harry Potter, here’s the search page for copyright registrations:

https://cocatalog.loc.gov/cgi-bin/Pwebrecon.cgi?DB=local&PAGE=First

Find me that copyrighted number.

ASCII encoding is a material form. If you think you can distribute Harry Potter online by virtue of it being ‘a number’, feel free to test your sovereign-citizen style theory in a court.

I never said you could distribute Harry Potter online as a number legally. I said you can’t copyright a number, which you can’t. Just because you can convert data into numbers doesn’t make numbers copyrightable.

You’re also ignoring fair use. Depending on how you use the numbers, it could very well pass the four factors test.

But beyond all that, you need to cite the case law. The statute is interpreted by the courts. What have the courts said?

Explorer09 (profile)

May 25, 2025 at 5:12 pm

Re: Re: Re:⁸

Registered copyright is still a thing. You must register a copyright in order to get statutory damages and attorney’s fees. Large corporations register their copyrights.

Since we’re discussing Harry Potter, here’s the search page for copyright registrations: [link omitted]

Find me that copyrighted number.

Idiot. If that registration page shows which “number” (here I mean the full text of the novel, not the registration number) is copyrighted, then you can copy it fully, defeating the purpose of copyright which is meant to prevent you from copying the whole thing.

You surely had no idea about number being copyrightable. I suggest that you read the Wikipedia article about “infinite monkey theorem” and correct yourself.

MrWilson (profile)

May 25, 2025 at 9:05 pm

Re: Re: Re:⁹

Idiot. If that registration page shows which “number” (here I mean the full text of the novel, not the registration number) is copyrighted, then you can copy it fully,

These responses just keep getting dumber.

The point is that you can’t find a registration for it BECAUSE YOU CAN’T COPYRIGHT A NUMBER!!! And just because you’re going to continue to be obtuse, I’ll also play along with your literal interpretation and say that even if you could copyright a number, you wouldn’t have to list then entire number in the registration. For instance, we generally abbreviate pi as 3.14 or 3.14159… The abbreviation could be sufficient for registration. So you’re even wrong while being very wrong.

defeating the purpose of copyright which is meant to prevent you from copying the whole thing.

The purpose of copyright isn’t to prevent your from copying a text. Copyright is meant to secure for a limited time the right for creators to control their work. You can own the copyright on a work and release it under a permissive license that allows someone to copy the entirety of the text. Copyright doesn’t prevent that. It just empowers the creator or their assignees to determine in what circumstances it can be copied, with the exception of scenarios where their rights of control are limited, such as in the case of fair use.

You are admitting here that you don’t even understand what copyright is. If your bootlicking of wealthy corporations, lack of understanding of the doctrine of first sale, fair use, and roughly every other issue relating to copyright weren’t already disqualifying, this alone would render your assertions laughably dismissible.

You surely had no idea about number being copyrightable.

I surely did and still do know that you can’t copyright a number.

I suggest that you read the Wikipedia article about “infinite monkey theorem” and correct yourself.

Why would I need to read an article about a concept I’m already familiar with and that has no bearing what the topic? Randomization has not been shown to produce the exact character patterns of large text works such as Hamlet. The theorem is stipulated on having infinite time, which we do not have. Non sequitur!

Explorer09 (profile)

May 25, 2025 at 10:38 pm

Re: Re: Re:¹⁰

The point is that you can’t find a registration for it BECAUSE YOU CAN’T COPYRIGHT A NUMBER!!! And just because you’re going to continue to be obtuse, I’ll also play along with your literal interpretation and say that even if you could copyright a number, you wouldn’t have to list then entire number in the registration. For instance, we generally abbreviate pi as 3.14 or 3.14159… The abbreviation could be sufficient for registration. So you’re even wrong while being very wrong.

Do you have any idea about what you are saying. In the registration website you linked, I can find about 25 entries when I search the title “Harry Potter” there.

The purpose of copyright isn’t to prevent your from copying a text. Copyright is meant to secure for a limited time the right for creators to control their work. You can own the copyright on a work and release it under a permissive license that allows someone to copy the entirety of the text.

And why the fuck must I release my copyrighted work in a permissive license?
You are now making an assumption that not every creator would agree: That the works must be released for free.

Look, even Techdirt didn’t promote that idea. Techdirt encourage voluntarily releasing content for free that can make fans wanting to fund new content in a new kind of business model, and yet it was never mandatory to release content for free.

doctrine of first sale, fair use

First sale doctrine doesn’t cover the reproduction (i.e. copying) of the work even if you own a legal copy of it.

Fair use is a court ruling, not automatic when you say it is fair use.

Randomization has not been shown to produce the exact character patterns of large text works such as Hamlet. The theorem is stipulated on having infinite time, which we do not have.

Well, it’s not about infinite time. It’s about your denial of the fact that every digitized data is a number. You wanna proof it? hexdump some_random_document.pdf and see it!

MrWilson (profile)

May 26, 2025 at 12:16 am

Re: Re: Re:¹¹

Do you have any idea about what you are saying. In the registration website you linked, I can find about 25 entries when I search the title “Harry Potter” there.

AND NONE OF THEM ARE A REGISTRATION FOR A NUMBER THAT REPRESENTS THE TEXT!!! This ignorance seems willful at this point. At least, that’s the most graceful interpretation.

And why the fuck must I release my copyrighted work in a permissive license?

You don’t have to. I didn’t suggest you did. I just pointed out that copyright isn’t what you think it is. You seem to have a very myopic perspective on a complex topic. You seem to think copyright is just a security to make sure you make money off of your content. That’s not what it actually is.

You are now making an assumption that not every creator would agree: That the works must be released for free.

I did not say that. Again, you are not reading or understanding what I am writing. I am correcting your misunderstandings and you’re just misunderstanding more.

Look, even Techdirt didn’t promote that idea. Techdirt encourage voluntarily releasing content for free that can make fans wanting to fund new content in a new kind of business model, and yet it was never mandatory to release content for free.

Nobody said it should be mandatory to release content for free! Quote me where I claimed that. Except don’t bother because you can’t because I didn’t.

Fair use is a court ruling, not automatic when you say it is fair use.

I didn’t say it was automatic when you say it is. It is fair use whether you say it is or not when it is actually fair use and copyright owners have an obligation to consider it before taking action. Yes, you can just sue, but you can also lose and have to pay legal fees for your loss. That doesn’t mean court is the only place fair use comes up.

Well, it’s not about infinite time.

Yes, actually the infinite monkey theorem is about infinite time. You literally referenced it and suggested I should look it up, but you’re demonstrating you don’t even understand it…which, at this point, seems expected.

It’s about your denial of the fact that every digitized data is a number.

I have never denied that data can be depicted as a number. I have only ever pointed out, correctly, that you cannot copyright a number as a creative work. Copyrighted works require human authorship and creativity. Converting text into numerical data is not creative. Nobody writes whole novels in binary.

You wanna proof it? hexdump some_random_document.pdf and see it!

You keep trying to fight straw men so hard. Are you a crow or something?

Anonymous Coward

May 27, 2025 at 9:03 am

Re: Re: Re:⁸

“Registered copyright is still a thing. You must register a copyright in order to get statutory damages and attorney’s fees.”

That is incorrect.

“Remedies
Civil Remedies
Infringement of Copyright and Moral Rights

34 (1) Where copyright has been infringed, the owner of the copyright is, subject to this Act, entitled to all remedies by way of injunction, damages, accounts, delivery up and otherwise that are or may be conferred by law for the infringement of a right.


(2) In any proceedings for an infringement of moral rights, the court may grant to the holder of those rights all remedies by way of injunction, damages, accounts, delivery up and otherwise that are or may be conferred by law for the infringement of a right.

Marginal note:Costs

(3) The costs of all parties in any proceedings in respect of the infringement of a right conferred by this Act shall be in the discretion of the court.

(4) The following proceedings may be commenced or proceeded with by way of application or action and shall, in the case of an application, be heard and determined without delay and in a summary way:

    (a) proceedings for infringement of copyright or moral rights;

    (b) proceedings taken under section 44.12, 44.2 or 44.4; and

    (c) proceedings taken in respect of

        (i) a tariff approved by the Board under Part VII.1 or VIII, or

        (ii) agreements referred to in subsection 67(3).

(5) The rules of practice and procedure, in civil matters, of the court in which proceedings are commenced by way of application apply to those proceedings, but where those rules do not provide for the proceedings to be heard and determined without delay and in a summary way, the court may give such directions as it considers necessary in order to so provide.

(6) The court in which proceedings are instituted by way of application may, where it considers it appropriate, direct that the proceeding be proceeded with as an action.

(7) In this section, application means a proceeding that is commenced other than by way of a writ or statement of claim.

34.1 (1) In any civil proceedings taken under this Act in which the defendant puts in issue either the existence of the copyright or the title of the plaintiff to it,

    (a) copyright shall be presumed, unless the contrary is proved, to subsist in the work, performer’s performance, sound recording or communication signal, as the case may be; and

    (b) the author, performer, maker or broadcaster, as the case may be, shall, unless the contrary is proved, be presumed to be the owner of the copyright.

(2) Where any matter referred to in subsection (1) is at issue and no assignment of the copyright, or licence granting an interest in the copyright, has been registered under this Act,

    (a) if a name purporting to be that of

        (i) the author of the work,

        (ii) the performer of the performer’s performance,

        (iii) the maker of the sound recording, or

        (iv) the broadcaster of the communication signal

    is printed or otherwise indicated thereon in the usual manner, the person whose name is so printed or indicated shall, unless the contrary is proved, be presumed to be the author, performer, maker or broadcaster;

    (b) if

        (i) no name is so printed or indicated, or if the name so printed or indicated is not the true name of the author, performer, maker or broadcaster or the name by which that person is commonly known, and

        (ii) a name purporting to be that of the publisher or owner of the work, performer’s performance, sound recording or communication signal is printed or otherwise indicated thereon in the usual manner,

    the person whose name is printed or indicated as described in subparagraph (ii) shall, unless the contrary is proved, be presumed to be the owner of the copyright in question; and

    (c) if, on a cinematographic work, a name purporting to be that of the maker of the cinematographic work appears in the usual manner, the person so named shall, unless the contrary is proved, be presumed to be the maker of the cinematographic work."

Turning to the question of ‘but you can’t find a number in a registered copyright database’, you’re missing the point. All digital works are necessarily numbers, but they can be different numbers depending on encoding and etc. The intent of the copyright law isn’t that you can copyright the number but that you can’t copy the work that number represent. Thus what you’d find in a list of registered works are works.

Which is why the argument that ‘well since numbers can’t be be copyrighted, I can copy this number as much as I want, and if that number happens to be a way to represent this copyrighted work, I’ve found the One Neat Trick to not be hit for copyright infringement’ was so bloody stupid.

You’re really quick to accuse people of not understanding stuff, Mr. Regurgitation Engines Are Like A Library. Maybe slow down and try reading.

MrWilson (profile)

May 27, 2025 at 10:09 am

Re: Re: Re:⁹

Remedies
Civil Remedies
Infringement of Copyright and Moral Rights

You do realize that you quoted Canadian, not US law, right? This is the greatest own-goal I’ve seen in a long while.

And just so you don’t waste time trying to find something to support your claim in US law, here’s the relevant statute:

17 U.S. Code § 412 – Registration as prerequisite to certain remedies for infringement

In any action under this title, other than an action brought for a violation of the rights of the author under section 106A(a), an action for infringement of the copyright of a work that has been preregistered under section 408(f) before the commencement of the infringement and that has an effective date of registration not later than the earlier of 3 months after the first publication of the work or 1 month after the copyright owner has learned of the infringement, or an action instituted under section 411(c), no award of statutory damages or of attorney’s fees, as provided by sections 504 and 505, shall be made for—
(1)any infringement of copyright in an unpublished work commenced before the effective date of its registration; or
(2)any infringement of copyright commenced after first publication of the work and before the effective date of its registration, unless such registration is made within three months after the first publication of the work.

Turning to the question of ‘but you can’t find a number in a registered copyright database’, you’re missing the point. All digital works are necessarily numbers, but they can be different numbers depending on encoding and etc. he intent of the copyright law isn’t that you can copyright the number but that you can’t copy the work that number represent. Thus what you’d find in a list of registered works are works.

That’s exactly what I’m saying. Encoding is not creative effort. It is not able to be copyrighted. You seem to be missing the point.

Which is why the argument that ‘well since numbers can’t be be copyrighted, I can copy this number as much as I want, and if that number happens to be a way to represent this copyrighted work, I’ve found the One Neat Trick to not be hit for copyright infringement’ was so bloody stupid.

And I never supported that argument, so I don’t why you’re arguing with that straw man.

You’re really quick to accuse people of not understanding stuff,

Yes, because you didn’t understand at all. You didn’t even understand that you were quoting Canadian law when you thought you had a one-up on me.

Mr. Regurgitation Engines Are Like A Library.

And you prove you don’t understand because I’ve never said that an LLM is like a library. I said reading is like reading. A library is an example of where you can read. It was a pretty straightforward analogy. That you misunderstood it, along with the other points, is a pattern.

Maybe slow down and try reading.

Sage advice. Heal thyself, doctor!

Explorer09 (profile)

May 27, 2025 at 1:01 pm

Re: Re: Re:¹⁰

That’s exactly what I’m saying. Encoding is not creative effort. It is not able to be copyrighted. You seem to be missing the point.

Let me rephrase that anonymous user’s comment a bit:

Copyright protects not the numbers themselves, but what the number represent as creative works. This in turn also makes that number protected. The claim about “numbers not being copyrightable” is an oversimplification of the fact in a way that becomes misleading and useless in copyright debates.

If you still didn’t get it, you can try test out your theory in court and see who wins.

MrWilson (profile)

May 27, 2025 at 4:38 pm

Re: Re: Re:¹¹

Copyright protects not the numbers themselves, but what the number represent as creative works.

The thing that the number represents is the copyrighted work. The number is just an incidental result of encoding by a particular method.

This in turn also makes that number protected.

“Protected” is doing a lot of heavy lifting in this sentence. It doesn’t mean the number is copyrighted. If you randomly wrote a string of numbers and posted it and it just so happened in a random encryption scheme to correspond to a copyrighted work, that doesn’t mean that posting that number would be a copyright violation.

The claim about “numbers not being copyrightable” is an oversimplification of the fact in a way that becomes misleading and useless in copyright debates.

It’s an accurate statement. The irony is that this was the AC’s own straw man they made up. I never even endorsed it. I’m just arguing with the finer points because it shows lack of knowledge on the topic.

If you still didn’t get it, you can try test out your theory in court and see who wins.

It’s not my theory! Reread the thread. I didn’t bring this up at all.

You’re so desperate to argue and assume I’m wrong about something that you’re arguing with other people’s straw men instead of just the ones you’ve invented. You are desperate.

Anonymous Coward

May 27, 2025 at 1:36 pm

Re: Re: Re:¹⁰

I am well aware I quoted Canadian law. I’ve been quoting Canadian law the whole time, because a. I’m Canadian, b. it’s actually drafted clearly unlike the godawful U.S. equivalents, and c. it has been a test of whether you are actually at all interested in the citations you kept demanding.

Which you’ve spent a whole-ass week failing. Thanks for playing.

MrWilson (profile)

May 27, 2025 at 6:50 pm

Re: Re: Re:¹¹

Aha! My ineptitude was just a ruse, by george, and you fell for it! I really intended on arguing about things I knew were completely different to prove the point that you weren’t interested in the things you didn’t ask for!

Which you’ve spent a whole-ass week failing. Thanks for playing.

This is rule of goats territory. It doesn’t matter why you did or why you pretend you did it. You have undermined all your arguments. They are useless here. That’s not a gotcha.

But also, if you’ll notice, I asked for citations of a type and you responded with a different type.

For example:

I said: “Show me a copyright registration that includes an ASCII encoded numeric representation of a Harry Potter book chapter.”

You quoted a statute. Even if you had quoted a US law statute, it would still not be a copyright registration that includes an ASCII encoded numeric representation of a Harry Potter book chapter.

So even your excuse doesn’t hold water.

Thanks for playing indeed. You’ve spent a whole-ass week being provably wrong. I guess that’s worth something to you?

Anonymous Coward

May 21, 2025 at 10:38 pm

Re:

How much you want for me reading your comment? Cuz you ain’t gettin’ it.

How much you want for an LLM reading it? Because guess what.

[In no way do i endorse the hot garbage that is commercial “AI”, and as to “AI” being as important and central to life as electricity, fuck off, dime store Kurzweils.]

Anonymous Coward

May 22, 2025 at 12:11 am

“at the expense of creativity and innovation?”

Yeah, because nothing says creativity like thirty million shrimp Jesus flight attendants.

Arianity (profile)

May 22, 2025 at 12:18 am

At this scale, locating copyright holders and getting their permission is daunting for even the biggest and wealthiest AI companies, and impossible for smaller competitors.

“It would be hard” is not a factor in fair use. (Never minding that licensing companies/solutions are popping up)

It repeatedly conflates the use of works for training models—a necessary step in the process of building a GenAI model—with the use of the model to create substantially similar works.

It pretty explicitly considers them separately. To quote:

The use of a work in initial pre-training, for instance, may be distinct from its use in subsequent training or RAG. A number of commenters opined that the fair use analysis requires treating these different uses separately. Similarly Because generative AI models may simultaneously serve transformative and non- transformative purposes,264 restrictions on their outputs can shape the assessment of the purpose and character of the use. As well as some uses of copyrighted works for generative AI training will qualify as fair use, and some will not. On one end of the spectrum, uses for purposes of noncommercial research or analysis that do not enable portions of the works to be reproduced in the outputs are likely to be fair. etc.

And rightsholders don’t have the right to control fair uses

The report explicitly covers this. One might argue that although copyright
owners do not have a right to charge for fair uses as such, they do have a right to charge for access to their works. Which is true. While an author can’t control how you use their book, you do generally have to actually buy the book (or go to a library or equivalent).

Third, it’s based entirely on speculation that Bridgerton fans will buy random “romance novels” instead of works produced by a bestselling author they know and love.

There are literally already examples of people turning to AI work. This is also a bit of a red-herring, by only considering existing fans (or bestselling authors- people who aren’t bestsellers are also covered by copyright). New potential fans would have no reason to know/love/be loyal that author. (It also affects things like licensing). The only speculation is how far it will go, which will be highly dependent on how good it gets as it continues to improve.

Second, it’s not supported by any relevant precedent.

The situation is literally unprecedented. While it’s nice to have precedent to look back at, novel situations do sometimes require new precedent. As the report notes, it does get back to the fundamental reasons why copyright law exists in the first place.

This relies on breathtaking assumptions that lack evidence, including that all works in the same genre are good substitutes for each other—regardless of their quality, originality, or acclaim.

This is a massive strawman. It’s not claiming that all works are substitutes.

It’s a policy judgment about the value of GenAI technology for future creativity, by an office that has no business making new, free-floating policy decisions.

[a]dvise Congress on national and international issues relating to copyright,”2

However, we are equally as interested in what the law should be in the future.

Anonymous Coward

May 22, 2025 at 12:26 am

Love that the Electronic Freedom Foundation is choosing to have conniptions about hypothetical threats to creativity, innovation and freedom in order to defend technology and corporations that are currently causing actual harms to people’s ability to use the internet to communicate freely.

Anonymous Coward

May 22, 2025 at 9:20 am

This is also causing zero-click searches which don’t profit any content creators.

Content quality will suffer and we’ll return to a patronage model.

PB&J (profile)

May 22, 2025 at 9:54 am

The report attempts to sidestep that conclusion by repeatedly ignoring the actual use in question—training —and focusing instead on how the model may be ultimately used.

This suggests that the trained model itself is the thing of value — but that is demonstrably not true. No one wants the model — the trillions of parameters and weights are not meaningful or useful by themselves. What they want is the ability to produce outputs from that model.

So I might grant you that the training is fair use, but the moment you use the model, it sure feels like that’s infringement.

Explorer09 (profile)

May 22, 2025 at 11:58 pm

Re:

The model would have been useful if it were an aggregation of all human knowledge in the way that Wikipedia has been making. And yet no AI companies insist their models are such a thing. I would have grant them fair use if those AI companies really made the models for the benefits of the general public, but it’s not the case. The Big Tech as we see it made AI for private profit. They are not charity, they are not nonprofits that can argue about fair use in this manner.

urza9814

May 22, 2025 at 7:21 pm

These CEOs ought to be in prison

Aaron Schwarz died in prison for “stealing” tens of gigabytes of copyrighted data to give it freely to the world for the benefit of us all. These executives “steal” petabytes purely to fill their own bulging wallets and rather than throwing them in prison everyone is bending over backwards to justify and excuse it!

CEOs are not gods. They do not deserve such worship.

Explorer09 (profile)

May 22, 2025 at 11:28 pm

EFF on fair use, they are wrong this time

This is one of the times I disagree with EFF with the fair use argument, seriously. The generative AI isn’t just a “innovative” tech but also a tech that exploits creative labor of other people. The evidences are clear enough that Meta torrented books (read: pirated) to train their Llama AI models, and EFF can’t defend anything on why Meta couldn’t just buy legal copies of books and train the AI with just those. That’s one of the failures in EFF’s argument.

I do agree that BitTorrent and other P2P platforms should not be illegal by themselves (as EFF argued), but as Meta used those pirated content in commercial applications, that is a strong factor to rule against fair use for AI training, despite many AI advocates suggest the opposite.

And note that the arguments of generative AI could promote free speech for minor groups of people are nonsense. AI generated content is not protected speech in many countries that deal with issues with AI (US and EU included). Therefore it could be legal for web platforms to force disclosure of whether content is AI generated from users uploading it and not violate, e.g. first amendment in the US.

As for generative AI training (the main topic here), what EFF argued about being fair use largely ignored the rulings of Warhol Foundation v. Goldsmith. The EFF suggested decoupling on the fair use analysis of AI training from the “ultimate use” of AI is erroneous. Warhol rulings is the opposite of what EFF is suggesting right now and EFF didn’t learn.

Explorer09 (profile)

May 23, 2025 at 12:26 pm

My frustration with the arguments of people claiming it’s not fair use and that all training must be licensed is that many people seem to think they’re championing the little guy when they’re inadvertently advocating for the benefit of the wealthy and corporations.

Yes, inevitably this would be a corporations vs. corporations fight. Specifically, Big Hollywood vs. Big Tech. And like it or not, I have to pick a side.

And I admit I picked the Hollywood side, not because I like them or they are all moral, but they simply respect creative workers and their unions, while the Big Tech ignored them mostly.

First, it is fair use whether you like the idea of corporations profiting off of the largely unpaid work of poor and creative people. That’s just capitalism in general. Argue against that instead of licensing costs for AI training if that’s what you want.

Warhol v. Goldsmith can question your claim about fair use, and before you reply, let me tell you I have read all the amici curiae briefs in the Kadrey v. Meta regarding the summary judgement motion. In other words, I know all arguments of both sides.

[T]he ability of a large corporation to train AI on publicly available data without paying every single copyright owner is the same ability you have to do the same. It’s the same that an independent university researcher trying to study potential treatments for rare and unprofitable diseases.

Nay. The fair use analyses in US courts don’t work that way.

(1) University researchers doing on copyrighted data would more likely find fair use than for-profit corporations, as university researches serve more purposes, including profit and non-profit (educational) ones.

(2) An AI on researching medical treatments are very unlikely to need data about fiction books, music and
visual arts. Yet the general-purpose AIs like ChatGPT are train with those artistic works without justification. Saying that it is fair use undermines common sense.

You’re arguing, ultimately, that only wealthy corporations with large treasure troves of corporate profits should be allowed to build LLMs. You’re opposing the democratization of the technology.

This argument is flawed for two reasons. (1) It is Big Tech we are fighting and they are already “wealthy corporations with large treasure troves of corporate profits” and yet they don’t pay content creators a single penny. You assume only Hollywood that is wealthy, which is far away from reality. (2) There is no “democratization” at all with AI. You still have Big Tech monopoly. Like it or not, and the rest of the “democratization” argument is straw man.

It’s like saying you don’t like Microsoft’s business practices so Linux should also be outlawed.

Linux didn’t copy creatives works of others without consent. False analogy.

It is absolutely important to advocate for regulations and oversight and ethical laws that control how this happens.

Does the Trump administration ever want to regulate AI when they are trying to pass a bill that restricts states from regulating AI for 10 years?

Actual creators, such as myself, will get virtually nothing even if licensing is required.

At least they would no longer train on your work without consent.

Anonymous Coward

May 23, 2025 at 6:35 pm

Re:

And like it or not, I have to pick a side.

You don’t.

This is a case where both sides deserve to lose.

MrWilson (profile)

May 23, 2025 at 9:20 pm

Re:

Specifically, Big Hollywood vs. Big Tech.

Hollywood is movies. We’re talking about content/media companies that are more than just film. It’s film, music, video games, books, podcasts, audiobooks, et al.

And like it or not, I have to pick a side.

Sounds like you have a personal bias that clouds your judgement. Also, you absolutely don’t have to pick either of those sides. That’s a false dilemma. If you pick either of those sides, they will both continue to win and you will always lose.

And I admit I picked the Hollywood side, not because I like them or they are all moral, but they simply respect creative workers and their unions, while the Big Tech ignored them mostly.

Holy fuck! Have you been awake for the last thirty years? Writers’ strike mean anything to you?

Hollywood is the industry that has violated and exploited my copyrights as a creator more than anyone else. You sound like you’re saying you like one abusive boyfriend more than another because your preferred boyfriend beats you less often. That’s fucked up.

Warhol v. Goldsmith can question your claim about fair use,

No, not really.

and before you reply,

Oh, this is definitely going to be a non sequitur…

let me tell you I have read all the amici curiae briefs in the Kadrey v. Meta regarding the summary judgement motion. In other words, I know all arguments of both sides.

“Knowing” all argument of both sides doesn’t mean you’re right. Also, there are more sides than just the amicus briefs.

Nay. The fair use analyses in US courts don’t work that way.

You missed the point of the statement. Rights are universal. Meta’s right to moderate its website is the same right I enjoy to moderate my website. If it’s fair use for me to copy public data (e.g. use a web browser), it’s fair use for a corporation and vice versa. And that act is functionally the same as a deaf person downloading the same data and having a screen reader read it out loud or a blind person using a device to convert it to braille or a person with ADHD copying a long article and having an LLM summarize it for them or a person with a good memory being able to read and recite an entire work.

(1) University researchers doing on copyrighted data would more likely find fair use than for-profit corporations, as university researches serve more purposes, including profit and non-profit (educational) ones.

Except university researchers have smaller legal funds to defend against corporate lawsuits, so this analysis is useless in the face of punishment-by-lawsuits. Also, some corporate LLMs can be used by researchers, so the issue is more gray than you’re pretending. Just look at the case law surrounding copyright issues with academic journals.

(2) An AI on researching medical treatments are very unlikely to need data about fiction books, music and
visual arts. Yet the general-purpose AIs like ChatGPT are train with those artistic works without justification.

Except the arguments against corporations training data isn’t against the purpose but rather the process. The arguments include the claim that the copying is necessarily copyright infringement, so it doesn’t matter what the content is that is being copied and it doesn’t matter what purpose it is being copied for. You’re saying there are exceptions, but I’m specifically critiquing the people who aren’t allowing for any exceptions in their analyses.

Saying that it is fair use undermines common sense.

Saying it’s not fair use undermines the law.

Common sense is often claimed by myopic people who thinks everyone does or should think as they do.

This argument is flawed for two reasons. (1) It is Big Tech we are fighting and they are already “wealthy corporations with large treasure troves of corporate profits” and yet they don’t pay content creators a single penny. You assume only Hollywood that is wealthy, which is far away from reality.

I don’t assume only Hollywood is wealthy. Not sure where that came from. You’re also not only fighting big tech. You’re fighting anyone who would make the same fair use argument, which includes researchers, libraries, college kids, and precocious middle schoolers playing with technology in their mom’s basement. You’re using a nuclear argument to eliminate an enemy city but also every member of innocent wildlife living in the adjacent rural areas. Big tech will be fine regardless of how the lawsuits play out. The little guys will not. Research will suffer. Some kid is going to get sued for training an LLM the way his dad was sued for downloading MP3s on Limewire. Poor people will be further crushed because you’re handing corporations a cudgel.

(2) There is no “democratization” at all with AI. You still have Big Tech monopoly. Like it or not, and the rest of the “democratization” argument is straw man.

That just shows you don’t know anything about the AI field. There are a lot of open source and independent projects. There are people training specifically anti-corporate LLMs and public advocacy LLMs. That you are proudly ignorant of them speaks to the limited basis of your argument.

Linux didn’t copy creatives works of others without consent. False analogy.

First, you don’t seem to understand what analogies are. If analogues were exactly the same as what they’re compared to, they wouldn’t need to be analogized. Second, fair use doesn’t require consent, so this argument is pointless. But also, the fact that Linux didn’t improperly copy the works of others (hello SCO Group!) is why the analogy is meaningful. I’m saying you’re targeting innocent people because you are blindly attacking them while thinking you’re attacking large corporations.

Does the Trump administration ever want to regulate AI when they are trying to pass a bill that restricts states from regulating AI for 10 years?

I would never suggest that the Trump Administration would ever produce anything ethical. Also, the Trump Administration isn’t the legislature actually responsible for passing laws.

At least they would no longer train on your work without consent.

My copyrighted works get infringed all the time. LLM training isn’t my problem. In ten years when a middle schooler asks their AI teacher about a topic I’ve produced content in relation to, I’d love for my contributions to the field to come up rather than be completely forgotten because I was butthurt I didn’t get a five dollar settlement from a useless lawsuit that only made corporations and lawyers wealthier.

You seem ignorant about the inevitable nature of this. Read op-eds about smart phones from twenty years ago or rants about the internet or the telephone or the newspaper and how they were going to ruin civilization. It’s all just tools. The wealthy will always use available tools to make themselves wealthy. You can’t stop that. You can avail yourself of your own tools though. Killing your own tools to spite the wealthy who won’t feel the minor prick is shortsighted and arrogantly stupid.

Explorer09 (profile)

May 23, 2025 at 11:34 pm

Re: Re:

Hollywood is the industry that has violated and exploited my copyrights as a creator more than anyone else. You sound like you’re saying you like one abusive boyfriend more than another because your preferred boyfriend beats you less often. That’s fucked up.

Unfortunately that’s the situation. It’s indeed f__ked up whether you like it.

If it’s fair use for me to copy public data (e.g. use a web browser), it’s fair use for a corporation and vice versa.

That’s the big IF there. I can explain the case of Warhol Fund. v. Goldsmith if you are willing to listen. Your fair use assumption is no longer holding after the Supreme court decision of that case.

Also, some corporate LLMs can be used by researchers, so the issue is more gray than you’re pretending.

Copyright groups already call this “Data Laundering” and specifically oppose this. It’s not my position. Yet I agree with their reasoning. The academics can make their own LLMs from scratch that are transparent in which data they have been trained with credit the copyright owners appropriately. No justification to use a commercial LLM that might be illegal in the first place to do it.

You’re fighting anyone who would make the same fair use argument, which includes researchers, libraries, college kids, and precocious middle schoolers playing with technology in their mom’s basement.

Fair use depends on the ultimate purposes of the AI models and it would not be “all yes” or “all no”. My position on this is same as USCO. It’s dangerous to greenlight all of them because some of the training are really unethical to begin with.

You’re using a nuclear argument to eliminate an enemy city but also every member of innocent wildlife living in the adjacent rural areas.

Why not? Nuke all LLMs until one is built up that respect people’s rights. What’s the problem with that?

Because LLMs is quite a new tech, I don’t see any problem of forcing everyone to the pre-LLM ways of working and lifestyle. It’s your issue of making the unethical tech your life.

The little guys will not. Research will suffer.

Unfounded argument.

Some kid is going to get sued for training an LLM the way his dad was sued for downloading MP3s on Limewire.

In other words, the Napster era.

I do not share compassion of people pirating music even though the tech like Limewire had other legal uses. This argument is exactly the Napster case. It’s not the VCR (Betamax) case. And I had enough of such arguments that ignored the court rulings. Sorry, don’t try to change my mind.

There are a lot of open source and independent projects.

The software being open source is independent of the data model being legal.

There are people training specifically anti-corporate LLMs and public advocacy LLMs.

I don’t advocate for blocking them. Why do you think that I do? As long as the models are all trained from scratch.

Saying you can buy a commercial pre-trained model (which might be copyright infringement in the first place) and augment that model and claim it’s all yours is dishonest. I specifically condemn this kind of lying.

Second, fair use doesn’t require consent, so this argument is pointless. But also, the fact that Linux didn’t improperly copy the works of others (hello SCO Group!) is why the analogy is meaningful.

The SCO v. IBM case, no?

That case didn’t rule anything about fair use according to Wikipedia. The question was whether SCO is entitled to sue Linux as the copyright claim from SCO over so called Unix code was unclear. Nothing to do with AI training or fair use.

My copyrighted works get infringed all the time. LLM training isn’t my problem.

You have no idea what kind of rights we are fighting for, for you. You are so “innocent” that you got tricked to throw aways rights that should be yours to hold. Pathetic.

In ten years when a middle schooler asks their AI teacher about a topic I’ve produced content in relation to, I’d love for my contributions to the field to come up rather than be completely forgotten

Feel free to license you works under a free license such as CC-BY. There’s nothing stopping you from doing to. Also you can give explicit permission for AI training if you want to. You are simply not forced to do it.

Read op-eds about smart phones from twenty years ago or rants about the internet or the telephone or the newspaper and how they were going to ruin civilization.

Off-topic. But look at the issues of young children gotten addicted with phones and internet in general. And the modern landscape of internet is loaded with tons of misinformation and disinformation, the fear of smartphone ruining the civilization isn’t without merits.

MrWilson (profile)

May 24, 2025 at 1:58 am

Re: Re: Re:

Unfortunately that’s the situation. It’s indeed f__ked up whether you like it.

Except that’s not the situation. You don’t have to side with any wealthy people at all. That you think you do is what’s fucked up.

That’s the big IF there. I can explain the case of Warhol Fund. v. Goldsmith if you are willing to listen. Your fair use assumption is no longer holding after the Supreme court decision of that case.

Except Warhol is about a photograph, not text. Warhol is about licensing a derivative work, not LLMs producing non-derivative works. Warhol is about two entities attempting to use the same content for the same commercial purpose. If you can cite an author who wrote their text with the intent to train an LLM, I’m interested in seeing that.

Copyright groups already call this “Data Laundering” and specifically oppose this. It’s not my position. Yet I agree with their reasoning.

That doesn’t make you or them correct. The training is fair use regardless of who does it. Making up a term for it doesn’t make it illegal.

The academics can make their own LLMs from scratch that are transparent in which data they have been trained with credit the copyright owners appropriately.

Not according to the people whose arguments I’m referring to. They are claiming that all training is a de facto copyright violation. If you’re not arguing that, then there’s no reason for you to respond to my post.

Fair use depends on the ultimate purposes of the AI models

No, that’s only one of the four factors. You’re being myopic.

and it would not be “all yes” or “all no”.

That’s for a court to decide.

My position on this is same as USCO. It’s dangerous to greenlight all of them because some of the training are really unethical to begin with.

Unethical isn’t a legal argument. We can talk about ethics all day but it has little to do with our laws or government. We live in an unethical system where laws are bought by corporations. If you’re only making an ethical argument, then the nature of copyright laws are themselves unethical and the whole discussion is moot.

Why not? Nuke all LLMs until one is built up that respect people’s rights. What’s the problem with that?

Nuking all LLMs means there’s no until. If you successfully argue that LLMs require paid licensing for training, then no ethical LLMs can ever be trained by anyone because only unethical corporations will be able to afford the entry fee of licensing costs.

Because LLMs is quite a new tech, I don’t see any problem of forcing everyone to the pre-LLM ways of working and lifestyle. It’s your issue of making the unethical tech your life.

Except it’s not me at all. I’m not even using LLMs much. I’ve played with them to see what they’re capable and been encouraged by their failures in performance to provide me with the confidence that they won’t as yet be able to replicate my creativity. I’m not afraid to test them to find out their limitations and there are many limitations to document. But that doesn’t mean that an LLMs can’t improve our collective performance in areas we actually value. I’m saying ignore the cute but absurd LLMs generating twelve finger art and look at the progress of the LLMs diagnosing medical patients better than experienced doctors. You’re obsessed with getting one over on Big Hollywood so much that you’re willing to fuck over a cancer patient.

Unfounded argument.

[citation needed]

In other words, the Napster era.

Yes, an era where big corporations fucked over poor people. Why are you cheering on a return to that?

I do not share compassion of people pirating music even though the tech like Limewire had other legal uses. This argument is exactly the Napster case. It’s not the VCR (Betamax) case. And I had enough of such arguments that ignored the court rulings. Sorry, don’t try to change my mind.

You don’t share compassion with poor people who can’t afford to purchase overpriced luxury commodities in a capitalist system that undervalues worker contributions such that they resort to illegal means that then subject them to disproportionate penalties based entirely on legislation passed by corrupt legislators bribed by the very democracy-undermining corporations who benefit from said lawsuits? Color me completely surprised and definitely sarcastic in this particular sentence. I feel like this says everything I’ve been trying to say. You don’t mind the little guy getting crushed because you’ll defend to the death corrupt corporate-bribed laws that magically change when they want another unethical payday for work they never performed themselves.

The software being open source is independent of the data model being legal.

The open source projects benefit from the same fair use analysis. LLMs are significantly less useful without large data sets. Independent development will never be able to afford arbitrary licensing fees.

I don’t advocate for blocking them. Why do you think that I do? As long as the models are all trained from scratch.

What the fuck does “trained from scratch” mean? Are you suggesting that an LLM only be trained from the 40,000 words a researcher uses to define parameters? Do you not understand at all the technology we’re talking about?

Saying you can buy a commercial pre-trained model (which might be copyright infringement in the first place) and augment that model and claim it’s all yours is dishonest. I specifically condemn this kind of lying.

Straw man. Who said anything like this?

That case didn’t rule anything about fair use according to Wikipedia. The question was whether SCO is entitled to sue Linux as the copyright claim from SCO over so called Unix code was unclear. Nothing to do with AI training or fair use.

So you’re saying you weren’t previously aware that SCO was claiming that Linux did in fact contain “stolen” code from Unix? So you’re saying you don’t understand the point and your analysis is limited by your limited understanding? I’ll accept that.

You have no idea what kind of rights we are fighting for, for you. You are so “innocent” that you got tricked to throw aways rights that should be yours to hold. Pathetic.

You don’t get to represent me without my consent. The sheer arrogance of this statement is absurd. I’m not throwing away any rights I’m able to retain.

Feel free to license you works under a free license such as CC-BY. There’s nothing stopping you from doing to. Also you can give explicit permission for AI training if you want to. You are simply not forced to do it.

I have already licensed many of my works using a Creative Commons Attribution license. I love that you think you’re educating me on this topic. A CC-BY license doesn’t mean your works aren’t subject to copyright violations. If there’s no attribution, it’s a violation. Also, big media corporations don’t care about that. I have first hand experience.

Off-topic.

Not off-topic at all. Highly relevant. There’s a moral panic with every new technology that pops up. Everything gets labeled as bad for children. I learned math from playing DnD as a kid despite being told it would cost me my soul. I learned programming from editing computer games as a teenager despite being told video games would rot my brain. Telephones meant people would no longer visit each other in person and we’d all become disconnected entirely. Newspapers meant people would read in public instead of engaging socially with the people around them. Clutch those pearls!

But look at the issues of young children gotten addicted with phones and internet in general.

This isn’t a technology issue. Before phones it was video games. Before video games it was TV. Before TV it was radio. Before radio, it was marbles and jacks. Before marbles and jacks, it was billiards and pool and that rhymes with T and that stands for trouble right here in River City. Anyone, especially kids, looking for a distraction or an obsession, will find it, regardless of whether it’s a complex machine or a stick that they can pretend is a sword or a gun. That you think phones and the internet are the problem rather than the symptom is, again, telling about your own myopic perspective.

And the modern landscape of internet is loaded with tons of misinformation and disinformation, the fear of smartphone ruining the civilization isn’t without merits.

You should be more concerned with the lack of education on how to be skeptical of uncited claims (including yours). There will always be mis- and disinformation regardless of the medium of communication. Critical thinking and analysis is important, but you’re advocating for a dumbed down logic of “copying bad.” Human behavior will always be the problem regardless of the medium of communication or any tools or technology. You’re abetting human behavior by pretending technology is the problem.

Explorer09 (profile)

May 24, 2025 at 4:36 am

Re: Re: Re:²

Except Warhol is about a photograph, not text. Warhol is about licensing a derivative work, not LLMs producing non-derivative works.

The assumption of yours is LLM not being a derivative. I disagree. And yet there’s no court ruling on that yet. We can wait on how the AI copyright lawsuits go.

The training is fair use regardless of who does it. Making up a term for it doesn’t make it illegal.

Court case citation needed. (I have cited mine, this is your burden of proof.)

Nuking all LLMs means there’s no until. If you successfully argue that LLMs require paid licensing for training, then no ethical LLMs can ever be trained by anyone because only unethical corporations will be able to afford the entry fee of licensing costs.

Look at Fairly Trained, an organisation proving the opposite of your claim.

You don’t share compassion with poor people who can’t afford to purchase overpriced luxury commodities in a capitalist system that undervalues worker contributions such that they resort to illegal means that then subject them to disproportionate penalties based entirely on legislation passed by corrupt legislators bribed by the very democracy-undermining corporations who benefit from said lawsuits?

What the heck does this have anything to do with AI? The AI companies are not poor people. They made millions or billions of dollars exploiting the creative works of others.

LLMs are significantly less useful without large data sets. Independent development will never be able to afford arbitrary licensing fees.

The first of what you said is disputed. The second, as I replied in another post, fair use is a bad solution for AI startups because it still exploits works of artists, big or small.

What you were advocating is the legalisation of exploitation by Big Tech companies under the shields of small companies. You don’t know what you are defending against.

What the fuck does “trained from scratch” mean? Are you suggesting that an LLM only be trained from the 40,000 words a researcher uses to define parameters? Do you not understand at all the technology we’re talking about?

There is an initial state for the neural network parameters before the network start getting fed with human content for the parameters to automatically adjust themselves. That is what they called “pre-training” and I knew that pretty well. When Adobe can train a model from scratch, without using a pre-trained model from others (which might be accused of copyright infringement), there is no reason small companies can’t do it.

So you’re saying you weren’t previously aware that SCO was claiming that Linux did in fact contain “stolen” code from Unix?

What does this anything to with AI? Hell, if this case matter at all, the AI companies would have been citing it as a defense. This SCO v. IBM has even lower importance than, say, Google v. Oracle (the accusation of Google copying Java code in its Android operating system).

You don’t get to represent me without my consent. The sheer arrogance of this statement is absurd. I’m not throwing away any rights I’m able to retain.

I say pathetic. You don’t know we are fighting for your rights to and you mistook us as enemies out of that ignorance.

MrWilson (profile)

May 24, 2025 at 12:27 pm

Re: Re: Re:³

The training is fair use regardless of who does it. Making up a term for it doesn’t make it illegal.

Court case citation needed. (I have cited mine, this is your burden of proof.)

You literally just said there wasn’t a case yet, so why are you asking for one?

Look at Fairly Trained, an organisation proving the opposite of your claim.

Except they’re not. It’s run by a guy who thinks training is stealing. They claim to certify ethical AI training. That’s just a claim. I can form an organization that claims the opposite. That doesn’t make the claim true.

Also, organization is typically spelled with a Z in the US. Are you an American?

You don’t share compassion with poor people who can’t afford to purchase overpriced luxury commodities in a capitalist system that undervalues worker contributions such that they resort to illegal means that then subject them to disproportionate penalties based entirely on legislation passed by corrupt legislators bribed by the very democracy-undermining corporations who benefit from said lawsuits?

What the heck does this have anything to do with AI? The AI companies are not poor people. They made millions or billions of dollars exploiting the creative works of others.

You’re admitting here that you haven’t understood anything about my position. This has everything to do with AI.

Literally the first thing I said to which you responded was: “My frustration with the arguments of people claiming it’s not fair use and that all training must be licensed is that many people seem to think they’re championing the little guy when they’re inadvertently advocating for the benefit of the wealthy and corporations.”

I’m not defending AI companies at all. I’m defending the little people. The AI companies have enough money or can get enough money to license their content and the little people will still be fucked. What you’re advocating for is making it difficult for the little guy to do things by putting a massive price tag on the activity that only a wealthy corporation will be able to afford. Have you not read anything I’ve written?

The first of what you said is disputed.

Experiment yourself. Test the performance of an LLM trained on an extremely small dataset and one on a large dataset. The difference should be obvious. “Disputed” here is like “vaccine skeptic.”

The second, as I replied in another post, fair use is a bad solution for AI startups because it still exploits works of artists, big or small.

That’s an entirely subjective claim. Do you consider it exploitation of an author if someone reads their work and the reader becomes a better writer because of the experience? Should the reader ask for permission to learn from their experience?

What you were advocating is the legalisation of exploitation by Big Tech companies under the shields of small companies. You don’t know what you are defending against.

You don’t know what I’m fighting for, even when I explain it to you. I’m fighting for the little guy who needs open source and independent AI to provide alternatives to the inevitable corporate AI dominance. And your position is that both the little guy and the corporation should pay millions of dollars to develop their models, therefore, only the corporation will be able to offer anything useful and then AI will be dictated by profiteering corporations rather than democratized and more useful for securing and enhancing rights and freedoms.

Also, there’s the Z = S conversion again in “legalisation.”

There is an initial state for the neural network parameters before the network start getting fed with human content for the parameters to automatically adjust themselves. That is what they called “pre-training” and I knew that pretty well.

Yeah, so you don’t understand the technology. Pre-training is the point at which the large dataset of “human content” is fed into the LLM.

When Adobe can train a model from scratch, without using a pre-trained model from others (which might be accused of copyright infringement), there is no reason small companies can’t do it.

“Train from scratch” in the context of an LLM just means that you’re providing your own chosen dataset rather than copying from someone else’s. That doesn’t have any effect on the legality of the process. You can “train from scratch” with copyrighted material.

What does this anything to with AI?

You don’t seem to be able to follow the conversation.

You said “Linux didn’t copy creatives works of others without consent. False analogy.” I pointed out that SCO actually accused Linux of copying the work of others without consent. I was pointing out that you didn’t understand the history of what you were referring to, so your analysis that it was a “false analogy” is useless. If I make a knitting metaphor and you don’t know what a knitting bobbin is, your analysis of my metaphor is useless. But you still missed the entire point of the analogy. It was to say that you are attacking big corporations but hurting innocent independent non-profits, researchers, students, and poor individuals. That’s the whole point here.

I say pathetic.

Yes, trying to represent someone without their consent is pretty pathetic. Telling that person that they don’t understand their own interests when you clearly don’t understand their interests is patronizing.

You don’t know we are fighting for your rights to and you mistook us as enemies out of that ignorance.

You aren’t fighting for my rights. You are fighting against my right to train an LLM based on content I can find in the world, the entire breadth of human knowledge that is available online. You’re saying I should have to pay millions of dollars to a big media company in order to scan copies of works I already own a copy of (hello, first sale doctrine). You’re saying I should be stuck with only having access to LLMs that profitable corporations develop, that future authoritarian administrations will try to adjust with “official” takes that erase actual history and human rights violations. Your inadvertent position is that only large corporations will be able to shape the future.

Did you know you can sue AI companies when they output your paper without citing you as the author? That’s your right.

Did you know that I’ve tried to get LLMs to output my work and they haven’t gotten close to it at all. That’s a useless right if the output isn’t a derivative work.

The Doe v. GitHub case, now pending for appeal in the Ninth Circuit, is defending for this. And in case you didn’t know, the Doe v. GitHub plantiffs are open source software developers.

You seem to think that means something contrary to my position. Again, you’re arguing against a straw man because you don’t understand what I’m saying, despite my explicit statements.

Explorer09 (profile)

May 24, 2025 at 12:43 pm

Re: Re: Re:⁴

The training is fair use regardless of who does it.
You literally just said there wasn’t a case yet, so why are you asking for one?

Thomson Reuters v. ROSS Intelligence: District court ruled that Ross’s use is not fair use. The decision is pending appeal.

Kadrey v. Meta: This is the case closest to recieve a decision on whether generative AI training is fair use. I keep watching on this one.

If you successfully argue that LLMs require paid licensing for training, then no ethical LLMs can ever be trained by anyone because only unethical corporations will be able to afford the entry fee of licensing costs.
[Fairly Trained is] run by a guy who thinks training is stealing. They claim to certify ethical AI training. That’s just a claim. I can form an organization that claims the opposite. That doesn’t make the claim true.

It is your argument that there can be no “ethical” training when you use “ethical” in my definition. When I gave you an example of the opposite, you then deny the concept of “ethical”. WHAT THE F*CK is wrong with you?

(Before you reply, let me tell you Meta made the same arguments in their defense in Kadrey v. Meta. So I know how f*cking evil with it.)

MrWilson (profile)

May 24, 2025 at 2:16 pm

Re: Re: Re:⁵

Thomson Reuters v. ROSS Intelligence

That wasn’t fair use because of the four factor analysis finding that it was for the purpose of building a competing service so it failed the market effect factor. That’s not universal to all LLM training.

When I gave you an example of the opposite

You haven’t given me an example of the opposite. You referenced an organization apparently headed by biased individuals who operate on unproven bases for their approach. You take their claims at face value. I do not. I will wait for actual evidence, not claims.

(Before you reply, let me tell you Meta made the same arguments in their defense in Kadrey v. Meta. So I know how f*cking evil with it.)

Before you reply, let me tell you that everything you keep saying after “before you reply” hasn’t changed how incorrect you are. Your claimed knowledge apparently isn’t helping you make sound arguments. Trying to pre-empt my arguments when you don’t even understand my position doesn’t seem fruitful.

It’s just terminology difference, but the same concept.

Except you said: “I don’t advocate for blocking them. Why do you think that I do? As long as the models are all trained from scratch.” Since it’s possible to “train from scratch” with copyrighted material, this distinction is useless. You’re saying it’s both copyright infringement to train an LLM on copyrighted material, but also ethical if you do the copying yourself rather than copying someone else’s copied dataset. That doesn’t make any sense.

And it’s your liability for training it with copyrighted material then.

Yes, and large corporations can afford to either license content or pay lawsuit settlements. Independent researchers and students and precocious middle schoolers can’t. You’re saying the little guy should be financially crushed should he train an effective LLM on a large dataset.

It is you that insist that is “fair use” and keep denying it.

Yes, because it is fair use, the same as if I read a book at the library and remember every word in it and that memory informs my ability to write, but not just regurgitate the exact same text. It is fair use for children to learn to write by reading what others have written. For your position to be consistent, you would have to insist that children should pay to learn to read.

And I didn’t deny that I help the big corporations here.

And yet you pretend to judge the ethics of others. This is really all you have to say to prove you have no moral stance here.

Because there is no “little guy” that you claimed to be defending.

I’ve literally been referencing little guys. You are proving you haven’t actually understood what I’ve said.

It’s corporations vs. corporations, like it or not.

Again, false dilemma. You don’t have to pick either side. You can oppose both.

So your “frustration” with the issue was based on misunderstanding of it. Not my fault.

Here you’re just admitting you don’t understand how case law and precedents work. I have already explained this, but you still miss this point: If corporations lose cases and the result is a legal precedent that all training requires financial compensation, poor people will not be able to afford to train LLMs and therefore only wealthy corporations will be able to. Full stop.

You thought the Big Tech side were the good guys.

No, I didn’t. I literally said: “Except that’s not the situation. You don’t have to side with any wealthy people at all. That you think you do is what’s fucked up.”

I say no they aren’t. It’s better to admit they are both evil, but I have my own value to protect.

I’m saying they’re both “evil,” though it’s clearer to say greedy and unethical. You aren’t protecting your own value. You’re selling your soul to the company store while pretending corporations are just opposing sports teams you have to pick from to align yourself with. You don’t have to align with any of them. It’s fucked up that you think they’re evil but you’re still picking one of them. That really kills any pretense of a moral argument from you.

You should pay tuitions to your teacher. Everyone pays someone when they learn.

This is where I don’t think you’re American again, or maybe not an American English first speaker. In the US, “pay tuitions to your teacher” isn’t a thing. You pay tuition at college or a private school, and not directly to an individual teacher who is more commonly called a professor or instructor at that level, but most K-12 schools where students actually learn to read are public schools that don’t involve tuition. Also, tuition is plural, not tuitions, unless you’re talking about different types of tuition. And you completely avoided answering my question of whether you’re an American or not, which likely indicates you think the answer would weaken your argument.

There may be people who can self teach, but when they read book they must either buy or rent a legal copy of it. Does this not make sense?

Not in the US where public libraries are tax funded and little free libraries are giving away books on every other street corner. Not on the internet where most websites are free to read.

I’m going to stop responding at this point. You don’t appear to be an American or have an understanding of American institutions, so arguing about American laws isn’t useful.

Explorer09 (profile)

May 24, 2025 at 5:18 pm

Re: Re: Re:⁶

Thomson Reuters v. ROSS Intelligence

That wasn’t fair use because of the four factor analysis finding that it was for the purpose of building a competing service so it failed the market effect factor. That’s not universal to all LLM training.

Thumb up for pointing a fact, and I agree with it, too.

And, is it not true that some of the LLMs would compete with the original author on the written works? Asking this in another way: How can a LLM not compete with the original authors in the market of written works?

You’re saying it’s both copyright infringement to train an LLM on copyrighted material, but also ethical if you do the copying yourself rather than copying someone else’s copied dataset. That doesn’t make any sense.

By “copying yourself” I mean when you own the copyright of the contents. Did I explain not clear enough?

Since I assume the LLMs are “derivative works” of the training data in terms of copyright, this is all logical to me.

[L]arge corporations can afford to either license content or pay lawsuit settlements.

Why not make the large corporations pay? That’s the whole point of the authors suing!

Independent researchers and students and precocious middle schoolers can’t. You’re saying the little guy should be financially crushed should he train an effective LLM on a large dataset.

Why shouldn’t pirates be “financially crushed”? Look, being poor is never an excuse of doing illegal things. Except when you are advocating to legalize thing that should have been illegal.

if I read a book at the library and remember every word in it and that memory informs my ability to write, but not just regurgitate the exact same text

Two issues: (1) Did you legally buy the book or rent one? (2) Writing a thing that you remembered in every word does not make it free from infringement.

It sounds like you are trying to “read” a chapter of Harry Potter, English version, and remember it word by word, and write down a whole chapter in French, or Spanish. Yeah, you didn’t “regurgitate” technically, and yet you still reproduce the creative expressions J.K. Rowling made in her novels.

Again, false dilemma. You don’t have to pick either side. You can oppose both.

My position is protecting small authors (writers, painters and musicians) from AIs quickly generating works “in their style”, potentially copying original authors’ expressions that is infringement, the things that copyright law was originally designed to protect.

In the US, “pay tuitions to your teacher” isn’t a thing. You pay tuition at college or a private school, and not directly to an individual teacher who is more commonly called a professor or instructor at that level, but most K-12 schools where students actually learn to read are public schools that don’t involve tuition.

Not in the US where public libraries are tax funded and little free libraries are giving away books on every other street corner.

No public school teacher would teach for free. The question of whether I’m an American is not relevant. Th point is even for public schools, the teachers got paid with government’s money, which in turn comes from your pockets through “taxes”. That’s enough. A question on this argument is a straw man and wastes my time.

Not on the internet where most websites are free to read.

Do you really think people put knowledge on the internet for you to read for free are truly “free”? There Is No Free Lunch as the economists always say.

Most websites got revenues through advertising, a few other put content on paywalls if their advertising revenues can’t cover their costs, and yet others, like Wikipedia, rely on donations.

Imagine what could happen when AI took the knowledge from Wikipedia, and generate content without citing Wikipedia as the source. Fewer people would read or edit Wikipedia. There would be less contributions from volunteers, and less donations to cover server and staff costs, eventually.

MrWilson (profile)

May 25, 2025 at 9:54 pm

Re: Re: Re:⁷

And, is it not true that some of the LLMs would compete with the original author on the written works?

That’s for a court decide in individual cases. Plaintiffs should make that claim if it’s actually true. I’m arguing against the broad generalization claim that all training must be licensed.

Asking this in another way: How can a LLM not compete with the original authors in the market of written works?

Easily. It doesn’t. I’m not sure why you need clarification on that point. Do you think LLMs are only meant to replace the authors of the content used to train them? That might be your problem. You’re imagining the worst case scenario and ignoring or are just unimaginative regarding all the other possible uses. LLMs are useful not to replace artists or human creativity. They’re useful for doing the shit work we don’t want to do, like writing an email to your boss that summarizes recent progress on a project you don’t even want to be working on. Again, again, I will say, again, you don’t seem to understand the facets of the technology you’re blindly frothing at the mouth over.

That said, I’d actually recommend you type these conversations into an LLM. A machine that hallucinates random data would likely understand it better than you.

By “copying yourself” I mean when you own the copyright of the contents. Did I explain not clear enough?

No, because “train from scratch” doesn’t imply you own the copyrights of what you use to train the LLM with, only that you’re doing the training yourself. But this is absurd. Perhaps only someone as prolific as Stephen King would have enough self-generated text to train a decent LLM.

You’re also saying that researchers shouldn’t be able to train LLMs since they didn’t do all the research themselves. That would cripple scientific research. You’re fussy over which billionaire corporations are profiting off of your content while you kick actual human progress in the face. You’re trying to kill a technology in its infancy because, as happens with all technology, it inevitably gets used for profit. Automobiles put so many horse-related businesses out of business and created so many new jobs around manufacturing and maintenance and customization and delivery.

Since I assume the LLMs are “derivative works” of the training data in terms of copyright, this is all logical to me.

And since your assumption is wrong, this is a useless claim.

Why not make the large corporations pay? That’s the whole point of the authors suing!

Again, you don’t understand legal precedents. If a precedent is set that you must pay to license copyrighted material for training an LLM, then the wealthy corporations will just license it. But if any poor person, non-profit, middle schooler, etc. wants to do the same thing, they will not be able to afford it. Therefore, the only effective LLMs will be own and controlled by wealthy corporations. And they will win government contracts to teach your children and rewrite your history curriculum and censor things authoritarian governments want censored.

Why shouldn’t pirates be “financially crushed”?

Fair use doesn’t make one a pirate.

Look, being poor is never an excuse of doing illegal things.

Actually, it’s quite often the only available alternative to starving for many people because we live in a corrupt system where the wealthy have seized the means of production and the executive, legislature, and judiciary, whereby laws are written for the benefit of the wealthy.

Two issues: (1) Did you legally buy the book or rent one?

No, because the library is tax funded! It is free at the point of service and people who don’t pay taxes like homeless people are still able to read in the library. I’m sorry this isn’t a thing in whatever place you live, but in the US, the libraries are free to read in.

(2) Writing a thing that you remembered in every word does not make it free from infringement.

You’re changing the scenario. I didn’t say you remember every word and rewrite every word. I’m saying you use your knowledge and experience of the text to become a better writer, to write your own words. This is an analogy about human beings learn to read. It’s odd I have to say that.

It sounds like you are trying to “read” a chapter of Harry Potter, English version, and remember it word by word, and write down a whole chapter in French, or Spanish. Yeah, you didn’t “regurgitate” technically, and yet you still reproduce the creative expressions J.K. Rowling made in her novels.

That your immediate assumption is that any scenario is an intention to commit a copyright violation is telling. You’re paranoid about everything being “piracy.” It informs your misguided assertions.

My position is protecting small authors (writers, painters and musicians) from AIs quickly generating works “in their style”, potentially copying original authors’ expressions that is infringement, the things that copyright law was originally designed to protect.

Copyright law is currently meant to protect the profits of the wealthy, especially the corporations that can retain copyright ownership after a creator dies. It’s extended long after the creator’s death now such that few people living during the author’s lifetime will be alive when it hits the public domain. Those same corporations, the ones you are siding with, violate the copyrights of small authors all the time. Your stance is a moral contradiction.

No public school teacher would teach for free.

Actually, many of them do teach for free. They often spend their own money on classroom supplies. They’re often underpaid for what they do. They often spend extra time before or after class that isn’t paid. They grade papers at home after hours. Your ignorance of the American education system makes all of your assertions on this topic useless.

The question of whether I’m an American is not relevant.

It’s the most relevant of anything anymore. It speaks to your ignorance of our laws and how they work, of our courts and how they work. It speaks to your ignorance of how our libraries and schools work. It speaks to your inability to vote for representatives who vote on these laws, who vet the nominated justices who analyze the constitutionality of these laws, etc, etc, et al. Your opinion is too uninformed to matter.

Th point is even for public schools, the teachers got paid with government’s money, which in turn comes from your pockets through “taxes”. That’s enough. A question on this argument is a straw man and wastes my time.

Poor and homeless children are entitled to a free public education even if their parents can’t afford to pay taxes.

Do you really think people put knowledge on the internet for you to read for free are truly “free”? There Is No Free Lunch as the economists always say.

You missed the point. It’s free to the reader. You didn’t directly pay money to Mike or the EFF to read this article (though at this point I doubt you actually read the article). Free has multiple meanings. Libre. Gratis. Free as in beer. Free as in kitten. Don’t be obtuse.

Imagine what could happen when AI took the knowledge from Wikipedia, and generate content without citing Wikipedia as the source.

If you want to suggest the US Congress pass a law stating that LLMs providing encyclopedic knowledge must cite a source, go for it. But are you suggesting that human beings haven’t already been copying Wikipedia without citation? I’ve found plenty of people who have just copied and pasted Wikipedia text without citation. Why would an LLM be any different?

Fewer people would read or edit Wikipedia.

You do realize that Wikipedia editors are mostly unpaid volunteers, right? And uncited copying already happens. You don’t understand this at all.

There would be less contributions from volunteers, and less donations to cover server and staff costs, eventually.

I value Wikipedia’s significant contribution to humanity. You apparently don’t because it’s a non-profit and you’re all about locking up copyrighted content for profit, so your mock concern is transparent. But also, the proliferation of LLMs will make human verification of LLM content more important, not less. There might even be an increase in paid content reviewers once major hallucinations lead to major disasters. We’re probably less than 15 years away from some notable crisis occurring because someone trusted LLM content at the wrong time. The Hollywood script about the hapless nuclear power plant programmer using ChatGPT to fix an error and cause a meltdown writes itself. I’m guessing it’ll actually be something more banal like an massive internet outage because a junior developer trusted generated code to fix a server. Any such crisis will prompt further distrust in unverified content.

Explorer09 (profile)

May 24, 2025 at 1:04 pm

Re: Re: Re:⁴

Pre-training is the point at which the large dataset of “human content” is fed into the LLM.

It’s just terminology difference, but the same concept.

“Train from scratch” in the context of an LLM just means that you’re providing your own chosen dataset rather than copying from someone else’s. That doesn’t have any effect on the legality of the process. You can “train from scratch” with copyrighted material.

And it’s your liability for training it with copyrighted material then. My idea was clear: Training with copyrighted materials = infringement. It is you that insist that is “fair use” and keep denying it.

Literally the first thing I said to which you responded was: “My frustration with the arguments of people claiming it’s not fair use and that all training must be licensed is that many people seem to think they’re championing the little guy when they’re inadvertently advocating for the benefit of the wealthy and corporations.”

And I didn’t deny that I help the big corporations here. Because there is no “little guy” that you claimed to be defending. It’s corporations vs. corporations, like it or not. So your “frustration” with the issue was based on misunderstanding of it. Not my fault.

You thought the Big Tech side were the good guys. I say no they aren’t. It’s better to admit they are both evil, but I have my own value to protect.

Do you consider it exploitation of an author if someone reads their work and the reader becomes a better writer because of the experience? Should the reader ask for permission to learn from their experience?

You should pay tuitions to your teacher. Everyone pays someone when they learn. There may be people who can self teach, but when they read book they must either buy or rent a legal copy of it. Does this not make sense?

Explorer09 (profile)

May 24, 2025 at 4:43 am

Re: Re: Re:²

(continued)

A CC-BY license doesn’t mean your works aren’t subject to copyright violations. If there’s no attribution, it’s a violation. Also, big media corporations don’t care about that. I have first hand experience.

Did you know you can sue AI companies when they output your paper without citing you as the author? That’s your right.

The Doe v. GitHub case, now pending for appeal in the Ninth Circuit, is defending for this. And in case you didn’t know, the Doe v. GitHub plantiffs are open source software developers.

terop (profile)

June 9, 2025 at 7:44 pm

Re: Re:

If it’s fair use for me to copy public data (e.g. use a web browser), it’s fair use for a corporation and vice versa.

The authors of web browsers have worked very hard to make this possible. It’s not based on fair use. Instead, they got a new law passed, where temporary copies of the internet transferred data is allowed by the law. So it does not rely on fair use any longer.

But when you consider what work web browsers needed to do to get this passed:
1) browser security sandbox prevents unauthorised large scale copying of web site data
2) internet downloaded data is stored in two places: computer memory, and persistently in user’s local encrypted cache files.
3) internet downloaded data is never stored in “plaintext” in the user’s computer
4) the download bar, which gives users impression that unrestricted downloads are allowed in reality limits heavily the amount of data transferred and number of files given to users and what user operations are needed to start and execute the downloads..

So when you’re wearing your torrent leeching hat, you need to think web browsers as limiting your download habits.

MrWilson (profile)

June 9, 2025 at 11:40 pm

Re: Re: Re:

The authors of web browsers have worked very hard to make this possible. It’s not based on fair use. Instead, they got a new law passed, where temporary copies of the internet transferred data is allowed by the law. So it does not rely on fair use any longer.

Hey, you made a claim that purports to be a fact. Which “new law” are you referring to? Surely such a law has a citation available. I eagerly await for you to prove you’re actually talking about reality and not just making up bullshit.

But when you consider what work web browsers needed to do to get this passed:

So here you demonstrate that you don’t understand US law or how internet protocols work. Huzzah!

terop (profile)

June 9, 2025 at 11:49 pm

Re: Re: Re:²

Which “new law” are you referring to? Surely such a law has a citation available.

https://www.eff.org/files/filenode/temporary_copies_fnl.pdf says relevant cases you should look at are CoStar v. LoopNet and Cablevision remote DVR case.

MrWilson (profile)

June 10, 2025 at 12:42 am

Re: Re: Re:³

You said new law, not new case.

But that citation does reference a law, or specifically an update to the 1976 Copyright Act that was made in 1998, which makes it hardly “new.”

You continue to demonstrate that you don’t understand what you’re talking about.

terop (profile)

June 10, 2025 at 2:24 am

Re: Re: Re:⁴

But that citation does reference a law, or specifically an update to the 1976 Copyright Act that was made in 1998, which makes it hardly “new.”

So the year number is the only bit of information you managed to find from the case law references? Shouldn’t you be examining the “temporary” keyword/the actual limits of what browsers are allowed to do under these cases?

Basically important things you should check are the limits of the decision, i.e. scraping probably is outside of the scope, while browser is able to download the material, the browser is not allowed to give the downloaded material to the user. Browsers obviously display the web page to user, but the files stay locked inside the browser sandbox.

Persistent caching has been significant issue/since it saves the data to persistent storage and makes a copy that doesn’t disappear. Browsers have implemented timeouts for caching and allowing reloading from the original source.

I kinda expected this level information from the case law reference, but guess the year number is good enough find.

MrWilson (profile)

June 10, 2025 at 10:09 am

Re: Re: Re:⁵

Shouldn’t you be examining the “temporary” keyword/the actual limits of what browsers are allowed to do under these cases?

No, you should be providing citations that prove the claim you made. But the claim you made is “new law” not cases.

while browser is able to download the material, the browser is not allowed to give the downloaded material to the user. Browsers obviously display the web page to user, but the files stay locked inside the browser sandbox.

You’re describing technical functions, not legal requirements.

Persistent caching has been significant issue/since it saves the data to persistent storage and makes a copy that doesn’t disappear. Browsers have implemented timeouts for caching and allowing reloading from the original source.

Browsers don’t have agency. They haven’t implemented anything. Also, it’s not illegal to use an older browser that does permanently cache what it loads from a server. Netscape Navigator/Communicator used to do this specifically. It’s not illegal.

I kinda expected this level information from the case law reference, but guess the year number is good enough find.

No, you’re just confused about what I said earlier about how I’m not doing your legwork for you. You made a claim, you offer the proof. But it’s a trick question anyway because there is no proof. You’re just going to continue to make false claims about copyright. You’ve admitted to not wanting to actually understand US copyright law. That says all you need to say and undermines any claims you’ve made.

terop (profile)

June 18, 2025 at 4:35 pm

Re: Re: Re:⁶

That says all you need to say and undermines any claims you’ve made.

There’s nothing that can undermine all my claims. Basically my info is based on international treaties and established copyright cases as reported by news media.

MrWilson (profile)

June 18, 2025 at 10:44 pm

Re: Re: Re:⁷

No, your info is based on your misunderstanding about applicability of international treaties to US copyright law and your misunderstanding of what copyright cases actually established as precedent.

terop (profile)

June 19, 2025 at 5:02 pm

Re: Re: Re:⁸

what copyright cases actually established as precedent.

I don’t care if the case is a precedent or not. If some poor soul is subjected to the ruling, then the same could happen to anyone, and thus it is the law as established by the courts.

My pattern is such that I listen to all players in the marketplace. This gives me the widest possible exposition to the rules that govern our world. Closing out some players (like RIAA) from the analysis is not my way. I instead use the information to my advantage, even if I don’t agree with anyone’s position.

MrWilson (profile)

June 19, 2025 at 10:34 pm

Re: Re: Re:⁹

I don’t care if the case is a precedent or not. If some poor soul is subjected to the ruling, then the same could happen to anyone, and thus it is the law as established by the courts.

That’s what a precedent is!!! That’s the whole point! A precedent is a judicial case ruling that does apply to other instances and creates case law that courts will consider in later cases. You’ve admitted to not paying attention while pretending you absorb pertinent information. You’re demonstrating that your casually ignorant stance is not informed.

My pattern is such that I listen to all players in the marketplace. This gives me the widest possible exposition to the rules that govern our world. Closing out some players (like RIAA) from the analysis is not my way. I instead use the information to my advantage, even if I don’t agree with anyone’s position.

Not everyone is correct. Not everything said is a rule that actually dictates policy or practice or law. You can’t use some random person’s uninformed opinion in lieu of an expert’s learned perspective.

terop (profile)

June 19, 2025 at 10:46 pm

Re: Re: Re:¹⁰

You can’t use some random person’s uninformed opinion in lieu of an expert’s learned perspective.

of course you can. you just need to be a rule expert like myself, who can follow thousands of rules simultaniously and finding logical inconsistencies from the rules. If you only follow the experts, you never get the information what is actually happening in the marketplace. Experts have idealistic view of the situation, and the ground level information is also needed.

Explorer09 (profile)

June 20, 2025 at 2:08 am

Re: Re: Re:¹⁰

@terop @MrWilson

Just a reminder I don’t like RIAA as they had a bad reputation of push an anti-copying technology (SCMS) that didn’t work (to stop illegal copying), and hurt independent musicians that use consumer equipments for legal copying. (See this: https://en.wikipedia.org/wiki/Audio_Home_Recording_Act)

This copy restriction went beyond the Sony (Betamax case) safe harbor. Restricting a function on consumer equipment that has perfectly legal uses.

terop (profile)

June 20, 2025 at 12:43 pm

Re: Re: Re:¹¹

don’t like RIAA as they had a bad reputation of push an anti-copying technology

RIAA’s position in the marketplace is significantly better than position of random pirates. Mainly because RIAA and the music publishers worked hard to get working products to the consumers on large scale. Pirates have no such defense.

While I don’t like RIAA’s sue-grandmother-for-swpping-music-files-on-kazaa lawsuits, RIAA’s position is still significantly better.

MrWilson (profile)

June 20, 2025 at 10:42 pm

Re: Re: Re:¹²

The RIAA is not a music publisher. It doesn’t work hard to get working products to consumers. It’s a lobbying organization. That you conflate them is telling.

terop (profile)

June 21, 2025 at 8:26 am

Re: Re: Re:¹³

The RIAA is not a music publisher. It doesn’t work hard to get working products to consumers. It’s a lobbying organization.

RIAA’s large music collection and contacts to top level artists means they did the work that was expected from them. The only reason they are able to speak for the artists, is because they have contracts to many artists. And if riaa didn’t do their job, those contracts(and thus riaa’s position on the marketplace) would not exist.

MrWilson (profile)

June 21, 2025 at 5:52 pm

Re: Re: Re:¹⁴

RIAA’s large music collection and contacts to top level artist

You don’t understand what the RIAA is. It doesn’t have a large collection of music. It is not a music publisher. It is not a record company.

terop (profile)

June 21, 2025 at 9:15 pm

Re: Re: Re:¹⁵

You don’t understand what the RIAA is. It doesn’t have a large collection of music. It is not a music publisher. It is not a record company.

The lawsuits RIAA have done in courts say otherwise. They had no problems claiming copyright ownership of songs from top-level artists in the court paperwork and they used those copyright bits to harass single mothers and elderly people and some pirates. See recordingindustryvspeople for more info.

MrWilson (profile)

June 22, 2025 at 6:12 pm

Re: Re: Re:¹⁶

Sueing on behalf of artists isn’t the same thing as being the owner of the copyrights. RIAA is a membership organization composed of big record companies. The companies sign the artists, not the RIAA.

Explorer09 (profile)

June 21, 2025 at 1:23 am

Re: Re: Re:¹²

RIAA’s position in the marketplace is significantly better than position of random pirates. Mainly because RIAA and the music publishers worked hard to get working products to the consumers on large scale. Pirates have no such defense.

No. The point is RIAA only works for the best interests of the big record labels, and doesn’t care about independent artists. RIAA can lobby to make law that makes independent artists’ life harder.

terop (profile)

June 21, 2025 at 12:47 pm

Re: Re: Re:¹³

The point is RIAA only works for the best interests of the big record labels, and doesn’t care about independent artists.

How is RIAA able to get contracts to top-level artists, if they’re doing nothing to the benefit of those artists? Copyright gives copyright ownership to the artists when the product is created, so riaa had to do something to get access to the copyright ownership.

Explorer09 (profile)

June 22, 2025 at 3:02 am

Re: Re: Re:¹⁴

How is RIAA able to get contracts to top-level artists, if they’re doing nothing to the benefit of those artists? Copyright gives copyright ownership to the artists when the product is created, so riaa had to do something to get access to the copyright ownership.

Only top-level artists. RIAA doesn’t care small artists along the way. So small artists have to file separate lawsuits against Suno and Udio (AI music generators) in order to demand a share from them. (And they have filed suits, Justice v. Suno and Justice v. Uncharted Labs.)

terop (profile)

June 22, 2025 at 7:50 am

Re: Re: Re:¹⁵

Only top-level artists. RIAA doesn’t care small artists along the way.

When these small artists are rejected early in their career, how long do you think these artists remember this treatment? If RIAA doesn’t support small artists, when the artists are further in their career, I bet many of them don’t want to take RIAA’s contract simply because how they were treated when they were starting their career.

This is what I do with steam. They let my beginner’s game rot in greelight for 2 years, which made it completely outdated. Thus I have bad experiences with steam, and now that I’m more experienced, I’m not giving my products to steam at all. Instead, the people who supported me early days and let my product get published (this would be itch.io), gets my business. This way companies are just digging their hole downwards, if they treat starting artists/developers badly, and it takes significant perks before that hole is filled.

MrWilson (profile)

June 20, 2025 at 10:40 pm

Re: Re: Re:¹¹

Just a reminder I don’t like RIAA

Yet you link to their propaganda ministry, the Copyright Alliance…

Explorer09 (profile)

June 21, 2025 at 1:30 am

Re: Re: Re:¹²

Yet you link to their propaganda ministry, the Copyright Alliance…

Copyright Alliance isn’t just RIAA, mind you. You shouldn’t treat their opinions as mere propaganda before you go actually read them and understand what they are talking about. Unless you are all anti-copyright can you disregard them.

MrWilson (profile)

June 21, 2025 at 10:26 am

Re: Re: Re:¹³

It was founded by Jack Valenti for dogs sake!

I’m not anti-copyright. I’m against large corporations that abuse and exploit creators and workers and customers for profit.

That you turn a blind to them while decrying AI companies is a monumental hypocrisy.

Explorer09 (profile)

June 21, 2025 at 11:22 am

Re: Re: Re:¹⁴

I’m not anti-copyright. I’m against large corporations that abuse and exploit creators and workers and customers for profit.

As if the AI companies don’t exploit creators or put workers out of their jobs…

MrWilson (profile)

June 21, 2025 at 1:16 pm

Re: Re: Re:¹⁵

Sure, and…?

You keep pretending like it’s okay when big media exploits people because you can find examples of AI companies exploiting people. Maybe big corporations exploiting people is wrong on principle regardless of the products or services they profit from…

Explorer09 (profile)

May 24, 2025 at 6:11 pm

[On the discussion of Linux and SCO case]

It was to say that you are attacking big corporations but hurting innocent independent non-profits, researchers, students, and poor individuals. That’s the whole point here.

Please name an “innocent independent non-profit, researcher, student, or poor individual” you are talking about.

Or is it just me that I sense no one but some “bad students” who just want to freeload and use ChatGPT to complete their homework, ignoring academic ethics?

You are fighting against my right to train an LLM based on content I can find in the world, the entire breadth of human knowledge that is available online.

TRUE. Because you are not a registered non-profit that is entitled to an exemption on infringement. Libraries have that exemption, but not you.

You’re saying I should have to pay millions of dollars to a big media company in order to scan copies of works I already own a copy of (hello, first sale doctrine).

Personal copy of the book does not imply a license for commercial use of it. The “first sale doctrine” does not permit reproduction of a work (precisely speaking, the exception is only given for personal archival in the US Copyright law as far as I remember).

You’re saying I should be stuck with only having access to LLMs that profitable corporations develop[…]

Sigh. Another misunderstanding. I’ve said that granting fair use isn’t the solution.

The solution is allowing rental of licensed, legally-trained LLMs to smaller businesses, and allowing small businesses to tweak the models without the need to negotiate licenses from original authors. The details of this is to be discussed in the future. You can’t say authors’ works should be exploited by big corps when you claim to protect small businesses in the way. ChatGPT and Google are f*cking big. That’s the reality.

Did you know that I’ve tried to get LLMs to output my work and they haven’t gotten close to it at all. That’s a useless right if the output isn’t a derivative work.

Derivative work isn’t judged by similarity alone. It’s irony, because all the debates about AI “regurgitating” was really about proving the works are contained in the model (which is copyright infringement), and yet they all argue that regurgitation is a bug, while the true goal of such was to avoid copyright infringement claims.

When GitHub Copilot sucked most of the open source code it found on the internet. It didn’t obey the license of free and open source code by crediting the authors or release the entire model under GPL. That was the reason of the lawsuit.

MrWilson (profile)

May 25, 2025 at 10:20 pm

Re:

Please name an “innocent independent non-profit, researcher, student, or poor individual” you are talking about.

I’d hit the character limit before I got beyond just the names of people I know personally. I am literally talking about every American who isn’t a wealthy corporation. That’s how rights work in the US. Everyone (theoretically) has the same rights.

Or is it just me that I sense no one but some “bad students” who just want to freeload and use ChatGPT to complete their homework, ignoring academic ethics?

Your lack of imagination is your problem, not anyone else’s. But it’s also your lack of research. Do some research into academic use of LLMs and get back to me. And by research, I don’t mean just googling your confirmation bias.

TRUE. Because you are not a registered non-profit that is entitled to an exemption on infringement. Libraries have that exemption, but not you.

It’s fair use. Everyone has that right.

Personal copy of the book does not imply a license for commercial use of it.

Training an LLM isn’t necessarily a commercial use. You’re not arguing against commercial use elsewhere. You’re broadbrushing all training as a copyright violation without regard to the purpose of the LLM. I’ve literally cited non-profit research.

Sigh. Another misunderstanding. I’ve said that granting fair use isn’t the solution.

It’s not granting. It is fair use. This isn’t a negotiation. This is interpretation of the law as it is.

The solution is allowing rental of licensed, legally-trained LLMs to smaller businesses,

Exactly. You’re saying only wealthy corporations can develop LLMs. Any attempt you make to pretend that you care about “small authors” is bullshit. You’re advocating for enriching the already wealthy. Full stop.

and allowing small businesses to tweak the models without the need to negotiate licenses from original authors.

At this point, I’m guessing you work for a big media company.

The details of this is to be discussed in the future.

You aren’t an American. You don’t have a say in US law. And if you actually do, that’s bribery and corruption and you should be prosecuted.

You can’t say authors’ works should be exploited by big corps when you claim to protect small businesses in the way. ChatGPT and Google are f*cking big. That’s the reality.

I’m saying author’s works shouldn’t be exploited by big corporations, whether it’s big media or big tech. You’re advocating for big media to continue to fuck over the little guy.

Derivative work isn’t judged by similarity alone.

Again, you don’t understand the technology. The original trained text is not included in the model. It can’t be. The size of the model is too small to contain all the text that it was trained on. You don’t understand tokenization.

When GitHub Copilot sucked most of the open source code it found on the internet. It didn’t obey the license of free and open source code by crediting the authors or release the entire model under GPL. That was the reason of the lawsuit.

You seem to assume that I’ve somehow claimed that every lawsuit against an LLM company is the same or that every lawsuit against an LLM company is without merit. There are some scenarios where an LLM can be trained illegally. Reread my first post to which you responded. Note what I didn’t say that you’ve pretended I’ve said. Stop arguing with straw men.

Explorer09 (profile)

May 25, 2025 at 11:04 pm

Re: Re:

I’d hit the character limit before I got beyond just the names of people I know personally. I am literally talking about every American who isn’t a wealthy corporation. That’s how rights work in the US. Everyone (theoretically) has the same rights.

I can type a “John Doe” name here. And I requested you to name just one person or organization. You claim about you would hit the character limit is a lie. You simply don’t want to, because you just want to “freeload”.

Do some research into academic use of LLMs and get back to me. And by research, I don’t mean just googling your confirmation bias.

No, I won’t. The burden of proof on this part is yours, and while I can suspect one such academic use exists. This doesn’t mean I should let go commercial uses of these LLMs. Fair Use Factor One (in the U.S. Copyright law), you know that.

Training an LLM isn’t necessarily a commercial use. You’re not arguing against commercial use elsewhere. You’re broadbrushing all training as a copyright violation without regard to the purpose of the LLM.

When an LLM has both commercial and non-commercial uses, it’s the commercial part that would be argued in court in the aspect of fair use. You can’t shield commercial LLMs from liability simply because they have non-commercial benefits. (And I have argue this before. Same position as in that USCO draft report.)

You’re saying only wealthy corporations can develop LLMs. Any attempt you make to pretend that you care about “small authors” is bullshit. You’re advocating for enriching the already wealthy. Full stop.

At the “wealthy corporations” can be forced to pay me when they use my work for training; do you like that? I simply don’t want big corporations to take my works for free! Even I would hurt what-you-called “poor people” who are lazy and just want to freeload.

I’m saying author’s works shouldn’t be exploited by big corporations, whether it’s big media or big tech. You’re advocating for big media to continue to f*ck over the little guy.

Because they are already fcking, you get it? And how would it be any better if the Big Tech fcked you rather than the Big Media do it?

The original trained text is not included in the model. It can’t be. The size of the model is too small to contain all the text that it was trained on. You don’t understand tokenization.

Data compression. Information entropy. There is no such a thing as free intelligence. I know all these.

You seem to assume that I’ve somehow claimed that every lawsuit against an LLM company is the same or that every lawsuit against an LLM company is without merit.

I didn’t assume that. But you seems to have no idea who you are standing with. You are blinded by the idea that AI can give you free energy, or free intelligence. There’s no such a thing, and that’s why the creator are fighting. To protect the sources of intelligence from unlawful stealing (read: copying, piracy).

Note: I’m even against AI training with Creative Commons-licensed content because, AFAIK, no LLM do bother to attribute the original sources of what the AI has been trained with.

MrWilson (profile)

May 26, 2025 at 12:45 am

Re: Re: Re:

I can type a “John Doe” name here. And I requested you to name just one person or organization. You claim about you would hit the character limit is a lie. You simply don’t want to, because you just want to “freeload”.

What the hell are you even talking about? First, your request for names doesn’t entitle you to names. But, as I said, “I am literally talking about every American who isn’t a wealthy corporation.” Do you know any non-wealthy Americans? Add them to the list yourself. Google “Joe Smith” and an American city and you’ll find random people with names and ages and addresses. They’re all included. It would be easier to name the people who I’m not talking about, such as Elon Musk, Jeff Bezos, Mark Zuckerberg, and other wealthy assholes who can afford to license as much content as they want.

No, I won’t. The burden of proof on this part is yours,

But the burden of education is on you. You’re ignorant on the topic. And as you said, teachers should be paid. You’re not paying me to educate you.

and while I can suspect one such academic use exists. This doesn’t mean I should let go commercial uses of these LLMs. Fair Use Factor One (in the U.S. Copyright law), you know that.

Except, as you would know if you actually understood fair use and the case law relating to it, commercial use doesn’t bar a fair use determination because a single factor in favor of the copyright owner doesn’t necessarily negate a court finding a use to be fair use.

When an LLM has both commercial and non-commercial uses, it’s the commercial part that would be argued in court in the aspect of fair use.

False. Copyright owners have opposed training as not fair use regardless of the purpose. It’s just that the most prominent uses have been commercial. But again, the arguments are that training is not fair use. The arguments have not been that non-profit, personal use is fair use but commercial use is not. You’re moving the goalposts here.

Literally the first line of the first comment I made on this topic to which you responded was: “people claiming it’s not fair use and that all training must be licensed…”

If you think non-commercial training is fine, you haven’t said so until now and you’ve been responding to my assertions about non-commercial training as if they are also copyright violations, so at best you’re backtracking because you know I’ve caught you in a contradiction, but at worst, and more likely, you don’t have a clue what you’re talking about.

You can’t shield commercial LLMs from liability simply because they have non-commercial benefits. (And I have argue this before. Same position as in that USCO draft report.)

Actually, you could. That has actually been a determination in some courts in copyright cases. Your ignorance harms your assertions.

At the “wealthy corporations” can be forced to pay me when they use my work for training; do you like that? I simply don’t want big corporations to take my works for free!

They don’t have to take it for free. They can license it. What is your content worth? You’ll get pennies.

Even I would hurt what-you-called “poor people” who are lazy and just want to freeload.

Keep showing that contempt for poor people and making assumptions about their character and intentions. It says so much about your position.

Because they are already fcking, you get it?

Yes, that’s literally what I’m saying. You don’t seem to get it. And you’re cheering them on while they make it worse.

And how would it be any better if the Big Tech fcked you rather than the Big Media do it?

The answer to your question is literally in the line you’re responding to: “I’m saying author’s works shouldn’t be exploited by big corporations, whether it’s big media or big tech.” It wouldn’t be better if Big Tech exploited it over Big Media. Neither is good. But you’re advocating for one. I’m advocating for neither.

Data compression. Information entropy. There is no such a thing as free intelligence. I know all these.

So you admit you don’t understand how LLMs are trained. You could just have said that from the beginning.

I didn’t assume that. But you seems to have no idea who you are standing with.

You seem to have no idea what my position is, even after I correct your misperception.

You are blinded by the idea that AI can give you free energy, or free intelligence.

Huh what? Quote me where I said anything like that. Intelligence isn’t something a machine is capable of giving you, unless you’re using it in the sense of information like an “intelligence report” that a military unit might receive from scouting units.

There’s no such a thing, and that’s why the creator are fighting. To protect the sources of intelligence from unlawful stealing (read: copying, piracy).

Copyright violations aren’t stealing or piracy. Stealing involves a rivalrous, scarce commodity and the deprivation of the owner of that commodity. Copying data only creates more data. Piracy is something that happened a lot in the late 1600s and in poorer parts of the world like the coasts of Somalia. Your scare words and moral equivocating is just biased propaganda that shows your intellectual dishonesty.

Note: I’m even against AI training with Creative Commons-licensed content because, AFAIK, no LLM do bother to attribute the original sources of what the AI has been trained with.

Not all Creative Commons licenses require attribution. Holy fuck you don’t understand what you’re talking about.

Explorer09 (profile)

May 26, 2025 at 11:38 pm

Re: Re: Re:²

I am literally talking about every American who isn’t a wealthy corporation. Do you know any non-wealthy Americans? Add them to the list yourself. Google “Joe Smith” and an American city and you’ll find random people with names and ages and addresses. They’re all included.

In other words, the right to freeload. I got it. This has never been a right before. As if, the right to a “free lunch” where in economic reality there is no such thing.

Note: It is a different matter to advocate public access to knowledge with taxpayers’ money. And yet you don’t seem to be doing that. You just want to enjoy the copyrighted works for free without paying the creators. Advocacy for legalizing things that was illegal.

[C]ommercial use doesn’t bar a fair use determination because a single factor in favor of the copyright owner doesn’t necessarily negate a court finding a use to be fair use.

“Google Books” case and Perfect 10 v. Amazon. Did you think I have no idea about the courts finding fair use?

Copyright owners have opposed training as not fair use regardless of the purpose.

While I’m in this position, it’s their right to argue it’s not fair use.

But again, the arguments are that training is not fair use.

Training cannot be granted as a fair use regardless of purpose, including commercial and non-commercial AI. What’s the problem here? Because you are suggesting an extreme end in this spectrum: that ALL training must be fair use regardless of the AI being commercial or not.

The arguments have not been that non-profit, personal use is fair use but commercial use is not. You’re moving the goalposts here.

I didn’t. I’ve said my position on AI training is same as USCO: that the “fairness” of AI training with copyrighted content depends on the ultimate purpose of AI.

In case you are still confused, I can name (hypothetical) examples:

(1) An AI-powered article summary system: Generated content mostly depends on the article the user provides as input. Almost no “regurgitation” of the training data is possible. Training this AI with copyrighted materials could be fair.
(2) A machine translation system (Google Translate and like): Generated content again depends on user input (i.e. text to translate) and not training data. In this case the training with copyrighted materials has little effect on the books’ market and thus could be fair use.
(3) AI image upscaling: Again generated content mostly depends on user input. Almost no regurgitation possible. Training could be fair use.

But (4) general purpose generative AI, including ChatGPT, Gemini and Grok, are NOT these categories.

If you think non-commercial training is fine, you haven’t said so until now and you’ve been responding to my assertions about non-commercial training as if they are also copyright violations[…]

No, because “non-commercial” training could be unfair when the model had a use that could be commercial. Like, how Internet Archive (a non-profit) lost the Hachette v. Internet Archive case about the “digital lending” of books. Being “non-commercial” isn’t a sufficient criterion to rule for fair use.

They [wealthy corporations] don’t have to take it for free.

In fact they took it for free. You didn’t read the case of Meta, and are making a wrong assumption.

So you admit you don’t understand how LLMs are trained. You could just have said that from the beginning.

In Andersen v. Stability AI, the judge denied the defendent’s motion to dismiss the copyright infringement claim, the plantiffs (many visual artists) cited a compression saying by Stability AI’s CEO. (Emad Mostaque: “We took 100,000 gigabytes of images and compressed it to a two-gigabyte file that can recreate any of those [images] and iterations of those.”) And so the judge ruled in favor of the plantiffs.

It’s useless trying to accuse me of not knowing about how LLM is trained, because you simply have no clue about it either and suggest it’s still magic. There’s no magic. By aggregating lots of pictures of apples and compressing the aggregation aggressively and in a very lossy manner, a “ridiculous” size decrease can be achieved.
This technology is incredible by itself, but cannot rule out the claim of copyright infringement.

Intelligence isn’t something a machine is capable of giving you

Intelligence as in “intellectual property” and the “artificial intelligence” word itself. So do you agree there is no intelligence in AI?

Copying data only creates more data.

So you don’t believe the personal data leak is a issue, including password leaks, cracking of people’s secrets?

Not all Creative Commons licenses require attribution.

CC0, the public domain dedication, does not require attribution. Other CC licenses have the BY clause.

MrWilson (profile)

May 27, 2025 at 1:37 am

Re: Re: Re:³

In other words, the right to freeload. I got it. This has never been a right before.

You’re calling everyone who isn’t a wealthy billionaire a freeloader? What is wrong with you?

As if, the right to a “free lunch” where in economic reality there is no such thing.

There are plenty of rights that allow people to do things for free. Gratis isn’t the only kind of “free.”

Note: It is a different matter to advocate public access to knowledge with taxpayers’ money. And yet you don’t seem to be doing that.

I’m literally saying we already do that. They’re called libraries! I don’t have to advocate for what is already happening and has been for hundreds of years!

You just want to enjoy the copyrighted works for free without paying the creators.

Quote me where I said that. You keep arguing with straw men. You seem to think anyone who disagrees with you is only interested in getting things for free. You are paranoid and obsessed.

Advocacy for legalizing things that was illegal.

Fair use is legal, as I’ve been saying. But even then, changing laws is also a thing that happens in societies.

“Google Books” case and Perfect 10 v. Amazon. Did you think I have no idea about the courts finding fair use?

I think you citing cases or claiming to have knowledge of issues relating to the topic doesn’t do a damn bit of good in your flawed analysis because, as you’ve pointed out, you’re just spewing a biased, paranoid, profit-driven perspective in which apparently everyone owes you something.

While I’m in this position, it’s their right to argue it’s not fair use.

They can argue whatever they want.

Training cannot be granted as a fair use regardless of purpose, including commercial and non-commercial AI.

Another contradiction. Earlier, you stated: “Because you are not a registered non-profit that is entitled to an exemption on infringement.”

What’s the problem here? Because you are suggesting an extreme end in this spectrum: that ALL training must be fair use regardless of the AI being commercial or not.

If you were keeping up, you would have noticed that I noted that Thomson Reuters was decided based on the competing service aspect. I said that people who claim all training is a copyright violation and requires licensing are wrong.

I didn’t. I’ve said my position on AI training is same as USCO: that the “fairness” of AI training with copyrighted content depends on the ultimate purpose of AI.

Holy fuck dude. Again, “Literally the first line of the first comment I made on this topic to which you responded was: “people claiming it’s not fair use and that all training must be licensed…””

You are literally changing your position here. If you think not all training must be licensed because some of it is fair use, then you shouldn’t have started arguing with me! I have said multiple times you haven’t understood anything I’m saying and you just keep proving it.

In case you are still confused, I can name (hypothetical) examples:

You’ve literally conceded my point here. If there are example, then not all training isa copyright violation or requires licensing. That was my whole point!

In fact they took it for free. You didn’t read the case of Meta, and are making a wrong assumption.

You missed the point (again). I’m saying that if it is determined that training is not fair use and all training must be licensed, the wealthy corporations can pay for the licensing, but poor people can’t. I didn’t make a claim about what happened in a particular scenario. I’m talking about the future implications of these cases. You know- legal precedents, something you don’t seem to understand.

In Andersen v. Stability AI, the judge denied the defendent’s motion to dismiss the copyright infringement claim, the plantiffs (many visual artists) cited a compression saying by Stability AI’s CEO. (Emad Mostaque: “We took 100,000 gigabytes of images and compressed it to a two-gigabyte file that can recreate any of those [images] and iterations of those.”) And so the judge ruled in favor of the plantiffs.

We’ve been discussing text LLMs, not art generators. You’re shifting the goalposts again. Also, Andersen v. Stability AI hasn’t been decided yet.

It’s useless trying to accuse me of not knowing about how LLM is trained, because you simply have no clue about it either and suggest it’s still magic.

I haven’t said anything about magic. You cannot stop yourself from making up straw men to argue with.

Intelligence as in “intellectual property” and the “artificial intelligence” word itself. So do you agree there is no intelligence in AI?

You missed the point again. An AI cannot give a person intelligence. An AI can be intelligent or a human can be intelligent, but unless you’re doing some kind of science fiction mindlink between the AI and the human, the AI cannot give you intelligence.

So you don’t believe the personal data leak is a issue, including password leaks, cracking of people’s secrets?

Another straw man. I said a copyright violation isn’t theft and I explained why. I didn’t say leaking personal data isn’t a problem. That is a non sequitur that has nothing to do with it. Hacking data is a separate crime from copyright violations. But you don’t understand US law, so here where are again.

CC0, the public domain dedication, does not require attribution. Other CC licenses have the BY clause.

So you concede the point.

Explorer09 (profile)

May 27, 2025 at 5:25 am

You’re calling everyone who isn’t a wealthy billionaire a freeloader? What is wrong with you?

What’s wrong with you for wanting the content for free?

Gratis isn’t the only kind of “free.”

Then why not paid the content you use for AI training, then?

[Y]ou’re just spewing a biased, paranoid, profit-driven perspective in which apparently everyone owes you something.

Not everyone, but someone, and that’s enough.

[Y]ou stated: “Because you are not a registered non-profit that is entitled to an exemption on infringement.”

Because YOU are not, period. A non-profit can weigh in favor of fair use, but it’s just one factor out of the four. You’re not the non-profit that can even argue on the Factor One. Is it not clear enough?

I think I no longer need to reply on this straw man, because you simply don’t want to pay any dime to the authors, and simply want to use the copyrighted content for free. And that’s why you are dodging questions and try to impose your own “fair use” theory to others, ignoring the USCO that had debunked it.

If you were keeping up, you would have noticed that I noted that Thomson Reuters was decided based on the competing service aspect. I said that people who claim all training is a copyright violation and requires licensing are wrong.

Core question: Isn’t generative AI able to create content that compete with the authors that created the works that made the training data?

You’re not answering this question, and while you argued that “not all AI trainings are copyright violation”, you suggested instead that “all AI trainings are not copyright violation”. F*ck off with the logic trick.

people claiming it’s not fair use and that all training must be licensed…

There is no contradiction. All training must be licensed. Because fair use is legislated as an exception, not as a rule. If you run a company that uses someone else’s copyrighted works for profit, you must seek license for them first. Only if the licensing deal fails could you seek for fair use arguments in court. Not the other way around.

Stop playing with the law.

[T]he wealthy corporations can pay for the licensing, but poor people can’t. […] I’m talking about the future implications of these cases. You know- legal precedents, something you don’t seem to understand.

Yes, set up a legal precedent that all AI training must seek license! Your assumptions of the “poor people” are nonsense, so f*ck!

“Future” + “poor people” -> cases that are not happening now and are moot to discuss.

We’ve been discussing text LLMs, not art generators.

Same. Text LLMs do regurgitate and those are proofs that the copyrighted works are in the data. Don’t ask me for concrete proofs, because those are parts of the legal discovery processes. It would be the AI companies that disclose the training data and training process, not me.

As for the discovery of particular cases, I know one fact, that Meta did torrent books through “shadow libraries”, i.e. pirate sources. Whether the pirated book content would end up in Meta’s Llama model is irrelevant, as the plaintiffs have already moved for a summary judgement that Meta infringes copyright.

I said a copyright violation isn’t theft and I explained why.

Except no authors are buying your theory. The discussion of this part is moot because ther is no definition of “theft” in copyright laws, it’s laypeople’s saying about copyright infringement when out of the legal context.

And you didn’t seem to want to know why they call copyright infringement “theft”, so be it.

MrWilson (profile)

May 27, 2025 at 10:36 am

Re:

What’s wrong with you for wanting the content for free?

I have never claimed to want content for free. Why are you arguing with a straw man? Further, why have you extended this straw man to every one in the US who isn’t wealthy?

Then why not paid the content you use for AI training, then?

You missed the point, I think because you don’t understand the concept of gratis versus other forms of “free,” such as libre. I’m referring to libre.

Not everyone, but someone, and that’s enough.

That’s not justification for creating a new right for copyright owners just because you’re paranoid and greedy.

Because YOU are not, period. A non-profit can weigh in favor of fair use, but it’s just one factor out of the four. You’re not the non-profit that can even argue on the Factor One. Is it not clear enough?

You have admitted to contradicting yourself here. Is that not clear enough? Non-profits are included in the non-wealthy people you have called freeloaders.

I think I no longer need to reply on this straw man, because you simply don’t want to pay any dime to the authors, and simply want to use the copyrighted content for free.

That is indeed a straw man. I haven’t never claimed this at all. I am an author. I’m also a designer. The most egregious violation of my copyrights has been perpetrated by the Big Media companies you are sided with. Accusing me of wanting stuff for free when Big Media exists to profit off of poor creators is rich. You’re propping up your own abuse at the hands of wealthy corporations and taking it out on others.

And that’s why you are dodging questions and try to impose your own “fair use” theory to others,

I’m not dodging questions. I’m directly addressing all of your bullshit.

ignoring the USCO that had debunked it.

The USCO hasn’t debunked it. They issued a non-binding opinion. This is something that gets decided in courts or the legislature.

Core question: Isn’t generative AI able to create content that compete with the authors that created the works that made the training data?

Compete is a subjective term. I don’t personally think so. As I already said, I literally attempted to get an LLM to read a vast amount of my work (I am a published author, dude, not some rando who wants free shit), just to test it, to see what everyone is afraid of. Not only could it not replicate my writing style, but it was also full of boring prose. So no, I don’t think it can compete with authors.

You’re not answering this question, and while you argued that “not all AI trainings are copyright violation”, you suggested instead that “all AI trainings are not copyright violation”. F*ck off with the logic trick.

You literally misquoted me there. Search the page for the phrase “all AI trainings are not copyright violation”. You are the only one to have said that in this discussion. You are creating a straw man here.

There is no contradiction. All training must be licensed. Because fair use is legislated as an exception, not as a rule.

That is a contradiction. All training doesn’t have to be licensed. You’ve said so yourself.

The exception of fair use is a rule. It is built into the law. It is not just a defense in court but an actual part of the law itself. You seem to be running into a confusion about the concept of exception vs rule, which is a common English linguistic juxtaposition, but that doesn’t apply to this scenario.

If you run a company that uses someone else’s copyrighted works for profit, you must seek license for them first.

This is not true in many cases!!! Plenty of companies use someone else’s copyrighted works for profit without needing a license. Fair use allows many uses that don’t require a license. Parody doesn’t generally require a license. Commentary doesn’t generally require a license. There’s a world of content out there generated using other copyrighted content that doesn’t require a license. Again, this claim just demonstrates how your bias is limiting you to a myopic viewpoint.

Only if the licensing deal fails could you seek for fair use arguments in court. Not the other way around.

Not at all. I’ve already cited the case law that proves this wrong. You aren’t even arguing against me at this point. You’re arguing with reality. People have used fair use prior to and in lieu of going to court.

Stop playing with the law.

Start reading the law and the case law so you’re not so wrong.

Yes, set up a legal precedent that all AI training must seek license!

You’re admitting here that it isn’t yet. Which means you’re admitting you’re wrong.

Your assumptions of the “poor people” are nonsense, so f*ck!

There are no assumptions. It’s pattern recognition. “So fuck” isn’t a complete sentence.

“Future” + “poor people” -> cases that are not happening now and are moot to discuss.

No, they aren’t. Current actions have future repercussions. That’s how laws and legal precedents work.

Don’t ask me for concrete proofs,

Sure, you don’t have to actually prove any claims you make. I bet you’re a Nigerian prince too!

because those are parts of the legal discovery processes. It would be the AI companies that disclose the training data and training process, not me.

So you’re admitting you haven’t seen the proof, but you believe it anyway. That’s magical thinking.

As for the discovery of particular cases, I know one fact, that Meta did torrent books through “shadow libraries”, i.e. pirate sources.

You just changed the scenario again. We’re talking about results, not training. And I don’t support Meta. Zuck can get fucked for all I care (as I have already said).

Whether the pirated book content would end up in Meta’s Llama model is irrelevant, as the plaintiffs have already moved for a summary judgement that Meta infringes copyright.

It is relevant to the discussion we’re having.

Except no authors are buying your theory.

I am an author. Cory Doctorow also thinks the same. Plenty of others too.

The discussion of this part is moot because ther is no definition of “theft” in copyright laws, it’s laypeople’s saying about copyright infringement when out of the legal context. And you didn’t seem to want to know why they call copyright infringement “theft”, so be it.

I know why they call copyright infringement “theft.” It’s because they want to make a moral equivocation to charge the discussion and depict copyright violators as petty thieves. You seem to be in the same boat.

Explorer09 (profile)

May 27, 2025 at 1:40 pm

Re: Re:

I have never claimed to want content for free. Why are you arguing with a straw man? Further, why have you extended this straw man to every one in the US who isn’t wealthy?

If you didn’t want content for free, then you should please STFU in these AI lawsuits because you really had no idea what those AI companies have done.

I think because you don’t understand the concept of gratis versus other forms of “free,” such as libre. I’m referring to libre.

Bullsh-t. (1) Those literary works AI companies have taken have no “libre” things to talk about. (2) The “libre” idea, advocated by Free Software Foundation, Creative Commons and similar group have nothing to do with AI scraping works, the works have been “non-libre” from the start. (I am talking about LLM scraping here, not the GitHub case that scraped the open source software, but even with the open source software, attribution is a minimum requirement before the licensee receives any freedom to distribute the software.)

Compete is a subjective term. I don’t personally think so. As I already said, I literally attempted to get an LLM to read a vast amount of my work (I am a published author, dude, not some rando who wants free shit), just to test it, to see what everyone is afraid of. Not only could it not replicate my writing style, but it was also full of boring prose. So no, I don’t think it can compete with authors.

Whether the AI can compete with YOU is not important in the lawsuits.
What’s important is that you are against the authors who want to being a suit because you are selfish and disregarding their works and creative labor.

Plenty of companies use someone else’s copyrighted works for profit without needing a license. Fair use allows many uses that don’t require a license. Parody doesn’t generally require a license. Commentary doesn’t generally require a license. There’s a world of content out there generated using other copyrighted content that doesn’t require a license.

I think I need to remind you one important point: Fair use was not enacted to protect technological innovations. Fair use was enacted to protect free speech.

Therefore fair use are traditionally granted for parodies and commentaries. Technological innovations themselves are not reasons for fair use. Saying that AI is innovative enough so it can be “fair use” is clearly misunderstanding of fair use.

Generative AI does not fit the cases of parodies or commentaries, therefore the fair use argument of this part is useless. (I didn’t say this. This is mentioned in an amicus brief of the Kadrey v. Meta case, by “copyright law professors”.)

So you’re admitting you haven’t seen the proof, but you believe it anyway.

I didn’t say I’m on the position of a judge.

We’re talking about results, not training. And I don’t support Meta. Zuck can get fucked for all I care (as I have already said).

The fairness of the AI training depends on the ultimate uses of the model. And you admitted that it’s all f-cked up when you pirate books for training. Anthropic (the company behind Claude AI) is also f-cked up because they also pirate.

I know why they call copyright infringement “theft.” It’s because they want to make a moral equivocation to charge the discussion and depict copyright violators as petty thieves. You seem to be in the same boat.

True. And its even truer as the AI companies never attribute the authors when they use the data.

MrWilson (profile)

May 27, 2025 at 5:10 pm

Re: Re: Re:

If you didn’t want content for free, then you should please STFU in these AI lawsuits because you really had no idea what those AI companies have done.

Again, you’ve missed the entire point of everything I’ve said, including the first post that you responded to. If you didn’t want to argue about this, you didn’t have to respond, so you can STFU yourself.

As I said from the very beginning, I am not defending the AI companies. But in the course of broadbrush claiming that all training is a copyright violation, you are actively trying to deprive Americans of rights. You are advocating for making the big media companies wealthier and more powerful at the expense of corrupting my democracy. You’re not even an American. Stop lobbying for changes to our laws. If anyone should butt out, it’s you!

Bullsh-t. (1) Those literary works AI companies have taken have no “libre” things to talk about.

[citation needed] They have trained on vast amounts of content. Some of that content was libre. Hell, some of the content is already in the public domain, and you can’t get more libre than that. This is a weird claim.

(I am talking about LLM scraping here, not the GitHub case that scraped the open source software, but even with the open source software, attribution is a minimum requirement before the licensee receives any freedom to distribute the software.)

So you just admitted that there was some libre content and you’re just pretending it doesn’t exist for the sake of your argument. That’s some intellectual dishonesty right there.

Whether the AI can compete with YOU is not important in the lawsuits.

It is important to me and to other creators. You are arguing entirely from a personal bias, but you think my opinion shouldn’t matter because I’m arguing from my own perspective. That’s some hypocrisy right there.

What’s important is that you are against the authors who want to being a suit because you are selfish and disregarding their works and creative labor.

No, not at all. That you think that is your straw man in action again. You keep thinking I’m arguing against the people who filed the lawsuits. I’m arguing against people like you who make these broad statements while being ignorant of the implications and being contradictory in the intentions and effects of your support for wealthy corporations while blindly thinking you’re helping the little guy. You keep citing the lawsuits as if I’ve claimed everything in all the lawsuits should be decided in favor of the AI companies. I have said no such thing, but you keep making up straw men like that.

I think I need to remind you one important point: Fair use was not enacted to protect technological innovations.

Of course not. Fair use as a concept originated in the 18th century. Quote me where I claimed that.

Fair use was enacted to protect free speech.

The origin of the concept in the UK was to protect the right of a publisher to make an edit to a treatise that it had published and had nothing to do with free speech.

Therefore fair use are traditionally granted for parodies and commentaries. Technological innovations themselves are not reasons for fair use.

Sometimes the technological innovation is indistinguishable from a transformative use, so a technological innovation can be a reason for fair use.

Saying that AI is innovative enough so it can be “fair use” is clearly misunderstanding of fair use.

I haven’t claimed that. You continue to argue with straw men.

Generative AI does not fit the cases of parodies or commentaries, therefore the fair use argument of this part is useless. (I didn’t say this. This is mentioned in an amicus brief of the Kadrey v. Meta case, by “copyright law professors”.)

Parody and commentaries aren’t the only aspects of a fair use analysis.

I didn’t say I’m on the position of a judge.

So you are admitting that you’re just taking it on faith. You want it to be true, so you believe it. That’s a terrible basis for a belief.

The fairness of the AI training depends on the ultimate uses of the model.

Not necessarily. That can be a factor in the analysis of fair use.

And you admitted that it’s all f-cked up when you pirate books for training.

I didn’t actually say that. I also don’t support it either. But you keep claiming I’ve said things I haven’t said. You’re not arguing with me at all. You’re arguing with some shadow you think represents my position. And no matter how many times I try to tell you or point out that you’re attributing things to me that I have not said or supported, you just ignore those statements and make up new straw men.

Explorer09 (profile)

May 28, 2025 at 7:51 am

Re: Re: Re:²

But in the course of broadbrush claiming that all training is a copyright violation, you are actively trying to deprive Americans of rights.

Which rights?

MrWilson (profile)

May 28, 2025 at 10:35 am

Re: Re: Re:³

Reread everything I have said on this post and try to understand it at least once.

terop (profile)

May 27, 2025 at 7:04 pm

Two problems with the draft

1) training infringes reproduction right
2) fair use is not available if the AI outputs substitute the original

These two rules together pretty much means that if you want to follow the cutting edge of AI research, you need to move your focus from western societies to the communist china.

Explorer09 (profile)

May 28, 2025 at 9:55 am

Re:

@terop

These two rules together pretty much means that if you want to follow the cutting edge of AI research, you need to move your focus from western societies to the communist china.

China is a bad excuse for AI companies in an attempt to legalize exploitation of creative labor. There’s no saying from the USCO draft that you can’t train AI for purely research purpose, what it said is that many commercial AI training are not likely to pass the fair use test as what these AIs generate have the potential of competing with the original works and thus would be at disadvantage in the Factor 4 analysis in the fair use section in US copyright law.

MrWilson (profile)

May 28, 2025 at 10:12 am

Re: Re:

That’s rich considering you yourself have called non-profit researchers freeloaders and pirates who just want free content.

terop (profile)

May 29, 2025 at 12:19 pm

Re: Re: Re:

Since the draft explicitly says that AI practices are illegal according to copyright office, I’ve decided to disable AI features in gameapi builder tool and meshpage.org. Listening to every entity in the marketplace is necessary to do correct decisions, but copyright office opinion matters more than opinion of random pirates. Thus AI features are disabled for the future of gameapi.

Mike Masnick (profile)

May 29, 2025 at 4:00 pm

Re: Re: Re:²

Since the draft explicitly says that AI practices are illegal according to copyright office

In no way does it actually say that.

Explorer09 (profile)

May 29, 2025 at 5:50 pm

Re: Re: Re:³

Since the draft explicitly says that AI practices are illegal according to copyright office

In no way does it actually say that.

To be precise, the training of the AI models with copyrighted data is prima facie copyright infringement. After such prima facie infringement happened can the court evaluate whether the accused infringement is fair use. “Fair use” is evaluated with several factors together, and there is no blanket saying that AI training is fair use or not fair use. EFF has been misleading the general public (and the AI users and AI companies) that AI training has always been fair use. The U.S. Copyright Office warned that it’s not. In other words, don’t expect courts to rule in favor of AI.

MrWilson (profile)

May 30, 2025 at 12:09 am

Re: Re: Re:⁴

To be precise, the training of the AI models with copyrighted data is prima facie copyright infringement.

If training is fair use, then it is not copyright infringement. A fair use is de jure not infringement.

EFF has been misleading the general public (and the AI users and AI companies) that AI training has always been fair use.

You should reread the article.

The U.S. Copyright Office warned that it’s not. In other words, don’t expect courts to rule in favor of AI.

No really, you should reread the article. “Though the report is non-binding, it could influence courts…”

Explorer09 (profile)

May 30, 2025 at 12:44 am

Re: Re: Re:⁵

A fair use is de jure not infringement.

There is a distinction between “prima facie” infringement and the infringement after court’s judgement.

The “prima facie” infringement refers to the action that constitute infringement before the court can find fair use on the defendent. The defendent only needs fair use arguments after the plaintiff has successfully alleged the “prima facie” infringement action.

You are deliberately confusing the court ruling process. When the plaintiff has alleged the infringement action, it’s only the “prima facie” infringement and the courts need such distinctions, or else the definition of “copyright infringement” in the court ruling process will becomes a circular loop.

MrWilson (profile)

May 30, 2025 at 9:43 am

Re: Re: Re:⁶

You’re just parroting the Copyright Office Report, which multiple people disagree with, which is the point of this article.

Fair use isn’t just an affirmative defense.

You are deliberately confusing the court ruling process.

I’m guessing at this point that your confusion isn’t deliberate and you genuinely just don’t understand what you’re talking about about, or what anyone else is talking about for that matter. Honestly, an LLM could do a better job of arguing.

Explorer09 (profile)

May 30, 2025 at 1:17 pm

Re: Re: Re:⁷

I’m now treating MrWilson’s argument more like a spam and will not reply anything more.

MrWilson (profile)

May 30, 2025 at 4:50 pm

Re: Re: Re:⁸

It’s weird to refer to me in the third person when you’re responding to my comment.

terop (profile)

June 9, 2025 at 5:41 am

Re: Re: Re:⁷

Fair use isn’t just an affirmative defense.

1) courts only consider fair use if you
a) raise the issue beforehand in your court paperwork
b) and you were found violating copyrights and have
no other place to go than damages calculation
if you havent followed google vs oracle paperwork, google tried fair use argument but needed to spend millions on lawyers fees before court considered the fair use arguments.
2) court’s decision is required before you can consider something fair use. This 2nd rule kinda negates all your bullshit that you can violate copyright first and then claim fair use. You need to obtain courts decision before it can be declared fair use.

MrWilson (profile)

June 9, 2025 at 12:08 pm

Re: Re: Re:⁸

Incorrect. A rightsholder can determine fair use and not sue or issue a takedown. The court doesn’t make it fair use, it determines that it already has been fair use.

Explorer09 (profile)

June 9, 2025 at 4:30 pm

Re: Re: Re:⁹

A rightsholder can determine fair use and not sue or issue a takedown. The court doesn’t make it fair use, it determines that it already has been fair use.

There is no automatic-fair-use-just-because-you-say-so. The courts and judicial authorities are there for a reason.

For DMCA takedowns, what you are referring to is “potentially fair use” but being “potentially fair use” does not mean it always is. Especially it’s the courts who have the authority to declare fair use. Not you.

MrWilson (profile)

June 9, 2025 at 7:34 pm

Re: Re: Re:¹⁰

There is no automatic-fair-use-just-because-you-say-so. The courts and judicial authorities are there for a reason.

Yes, the court is there to chide you for not considering fair use in the first place before filing and rule in favor of fair use when you just thought you could squeeze money out of someone for a right you don’t actually possess.

For DMCA takedowns, what you are referring to is “potentially fair use”

“Potentially fair use” is not a legal term.

Especially it’s the courts who have the authority to declare fair use. Not you.

Courts have the authority to declare fair use in a litigated determination. But I can still say something is fair use based on an analysis of the four factors and a reading of related case law and that can hold up in court if the court agrees. You’re pretending like everything goes through the court. Plenty of lawyers advise their “potentially” litigious clients that some instances are fair use and don’t suggest suing unless you want to lose and “potentially” be liable for court costs. Plenty of copyright holders determine fair use on their own and don’t sue. I’ve literally determined that some of the use of my works is fair use and I haven’t sued. Oh look, I do have the authority to declare something to be fair use!

You seem to think every copyright holder should sue every user to find out if the court agrees with a fair use analysis. That’s not realistic or sensible.

Explorer09 (profile)

June 9, 2025 at 11:03 pm

Re: Re: Re:¹¹

Courts have the authority to declare fair use in a litigated determination. But I can still say something is fair use based on an analysis of the four factors and a reading of related case law and that can hold up in court if the court agrees.

Yes. Four factors. Then why the hell you disagree with the Copyright Office’s analysis and instead pick only the case laws that you think that would rule for fair use for your particular scenarios?

It isn’t that I don’t understand case laws, but you have not refused any point about how USCO is wrong, and so such an argument of you is a waste of my time.

No really. I’ve read the common defenses, Authors Guild v. Google a.k.a. Google Books case (which is about book search engines, not generative AI), Sony v. Universal a.k.a. Betamax case (which ruled for fair use only for personal video copying and notably does not apply to cases like Napster and Grokster; the Grokster case is important here as the Supreme Court pretty much denied there can be fair use for P2P copying), Google v. Oracle (which is limited to software code copying only and does not apply to other kind of works such as books). You guys who tried to defend fair use on AI pretty much need to notice the Warhol Foundation v. Goldsmith case, because that one is the closest to generative AI on fair use. That the copyright holders would cite to rule against you. Rather than I explain what that is about, you should study yourself. Make your “fair use” arguments able to win on that case, or else you won’t win.

Explorer09 (profile)

June 9, 2025 at 11:06 pm

Re: Re: Re:¹²

No really. I’ve read the common defenses

Sorry I missed another one. Campbell v. Acuff-Rose Music (often cited by AI companies for fair use but that’s limited to parodies, and generative AIs are obviously not parodies for the case to apply).

MrWilson (profile)

June 9, 2025 at 11:59 pm

Re: Re: Re:¹²

Then why the hell you disagree with the Copyright Office’s analysis and instead pick only the case laws that you think that would rule for fair use for your particular scenarios?

Because that’s how human thought processes work. We recognize patterns, such as the fact that courts have declared that the four factor analysis is not a numbers game where each factored is weighed the same, but rather different estimations of different factors can tip the balance one way or another. I guarantee you that I have read more case law relating to copyright than you have.

It’s also funny that you’re asking me why I favor case law that agrees with me when you’ve deliberately cited unsettled cases that may possibly indicate one court might potentially agree with your assertions and have chosen to ignore cases where fair use was determined.

It isn’t that I don’t understand case laws,

It’s not “case laws,” plural. It’s case law. And it is definitely that you don’t understand it at all. If you did understand it, you wouldn’t be making claims that contradict case law.

but you have not refused any point about how USCO is wrong,

The EFF pointed out how the US Copyright Office got their analysis wrong. I mostly agree with their perspective on this topic. I also offered my own perspectives, which you seemingly intentionally or just clumsily interpreted completely differently than anything I actually wrote.

and so such an argument of you is a waste of my time.

And yet you continue to respond, thus admitting that you are voluntarily wasting your time. Why are you wasting your own time?

No really. I’ve read the common defenses, Authors Guild v. Google a.k.a. Google Books case (which is about book search engines, not generative AI), Sony v. Universal a.k.a. Betamax case (which ruled for fair use only for personal video copying and notably does not apply to cases like Napster and Grokster; the Grokster case is important here as the Supreme Court pretty much denied there can be fair use for P2P copying), Google v. Oracle (which is limited to software code copying only and does not apply to other kind of works such as books).

This is literally you interpreting case law to support your prejudiced conclusion. These are notable cases, and not all relevant, but they aren’t all of case law relating to fair use.

You guys who tried to defend fair use on AI

I’m only one person. Here you’re admitting that you’ve grouping me with other people. This is perhaps one explanation why you make up straw men I haven’t uttered. You think multiple people who you disagree with all think the same. This is lazy thinking. You refuse to actually engage with what I’ve said and you just fight false positions you imagine. You’re really wasting your own time here and looking silly while doing it.

pretty much need to notice the Warhol Foundation v. Goldsmith case, because that one is the closest to generative AI on fair use.

Not close enough to set a relevant precedent. The Warhol work was definitely derivative of the Goldsmith work. An LLM trained on a work among millions of others can’t necessarily reproduce that one work and in the vastness of its training data a single work can’t be significantly influential on the results without intentional human intervention. This, again, demonstrates that you don’t understand how LLMs work.

That the copyright holders would cite to rule against you.

This isn’t a complete or coherent sentence.

Rather than I explain what that is about, you should study yourself. Make your “fair use” arguments able to win on that case, or else you won’t win.

I’m not litigating anything so I’m not going to win or lose a case. Do you understand that I’m not a defendant in these lawsuits? Have you lost all sense of reality here?

terop (profile)

May 30, 2025 at 9:40 pm

Re: Re: Re:³

if it does not say that, what does the following quote mean:

“””The steps required to produce a training dataset containing copyrighted works clearly
implicate the right of reproduction.
Developers make multiple copies of works by
downloading them; transferring them across storage mediums; converting them to different
formats; and creating modified versions or including them in filtered subsets.In many cases, the first step is downloading data from publicly available locations,but whatever the source, copies are made—often repeatedly”””

To me, this explicitly states that reproduction right is infringed when AI companies prepare training data for the training process.

Mike Masnick (profile)

May 30, 2025 at 10:41 pm

Re: Re: Re:⁴

For all of your comments on copyright, it appears you have no fucking clue how copyright works. “Implicating the right of reproduction” ≠ “AI practices are illegal”

How stupid are you?

Explorer09 (profile)

May 31, 2025 at 1:39 am

Re: Re: Re:⁵

For all of your comments on copyright, it appears you have no fucking clue how copyright works. “Implicating the right of reproduction” ≠ “AI practices are illegal”

How stupid are you?

It’s you that need to explain why it’s not, and it won’t help by accusing other people are stupid.

If those commercial AI models use only licensed data (or public domain data) for training, then we have no problem. Otherwise, they are infringing either the author’s right on reproduction or derivative works or both. The only possible defense here is fair use, but, as Thomson Reuters v. Ross case has shown, the AI companies are not likely to win.

MrWilson (profile)

May 31, 2025 at 12:08 pm

Re: Re: Re:⁶

It’s you that need to explain why it’s not,

It’s been explained to you. You just have a profit-motive that prevents you from understanding.

and it won’t help by accusing other people are stupid.

You are new here. Terop is stupid. He has a history of showing up and hallucinating new aspects of US laws that he wants to be true but have no basis in legislated or case law or reality. He comments here in an attempt to promote software no one is interested in.

If those commercial AI models use only licensed data (or public domain data) for training, then we have no problem.

Who’s we? You’re just one person. Are you a party to a lawsuit? Which one? What group of people have you been designated as the representative of?

Otherwise, they are infringing either the author’s right on reproduction or derivative works or both.

Or they’re not because copyright isn’t unlimited and results that don’t include source material aren’t derivative. If I read Harry Potter and write a story about a wizard that contains nothing from Harry Potter, my story isn’t a derivative work. Otherwise Harry Potter is a derivative work of Tolkien’s. You’re creating rights that don’t exist. Copyright protects expression, not ideas.

The only possible defense here is fair use, but, as Thomson Reuters v. Ross case has shown, the AI companies are not likely to win.

You are losing the nuance again. Reuters hasn’t been decided yet. And the issue is greater than just these lawsuits.

terop (profile)

June 7, 2025 at 8:37 pm

Re: Re: Re:⁷

You are new here. Terop is stupid. He has a history > of showing up and hallucinating new aspects of US
laws that he wants to be true but have no basis in
legislated or case law or reality. He comments here > in an attempt to promote software no one is
interested in.

So your awesome analysis of the message i wrote consists of “proof by authority”. Even wikipedia says that its a bad idea, see https://en.wikipedia.org/wiki/Argument_from_authority

“Scientific knowledge is best established by evidence and experiment rather than argued through authority”

So basically if you wanted to dismiss my bullshit, you should examine the contents of the messages, instead of just looking at who wrote it.

MrWilson (profile)

June 8, 2025 at 9:23 pm

Re: Re: Re:⁸

No, that’s not an argument from authority. You didn’t understand the wikipedia entry at all. There was no appeal to any authority figure. But even if there was and it was actually such a fallacy, you would be suffering from the fallacy fallacy.

https://en.wikipedia.org/wiki/Argument_from_fallacy

There’s also no need to prove you’re wrong. You’re making claims (that are wrong), therefore you need to provide proof that those claims are correct, but you can’t because you’re wrong. This is pattern recognition. You’ve been proven wrong in the past, with citations proving you wrong. You don’t get to just keep making new incorrect claims and expect everyone else to do the legwork to prove you wrong every time. If you prove yourself a person who can’t be bothered to learn about a topic before spewing your incorrect magical thinking, then nobody else has to waste their time on it.

terop (profile)

June 9, 2025 at 5:22 am

Re: Re: Re:⁹

You don’t get to just keep making new incorrect claims and expect everyone else to do the legwork to prove you wrong every time.

If I cannot make incorrect claims and expect legwork from you, why you can do the same? You have also been proven wrong, when you support pirate area and illegal options and practices.

In fact, this illegality is what I’m trying to save you from. The original reason why I entered techdirt in the first place was because it was clear that the site is full of propaganda/illegal practices/wrong statements about copyrights. It was so bad that we had to declare you the worst violators of copyright we could find from the internet. Even 4chan is not that bad, but you were focused on copyright issues but you took the wrong side in the argument.

MrWilson (profile)

June 9, 2025 at 12:14 pm

Re: Re: Re:¹⁰

If I cannot make incorrect claims and expect legwork from you, why you can do the same?

I’m not doing the same. I’m not making wild counter-factual claims. I’m not making up fake parts of the law. I have provided actual citations.

You have also been proven wrong, when you support pirate area and illegal options and practices.

This is another unsupported claim from you. [citation needed]

In fact, this illegality is what I’m trying to save you from.

You’re not trying to save me from anything. You don’t know me. You don’t if I engage in illegal activity or not. Also, your self-interest paints everything you talk about. You clearly don’t care about anyone else.

The original reason why I entered techdirt in the first place was because it was clear that the site is full of propaganda/illegal practices/wrong statements about copyrights.

Our savior has come! Except you don’t understand copyright law, as has been proven several times. So you can’t teach us anything. You’ve actually come to be humiliated and to demonstrate your wishful ignorance.

It was so bad that we had to declare you the worst violators of copyright we could find from the internet.

Do you suffer from multiple personalities? Who’s we? Also, if you have proof of copyright violations on Techdirt, feel free to point them out. If they’re the worst, they should be quite easy to prove and link to.

Even 4chan is not that bad, but you were focused on copyright issues but you took the wrong side in the argument.

Apparently you’ve never been to 4chan or any of the worse sites out there if you’re going to make this silly claim. You haven’t even taken the wrong side of the argument because you don’t understand US copyright law well enough to make a coherent argument.

Again, you’re making a bunch of claims without proof. Provide evidence or else prove that you are as ignorant as you appear.

terop (profile)

June 9, 2025 at 4:48 pm

Re: Re: Re:¹¹

Our savior has come! Except you don’t understand copyright law, as has been proven several times.

This is the real problem. You think that understanding copyright is required to stay legal when interacting with existing products. This isn’t the case. You just need to avoid clearly illegal areas. Things like AI (where training is broken), or torrenting (where lots of pirate material is available), or swapping pirate movies (where money doesn’t go to the authors but some illegal middlemen)…

My position is that none of the “understanding” the fine details about copyright is required, if you even do the minimal stuff and avoid clearly illegal areas.

But your position is always that its necessary for you to go to the illegal area and swap the damn movies…. then you’re in big trouble and need to scream “fair use will save my ass” when you forgot to obtain the licenses…

None of the fair use bullshit is even required, if you followed the default behaviour expected from you. Find the authorised vendor, and purchase the products you need. Don’t go to the pirate area.

MrWilson (profile)

June 9, 2025 at 7:43 pm

Re: Re: Re:¹²

This is the real problem. You think that understanding copyright is required to stay legal when interacting with existing products.

No, I’m asserting that you don’t understand US copyright law and I’ve proven it several times. Thus rendering all of your assertions about its nature to be unreliable. Some people can accidentally remain legal in their uses of copyrighted material. But you’re purported to have actual knowledge you don’t and worse, you’re purporting to educate others on the topic you yourself require understanding in.

You just need to avoid clearly illegal areas.

Nothing’s actually clear when you’re ignorant. You’re just so simplistic in your thinking that you see everything (incorrectly) in black and white.

Things like AI (where training is broken), or torrenting (where lots of pirate material is available), or swapping pirate movies (where money doesn’t go to the authors but some illegal middlemen)…

Not all AI training is illegal, clearly or otherwise. Not all torrenting is illegal. Some software developers release torrents of their free software. You can torrent Linux flavors legally. There are public domain works available via torrents.

My position is that none of the “understanding” the fine details about copyright is required, if you even do the minimal stuff and avoid clearly illegal areas.

Your position is incorrect and explains why you can’t argue coherently.

But your position is always that its necessary for you to go to the illegal area and swap the damn movies…. then you’re in big trouble and need to scream “fair use will save my ass” when you forgot to obtain the licenses…

I have literally never said this. Not only do you hallucinate false parts of copyright law, you’re now hallucinating claims you pretend I’ve made. I will say again, quote me where I said that.

None of the fair use bullshit is even required, if you followed the default behaviour expected from you.

Fair use is legal behavior. That’s the whole point!

Find the authorised vendor, and purchase the products you need. Don’t go to the pirate area.

Who is the authorized vendor for free software?

terop (profile)

June 9, 2025 at 7:54 pm

Re: Re: Re:¹³

Who is the authorized vendor for free software?

Authorised vendor is always the person or group of persons who has permission to decide the license for the user. So for free software, authorised vendor is all the contributors who have collectively decided to use LGPL or GPL license. Any one of them can publish the material in their web page, and pass license to use and prepare derived works further to the next guy who then becomes a contributor (and obtains copyright) for their own contributions.

MrWilson (profile)

June 10, 2025 at 12:01 am

Re: Re: Re:¹⁴

No, you said you have to purchase the products you need from an authorized vendor. If the software is free, you don’t have to find an authorized vendor. It was a rhetorical question. You didn’t understand that I was pointing out that you position is nonsensical. It’s the entire issue about fair use. You don’t have to ask for permission or pay a copyright holder if your use is a fair use.

terop (profile)

June 10, 2025 at 12:19 am

Re: Re: Re:¹⁵

If the software is free, you don’t have to find an authorized vendor.

That’s wrong. You still need to find authorized vendor. It’s explicitly stated in the copyright law that 1) some operations are exclusive to author 2) you need explicit license to do those operations 3) only way to properly obtain the license is to find authorized vendor and ask a permission to do the stuff you want to do.

Free software is no different in this respect. They just make it easier to find authorized vendor by looking at some text files distributed together with the software.

Mike Masnick (profile)

June 10, 2025 at 9:30 am

Re: Re: Re:¹⁶

It’s explicitly stated in the copyright law that 1) some operations are exclusive to author 2) you need explicit license to do those operations 3) only way to properly obtain the license is to find authorized vendor and ask a permission to do the stuff you want to do.

Um. You are wrong. Number 3 is not in copyright law at all.

Why are you lying?

Explorer09 (profile)

June 10, 2025 at 9:47 am

Re: Re: Re:¹⁷

It’s explicitly stated in the copyright law that 1) some operations are exclusive to author 2) you need explicit license to do those operations 3) only way to properly obtain the license is to find authorized vendor and ask a permission to do the stuff you want to do.

Um. You are wrong. Number 3 is not in copyright law at all.

Why are you lying?

Precisely speaking, the other way is so called “fair use” defense, but it requires court decisions before you are greenlit.

Rather than wasting time arguing whether AI is “fair use”, my best way is to wait for court decisions to come up and see you guys lose horribly.

And even if the AI companies win the “fair use” defense (in which I highly doubt), there is still DMCA section 1202(b) that require users – including AI companies – to preserve the copyright management information (CMI). That is pending appeal in the Doe v. GitHub case.

MrWilson (profile)

June 10, 2025 at 10:14 am

Re: Re: Re:¹⁸

Precisely speaking, the other way is so called “fair use” defense, but it requires court decisions before you are greenlit.

No, you are absolutely wrong. You don’t have to ask permission for all uses and you don’t have go to court for them to be legal.

If this were true, you’d still have to contact every single person who releases their works under a permissive license. FOSS developers would be up to their ears in emails asking for permission and they’d never get around to formally approving each individual request.

You really don’t understand US copyright law and neither does Terop.

This is why I call you a copyright maximalist. You invent copyright holder rights that don’t exist and insist they have more power than they legally have.

terop (profile)

June 10, 2025 at 11:13 am

Re: Re: Re:¹⁹

FOSS developers would be up to their ears in emails asking for permission and they’d never get around to formally approving each individual request.

This is the biggest bullshit I’ve heard in a long time. We already declared “With enough eyeballs, Bugs are shallow” as bullshit simply because we received no emails about any problems in our software. In my whole life, I’ve received exactly one email asking for permission. So if people are asking for permission, its definitely not via email. Other permission requests (where I received magnificient $6) came via itch.io web pages. That’s about it. I think the amount of requests is too low rather than too much.

So I would say the number of people who are doing things properly and asking for permission is very small. Now you all will be bashing my product as not useful and I should die horrible death simply because passing money downstream normally works as a permission request and getting money from the society is declared necessary. But creating copyrighted works is not the way to go.

Maybe you have better idea how to fulfill the requirements, since copyrighted works are not working?

MrWilson (profile)

June 10, 2025 at 6:24 pm

Re: Re: Re:²⁰

This is the biggest bullshit I’ve heard in a long time.

This is how copyright works. You don’t have to ask for permission for many uses, especially not open and free licenses. That’s the whole fucking point. And the worst part of all of this is that you yourself have released your software under permissive licenses that don’t require asking for permission!

In my whole life, I’ve received exactly one email asking for permission.

So you’re claiming that you think everyone has to ask you permission, but you admit only one person ever has. So either you’re saying only one person has ever used your software or else you’re not enforcing your copyrights. But the reality is, under the license you’ve chosen, they don’t have to ask for permission. That defeats one of the purposes of the license!

So if people are asking for permission, its definitely not via email. Other permission requests (where I received magnificient $6) came via itch.io web pages. That’s about it. I think the amount of requests is too low rather than too much.

I’m talking about FOSS developers whose software gets used a lot, like Linus Torvalds and Richard Stallman.

So I would say the number of people who are doing things properly and asking for permission is very small.

Except you’re completely wrong about “properly.” And “properly” isn’t even the same as legally because it’s not required to ask for permission with such licenses.

Maybe you have better idea how to fulfill the requirements, since copyrighted works are not working?

I’m not here to provide you with career advice.

terop (profile)

June 10, 2025 at 9:57 pm

Re: Re: Re:²¹

So either you’re saying only one person has ever used your software or else you’re not enforcing your copyrights. But the reality is, under the license you’ve chosen, they don’t have to ask for permission.

Well, I have tried multiple different licensing systems, including proprietary, open source, free software, custom, creative commons, eat your own dogfood etc..

But the one person who asked for permission was doing it for solely commercial game software.

Mike Masnick (profile)

June 10, 2025 at 11:06 am

Re: Re: Re:¹⁸

Precisely speaking, the other way is so called “fair use” defense, but it requires court decisions before you are greenlit.

I think you’ve lost track of the conversation. I was responding to terop’s false claim that to use free software, you still have to get it from an “authorized vendor.”

That has fuck all to do with fair use.

I am beginning to think that you are not a good faith debater.

Rather than wasting time arguing whether AI is “fair use”, my best way is to wait for court decisions to come up and see you guys lose horribly.

This is about free software, not AI.

Explorer09 (profile)

June 10, 2025 at 12:30 pm

Re: Re: Re:¹⁹

I think you’ve lost track of the conversation. I was responding to terop’s false claim that to use free software, you still have to get it from an “authorized vendor.”

I didn’t argue out of context. Except that you mistook terop’s argument about “authorized vendor”. For free software you technically still need to get from what terop called an “authorized vendor” except this “authorized vendor” is “everyone that can distribute this software legally”.

Note that in jurisdictions where GPL cannot be fully enforced, distributing GPL software would also be illegal. This is the “liberty or death” clause since GPLv2.

This is about free software, not AI.

Free software does not always mean public domain. For example, by training AI with GPLed code and release the AI model not under GPL, it’s still a copyright violation. Free software doesn’t mean an always green light regarding AI training (it’s a “mostly free” except when you release it as proprietary or combine with proprietary code).

MrWilson (profile)

June 10, 2025 at 6:42 pm

Re: Re: Re:²⁰

I didn’t argue out of context. Except that you mistook terop’s argument about “authorized vendor”. For free software you technically still need to get from what terop called an “authorized vendor” except this “authorized vendor” is “everyone that can distribute this software legally”.

Bullshit. You’re omitting that Terop claims “only way to properly obtain the license is to find authorized vendor and ask a permission to do the stuff you want to do.”

You don’t have to ask for permission and your classification of the unofficial term “authorized vendor” as “everyone that can legally distribute the software” is still incorrect because you wouldn’t have to ask for permission from a person who doesn’t own the copyright and didn’t decide to release the work under the open license but merely passed it on via the permission automatically granted by the license itself.

terop (profile)

June 11, 2025 at 5:17 pm

Re: Re: Re:²¹

You wouldn’t have to ask for permission from a person who doesn’t own the copyright and didn’t decide to release the work under the open license but merely passed it on via the permission automatically granted by the license itself.

This sounds very wrong. The law does not work this way. Authorized vendor requirement in the law is there because the default behavior is that you need to be able to pass some money to the author, and not everyone in the world is authorized to sell you permission to use the software. Authors have various ways to pass the authorization forward in their sales organisations, but none of those authorization passing techniques allow you to skip the part where users find authorized vendor. None of the sales organizations can reach user’s home, so it is user’s responsibility to travel to the authorized vendor who is able to take your money and give a permission to use the software in return.

MrWilson (profile)

June 12, 2025 at 9:51 am

Re: Re: Re:²²

This is the weirdest theory of copyright ever. You have to be trolling at this point. It’s impossible to be this dense. You’d be violating your own claims every day. You haven’t paid Mike to use Techdirt or asked for his permission to use the website, so by your own determination, you’re constantly violating copyright laws.

Or… You’re an idiot who doesn’t understand what you’re talking about about

terop (profile)

June 12, 2025 at 10:41 am

Re: Re: Re:²³

You haven’t paid Mike to use Techdirt or asked for his permission to use the website, so by your own determination, you’re constantly violating copyright laws.

This is why my messages are being delayed or rejected outright, when Mike wants to forward all information about my products to the adverticement department and try to extort money from me.

MrWilson (profile)

June 12, 2025 at 2:27 pm

Re: Re: Re:²⁴

This claim is probably rising to the level of defamation.

terop (profile)

June 13, 2025 at 7:04 am

Re: Re: Re:²⁵

This claim is probably rising to the level of defamation.

You can’t have defamation, if the information is the truth. And there’s no indication that techdirt suddenly stopped running ads on the site or that they stopped delaying the messages.

MrWilson (profile)

June 13, 2025 at 11:10 am

Re: Re: Re:²⁶

Yet there is no proof that there’s a connection between your messages getting caught in the spam filter entirely because you’re spamming and any intent for Mike to extort or even ask you for advertising dollars. Claiming your unfounded claims are the truth doesn’t make them true. You’re just doubling down on false claims without proof.

terop (profile)

June 13, 2025 at 1:56 pm

Re: Re: Re:²⁷

any intent for Mike to extort or even ask you for advertising dollars.

This is only because I’ve explicitly stated that I’ve received $6 from my 10 year software project/don’t have extra money to pay for the advericements. But since the $6 is coming from 10 years of work, those dollars should be more valuable currency than your ordinary dollars where you spend less time obtaining them.

MrWilson (profile)

June 13, 2025 at 5:09 pm

Re: Re: Re:²⁸

You’re a paranoid conspiracy theorist at this point.

terop (profile)

June 13, 2025 at 7:34 pm

Re: Re: Re:²⁹

You’re a paranoid conspiracy theorist at this point.

So this is the best reason you can think of, why my technology is not worth exploring. Sounds like you’re running out of reasons and have to invent some bullshit that doesn’t make sense.

MrWilson (profile)

June 14, 2025 at 12:46 pm

Re: Re: Re:³⁰

This has nothing to do with your software. You keep bringing it up as if anyone cares. You’re a nut.

terop (profile)

June 17, 2025 at 2:02 pm

Re: Re: Re:³¹

This has nothing to do with your software.

Lets look at it this way:
1) copyright office published some paperwork
2) the paperwork contains info about how AI should be handled by software authors
3) the conclusion is that training is violation of copyright
4) thus every software developer worth their salt will examine their copyright bullshit and modify it to match the changing legal environment

But my software has the following aspects:
a) I publish some computer source code/binaries
b) it has AI included in it
c) thus copyright office paperwork is relevant to the AI aspects of the software
d) it turns out that AI training area is composed of a black box that is bound to explode after copyright office opinion marks it as copyright infringement
d) thus there’s some changes needed in the software, and consiquences will be passed to usa copyright office
e) but either case, the copyright office paperwork is relevant to my software
QED.

MrWilson (profile)

June 17, 2025 at 6:18 pm

Re: Re: Re:³²

the copyright office paperwork is relevant to my software QED.

You’re confused. The topic may be relevant to your software. Your software, however, isn’t relevant to the topic for anyone else except you.

terop (profile)

June 17, 2025 at 6:53 pm

Re: Re: Re:³³

The topic may be relevant to your software. Your software, however, isn’t relevant to the topic for anyone else except you.

Everyone else who creates software will need to follow the same copyright office rules… that makes it interesting to everyone.

MrWilson (profile)

June 18, 2025 at 11:56 am

Re: Re: Re:³⁴

Again, that’s the topic, not your software. If you hadn’t commented on this, nothing would be lost. You have added no insight and in fact have spewed mistruths and ignorance. You have contributed only confusion and bluster and bullshit.

terop (profile)

June 18, 2025 at 4:33 pm

Re: Re: Re:³⁵

You have contributed only confusion and bluster and bullshit.

that’s only because you didn’t read the full story. in short, copyright office opinion matters more than opinion of fair use pirates.

MrWilson (profile)

June 18, 2025 at 10:42 pm

Re: Re: Re:³⁶

Except it doesn’t if the Copyright Office is wrong. And you wouldn’t know who’s right or wrong because you admit and demonstrate ignorance about US copyright law.

terop (profile)

June 19, 2025 at 2:53 pm

Re: Re: Re:³⁷

Except it doesn’t if the Copyright Office is wrong.

Noone in marketplace is wrong. They just have different view to the same problem. Authors cannot get their products to the market. Publishers don’t have money for extensive sales activity. Customers cannot find the product that solves their problems. And copyright office cannot make pirates stop pirating.

MrWilson (profile)

June 19, 2025 at 10:36 pm

Re: Re: Re:³⁸

Noone in marketplace is wrong.

If no one is wrong, then contradictory perspectives would both be right, which means truth doesn’t matter and aardvark smartphones postulate Freudian underpants in the cold void of your mom’s basement.

terop (profile)

June 19, 2025 at 10:54 pm

Re: Re: Re:³⁹

If no one is wrong, then contradictory perspectives would both be right, which means truth doesn’t matter

this can be resolved by attaching context to the perspective. Then both can be simultaniously true, even if they’re logically contradictory. The context resolves the contradictions.

MrWilson (profile)

June 20, 2025 at 10:43 pm

Re: Re: Re:⁴⁰

Except both being true isn’t the only possibility. They could both be false. They could both be so wrong that they don’t make any sense whatsoever. And you clearly aren’t qualified to judge what’s true or false or even the context in which they might be either.

terop (profile)

June 21, 2025 at 8:21 am

Re: Re: Re:⁴¹

Except both being true isn’t the only possibility. They could both be false.

Sure. Pigs can also fly.

MrWilson (profile)

June 21, 2025 at 5:53 pm

Re: Re: Re:⁴²

Yeah, you’ve devolved into complete nonsense.

terop (profile)

June 21, 2025 at 9:05 pm

Re: Re: Re:⁴³

Yeah, you’ve devolved into complete nonsense.

You haven’t even seen what level of abstract nonsense I’m capable of. I’ll give you some reading to do, we’ll return to this once you’ve read the following books:
1) sets for mathematics, Lawvere
2) Category Theory, Awodey
3) categories for working mathematician, maclane

The real nonsense is significantly worse than you think. I have significant trouble finding other people who can understand the bullshit, so there’s some ivory tower problems with the material. But hope you read it, so we can talk real bullshit and nonsense in 2 years.

terop (profile)

June 9, 2025 at 10:28 pm

Re: Re: Re:¹³

But you’re purported to have actual knowledge you don’t and worse, you’re purporting to educate others on the topic you yourself require understanding in.

You know how I get information about how copyright works?

By creating copyrighted works.

Computer games, user interface libraries, intros and demos for demoscene, phone user interfaces and 3d engines. Things that are very common in today’s world as software products.

Creating copyrighted works and watching your products fail in the marketplace one after another is good way to learn the fine details about copyright.

I think you fail in copyright, because you rely on lawbooks and bullshit from the internet to base your knowledge and you have no idea how software is being written.

It’s the experience of watching it get invented, designed, written to software source code, submitted to version control, tested and bugfixed and then sales droids will try to turn it to money and failing miserably.

This whole process is such a revelation of why copyright is actually important part of societies and how much damage pirates are doing to the bottom line of companies and individual developers.

MrWilson (profile)

June 10, 2025 at 12:50 am

Re: Re: Re:¹⁴

You know how I get information about how copyright works? By creating copyrighted works.

Creating and publishing any creative work just generates a copyright in the US. It doesn’t give you any understanding of how copyright works at all.

I appreciate you admitting that you really have no understanding of copyright because you have some kind of folklore/cargo cult kind of belief about it. I had pretty much assumed this, but I appreciate you acknowledging that you’re neither well versed or even interested in understanding the law that you make assertions about.

I think you fail in copyright, because you rely on lawbooks and bullshit from the internet to base your knowledge

So I don’t understand US copyright law because I base my knowledge on the law? Should I instead consult a ouija board? Cast bones to divine some magical understanding?

and you have no idea how software is being written.

I have a good understanding of how software is being written. Also, software isn’t the only copyrightable content. Your focus on it only is a further demonstration of your ignorance. I produce copyrighted works also.

It’s the experience of watching it get invented, designed, written to software source code, submitted to version control, tested and bugfixed and then sales droids will try to turn it to money and failing miserably.

You’re talking about producing and selling a product. That’s not specific to US copyright law. That’s not even specific to software development.

This whole process is such a revelation of why copyright is actually important part of societies and how much damage pirates are doing to the bottom line of companies and individual developers.

You are just articulating how myopic your perspective is on US copyright law here.

You’re in Plato’s cave describing your deep understanding of a single shadow on the wall and admitting you’ve never been outside to see what’s actually going on.

terop (profile)

June 12, 2025 at 7:39 am

Re: Re: Re:¹⁵

Creating and publishing any creative work [..] doesn’t give you any understanding of how copyright works at all.

I think the above claim is blatantly wrong. Copyright was created to support authors whose work was ripped off by publishers who did not have authorization to sell author’s product. As such, copyright law recognizes what kind of activity is detremental to the success of product development. Its the activities like creating and publishing copyrighted works that must be continued even when money from the effort goes to some unrelated copycats.

MrWilson (profile)

June 12, 2025 at 9:53 am

Re: Re: Re:¹⁶

You have admitted to not studying US copyright law, so your claims about it aren’t just wrong, but completely ignorant.

terop (profile)

June 12, 2025 at 10:23 am

Re: Re: Re:¹⁷

You have admitted to not studying US copyright law, so your claims about it aren’t just wrong, but completely ignorant.

Copyright laws are supposed to work the same everywhere in western world, so you cannot hide behind your usa pond, when the same rules apply to larger area of the world. This is why we can sue pirate sites operated from usa, if they decide to infringe our copyright.

MrWilson (profile)

June 12, 2025 at 2:29 pm

Re: Re: Re:¹⁸

Copyright laws are supposed to work the same everywhere in western world,

No, they aren’t. That isn’t the law at all.

terop (profile)

June 10, 2025 at 10:11 am

Re: Re: Re:¹³

Not all AI training is illegal, clearly or otherwise. Not all torrenting is illegal.

If you give this BS to the courts, they will laugh you out from the courtroom. I bet you can’t even download the damn torrent client without violating copyrights.

MrWilson (profile)

June 10, 2025 at 11:23 am

Re: Re: Re:¹⁴

If you give this BS to the courts, they will laugh you out from the courtroom.

No, they won’t. It won’t even get to court because a FOSS developer won’t sue you for legally torrenting their free software that they themselves will often seed or offer to mirror sites to seed because they are interested in distributing their free software. That you aren’t aware of this fact is damning to you as a software developer. This is a giant gaping hole in your understanding of your own proclaimed field of expertise.

I bet you can’t even download the damn torrent client without violating copyrights.

I’d ask for a citation, but I know you’re not good for one. And it’s not necessary because the claim is false on its face. Torrenting is just a file transfer protocol. It’s not itself illegal. Game platforms have used torrenting to release content. FOSS developers have used it to release their software. Public domain content is perfectly legal to transfer over the internet. The mythology you’ve invented about copyright law is absurd.

terop (profile)

June 10, 2025 at 2:00 pm

Re: Re: Re:¹⁵

Torrenting is just a file transfer protocol. It’s not itself illegal.

User interfaces are needed before your file transfer protocol is useful to anyone and their mother. And there are strict rules that the user interface must not display pirated material in its user interface. Popular user interfaces like web browsers are struggling to meet the strict requirements set by the law. Basically, how you can check this information is by checking lawsuit done by RIAA and copyright lobby against software vendors who provide user interfaces (things like napster)… The lawsuits always have things like “our trademarks are infringed when (napster) displays the name of the song in it’s user interface”…

Explorer09 (profile)

June 10, 2025 at 3:34 pm

Re: Re: Re:¹⁶

Torrenting is just a file transfer protocol. It’s not itself illegal.

@terop

User interfaces are needed before your file transfer protocol is useful to anyone and their mother. And there are strict rules that the user interface must not display pirated material in its user interface.

Which rule? For what I’ve seen in the Napster and Grokster cases, I didn’t remember there’s a rule saying the user interfaces must not display pirate materials. The cases of Napster and Grokster were not that.

The problem with both P2P platforms is that the companies making the P2P software benefitted from the illegal copying done by their customers, and that the piracy were the primary use of the P2P software.

Of course it doesn’t make BitTorrent-as-a-protocol illegal. The real catch of it is, if you’re developing a technology that can be used for illegal purposes, make sure you don’t contribute to those activities or profit from them (or else you will be liable).

terop (profile)

June 10, 2025 at 4:05 pm

Re: Re: Re:¹⁷

Which rule? For what I’ve seen in the Napster and Grokster cases, I didn’t remember there’s a rule saying the user interfaces must not display pirate materials.

https://copyrightalliance.org/wp-content/uploads/2016/09/AM-Records-v.-Napster.pdf

“””Napster provides technical support for the indexing and
searching of MP3 files, as well as for its other functions, including a “chat room,” where users can meet to discuss
music, and a directory where participating artists can provide information about their music.”””

This “technical support for the indexing and searching mp3 files” explicitly points towards the user interface.

MrWilson (profile)

June 10, 2025 at 6:50 pm

Re: Re: Re:¹⁶

And there are strict rules that the user interface must not display pirated material in its user interface.

This isn’t a thing at all. There is no law that says you can’t list non-copyrightable names and titles in a display. And the names themselves aren’t the material itself. You have no idea how US copyright law works at all.

Popular user interfaces like web browsers are struggling to meet the strict requirements set by the law.

No, they aren’t.

Basically, how you can check this information is by checking lawsuit done by RIAA and copyright lobby against software vendors who provide user interfaces (things like napster)…

Those lawsuits didn’t set any kind of precedent that makes displaying the names of content a copyright violation. You didn’t actually research this topic.

The lawsuits always have things like “our trademarks are infringed when (napster) displays the name of the song in it’s user interface”…

Trademark and copyright are two distinctly different types of “intellectual property.” You made a claim that copyrights were violated merely by the display of words in an interface, but now you’re shifting the goalposts to trademark.

The absurdity is that this entire argument is useless because it’s perfectly legal to torrent legal torrents, so therefore there wouldn’t be any “pirated material” displayed in order to run afoul of this entirely made-up false aspect of copyright law you just conjured. So you’re even wrong inside your own wrong fantasy.

terop (profile)

June 11, 2025 at 6:26 am

Re: Re: Re:¹⁷

The absurdity is that this entire argument is useless because it’s perfectly legal to torrent legal torrents, so therefore there wouldn’t be any “pirated material” displayed in order to run afoul of this entirely made-up false aspect of copyright law you just conjured.

Good luck with that. The day you implement support for torrents, the users will demand that they can add their own protocol entry points to the system, so that their pirated file collections can be included to the system. Then you as an author can’t see anything wrong in your software, but users are leeching and pirating material like crazy. When you finally figure out that user’s only interest in your software is because it allows pirate data to be used, it’s too late and RIAA/MPAA is just few weeks away to pass DCMA notice to your software, or if it’s blatant enough violation, give you paperwork for a lawsuit.

This is significant issue for copyrights. Pirates will find a way to insert their bullshit to software projects. Things like requiring software to be open source so that pirate can then modify it and disable all the protections against copyright infringements.

MrWilson (profile)

June 11, 2025 at 10:09 am

Re: Re: Re:¹⁸

The day you implement support for torrents,

Many developers have implemented support for their own torrents. Notably, Blizzard has used the Blizzard Downloader that included BitTorrent protocols as a means of downloading legal game updates.

the users will demand that they can add their own protocol entry points to the system, so that their pirated file collections can be included to the system.

First, this isn’t the case. The Blizzard Downloader was only used for authorized Blizzard downloads. Users didn’t demand it be able to include copyright infringing uses. There are real world examples proving your random speculation is completely incorrect.

Then you as an author can’t see anything wrong in your software, but users are leeching and pirating material like crazy.

Software is typically agnostic as to its uses. Legal software being used for illegal means is on the user, not the developer, unless you can prove in court that the developer intended for it to be used for copyright infringement.

When you finally figure out that user’s only interest in your software is because it allows pirate data to be used,

Blizzard was quite certain that its own customers wanted to use the downloader to…download Blizzard’s games, that they legally paid for. Blizzard was only too happy to provide the service people were paying it for.

Your ignorance of the legal uses of torrenting is appalling, but not surprising.

it’s too late and RIAA/MPAA is just few weeks away to pass DCMA notice to your software, or if it’s blatant enough violation, give you paperwork for a lawsuit.

Cite the lawsuit from these organizations or their member corporations against Blizzard’s torrenting software. Oh wait, it doesn’t exist. Also, the MPAA is now the MPA and has been for 6 years, again demonstrating that you don’t know what you’re talking about. Your talking points are stale.

This is significant issue for copyrights. Pirates will find a way to insert their bullshit to software projects. Things like requiring software to be open source so that pirate can then modify it and disable all the protections against copyright infringements.

Your weird fever dreams and hallucinations don’t reflect reality. You should get your blood pressure checked.

terop (profile)

June 11, 2025 at 3:36 pm

Re: Re: Re:¹⁹

Cite the lawsuit from these organizations or their member corporations against torrenting software.

check this article about meta’s AI branch to fight publishers about meta leeching their AI data via torrent from pirate sites, and meta lost the fight:

https://www.wired.com/story/new-documents-unredacted-meta-copyright-ai-lawsuit/

Explorer09 (profile)

June 12, 2025 at 3:32 am

Re: Re: Re:²⁰

check this article about meta’s AI branch to fight publishers about meta leeching their AI data via torrent from pirate sites, and meta lost the fight [URL omitted; Kadrey v. Meta case]

@terop

It’s not completely lost. The summary judgement is still pending, but it’s unlikely that Meta will win. Meta’s only last bet is claiming torrenting to train AI is fair use, but the judge has expressed doubt on that.

I’m also waiting to see the judgement coming out. It would be the first case regarding generative AI training and fair use (and more authoritative than USCO).

terop (profile)

July 2, 2025 at 10:15 pm

Re: Re: Re:²¹

I’m also waiting to see the judgement coming out.

Looks like Meta won their lawsuit because plaintiffs focused on wrong aspect.

Explorer09 (profile)

June 11, 2025 at 11:25 am

Re: Re: Re:¹⁸

@terop

Pirates will find a way to insert their bullshit to software projects. Things like requiring software to be open source so that pirate can then modify it and disable all the protections against copyright infringements.

Open source software does not need protections against copyright infringements, since it’s part of the license to permit distribution almost anywhere and to anyone (except in jurisdictions where an open source license cannot be enforced, which is a rare case).

This comment shows your misunderstandings with open source software. Open source software ≠ piracy. And BitTorrent have legal uses. What can make you in trouble is when you permit users to pirate materials through your platform (software) AND you benefit from that illegal uses. Both conditions must satisfy in order for you to become liable. Cases where there is no liability include:

(1) You develop an open source BitTorrent client (e.g. Transmission). But you do not profit from the users using it (through subscriptions or other means) nor secondary benefits such as ad revenues.
(2) You develop a video game that supports BitTorrent protocol as a way to download game updates, and your video game client does not allow users to torrent arbitrary files on it. (That is, it allows only game updates.)

So make it clear. BitTorrent has legal uses.

MrWilson (profile)

June 11, 2025 at 12:40 pm

Re: Re: Re:¹⁹

Holy shit. You actually got one thing right! You deserve a gold star!

terop (profile)

June 11, 2025 at 3:49 pm

Re: Re: Re:¹⁹

Open source software does not need protections against copyright infringements, since it’s part of the license to permit distribution almost anywhere and to anyone

This isn’t true. Open source still needs to respect copyright of other people, even if they allow their own code to be copied freely.

This means that if users are giving urls to your software and software loads some data from the web, the software author need to ensure that the data loaded was not pirated. Since users provided the url/internet location, checking if the material is pirated becomes slightly more difficult/currently impossible to programmatically check it. For this reason, open source software when they load data from user-defined urls, need to have a section in their terms of service that pirated material must not be used in any urls typed to the software’s input slots. Basically software’s legality fails only if users break the conditions described in the TOS.

terop (profile)

June 12, 2025 at 9:46 am

Re: Re: Re:²⁰

This TOS solution is only used because no other solution is available. There are indications in court paperworks that the TOS solution is simply not enough to prevent large scale piracy happening through software you write. And thus courts are unwilling to accept it as a solution to the piracy problem. In my software, there’s additional tricks that need to be used as defense in addition to the TOS trick: namely, copy-pasting url to the software is designed to be burdensome enough that “manual steps are required” before piracy can happen, thus limiting significantly how large scale piracy users are able to do through the software.

But these same steps were tried by court cases where the defendant was paying significant damage amounts to the content owners. Thus the solutions we have for preventing user’s piracy might not be enough for the lawyers and they will just declare the software illegal.

It takes significant amount of research and effort to figure out these solutions that allow the software to operate properly, but still prevent piracy that users are trying to do. But if the research is not being done, the situation is significantly worse. The people who create software, but do not care about respecting other people’s copyright, will be blinking targets for copyright infringement lawsuits in the content owner’s copyright tracking system…

MrWilson (profile)

June 12, 2025 at 11:19 am

Re: Re: Re:²⁰

Basically software’s legality fails only if users break the conditions described in the TOS.

The TOS isn’t necessarily binding legally if the terms of the TOS aren’t legal. A TOS, for example, could require someone to break the law in order to have permission to use the software, and under US contract law, that isn’t a binding contract.

That said, a TOS can explicitly state that verbal or written permission is not required, meaning that your entire theory that users must contact copyright owners is completely false.

I’ve literally seen copyright holders who release content under permissive licenses complain that people contact them to ask for permission when they specifically used permissive licenses so that no one would need to contact them and waste their time asking permission for something already granted in the permissive license.

terop (profile)

June 13, 2025 at 5:32 am

Re: Re: Re:²¹

a TOS can explicitly state

But this requires that author explicitly decides to do this. I’m usually talking about the default behaviour, i.e. what happens when authors decide nothing…

MrWilson (profile)

June 13, 2025 at 11:11 am

Re: Re: Re:²²

When the authors decided nothing then they don’t make a TOS so your point is entirely moot.

terop (profile)

June 16, 2025 at 8:32 pm

Re: Re: Re:²³

When the authors decided nothing then they don’t make a TOS so your point is entirely moot.

sadly for you, when the solution is that some copyright infringement cannot be detected reliably by software algorithms, the only solution that makes the software legal is a section in TOS that declares it illegal area for the user. Thus it is a legal requirement that TOS contains this section. Authors do not need to make explicit decision to include it, since the law forces their hand.

terop (profile)

June 9, 2025 at 10:44 pm

Re: Re: Re:¹¹

This is another unsupported claim from you. [citation needed]

I posted citations, but it was filtered out by the spam filter, and seems the maintainers are not willing to give you the informattion you need to check the facts.

Mike Masnick (profile)

June 10, 2025 at 12:22 am

Re: Re: Re:¹²

I posted citations, but it was filtered out by the spam filter, and seems the maintainers are not willing to give you the informattion you need to check the facts.

This is a flat out lie.

The only comment that we blocked of yours in reply to me calling you out for lying. Let’s be clear on this: you FALSELY claimed that we mention your software all the time.

I responded, pointing out that this was an outright lie, and saying I don’t know your software, have never mentioned it…

And YOU responded with a spam ad about your software saying “Oh, I can fix that…” and went off on some nonsense about your software.

You provided no “citations” to the things that MrWilson was requesting.

So, not only did you lie first of all, but you’re lying again now, and trying to use the fact that you tried to spam my comments with an ad to your software (which looks stupid, derivative, and pretty fucking useless), as an excuse for failing to provide actual citations.

Dude: fuck off.

terop (profile)

June 10, 2025 at 12:38 am

Re: Re: Re:¹³

You provided no “citations” to the things that MrWilson was requesting.

The damn software is the citation I’m relying on. If using my own copyrighted work is not enough for your standards to work as a citation for copyright issues, then I don’t know what is.

Relying on case law means just copy-pasting some keywords which refer to stuff we have no access to. Its just guesswork if some paperwork hidden in some lawyer’s office matches the technological environment the software relies on. Probably these case law paperworks have some cool limitations of when the stuff is valid and when its not valid, but since the actual text is not available (and noone would bother reading them anyway), it’s just useless.

my own software on the other hand is significantly better to rely on,since I know every detail of it after spending 10 years writing the copyrighted work. But if I rely on stuff that I know, other people obviously have problems following it.

So its either spouting bullshit about case law that I know nothing about. Or I spout stuff that I know well, but other people have trouble understanding. Its your call. Which would you prefer?

Mike Masnick (profile)

June 10, 2025 at 9:32 am

Re: Re: Re:¹⁴

The damn software is the citation I’m relying on.

No it’s not. Your post was literally an advertisement for your software. That’s not what MrWilson (or anyone) was asking for.

That’s not a citation.

I’m beginning to think your problem in life is not copyright, but that you may be the dumbest fucking idiot ever to visit this site.

So its either spouting bullshit about case law that I know nothing about. Or I spout stuff that I know well, but other people have trouble understanding. Its your call. Which would you prefer?

That makes no sense. Again, the one post of yours we didn’t let through was not a citation. It was you promoting your software.

Why do you lie so fucking much?

terop (profile)

June 10, 2025 at 10:21 am

Re: Re: Re:¹⁵

I’m beginning to think your problem in life is not copyright, but that you may be the dumbest fucking idiot ever to visit this site.

Yeah, that’s why I have to create these copyrighted works and receive no compensation for my efforts. Some would call it dumb to repeatedly do the same failure of creating a product when previous one didn’t make me richer than Bill Gates. But compensation issues aside, I think it benefits the society if people are spending their time creating something useful, instead of spray-painting the neighbour’s garage or government buildings. If I was real dumb, I would do damage to the environment. But no, I’d rather create software that looks cool and would be useful if someone would take the time and actually use it.

MrWilson (profile)

June 10, 2025 at 11:26 am

Re: Re: Re:¹⁶

But compensation issues aside, I think it benefits the society if people are spending their time creating something useful, instead of spray-painting the neighbour’s garage or government buildings.

This is a false dilemma. It’s entirely possible to create something useful and not be completely wrong about how US copyright law works. Plenty of people do it every day. You’re literally chatting in the comments section of a website you seem to find valuable to spend time on that Mike has created.

terop (profile)

June 10, 2025 at 1:00 pm

Re: Re: Re:¹⁷

It’s entirely possible to create something useful and not be completely wrong about how US copyright law works.

Have you noticed that this only works in the land of the free? 95% of the world’s population doesn’t work that way. The world long ago moved away from the usa-centric platform you’re promoting, and looked other way. When you bow to the east, your ass points to the west. This is what you need to learn. The world isn’t in your small pond.

MrWilson (profile)

June 10, 2025 at 6:53 pm

Re: Re: Re:¹⁸

Have you noticed that this only works in the land of the free?

Dude, we’re talking about US copyright law! That’s all we’ve ever been talking about here. This article is literally about the US Copyright Office issuing an opinion on US copyright law. Of course this only works in the US! You are so obtuse.

terop (profile)

June 11, 2025 at 1:25 pm

Re: Re: Re:¹⁹

we’re talking about US copyright law! That’s all we’ve ever been talking about here.

You really dont get it. The laws and regulations done by USA copyright office are being copied by companies all over the world. The countries outside of usa must allow those stupid rules whether they wanted it or not, simply because global companies are following the usa rules. Even when the usa rules are completely ridiculous, the horror is being copied all the time all over the world and then we have to suffer the consiquences.

This is why usa rules are not just applying to the usa area, but it has wider impact. We should just require that usa respects their dominant position and keep the rules stable and not bring in stupid stuff that doesn’t work everywhere is the known universe.

MrWilson (profile)

June 11, 2025 at 5:29 pm

Re: Re: Re:²⁰

You really don’t get it. I can vote for legislators who will change US law. I have no influence over what companies outside of the US do. Talking to me about that is as useless as asking a fish at the bottom of the ocean to help address climate change.

terop (profile)

June 11, 2025 at 6:05 pm

Re: Re: Re:²¹

asking a fish at the bottom of the ocean to help address climate change.

I can help with that. I know where the fish we eat comes from, i-e- norwegians are responsible for 60% of all fish eaten in finland, so we just need to pass the information through our retailers to the norwegian fish farms and they can then start fixing the climate change as requested.

MrWilson (profile)

June 12, 2025 at 9:56 am

Re: Re: Re:²²

The fish themselves aren’t capable of consciously helping with climate change. You didn’t understand the simile at all.

terop (profile)

June 18, 2025 at 2:39 pm

Re: Re: Re:²³

The fish themselves aren’t capable of consciously helping with climate change.

We can just delegate our copyright issues to the fish. What better confirms our software legality than getting our copyright bullshit from the sea.

MrWilson (profile)

June 18, 2025 at 10:45 pm

Re: Re: Re:²⁴

Well, you seem to conjure your understanding of copyright out of nothing, rather than actual study of the law and case law, so sure, why not?

terop (profile)

June 19, 2025 at 2:50 pm

Re: Re: Re:²⁵

you seem to conjure your understanding of copyright out of nothing,

Now you’re saying my experiences as an author are “nothing”. Can’t I base my copyright bullshit on my own experiences dealing with publishers? Things like companies insisting on exclusive licenses so that authors are forbidden from publishing the same material via other channels? And then publishers stopping selling the product after a month of sales activity?

Mike Masnick (profile)

June 9, 2025 at 5:11 pm

Re: Re: Re:⁶

It’s you that need to explain why it’s not, and it won’t help by accusing other people are stupid.

If you post stupid shit, I call you stupid. Don’t like it? Don’t post stupid shit.

If those commercial AI models use only licensed data (or public domain data) for training, then we have no problem.

Even leaving aside the still undecided question of whether or not training on licensed work infringes, that wouldn’t make any AI system “illegal” as the original comment suggested.

Otherwise, they are infringing either the author’s right on reproduction or derivative works or both.

If you want to go down that path, you’re still wrong. An infringing use does not make an entire product illegal.

But also, no, even if one is using unlicensed materials, it’s difficult to see how either of those rights are impacted.

The only possible defense here is fair use

Also wrong. An alternative is that no copyright protected uses are implicated. No reproduction is made, and no derivative works are created. So fair use is a defense, but hardly the only one.

as Thomson Reuters v. Ross case has shown, the AI companies are not likely to win.

One ruling, generally seen as ridiculous among the copyright bar, and being appealed. Wouldn’t hang your hat on that just yet.

Explorer09 (profile)

June 9, 2025 at 11:34 pm

Re: Re: Re:⁷

If those commercial AI models use only licensed data (or public domain data) for training, then we have no problem.

Even leaving aside the still undecided question of whether or not training on licensed work infringes, that wouldn’t make any AI system “illegal” as the original comment suggested.

It’s disgusting for you and Techdirt to keep the attitude of “steal it first, and ask for forgiveness later”. That’s what the Big Tech companies are thinking. They gamble on they have more money to win in courts than the authors that would sue them.

Look, I know of AI uses that could win the fair use arguments, but for many generic, big AI models, they probably won’t. There are already evidences that ChatGPT generate work that compete on the same market as the book authors, and that should give you a warning sign when you make your AI application based on that.

An infringing use does not make an entire product illegal.

Say that to Napster and Grokster, please. You guys ignoring the important ruling of MGM v. Grokster make me feel you guys are intentionally deceiving the public.

An alternative is that no copyright protected uses are implicated.

Even when this is true, how can the AI companies defend for fair use on this?

No reproduction is made, and no derivative works are created. So fair use is a defense, but hardly the only one.

AI pre-training involves reproduction. So the first argument is false for generative AI already. The next is fair use, which is the only defense for Meta for its Llama AI models, currently.

MrWilson (profile)

June 10, 2025 at 12:08 am

Re: Re: Re:⁸

It’s disgusting for you and Techdirt to keep the attitude of “steal it first, and ask for forgiveness later”.

Quote Mike where he’s advocated for this.

That’s what the Big Tech companies are thinking.

What about the Big Media companies you’ve admitted to allying with?

They gamble on they have more money to win in courts than the authors that would sue them.

They do have more money than the authors that would sue them. And licensing works won’t pay out to smaller creators much at all. Lawyers are going to be the biggest winners of any lawsuit on this issue, regardless of who wins.

Look, I know of AI uses that could win the fair use arguments,

This contradicts your previous assertions.

but for many generic, big AI models, they probably won’t.

Based on your flawed analysis…

There are already evidences that ChatGPT generate work that compete on the same market as the book authors,

[citation needed]

Say that to Napster and Grokster, please. You guys ignoring the important ruling of MGM v. Grokster make me feel you guys are intentionally deceiving the public.

Define LLMs in such a manner that they are comparable to a peer-to-peer network.

Even when this is true, how can the AI companies defend for fair use on this?

You don’t understand legal strategies.

AI pre-training involves reproduction.

Not in the model, not in the results. You’re demonstrating, again, that you don’t understand the technology.

Explorer09 (profile)

June 10, 2025 at 12:39 am

Re: Re: Re:⁹

There are already evidences that ChatGPT generate work that compete on the same market as the book authors,

[citation needed]

Youtube video “How To Create And Sell E-books Using ChatGPT | How TO Earn Money Using ChatGPT”

AI pre-training involves reproduction.

Not in the model, not in the results. You’re demonstrating, again, that you don’t understand the technology.

Page 28 of the USCO report part 3:

‘As discussed in the Technological Background, the extent to which models memorize training examples is disputed. When, however, a specific model can generate verbatim or
substantially similar copies of a training example, without that expression being provided
externally in the form of a prompt or other input, it must exist in some form in the model’s
weights.
When a model takes the prompt “Ann Graham Lotz” and outputs an image that is
nearly identical to a portrait found in the training data, the expression in that image clearly
comes from the model.’

(Emphasis added)

Refute this one, please, and stop saying bullshit.

MrWilson (profile)

June 10, 2025 at 1:18 am

Re: Re: Re:¹⁰

Youtube video “How To Create And Sell E-books Using ChatGPT | How TO Earn Money Using ChatGPT”

You cited a video creator claiming how to tell people how to do something. That doesn’t prove that it can actually compete or that human audiences will pay for it in favor over human-authored works. Notably, the video demonstrates the limitation of ChatGPT to produce a large amount of content in a single prompt. The video creator asked for a “book” that was 8 pages long. I googled more examples and found someone who spent 3 hours coming up with a “book” that was 8000 words long. You’re refuting your own arguments with this citation. And that’s not even addressing issues of quality in the rendered content.

As I’ve said, I’ve asked LLMs to compete with my own works and it can’t. It writes like a child repeating concepts it’s read but not understood.

‘As discussed in the Technological Background, the extent to which models memorize training examples is disputed.

This statement alone refutes your claim. It’s not definitively decided. It’s disputed.

When, however, a specific model can generate verbatim or
substantially similar copies of a training example, without that expression being provided externally in the form of a prompt or other input, it must exist in some form in the model’s weights. When a model takes the prompt “Ann Graham Lotz” and outputs an image that is nearly identical to a portrait found in the training data, the expression in that image clearly comes from the model.’ Refute this one, please, and stop saying bullshit.

This example was taken from research that hasn’t been reproduced by peers. And if you look at the details of the “research” they used an older model trained on very few relative images that included duplicates of the image that was supposedly (but imperfectly) reproduced. And they had to try millions of times to get something that looked close enough to make the claim. They were trying to achieve this result and they put biased efforts towards it. And this is a single example. If it were reproducing copyrighted works (and perfectly, which isn’t even being alleged), they’d have a million examples instead of one.

But you’re just quoting the copyright office report instead of having researched this particular example yourself before citing it, proving that you are only looking for claims you agree with rather than researching all the nuances of the topic. You’re not interested in learning or changing your mind. You’re just spewing your magical thinking and pretending like your confidence is a sufficient substitute for knowledge or morality.

Explorer09 (profile)

June 10, 2025 at 3:01 am

Re: Re: Re:¹¹

You cited a video creator claiming how to tell people how to do something. That doesn’t prove that it can actually compete or that human audiences will pay for it in favor over human-authored works.

Yes. And the only reason that humans would buy AI generated works is when they can’t tell it’s AI generated. In other words, when the AI can deceive human audience. An also in another words, Turing Test.

So your idea would be to let AI flood the book market with AI generated “slop” and force the potential human buyers to participate this giant Turing Test, which I haven’t even argued it’s all ethical to being with. (An ethical Turing Test requires informed consent, that human participants are aware they are being tested and the content they see during the test may be AI made.)

I googled more examples and found someone who spent 3 hours coming up with a “book” that was 8000 words long.

And that supports my position that someone can make a book very quickly with AI, which in turn competes with the book authors that the AI was being trained with.

It doesn’t have to be a single prompt, but the fact the people can sell AI generated books is sufficient for this claim.

And they had to try millions of times to get something that looked close enough to make the claim.

One in a million chance is greater than zero, and that’s sufficient for the claim.

A purely coincidential resemblance of a copyrighted work would be less than a quintillionth chance to make it. By quintillionth I mean 2^(-64). Or even less, because modern cryptographic hashes has been a least 160 bits long. And the chance of a random monkey making a Shakespeare chapter is much less than that (taking the infinite monkey theorem into account).

If it were reproducing copyrighted works (and perfectly, which isn’t even being alleged)

Infringement doesn’t require perfect reproduction. Imperfect copies can also constitute infringement.

you’re just quoting the copyright office report instead of having researched this particular example yourself before citing it, proving that you are only looking for claims you agree with rather than researching all the nuances of the topic.

You are the one that should cite the counter-claim, not me. You are more professional than the USCO then show us your papers, then.

MrWilson (profile)

June 10, 2025 at 10:53 am

Re: Re: Re:¹²

Yes. And the only reason that humans would buy AI generated works is when they can’t tell it’s AI generated.

So advocate for laws that require LLM-generated content to be labeled. I’m fine with that.

In other words, when the AI can deceive human audience.

That doesn’t require outlawing LLMs by expanding copyright.

An also in another words, Turing Test.

You’re generalizing here. The Turing Test involves a machine convincing a human that it’s also human through direct interaction. The human has the opportunity to interrogate the machine. Just reading text generated by an LLM doesn’t qualify as the Turing Test.

So your idea would be to let AI flood the book market with AI generated “slop” and force the potential human buyers to participate this giant Turing Test, which I haven’t even argued it’s all ethical to being with.

No, not at all. I have not advocated for this at all. Again, QUOTE ME WHERE I SAID THAT! I said you’re just wrong about what you claim your citation proves. I didn’t advocate for anything in refuting your bad example.

(An ethical Turing Test requires informed consent, that human participants are aware they are being tested and the content they see during the test may be AI made.)

Humans aren’t being tested in the Turing Test. You continue to show your lack of understanding for nuance.

And that supports my position that someone can make a book very quickly with AI, which in turn competes with the book authors that the AI was being trained with.

Except it explicitly doesn’t compete with the books that the LLM was trained on. The LLM is trained on a wide variety of books. If an LLM is trained on a fantasy novel but is prompted to write a self-help book, the result doesn’t compete at all with the fantasy novel. And, here’s the kicker: it doesn’t even compete with other self-help books.

You’re A) conflating training with copyrighted content by the developer with the results prompted by a user, which are two different things B) you don’t seem to understand what compete means. You seem to think offering a product in the same general market as another product is competition but that’s not it.

I can ask Christie’s or Sotheby’s to offer a child’s five second crayon drawing in the same auction as a $500 million Picasso, but the child’s work doesn’t compete with the Picasso. The audience is different. The ability to fetch a similar price is different. The ability to affect the market for the Picasso is non-existent.

You’d have to prove that humans knew it was LLM-generated, that they wanted to purchase LLM-generated content, and that it replaced their interest in buying a human-authored work for it to actually compete.

What you’re also missing is that LLM-generated content isn’t copyrightable under US copyright law, so the “market” for the content goes away when one person purchases it and then legally releases it for free online.

It doesn’t have to be a single prompt, but the fact the people can sell AI generated books is sufficient for this claim.

That’s a function of what a publishing platform allows, not a legal claim. You should take it up with Amazon since the video referenced the KDP platform.

One in a million chance is greater than zero, and that’s sufficient for the claim.

It’s not just one in a million. It’s one in a million attempts to get that specific content from an unused model disproportionately trained on the targeted content. You’d have to argue that people who could otherwise do a google search for the “copied” content and find it easily would rather intentionally download an obsolete LLM model and spend hours and hours trying to get a reproduction of a widely available image that you can download easily from the internet for free.

You’re hanging your entire argument on the possibility that some people will waste their lives trying to get LLMs to reproduce already available content that isn’t even for sale and that will somehow lead to the downfall of human-authored content. That’s absurd.

A purely coincidential resemblance of a copyrighted work would be less than a quintillionth chance to make it. By quintillionth I mean 2^(-64). Or even less, because modern cryptographic hashes has been a least 160 bits long.

[citation needed] More specifically, legal citation needed.

And the chance of a random monkey making a Shakespeare chapter is much less than that (taking the infinite monkey theorem into account).

Not this again. The infinite monkey theorem is irrelevant here. In reality, a monkey doesn’t have infinite time to type. It’s not a realistic measure of anything. It would never produce an entire work of Shakespeare (you said “chapter,” but Shakespeare wrote plays divided into acts and wrote poems, and the theorem is about whole works, not subsections of works). That you even mention this proves, again, that you cannot be taken seriously. You’re a dilettante pretending to know something about this topic.

Infringement doesn’t require perfect reproduction. Imperfect copies can also constitute infringement.

If it’s not discernible as a reproduction, it can’t constitute infringement.

You are the one that should cite the counter-claim, not me.

You haven’t cited a credible source yet for your claims. I’m not going to do the legwork to prove you wrong when you haven’t even supported a claim.

Anonymous Coward

June 10, 2025 at 1:22 pm

Re: Re: Re:¹³

So advocate for laws that require LLM-generated content to be labeled. I’m fine with that.

“China Releases New Labeling Requirements for AI-Generated Content”
https://www.insideprivacy.com/international/china/china-releases-new-labeling-requirements-for-ai-generated-content/

That doesn’t require outlawing LLMs by expanding copyright.

There is no outlawing. But training AI with copyrighted data is already illegal (most of the time). So it’s actually about enforcing existing copyright laws, not adding new ones.

And DMCA included. AI companies must not remove copyright management information (CMI) during training. (If keeping the CMI would make AI “regurgitates” lots of copyright information in outputs, that’s the AI’s problem, not the problem with copyright law.)

The Turing Test involves a machine convincing a human that it’s also human through direct interaction. The human has the opportunity to interrogate the machine. Just reading text generated by an LLM doesn’t qualify as the Turing Test.

Does this even matter? Because the human “interrogators” aren’t told they are to interrogate, and the same ethical problem occurs. (By the way, if I can tell MrWilson is a machine or human when I’m not told that I am to interrogate in this Turing Test…)

Humans aren’t being tested in the Turing Test.

Yes they are. https://en.wikipedia.org/wiki/Turing_test#Should_the_interrogator_know_about_the_computer?

If humans aren’t told they are interrogators, they become part of the test and it’s the ethical problem.

If an LLM is trained on a fantasy novel but is prompted to write a self-help book, the result doesn’t compete at all with the fantasy novel.

Bullshit. Unless the LLMs have significantly limited on the type of outputs, end users will go to write a fantasy novel with that.

A direct proof: https://www.reddit.com/r/GPT3/comments/zgg3y9/how_do_prompt_chatgpt_to_write_a_fantasy_novel/

To talk this in detail, it’s the precedent of the Grokster case (and perhaps Napster, too). When P2P has many legal, non-infringing uses, as long as (1) users primarily use it for infringement, and (2) the company making the technology benefits from illegal uses. Then the company is liable (contributory copyright infringement).

It’s the ruling, so don’t blame me.

You’re A) conflating training with copyrighted content by the developer with the results prompted by a user, which are two different things

See the contributory copyright infringement liability.

B) you don’t seem to understand what compete means. You seem to think offering a product in the same general market as another product is competition but that’s not it.

Warhol v. Goldsmith. Like it or not. I don’t want to argue on this anymore.

I can ask Christie’s or Sotheby’s to offer a child’s five second crayon drawing in the same auction as a $500 million Picasso, but the child’s work doesn’t compete with the Picasso. The audience is different. The ability to fetch a similar price is different. The ability to affect the market for the Picasso is non-existent.

Bad analogy. A better one would be: An imitation artist at an auction of Hayao Miyazaki who says he can replicate someone’s portrait of photo “in the style of Miyazaki” and collects money out of those imitations.

And I use Miyazaki instead of Picasso because Picasso’s copyrights have long been expired, making all infringement claims invalid and “fair use” arguments moot.

You’d have to prove that humans knew it was LLM-generated, that they wanted to purchase LLM-generated content, and that it replaced their interest in buying a human-authored work for it to actually compete.

No. My assumption was humans don’t know they are LLM-generated when they purchase. So no “[replacement of] interests in buying a human-authored work” nonsense.

You are thinking the factor 4 of fair use too narrowly. That factor was made to be broad, covering not only existing market but also potential markets that authors deserve exclusive rights on. And that’s how USCO mentioned the “market delusion” (that people called it controversial) and why that’s an important factor to consider.

You’d have to argue that people who could otherwise do a google search for the “copied” content and find it easily would rather intentionally download an obsolete LLM model and spend hours and hours trying to get a reproduction of a widely available image that you can download easily from the internet for free.

Emphasis added. You showed your true colors! So you advocate piracy after all. All your arguments about “fair use” of AI are just decoys to cover your true intentions that is encouraging people to pirate.

Enough said.

Explorer09 (profile)

June 10, 2025 at 3:23 pm

Re: Re: Re:¹⁴

Wait. This post is mine (Explorer09). Due to some mistake in the comment system I posted it as an anonymous coward by accident.

MrWilson (profile)

June 10, 2025 at 7:39 pm

Re: Re: Re:¹⁴

“China Releases New Labeling Requirements for AI-Generated Content”

I meant in the US. We’re discussing US copyright law here. You and Terop are both non-Americans trying to lecture Americans about US copyright law. You’d think you could stay on that topic.

But training AI with copyrighted data is already illegal (most of the time).

This hasn’t been determined at all.

So it’s actually about enforcing existing copyright laws, not adding new ones.

You’re inventing a new right that hasn’t existed.

And DMCA included. AI companies must not remove copyright management information (CMI) during training.

That’s not actually what the DMCA says.

Does this even matter?

Yes, you’re misunderstanding an unrelated computer science concept relating to machine intelligence while arguing about a copyright issue.

Because the human “interrogators” aren’t told they are to interrogate, and the same ethical problem occurs.

This isn’t always the case with the Turing test. Your citation literally states: “Turing never makes clear whether the interrogator in his tests is aware that one of the participants is a computer.”

(By the way, if I can tell MrWilson is a machine or human when I’m not told that I am to interrogate in this Turing Test…)

This isn’t a complete or coherent sentence.

Yes they are.

Per that article: “The Turing test, originally called the imitation game by Alan Turing in 1949,[2] is a test of a machine’s ability to exhibit intelligent behaviour equivalent to that of a human. In the test, a human evaluator judges a text transcript of a natural-language conversation between a human and a machine.”

Humans are the interrogators and evaluators. They are not the test subject. A human cannot pass or fail a Turing test. You don’t understand what you’re talking about. You even provide citations that prove you don’t understand.

Bullshit. Unless the LLMs have significantly limited on the type of outputs, end users will go to write a fantasy novel with that.

You mean the LLM will write the fantasy novel. If the user writes the fantasy novel, then it’s a non-issue.

A direct proof

That’s not proof. That’s people looking to do something they can’t already do and discussing failed strategies. And if you read the comments, they admit it’s not actually doing what you claim. They’re pointing out that it’s not capable of what you claim.

“I have been able to get chatGPT to write 6 or 7 mediocre chapters of a teen novel.”

“It doesn’t have the memory to create an entire novel.”

“Of course I had to read each chapter to make sure they were good (and retry when it was not)”

“keep it in a notepad that way you can refer to it when you need it and be able to paste parts of it or the entirety of it in AI just in case it forgets somehow the basic framework will remain in its memory, and it will utilize that to keep the alignment of the story”

“AI can’t do that yet. It doesn’t have enough memory in its head to understand relationships between text that is far apart from each other.”

When P2P has many legal, non-infringing uses, as long as (1) users primarily use it for infringement, and (2) the company making the technology benefits from illegal uses. Then the company is liable (contributory copyright infringement).

You would have to prove that most uses of LLMs are primarily for infringement for this argument to work. I look forward to seeing your proof.

It’s the ruling, so don’t blame me.

I’ll blame you for misunderstanding the ruling and how it applies to LLM training or use.

See the contributory copyright infringement liability.

Provide a citation of law or case law that explicitly states that using an LLM is an infringment of a copyright.

Warhol v. Goldsmith. Like it or not. I don’t want to argue on this anymore.

You shouldn’t argue on this anymore. You continue to be wrong. That case is irrelevant to the argument.

Bad analogy. A better one would be: An imitation artist at an auction of Hayao Miyazaki who says he can replicate someone’s portrait of photo “in the style of Miyazaki” and collects money out of those imitations.

Except my analogy doesn’t assume as you do that the intended use is copyright violation. You have a bias that assumes that’s the primary use. You are wrong. Search for all the articles about the tips people are giving each other about using ChatGPT or Claude. They’re all saying stuff like, “here’s how to be more productive by having ChatGPT write you a task list,” “use these five prompts to improve your self-care regimen,” “Seven prompts to make you more productive at work.”

And I use Miyazaki instead of Picasso because Picasso’s copyrights have long been expired, making all infringement claims invalid and “fair use” arguments moot.

To reiterate for the seemingly thousandth time—We are discussing US copyright law. In the US, most of Picasso’s works are protected under copyright until 2043, which is 70 years after his death. You could have searched for this fact easily, but you chose not to. This is why you are useless on this topic. You have a world of data available to you and you choose not to fact check your own incorrect assumptions, making all your claims invalid and your arguments moot.

No. My assumption was humans don’t know they are LLM-generated when they purchase. So no “[replacement of] interests in buying a human-authored work” nonsense.

If humans don’t know they’re purchasing LLM-generated content, then the person claiming human authorship and copyright on the content is making a false copyright claim. That would make the competition argument moot.

You are thinking the factor 4 of fair use too narrowly. That factor was made to be broad, covering not only existing market but also potential markets that authors deserve exclusive rights on. And that’s how USCO mentioned the “market delusion” (that people called it controversial) and why that’s an important factor to consider.

As the article you didn’t read notes, the US Copyright Office’s opinion is non-binding and isn’t law. Find case law that actually supports your position or else your assertions are moot. Your characterization of any aspect of US copyright law isn’t reliable.

Emphasis added. You showed your true colors! So you advocate piracy after all.

No, you dumb fuck. You are so dense. You don’t understand the reference you made.

You quoted the US Copyright Office: “When a model takes the prompt “Ann Graham Lotz”…”

That image to which they referred is released under a permissive license. It’s not piracy to find and download it.

https://commons.wikimedia.org/wiki/File:Anne_Graham_Lotz_(October_2008).jpg

“This work is free and may be used by anyone for any purpose.”

And beyond that permissive license, it’s fair use to download the photo for personal use anyway, so the license itself isn’t even the only permissiveness. Copyright law itself allows such use.

You thought you had a gotcha moment, but it ironically only proves your eagerness to make bad assumptions and your complete and utter ignorance of US copyright law.

All your arguments about “fair use” of AI are just decoys to cover your true intentions that is encouraging people to pirate.

You can’t pirate what’s actually released for free, you complete dumbass. That said, who the hell wants to “pirate” a random photo of Billy Graham’s daughter anyway?

Enough said.

I absolutely agree. You have said quite enough, and you’ve been proven wrong multiple times, yet you keep spewing more laughable bullshit.

Explorer09 (profile)

June 11, 2025 at 12:24 am

Re: Re: Re:¹⁵

Except my analogy doesn’t assume as you do that the intended use is copyright violation. You have a bias that assumes that’s the primary use.

Well, that’s the contributory copyright infringement liability is about (plus vicarious infringement)! If your damn company benefitted from the copyright infringements done by end users, why the fuck can you not be liable? Even when it’s not the primary use of the tech.

(See also: Types of secondary copyright infringements as listed by Copyright Alliance
https://copyrightalliance.org/education/copyright-law-explained/copyright-infringement/secondary-copyright-infringement/)

MGM v. Grokster. As Grokster had ruled to be liable, you can’t shield ChatGPT or whatever generative AI there from this liability.

Search for all the articles about the tips people are giving each other about using ChatGPT or Claude. They’re all saying stuff like, “here’s how to be more productive by having ChatGPT write you a task list,” “use these five prompts to improve your self-care regimen,” “Seven prompts to make you more productive at work.”

I’m ignoring this argument because it doesn’t tell how it is necessary to train the AI model with copyrighted content in the first place. This is distraction.

In the US, most of Picasso’s works are protected under copyright until 2043, which is 70 years after his death. You could have searched for this fact easily, but you chose not to.

What about countries whose copyright laws set the lifespan to be 50 years after death (Berne Convention)?
If you’re advocating for shorter lifespan of copyright, you shouldn’t bring this straw man, as you would contradict yourself.

If humans don’t know they’re purchasing LLM-generated content, then the person claiming human authorship and copyright on the content is making a false copyright claim. That would make the competition argument moot.

No it doesn’t. One of the purpose of copyright is to protect the market from fake imitations. Even when the buyer may know they are fake, because the fake painting or books or music may be sold in a much cheaper price than the genuine one. Creating a huge temptation for buyers to look away from genuine goods. And that is the market competition in the factor 4.

Imagine this: You go to a video game store and look for a Nintendo Switch 2 Pro Controller to buy, and there is a illegal clone (from an unnamed manufacturer) with same boxing and priced US$10 cheaper (say, US$75 rather than US$85). Would you not be tempted to buy that cheaper knockoff?

As the article you didn’t read notes, the US Copyright Office’s opinion is non-binding and isn’t law. Find case law that actually supports your position or else your assertions are moot.

You cannot cite case laws that support your fair use claims either, why the hell should I listen?

Or should I wait until one of the AI copyright cases gets appealed to the Supreme Court and see who ultimately wins?

You quoted the US Copyright Office: “When a model takes the prompt “Ann Graham Lotz”…”

That image to which they referred is released under a permissive license. It’s not piracy to find and download it.

And you are fuckingly arguing the wrong thing! First, permissive licenses require attribution, and Stable Diffusion didn’t attribute the source or the original photographer. Second, people will be tempted to extract copyrighted materials from the AI, like it or not, even when it takes millions of attempts to do it (it’s relatively easy to automate generations of a million images these days). Third, you are assuming everything on the internet can be downloaded “for free” when you made that argument. That’s why it’s your true colors! You advocate piracy, period.

Don’t refute me on the third point. You just slipped it when you wrote the reply. Not my fault.

MrWilson (profile)

June 11, 2025 at 1:23 am

Re: Re: Re:¹⁶

as listed by Copyright Alliance

This citation says it all. You’re getting your propaganda from Big Media corporations that make their wealth exploiting the creativity of other people.

MGM v. Grokster. As Grokster had ruled to be liable, you can’t shield ChatGPT or whatever generative AI there from this liability.

ChatGPT isn’t peer-to-peer software.

I’m ignoring this argument because it doesn’t tell how it is necessary to train the AI model with copyrighted content in the first place. This is distraction.

So you assert that there’s a difference between training and use when you want to ignore an argument you can’t refute, but you don’t think there’s a difference when you want to conflate the two for a different argument about liability. Curious.

What about countries whose copyright laws set the lifespan to be 50 years after death (Berne Convention)?

We’re talking about US copyright law. We are always talking about US copyright law. That’s what the article is about. That’s what every comment I’ve made is about. If you want to talk about the laws of other countries, you won’t be doing it with me. That’s goalpost shifting and employing non sequiturs.

If you’re advocating for shorter lifespan of copyright, you shouldn’t bring this straw man, as you would contradict yourself.

You seem confused about the difference between stating a fact about the current state of the law and stating a wish that the law could be different. That’s not a contradiction. Unlike you, I don’t think my preference is magically the law because I really want it to be true.

Also, you don’t seem to understand what a straw man is. A straw man is arguing against a claim that the other person didn’t make. You did make the argument that Picasso’s works were no longer under US copyright, which was absolutely false.

No it doesn’t. One of the purpose of copyright is to protect the market from fake imitations. Even when the buyer may know they are fake, because the fake painting or books or music may be sold in a much cheaper price than the genuine one. Creating a huge temptation for buyers to look away from genuine goods. And that is the market competition in the factor 4.

That’s trademark, not copyright. Holy shit, you’re an idiot.

Imagine this: You go to a video game store and look for a Nintendo Switch 2 Pro Controller to buy, and there is a illegal clone (from an unnamed manufacturer) with same boxing and priced US$10 cheaper (say, US$75 rather than US$85).

That’s a trademark violation, not a copyright violation.

Would you not be tempted to buy that cheaper knockoff?

No, I don’t buy consoles.

You cannot cite case laws that support your fair use claims either, why the hell should I listen?

The major cases on the topic haven’t been decided. Of course that didn’t stop you from trying to cite undecided cases. I’m not intellectually dishonest like you that way.

Or should I wait until one of the AI copyright cases gets appealed to the Supreme Court and see who ultimately wins?

Yes, you should. It won’t really matter though. As I’ve already predicted, big corporations will win, whether it’s Big Media or Big Tech. Some will get slightly wealthier. Others will get slightly poorer. But the little guy is the one who will suffer and lose out.

First, permissive licenses require attribution,

Not all permissive licenses require attribution. And not all attribution licenses can legally require attribution for all uses. If you download a CC-BY licensed song and play it on your own speakers, you don’t have to attribute it anywhere. That’s just a personal use. Regarding LLMs, if the training dataset isn’t published, which it isn’t, you can’t attribute it. The trained material isn’t in the resulting model, so there’s nothing to attribute in the result. Again, you don’t understand how the technology works.

and Stable Diffusion didn’t attribute the source or the original photographer.

Should they post the attribution internally in their offices on a bulletin board? On a post-it note next to their computer?

Second, people will be tempted to extract copyrighted materials from the AI, like it or not, even when it takes millions of attempts to do it (it’s relatively easy to automate generations of a million images these days).

First, that’s not actually possible. Even this one example was super blurry and not an adequate substitute for the real image. But also, why would they waste so much time and effort to try to use an LLM to replicate copyrighted materials when they could just find it somewhere else and in some cases, legally for free? That argument doesn’t make any sense. Again, you really don’t understand the technology or how people are actually using it. You’re just paranoid about a fever dream moral panic you’ve had.

Third, you are assuming everything on the internet can be downloaded “for free” when you made that argument.

No, I’m not assuming that at all and such an assumption wouldn’t serve my argument at all. I said that one particular picture which is the one particular example you provided referencing a single “study” that tried really hard to find that one particular picture is available for free on the internet so no one would even logically bother to try to reproduce it via an LLM (except in the scenario that you’re a researcher desperate to prove a point with deliberately manipulated results). You aren’t vetting the people or the content you’re cribbing your propaganda from, so you’re ignorant of the implications. If you think I’m referring to “everything on the internet,” then you need to prove that an LLM can actually reproduce “everything on the internet.” If it can’t, which it can’t, then your argument is pointless. Not everything can be downloaded for free, nor can everything (or even much of anything) be easily, reliably, authentically reproduced by an LLM. Again, again, you don’t understand the technology.

That’s why it’s your true colors! You advocate piracy, period.

It’s not piracy to copy a free image in a legal manner, dumb shit. That’s the whole fucking point. You are so obtuse. That you think you’re scoring points by trying to accuse me of advocating for piracy really shows your “true colors.” You’ve already accused poor people of not existing and being freeloaders (positions which contradict each other). So you hate poor people. You admit to siding with wealthy corporations. You share propaganda from Big Media companies. Honestly, getting falsely accused of advocating for piracy is far better than what you actually are—a shill for the wealthy and powerful.

Don’t refute me on the third point. You just slipped it when you wrote the reply. Not my fault.

So you know your argument is bullshit and you don’t want me to explain why because you feel silly that you didn’t even understand the example you referenced. Well, too bad. As we’ve already established, you don’t get to tell me what I can and cannot say. We’ll add your ignorance of the 1st Amendment to the list of US laws you don’t understand, which already includes US copyright law, the DMCA, trademark law, et al.

Explorer09 (profile)

June 11, 2025 at 3:49 am

Re: Re: Re:¹⁷

You’ve already accused poor people of not existing and being freeloaders (positions which contradict each other). So you hate poor people.

I disregard the “poor people” arguments (bullshits) because you cannot name any single person or organization that fits your definition. In other words the “poor people” you mentioned don’t exist. And how would it make sense to claim that I “hate” people who don’t exist at all.

I could compassionate poor people if you can name a single person or organization who is “poor”. Otherwise, all of these are bullshit to argue about.

MrWilson (profile)

June 11, 2025 at 10:28 am

Re: Re: Re:¹⁸

I disregard the “poor people” arguments (bullshits) because you cannot name any single person or organization that fits your definition.

Every single (American) person who is not wealthy is on the list. As I have already said, listing actual names would take too long. Do you know an American citizen who isn’t wealthy? They’re on the list. I’m on the list. Everyone I know is on the list since I’m not friends with any billionaires or even millionaires. Anyone with a net worth less than a few million is on the list. 88% of the US population is a part of a household whose net worth is less than a million. That’s an easy enough line to draw. 88% of the US population is on the list. What do you think naming names will accomplish that this very broad inclusive statement doesn’t?

You’re just using a bullshit excuse to ignore the bulk of the American populace. Do you think there are only millionaires and billionaires in the US or is that just the only people you care about?

Also, bullshit is a mass or uncountable noun, so it’s just bullshit, not “bullshits.” If you add bullshit to bullshit, you just get more bullshit.

In other words the “poor people” you mentioned don’t exist.

You being skeptical about the existence of poor people in the US is the weirdest made up hill to die on. That you can’t even accept that poor people exist shows how irrational your position is.

And how would it make sense to claim that I “hate” people who don’t exist at all.

You apparently hate them so much that you deny their very existence despite the fact that you see them on the internet everyday and have the proof of their existence at your fingertips. You’ve been chatting with them on this website. Are you just talking to yourself here?

I could compassionate poor people if you can name a single person or organization who is “poor”. Otherwise, all of these are bullshit to argue about.

Let’s not pretend you have any compassion. If you can’t even imagine they exist without a citation as if you’re completely unaware of the state of the world, then you’ve decided on a weird, judgmental stance that deliberately erases them. Are there no poor people in your country? Is the concept of a poor person foreign to you? How wealthy are you?

Explorer09 (profile)

June 11, 2025 at 4:13 pm

Re: Re: Re:¹⁹

listing actual names would take too long

I only required you to list one name. So “too long” is an excuse.

Anyone with a net worth less than a few million is on the list. 88% of the US population is a part of a household whose net worth is less than a million.

Why this definition? Why is a household with net worth less an a million can’t even buy one work (music, book or movie)?

Bullshit. (This definition of “poor people” is made up and not reflecting the actual economic ability to purchase copyright ed works.)

MrWilson (profile)

June 11, 2025 at 6:33 pm

Re: Re: Re:²⁰

I only required you to list one name. So “too long” is an excuse.

You keep pretending like not naming a person means that no poor people exist. You do realize that reality doesn’t change based on what a person argues, right? This is the most absurd demand. I’m not going to provide you with a list. I’ve given you an entire category of American citizens that consists of about 300 million people. Do a google search, find references to poor Americans, there’s your name. Look at any public voter roll data in the US. There’s about an 88% chance the names you find match my definition.

Why this definition?

This has already been explained to you. You lost track of what we’re even discussing. I’m advocating for people who aren’t wealthy and powerful. Anyone who isn’t a millionaire, billionaire, or stockholder in a large amount of a big corporation qualifies. It also includes the vast majority of creators you pretend to care about, including me.

Why is a household with net worth less an a million can’t even buy one work (music, book or movie)?

I didn’t say that they can’t buy media. You claimed poor people were freeloaders. I never said anything about that. They are the primary consumers of media. They pay a lot of money for their media. And you’ve been saying they don’t exist or they’re freeloaders and pirates.

And if you needed this explained to you again, then you clearly aren’t tracking the argument and your positions are irrelevant. You’re admitting you don’t know what we’re talking about.

Bullshit. (This definition of “poor people” is made up and not reflecting the actual economic ability to purchase copyright ed works.)

Calling them poor was never about their ability to purchase copyrighted works. You made the claim that they didn’t. I never said that at all.

I am literally talking about every American who isn’t a wealthy corporation.

In other words, the right to freeload.

Explorer09 (profile)

June 12, 2025 at 3:37 am

Re: Re: Re:²¹

You keep pretending like not naming a person means that no poor people exist. You do realize that reality doesn’t change based on what a person argues, right? This is the most absurd demand. I’m not going to provide you with a list. I’ve given you an entire category of American citizens that consists of about 300 million people. Do a google search, find references to poor Americans, there’s your name. Look at any public voter roll data in the US. There’s about an 88% chance the names you find match my definition.

I don’t care about your “reality”. I would disregard it as if the judge dismisses a claim due to lack of evidence. Even though this is a casual debate and not a court battle. I still expect the rule of whoever brings up a claim must supply it with evidence.

So, dismissed.

MrWilson (profile)

June 12, 2025 at 10:10 am

Re: Re: Re:²²

I’m not referring to my “reality.” I’m referring the actual reality we both exist in.

Are you denying the existence of US citizens who are worth less than a million dollars? To deny the existence of the categorization of people I’ve provided, you’re tacitly saying you think every US citizen is at least a millionaire. That isn’t an intellectually honest position to take. You’re saying homeless people don’t exist.

If you want to pretend this is a debate, you can’t take absurd positions and expect any third party to take you seriously. You’re just using this as an excuse to ignore the fact that you completely misunderstood my argument and you’ve been fighting in favor of wealthy corporations against the very people you pretend you’re championing.

I still expect the rule of whoever brings up a claim must supply it with evidence.

I provided better than names. I’ve cited the majority of US citizens. You’re pretending your demand for names supersedes the categorization I’ve provided. Providing one name isn’t greater than providing 300,000 million people.

Anonymous Coward

June 12, 2025 at 1:35 pm

Re: Re: Re:²³

Are you denying the existence of US citizens who are worth less than a million dollars? […] You’re saying homeless people don’t exist.

I don’t deny the existence of “US citizens who are worth less than a million dollars”. I deny you who claim to be representing them.

This is like a “class action” where you define your class improperly. I challenge your ability to represent the class you defined.

Explorer09 (profile)

June 12, 2025 at 2:25 pm

Re: Re: Re:²⁴

(Oops. Another accidental comment as signed out where I didn’t mean that.)

MrWilson (profile)

June 12, 2025 at 2:36 pm

Re: Re: Re:²⁴

I deny you who claim to be representing them.

I didn’t claim to represent them. You’ve lost track of the discussion again.

I had stated: “It was to say that you are attacking big corporations but hurting innocent independent non-profits, researchers, students, and poor individuals. That’s the whole point here.”

To which you responded: “Please name an “innocent independent non-profit, researcher, student, or poor individual” you are talking about. Or is it just me that I sense no one but some “bad students” who just want to freeload and use ChatGPT to complete their homework, ignoring academic ethics?”

This is like a “class action” where you define your class improperly. I challenge your ability to represent the class you defined.

I don’t represent them. I’m challenging your assertions as a non-US citizen to lie about US law to deny those people their rights. You’ve created them as a class of people by saying they should lose out and wealthy corporations should benefit from their loss of rights.

Again, your inability to follow the discussion points is concerning on top of your proven ignorance of the law and the technology. You don’t even remember what you’ve said in this thread on top of arguing often with straw men I never uttered.

Explorer09 (profile)

June 12, 2025 at 2:43 pm

Re: Re: Re:²⁵

I didn’t claim to represent them. You’ve lost track of the discussion again.

I had stated: “It was to say that you are attacking big corporations but hurting innocent independent non-profits, researchers, students, and poor individuals. That’s the whole point here.”

To which you responded: “Please name an “innocent independent non-profit, researcher, student, or poor individual” you are talking about.

Yes! Name someone who is hurt by me. Because it’s you that suggest that I hurt someone. You didn’t name any person (but rather a class that’s improperly defined), so I just disregard this claim. (Or should I say “dismiss”?) Anyway this claim is nothing to talk because you didn’t bring evidence.

MrWilson (profile)

June 12, 2025 at 3:23 pm

Re: Re: Re:²⁶

Bob Smith. There is at least one person in the US named Bob Smith who isn’t worth more than a million dollars. There you go. I named someone.

I look forward to seeing what new bullshit you will spin up now that we’ve gotten over the absurd barrier of your imagination to consider the existence of people without their names being mentioned.

Explorer09 (profile)

June 12, 2025 at 4:05 pm

Re: Re: Re:²⁷

Bob Smith. There is at least one person in the US named Bob Smith who isn’t worth more than a million dollars. There you go. I named someone.

Which one? https://en.wikipedia.org/wiki/Robert_Smith_(disambiguation)

MrWilson (profile)

June 12, 2025 at 6:08 pm

Re: Re: Re:²⁸

Wikipedia is more likely to feature famous, wealthy, or deceased people. You should search a white pages listing instead. But also, pinpointing a single person seems really weird since the category includes 300 million other people. Are you planning to harass Bob or something?

Explorer09 (profile)

June 13, 2025 at 4:16 am

Re: Re: Re:²⁹

Wikipedia is more likely to feature famous, wealthy, or deceased people. You should search a white pages listing instead. But also, pinpointing a single person seems really weird since the category includes 300 million other people. Are you planning to harass Bob or something?

I want to you bring that Bob to show up on this website as a witness. So you can’t just frame a random Bob and make a claim that you represent him (or them).

MrWilson (profile)

June 13, 2025 at 11:17 am

Re: Re: Re:³⁰

Again, I didn’t claim to represent them. You have claimed that. So you’re arguing with a straw man. And in fact, you’ve claimed to represent my interests since you said you were arguing on behalf of rightsholders and I’m a rightsholder. And I told you that you didn’t represent me. So not only is this straw man a confession of your own hypocrisy and stupidity, it’s just pointless, like most of your attempts at arguing.

This entire obstinacy in refusing to acknowledge the existence of poor people was just you waiting to assert that I didn’t represent people I didn’t purport to represent because you again didn’t understand what was being said. You create these weird barriers to your own understanding and spin out so much you don’t even remember what you’ve said, much less what I’ve said.

Explorer09 (profile)

June 10, 2025 at 1:29 pm

Re: Re: Re:¹³

You’d have to argue that people who could otherwise do a google search for the “copied” content and find it easily would rather intentionally download an obsolete LLM model and spend hours and hours trying to get a reproduction of a widely available image that you can download easily from the internet for free.

Emphasis added. You showed your true colors! So you advocate piracy after all. All of the “fair use” claims about AI from you are just decoys to conceal this true intention of you!

MrWilson (profile)

June 10, 2025 at 7:41 pm

Re: Re: Re:¹⁴

I love that you think this is an example of piracy and that pointing out that browsers have a download button in a menu is supposedly advocating for piracy. It says so much about you and nothing about me.

terop (profile)

June 7, 2025 at 9:18 pm

Re: Re: Re:⁵

“Implicating the right of reproduction” ≠ “AI practices are illegal”

Technically there is another element required, before you can conclude that AI practices are illegal, but it’s just the normal:
1) AI companies forgot to obtain licenses for the material they used for the training.

MrWilson (profile)

May 30, 2025 at 12:10 am

Re: Re: Re:²

First, not sure why you responded to me. I was talking to Explorer09.

Second, we still don’t care about your failed software here.

terop (profile)

May 30, 2025 at 11:48 am

Re: Re: Re:³

Second, we still don’t care about your failed software here.

That doesn’t mean that my refusal to implement AI does not have any effect on overall AI usage in western societies.

You still haven’t understood how important my “failed software” is to the overall software ecosystem. It’s an symbian phone area edge module, and as such plays pivotial role in protecting software vendors from entering area they cannot handle without the help of successful phone companies.

It is well known fact that symbian failed because developing software in the area was too difficult for open source developers. Thus only small number of 3rd party applications were developed for the phone platform and that has always been one element attributed to the failure.

Now that we built the edge modules designed to protect software vendors from entering the area which they cannot handle, your solution is to call it “failed software” and reject its purpose as an edge module.

We can easily call bullshit on your practices. There’s no reason to lower gameapi’s status from symbian edge module to failed software like you tried to do.

MrWilson (profile)

May 30, 2025 at 1:10 pm

Re: Re: Re:⁴

Me: “We don’t care about X…”

You: “Let me tell you more about X…”

your solution is to call it “failed software” and reject its purpose as an edge module.

I call it failed software because if it was successful you wouldn’t be spamming unrelated posts about it. I don’t care about the technical details or the philosophy you have about it or whatever narrative you’ve told yourself for why it’s important.

We can easily call bullshit on your practices. There’s no reason to lower gameapi’s status from symbian edge module to failed software like you tried to do.

The fuck are you talking about? You’re hallucinating worse than an LLM trained solely on crypto bro 4chan posts.

terop (profile)

May 30, 2025 at 9:31 pm

Re: Re: Re:⁵

Me: “We don’t care about X…”

I don’t care about the technical details or the philosophy

Common theme in your bullshit seems to be that you simply do not care.

I think that’s significant problem. You really need to fix that problem. Sloppy practices where you simply do not care enough to arrive to work at correct time or care enough to actually create software that works properly… lazy and sloppy practices are consiquences for not caring…

MrWilson (profile)

May 31, 2025 at 12:03 am

Re: Re: Re:⁶

Common theme in your bullshit seems to be that you simply do not care.

Yes! You’re finally catching on to the words that I am saying!

Sloppy practices where you simply do not care enough to arrive to work at correct time or care enough to actually create software that works properly…

Hey, apparently I’m a software developer. Alright. Where’s the paycheck for that gig? Can I start putting that on my resume?

terop (profile)

June 8, 2025 at 7:58 pm

Re: Re: Re:⁷

Hey, apparently I’m a software developer. Alright.

This kinda explains why you’re against copyright. You have not created any copyrighted works worthy of copyright protection. Let me know if that’s wrong and I’ll upgrade your status from ignorant to something better.

Explorer09 (profile)

June 8, 2025 at 11:23 pm

Re: Re: Re:⁸

Hey, apparently I’m a software developer. Alright.

This kinda explains why you’re against copyright. You have not created any copyrighted works worthy of copyright protection. Let me know if that’s wrong and I’ll upgrade your status from ignorant to something better.

Being a software developer does not make a sufficient reason against copyright. In fact there are developers who are for copyright, too. I’m not saying this for myself, but for the plaintiffs in the Doe v. GitHub case.

In fact it was the open source developers who observed the copyright problems with generative AI (in particular GitHub Copilot, the AI code generator) and brought the lawsuit first! The key of making the open source communities thriving is copyright, that makes the “copyleft” clauses in open source licenses working, to protect the communities from corporate exploitation (e.g. incorporating open source to proprietary products without contributing back). @MrWilson seemed to have no idea about this.

The open source developers could have released everything to public domain, but they didn’t do it. Think of why.

MrWilson (profile)

June 9, 2025 at 12:30 pm

Re: Re: Re:⁸

This kinda explains why you’re against copyright.

First, I’m not against copyright. I’m against copyright maximalism and greed and putting power and money into the hands of those who are already wealthy and powerful and who subvert or outright suppress human and civil rights. Copyright maximalists have used their vast wealth to undermine my democracy and weaken the influence of citizens in their own government.

If I had my way, copyright would only be owned by humans, not by corporations. It would only be licensable, not transferrable, so corporations wouldn’t be able to retain them after a creator dies. I’d also not have copyright last so long, definitely not after the death of the creator. Temporary copyright protection was meant to incentivize new works being created and dead people can’t create new works. Longer copyright made more sense when it took longer to distribute copyrighted works in a market.

You have not created any copyrighted works worthy of copyright protection.

Copyright isn’t granted on some subjective measure of worthiness. It’s granted on the nature of the work and the human authorship. This kind of statement proves, again, that you don’t understand US copyright law.

That said, I have created many works that are copyrighted. I’ve sold copyrighted works. I’ve released some copyrighted works under various licenses, including Creative Commons licenses. I’ve had my copyrights violated by large corporations and individual random people. I’ve had people sell my creations and made money off them. I’ve issued DMCA takedown notices for such violations. I’ve had my copyrighted works used in media and products.

You might even consider me a software developer, not in the proper sense of being a coder, but according to the copyright office, some of the works I develop are classified as software.

Let me know if that’s wrong and I’ll upgrade your status from ignorant to something better.

The estimation of a fool isn’t something I’m going to care about.

Explorer09 (profile)

June 9, 2025 at 12:46 pm

Re: Re: Re:⁹

First, I’m not against copyright. I’m against copyright maximalism and greed and putting power and money into the hands of those who are already wealthy and powerful and who subvert or outright suppress human and civil rights. Copyright maximalists have used their vast wealth to undermine my democracy and weaken the influence of citizens in their own government.

If I had my way, copyright would only be owned by humans, not by corporations. It would only be licensable, not transferrable, so corporations wouldn’t be able to retain them after a creator dies. I’d also not have copyright last so long, definitely not after the death of the creator. Temporary copyright protection was meant to incentivize new works being created and dead people can’t create new works. Longer copyright made more sense when it took longer to distribute copyrighted works in a market.

And this is the ignorant part of you. When you argued about copyright and AI, you don’t really care about the creators whose works had been “stolen” (or scrapped) by AI even less than a month after the works have been published. You don’t give a sh-t about “temporary copyright protection” when you mentioned it. Even if we have copyright law that is non transferrable, and have shorter lifespan (say, 10 to 20 years), you would still try to greenlight AIs that steal works faster than that.

In other words, we know you are lying.

MrWilson (profile)

June 9, 2025 at 1:31 pm

Re: Re: Re:¹⁰

Oh hey, you’re responding to me again. I feel honored! /s

And this is the ignorant part of you. When you argued about copyright and AI, you don’t really care about the creators whose works had been “stolen” (or scrapped) by AI even less than a month after the works have been published.

You don’t get to tell me what I care about. I am a creator whose work has been used to train AI without my consent. You’ve been told this. That you pretend otherwise shows your ignorance. You just can’t fathom that someone might be in a similar situation and yet have a different perspective. The limitations of your understanding is your problem, not a deficiency on my part.

You don’t give a sh-t about “temporary copyright protection” when you mentioned it.

I wouldn’t have said it if I didn’t care about it. You seem to be missing the dominant theme in my interests in copyright as a topic or my perspective on LLM training.

Even if we have copyright law that is non transferrable, and have shorter lifespan (say, 10 to 20 years), you would still try to greenlight AIs that steal works faster than that.

Except if copyright duration were much shorter (you said 10 – 20, I didn’t), there would likely be less accommodation or need for fair use because works would come into the public domain so much faster. But this is all speculation. Getting all self-righteous about a hypothetical situation that isn’t likely to happen at all is very melodramatic. Clutch those hypothetical pearls!

In other words, we know you are lying.

You can’t know I’m lying or not if you don’t understand what I’m saying, which you’ve demonstrated multiple times, including this time. Also, who is we? You can only speak for yourself, and incoherently at that.

Explorer09 (profile)

June 9, 2025 at 4:18 pm

Re: Re: Re:¹¹

I am a creator whose work has been used to train AI without my consent.

I’m not replying to you. I’m making the post to let other people see you self-contradictory beliefs. Either you are ignorant or you have fooled yourself. Pick one.

As this quote tells everyone you don’t give a sh-t about the issue. Saying that your work had been used to train AI without consent and you don’t give a f-ck about your rights. And you just try to stop other people (creators) from z*exercising their rights** just because you don’t give a sh-t about them. How arrogant, and stupid as well.

You just can’t fathom that someone might be in a similar situation and yet have a different perspective

There is no “different perspective”. You are just ignoring that it’s their right to not give consent on AI training. You are trying to enforce your thoughts on copyright to others. Who do you think you are when it comes to lawmaking? Copyright law exists, and IT’S THEIR RIGHTS, AND YOU SHOULD SHUT UP.

Except if copyright duration were much shorter (you said 10 – 20, I didn’t), there would likely be less accommodation or need for fair use because works would come into the public domain so much faster.

Copyright abolitionist. I get it. SO F-CK OFF AND GET OUT.

MrWilson (profile)

June 9, 2025 at 8:12 pm

Re: Re: Re:¹²

I’m not replying to you. I’m making the post to let other people see you self-contradictory beliefs. Either you are ignorant or you have fooled yourself. Pick one.

Or I am a rightsholder and proved your claims to represent rightsholders is clearly incorrect. You have fooled yourself into believing that all authors, artists, designers, etc. will side with your preferred wealthy, abusive corporations who you admit are not good guys. You don’t have to pick one. You are ignorant and you have fooled yourself. And you’ve openly admitted to compromising to side with corporations over everyone else, to whom you have referred as freeloaders, even if you’ve not proven that they have ever committed copyright violations.

As this quote tells everyone you don’t give a sh-t about the issue.

I’ve responded at length to a lot of your bullshit for someone who supposedly doesn’t give a shit about the issue. I can’t imagine another explanation for my motives.

Saying that your work had been used to train AI without consent and you don’t give a f-ck about your rights.

I do care about my rights. They’ve been abused by the corporations you’re siding with. I’ve said this before. You’ve ignored it and pretended it doesn’t matter, as if maybe you don’t actually care about creators or their rights.

And you just try to stop other people (creators) from z*exercising their rights** just because you don’t give a sh-t about them. How arrogant, and stupid as well.

I’m not trying to stop anyone from doing anything. I offered a different perspective. You threw a bunch of assumptions on me and never understood what I was saying. You’re still making up new claims I haven’t stated or supported. You have a chronic issue with not reading what other people actually write.

There is no “different perspective”.

Yes, you’ve just proven that “You just can’t fathom that someone might be in a similar situation and yet have a different perspective.”

You are just ignoring that it’s their right to not give consent on AI training.

That hasn’t be determined as a right in any court yet. That’s what this argument is partially about.

You are trying to enforce your thoughts on copyright to others.

Last I checked, I don’t have the ability to control other people’s minds. I’m not trying to enforce anything. But that said, I am advocating, as I have already said, for the rights of poor people and researchers and students and others. You are trying to argue in favor of your copyright maximalism over their rights.

Who do you think you are when it comes to lawmaking?

An American citizen with the right to vote for legislators who pass laws and officials who appoint judges who judge the laws. An American citizen with a right of redress from the US government. An American citizen whose rights (both copyrights and constitutional rights) have been violated by the corporations you have admitted to champion in this argument.

You are not an American as far as I can tell (and you’ve never denied my determination that you are not) and are trying to tell Americans how their own laws should and do work. You can opine all you want. I don’t mind you spewing your ignorance because I know I can prove you wrong, as I have.

Copyright law exists, and IT’S THEIR RIGHTS,

No, it’s my rights too. It’s every American citizen’s rights that I’m advocating for. You’re apparently just advocating for copyright holders. But US law applies to us all. Legal precedents apply to us all (something I’ve had to explain to you multiple times).

AND YOU SHOULD SHUT UP.

The first amendment is also a law. Apparently the only US law you care about is copyright.

Copyright abolitionist.

I didn’t say that at all. I literally provided a system by which copyright would still exist. But you ignore anything I say that doesn’t fit into your chosen straw man narrative.

I get it.

No, you don’t, and that’s what’s so amusing. You seem to go to great lengths to not get it because, I’m guessing, your livelihood demands it.

SO F-CK OFF AND GET OUT.

You decided to come here to preach about your misguided views on copyright. If Mike wants me to leave, that’s his choice because it’s his site. That you’re pretending like you get to be the doorman for this website you don’t own or work for is also amusing and pathetic because you would be asserting control over Mike’s rights. Your assertion of caring about the rights of others is clearly bullshit.

Explorer09 (profile)

June 9, 2025 at 11:16 pm

Re: Re: Re:¹³

You are just ignoring that it’s their right to not give consent on AI training.

That hasn’t be determined as a right in any court yet. That’s what this argument is partially about.

The right to reproduction and the right to derivative works. They are rights. All the arguments about AI training does not reproduce copies are bullshit, and USCO refuted on that one, too.

You are trying to argue in favor of your copyright maximalism over their rights.

My position: (1) training data must be licensed, and (2) no copyright for AI generated works.

If I would support what you called “copyright maximalism”, I would let the big media companies copyright all AI generated things whatever they can. Why not? Why the copyright should be denied for AI generated works? This question is left for you to think. (Hint: Search for “Ban Corporate AI Profiteering” which was a campaign in 2023)

MrWilson (profile)

June 10, 2025 at 12:27 am

Re: Re: Re:¹⁴

The right to reproduction

The models don’t reproduce the training data.

and the right to derivative works.

The results aren’t derivative. They can’t be. If you put every famous physical artwork in a giant blender and made a collage out of random, pulped, minute little pieces (which isn’t even what an LLM does, but I’m being generous here with the metaphor), you couldn’t legally argue that the resulting collage was a derivative work of every single artwork that was pulped to make it.

Again, this demonstrates that you don’t understand how the technology works.

The LLM learned from several different texts that humans often say certain phrases. When it reproduces those phrases, it’s not copying any particular text but rather demonstrating what it learned about how humans frequently write text. If it randomly started quoting whole paragraphs from a text without being prompted to directly reference the source text, you might have a point, but that isn’t how they work.

They are rights.

They are indeed rights, just not rights that are legally assertable in this scenario.

All the arguments about AI training does not reproduce copies are bullshit, and USCO refuted on that one, too.

Except it didn’t. You shouldn’t probably read the article since it refutes this claim. The fact that you’re saying this proves, again, that you don’t actually read what you’re claiming to disagree with.

My position: (1) training data must be licensed, and

Except you have no case law that backs up this position, just wishful thinking based on your interest in maximizing profits for copyright holders, primarily the large media corporations you have admitted to siding with.

(2) no copyright for AI generated works.

This is already law.

If I would support what you called “copyright maximalism”, I would let the big media companies copyright all AI generated things whatever they can.

Not necessarily. Big media might actually like AI generated content to not be copyrighted because they wouldn’t have to license it to use it in their own copyrighted works. AI generated art not being able to be copyrighted means it can be integrated and everyone else would have to only extract the AI generated parts but couldn’t legally use the copyrighted parts that aren’t AI generated, which can be difficult in certain forms of media.

You seem to have a lot of assumptions that you haven’t thought through much.

Why not? Why the copyright should be denied for AI generated works? This question is left for you to think. (Hint: Search for “Ban Corporate AI Profiteering” which was a campaign in 2023)

Why did you start arguing for a point that wasn’t being argued? You just brought this up out of nowhere. Where had you or I discussed AI content being granted copyrighted status before? This is a literal non sequitur. It doesn’t not follow anything we’ve discussed up until now. It is further evidence that you are arguing with straw men.

Explorer09 (profile)

June 10, 2025 at 1:02 am

Re: Re: Re:¹⁵

The results aren’t derivative. They can’t be. If you put every famous physical artwork in a giant blender and made a collage out of random, pulped, minute little pieces (which isn’t even what an LLM does, but I’m being generous here with the metaphor), you couldn’t legally argue that the resulting collage was a derivative work of every single artwork that was pulped to make it.

Unfortunately yes. It IS derivative. It is the derivative of “every single work pulped to make it” as you say. At least that’s what the copyright holders alleged.

You really don’t have the idea of how “derivative works” in copyright law works (no pun intended). That’s why DJs mashing many popular songs together would get legal trouble if they don’t seek bulk licenses first.

And this derivative work rule is also critical of how open source licenses like GPL can enforce their “copyleft”. Because large software projects like Linux Kernel is technically a large “collage” of many developers’ contributions, each being small commits until everything is merged together.

Big media might actually like AI generated content to not be copyrighted because they wouldn’t have to license it to use it in their own copyrighted works.

No no no. The Big Media can get away with AI copyright pretty easily by making their own AI models with their own IPs. They just cannot exploit other companies’ IPs with their AIs. And the motives for Big Media to use AI is not to “steal” (they hold big IPs already and don’t need other people’s IPs to profit). It’s to cut labor costs by firing minor creative workers in the process.

AI generated art not being able to be copyrighted means it can be integrated and everyone else would have to only extract the AI generated parts but couldn’t legally use the copyrighted parts that aren’t AI generated, which can be difficult in certain forms of media.

I don’t see any problem with this.

Why not? Why the copyright should be denied for AI generated works?

Why did you start arguing for a point that wasn’t being argued?

Because you are assuming I am a “copyright maximalist” and I am saying I’m not, and this is one of the reasons.

You are making a wrong assumption that Big Media would want AI generated works to be copyright denied. The reality is actually different, as I mentioned above.

MrWilson (profile)

June 10, 2025 at 1:36 am

Re: Re: Re:¹⁶

Unfortunately yes. It IS derivative. It is the derivative of “every single work pulped to make it” as you say. At least that’s what the copyright holders alleged.

That’s not what the law or a court would determine though. Copyright holders aren’t legislators or judges, fortunately, despite their past successes in influences laws for their benefit over the rights of citizens.

You really don’t have the idea of how “derivative works” in copyright law works (no pun intended).

I do actually. That’s why I made that hypothetical example. If you can’t even tell what works the parts are from, it can’t be derivative because you must have identifiable copyrightable elements to have a derivative work.

Another example because you’re being obtuse:

If you bought a bunch of books and cut out singular words from each book and pasted them together into a completely different work that said completely different things than the books that the words came from, it would not be derivative because copyright doesn’t protect individual words but rather the expression that consists of more than just singular linguistic parts. If that resulting work qualified as a derivative work, then all written works would be derivative because people learn to write from reading other people’s work. We learn to speak by repeating what other people say. We repeat the phrases that our parents spoke when we were learning to speak.

You don’t understand derivative works or US copyright law in general.

That’s why DJs mashing many popular songs together would get legal trouble if they don’t seek bulk licenses first.

Not all mashups are considered derivative works nor do they all require licensing. There’s nuance that you’re ignoring here.

And this derivative work rule is also critical of how open source licenses like GPL can enforce their “copyleft”. Because large software projects like Linux Kernel is technically a large “collage” of many developers’ contributions, each being small commits until everything is merged together.

Open source software is typically licensed such that derivative works are authorized with specific stipulations, such as the resulting work being similarly licensed. This doesn’t refute anything I’ve said or support any point you’re trying to make.

No no no. The Big Media can get away with AI copyright pretty easily by making their own AI models with their own IPs. They just cannot exploit other companies’ IPs with their AIs.

You’re not refuting what I said. You’re just reasserting your own preference.

And the motives for Big Media to use AI is not to “steal” (they hold big IPs already and don’t need other people’s IPs to profit). It’s to cut labor costs by firing minor creative workers in the process.

These two sentences contradict each other. Big Media companies exist by exploiting the creative works of others. They do need other people’s IPs to profit, specifically the IPs they get assigned to them from the actual creators.

I don’t see any problem with this.

But you also don’t see why this would allow Big Media to benefit more than others.

Why did you start arguing for a point that wasn’t being argued?

Because you are assuming I am a “copyright maximalist” and I am saying I’m not, and this is one of the reasons.

I’m not assuming. You’ve claimed poor people who haven’t been proven to violate any copyright are freeloaders. You’ve actively advocated for copyright maximalist corporations to be able to exploit poor people. Whether you want to call it that because it sounds “bad” is your issue. You are a copyright maximalist based on the position you’ve taken and the side you’ve chosen. That’s like saying you’re not a fascist but you’ve sided with Nazis. What you call yourself doesn’t matter. Your choices determine what you are.

You are making a wrong assumption that Big Media would want AI generated works to be copyright denied. The reality is actually different, as I mentioned above.

You’re just speculating randomly based on your own ignorance. That’s not reality. That’s your fantasy.

Explorer09 (profile)

June 10, 2025 at 3:35 am

Re: Re: Re:¹⁷

I do actually. That’s why I made that hypothetical example. If you can’t even tell what works the parts are from, it can’t be derivative because you must have identifiable copyrightable elements to have a derivative work.

That’s the difficulty of proof on the plaintiffs, but they doesn’t mean it’s no t infringing. And that’s why in the UK the House of Lords is now making a bill that forces transparency on AI companies to disclose all copyrighted data during training.

And I would no longer reply with bullshits about AI is not infringement unless the copyright holders can proof it.

If you bought a bunch of books and cut out singular words from each book and pasted them together into a completely different work that said completely different things than the books that the words came from, it would not be derivative because copyright doesn’t protect individual words but rather the expression that consists of more than just singular linguistic parts. If that resulting work qualified as a derivative work, then all written works would be derivative because people learn to write from reading other people’s work. We learn to speak by repeating what other people say. We repeat the phrases that our parents spoke when we were learning to speak.

Bullshit. The words are not copyrightable but the specific arrangements of words form creative expressions that are copyrightable! And when the AI “learns” from those specific arrangements it copies the protected expression.

USCO report, pp. 47-48:

In providing this analysis, the Office rejects two common arguments about the
transformative nature of AI training. As noted above, some argue that the use of copyrighted
works to train AI models is inherently transformative because it is not for expressive
purposes. We view this argument as mistaken. Language models are trained on examples
that are hundreds of thousands of tokens in length, absorbing not just the meaning and parts of
speech of words, but how they are selected and arranged at the sentence, paragraph, and
document level – the essence of linguistic expression.
Image models are trained on curated
datasets of aesthetic images because those images lead to aesthetic outputs. Where the resulting model is used to generate expressive content, or potentially reproduce copyrighted
expression, the training use cannot be fairly characterized as “non-expressive.”

Refute this one please.

MrWilson

Big Media companies exist by exploiting the creative works of others.

I disagree. That’s your claim, not mine. Even when Big Media companies do exploit, it’s irrelevant to the AI companies who are alleged to “steal” works.

They do need other people’s IPs to profit, specifically the IPs they get assigned to them from the actual creators.

Technically the IPs produced by their own employees, yes. But not “people from other companies”. Trying to argue the definition of “other people” wasn’t helpful.

You’ve actively advocated for copyright maximalist corporations to be able to exploit poor people.

F-ck you as I’ve said there is no “poor people”! This argument is moot and useless.

MrWilson (profile)

June 10, 2025 at 11:15 am

Re: Re: Re:¹⁸

That’s the difficulty of proof on the plaintiffs, but they doesn’t mean it’s no t infringing.

You ignored what I said. If you can’t tell what the content is from, the plaintiff’s wouldn’t even think to sue. You’re assuming omniscience on the part of the copyright holders.

And that’s why in the UK the House of Lords is now making a bill that forces transparency on AI companies to disclose all copyrighted data during training.

We’re strictly discussing US copyright law here. We shrugged off British law a few hundred years ago. Do you need a citation for that?

And I would no longer reply with bullshits about AI is not infringement unless the copyright holders can proof it.

You should no longer reply with “bullshits” either way.

Bullshit. The words are not copyrightable but the specific arrangements of words form creative expressions that are copyrightable!

Yes, exactly. And I said in the hypothetical example that only singular words were used. So you agree that my example wouldn’t be copyright infringement or a derivative work.

And when the AI “learns” from those specific arrangements it copies the protected expression.

Except it doesn’t. It learns how to compose sentences based on weights and tokens, not the actual text itself. You are, again, demonstrating you don’t understand how the technology works.

In providing this analysis, the Office rejects two common arguments about the transformative nature of AI training. As noted above, some argue that the use of copyrighted works to train AI models is inherently transformative because it is not for expressive purposes.

Note that this wasn’t my argument that the Copyright Office is arguing against, so citing it as a refutation of what I said is irrelevant. I didn’t claim it was or wasn’t expressive. This is a non sequitur.

We view this argument as mistaken. Language models are trained on examples that are hundreds of thousands of tokens in length, absorbing not just the meaning and parts of speech of words, but how they are selected and arranged at the sentence, paragraph, and document level – the essence of linguistic expression.

Note here that their argument is couched in the claim that the result is expressive, therefore the training must be expressive, but those are two different processes. Reading a book and writing a book are two significantly different processes.

Refute this one please.

I did, and with gusto!

I disagree. That’s your claim, not mine. Even when Big Media companies do exploit, it’s irrelevant to the AI companies who are alleged to “steal” works.

It’s not irrelevant to the people who are exploited. That you acknowledge that you don’t care about people being exploited by big corporations, your claimed motive to protect authors is bullshit. They are the people getting exploited. That you can compartmentalize your morality this way reveals your hypocrisy.

Technically the IPs produced by their own employees, yes.

No, fuck no. You’re a fucking idiot. You are admitting here that you don’t understand US copyright law or the nature of publishing. Authors aren’t typically an employee of a publisher. If they are, it’s usually in a different job with a different role. Authors are typically freelancers who may, but not always, assign their copyrights to the publisher in a contract negotation for publishing. An employee in US law is a specific classification whose work for an employer would usually fall under “work for hire,” but that is not always the nature of publishing. That you are ignorant of this nuance further indicates how useless all your arguments are. You don’t know what you’re talking about. You’re trying to tell Americans how their laws should or do work. And you have the gall to tell me to shut up and get out?

But not “people from other companies”. Trying to argue the definition of “other people” wasn’t helpful.

Apparently it wasn’t helpful because you’re an idiot. There’s a very important distinction there.

F-ck you as I’ve said there is no “poor people”! This argument is moot and useless.

It’s not moot to point out that you are pretending the bulk of the US population doesn’t exist. It’s absurd that you can just declare this like it’s true. If it’s true, I don’t exist. Who are you even talking to?

terop (profile)

June 9, 2025 at 4:03 pm

Re: Re: Re:⁹

I’m against copyright maximalism and greed and putting power and money into the hands of those who are already wealthy and powerful

Yeah, but when you hear the real money amount that authors are getting, your response is to claim that the product wasn’t good enough. The time was already spent creating the product and you only have two responses: either we’re greed or our products are not good enough for the marketplace.

Basically you can’t have it both ways. Either authors are getting the money that they deserve or they don’t get it, but you can’t go both ways.

MrWilson (profile)

June 10, 2025 at 12:36 am

Re: Re: Re:¹⁰

Yeah, but when you hear the real money amount that authors are getting, your response is to claim that the product wasn’t good enough.

No? No, it’s not. Why are you trying to tell me what I think when I’ve uttered no such statement? You know you don’t need to post here if you’re just going to make up fake positions to fight with. You can do that on your own offline.

The time was already spent creating the product

Copyright doesn’t hang on how much time is spent creating content.

and you only have two responses: either we’re greed or our products are not good enough for the marketplace.

No, this is a false dilemma. There are other responses. Not all creators are greedy. And some high quality creations don’t become successful for reasons other than their quality. Sometimes there’s a timing issue or a lack of marketing that comes into play. Sometimes a similar piece of content gets released slightly earlier that spoils the market for a particular work. That you know so little about the various factors that can affect the success of a work indicates you might not be the best person to opine about this topic, especially not so confidently.

Basically you can’t have it both ways. Either authors are getting the money that they deserve or they don’t get it, but you can’t go both ways.

Deserving is a moral argument, not a legal one. You’re conflating different scales here.

Nobody “deserves” anything. You create a work and offer it for sale or license it some other way and hope to benefit in various possible ways from its release and publication, not all of which are monetary in nature. And some creators get ripped off and some become successful and still get ripped off. And many large media corporations make a lot of money based on the efforts of creators.

You continue to demonstrate your lack of understanding of the topic at hand.

terop (profile)

June 10, 2025 at 1:09 am

Re: Re: Re:¹¹

the time was already spent creating the product

Copyright doesn’t hang on how much time is spent creating content.

It is very common pattern with companies that they actually pay people based on the time spent, instead of how much copyrighted works was created. So very common compensation for copyrighted works is based on the time measurement.

copyright is needed for different purpose. It protects the company from ripoffs. While the software is being developed, there is danger of unintended distribution of the source code and that allows the whole world to take the software and run with it, without passing compensation to the authors. Company still needs to pay based on time spent, but copyright’s value is zero when the product has been ripped off by pirates.

How is the company getting money to pay for the time?

MrWilson (profile)

June 10, 2025 at 11:17 am

Re: Re: Re:¹²

We’ve had this discussion before. You invented something that doesn’t exist in US copyright law. The length of time spent on a work doesn’t affect whether it can be protected by copyright. You’re conflating issues relating to paying employees hourly wages with copyright protections. They are different parts of US law. They are unrelated.

terop (profile)

June 10, 2025 at 1:13 pm

Re: Re: Re:¹³

You’re conflating issues relating to paying employees hourly wages with copyright protections.

Yes. Copyright is very much to do with two different aspects:
1) compensation
2) control of where and how the product is used

MrWilson (profile)

June 10, 2025 at 7:43 pm

Re: Re: Re:¹⁴

Copyright has nothing to do with compensation directly. Compensation is possible through what rights copyright law provides, but some copyrighted content is released for free under permissive licenses that doesn’t involve compensation at all. There’s nothing in copyright law that speaks to hourly wages or compensating creators at particular rates. That you think it does is just more proof of your ignorance. You didn’t even know the public domain existed until I pointed it out.

terop (profile)

June 10, 2025 at 10:01 pm

Re: Re: Re:¹⁵

You didn’t even know the public domain existed until I pointed it out.

You have a time machine to the 1980s? That’s when I learned about public domain/free software/open source/shareware/freeware/proprietary/commercial etc licenses..

MrWilson (profile)

June 11, 2025 at 10:37 am

Re: Re: Re:¹⁶

You have a time machine to the 1980s?

If I had a time machine, I wouldn’t go back to the 1980s. I remember it well enough and there wasn’t much good there.

That’s when I learned about public domain/free software/open source/shareware/freeware/proprietary/commercial etc licenses..

First, a lot has changed since the 1980s, including relating to case law and copyright legislation in the US, so if that’s the source of your information, it explains why your ideas are so wrong.

Second, you stated (incorrectly) on September 15th, 2024 that “everything created by mankind is covered by copyright.” This is factually incorrect. Public domain works are not covered by copyright. Claiming you know about the public domain yet not factoring it into your broad claims doesn’t prove you actually know or understand what the public domain is. It’s a giant amount of content. It’s the entirety of human creations in fixed mediums before the 20th century.

terop (profile)

June 11, 2025 at 3:23 pm

Re: Re: Re:¹⁷

Second, you stated (incorrectly) on September 15th, 2024 that “everything created by mankind is covered by copyright.” This is factually incorrect. Public domain works are not covered by copyright.

You’re missing context here. Some idiot was claiming that all content items in the internet can be freely used without considering copyright at all, simply because internet publishes the material to everyone and their mother.

That simply wasn’t true, and the default operation for all copyrighted works is that you need to obtain license to use them in any way. Public domain is such a small piece of the pie that it can be ignored completely, and noone would want to use the content from 1930’s anyway. They didn’t even have jpg standard back then.

MrWilson (profile)

June 11, 2025 at 6:49 pm

Re: Re: Re:¹⁸

You’re missing context here. Some idiot was claiming that all content items in the internet can be freely used without considering copyright at all, simply because internet publishes the material to everyone and their mother.

You’re missing the context. You made that statement in reply to me and I wasn’t making any such claim. I was pointing out that LLMs can be trained on content that isn’t covered by copyright and you falsely claimed that all content is covered by copyright and then I pointed out that the public domain exists.

Why are you lying about the record that we can just go back and look at?

That simply wasn’t true, and the default operation for all copyrighted works is that you need to obtain license to use them in any way.

No, that’s not the case at all. There are some legal uses that don’t require obtaining a license or asking for permission. This has been explained to you. And when you’re pressed on the letter of US copyright law and related case law, you have admitted you don’t study it.

Public domain is such a small piece of the pie that it can be ignored completely,

Plenty of people use works that are in the public domain every day. It’s not all that small. The internet archive, Google Books, libraries, and other websites are full of such works. One group actually trained an LLM on public domain works from the Library of Congress.

and noone would want to use the content from 1930’s anyway. They didn’t even have jpg standard back then.

We’re mostly talking about text, but plenty of LLMs are also trained on public domain works as well as copyrighted works. You can make a digital image of an old photograph (and it doesn’t qualify for copyright if it’s just a reproduction). You’re just hand-waving away points that prove you wrong without actually refuting them. You’re also significantly wrong about content from the 1930s. Plenty of people are interested in it. Also, public domain is more than just works from the 1930s. There are some works from between the 1930s and 1978 that also in the public domain since they failed to renew licenses or didn’t include copyright declarations as previously required.

terop (profile)

June 12, 2025 at 4:33 am

Re: Re: Re:¹⁹

There are some works from between the 1930s and 1978 that also in the public domain since they failed to renew licenses

Thats only because usa didn’t sign berne convention rules until 1988… under berne, copyright is automatic and does not require registration…

MrWilson (profile)

June 12, 2025 at 10:11 am

Re: Re: Re:²⁰

It’s still a fact that you weren’t aware of or at best disregarded in your false claims.

terop (profile)

June 13, 2025 at 5:27 am

Re: Re: Re:²¹

at best disregarded in your false claims.

only a sith deals with absolutes like that. Its not always black and white, when there’s levels of gray and all the vibrant colours too.

MrWilson (profile)

June 13, 2025 at 11:20 am

Re: Re: Re:²²

Only an ignorant liar tries to cover up their ignorance with lies and lazy logic.

Also, you’re using the Star Wars quote out of context for your own benefit, which is something a Sith would do.

terop (profile)

June 13, 2025 at 1:46 pm

Re: Re: Re:²³

you’re using the Star Wars quote out of context for your own benefit, which is something a Sith would do.

You simply don’t understand it. I’m an expert at examining colours, since I was responsible of making sure phone screens had exactly correctly coloured pixels. Try count how many pixels I’ve examined to arrive at correct software to display accurate pixel colours.

It’s even so bad, that an adverticement that didn’t use rgba colour palette caused significant problems/headache.

MrWilson (profile)

June 15, 2025 at 8:44 pm

Re: Re: Re:²⁴

“Good luck. You’re gonna need it.”

terop (profile)

July 2, 2025 at 10:12 pm

Re: Re: Re:²⁵

Gambling is forbidden for all the companies like wikipedia, game developers with lootboxes etc, since only veikkaus in finland is able to setup gambling.

Obviously wikipedia’s jimmy wales begging for money money collection violated finnish law about gambling.

terop (profile)

June 8, 2025 at 7:47 pm

Re: Re: Re:⁵

You’re hallucinating worse than an LLM trained solely on crypto bro 4chan posts.

4chan is significantly better platform than whatever you’re using, at least they don’t put delays to messages for no good reason.

MrWilson (profile)

June 9, 2025 at 12:32 pm

Re: Re: Re:⁶

at least they don’t put delays to messages for no good reason.

If trolls like you are getting caught up in the spam filter, it seems to be working for a very good reason and quite effectively.

terop (profile)

June 9, 2025 at 3:59 pm

Re: Re: Re:⁷

If trolls like you are getting caught up in the spam filter, it seems to be working for a very good reason and quite effectively.

Well, you’re the one waiting for reply when that happens.

MrWilson (profile)

June 9, 2025 at 8:14 pm

Re: Re: Re:⁸

I don’t actually value what you say. I don’t care if I have to wait my entire lifetime for a nonsensical response.

terop (profile)

June 9, 2025 at 10:17 pm

Re: Re: Re:⁹

I don’t actually value what you say. I don’t care if I have to wait my entire lifetime

So this went again to the “lets look at who wrote the message instead of analyzing the message content”.

MrWilson (profile)

June 11, 2025 at 6:50 pm

Re: Re: Re:¹⁰

If this were the first time you commented, you might have some benefit of the doubt left, but it doesn’t take a lot of pattern recognition from all your comments to guess that everything you say next will also be uneducated bullshit.

terop (profile)

June 12, 2025 at 8:38 am

Re: Re: Re:¹¹

If this were the first time you commented, you might have some benefit of the doubt left

You forgot to factor the possibility that all the bullshit you read is actually accurate and important copyright messages in our part of the world.

I know it is difficult in usa to understand how the world works, when you’re used to looking at your flagpole and singing national anthems, but maybe next time you will actually read the bullshit.

MrWilson (profile)

June 12, 2025 at 10:16 am

Re: Re: Re:¹²

You forgot to factor the possibility that all the bullshit you read is actually accurate and important copyright messages in our part of the world.

The article is about US copyright law. We have only ever been talking about US copyright law. The article was written to address the US Copyright Office commenting on US copyright law. This has never been about copyright in any other part of the world. You came here and chose to freely engage in a discussion about US copyright law. You don’t get to pretend we’ve been talking about anything else.

I know it is difficult in usa to understand how the world works, when you’re used to looking at your flagpole and singing national anthems, but maybe next time you will actually read the bullshit.

I love this absurd generalization. I know it’s difficult for you to understand how people in other countries work, but we’re not the stereotypes you see in movies. I haven’t sung the national anthem in 30 years. But the greater point here is that you’re completely ignorant of what you’re talking about, again.

terop (profile)

June 13, 2025 at 1:40 pm

Re: Re: Re:¹³

we’re not the stereotypes you see in movies.

I know. You’re actually artificial intelligence robots that are soon taking over the world, if we cannot stop the robots that you sent back in time via these devices that can cut through steel and it has nice rotating sphere that cuts its way through time.

MrWilson (profile)

June 15, 2025 at 7:41 pm

Re: Re: Re:¹⁴

The most pathetic aspect of this fantasy is that you’re not important enough that any machine would want to come back to assassinate you. You’re Willy Loman, not JFK.

terop (profile)

June 16, 2025 at 8:22 pm

Re: Re: Re:¹⁵

The most pathetic aspect of this fantasy is that you’re not important enough that any machine would want to come back to assassinate you.

I’m not important enough? I think you are gravely mistaken. Without my work, there wouldn’t exist over 150 million symbian phones in the auropean market. And my 3d engine is used/downloaded by 650 lucky people.

terop (profile)

June 9, 2025 at 6:18 am

Re: Re: Re:⁵

I call it failed software because if it was successful you wouldn’t be spamming unrelated posts about it.

Its not unrelated. I spent significant amount of time (and time is money) for developing a system for my software to avoid liability in copyright area. Things like not allowing pirate material to ruin the products created with my software, or asking for license information for each assets before publishing them to the world…

Do you understand that your bullshit about how pirate practises are awesome way to create products for the world, with napster or limewire directly competing against my products with unfair and illegal business practices which violate copyright. You’re directly supporting criminals and criminal lifestyle.

MrWilson (profile)

June 9, 2025 at 12:39 pm

Re: Re: Re:⁶

It’s unrelated because the article doesn’t mention your software and no one here cares about your personal agenda.

Do you understand that your bullshit about how pirate practises are awesome way to create products for the world,

Quote me where I’ve said this.

with napster or limewire directly competing against my products with unfair and illegal business practices which violate copyright.

Are you still using AOL? Napster and Limewire are decades old news now.

You’re directly supporting criminals and criminal lifestyle.

[citation needed] You are assigning concepts to me that I have never uttered. You seem to think that because I point out that you’re ignorant about how US copyright law actually works, that I must agree with everything you oppose. That is very lazy, simplistic thinking. If any of this were true, you could easily quote me where I have said such things.

terop (profile)

June 9, 2025 at 3:20 pm

Re: Re: Re:⁷

It’s unrelated because the article doesn’t mention your software and no one here cares about your personal agenda.

They mention my software every time they run an article about copyright (and ruin it with propaganda about how it’s legal to do copyright infringement)

Mike Masnick (profile)

June 9, 2025 at 5:13 pm

Re: Re: Re:⁸

They mention my software every time they run an article about copyright

WTF are you talking about? I’ve never mentioned your software. I have no idea what your software is or what it does.

ruin it with propaganda about how it’s legal to do copyright infringement

Another lie. We have never said that it’s “legal to do copyright infringement.”

Are you having a stroke? Do you need help? Otherwise, why lie?

Mike Masnick (profile)

June 9, 2025 at 9:38 pm

Re: Re: Re:⁹

Hey terop:

I see that you posted a comment replying to this, also caught in the spam filter, where you used me calling out your lie that we mentioned your software, and used it as an excuse to promote your software. That’s just fucking spam, man.

It’s not getting through the filter.

Also: grow the fuck up.

terop (profile)

June 9, 2025 at 10:36 pm

Re: Re: Re:¹⁰

used it as an excuse to promote your software. That’s just fucking spam,

There are users screaming in your site about me not providing citations for the facts I’m posting, and when I finally post the urls, it gets filtered in the spam filter.

These users never saw the urls, since they’re newer than where the urls have been available.

But guess those citations and facts are not important. If that’s your decision, I can’t do anything but tell these people that the citations are not allowed in the site.

Maybe next time they wont request ability to check the facts we post…

MrWilson (profile)

June 10, 2025 at 11:18 am

Re: Re: Re:¹¹

Your own software isn’t proof of any of your claims. That you think being asked for citations of US law or case law is an opportunity to spam links to your own software is entirely on your self-interest and stupidity.

terop (profile)

June 10, 2025 at 12:20 pm

Re: Re: Re:¹²

Your own software isn’t proof of any of your claims.

You’re wrong. My software is like a town library. It contains the information gathered in the last decade, stores it for future use. The claims are fundamentally based on the stuff that has happened in last decade, and thus the software source code should have entries proving all the claims i make. We saw how big success sites like wikipedia has become, and our software source code is our version of the information storage.

It is significant failure in your part that you still do not have access to our software module catalog. Its like rest of the world got youtube, but your internet is slow enough that you cannot access the videos. Its our history and contributions that are contained in our software and people who reject it, will lose the next chapter in our magnificient technology development.

We are soon taking control of the world, and you will be left behind, poor and miserable, when you failed to innovate and embrace emerging technologies. You’ve seen a glimpse of our technological lead, but it seems to be so advanced that you instantly rejected it. This is why you’ll be late, finding out the world changed under you when noone knew such changes are possible.

so long, and thanks for all the fish in the sea…

MrWilson (profile)

June 10, 2025 at 7:51 pm

Re: Re: Re:¹³

The claims are fundamentally based on the stuff that has happened in last decade, and thus the software source code should have entries proving all the claims i make.

Except we’re discussing US copyright law. Your software doesn’t dictate US copyright law nor is it US case law relating to copyright. You can claim your software contains the secrets of Roswell, New Mexico, the JFK assassination, and the location of alien enclaves in the deep ocean, but that doesn’t make it a factual source.

It is significant failure in your part that you still do not have access to our software module catalog.

This is hilarious attempt at advertising. You’re trying to neg people to get them interested in your software that you admit people aren’t finding valuable enough to pay for.

We are soon taking control of the world, and you will be left behind, poor and miserable, when you failed to innovate and embrace emerging technologies.

First, who’s we? You’re pluralizing yourself again.

Second, you have no clue as to what technologies I embrace. It sounds like you’ve been left behind since you’re touting software that other websites have already been doing, such as display 3D models on a web page.

Third, this egomaniacal rant is in humorous juxtaposition to your earlier self-deprecating regarding the failure of your software.

so long, and thanks for all the fish in the sea…

It’s “So Long, and Thanks for All the Fish,” not “all the fish in the sea…” The dolphins didn’t eat all the fish in the sea before they left earth.

terop (profile)

June 10, 2025 at 10:06 pm

Re: Re: Re:¹⁴

Except we’re discussing US copyright law. Your software doesn’t dictate US copyright law nor is it US case law relating to copyright.

US copyright law is based on the berne convention rules, so if I solely rely on that ruleset, the same stuff should work also in US soil. If not, then USA legal system is violating their contractual responsibilities.

MrWilson (profile)

June 11, 2025 at 10:43 am

Re: Re: Re:¹⁵

US copyright law is based on the berne convention rules,

No, no it’s not. US copyright law recognizes the Berne Convention rules, but US copyright law is not based on the Berne Convention rules. US copyright law is more nuanced than the Berne Convention and the Berne Convention doesn’t cover everything in any of the signatory countries’ copyright laws. It’s just an agreement to respect copyright from other countries. It doesn’t cover things like the DMCA or US copyright case law or the vast majority of US copyright law. Your ignorance on this point is just one more black mark on your already disintegrated credibility.

so if I solely rely on that ruleset, the same stuff should work also in US soil.

No, that’s not how it works at all.

If not, then USA legal system is violating their contractual responsibilities.

No, US copyright law recognizes the Berne Convention, but goes beyond that with its own greater nuance. You don’t understand the Berne Convention. It’s a superseding set of laws over every aspect of US copyright law. For one thing, the US would never agree to that if it was, nor would many of the other signatory countries. Berne only covers one minor aspect of copyright agreements between countries.

terop (profile)

June 11, 2025 at 11:50 am

Re: Re: Re:¹⁶

US copyright law recognizes the Berne Convention rules, but US copyright law is not based on the Berne Convention rules.

Yeah, but you only have experience about berne rules since 1988, since usa was late in adopting the rules. We have experience with the rules for over 200 years, so our berne techniques are significantly more fine-tuned and we can find subtle details from them which are completely unheard in usa.

MrWilson (profile)

June 11, 2025 at 12:42 pm

Re: Re: Re:¹⁷

Except we’re only talking about US copyright laws, not the Berne Convention. You can have more experience in underwater basket-weaving but if we’re not discussing that topic, it’s completely irrelevant and actually worse than useless when you pretend it substitutes for actual knowledge of the topic at hand.

terop (profile)

June 11, 2025 at 1:00 pm

Re: Re: Re:¹⁸

You can have more experience in underwater basket-weaving but if we’re not discussing that topic,

You should increase the scope of your rules to contain larger area of the world than your tiny usa pond. I’m thinking maybe berne rules are still not containing everything needed, when china, asia and africa rules should be included too. Staying in your local bubble is never too healthy.

MrWilson (profile)

June 11, 2025 at 1:40 pm

Re: Re: Re:¹⁹

The laws of other countries don’t affect me in my own country. This article is about US copyright laws. It’s not at all relevant to address foreign laws here. If you don’t want to be limited to this particular topic, comment on a different article that is actually about the laws of other countries. And the US is rather large for being a “local bubble.” We have states that are bigger than some European countries. This is quite a desperate goalpost shifting.

terop (profile)

June 13, 2025 at 1:37 pm

Re: Re: Re:²⁰

comment on a different article that is actually about the laws of other countries

I don’t think there is any such articles available in techdirt. There’s too much american bullshit and pre-censorship of foreign articles that the site almost never posts articles about foreign countries. So if I post something, it needs to be about copyright issues, since copyright rules are the same for everyone on the western world.

MrWilson (profile)

June 14, 2025 at 11:50 pm

Re: Re: Re:²¹

I don’t think there is any such articles available in techdirt.

Sounds like you should find a different website that caters to your interests.

since copyright rules are the same for everyone on the western world.

Except they aren’t.

terop (profile)

June 15, 2025 at 4:31 pm

Re: Re: Re:²²

Sounds like you should find a different website that caters to your interests.

The other websites don’t have similar kind of copyright problems than what techdirt has. So I wouldn’t fit well to their reader groups.

It is important that my software had significant problems with the copyright area. First it was attempting to clone youtube’s main content catalog format, but failing to gain traction among users. Then the software allowed users to download content from the internet, and even displaying “Downloading..” as a progress indication, indicating to older people that something illegal is happening. Then it built it’s 3d model catalog as a combination of own software code, but using 3d models from sketchfab without displaying copyright or authorship information next to the model. Once that was fixed, it used artificial intelligence which is known to have copyright problems.

Stuff like this is breaking every software project on the planet. Including mine. I’ve been researching for solutions to most of the problems mentioned above, but not all the solutions found have been taken into usage, given that the solutions are regularly causing issues that end users are unable to solve.

MrWilson (profile)

June 15, 2025 at 7:42 pm

Re: Re: Re:²³

So I wouldn’t fit well to their reader groups.

You don’t fit here at all.

It is important that my software…

Nope. Nobody cares about your software.

terop (profile)

June 16, 2025 at 6:52 pm

Re: Re: Re:²⁴

It is important that my software…

Nope. Nobody cares about your software.

This is where you’re going against established practices and common sense. Software is the magic enabler that solves every problem in today’s society, and I just happen to have the newest and most bleeding edge software available on the planet. But given that you did not even try to analyze what the software is doing, your analysis cannot be trusted and you’re always painting yourself to a corner.

Explorer09 (profile)

June 17, 2025 at 12:34 am

Re: Re: Re:²⁵

and I just happen to have the newest and most bleeding edge software available on the planet

@terop, please stop advertising your on software as anything better than others’.

terop (profile)

June 17, 2025 at 3:07 am

Re: Re: Re:²⁶

please stop advertising your on software as anything better than others’.

Nope. Then I would be lying.

Explorer09 (profile)

June 17, 2025 at 10:50 am

Re: Re: Re:²⁷

please stop advertising your on software as anything better than others’.

Nope. Then I would be lying.

Stronger claims require stronger evidences. Since you didn’t provide any evidence about your claim, that would be misleading advertising.

terop (profile)

June 17, 2025 at 1:50 pm

Re: Re: Re:²⁸

Stronger claims require stronger evidences.

My stronger claim is that “I actually have working software” instead of pile of bits that crashes all the time. As you know from software vendor histories, this is better than what software market as a whole can implement.

Explorer09 (profile)

June 17, 2025 at 8:11 pm

Re: Re: Re:²⁹

There are other software vendors that ship non-crashing software. So your basic claim like that isn’t helpful.

terop (profile)

June 18, 2025 at 2:46 pm

Re: Re: Re:³⁰

There are other software vendors that ship non-crashing software. So your basic claim like that isn’t helpful.

rust developers are claiming on the internet that status of the software industry is not handling security issues carefully enough and memory corruptions etc. So there are people on the internet that are claiming that your bullshit explodes to your face.

Explorer09 (profile)

May 30, 2025 at 1:28 pm

Re: Re: Re:⁴

protecting software vendors from entering area they cannot handle

I suspect you are referring to potential copyright infringement lawsuits. Well IANAL (I am not a lawyer), but it was indeed a legal danger zone if companies use AI code generators without lawyers’ reviews. When the AI “regurgitates” any known copyrighted code (that can be accused of copyright infringement), you won’t be able to know it right away.

terop (profile)

June 11, 2025 at 11:19 am

Re: Re: Re:⁵

it was indeed a legal danger zone if companies use AI code generators without lawyers’ reviews.

today’s news tidbit kinda explains why AI is dangerous.

disney is suing AI company because copyrighted characters appear in the midjourney’s output…

terop (profile)

June 12, 2025 at 4:15 pm

Re: Re: Re:⁶

https://media.npr.org/assets/artslife/movies/misc/midjourney.pdf

If disney is going to attack midjourney with this level of big guns, midjourney has no fucking chance. No amount of fair use is going to save them. This is more than billion dollar lawsuit, and midjourney’s technology is going to be so fucking dead that we’re going to be digging bones in graveyard if we want to see the technology in next 50 years.

terop (profile)

June 21, 2025 at 4:33 pm

Re: Re: Re:⁷

now we got what real lawyers think of disney vs midjourney lawsuit:

https://www.youtube.com/watch?v=zpcWv1lHU6I

terop (profile)

June 24, 2025 at 9:26 am

Re: Re: Re:⁸

Now claude is in big trouble after pirating the training data:
https://aifray.com/claude-ai-maker-anthropic-bags-key-fair-use-win-for-ai-platforms-but-faces-trial-over-damages-for-millions-of-pirated-works/

Explorer09 (profile)

June 24, 2025 at 1:37 pm

Re: Re: Re:⁹

@terop

Regarding the Bartz v. Anthropic summary judgements, the opinion I saw are mixed. In particular, creators are not happy.

The only good side of this judgement is that piracy is likely a game over for AI companies now. (I’m talking about Meta and OpenAI, too.)

I have >50% confidence that the fair use judgement for the case will apply for a appeal. Because this “training is fair use so long as you legally acquired a copy” would mean a greenlight to OpenAI and Google scraping billions of web pages simply because they’re available gratis (for free). This is a terrible precedent for e.g. news companies that publish content mostly on the internet, and independent bloggers and writers.

terop (profile)

June 24, 2025 at 2:50 pm

Re: Re: Re:¹⁰

It seems that the “fair use” solution that the companies trusted to fix the copyright issue is not actually helping when the use is relying on pirated source material. Basically pirated material can have issues like drm getting removed or formatshifted from proprietary file formats to the more commonly pirated formats like mp3, mp4, png/jpeg files. This makes pirated versions more convinient for the AI companies, but the legal paperwork was very clear that convinience of the company is not acceptable reason to use pirated source materials.

MrWilson (profile)

June 24, 2025 at 7:03 pm

Re: Re: Re:¹⁰

Because this “training is fair use so long as you legally acquired a copy” would mean a greenlight to OpenAI and Google scraping billions of web pages simply because they’re available gratis (for free). This is a terrible precedent for e.g. news companies that publish content mostly on the internet, and independent bloggers and writers.

Or maybe they never had a right to demand a license for training and they’re not losing anything. Why is training magically not fair use when other transformative uses are? If the material is obtained legally, why isn’t the use legal?

“You can read this legal freely available material, but a machine can’t…except a web browser…and a search algorithm…and the internet archive…and a web scraper…and…”

terop (profile)

June 24, 2025 at 10:50 pm

Re: Re: Re:¹¹

Why is training magically not fair use when other transformative uses are?

fair use should be limited to sentences of size 6 words or smaller. Currently they’re asking fair use to apply to terabytes of data, and they’re not considering the work amounts that went into collecting those databases(much less creating the material from scratch). If companies paid proper money amounts for the data, the AI databases would cost significant amount of money, millions of dollars.

If the material is obtained legally, why isn’t the use legal?

I think it’s about the sheer amount of data used. Large collections of data has generally been illegal, given that no-one is able to obtain licenses for all the data in the collection, since the mere negotiation process for millions of content items is too burdensome. But copyright law has generally solved it by insisting that the size of the data amount is reduced to small enough amount that the proper license acquire is possible. The license acquire is still significantly easier process than creating the same material from scratch. Why should your company get access to huge database of data, when the same data is unavailable for use for everyone else who follows copyright law?

It’s only because of computers allow large data collections to exist, that it has been significant problem recarding copyrights. When books were manually copied with printing press or ink based pens, getting a license was minor issue compared to the overall work amount involved.

Basically none of the AI companies executed the proper process of dividing the data to small pieces and obtaining separate license for each piece from its author. They think it’s too burdensome, but copyright law thinks that they should not use that much data, since creating it from scratch is also burdensome.

When we learned copyright law, the conclusion was that everything in internet is illegal to use in your own product. There simply wasn’t licenses available for the data. Author names were missing and contacting authors via email turned out to be impossible since everyone is trying to avoid spam. I.e. internet had huge collections of data available, but all of it was inaccessible when you wanted to follow the proper copyright process.

Now AI companies are trying to solve it the same way as how pirates collected their movie/song/software/game collections. This is the wrong way. They should develop technologies that use less data. Make their AI algorithms work with smaller datasets.

This is what I’m doing with my software. I only rely on small amount of data for developing my 3d model technologies. Large amount of data is copyright-dangerous, and it also requires more compute-time to analyze and utilize. Even normal 3d models are large enough that GPU cards struggle rendering all the data passed to the hardware. If handling the data takes long time, there’s no reason to collect such large databases.

Explorer09 (profile)

June 25, 2025 at 9:43 am

Re: Re: Re:¹²

@terop

In the U.S. the “fair use” in copyright law is ruled by the court in a case-by-case basis. Rather than listing which particular cases are fair use, the statute mandates four consideration factors (17 U.S. Code § 107). The judges will evaluate the four factors of fair use separately and then combine the factors together for the overall conclusion. The judges will also reference precedents so that similar cases would evaluate fair use in similar way.

fair use should be limited to sentences of size 6 words or smaller. Currently they’re asking fair use to apply to terabytes of data, and they’re not considering the work amounts that went into collecting those databases(much less creating the material from scratch).

The sad fact is there was a case nicknamed “Google Books” (Authors Guild v. Google) that had ruled fair use even when Google scraped terabytes of data. It’s a book search and indexing engine, and the courts gave that fair use. So it isn’t about the amount of data scraped. Even terabytes can be fair for a search engine.

If companies paid proper money amounts for the data, the AI databases would cost significant amount of money, millions of dollars.

And yes this is why the AI companies try to lobby and try to gain fair use for everything they scraped. (They had fair use for search engines and are trying to push that for generative AI.)

Why should your company get access to huge database of data, when the same data is unavailable for use for everyone else who follows copyright law?

Good point. And this is why the recent Anthropic case the judge denied fair use on pirated books (I totally agree on this part despite the rest of the rulings are significantly flawed.)

Basically none of the AI companies executed the proper process of dividing the data to small pieces and obtaining separate license for each piece from its author. They think it’s too burdensome, but copyright law thinks that they should not use that much data, since creating it from scratch is also burdensome.

Note. In the case of book search engines, creating data from scratch won’t make sense. There are also another case (sorry I can’t find a case law for this) of a plagiarism detector when the machine needs to keep the full copy of the books so that it can used to find plagiarism on users’ inputs.

[W]e learned copyright law, the conclusion was that everything in internet is illegal to use in your own product. There simply wasn’t licenses available for the data.

That is partly true. Most contents published on the internet are not allowed for commercial reuse. But there is a subset of data that comes with explicit licenses such as Creative Commons that would permit you to use it without contacting the author. (I would argue that, with proper attributions, AI can be trained with Creative Commons licensed works. It’s just that we didn’t see AI companies attribute the sources when they train AIs.)

They should develop technologies that use less data. Make their AI algorithms work with smaller datasets.

Or in the alternative, obtain licenses for all the datasets. This is how large, open-source software (such as Linux) thrives.

terop (profile)

June 26, 2025 at 8:46 am

Re: Re: Re:¹³

The sad fact is there was a case nicknamed “Google Books” (Authors Guild v. Google) that had ruled fair use even when Google scraped terabytes of data.

I think what separates google books from artificial intelligence, is that google books only wanted to utilize the “captions” from the data, not the full text of the book. They published search snippets, which were intentionally restricted to small piece of the text, and never couldn’t contain large section of text from the book. The full power only came to the fact that it could index multiple books.

Artificial intelligence is different. They use full text of the book, for it’s core creative aspect of the book. AI’s trick is to try to “obfuscate” the source of the material, and thus they’re unable to collect a list of works the end result has been created from. AI is not creating direct quotes of the text, but they run the data trough an obfuscation service. It’s similar to how criminals hide their crypto money track by running the money through coin mixers.

The coin mixers are clearly declared illegal on money area for helping criminals do money laundering, so if we consider copyrighted subject matter as a form of money, we must consider AI practices illegal too.

Explorer09 (profile)

June 26, 2025 at 12:03 pm

Re: Re: Re:¹⁴

@terop You didn’t read the case of Google Books and made the wrong assumption. Google did index the full content of the books.

And as Judge Chhabria had ruled, you need to point out evidence that generative AI “obfuscated” the sources before your infringement claim works. Note that it’s not I like AI, it’s that the infringement claim needs stronger evidences in order to work. And hell, I know data laundering is a serious moral issue, but that thing doesn’t lead to your conclusion.

terop (profile)

June 27, 2025 at 2:04 pm

Re: Re: Re:¹⁵

You didn’t read the case of Google Books and made the wrong assumption. Google did index the full content of the books.

The recent paperwork claimed that meta AI output could only reproduce less than 50 words from each individual book, even if you carefully craft the prompt to look for info from that book.

And this fact was used to claim that google book scanning case applies to the situation..
=> so the small amount of infringing data in output is essential part of their case…

Explorer09 (profile)

June 27, 2025 at 11:36 pm

Re: Re: Re:¹⁶

The recent paperwork claimed that meta AI output could only reproduce less than 50 words from each individual book, even if you carefully craft the prompt to look for info from that book.

And this fact was used to claim that google book scanning case applies to the situation..
=> so the small amount of infringing data in output is essential part of their case…

People often quote content of a book to express opinions about the books by themselves. Such “quoting for commentary” use are definitely fair within copyright law. Unless you quote too much making your commentary effectively substituted the book sales.

There is another weakness to the genAI fair use claim: There is a possibility that the regurgitated portion end up in another book for sale on that is also same purpose for the original author (e.g. quote from novel end up in another commercial novel; or quote from news article for publishing another news without crediting the original source). That could defeat the fair use. Judge Chhabria might have anticipated this “unfair use” in mind, yet plaintiffs didn’t argue. And so he had to rule Meta as marginally fair use, and yet with a lot of warnings.

Explorer09 (profile)

June 29, 2025 at 6:13 am

Re: Re: Re:¹⁷

@terop

Mind you. I don’t like the Google Books precedent at all. Even though the regurgitation of 50 words is not much, a malicious users could eventually extract the whole book out of AI by repeat trying the prompts to piece many 50-word outputs together, to make a full version of the book that’s infringement.

The Google Book case is a Second Circuit ruling. Theoretically it can be overturned by the Supreme Court, but the aforementioned malicious use has not been seen and the plaintiffs didn’t cite any evidence for such. It isn’t worth it to appeal this case – it’s better to file a suit again with different authors.

terop (profile)

June 30, 2025 at 8:52 am

Re: Re: Re:¹⁸

malicious users could eventually extract the whole book out of AI by repeat trying the prompts to piece many 50-word outputs together,

This is why many plaintiffs are trying to move from detecting infringements from the output to the input of the AI system. In the input side, the infringement is clearly blatantly copying full text of the books.

It’s no longer just selecting snippets from the books, but instead the input side is cloning the full expressive content of the book. But then it hits the problem of linking the infringement to the copyright owner’s exclusive operations: PERFORM, DISTRIBUTE and DERIVATIVE WORKS, DISPLAY. AI system isn’t exactly publicly displaying other than the snippets. Distribute bit fails for similar reasons. perform is not failing either. But the key aspect what AI clearly infringes is the DERIVATIVE WORKS section. If plaintiffs would focus on derivative works, they could win all AI related lawsuit.

AI based products on the internet are clearly infringing on derivative works -exclusive operation.

Explorer09 (profile)

June 30, 2025 at 11:55 am

Re: Re: Re:¹⁹

But the key aspect what AI clearly infringes is the DERIVATIVE WORKS section. If plaintiffs would focus on derivative works, they could win all AI related lawsuit.

The “fair use” in U.S. law does shield users from infringing the derivative work right. So your proposed focus would not work. (It’s Campbell v Acuff-Rose Music case law.)

terop (profile)

July 1, 2025 at 7:08 am

Re: Re: Re:²⁰

The “fair use” in U.S. law does shield users from infringing the derivative work right.

The courts and judges just warned in the paper works to rely on fair use defense, given that it was just plaintiff’s blunders that gave Meta fair use decision, and had the plaintiff’s actually ran the case properly, they would have won the fair use. They’re clearly recommending every AI company to take a license to the training material, before content owners find out about the operations.

Explorer09 (profile)

June 25, 2025 at 3:31 am

Re: Re: Re:¹¹

Why is training magically not fair use when other transformative uses are? If the material is obtained legally, why isn’t the use legal?

A better analogy is that when you buy a CD from a music store, it grant you a license for personal (and home) enjoyment of music, but you are not to play that music on your workplace or store.

Purchasing the CD does not imply a license for commercial use of that music.

Just to mention, I strongly suggest this case will be appealed. Judge Alsup’s reasoning is deeply flawed and it focused too much on “transformative”-ness that it engulfed other considerations of fair use. Also it erroneously equated machine learning to human learning (I’ve suggested this equation shouldn’t hold because there is no legal personhood for machines; not founded in any constitution of any country).

MrWilson (profile)

June 25, 2025 at 1:57 pm

Re: Re: Re:¹²

A better analogy is that when you buy a CD from a music store, it grant you a license for personal (and home) enjoyment of music, but you are not to play that music on your workplace or store.

But purchasing a CD, listening to the music, learning to play guitar and understand chord progressions and then writing your own music using the skills you learned is perfectly legal. Adding “with a computer” shouldn’t magically make that different.

Purchasing the CD does not imply a license for commercial use of that music.

But purchasing the CD does not restrict you from learning to make music and make money from that skill you developed. But also, not everyone training an LLM is seeking to or is making money.

Let’s take the machine out of the process. You sit a chimp in front of Bob Ross episodes and the chimp learns to paint. The chimp paints a painting that isn’t a copy of a Bob Ross painting. Is the chimp violating copyrights? No, of course not. If you sell the chimp’s painting, you’re also not violating copyrights just because the chimp learned from watching Bob Ross. The chimp being supplanted by a machine doesn’t change the legal foundations of the process. That would be like saying you’re legally allowed to look at a painting, but no one can look at it using glasses to improve their vision. Glasses are just a tool. A computer is just a tool.

Just to mention, I strongly suggest this case will be appealed.

Of course it will. There’s a lot of money to be made in licensing deals if media companies can force everyone to license training data. That doesn’t mean it will be overturned on appeal. It could be. We’ll see.

Judge Alsup’s reasoning is deeply flawed and it focused too much on “transformative”-ness that it engulfed other considerations of fair use.

The thing about the four factors is that they aren’t applied equally. You could have three factors go against you and a court might rule that the remaining factor goes in your favor strongly enough to override the others, or vice versa. A poem consisting of four lines is so short that 100% of the work is used, but that doesn’t make it infringing if other factors weigh heavily towards a fair use.

Transformative is a powerful argument. Commentary and parody are transformative by nature and they get special carve outs.

Also it erroneously equated machine learning to human learning (I’ve suggested this equation shouldn’t hold because there is no legal personhood for machines; not founded in any constitution of any country).

The lack of legal personhood shouldn’t be an impediment. Evolving algorithms are already legal. Humans are just using the machines as tools to do something. That’s literally something humans have done for millions of years.

The personhood issue only speaks to whether the output can be copyrighted. It can’t because the authorship is by a machine, not a person (though there is some debate as to whether the human prompt is equivalent to other human input that does allow for copyrightability, such as a human clicking a camera button to take a picture, even though it’s the camera that is actually capturing the image – that’s a debate for another day).

Explorer09 (profile)

June 26, 2025 at 4:33 am

Re: Re: Re:¹³

But purchasing a CD, listening to the music, learning to play guitar and understand chord progressions and then writing your own music using the skills you learned is perfectly legal. Adding “with a computer” shouldn’t magically make that different.

This argument is fine for music maker applications. But probably not for music-generating AIs (Suno & Udio). When it comes to AI it cannot be equated with human learning, because there is no so-called “skills”. Rather they are more about “samples” and the quality of those samples.

Let’s take the machine out of the process. You sit a chimp in front of Bob Ross episodes and the chimp learns to paint. The chimp paints a painting that isn’t a copy of a Bob Ross painting. Is the chimp violating copyrights? No, of course not. If you sell the chimp’s painting, you’re also not violating copyrights just because the chimp learned from watching Bob Ross. The chimp being supplanted by a machine doesn’t change the legal foundations of the process.

I would say this is a good analogy, MrWilson, but details can matter.

If the chimp’s painting is substantially similar to Bob Ross (think of when the chimp photographed the painting rather then redraw with a brush), then the chimp’s output can still infringe copyright. Except that the chimp can’t be sued for infringement, it would be the human redisributor of the chimp’s work that is liable of infringement.
If there’s no substantial similarity (ignoring the aspect of “style” copying which is out of scope of copyright law but may be in scope of trademarks), then of course there’s no infringement. The chimp’s work would be uncopyrightable, by the way, assuming it can draw decently.
So it’s not always yes or always no, it’s the details.

There’s a lot of money to be made in licensing deals if media companies can force everyone to license training data. That doesn’t mean it will be overturned on appeal. It could be. We’ll see.

Considering that Judge Chhabria has also ruled on the case now. I won’t debate on this part further. Judge Chhabria’s arguments are much better than Judge Alsup’s.

such as a human clicking a camera button to take a picture, even though it’s the camera that is actually capturing the image – that’s a debate for another day

I don’t think a debate is needed on this. The key is the amount of human creative control that determines copyrightability. When the AI generates significant part of image/music/content that the human has no control of (as if, most of the internal decisions are black box), those part would be uncopyrightable. And the USCO has recognised certain works with AI and registered copyrights for them, albeit each of them carries a waiver (on which parts are uncopyrightable). (I would call these cases as “partial copyright” protection, in constrast to full copyright.)

terop (profile)

June 26, 2025 at 9:04 am

Re: Re: Re:⁹

Here’s a decision that meta’s use of pirated books is fair use, simply because authors couldn’t connect the dots for book sales slowing to the introduction of AI in the marketplace:

https://news.bloomberglaw.com/legal-ops-and-tech/meta-beats-copyright-suit-from-authors-over-ai-training-on-books

Explorer09 (profile)

June 11, 2025 at 2:47 am

and Stable Diffusion didn’t attribute the source or the original photographer.

Should they post the attribution internally in their offices on a bulletin board? On a post-it note next to their computer?

Attribution next to the generated output, you idiot!
Say “this AI generated images incorporates materials from [arthor-name] [image-url], which is licensed under CC-BY 4.0”

But also, why would they waste so much time and effort to try to use an LLM to replicate copyrighted materials when they could just find it somewhere else and in some cases, legally for free?

As I’ve said, you slipped it out of your mouth. You support piracy, period.

You assume every image that AI model has been “trained” with is legally obtained for free (which is not; OpenAI, Meta and Anthropic all use pirated materials), otherwise you would not make that question at all.

A person who respects copyright would say, buy the license for AI training. Even on a fair use scenario of, Google Books for example, the result page of Google Books would show the user where to buy the book (e.g. buy from Amazon). You didn’t consider that, because you support pirates. And I should stop making you smarter by recognizing your mistakes.

MrWilson (profile)

June 11, 2025 at 11:07 am

Re:

Attribution next to the generated output, you idiot! Say “this AI generated images incorporates materials from [arthor-name] [image-url], which is licensed under CC-BY 4.0”

Except the training dataset is not in the generated output, you idiot! That’s the whole fucking point! Again, again, again, you don’t understand how the technology works. The result from the model is not a derivative work. It doesn’t utilize a single image. It utilizes its understanding of denoising using tokens and weights. There’s no attribution that can be made, anymore than pulping all physical paintings and using the pulp to make new art would allow an collage artist to identify the original source of any random element.

legally for free?

As I’ve said, you slipped it out of your mouth. You support piracy, period.

Yes, legally for free piracy. That makes total sense.

As I’ve said, you’re pretending the internet doesn’t exist and you’re pretending that it’s illegal to view images on the internet and you’re pretending that LLMs contain all images on the internet and you’re pretending that LLMs can just reproduce all images on the internet. Your assertions are non-sensical and completely impractical. You’re predicting internet users will forget the internet exists and just try to recreate all content with an LLM. This is such a weird fever dream and moral panic. Do you think Dungeons and Dragons actually teaches children real satanic spells and rituals? Do you think Halloween candy has razorblades and poison in it? Do you think playing popular music backwards lets you talk to demons?

Your sanity is in question at this point.

You assume every image that AI model has been “trained” with is legally obtained for free (which is not; OpenAI, Meta and Anthropic all use pirated materials), otherwise you would not make that question at all.

I’m not assuming anything. You’re back to conflating training and use as if they’re the same thing. You’re also assuming that LLMs can perfectly replicate trained data that isn’t in the model.

A person who respects copyright would say, buy the license for AI training.

There isn’t a license for AI training in most cases. This hasn’t been a thing until now. There also isn’t an established legal right to demand licenses for training, which is what the court cases and this article are about. You’re pretending it’s already settled.

Even on a fair use scenario of, Google Books for example, the result page of Google Books would show the user where to buy the book (e.g. buy from Amazon).

That’s not a mandatory action by Google. That’s not legally required. You’re inventing your own standard and pretending it’s legally binding.

You didn’t consider that, because you support pirates.

That’s just a useless ad hominem at this point. You claim poor people both don’t exist but also do exist and are freeloaders and pirates. If that’s your definition, the criticism isn’t bad. It’s telling that you think it’s an insult to support the rights and interests of poor people and you feel compelled to denigrate them.

And I should stop making you smarter by recognizing your mistakes.

That’s a weird claim coming from a person who has been proven wrong so many times. I’ve been giving you a free education despite the fact that you claim “You should pay tuitions to your teacher. Everyone pays someone when they learn.” I can give you a paypal or venmo link if you’d like pay me “tuitions.”

Explorer09 (profile)

June 11, 2025 at 3:58 pm

Re: Re:

Except the training dataset is not in the generated output

Then why can the AI generate output that is similar to copyright works? By magic?

That’s common sense. (Remember the example of reproducing the photo of Anne Graham Lotz that USCO has cited.)

you’re pretending the internet doesn’t exist
and you’re pretending that it’s illegal to view images on the internet
and you’re pretending that LLMs contain all images on the internet
and you’re pretending that LLMs can just reproduce all images on the internet.

False.
Viewing images on the internet involves copying the data to your computer memory. And if those images are illegally distributed, you viewing it can constitute copyright infringement.
Not all images, but many images, including many images that are copyrighted.
Regurgitation is an issue. You even admitted regurgitation is technically possible, so why hide it? You even cited the “one in a million” chance, so if I can try one million times, I can proof your infringement.

You’re predicting internet users will forget the internet exists and just try to recreate all content with an LLM.

Another slip of your mouth. You support piracy period.

Before the need to argue whether it’s practical to recreate copyrighted content with LLMs, you simply suggest users to download the content from the internet, which implies, you pirate.

(Otherwise you would say to buy instead. Some of the creative works are exclusive to physical media, some others allow downloading but are behind a paywall. You are worse than the AI companies trying to defend their training data are “publicly available” content, because you suggested pirate ways.)

There isn’t a license for AI training in most cases

Say what? https://authorsguild.org/advocacy/artificial-intelligence/ai-licensing-what-authors-should-know/

There also isn’t an established legal right to demand licenses for training

What the fuck is “the right to demand license”?

MrWilson (profile)

June 11, 2025 at 7:29 pm

Re: Re: Re:

Then why can the AI generate output that is similar to copyright works? By magic?

First, this question indicates your admission that you don’t understand how LLMs are trained, as I have already claimed multiple times. So at least you’re being honest here about your ignorance.

Second, it’s because it learned to generate based on its analysis of the training data. Text LLMs predict the next word based on weights and tokens. Image generation works through a similar process with denoising. This information is widely available on the internet. You shouldn’t be this ignorant, especially when claiming to know more than other people.

That’s common sense. (Remember the example of reproducing the photo of Anne Graham Lotz that USCO has cited.)

I am surprised you brought this example of your utter ignorance up again.

The study intentionally sought that image using a poorly trained model that contained a bunch of duplicates of that image and they had to try millions of time to get something that looked “similar” but not useful as a substitute or a competing work. And that’s the only example you’ve ever mentioned despite using this one example to assert that any LLM could do the exact same thing with virtually any copyrighted work.

Viewing images on the internet involves copying the data to your computer memory. And if those images are illegally distributed, you viewing it can constitute copyright infringement.

Except we’re talking about a legally free image. It wasn’t illegally distributed.

Not all images, but many images, including many images that are copyrighted.

No, LLMs don’t contain any images. That’s the whole point. The training dataset is not in the model. It can’t be. There’s isn’t a compression standard good enough to squeeze all that data into a few gigabytes. You’re showing off your ignorance again here.

Regurgitation is an issue. You even admitted regurgitation is technically possible, so why hide it?

I haven’t admitted it’s technically possible. I’ve pointed out the flaws in the conclusion that the study made that the US Copyright Office parroted and that you lazily trusted.

You even cited the “one in a million” chance,

Actually, you said one in a million. I said one in millions. And I pointed out the context that makes that one single example difficult to reproduce. And what it did render was not a copy, but a fuzzy reference at best. And, as I already pointed out, the image was already LEGALLY free and available on the internet, so no one would need to try so hard to get a bad imitation of it from an LLM.

Provide another example that is actually a passable copy of a work used in the training data. This one example is not your proof.

so if I can try one million times, I can proof your infringement.

Try it. I’d love to see the results. You claim it’s possible. Go for it. Until you provide this proof, we’ll consider it impossible.

Another slip of your mouth. You support piracy period.

There’s no slip. This is you admitting that you think people viewing stuff on the internet means they’re violating copyrights. I’m referring, as I have stated, to simply going to any random website that isn’t locked behind a paywall or a login and your browser just loads images. Or do you think you’re “pirating” and violating copyrights when you come to Techdirt and the logo loads in your browser? Why are you accusing yourself of being a pirate?

Before the need to argue whether it’s practical to recreate copyrighted content with LLMs, you simply suggest users to download the content from the internet, which implies, you pirate.

No, you dense motherfucker. I literally linked to the webpage where your one weakest example is legally available for free. Which implies you think browsing the internet is a copyright violation. Or at least that you’re just an idiot who doesn’t know what they’re talking about.

(Otherwise you would say to buy instead.

You literally can’t buy a lot of content that is legally available for free. There’s no purchase button on a Wikimedia page offering a free license on a photograph that the copyright holder has released with a permissive license.

That your mind jumps immediately to copyright violations indicates your bias. There’s plenty of legally free content on the internet that nobody can or needs to pay for. Are you like Terop and think the public domain doesn’t exist?

Some of the creative works are exclusive to physical media,

Then they aren’t getting trained in LLMs.

some others allow downloading but are behind a paywall.

Then they aren’t getting trained in LLMs.

You are worse than the AI companies trying to defend their training data are “publicly available” content, because you suggested pirate ways.)

I didn’t suggest “pirate ways” at all. You claimed legally free content required a purchase.

Say what? https://authorsguild.org/advocacy/artificial-intelligence/ai-licensing-what-authors-should-know/

You keep doing this where you just search for something you imagine supports your argument but you don’t actually examine what you link to. For dog sakes, you literally quoted a reddit page previously. That linked page is one organization talking about their licensing recommendations. But they don’t cover every creator whose work might be used in training an LLM. Not all creators are easily contacted. There’s works still covered under copyright whose current owner isn’t easily identified. You’re pretending it’s as easy as just clicking a button on a website and paying for a license as if it’s openly offered in easily identified and verified manners.

I’m an author. I’m not a part of the authors guild. My work isn’t available for licensing for LLM training. It’s not like I have a button somewhere on the Amazon pages for my work.

What the fuck is “the right to demand license”?

There is no law or court case that has ruled definitively that all LLM training on copyrighted content requires a license, ergo, copyright owners do not possess such a recognized right.

Explorer09 (profile)

June 12, 2025 at 3:12 am

Re: Re: Re:²

it’s because it learned to generate based on its analysis of the training data. Text LLMs predict the next word based on weights and tokens. Image generation works through a similar process with denoising.

Weight prediction = expression “fixed in a tangible medium” -> copyright issue.

Denoising = a very-aggressive compression technique that does not effect the analysis of copyright.

The judge in Andersen v. Stability AI had already disagreed with you.

“That these works may be contained in Stable Diffusion as algorithmic or mathematical representations – and are therefore fixed in a different medium than they may have originally been
produced in – is not an impediment to the claim at this juncture.” (Judge Orrick) https://admin.bakerlaw.com/wp-content/uploads/2024/08/ECF-223-Order-Granting-in-Part-and-Denying-in-Part-Defendants-Motions-to-Dismiss.pdf

LLMs don’t contain any images. That’s the whole point. The training dataset is not in the model.

Because it exists not in the form of images but in the form of mathematical expression! See the quote from the judge again.

do you think you’re “pirating” and violating copyrights when you come to Techdirt and the logo loads in your browser?

Straw man.

You literally can’t buy a lot of content that is legally available for free. There’s no purchase button on a Wikimedia page offering a free license on a photograph that the copyright holder has released with a permissive license.

That your mind jumps immediately to copyright violations indicates your bias. There’s plenty of legally free content on the internet that nobody can or needs to pay for. Are you like Terop and think the public domain doesn’t exist?

No, my question is, why the fuck can’t the AI models be trained with only public domain materials.

When you suggested many things on the internet can be “legally downloaded for free”, you ignored the reality that AI companies train with pirated materials that means copyright infringement! Suggest “legally downloaded for free” content won’t help because the AI companies are not accused of this part!

Your replies suggests me again and again that training the AI with only public domain content is the only non-infringing way to go! So fuck!

What the fuck is “the right to demand license”?

There is no law or court case that has ruled definitively that all LLM training on copyrighted content requires a license, ergo, copyright owners do not possess such a recognized right.

The right to reproduction and the right to derivatives cover AI training. And you dodged my original question of what the fuck is “the right to demand license”.

MrWilson (profile)

June 12, 2025 at 10:52 am

Re: Re: Re:³

Weight prediction = expression “fixed in a tangible medium” -> copyright issue.

Weight prediction isn’t a copied expression fixed in a tangible medium. It’s an original understanding of how to compose language. For this to be a copyright issue, children would likewise have to get copyright clearances to learn how to write. That is not how US copyright law works. You’re inventing a right that doesn’t exist.

Denoising = a very-aggressive compression technique that does not effect the analysis of copyright.

Denoising isn’t a compression technique at all. Again, you don’t understand how the technology works.

As the EFF has said:

“The complaint against Stable Diffusion characterizes this as “compressing” (and thus storing) the training images, but that’s just wrong. With few exceptions, there is no way to recreate the images used in the model based on the facts about them that are stored. Even the tiniest image file contains many thousands of bytes; most will include millions. Mathematically speaking, Stable Diffusion cannot be storing copies of all of its training images (for now, let’s put a pin in the question of whether it stores a copy of any of them).”

The judge in Andersen v. Stability AI had already disagreed with you.

The judge doesn’t disagree with me. The word “may” in that sentence means the judge is saying no determination has been made at this stage. That was, as you can see, simply to address a motion to dismiss. The judge is only saying it’ll go to court to be determined. Again, you gleefully try to find anything you think will support your argument but you don’t understand the nuance so you end up posting things that at best don’t support your argument and at worst prove you wrong.

Also, I wouldn’t trust a judge over a technologist on a matter of technology.

Because it exists not in the form of images but in the form of mathematical expression! See the quote from the judge again.

Except that’s not actually true. What exists in mathematical expression is the original understanding that the LLM has about how to compose sentences and predict the next word and not predict the next word from a single work but from the entirety of every piece of data it has been trained on but doesn’t retain. It is not compressing the work at all, it is remembering what it learned. It’s like reading a bunch of books and learning how English syntax works and developing a vocabulary you can use to compose sentences from them. That is not copyright violation or else, again, human learning would violate copyright.

The judge also isn’t making a legal determination of fact. It was a motion to dismiss that they were ruling on, which involves factoring in the unproven arguments of both sides. You also don’t understand how the US court system works.

Straw man.

No, not at all. It’s literally your theory of copyright in a real world example. It contradicts your own assertions. If you don’t think you’re violating copyright by visiting this site, the same principle applies to going to the Wikimedia page for a photograph. But you’ve referring to that as piracy and a copyright violation. You don’t like it being pointed out that the only example you can parrot from third parties doesn’t even support your argument because the image is available legally for free.

No, my question is, why the fuck can’t the AI models be trained with only public domain materials.

Someone tried. It was really hard and the result was less useful than one trained on larger datasets. But that’s not a technological or legal question. You’ll want to ask the people doing the training.

When you suggested many things on the internet can be “legally downloaded for free”, you ignored the reality that AI companies train with pirated materials that means copyright infringement!

This is you again pretending I’ve ever supported this. This is a straw man. I never said it was okay to use “pirated” materials.

It’s also a moved goalpost because on this particular topic, we weren’t discussing the conduct of the LLM trainers, but rather the efforts of users who you fantastically claimed would try to use an LLM to produce a flawed reproduction of a work that is legally available for free in a perfect reproduction on a website, again legally for free!!! Why would anyone go to great lengths to generate millions of results, wasting vast amounts of time and electricity, to get a failed reproduction of something you can just get legally for free on the internet? The scenario doesn’t even make sense because you would have to have already visited the Wikimedia page to have seen the legally free image in order to know what to try to reproduce using the LLM. The study scenario is contrived and your attempts to use it to assert people will pursue this as an avenue of copyright violation is absurd. This only says anything about your paranoia and ignorance.

Suggest “legally downloaded for free” content won’t help because the AI companies are not accused of this part!

Again, we’re not talking about the AI companies in this respect. We’re talking about users that you accused of copyright violation in using an LLM to reproduce a failed attempt at a copy of a legally free image already available on the internet. If you can’t keep track of what you’ve argued, you should look back through the thread. Keep up, dude.

Your replies suggests me again and again that training the AI with only public domain content is the only non-infringing way to go!

Feel free to train your LLMs with only public domain content. That’s your right, I assume, though I don’t know the laws in your country. In the US, that would be perfectly legal. Have fun with that.

So fuck!

Sofa king great!

The right to reproduction and the right to derivatives cover AI training.

Again, “There is no law or court case that has ruled definitively that all LLM training on copyrighted content requires a license, ergo, copyright owners do not possess such a recognized right.” You responded to the answer.

And you dodged my original question of what the fuck is “the right to demand license”.

The right to demand licenses is the right of the copyright holder to demand that people obtain a license for uses of their copyrighted content that are required by law. And again, again, again, no legislation or case law has yet determined that copyright holders have a right to demand a license or licenses for LLM training.

Explorer09 (profile)

June 12, 2025 at 2:19 pm

Re: Re: Re:⁴

Weight prediction isn’t a copied expression fixed in a tangible medium. It’s an original understanding of how to compose language. For this to be a copyright issue, children would likewise have to get copyright clearances to learn how to write. That is not how US copyright law works. You’re inventing a right that doesn’t exist.

“The complaint against Stable Diffusion characterizes this as “compressing” (and thus storing) the training images, but that’s just wrong.

Children don’t learn how to write by reading a copyrighted novel. It is already bullshit by analogizing machine learning with human learning (this is also debunked by USCO), now it’s more bullshit by suggesting children learn writing by reading copyrighted books.

If you make a more reasonable claim such as learning with dictionaries and textbooks. It could have made more sense, but no you didn’t.

“Denoising isn’t a compression technique at all […]”

Say that to judges! Say that to Disney and Universal! It’s bullshit to suggest images can come out of thin air. And EFF’s reasoning is fundamentally flawed.

Mathematically speaking, Stable Diffusion cannot be storing copies of all of its training images (for now, let’s put a pin in the question of whether it stores a copy of any of them).”

Because Stable Diffusion doesn’t need to store “all” copies! Rather, it stores aggregated and “diffused” representations of all training data, and let the model interpolate the images it didn’t store.

EFF had certainly confused about the “compression” claim. they thought the “compression” only happens on a single image, and not compression to the aggregated data! When there are lots of redundancies in the aggregations of image representations, compression can dictate that only one representation is needed to store, even when the representations of objects are visually different among training images.

So what does that mean? There are still copies. But they are highly aggregated and “fused” with other images of the training data. By analyses in the copyright, the model itself becomes the derivative work of the training data. No magic here.

Someone tried. It was really hard and the result was less useful than one trained on larger datasets.

As if claiming you cannot make a “useful” AI without being unethical.

I never said it was okay to use “pirated” materials.

[…] something you can just get legally for free on the internet

There is no “legally for free” zone here. Only “illegal for free” zone.

You are questioning why people would reproduce “legally for free” images through LLM, but that’s never my question. What I said is people reproduce “illegally, copyrighted” images through LLM, not the “legally for free” bullshit.

The right to demand licenses is the right of the copyright holder to demand that people obtain a license for uses of their copyrighted content that are required by law. And again, again, again, no legislation or case law has yet determined that copyright holders have a right to demand a license or licenses for LLM training.

There is no need to “demand license”. Because people other than the copyright holder cannot legally use copyrighted content for any purpose. Rather than copyright holders demand other people, it’s people who plea copyright holders for licenses for usung content.

MrWilson (profile)

June 12, 2025 at 3:21 pm

Re: Re: Re:⁵

Children don’t learn how to write by reading a copyrighted novel.

I did. Not just one “novel,” but many different books. My classmates did too. Your ignorance of the US education system is revealed again.

It is already bullshit by analogizing machine learning with human learning (this is also debunked by USCO), now it’s more bullshit by suggesting children learn writing by reading copyrighted books.

Except they actually do. This is the weirdest claim. Are you a writer? Have you never learned about composition from reading written material? How did you learn to write English? You definitely read copyrighted content while learning to read English. You’re getting into Terop-level of absurd claims here.

If you make a more reasonable claim such as learning with dictionaries and textbooks. It could have made more sense, but no you didn’t.

Dictionaries and textbooks are copyrighted content! Holy shit, you’re an idiot! Your ignorance is wild. It’s all over the place.

Say that to judges!

Lawyers for the defense already will, as well as specialists they call.

Say that to Disney and Universal!

There’s no reason to try to explain something to people whose livelihood depends on them not understanding it.

It’s bullshit to suggest images can come out of thin air. And EFF’s reasoning is fundamentally flawed.

Nobody suggested images come out of thin air. Again, you don’t understand the technology and you keep demonstrating that fact.

Because Stable Diffusion doesn’t need to store “all” copies!

So you admit it doesn’t store the images and therefore cannot reproduce the images. You just destroyed your own arguments.

Rather, it stores aggregated and “diffused” representations of all training data

Except it doesn’t at all. It stores what it learned through noising and denoising processes. The training data, to be specific, we’re talking about the images it’s trained on, are not represented in the model. A process is what the model actually contains. A process is not a copy of the training data.

and let the model interpolate the images it didn’t store.

That’s not what interpolate means. You can’t interpolate an image you aren’t able to insert and it doesn’t have the original image available after training.

EFF had certainly confused about the “compression” claim. they thought the “compression” only happens on a single image, and not compression to the aggregated data!

That’s not their stance. Also, “compressing aggregated data” isn’t the same as having compressed copies of all the trained content. Again, again, again, you don’t understand how the technology works. Your claims are useless.

When there are lots of redundancies in the aggregations of image representations, compression can dictate that only one representation is needed to store, even when the representations of objects are visually different among training images.

It seems like you’re quoting something here but I’m not seeing a source.

So what does that mean? There are still copies.

There are no copies! This can’t be said enough. The model doesn’t not contain the training data.

But they are highly aggregated and “fused” with other images of the training data.

The training data is not in the model, therefore this cannot be true.

By analyses in the copyright, the model itself becomes the derivative work of the training data. No magic here.

It isn’t. It can’t be. It doesn’t contain the training data. The model contains a process, not an object.

As if claiming you cannot make a “useful” AI without being unethical.

That’s a subjective statement. I’m claiming you’re arguing in favor of unethical media companies who can’t be as wealthy as they are without exploiting poor workers and creators. You don’t seem to have a problem with that so your self-righteous assertions of ethics is hypocrisy. You’ve just chosen, as you have already admitted, which set of unethical corporations you want to side with, despite it being pointed out that you don’t have to side with either.

There is no “legally for free” zone here. Only “illegal for free” zone.

I literally linked to the image source where you can indeed find the image “legally for free.” You don’t get to declare other people’s chosen licenses to be illegal. And again, if your assertions are true, you are violating copyright by visiting this site. Of course that’s absurd, so your argument is absurd. Browsing the internet and seeing content that copyright holders have intentionally posted to the internet, served up by their servers and the servers of companies they have chosen to host their content on, is not illegal. You should really stop using the internet if you think everything is a copyright violation, otherwise you’re complicit and morally compromised.

You are questioning why people would reproduce “legally for free” images through LLM, but that’s never my question. What I said is people reproduce “illegally, copyrighted” images through LLM, not the “legally for free” bullshit.

But the only example you can provide is legally available for free example! And it’s a terrible example because it’s a blurry failed reproduction from a model nobody actually uses after the researchers wasted millions of attempts and electricity and time to find something vaguely resembling the targeted image just to make the argument that it’s sorta maybe might could be possible.

You’re admitting that you’re using one really bad example to pretend your absurd, paranoid, unrealistic hypothetical scenario is plausible or desired.

There is no need to “demand license”.

You must have a right to license a particular use in order to demand, ask for, request, or expect someone to license your copyrighted material.

There is no right under copyright law for a copyright holder to demand a license for exempted acts, including fair use, but also those that fall under the TEACH Act, which I’m quite certain you’re completely ignorant of, as well as those listed under 17 U.S. Code § 110.

For example, “…the following are not infringements of copyright:…performance of a nondramatic literary or musical work or of a dramatico-musical work of a religious nature, or display of a work, in the course of services at a place of worship or other religious assembly;”

So as a copyright holder, you can’t demand a church license your copyrighted content for them to be able to legally perform or display the work during a church service.

Because people other than the copyright holder cannot legally use copyrighted content for any purpose.

They can legally use copyrighted content for many purposes that don’t require licenses or asking for permission.

Rather than copyright holders demand other people, it’s people who plea copyright holders for licenses for usung content.

You don’t have to plea for a license if your use doesn’t require one. That you are ignorant of this fact means, again, for the thousandth time, you don’t know what you’re talking about. You don’t understand US copyright law.

Explorer09 (profile)

June 12, 2025 at 3:59 pm

I did. Not just one “novel,” but many different books. My classmates did too. Your ignorance of the US education system is revealed again.

Except they actually do. This is the weirdest claim. Are you a writer? Have you never learned about composition from reading written material? How did you learn to write English? You definitely read copyrighted content while learning to read English. You’re getting into Terop-level of absurd claims here.

Dictionaries and textbooks are copyrighted content! Holy shit, you’re an idiot! Your ignorance is wild. It’s all over the place.

Three pieces of bullshit together.

Firstly, there are public domain books. Secondly, AI “reads” “millions” of books while no human could have that time to read that many. Thirdly, you cannot legally justify that AI has the right to “learn” even when it can learn like humans.

It stores what it learned through noising and denoising processes.

Emphasis added. Since you claim “what it learned” can be “stored”. “What it learned” is fixed in a tangible medium and such is subject to copyright! You can’t win.

The model contains a process, not an object.

Stored process. Since all computers in our world are stored-program computers, the “process” itself you are referring to is subject to copyright.

I’m claiming you’re arguing in favor of unethical media companies who can’t be as wealthy as they are without exploiting poor workers and creators.

Dismissed again because you failed to list any example of “poor worker or creator”.

But the only example you can provide is legally available for free example! And it’s a terrible example because it’s a blurry failed reproduction from a model nobody actually uses after the researchers wasted millions of attempts and electricity and time to find something vaguely resembling the targeted image just to make the argument that it’s sorta maybe might could be possible.

Well, since there is a news that Disney and Universal are suing Midjourney just a day ago, why not read their complaint for such illegal examples?

Here you are: https://www.courthousenews.com/wp-content/uploads/2025/06/disney-ai-lawsuit.pdf

There is no right under copyright law for a copyright holder to demand a license for exempted acts, including fair use, but also those that fall under the TEACH Act, which I’m quite certain you’re completely ignorant of, as well as those listed under 17 U.S. Code § 110.

For example, “…the following are not infringements of copyright:…performance of a nondramatic literary or musical work or of a dramatico-musical work of a religious nature, or display of a work, in the course of services at a place of worship or other religious assembly;”

So as a copyright holder, you can’t demand a church license your copyrighted content for them to be able to legally perform or display the work during a church service.

First, your cited section 110 doesn’t cover anything about AI. Second, even your cited example (§110(3)) is flawed, because the limit is on “nondramatic” literary or musical works, or dramatico-musical work “of a religious nature” only. I still have exclusive rights if my works are “drama” and “non-religious”.

Because people other than the copyright holder cannot legally use copyrighted content for any purpose.

They can legally use copyrighted content for many purposes that don’t require licenses or asking for permission.

Are you intentionally misinterpreting my “any” word here?

MrWilson (profile)

June 12, 2025 at 5:12 pm

Re:

Firstly, there are public domain books.

There are indeed. That doesn’t mean I didn’t read copyrighted books to learn to write. This doesn’t refute anything. The fact that I did in fact read copyrighted books to learn to write disproves your claim entirely. I didn’t get permission or obtain a license to learn to write from reading the books I read that were subject to copyright, because I didn’t have to. US copyright law doesn’t require that!

Secondly, AI “reads” “millions” of books while no human could have that time to read that many.

Yes, and this lack of a limitation changes nothing about the legality. If a human could actually read that many that fast, it wouldn’t magically become illegal just because one person was exceptional. The speed at which someone or something can read is not a legal basis for anything. Otherwise, you’d be citing the law that refuted this.

Thirdly, you cannot legally justify that AI has the right to “learn” even when it can learn like humans.

LLMs have no rights. They’re not people. But humans have the right to train LLMs.

Emphasis added. Since you claim “what it learned” can be “stored”. “What it learned” is fixed in a tangible medium

What it learned is a process of its own actions, not a copyrighted work.

and such is subject to copyright! You can’t win.

No, it isn’t subject to copyright. The learned process is a machine process not of human authorship. It doesn’t qualify for copyright. Fixing something in a fixed medium doesn’t make it subject to copyright. That’s a requirement for copyright, not a proof that something is copyrighted. Machine-generated code is not subject to copyright. The US Copyright Office that you keep citing will tell you this.

Stored process. Since all computers in our world are stored-program computers, the “process” itself you are referring to is subject to copyright.

First, this is wrong. Second, it’s terrible logic. Computers storing programs isn’t a basis for them being subject to copyright. You can’t cite a law or case law that says that anything stored on a computer is subject to copyright by that fact. That would wipe out the public domain status for any public domain work that gets digitized or originally existed in a digital format.

Dismissed again because you failed to list any example of “poor worker or creator”.

I already listed an example. If you were an American citizen, I’d list you. I’ll list myself since I’m not a millionaire. You want someone else? Len Kaminski.

Now, what is the great response you were hoarding while waiting for a name? What brilliant reasoning will refute this claim because you can’t fathom the existence of poor people in the US?

Well, since there is a news that Disney and Universal are suing Midjourney just a day ago, why not read their complaint for such illegal examples?

Nope. You’re goalpost shifting. You said that one example was proof. I proved it wasn’t. Either you admit you were wrong or your prove that example was actually a violation of copyright and reason why it would lead to the scenario you claimed. Otherwise, your claim is completely dismissed, as you are so wont to do with far less logic.

First, your cited section 110 doesn’t cover anything about AI.

We’re discussing copyright. Holy fuck! We’re discussing US copyright law, you dumb motherfucker!!! How dense can you be? The use being for training an LLM doesn’t magically change copyright law. You’ve been arguing about US copyright law. You’re goalpost shifting again!

Second, even your cited example (§110(3)) is flawed, because the limit is on “nondramatic” literary or musical works, or dramatico-musical work “of a religious nature” only. I still have exclusive rights if my works are “drama” and “non-religious”.

That’s not refuting the example. The example is to show that there are exemptions to copyright. And that’s just one example. It’s not claiming all uses are exemptions of copyright law.

Are you intentionally misinterpreting my “any” word here?

No, I’m intentionally pointing out that there are some, in fact, there are many uses that don’t require licenses or asking for permission. You’re claiming that all uses require permission. Copyright law itself says that claim is false. You don’t understand US copyright law and as you’ve revealed in this post, you don’t even understand we’re talking about US copyright law!

Explorer09 (profile)

June 13, 2025 at 4:12 am

I didn’t get permission or obtain a license to learn to write from reading the books I read that were subject to copyright, because I didn’t have to. US copyright law doesn’t require that!

Which section of the U.S. copyright law?

If a human could actually read that many that fast, it wouldn’t magically become illegal just because one person was exceptional.

The debunk is about AI reading the same as humans, not about the legality of “reading”. It’s that the anthropomorphizing machines won’t work.

But humans have the right to train LLMs.

Which law? Especially on the copyright of that. Cite a clause in the U.S. copyright law.

The learned process is a machine process not of human authorship. It doesn’t qualify for copyright. Fixing something in a fixed medium doesn’t make it subject to copyright. That’s a requirement for copyright, not a proof that something is copyrighted. Machine-generated code is not subject to copyright. The US Copyright Office that you keep citing will tell you this.

You are taking out parts when you make the statement here. First, whether a work is subject to copyright is independent of whether it’s a machine that made it or a human made it. A machine-generated stuff (either the model or the LLM output) can be simultaneously uncopyrightable and infringing someone else’s copyright. That’s the key point of many copyright lawsuits about AI.

You can’t cite a law or case law that says that anything stored on a computer is subject to copyright by that fact. That would wipe out the public domain status for any public domain work that gets digitized or originally existed in a digital format.

Are you intentionally misinterpreting my “any” word here?

No, I’m intentionally pointing out that there are some, in fact, there are many uses that don’t require licenses or asking for permission. You’re claiming that all uses require permission.

I will reply these two comments together, because when you are suggesting there are “some” uses that don’t require licenses blah-blah-blah, you then made the claim of “you can’t cite a law or case law that says that anything stored on a computer is subject to copyright”, misusing the word “any” here. What a hypocrite you are.

If I replace your “any” with “every”, then the claim is true, but with “any”, you instead suggest software code is not copyrightable, and then what the fuck would many lawsuits about software copyright are about?

If you were an American citizen, I’d list you. I’ll list myself since I’m not a millionaire. You want someone else? Len Kaminski.

Now, what is the great response you were hoarding while waiting for a name? What brilliant reasoning will refute this claim because you can’t fathom the existence of poor people in the US?

I’m not challenging your ability to name any “poor person” according to your definition, I’m challenging your ability to represent the “poor people” class you defined.

You cite Len Kaminski, and so I am fucking serious to ask this question, did he support piracy like you do? (I say Len Kaminski, a comic book writer, notable with some Iron Man works. https://marvel.fandom.com/wiki/Len_Kaminski )

MrWilson (profile)

June 13, 2025 at 12:13 pm

Re:

Which section of the U.S. copyright law?

The doctrine of first sale, fair use (107), and since 2002, the TEACH Act. And specifically, 106 doesn’t give copyright holders the right to restrict learning from copyrighted material. You seem to assume that US copyright law gives copyright owners infinite rights over their material. It doesn’t. You should avail yourself of the vast resources on the internet that are available legally for free to educate yourself on the topic better before trying to argue with others about it.

The debunk is about AI reading the same as humans, not about the legality of “reading”. It’s that the anthropomorphizing machines won’t work.

Nobody is anthropomorphizing machines. Not being human doesn’t stop machines from learning. The learning process is not the same but is analogous. The analogies are made because you don’t understand the process and it tends to make it easier to understand if you compare it to something similar. But you’re intentionally being obtuse or at best, you’re just dense. Neither qualifies your perspective as useful, much less authoritative on the subject matter.

Which law? Especially on the copyright of that. Cite a clause in the U.S. copyright law.

The 10th Amendment of the Constitution, which supersedes US copyright law.

First, whether a work is subject to copyright is independent of whether it’s a machine that made it or a human made it.

Fuck no. You are absolutely wrong. And this point should have been a gimme. How did you not know this? You again prove you don’t know what you’re talking about yet you have to gall to pretend you can educate Americans on their own laws.

https://www.reuters.com/world/us/us-appeals-court-rejects-copyrights-ai-generated-art-lacking-human-creator-2025-03-18/

“The Human Authorship Requirement
The U.S. Copyright Office will register an original work of authorship, provided that the work was created by a human being. The copyright law only protects “the fruits of intellectual labor” that “are founded in the creative powers of the mind.” Trade-Mark Cases, 100 U.S. 82, 94 (1879). Because copyright law is limited to “original intellectual conceptions of the author,” the Office will refuse to register a claim if it determines that a human being did not create the work. Burrow-Giles Lithographic Co. v. Sarony, 111 U.S. 53, 58 (1884).”

https://www.copyright.gov/comp3/chap300/ch300-copyrightable-authorship.pdf

Your ignorance on this simple and essential aspect of US copyright law should, again, say it all. You don’t have a fucking clue what you’re talking about.

A machine-generated stuff (either the model or the LLM output) can be simultaneously uncopyrightable and infringing someone else’s copyright.

False. The model can’t be because it doesn’t include any of the trained data. You’re conflating the two.

That’s the key point of many copyright lawsuits about AI.

That’s a key argument that hasn’t been determined by a court of law definitively in a precedent that influences all of US copyright law.

I will reply these two comments together, because when you are suggesting there are “some” uses that don’t require licenses blah-blah-blah,

Some uses don’t. That’s a factual statement.

you then made the claim of “you can’t cite a law or case law that says that anything stored on a computer is subject to copyright”,

You’ve ommitted the key qualifying phrase in that statement, which makes your take entirely dishonest. What I said was “You can’t cite a law or case law that says that anything stored on a computer is subject to copyright by that fact.” You left out “by that fact,” meaning that the fact that something is stored on a computer doesn’t make it subject to copyright protection. That was the meaning that you’re either too stupid to comprehend or at worst, intentionally trying to twist. Indeed, what a hypocrite you are.

If I replace your “any” with “every”, then the claim is true,

No, it isn’t.

but with “any”, you instead suggest software code is not copyrightable,

No, I didn’t. You’re just revealing you didn’t understand what I said, yet again. I keep telling you that you’re not understanding what I’m saying and you just keep responding to things I didn’t say. How many times have I said this in this comment section? Of course software code is copyrightable. Quote me where I specifically said “software code is not copyrightable.” You can’t. I said that just because something is stored on a computer doesn’t mean that it is copyrightable, which is what you claimed. There are other requirements for things to be copyrightable beyond being stored in a fixed medium. One of which is human authorship, which I have already pointed out. Your ignorance of copyright law is astounding in direct correlation to how much you try to expound upon it.

I’m not challenging your ability to name any “poor person” according to your definition, I’m challenging your ability to represent the “poor people” class you defined.

I never purported to represent them. Quote me where I said I represent them! Otherwise, your argument is dismissed, as you are so wont to do.

You cite Len Kaminski, and so I am fucking serious to ask this question, did he support piracy like you do?

I haven’t supported piracy, so that’s a straw man unto itself. Len can represent himself, but I don’t recommend trying to harass him over your issues. He hasn’t been in good health recently. But he is a poor US citizen and a creator who has been fucked over by the big media companies you’re championing, which is another reason you shouldn’t bother him. You’re arguing in favor of the people who are responsible for his current state.

I love that you’re going down these rabbit holes based entirely on your own straw men.

I never asserted I represent others, meanwhile you asserted that you represented my interests. I never supported piracy, yet you’ve asserted that legally accessing freely available, permissively licensed content is piracy and a copyright violation. Your entire argument is a giant bundle of arrogant ignorance and straw men.

Let’s review: I’m a rightsholder. I disagree with your siding with Big Media corporations. Poor US citizens and creators exist. You don’t represent me. You don’t represent any of them. You don’t understand US copyright law. You don’t understand the US Constitution. You don’t understand LLM technology. You have proven nothing that you’ve claimed.

Explorer09 (profile)

June 13, 2025 at 1:55 pm

Re: Re:

I didn’t get permission or obtain a license to learn to write from reading the books I read that were subject to copyright, because I didn’t have to. US copyright law doesn’t require that!

The doctrine of first sale, fair use (107)

How do the four factor of 17 U.S. Code § 107 grant you permission on that?

the TEACH Act

Which section and clause?

106 doesn’t give copyright holders the right to restrict learning from copyrighted material.

It’s not restricting AI “learning”, it’s restricting AI reproduction of copyrighted work under the guise of “learning”. § 106(1) and § 106(2) the exclusive rights to reproduction and derivative works.

In case you didn’t get it, here’s a hypothetical example:
A group of students learn to write horror fiction by reading novels by Stephen King, and through learning, the students are asked by their teacher to write book reports or “notes” documenting how a novel in King’s style can be made (so they’re not just reviews or commentaries that are “safe harbors” in fair use judgements), and then these “notes for creating works in Stephen King’s style” later got published and sold commercially. How do people claim that these student’s “learning” is fair use?

Here, the claim that AI doesn’t store any piece of the work won’t help. (1) The “notes” are “fixed in tangible media” and thus subject to copyright. (2) Even though the works are not present in the final use, the “notes” here are under derivative work category and “learning” is not an excuse for this one being fair. (3) The mention of specific author name in these “notes”, means that author can accuse trademark infringements in addition to copyright infringements. For now in court, the “styles can’t be copyrighted” defense cannot apply to this (Andersen v. Stability AI). So, how do you answer?

You cite Len Kaminski, and so I am fucking serious to ask this question, did he support piracy like you do?

I haven’t supported piracy, so that’s a straw man unto itself. Len can represent himself, but I don’t recommend trying to harass him over your issues. He hasn’t been in good health recently. But he is a poor US citizen and a creator who has been fucked over by the big media companies you’re championing, which is another reason you shouldn’t bother him. You’re arguing in favor of the people who are responsible for his current state.

I’m not saying I would harass anyone. I was saying if you cannot bring any person here as a direct witness, then your claim of me hurting “poor people” is unfounded and I dismiss it. You carry the burden of proof of your claim, not me.

I never supported piracy, yet you’ve asserted that legally accessing freely available, permissively licensed content is piracy and a copyright violation. Your entire argument is a giant bundle of arrogant ignorance and straw men.

You support piracy of AI companies under the guise of “machine learning”!!! While I asked why the fuck can’t AI companies train with only public domain content, you dodged the question and then simply insist that machine learning is legal (even when the source of training data is pirated)! That’s your stance, damnit. It cannot be any clearer.

If you really focus on “permissively licensed content” for training, then you should condemn the AI companies doing illegal actions. Otherwise the stance you claimed you are is not the same as what you behave.

Poor US citizens and creators exist.

Dismissed. (Even when poor citizens and creators exist, you can’t represent them. No witness showing up, so this argument is moot even when what you say is a fact.)

MrWilson (profile)

June 13, 2025 at 5:08 pm

Re: Re: Re:

How do the four factor of 17 U.S. Code § 107 grant you permission on that?

First, let’s cover the fact that you’re actively questioning the legality of humans learning to write under copyright law you don’t understand. This is absurd. Are you going to claim that eating food is a trademark violation next? This ignorance is appalling. If you need this explained to you, you are admitting you don’t understand US copyright law whole cloth.

Second, fair use would be cover it, but 106 not reserving the right to copyright holders is all the justification you’d need.

But a four factor analysis would weigh easily in favor of reading to learn to write because the use is transformative (turning the reading of the content into pattern recognition of language composition in an intangible, non-fixed human brain medium) and the fact that the use wouldn’t affect the market for the original. Nobody refuses to buy a book from an author because a kid somewhere has read it before.

Which section and clause?

Teachers can make copies of works for educational purposes. Section 110(2) of the U.S. Copyright Act.

It’s not restricting AI “learning”, it’s restricting AI reproduction of copyrighted work under the guise of “learning”. § 106(1) and § 106(2) the exclusive rights to reproduction and derivative works.

Quote where 106 says rights holders have the right to demand licenses for LLM training. You can’t, because, as we’ve already said, this hasn’t actually been determined by law or case law.

In case you didn’t get it, here’s a hypothetical example:
A group of students learn to write horror fiction by reading novels by Stephen King, and through learning, the students are asked by their teacher to write book reports or “notes” documenting how a novel in King’s style can be made (so they’re not just reviews or commentaries that are “safe harbors” in fair use judgements), and then these “notes for creating works in Stephen King’s style” later got published and sold commercially. How do people claim that these student’s “learning” is fair use?

This is perfectly legal. The notes are composed of the students’ pattern recognition, learning, analysis, reviews, and commentaries about Stephen King’s writing style. And here’s the goddamn kicker: Stephen King wrote a whole fucking book called “On Writing: A Memoir of the Craft” in order to teach writers how to write like him! Of all the writers you could choose to make an example out of, you chose someone who literally teaches others to write. I couldn’t have planned this so well if I wanted to. You created your own trap for yourself. But even if Stephen King hadn’t expressly put effort into teaching others to write, it would still be fair use to write notes about how he composes. US copyright law notably only covers a specific expression, not a writing style. Writing style falls under ideas, procedures, methods, systems, processes, concept, etc. which aren’t covered.

Copyright does not protect
• Ideas, procedures, methods, systems, processes, concepts, principles, or discoveries
• Works that are not fixed in a tangible form (such as a choreographic work that has not been notated or recorded or an improvisational speech that has not been written down)
• Titles, names, short phrases, and slogans
• Familiar symbols or designs
• Mere variations of typographic ornamentation, lettering, or coloring
• Mere listings of ingredients or contents

https://www.copyright.gov/circs/circ01.pdf
https://www.copyright.gov/circs/circ33.pdf

Here, the claim that AI doesn’t store any piece of the work won’t help. (1) The “notes” are “fixed in tangible media” and thus subject to copyright.

The notes are the creation of the reader, not the writer. If a human wrote the notes, that human owns the copyright, not the writer whose works are being commented on. And notes about how a writer writes does definitely fall under commentaries. And again, you’re confused about the fixed medium requirement. Being fixed in a tangible media is a requirement if you want something to be copyrighted. It does not at all in any way mean that everything fixed in tangible media is subject to copyright. Public domain works are fixed in a tangible media. Government works that don’t qualify for copyright are fixed in a tangible media. Titles, names, short phrases, and slogans are fixed in a tangible media and do not qualify for copyright. This has already been explained to you. You’re not learning a goddamn thing from this discussion.

(2) Even though the works are not present in the final use, the “notes” here are under derivative work category and “learning” is not an excuse for this one being fair.

The notes aren’t a derivative work because they don’t contain the original work and if they contain any part of the original work, that would be a de minimis use and the rest of the work that contains the commenter’s own thoughts would still qualify for its own copyright. They just wouldn’t own the copyright on any quoted parts written by the original writer.

(3) The mention of specific author name in these “notes”, means that author can accuse trademark infringements in addition to copyright infringements.

As long as the writer of the notes doesn’t purport to have the participation of the original writer, this wouldn’t be a trademark infringement. You could accuse all you like.

To be clear, again, you are ignorant of the US education system. If your theories of copyright were correct, almost all learning in public and private educational institutions would be illegal. Students would be violating copyright to write a book report or an essay on a work. This isn’t the case and you can’t point to any court cases in which this has been adjudicated. If you were right, you could. CITE YOUR PROOF OF THIS CLAIM.

For now in court, the “styles can’t be copyrighted” defense cannot apply to this (Andersen v. Stability AI). So, how do you answer?

It absolutely can. Andersen v. Stability AI hasn’t been decided at all. It’s not even set to begin for another year. You can’t claim a precedent from a case that hasn’t even started. And even when it’s decided, it won’t necessarily set a precedent in all US jurisdictions. You don’t understand how US law or US courts work either.

I’m not saying I would harass anyone. I was saying if you cannot bring any person here as a direct witness, then your claim of me hurting “poor people” is unfounded and I dismiss it.

You seem to be confused about the claim. You do realize people can have their rights violated and not even know it, correct? For example, I could bribe politicians to pass a law to take away a right that US citizens currently have and they may not be aware that I’ve taken it from them. Laws are complex and often hundreds of pages long. Many people (such as you) don’t take the time to learn what all is included in them. You’re demanding a subjective perspective from a person who may not know anything about the topic. This is such a useless counter-argument. At best you could say some of the people wouldn’t care about losing a right they weren’t planning on using, but that’s not the same thing.

Large media corporations have already hurt US citizens by bribing politicians and expanding copyright far beyond its original intent and purpose. They have already deprived US citizens of their rights. They have already subverted the democratic process and disenfranchised voters’ rights.

And you’re defending them. You are actually cheerleading for the further harm to US citizens.

You carry the burden of proof of your claim, not me.

It’s simple and I’ve already said it multiple times. “If corporations lose cases and the result is a legal precedent that all training requires financial compensation, poor people will not be able to afford to train LLMs and therefore only wealthy corporations will be able to.” The large AI corporations can pay for licenses if they lose. Poor people won’t be able to. So we’ll only have big, expensive, powerful AI companies licensing their AI to the government, to educational institutions. The future of our society will be controlled by the LLMs trained by the wealthy. You’re functionally just saying you want your bribe before they make my society worse.

You support piracy of AI companies under the guise of “machine learning”!!!

LLM training isn’t itself piracy or a copyright violation. CITE A COURT CASE OR A LAW THAT SAID EXACTLY THAT.

While I asked why the fuck can’t AI companies train with only public domain content, you dodged the question

I didn’t. I pointed out that someone has done so and the results were mediocre. It’s also not a question you can demand I answer. I’m not saying they can or can’t. Take it up with them.

and then simply insist that machine learning is legal

You can’t point to a law or case law that says it’s illegal. If it were illegal, then software dating back to at least the 70s or earlier would have been illegal. Algorithms would be copyright violations. Search engine would be copyright violations.

(even when the source of training data is pirated)!

I didn’t say this at all. I will say again QUOTE ME WHERE I SAID THAT. Your continuing claims while failing to actually quote anything I’m saying that actually does what you claim proves you’re wrong. You’re arguing with straw men.

That’s your stance, damnit. It cannot be any clearer.

That’s not my stance. You keep making up fake things I haven’t said instead of just quoting me. You say things are clear while you demonstrate multiple times that you don’t understand much of anything you’re saying.

If you really focus on “permissively licensed content” for training, then you should condemn the AI companies doing illegal actions.

You will note at the very beginning of this that I am not siding with the AI companies. I literally said you don’t have side with or favor any large corporations and you insisted that you must. I haven’t and don’t support them. You have asserted I do based on your own false dilemma that a side must be chosen.

Otherwise the stance you claimed you are is not the same as what you behave.

I don’t “behave” at all. The stance I claim is all there is here. I don’t work for a large AI company. I’m not training their LLMs or downloading copyrighted content for their training purposes. You seem to assume I’m doing more than discussing the topic. That’s a really weird assertion. It’s almost as if you’ve propped me up as a straw man for arguments you wish to have with people who aren’t here.

Dismissed. (Even when poor citizens and creators exist, you can’t represent them.

I have specifically not purported to represent them, though you have purported to represent my interests, which makes you a hypocrite.

No witness showing up, so this argument is moot even when what you say is a fact.)

You didn’t ask for a witness. You asked for a name, which is not the same. And asking for a name or a witness is irrelevant. This isn’t a court of law, dumb shit. And you’re not a lawyer.

Explorer09 (profile)

June 13, 2025 at 9:53 pm

Re: Re: Re:²

First, let’s cover the fact that you’re actively questioning the legality of humans learning to write under copyright law you don’t understand. This is absurd. Are you going to claim that eating food is a trademark violation next? This ignorance is appalling. If you need this explained to you, you are admitting you don’t understand US copyright law whole cloth.

I intentionally question this so that you can explain your theory about how human learning fits the four fair use factors. Not because I believe that human learning is non-infringing, but because your “fair use” analysis on this simple case might be backed by any court ruling. (And if so, your claim about AI training being fair use would become flawed theory as well.)

But a four factor analysis would weigh easily in favor of reading to learn to write because the use is transformative (turning the reading of the content into pattern recognition of language composition in an intangible, non-fixed human brain medium) and the fact that the use wouldn’t affect the market for the original.

I would say this answer of yours might be wrong. But I am not certain that mine is right (you should ask a lawyer to confirm my answer). Anyway:

According to 17 U.S. Code § 101, the definition of a “copy” required the work to be “fixed” in a “material object”. The memories in human brain cells are not “fixed” in a strict sense, and so humans memorising stuff is not considered “reproduction” in the copyright law, and as a result, there is no need to consider fair use factors.

Nobody refuses to buy a book from an author because a kid somewhere has read it before.

Hold on, there are people who do choose not to buy a book if they had read it somewhen before. E.g. they may had borrowed the book from someone (assuming that book is a legal copy). This can affect the fourth factor i.e. book sales, on the fair use. If I didn’t argue that memorization in brain cells is “not fixed” and thus “not a copy”…

Teachers can make copies of works for educational purposes. Section 110(2) of the U.S. Copyright Act.

§ 110 (2)(A), this exemption applies to “a governmental body or an accredited nonprofit educational institution” only. Nothing to do with AI training, and if you are a for-profit corporation, this section never applies.

Again, it’s to show many copyright exemptions to human learning don’t apply to AI training, and the analogy between the two is fundamentally flawed in the legal sense. The layman’s words for this is unlike humans, AIs don’t have rights to learn.

[H]ere’s the goddamn kicker: Stephen King wrote a whole fucking book called “On Writing: A Memoir of the Craft” in order to teach writers how to write like him!

Except that the cited Stephen King’s instructonal book is not part of this example! It’s about how students pretend to be Stephen King and write a book about how to write like Stephen King without King’s permission!

If the students write generic instructional books about how to write without any mentions of King (or merely comments about how King inspired them), then the case would be much easier because then the students’ works would have no direct extraction of King’s works (or too minimal to bring copyright concerns). Even if there is reference or extraction of King’s works, the results would be “transformative” enough to not look like Stephen King’s works or their derivatives at all.

The notes aren’t a derivative work because they don’t contain the original work and if they contain any part of the original work, that would be a de minimis use and the rest of the work that contains the commenter’s own thoughts would still qualify for its own copyright. They just wouldn’t own the copyright on any quoted parts written by the original writer.

Emphasis added. Because I didn’t assume in this hypothetical scenario the quotes are de minimis use. And for substantial quotes the question comes in: The original writer didn’t give license to you to quote substantial portions of his book. And how would you justify fair use on that? Especially when the students books are later sold commercially.

If your theories of copyright were correct, almost all learning in public and private educational institutions would be illegal.

No, this is not my claim. The claim is about students selling books telling readers how to write “in Stephan King’s style” without permission from Stephen King. And the students also quote substantial content of Stephan King’s work without permission.

Note that as you cited that Stephan King also wrote instructional books, this mean a market competition with King’s instructional book directly, which could affect the factor four analysis in “fair use”.

Students would be violating copyright to write a book report or an essay on a work.

I didn’t deny the legality of commentaries.

Large media corporations have already hurt US citizens by bribing politicians and expanding copyright far beyond its original intent and purpose.

What purpose?

They have already deprived US citizens of their rights. They have already subverted the democratic process and disenfranchised voters’ rights.

Which rights are specifically deprived?

“If corporations lose cases and the result is a legal precedent that all training requires financial compensation, poor people will not be able to afford to train LLMs and therefore only wealthy corporations will be able to.” The large AI corporations can pay for licenses if they lose. Poor people won’t be able to. So we’ll only have big, expensive, powerful AI companies licensing their AI to the government, to educational institutions. The future of our society will be controlled by the LLMs trained by the wealthy. You’re functionally just saying you want your bribe before they make my society worse.

Before I request a proof of how that could happen, I would ask another thing? Why do you have to rely everything on the LLM, as if people cannot work without LLM in all industries? There’s one thing I need to mind you: You only need a pen, and not LLM, to write, and a brush and ink, not LLM, to draw. Why the heck did the entire world depend on a technology that is more harmful to environment than before (in terms of energy waste, and air pollution)?

That’s part of the reason I don’t buy your argument about “[copyright/creative monopoly] making the society worse”. As AI is already worse for the environment already.

By the way, I dismiss your “poor people” claim here again.

Explorer09 (profile)

June 13, 2025 at 10:01 pm

Re: Re: Re:³

For environmental impact on AI, see also these:

https://www.unep.org/news-and-stories/story/ai-has-environmental-problem-heres-what-world-can-do-about

https://www.caltech.edu/about/news/air-pollution-and-the-public-health-costs-of-ai

MrWilson (profile)

June 18, 2025 at 12:22 am

Re: Re: Re:³

I intentionally question this so that you can explain your theory about how human learning fits the four fair use factors.

Again, it doesn’t need to be fair use. There is nothing in 106 that gives copyright holders the right to demand a license for people to learn to write (or draw or sing or play guitar) from their copyrighted works. There are no cases where this is disputed. It’s hard to find anything on the topic because it’s so legal that nobody is debating it because nobody with the least amount of understanding considers it a violation of copyright. Which why it’s entirely absurd for you to even entertain the idea that it might not be legal.

I would say this answer of yours might be wrong. But I am not certain that mine is right (you should ask a lawyer to confirm my answer).

You’re ignoring the evidence that elementary schools are not daily being sued for copyright infringement for teaching American children how to write. I would say your answer is absolutely wrong because there’s no proof whatsoever and there would be plenty if you were right.

According to 17 U.S. Code § 101, the definition of a “copy” required the work to be “fixed” in a “material object”.

The memories in human brain cells are not “fixed” in a strict sense, and so humans memorising stuff is not considered “reproduction” in the copyright law, and as a result, there is no need to consider fair use factors.

That’s what I said.

Hold on, there are people who do choose not to buy a book if they had read it somewhen before. E.g. they may had borrowed the book from someone (assuming that book is a legal copy). This can affect the fourth factor i.e. book sales, on the fair use. If I didn’t argue that memorization in brain cells is “not fixed” and thus “not a copy”…

You changed what I said. I said nobody refuses to buy a book because a kid somewhere has read it before. I didn’t say the kid who had read it already is the person who refuses to buy it. A random person in Seattle won’t refuse to buy a book in a bookstore in Seattle because some random kid (completely different person) read the same book in elementary school in Florida. A child reading a book and learning to write from it doesn’t affect the market for the book.

§ 110 (2)(A), this exemption applies to “a governmental body or an accredited nonprofit educational institution” only. Nothing to do with AI training, and if you are a for-profit corporation, this section never applies.

We were discussing children learn to write from reading. Now you’re goalpost shifting again. But, as it stands, I have already pointed out that I’m concerned about the ability of researchers (including employees and students of accredited nonprofit institutions) to train LLMs for research purposes. So it definitely applies. And you have demonstarted your ignorance of how education works in the US, so your assertions on the matter are useless.

Again, it’s to show many copyright exemptions to human learning don’t apply to AI training, and the analogy between the two is fundamentally flawed in the legal sense.

Except you’ve demonstrated you didn’t understand US copyright law even as much as it pertains to humans learning, so your analysis is fundamentally flawed in the legal sense.

The layman’s words for this is unlike humans, AIs don’t have rights to learn.

But humans have the right to train machines.

Except that the cited Stephen King’s instructonal book is not part of this example!

You don’t get to pretend that students won’t have access to or knowledge of the fact that Stephen King wrote such a book. And since your scenario is that a teacher is asking students to study Stephen King’s writing, the teacher would most definitely avail the students of the knowledge King specifically offers regarding his writing in his book on writing. You can’t just exclude it because it completely destroys the premise and conclusion of your argument. That’s intellectually dishonest. You’re handling the fact that you fucked up by using Stephen King as an example pretty poorly.

It’s about how students pretend to be Stephen King

You’re changing the scenario again. Pretending to be Stephen King would be fraud.

and write a book about how to write like Stephen King without King’s permission!

Stephen King already wrote that book. Why would the students try to write it? That doesn’t make any sense. However, it wouldn’t be illegal since writing style isn’t copyrightable in the US. You might have to write a disclaimer stating that your book has not been approved or endorsed by Stephen King, the way many biographers do when writing an unauthorized biography about a famous person.

If the students write generic instructional books about how to write without any mentions of King (or merely comments about how King inspired them), then the case would be much easier because then the students’ works would have no direct extraction of King’s works (or too minimal to bring copyright concerns). Even if there is reference or extraction of King’s works, the results would be “transformative” enough to not look like Stephen King’s works or their derivatives at all.

You’re retreating here. This directly contradicts you’re previous assertions that this conduct was a copyright violation.

Emphasis added. Because I didn’t assume in this hypothetical scenario the quotes are de minimis use.

So you assumed copyright violations from the start, so the entire scenario is tainted by your negative perception of anyone who might be pursuing a legal approach to the topic.

And for substantial quotes the question comes in: The original writer didn’t give license to you to quote substantial portions of his book.

You don’t need a license to quote authors if it isn’t substantial. That’s literally the commentary exception. You’re just magically assuming it would be in quantity enough to violate copyright. You’ve decided to cripple this hypothetical author by insisting that they do something that is likely a copyright violation so that you can claim it is in fact a copyright violation. You’re using circular logic.

And how would you justify fair use on that? Especially when the students books are later sold commercially.

They wouldn’t use large portions of King’s work. If they didn’t self-publish, their editor and publisher wouldn’t let them include such large portions.

But this is all a moot point because you’re pretending this is an analogy about how LLM training works, except it’s not because there is less than de minimis use of the copyrighted works in the model itself. There is no copy of the works in the model. The correct analogy for LLM training is that a person read many copyrighted works, their brain used their analysis skills and pattern recognition to retain the process of composing phrases and sentences and paragraphs and chapters, and they then use those patterns they recognized to compose completely new sentences, paragraphs, chapters, and whole works. This is literally how I learned to write and how I have refined my writing skills over decades. I don’t include the sentences of the writers from whom I learned to write in my writing. But I have used pattern recognition to refine processes for composing good sentences and paragraphs, to compose effective prose and storytelling content.

MrWilson (profile)

June 18, 2025 at 12:23 am

Re: Re: Re:³

No, this is not my claim.

But this is the natural conclusion of your claim. You’re saying that learning is a copyright violation if that learning is recorded in a fixed medium. Except the learning is neither the original work nor a derivative work.

Here’s an example:

Author Joe Smith writes this paragraph.

The woman told him that he would encounter death if he chose to walk into the garden, but he walked in anyway. In the garden, he encountered a cloaked figure who spoke with a supernatural resonance in his voice and who knew the man’s fate. The man realized that this was Death whom he had encountered in the garden, just as the woman said.

And a student reading this passage might write notes like: “Play on words – ‘encounter death’ can mean to die, but it could also mean ‘encounter the personification of death.’ Use play on words in the future to seem clever in your writing.”

The student references only the phrase “encounter death” from the original work. It’s de minimis and the phrase itself doesn’t qualify for copyright protection itself since it’s so generic and common. It would only be copyrighted in the context of the greater work. The student learned a process about writing, specifically in this scenario about using word play to surprise the reader. This is not at all in any fashion a copyright violation.

The claim is about students selling books telling readers how to write “in Stephan King’s style” without permission from Stephen King.

This isn’t itself illegal. Style’s aren’t subject to copyright protection.

And the students also quote substantial content of Stephan King’s work without permission.

This is different. You’ve included this to make it illegal. But it’s not even analogous to LLM learning. LLMs do not quote or contain the original works they’re trained on.

Note that as you cited that Stephan King also wrote instructional books, this mean a market competition with King’s instructional book directly, which could affect the factor four analysis in “fair use”.

A random student who isn’t a bestseller author publishing a book about Stephen King’s writing style will not compete with Stephen King’s own book on his writing style.

I didn’t deny the legality of commentaries.

But you denied that writing a book report or essay on Stephen King’s works was commentary. Even publishing a book on Stephen King’s writing style would be considered commentary.

What purpose?

Article I, Section 8, Clause 8: To promote the Progress of Science and useful Arts

Which rights are specifically deprived?

They expaned copyright beyond its original term such that it is now the life of the author plus 70 years, which has deprived the public of the use of the public domain works within their own lifetime. If I die before 70, I will be dead before a work written during my lifetime is available in the public domain. But the lobbying and bribery by media companies has also bought corrupt politicians and deprived citizens of their right to representation and the right of redress of their government. When the legislature writes laws for the wealthy, they are not representing the people or upholding the Constitution. The normalization of this corruption and bribery opens the door for more big corporations and “campaign donors” to further subvert democracy and representation.

This deprives US citizens of some of the most fundamental constitutional rights since it affects free speech, free press, the right to petition the government, and the right to representation within Congress, among others. And this corruption affects other rights since wealthy corporations can support candidates who support creating more rights for wealthy corporations and those candidates may also violate other civil and human rights. Some media companies have donated to Donald Trump and he’s violated uncountable human and civil rights, so they are at best indirectly supporting those violations.

Before I request a proof of how that could happen, I would ask another thing? Why do you have to rely everything on the LLM, as if people cannot work without LLM in all industries?

I’m not relying on LLMs. I’m recognizing the fact that corporate leaders are only too eager to embrace “AI” for almost anything. Some educational institutions are embracing it’s use. Many corporations are hiring “prompt engineers.” Customer service representatives are being replaced by LLM chatbots. You have to be blind not to see the inevitability that the wealthy and the corporate leaders are going to force more and more AI on everyone. The Trump administration is already endorsing it’s use in the analysis of government systems in the disingenuous search for programs to cut funding for. Demanding licenses for training only means the AI companies just pay the Big Media companies an amount, but they’re going to continue to be used. These court battles won’t kill AI even if the outcomes are negative for the AI companies. If the outcomes are negative, deals will be struck. It’s just a matter of how much money gets spread around to already wealthy people. But even if Big Media wins these lawsuits, the creators won’t see a significant windfall at all. The Big Media companies already use contracts to minimize how much they have to share of their profits with the creators who actually create the works they profit from.

There’s one thing I need to mind you: You only need a pen, and not LLM, to write, and a brush and ink, not LLM, to draw.

This statement reveals your fundamental misunderstanding of my position. I don’t use LLMs to write. I don’t use LLMs to create graphic designs. I don’t use LLMs to draw. I publish works that are my own efforts. I’m just not an idiot who can’t see that it’s inevitable that these LLMs will be pushed on the American public more and more and I’d prefer the LLMs that get the contracts for government and public services were trained by ethical researchers instead of major for-profit corporations who just happen to be the only ones wealthy enough to license enough content to train an effective LLM. When I’m old and needing greater amounts of health care, I want the LLM that gets assigned to my case not be programmed to maximize profits but rather to take my health into greatest consideration.

Why the heck did the entire world depend on a technology that is more harmful to environment than before (in terms of energy waste, and air pollution)?

Because the wealthy don’t care about the environment. Big media companies rely on giant environmentally-unfriendly server farms to stream their licensed content over and over to the same people because they don’t want their precious IP cached and replayable. You’re pretending AI companies are the only destructive corporations. The wealthy own stock in a wide variety of environment-destroying and exploiting operations.

That’s part of the reason I don’t buy your argument about “[copyright/creative monopoly] making the society worse”. As AI is already worse for the environment already.

Or, as I already said, wealthy corporations are all the same, regardless of which side they fall on for this particular topic and you don’t have to side with either of them as you will never be one of them. You’re at best a useful idiot for the big media companies. But…and here’s the kicker…the stock of the big media companies and the stock of the big AI companies can be owned by the same people. You’re not actually picking a side when the wealthy can own parts of everything.

By the way, I dismiss your “poor people” claim here again.

Yeah, you don’t give a damn about the poor people who will have your preferred licensed AI companies’ LLMs forced upon them. We’ve already established that for any possible reference you make to an ethical position or concern about the environment, you’re willing to throw that out the window in favor of big media corporations and their profits, which exist in the first place at the expense of the creators and poor people.

Explorer09 (profile)

June 19, 2025 at 11:00 am

Re: Re: Re:⁴

You’re saying that learning is a copyright violation if that learning is recorded in a fixed medium. Except the learning is neither the original work nor a derivative work.

Here’s an example:

Author Joe Smith writes this paragraph.

The woman told him that he would encounter death if he chose to walk into the garden, but he walked in anyway. In the garden, he encountered a cloaked figure who spoke with a supernatural resonance in his voice and who knew the man’s fate. The man realized that this was Death whom he had encountered in the garden, just as the woman said.

And a student reading this passage might write notes like: “Play on words – ‘encounter death’ can mean to die, but it could also mean ‘encounter the personification of death.’ Use play on words in the future to seem clever in your writing.”

The student references only the phrase “encounter death” from the original work. It’s de minimis and the phrase itself doesn’t qualify for copyright protection itself since it’s so generic and common. It would only be copyrighted in the context of the greater work. The student learned a process about writing, specifically in this scenario about using word play to surprise the reader. This is not at all in any fashion a copyright violation.

No. It’s not students writing “encounter death” constitute copyright infringement. It’s students writing “The woman told him that he would encounter death if he chose to walk into the garden … In the garden, he encountered a cloaked figure who spoke with a supernatural resonance in his voice and who knew the man’s fate. The man realized that this was Death” that constitutes copyright infringement.

The infringement happens on the bigger context and word & sentence arrangements, not the de minimis cases of individual words or phrases. Not about a word play either.

This isn’t itself illegal. Style’s aren’t subject to copyright protection.

You are oversimplifying things. (1) Styles of a specific artist are protected under TRADEMARK LAW. (2) Style mimicry while replicating a significant portion of the original author’s work is still copyright infringement. You can’t escape liability by just say you copy someone’s “style”, because the devil is in the details.

And the students also quote substantial content of Stephan King’s work without permission.

This is different. You’ve included this to make it illegal. But it’s not even analogous to LLM learning. LLMs do not quote or contain the original works they’re trained on.

Have you read any lawsuit complaint before you make this BLATANTLY FALSE claim? Here is one, New York Times v. OpenAI, read pages 30 to 32:

https://admin.bakerlaw.com/wp-content/uploads/2024/01/ECF-1-Complaint-1-1.pdf

A random student who isn’t a bestseller author publishing a book about Stephen King’s writing style will not compete with Stephen King’s own book on his writing style.

What if the book becomes bestselling later on? This defense is fucking stupid when it comes to factor four on fair use.

But you denied that writing a book report or essay on Stephen King’s works was commentary. Even publishing a book on Stephen King’s writing style would be considered commentary.

Yes I deny it. Why? Educating other people on Stephen King’s style with significant quotes from Stephen King’s copyrighted works go beyond the boundary of being a commentary. I did’t even mention about AI regurgitating.

They [expanded] copyright beyond its original term such that it is now the life of the author plus 70 years, which has deprived the public of the use of the public domain works within their own lifetime. If I die before 70, I will be dead before a work written during my lifetime is available in the public domain. But the lobbying and bribery by media companies has also bought corrupt politicians and deprived citizens of their right to representation and the right of redress of their government. When the legislature writes laws for the wealthy, they are not representing the people or upholding the Constitution. The normalization of this corruption and bribery opens the door for more big corporations and “campaign donors” to further subvert democracy and representation.

This deprives US citizens of some of the most fundamental constitutional rights since it affects free speech, free press, the right to petition the government, and the right to representation within Congress, among others. And this corruption affects other rights since wealthy corporations can support candidates who support creating more rights for wealthy corporations and those candidates may also violate other civil and human rights. Some media companies have donated to Donald Trump and he’s violated uncountable human and civil rights, so they are at best indirectly supporting those violations.

(Emphasis added)

What bullshit are you talking about? There is no such a right as “using the work within the people lifetime”. Quote the U.S. constitution for such a right. The latter things about “right to representation” or “free speech, free press” have nothing to do with copyright. These are bullshit debunked many times. (I know this position is from EFF, but after Warhol v. Goldsmith case, the EFF didn’t ever learn. See also: https://copyrightalliance.org/warhol-foundations-flawed-transformative-use-theory/)

I’m recognizing the fact that corporate leaders are only too eager to embrace “AI” for almost anything. Some educational institutions are embracing it’s use. Many corporations are hiring “prompt engineers.” Customer service representatives are being replaced by LLM chatbots. You have to be blind not to see the inevitability that the wealthy and the corporate leaders are going to force more and more AI on everyone.

Then the solution is not to let AI win but to fucking address the AI problems and not adopt AI blindly!

Copyright theft is just one of the problems with AIs. Misinformation, deepfakes, fraud, energy waste, dehumanizing creativity, are all there.

When I’m old and needing greater amounts of health care, I want the LLM that gets assigned to my case not be programmed to maximize profits but rather to take my health into greatest consideration.

Then your position still exploits human workers that put efforts in your health care stuff. You don’t bother pay any human that would help in your later life. So you’re selfish. But that’s fine, because every worker is selfish when they want to make a living and get paid and not get exploited.

We’ve already established that for any possible reference you make to an ethical position or concern about the environment, you’re willing to throw that out the window in favor of big media corporations and their profits, which exist in the first place at the expense of the creators and poor people.

Dismiss “poor people”, you can only represent yourself.

MrWilson (profile)

June 19, 2025 at 5:52 pm

Re: Re: Re:⁵

No. It’s not students writing “encounter death” constitute copyright infringement. It’s students writing “The woman told him that he would encounter death if he chose to walk into the garden … In the garden, he encountered a cloaked figure who spoke with a supernatural resonance in his voice and who knew the man’s fate. The man realized that this was Death” that constitutes copyright infringement. The infringement happens on the bigger context and word & sentence arrangements, not the de minimis cases of individual words or phrases. Not about a word play either.

First, even quoting that paragraph likely wouldn’t be enough to be infringing if it were part of a longer novel or even a short story. Second, you’re choosing a scenario that isn’t realistic. You’re starting with the assumption that they’re infringing copyrights and therefore learning is infringing, but it’s perfectly possible and perfectly legal to learn from copyrighted material. And this doesn’t serve as an analogy for LLM training because nothing of the original content is quoted. And more importantly, you’re ignoring that this is how human beings learn to write. You read the works of others. You compose sentences by seeing how other people compose sentences and some of those compositions you observe are still protected under copyright.

You are oversimplifying things. (1) Styles of a specific artist are protected under TRADEMARK LAW.

First, we’re not discussing trademark law. You can’t claim a violation of a copyrighted work as a trademark violation because those are different sets of laws. Second, you’re going to have to provide a citation that writing styles are protected under trademark. I’m not seeing anything in trademark law that would cover this. I’m starting to suspect you don’t understand what a writing style is.

(2) Style mimicry while replicating a significant portion of the original author’s work is still copyright infringement.

If you’re mimicking the style, you’re not using any portion of the original author’s work. If you’re using their work, you’re not mimicking them, you’re just copyring them. You keep saying, “they’re violating copyright, therefore they’re violating copyright.” Your circular logic doesn’t work. I thought maybe you didn’t understand how the US education system works, but it seems like you just don’t understand human beings.

You can’t escape liability by just say you copy someone’s “style”, because the devil is in the details.

You absolutely can not be liable as long as you aren’t copying their actual work in portions large enough to trigger a copyright claim.

And the students also quote substantial content of Stephan King’s work without permission.

You keep saying this. That wouldn’t fit the instructor’s assignment.

Have you read any lawsuit complaint before you make this BLATANTLY FALSE claim?

Holy fuck. I love it when people provide citations that prove they didn’t read or at least understand their own citations.

Here’s the pertinent part of the claim: “In May 2023, Microsoft and OpenAI unveiled “Browse with Bing,” a plugin to ChatGPT that enabled it to access the latest content on the internet through the Microsoft Bing search engine.”

We’re discussing LLM model training, not live search features. The LLM model itself doesn’t contain the quoted text. It literally has to be connected to a live search add-on to have that functionality. This literally proves what I’ve been saying.

But again, you keep using the lawsuits as if I’ve been saying all the lawsuits are right or wrong. I’m not defending the practices of the large AI companies. I’m defending the singular principle that training an LLM isn’t itself necessarily copyright infringement. You keep using me as a proxy for the AI companies despite my disavowing them from the beginning. I’m not your AI company straw man.

What if the book becomes bestselling later on?

It’s still fair use and commentary as long as it doesn’t quote too much of his original work. You can describe a writer’s style without quoting it. You can describe plots and characters and sentence structure without quoting the author’s words. You’ve apparently never read a book report.

Yes I deny it. Why? Educating other people on Stephen King’s style with significant quotes from Stephen King’s copyrighted works go beyond the boundary of being a commentary. I did’t even mention about AI regurgitating.

You keep adding the “with significant quotes from Stephen King’s copyrighted works” part. That’s not what does happen when humans learn to write from reading someone else’s work. That’s not what LLMs are doing when they’re responding to prompts. You’re making up a fake scenario that doesn’t apply to anything.

What bullshit are you talking about? There is no such a right as “using the work within the people lifetime”.

There was originally. https://copyright.gov/about/1790-copyright-act.html

Quote the U.S. constitution for such a right.

Amendment 10: “The powers not delegated to the United States by the Constitution, nor prohibited by it to the States, are reserved to the States respectively, or to the people.”

The latter things about “right to representation” or “free speech, free press” have nothing to do with copyright.

They are more important than copyright and they have been weakened and subverted by the big media companies you have chosen to side with.

These are bullshit debunked many times.

These what are bullshit? You didn’t address anything. What claims are you saying are debunked?

See also: https://copyrightalliance.org/warhol-foundations-flawed-transformative-use-theory/)

Nope. I’m not touching a link to an industry-backed organization. You’re saying, “these are the good guys, just read their propaganda and they’ll tell you.” Cite a legal research document with case law and legislative sources. You keep citing the Copyright Alliance as if they’re not a biased source. This is why you don’t understand the topic. You’ve been getting all your information from biased claims, not neutral or factual sources. Their job is to empower themselves, not to speak factually about the law. They’re arguing for stricter protection for their profits.

Then the solution is not to let AI win but to fucking address the AI problems and not adopt AI blindly!

How do you propose doing that? You don’t have influence over all those companies and institutions. They have free will. They’re going to do it whether you or I approve. Setting a legal precedent that all LLM training requires licensing won’t stop that.

Copyright theft is just one of the problems with AIs. Misinformation, deepfakes, fraud, energy waste, dehumanizing creativity, are all there.

Those are issues that have existed before LLMs were widely available. They’re also misuses of various technologies by human beings. You’re complaining that technology can be exploited for unethical purposes. Of course it can. It always has been. It will continue to be in the future. It’s not effective to outlaw technology. You have to regulate human actions.

Then your position still exploits human workers that put efforts in your health care stuff. You don’t bother pay any human that would help in your later life. So you’re selfish. But that’s fine, because every worker is selfish when they want to make a living and get paid and not get exploited.

No, you dumb fuck. The scenario is that my health insurance and health care provider will be using AI to handle my case. You’re pretending like I would choose that. The whole fucking point is that this will be pushed on people against their preference. I do pay human beings right now for my health care. And since I’m an American, I likely pay a lot more than you pay and for worse health care than you probably get because the wealthy assholes who run the US won’t blink at an opportunity to monetize every human need. And what’s worse, we have non-Americans like you cheering it on self-righteously.

Dismiss “poor people”, you can only represent yourself.

I exist. I’m not wealthy. I’m also a copyright holder and a creator. You don’t represent me.

But you also don’t represent anyone else, which makes your entire stance useless unless you’re purporting to be a creator whose works are published in the US.

Explorer09 (profile)

June 20, 2025 at 1:56 am

Re: Re: Re:⁶

Second, you’re choosing a scenario that isn’t realistic. You’re starting with the assumption that they’re infringing copyrights and therefore learning is infringing, but it’s perfectly possible and perfectly legal to learn from copyrighted material.

Yes, I am using the scenario that “learning” is infringing. And by the way that “learning” word is in scare quotes because in this scenario it’s commercial reproduction of copyrighted works under the guise of “learning”.

If the “students” in this scenario publish nothing and sell nothing, there is no infringement! But heck that is not the example!

And this doesn’t serve as an analogy for LLM training because nothing of the original content is quoted.

No. Even when the LLM quotes nothing it can still be infringing. The case can be when it takes the sentences or paragraphs and translates word by word into another language, so that it technically quotes “nothing of the original”, but the creative expression through careful arrangements of words is still copied. This is technically a derivative, but still an exclusive right to the copyright holder.

And more importantly, you’re ignoring that this is how human beings learn to write. You read the works of others. You compose sentences by seeing how other people compose sentences and some of those compositions you observe are still protected under copyright.

I can write without reading any copyrighted work! This analogy is fucking nonsense. It’s nonsense because you assume there is no public domain literature. And you are a hypocrite when you argue about public domain in your examples, while denying public domain in examples other people mentioned.

Here’s the pertinent part of the claim: “In May 2023, Microsoft and OpenAI unveiled “Browse with Bing,” a plugin to ChatGPT that enabled it to access the latest content on the internet through the Microsoft Bing search engine.”

Irrelevant. RAG (retrieval augmented generation) is likely not fair use either, and it also “reproduce” copies that is copyright infringement.

I’m not defending the practices of the large AI companies. I’m defending the singular principle that training an LLM isn’t itself necessarily copyright infringement. You keep using me as a proxy for the AI companies despite my disavowing them from the beginning. I’m not your AI company straw man.

You are effectively defending AI companies! Not a straw man. Because none of the research purpose only AI are being sued. What are being sued are commercial AI models despite they have secondary uses that are non-commercial.

And keep in mind the bottom line is: There is no blanket fair use for generative AI. Whether the AI model training is fair use depends on the ultimate use of the AI by its users.

What if the book becomes bestselling later on?

It’s still fair use and commentary as long as it doesn’t quote too much of his original work. You can describe a writer’s style without quoting it. You can describe plots and characters and sentence structure without quoting the author’s words. You’ve apparently never read a book report.

You have fucking no idea how court judges rule fair use. In the U.S., the law doesn’t state “book report is always fair use” nor “commentary is always fair use”, because there could always be exception cases where a “report” reproduce a significant portion of the original work, so that the “report” becomes a market substitute of the original. The court judges have to always evaluate the four factors to determine fair use. No skipping steps.

Factor one: Purpose is commercial selling of the book (against fair use). Factor two: Nature of copyright work is fiction books, but likely already published (neutral or slight for fair use). Factor three: The amount taken is likely not minimal (neutral, but let’s consider this leans for fair use for the sake of argument). Factor four: It creates a market substitute for Stephen King’s own guidebooks on how to write fictions (thus against fair use in this most important factor).

This is how in the U.S. the fair use is evaluated. Oh wait, this analysis is quite similar to the Thomson Reuters v. Ross Intelligence ruling in the district court.

You keep adding the “with significant quotes from Stephen King’s copyrighted works” part. That’s not what does happen when humans learn to write from reading someone else’s work.

The analogy does not have to be “realistic” as of how humans would do it. It’s AI. That is one difference between human learning and AI “learning”, mind you. The AI training process copies significant amount of works during the data gathering phase, which is even before the data is transformed into neural network weights.

What bullshit are you talking about? There is no such a right as “using the work within the people lifetime”.

There was originally. [U.S. Copyright Act of 1790]

Quote the exact section and sub-section of the law.

Quote the U.S. constitution for such a right.

Amendment 10: “The powers not delegated to the United States by the Constitution, nor prohibited by it to the States, are reserved to the States respectively, or to the people.”

Are you suggesting the U.S. Copyright law is unconstitutional? Because no case law ruled that, and the quoting of your Amendment 10 is vague because no even a State law implies a right of “using a copyrighted work within a person’s lifetime”.

Unless you can sue to the Supreme Court about the constitutionally of the modern copyright law, I would disregard this one as being unfounded.

They are more important than copyright and they have been weakened and subverted by the big media companies you have chosen to side with.

Irrelevant. Because AI is not a person and has no free speech.

I’m not touching a link to an industry-backed organization. You’re saying, “these are the good guys, just read their propaganda and they’ll tell you.”

The Supreme Court ruling in Warhol is not propaganda, and you can try ignoring all arguments of the opposite side, but by doing do I’ll ignore your argument (and those from EFF, too). It’s good to live yourself in an echo chamber, until you see the fact that the Supreme Court wasn’t on your side.

Then the solution is not to let AI win but to fucking address the AI problems and not adopt AI blindly!

How do you propose doing that? You don’t have influence over all those companies and institutions. They have free will. They’re going to do it whether you or I approve. Setting a legal precedent that all LLM training requires licensing won’t stop that.

I don’t have to “stop that”, but I can get paid for my work and use it on environmental protection campaigns that could eventually stop that. I’m not the good guy as I have said before. I just want bad guys to stop exploiting the creative labor of humans.

You’re complaining that technology can be exploited for unethical purposes. Of course it can. It always has been. It will continue to be in the future. It’s not effective to outlaw technology. You have to regulate human actions.

And the copyright law fits exactly one of the purposes of regulating AI, especially on the no-exploitation-of-human-labor part.

The scenario is that my health insurance and health care provider will be using AI to handle my case.

Oh yeah and I would stop your damn health care provider from using AI for your case because that’s exploitation and should be stopped.

The whole fucking point is that this will be pushed on people against their preference.

And the whole fucking point is creative worker are exploited by AI companies without their “preference” either.

I likely pay a lot more than you pay and for worse health care than you probably get because the wealthy assholes who run the US won’t blink at an opportunity to monetize every human need.

Why should I care about your health care if there is a chance that I may live with a shorter lifetime than you? This is off-topic, but the point is that it’s not necessary for your healthcare to depend on AI, and even when it does, it’s not an excuse for exploiting creative works (why should healthcase AI be trained with novels anyway?)

But you also don’t represent anyone else, which makes your entire stance useless unless you’re purporting to be a creator whose works are published in the US.

True. And I am stripped that market opportunity because of AI.

MrWilson (profile)

June 20, 2025 at 11:34 pm

Re: Re: Re:⁷

Yes, I am using the scenario that “learning” is infringing. And by the way that “learning” word is in scare quotes because in this scenario it’s commercial reproduction of copyrighted works under the guise of “learning”.

Except it’s not. You claimed children learning to write by reading copyrighted works was a copyright infringement. It is not.

If the “students” in this scenario publish nothing and sell nothing, there is no infringement! But heck that is not the example!

Students don’t publish their book reports usually.

Even when the LLM quotes nothing it can still be infringing. The case can be when it takes the sentences or paragraphs and translates word by word into another language,

It would have to have the sentences and paragraphs to be able to translate it. You can’t translate if you don’t have the original text. The model doesn’t have the original text!

so that it technically quotes “nothing of the original”,

You made up the translation scenario. That’s not realistic or relevant.

but the creative expression through careful arrangements of words is still copied.

That’s not expression. That’s an idea.

This is technically a derivative, but still an exclusive right to the copyright holder.

It’s not translating at all. You keep making bad analogies that have nothing to do with the topic. It just reveals, again, again, that you don’t understand how the technology works and you don’t understand how US copyright law works.

I can write without reading any copyrighted work!

And you can literally legally learn to write from copyrighted content. This isn’t an analogy. This is literally my lived experience. If I remember how to write based on what I’ve read, it’s legal. I often don’t remember the specific sentences I’ve read but the method of writing is retained, such that the “data” that I’ve trained myself on isn’t even present in my head. I can’t quote word for word much of some writer’s expression despite remember plots, characters, and writing style.

This analogy is fucking nonsense.

This isn’t an analogy. This is literally how I learned to write.

It’s nonsense because you assume there is no public domain literature. And you are a hypocrite when you argue about public domain in your examples, while denying public domain in examples other people mentioned.

Except I haven’t assumed there’s no public domain. The existence of the public domain just doesn’t magically make it illegal to learn to write from copyrighted content. It’s not a necessary component of the argument because you don’t have to only rely on the public domain.

Irrelevant. RAG (retrieval augmented generation) is likely not fair use either, and it also “reproduce” copies that is copyright infringement.

It’s essential. The whole point is that you’ve been claiming the models can quote content you and then now admit it can’t quote without an added feature. The models don’t contain the original trained data. The added feature isn’t a de facto part of all LLMs. You’re trying to find anything that can be infringing and you keep ignoring the entire reason we’re arguing here. You think I’m saying nothing is infringing. I haven’t said that.

You are effectively defending AI companies! Not a straw man.

I am not. Again, again, again, you need to reread my first posts here.

Because none of the research purpose only AI are being sued. What are being sued are commercial AI models despite they have secondary uses that are non-commercial.

I’m not talking about these lawsuits you keep bringing up. You’ve brought them up, not me. You keep pretending I’ve been defending the actions taken by the AI companies. You really, seriously, definitely need to read what I first said at the start of this whole thing. It would save you so much time. Here, I’ll do it for you, again:

“My frustration with the arguments of people claiming it’s not fair use and that all training must be licensed is that many people seem to think they’re championing the little guy when they’re inadvertently advocating for the benefit of the wealthy and corporations.”

If you don’t disagree with every part of what I said there, you should stop responding.

And keep in mind the bottom line is: There is no blanket fair use for generative AI. Whether the AI model training is fair use depends on the ultimate use of the AI by its users.

Not necessarily. The uses can be infringing without the model being infringing, the same way a VCR can record content off TV and the person making the recording can violate copyright by selling it without authorization or they can use it for time-shifting their fair use viewing. If your assertion were true, then any technology that can be used for an infringing use would be illegal. You wouldn’t be able to read this message because your Internet connected device would be illegal. Again, you don’t understand US copyright law.

What if the book becomes bestselling later on?

You have fucking no idea how court judges rule fair use.

I do. I’ve studied many cases and decisions. You should do the same, and not just some corporate propaganda version.

In the U.S., the law doesn’t state “book report is always fair use” nor “commentary is always fair use”, because there could always be exception cases where a “report” reproduce a significant portion of the original work, so that the “report” becomes a market substitute of the original.

You’re changing the scenario. A book report doesn’t mean it gets published. You could reproduce a significant portion of the original work for a classroom assignment and it won’t be a market substitute at all. At most, the teacher would just tell you that it’s not necessary for the assignment to quote quite so much. It’s unlikely more than the student and the teacher would ever read the report.

Factor one: Purpose is commercial selling of the book (against fair use).

That’s not the purpose of a book report. It’s education.

Factor two: Nature of copyright work is fiction books, but likely already published (neutral or slight for fair use).

Some people write book reports on non-fiction. Stephen King writes non-fiction books, like his On Writing book.

Factor three: The amount taken is likely not minimal (neutral, but let’s consider this leans for fair use for the sake of argument).

You’re saying they use a lot when they likely wouldn’t. This is like saying, “if someone owns a gun they’re a murderer because I made up a scenario where they murdered someone with their gun.” You are starting from the conclusion. That’s intellectually dishonest.

Factor four: It creates a market substitute for Stephen King’s own guidebooks on how to write fictions (thus against fair use in this most important factor).

It doesn’t create a market substitute. You’ve only claimed that a teacher would ask a student to publish their commentaries about Stephen King’s work, which isn’t likely. What publisher is going to print that? Are they self-publishing? Whose doing the formatting? Is this for an assignment? Does the curriculum for the class cover this purpose? Your scenario doesn’t make any sense.

The analogy does not have to be “realistic” as of how humans would do it.

It wasn’t just an analogy. You said students learning to write from reading was a copyright infringement.

It’s AI. That is one difference between human learning and AI “learning”, mind you. The AI training process copies significant amount of works during the data gathering phase, which is even before the data is transformed into neural network weights.

Wait, the data is transformed? Would you say the process of transforming something is… transformative?

Quote the exact section and sub-section of the law.

“…the author…shall have the sole right and liberty of printing, reprinting, publishing and vending such map, chart, book or books, for the term of fourteen years from the recording the title thereof in the clerk’s office… And if, at the expiration of the said term…the same exclusive right shall be continued to him or them, his or their executors, administrators or assigns, for the further term of fourteen years;”

Are you suggesting the U.S. Copyright law is unconstitutional?

No, quote me where I said that. I said that the public was deprived of the use of the public domain works within their own lifetime. And I provided the source of that.

Because no case law ruled that,

And I didn’t claim that.

and the quoting of your Amendment 10 is vague because no even a State law implies a right of “using a copyrighted work within a person’s lifetime”.

I didn’t say it expressly said that. And you completely missed the pertinent clause in the 10th Amendment that I was referring to.

Unless you can sue to the Supreme Court about the constitutionally of the modern copyright law, I would disregard this one as being unfounded.

Unless you can understand what I’m arguing, you’re dismissing your own ignorance here.

Irrelevant. Because AI is not a person and has no free speech.

I am a person and I have free speech. And my free speech, representation, redress, and other rights have been subverted by these large corporations you’re siding with. And what’s worse is that you’re siding with them over profits. You’re saying you care about hypothetical sales rather than the danger of my country turning into an authoritarian hell-hole.

The Supreme Court ruling in Warhol is not propaganda,

Then link to the ruling and not to a propaganda organization.

and you can try ignoring all arguments of the opposite side,

That’s not the opposite side. There aren’t only two sides. I’ve been pointing out that you’ve been engaging in a false dilemma from the very beginning but you keep applying that myopic perspective.

but by doing do I’ll ignore your argument (and those from EFF, too).

You have already been ignoring everything I have said since the beginning. I’ve pointed out a field full of straw men. There are no crows for miles.

It’s good to live yourself in an echo chamber,

I don’t live in an echo chamber. I’m aware of the bad, selfish, greed-driven perspectives of unabashed exploiters. I just won’t accept them as a valid citation when we’re talking about truth, not propaganda.

until you see the fact that the Supreme Court wasn’t on your side.

The Supreme Court has been making some unconstitutional decisions. Many justices are openly corrupt now. Many of their appointments were the result of corruption and unlawful activity. I won’t be surprised if any particular SCOTUS decision goes a way I’d disagree with.

How do you propose doing that? You don’t have influence over all those companies and institutions. They have free will. They’re going to do it whether you or I approve. Setting a legal precedent that all LLM training requires licensing won’t stop that.

I’m not the good guy as I have said before. I just want bad guys to stop exploiting the creative labor of humans.

You are siding bad guys who are exploiting the creative labor of humans. That’s literally what I’m railing against! I am the exploited. You’re cheering on the people who have exploited me! I know you aren’t the good guy. You are a sycophant to the wealthy and powerful.

And the copyright law fits exactly one of the purposes of regulating AI, especially on the no-exploitation-of-human-labor part.

The entire system of capitalism is built on the exploitation of human labor. That ship has sailed. Private owners of the means of production make money from owning things. They’re not doing the work themselves. A worker can sometimes generate millions of dollars worth of profits for a company and only pull down a barely living wage. You’re pretending like AI companies are the only exploiters out there. It’s the whole damn system!

Oh yeah and I would stop your damn health care provider from using AI for your case because that’s exploitation and should be stopped.

But you won’t be able to. Again, even if LLM training data must be licensed, wealthy LLM companies will still be able to afford to license them (and small creators will get virtually nothing), and so the LLMs will get trained and used and pushed on us without our consent. You might as well say it’s okay because unicorns are going to emerge and give everyone a million dollars. Having unrealistic expectations of how things are going to go doesn’t justify bad decisions today.

And the whole fucking point is creative worker are exploited by AI companies without their “preference” either.

The whole fucking point is that workers are exploited by companies without their preference. It’s not just creatives. It’s not just AI companies. It’s all capitalist companies that aren’t employee owned.

Why should I care about your health care if there is a chance that I may live with a shorter lifetime than you?

I could get hit by a car tomorrow and die. What does the hypothetical that you or I would outlive each other have anything to do with the moral argument that health care should be humane and affordable? That you pose it as a zero sum scenario is weird.

This is off-topic, but the point is that it’s not necessary for your healthcare to depend on AI, and even when it does, it’s not an excuse for exploiting creative works (why should healthcase AI be trained with novels anyway?)

We’re not talking about LLMs being trained on fiction works. As you should recall, had you ever actually understood my premise, stated several times now, I am saying that you can’t just assert that all LLM training must require licensing, because it’s not established in law or case law at this time. And I am not talking about the big AI companies getting sued right now. I’m talking about anyone who might ever want to train an LLM in the future in the US, you know, including those poor people and students and researchers you don’t apparently think exist.

True. And I am stripped that market opportunity because of AI.

This doesn’t seem like a loss for us. I can’t imagine your works would be appreciated in the US. You can’t understand things that have been taught to you multiple times.

Explorer09 (profile)

June 21, 2025 at 2:03 pm

Re: Re: Re:⁸

Except it’s not. You claimed children learning to write by reading copyrighted works was a copyright infringement. It is not.

There is no blanket fair use for children’s learning, mind you.

American Geophysical Union v. Texaco case shows that intermediate copies could consistitute infringement. For this particular case it’s employees of a for-profit corporation learning and it’s human learning case that was ruled not fair use.

It would have to have the sentences and paragraphs to be able to translate it. You can’t translate if you don’t have the original text. The model doesn’t have the original text!

Because in the AI “pre-training” phase the text has been translated to model weights! It does not need to be text to constitute “copies”.

And you can literally legally learn to write from copyrighted content. This isn’t an analogy. This is literally my lived experience. If I remember how to write based on what I’ve read, it’s legal. I often don’t remember the specific sentences I’ve read but the method of writing is retained, such that the “data” that I’ve trained myself on isn’t even present in my head. I can’t quote word for word much of some writer’s expression despite remember plots, characters, and writing style.

Why the hell should I care about your learning experience? Are you AI and not human?

The models don’t contain the original trained data.

You keep presenting this as a fact while I dispute this many times! The model contains data in a different form than what the copyright content was originally “fixed” in. And for the purpose of copyright, the model itself is a derivative work!

And because it’s derivative work it needs copyright license to distribute, period.

I’m not talking about these lawsuits you keep bringing up. You’ve brought them up, not me. You keep pretending I’ve been defending the actions taken by the AI companies. You really, seriously, definitely need to read what I first said at the start of this whole thing. It would save you so much time. Here, I’ll do it for you, again:

“My frustration with the arguments of people claiming it’s not fair use and that all training must be licensed is that many people seem to think they’re championing the little guy when they’re inadvertently advocating for the benefit of the wealthy and corporations.”

If you don’t disagree with every part of what I said there, you should stop responding.

You cited no single case where AI training is not infringing or “fair use” because there is none (yet). Stop making the claim that you can train AI without copyright license!

Not necessarily. The uses can be infringing without the model being infringing, the same way a VCR can record content off TV and the person making the recording can violate copyright by selling it without authorization or they can use it for time-shifting their fair use viewing. If your assertion were true, then any technology that can be used for an infringing use would be illegal. […] Again, you don’t understand US copyright law.

If only the model is not infringing (which is false premise already, and I don’t need to argue about the further what-if scenario).

(And I wish there are AI models with data fully licensed, damn it! But what we’ve seen are AI companies trying to defend their scraping is fair use while their arguments don’t hold. Thomson Reuters v. Ross is an example case.)

You could reproduce a significant portion of the original work for a classroom assignment and it won’t be a market substitute at all.

No. You’ve the confused between the exemption for schools (§ 110(1) and (2)) and the fair use exemption (§ 107). I’m not talking about the § 110 case. Copying for school classroom use is already exempt so I don’t need to address the fair use four factors.

[Factor one] That’s not the purpose of a book report. It’s education.

Factor one must be evaluated with the ultimate purpose of the use. So it’s not “education” in the intermediate step, but the commercial publishing of the notes/book that matters. Texaco case law.

[Factor three] You’re saying they use a lot when they likely wouldn’t. This is like saying, “if someone owns a gun they’re a murderer because I made up a scenario where they murdered someone with their gun.” You are starting from the conclusion. That’s intellectually dishonest.

Except that you are refuting a straw man. Even when what you say is true here, Factor Three would still rule neutrally here.

[Factor four] It doesn’t create a market substitute. You’ve only claimed that a teacher would ask a student to publish their commentaries about Stephen King’s work, which isn’t likely. What publisher is going to print that? Are they self-publishing? Whose doing the formatting? Is this for an assignment? Does the curriculum for the class cover this purpose? Your scenario doesn’t make any sense.

Did I say the hypothetical scenario has to make sense? This is actually what generative AI has been doing as a metaphor. So no need to question whether there is a “publisher” who would print that because there just is.

Wait, the data is transformed? Would you say the process of transforming something is… transformative?

USCO even said that generative AI outputs are transformative, so what’s the issue here?

You mistakenly believing “transformaive = fair use” is the issue.

I said that the public was deprived of the use of the public domain works within their own lifetime. And I provided the source of that.

There is no such right of “using works within their lifetime”! Even with your quote on U.S. Copyright Act of 1790 doesn’t say there is such right.

Your quoted acts only says about the copyright of 14 years, but then, there is nothing unconstitutional for extended that lifespan to “life + 50 years” or “life + 70 years”. When you insist on the a right that doesn’t exist in statute, there is nothing to be “deprived” of.

Then link to the ruling and not to a propaganda organization.

https://www.supremecourt.gov/opinions/22pdf/21-869_87ad.pdf

The Supreme Court has been making some unconstitutional decisions. Many justices are openly corrupt now. Many of their appointments were the result of corruption and unlawful activity. I won’t be surprised if any particular SCOTUS decision goes a way I’d disagree with.

So you have more authority than the Supreme Court? Get it.

Setting a legal precedent that all LLM training requires licensing won’t stop that.

I would go for setting a legal precedent for that whether you like it or not. Call me a bad guy whenever you want, because you even disregard the Supreme Court.

You are siding bad guys who are exploiting the creative labor of humans. That’s literally what I’m railing against! I am the exploited. You’re cheering on the people who have exploited me! I know you aren’t the good guy.

At least I can get paid for my works! I don’t care whether you are exploited! I even dismiss your claim about “poor people” as without evidence. Are you happy now?

You’re pretending like AI companies are the only exploiters out there. It’s the whole damn system!

And why the heck should I let AI companies keep exploiting anyway? I have no obligation to fight what you called the “whole damn system”.

[If] LLM training data must be licensed, wealthy LLM companies will still be able to afford to license them (and small creators will get virtually nothing), and so the LLMs will get trained and used and pushed on us without our consent.

Why do the small creators need LLM to do anything? You didn’t answer this question when I asked before, so the rest of the outcome you suggest is nonsense!

“LLM gets pushed on [me] without [my] consent”? What the fuck?

I’m talking about anyone who might ever want to train an LLM in the future in the US, you know, including those poor people and students and researchers you don’t apparently think exist.

Because LLM is a luxury and not everyone can get the luxury however you tried. The LLM requires a server farm which means smaller companies won’t have that luxury to deploy an LLM unless they can rent the servers from someone else. This is unchangeable. A better world would be every personal computer be able to run SLMs (Small language models) instead of rely on LLMs for most computing tasks that need AI. Assuming the SLMs have training materials that are all licensed, by the way.

MrWilson (profile)

June 21, 2025 at 6:14 pm

Re: Re: Re:⁹

There is no blanket fair use for children’s learning, mind you.

You don’t need fair use. It’s a non-infringing use. There is no legal precedent that human learning from copyrighted content is a copyright infringement (and to be clear, we’re just talking about learning, not some unrealistic impractical convoluted scenario you’ve invented where infringement is assumed). If there was such a precedent, you could cite it. You can’t.

American Geophysical Union v. Texaco case shows that intermediate copies could consistitute infringement. For this particular case it’s employees of a for-profit corporation learning and it’s human learning case that was ruled not fair use.

That has nothing to do with human learning. That case was about intermediate copies, not non-fixed human brain learning.

Because in the AI “pre-training” phase the text has been translated to model weights! It does not need to be text to constitute “copies”.

The model weights aren’t the training data. It is the process of the model learning, not a translation or encryption or compression of the training data. You can’t take the weights and faithfully reproduce the training data.

Why the hell should I care about your learning experience? Are you AI and not human?

You said human learning was a copyright violation. If you can’t accept the legality of that, you’re admitting you’re just a myopic copyright maximalist trying to invent new rights that don’t exist.

You keep presenting this as a fact while I dispute this many times!

You can claim it’s not true all you like. That doesn’t mean you’re correct. You’ve demonstrated that you don’t understand how the technology works.

The model contains data in a different form than what the copyright content was originally “fixed” in.

The data the model contains is not a different form of the training data. It is different data. It is data created by the model.

And for the purpose of copyright, the model itself is a derivative work!

The model doesn’t contain the original data. There’s nothing to be derivative of in the model.

And because it’s derivative work it needs copyright license to distribute, period.

You can’t cite case law or law that says this.

You cited no single case where AI training is not infringing or “fair use” because there is none (yet). Stop making the claim that you can train AI without copyright license!

You just contradicted yourself. If no case law or law has yet found LLM training to always require licensing, then it is legal until such case or law becomes a precedent. You’re admitting here both that your claim that it’s illegal is just opinion, not fact-based and you’re admitting you don’t understand US laws and legal precedents. That understanding you prove you don’t have is a prerequisite for having a useful perspective on this matter. You’re getting to be as non-sensical as Terop.

If only the model is not infringing (which is false premise already, and I don’t need to argue about the further what-if scenario).

You just admitted that the model hasn’t been determined to be infringing yet. Stop contradicting yourself.

(And I wish there are AI models with data fully licensed, damn it!

There will be eventually, but you’re missing that creators will still be exploited. The big media corporations will get the licensing funds and very few creators will get much of anything. Creators will be pressured to license their work for almost nothing when signing new contracts. The future you think you’re railing against and that you think can be prevented by successful lawsuits will not be stopped by these methods.

No. You’ve the confused between the exemption for schools (§ 110(1) and (2)) and the fair use exemption (§ 107). I’m not talking about the § 110 case. Copying for school classroom use is already exempt so I don’t need to address the fair use four factors.

No, you’re confused between the lack of a 106 right being established that covers human learning and the need for a justification that something is fair use. If there’s no right, then there’s no need to argue fair use.

Your entire fair use analysis is unnecessary.

Did I say the hypothetical scenario has to make sense?

Reality dictates that it must. You don’t legislate precedents based on absurd scenarios that will never happen.

This is actually what generative AI has been doing as a metaphor.

Generative AI output isn’t by default published to a market. That’s a false claim.

You mistakenly believing “transformaive = fair use” is the issue.

It’s not always fair use, but it’s a strong argument in favor of fair use. It frequently is.

There is no such right of “using works within their lifetime”! Even with your quote on U.S. Copyright Act of 1790 doesn’t say there is such right.

I didn’t claim the 1790 law was where the right was derived. I said the 10th Amendment was the source of the fact that it was legal before the expansion of the length of copyright.

Your quoted acts only says about the copyright of 14 years, but then, there is nothing unconstitutional for extended that lifespan to “life + 50 years” or “life + 70 years”. When you insist on the a right that doesn’t exist in statute, there is nothing to be “deprived” of.

I didn’t claim it was unconstitutional. You used that term. I said US citizens were deprived of the ability. I didn’t even call it a right. You did.

So you have more authority than the Supreme Court? Get it.

Authority? No, quote me where I uttered that straw man.

I would go for setting a legal precedent for that whether you like it or not.

So you accept that you’re rabidly arguing for a futile, useless scenario.

Call me a bad guy whenever you want,

You called yourself one.

because you even disregard the Supreme Court.

SCOTUS hasn’t yet ruled on this issue, but I do condemn corrupt SCOTUS decisions. They have made unconstitutional decisions at times. Would you argue that a SCOTUS justice isn’t capable of being corrupt or making a biased decision that contradicts the Constitution? If so, what are you smoking that makes you so trusting of the wealthy and powerful?

At least I can get paid for my works!

These lawsuits won’t get you a lot more money. The big media companies will just get a pay out from the AI companies, some of whose stock are owned by the same people, and you’ll get pressured to accept a contract that includes the licensing for nothing or almost nothing. You’re literally fighting for someone else’s profit at the expense of poor people.

I don’t care whether you are exploited! I even dismiss your claim about “poor people” as without evidence. Are you happy now?

Yes. I’ve been saying this, but I appreciate you admitting it. You are saying you have no moral stance here. You’re just out for your own. You don’t care about others. That means no one is going to care about you or your pet issues. This should be the end of anything you say on the topic. Everything else you say is derived from this bias.

And why the heck should I let AI companies keep exploiting anyway? I have no obligation to fight what you called the “whole damn system”.

You have no power to fight the whole damn system or to allow or stop the AI companies. You wouldn’t be impotently arguing with straw men in this comment section if you had any power or influence. You’re just strutting around here bragging about how you only care about yourself. No one needs to feel any empathy for you. Your position is so futile and contradictory and ill-informed and full of propaganda that I don’t think you’re a creator. You seem to be a shill for the Copyright Alliance or a similar organization.

Why do the small creators need LLM to do anything?

Ask them.

You didn’t answer this question when I asked before, so the rest of the outcome you suggest is nonsense!

You assume I’m speaking for them even though I’ve said and you’ve agreed that I represent myself. But you’re also ignoring that it’s not just about creators. It’s about all US citizens, including the majority whose existence you dispute.

Because LLM is a luxury and not everyone can get the luxury however you tried. The LLM requires a server farm which means smaller companies won’t have that luxury to deploy an LLM unless they can rent the servers from someone else.

You’re ignoring local LLMs. You’re ignoring smaller purpose LLMs. You seem to assume every LLM present or future will be some big generalized LLM like ChatGPT or Gemini. You’re admitting your ignorance again.

This is unchangeable.

It’s not only changeable, but there are also contrary examples that prove your claims wrong.

A better world would be every personal computer be able to run SLMs (Small language models) instead of rely on LLMs for most computing tasks that need AI. Assuming the SLMs have training materials that are all licensed, by the way.

Why do you think this won’t ever be reality? It’s already happening. This is another demonstration of your ignorance.

Explorer09 (profile)

June 21, 2025 at 8:38 pm

Re: Re: Re:¹⁰

It’s a non-infringing use. There is no legal precedent that human learning from copyrighted content is a copyright infringement (and to be clear, we’re just talking about learning, not some unrealistic impractical convoluted scenario you’ve invented where infringement is assumed).

Define “learning”. Because it looks like we have definition difference for the term.

That has nothing to do with human learning. That case was about intermediate copies, not non-fixed human brain learning.

Again, define “learning”, before I can go with this debate.

It is the process of the model learning, not a translation or encryption or compression of the training data. You can’t take the weights and faithfully reproduce the training data.

Again define “learning” when it comes to machine as well.

The data the model contains is not a different form of the training data. It is different data. It is data created by the model.

Created through what?

There’s nothing to be derivative of in the model.

Then the models comes from thin air?

And because it’s derivative work it needs copyright license to distribute, period.

You can’t cite case law or law that says this.

You didn’t read any recent lawsuit about AI training, did you? There is one case law, Thomson Reuters.

If no case law or law has yet found LLM training to always require licensing, then it is legal until such case or law becomes a precedent.

You can only say this before March, 2025.

And I can wait until another case law’s summary judgment is made, which is no later than this year.

The big media corporations will get the licensing funds and very few creators will get much of anything. Creators will be pressured to license their work for almost nothing when signing new contracts. The future you think you’re railing against and that you think can be prevented by successful lawsuits will not be stopped by these methods.

It’s marginally better for a creator to be “forced” to sign a licensing contract, than have their works taken without permission!

For two things: (1) the creator can get paid under the contract; (2) if the contract turns out to be unfair labor practice or anything, they can sue with sufficient evidence.

Generative AI output isn’t by default published to a market. That’s a false claim.

“By default”? What the fuck again? How can the companies be charged with secondary liability of copyright infringement only when the output is published “by default”?

By the way, there’s a recent complaint. Midjourney keeps the user’s image generation for public viewing in its “Explore” pages, and Disney (and Universal) cited the Explore pages for infringement proof. Say to Disney that it’s not “published”.

it’s a strong argument in favor of fair use. It frequently is.

Which argument?

I said the 10th Amendment was the source of the fact that it was legal before the expansion of the length of copyright.

And yet the expansion of copyright through Congress wasn’t illegal. End of story.

I said US citizens were deprived of the ability. I didn’t even call it a right. You did.

‘Oh! I am “deprived” of the ability to kill! I am “deprived” of the ability to commit adultery! I am “deprived” of the ability to steal!’

Seriously, what the fuck?

I do condemn corrupt SCOTUS decisions. They have made unconstitutional decisions at times. Would you argue that a SCOTUS justice isn’t capable of being corrupt or making a biased decision that contradicts the Constitution?

That is a political question. I don’t reply this because it’s out of scope. You go persuade the politicians because this is nothing to do with me.

and you’ll get pressured to accept a contract that includes the licensing for nothing or almost nothing.

It’s already happening even without the suit! Google admits that it uses YouTube videos to train their video generating AI, Veo 3 (https://www.cnbc.com/2025/06/19/google-youtube-ai-training-veo-3.html)

And there’s no way to opt out!

YouTube Terms of Service
“By providing Content to the Service, you grant to YouTube a worldwide, non-exclusive, royalty-free, sublicensable and transferable license to use that Content (including to reproduce, distribute, prepare derivative works, display and perform it) in connection with the Service and YouTube’s (and its successors’ and Affiliates’) business, including for the purpose of promoting and redistributing part or all of the Service.” (Emphasis added)

You’re literally fighting for someone else’s profit at the expense of poor people.

I’m fighting for my profit. And dismiss “poor people” again. (This “poor people” is spamming bullshit as there is no single witness showing up. Why the fuck should I assume “poor people” exists or why should I care for them?)

You are saying you have no moral stance here. You’re just out for your own. You don’t care about others. That means no one is going to care about you or your pet issues.

My moral stance is respect copyright and no copyright exemption for AI training. There is no “others” you mentioned that I should care about. Bring a witness here, or else I dismiss.

Why do the small creators need LLM to do anything?

Ask them.

I ask you. Because you brought this bullshit argument about “poor people” need access to LLM.

You assume I’m speaking for them even though I’ve said and you’ve agreed that I represent myself. But you’re also ignoring that it’s not just about creators. It’s about all US citizens, including the majority whose existence you dispute.

I dispute even the “majority” word of yours here.

I would rather have polls like this before you claim you opinion is the majority:
https://theaipi.org/poll-biden-ai-executive-order-10-30-5/

MrWilson (profile)

June 22, 2025 at 7:45 pm

Re: Re: Re:¹¹

Define “learning”. Because it looks like we have definition difference for the term.

We’ll go with the verb form:

Verb
learn (third-person singular simple present learns, present participle learning, simple past and past participle learned or learnt)

To acquire, or attempt to acquire knowledge or an ability to do something.

In the context of English-speaking American children learning to write, it means reading and using pattern recognition to understand how nouns and verbs and other parts of speech are used in the composition of complete sentences. It also involves remembering vocabulary words so you can write more concepts. With that learning, you can compose a large variety of sentences. None of this process requires memorizing specific sentences composed by other people, copyrighted or not. It can be learned from reading copyrighted or public domain works. It is not illegal in the United States and is in fact referenced in Article I Section 8 Clause 8 of the Constitution as the original purpose of copyright: “To promote the progress of science and useful arts…” And in that historic context, science meant knowledge/learning.

Again, define “learning”, before I can go with this debate.

I doubt my definition will enable you to argue anything coherent since you seem to intentionally take the worst stances and also willfully misinterpret what I say, even after correction.

Again define “learning” when it comes to machine as well.

Research how LLMs are trained.

Created through what?

A process.

Then the models comes from thin air?

It comes from the process.

You didn’t read any recent lawsuit about AI training, did you? There is one case law, Thomson Reuters.

We’ve literally already talked about this. You don’t seem to remember anything we’ve said. Are you an LLM that can’t track a conversation over a long period of time? The appeal of Thomson Reuters v ROSS is not decided yet. It doesn’t mean anything yet. There is not precedent. And it doesn’t even cover the scope of what we’re talking about, so it won’t set the precedent you think it will.

You can only say this before March, 2025.

Again, again, again, you don’t understand US laws or how precedents are set. A ruling in a case that A) is on appeal B) doesn’t cover the entire scope of your argument and C) is not a precedent is not a good thing to rest your entire claim on. The details of Thomson Reuters v Ross aren’t the same as other scenarios. It’s not a blanket decision that all LLM training requires licensing.

And I can wait until another case law’s summary judgment is made, which is no later than this year.

So wait and stop making claims that you acknowledge aren’t based in current law or case law.

It’s marginally better for a creator to be “forced” to sign a licensing contract, than have their works taken without permission!

I’m a creator. It’s functionally the same to me. Being coerced is the same as not giving permission.

For two things: (1) the creator can get paid under the contract;

By whom? The AI company? It’s not going to pay small creators shit.

(2) if the contract turns out to be unfair labor practice or anything, they can sue with sufficient evidence.

Yes, with their big bags of money to hire lawyers, small creators will sue and definitely beat large corporations with top tier law firms on speed dial.

You really have no fucking idea how anything works.

“By default”? What the fuck again? How can the companies be charged with secondary liability of copyright infringement only when the output is published “by default”?

I didn’t say anything about whether they could be charged with secondary liability. You pretended in your unrealistic analogy that the output of an LLM is typically published publicly. It more often isn’t. People use a lot of LLMs privately.

By the way, there’s a recent complaint. Midjourney keeps the user’s image generation for public viewing in its “Explore” pages, and Disney (and Universal) cited the Explore pages for infringement proof. Say to Disney that it’s not “published”.

I have already said I’m not defending any particular AI company in any lawsuits. And you continue to bring them up as if I represent a person who supports them despite multiple iterations of me saying they’re not the people I’m concerned about. Big corporations can get fucked, regardless of what product or service they profit off of.

Which argument?

That a use is transformative.

And yet the expansion of copyright through Congress wasn’t illegal. End of story.

First, I didn’t claim it was illegal. Second, the Nuremberg Race Laws in Nazi Germany were legal. Legal is not the same as moral or ethical or right. Corrupt legislators can make immoral laws. Corrupt justices can make immoral case law. Are you willing here to say that you think all laws are morally correct? That would be a bold and stupid admission.

‘Oh! I am “deprived” of the ability to kill! I am “deprived” of the ability to commit adultery! I am “deprived” of the ability to steal!’

That you equate the use of a public domain work with the ability to kill or steal is insane! Holy shit!

Also, committing adultery is legal in the US, so the inclusion of that one is another admission on your part that you don’t understand what you’re talking about. Full stop.

Seriously, what the fuck?

That’s what I’m saying! What the fuck is up with your moral equivalences?!? Do you think people who use public domain works should be imprisoned or executed? That’s fucked up.

That is a political question. I don’t reply this because it’s out of scope. You go persuade the politicians because this is nothing to do with me.

You’re arguing for a change in laws and rights. That’s political. And it’s infinitely more important than the petty 50 cents you think you might get out of licensing works for LLM training. And your admission that you don’t care about it makes your myopic obsession with a different country’s laws irrelevant.

It’s already happening even without the suit! Google admits that it uses YouTube videos to train their video generating AI, Veo 3 (https://www.cnbc.com/2025/06/19/google-youtube-ai-training-veo-3.html)

This supports my argument. I appreciate you pointing out that I’m right. Feel free to continue doing so. You’ve already done this a lot.

I’m fighting for my profit.

You won’t get shit. You’re fighting for nothing.

And dismiss “poor people” again. (This “poor people” is spamming bullshit as there is no single witness showing up.

I don’t know if you understand how arguments work, but it’s absurd to insist that I abduct someone and drag them into an argument they may not already be interested in. This is the weirdest demand for proof. Poor people exist. American citizens exist. Their rights are affected by case law and legal precedents. None of this should be in dispute. That you dispute it is intellectual dishonesty.

Why the fuck should I assume “poor people” exists

You shouldn’t assume. You should know. Have you never met a poor person? Are you not able to simply search the internet for the portion of the US population living under the poverty level?

or why should I care for them?)

If you don’t already understand why you should experience empathy for other human beings, I don’t know that I can answer this for you. If you admit you’re a sociopath incapable of empathy, then there’s no point in discussing anything anymore. You’re just a selfish toddler throwing a tantrum.

My moral stance is respect copyright and no copyright exemption for AI training.

That’s not a moral stance. You’ve admitted you’re only interested in your own profit. Unless you’re arguing your benefit is the only basis for morality, in which case you deserve no empathy from anyone else.

There is no “others” you mentioned that I should care about. Bring a witness here, or else I dismiss.

You can “dismiss” all you want. It doesn’t change the existence of other people you should care about. If you can’t even fathom the existence of such a person without having one of them shoved in your face, you’re at best a solipsist and at worst a narcissist and neither means your opinion carries any value.

I ask you. Because you brought this bullshit argument about “poor people” need access to LLM.

I didn’t say they need access to an LLM. I said they currently have a right to train an LLM if they want to because no law or legal precedent yet says otherwise. There are plenty of people training their own LLMs. Look at Hugging Face and GitHub. You’ll find lots. Ask them.

I dispute even the “majority” word of yours here.

You can’t dispute a fact without evidence. You have no evidence. Not everyone in the US is wealthy. Prove otherwise.

I would rather have polls like this before you claim you opinion is the majority

I didn’t claim my perspective was in the majority. The majority of people don’t care about this topic.

You should read that poll more closely. It doesn’t say what you think it says. And it doesn’t speak to the topic we’re actually discussing. Where was the question about licensing for LLM training?

Again, again, again, (how many times do I have to point this out?) you are not arguing with me. You’re arguing with a straw man. I’m not fighting for AI companies. I’m not saying AI should be used in drones in warfare. If I have ever said such a thing, you could fucking quote me and you never quote me whenever I ask you to quote me on something like this. Your failure to quote me is your tacit admission you’re full of shit.

Explorer09 (profile)

June 26, 2025 at 4:02 am

Well then. The ruling on Meta’s fair use is out.

https://storage.courtlistener.com/recap/gov.uscourts.cand.415175/gov.uscourts.cand.415175.598.0_1.pdf

Judge Chhabria grants fair use on Meta, but reluctantly. He complains the plaintiffs were bringing wrong arguments and warns that, if the evidences were presented differently, the judgment would be different.
This fair use is limited to this case against the 13 plaintiff authors only. This is not a class action, so other authors can still sue Meta under different evidences.
Judge Chhabria rejects Judge Alsup’s reasoning (Bartz v. Anthropic) about AI learning is like human learning (dispite the conclusion granting fair use for Meta). In the opening section: “[W]hen it comes to market effects, using books to teach children to write is not
remotely like using books to create a product that a single individual could employ to generate
countless competing works with a miniscule fraction of the time and creativity it would
otherwise take. This inapt analogy is not a basis for blowing off the most important factor in the
fair use analysis.“
Judge also noticed the Meta AI can regurgitate part of plantiff’s works. But no more than 50 words. The limit is less than what Google Books could output in the Google Books case (Authors Guild v. Google).
Judge rejects the assumption that AI only copies unprotectable ideas during training (as in the U.S. Copyright Office report).
On the market factor, market dilution is addressed (as in the U.S. Copyright Office report), but Judge criticizes on plaintiffs’ lack of evidence on the harm of the plaintiffs’ books (lost sales, etc.) even when dilution is possible.
Judges gives a warning that this fair use grant doesn’t mean Meta’s copying was lawful. In other words the AI companies should seek for licences for training data anyway.

Wednesday
20:14	The Other Side: Game Dev Tim Cain Isn't Helping In The AI In Gaming Debate (5)
15:43	Leading Cancer Charity Stops Funding Open Access Publishing Because It's Just Not Working (0)
13:35	Appeals Court Dumps California Law That Would Have Banned Federal Officers From Wearing Masks (8)
11:20	It Was Spelled In Seashells By The Seashore. The DOJ Now Pretends It's A Felony. (32)
11:15	Daily Deal: The Complete Ethical Hacking Course (0)
09:20	ICE Is Or Isn't Cutting Back On Courthouse Arrests, Depending On Who You Ask (4)
05:23	FCC Leaks To Semafor They're 'Investigating' ABC Because A Comedian Told A Joke. Again. (15)
Tuesday
19:55	The Secretary Of Health & Human Services Doesn't Believe In The Foundation Of Modern Medicine (37)
14:55	Tech Lobbyists Hard At Work Undermining Proposed Alaska 'Right To Repair' Law (0)
12:56	'Free Speech' President Trump, Once Again, Tries To Get Jimmy Kimmel Fired For Jokes (37)

The U.S. Copyright Office’s Draft Report On AI Training Errs On Fair Use

from the fair-use-matters dept

The Report Bungles Fair Use

Courts Should Reject the Copyright Office’s Fair Use Analysis

Comments on “The U.S. Copyright Office’s Draft Report On AI Training Errs On Fair Use”