Anonymous Coward

Is the alternative (re: not addressing copyright law with regards to AI) actually any better?

We’ve already seen companies limit API access, ostensibly because of cost, but simultaneously raising rates as the value of API access becomes more associated with AI outcomes.

We already see the largest, most influential tech companies able to far outstrip other competitors with AI, as they’re better able to parse the massive amount of data available.

Copyright currently exists to help the middlemen, not the artists, but is that an argument against applying copyright to AI, or an indictment of our copyright system in general?

Arguing that artists are already required to bend over backwards for major companies, thus copyright shouldn’t be enforced with regards to AI sounds like a bit of a backwards argument.

Mike Masnick (profile)

December 4, 2023 at 12:05 pm

Re:

Is the alternative (re: not addressing copyright law with regards to AI) actually any better?

Yes. Absolutely.

We’ve already seen companies limit API access, ostensibly because of cost, but simultaneously raising rates as the value of API access becomes more associated with AI outcomes.

A practice I’ve criticized, and one I don’t think is all that sustainable. I mean, many of the AI systems don’t use the APIs anyway, but resort to scraping or other content access mechanisms anyway. So while the API pricing is notable, I’m not sure it’s difference making.

We already see the largest, most influential tech companies able to far outstrip other competitors with AI, as they’re better able to parse the massive amount of data available.

I… also don’t think that’s accurate. I mean, go back just a few years ago and we kept hearing how the only companies that would be able to parse the data and offer AI were Google, FB, Amazon, and Apple.

And yet, the leaders in AI are… not really any of those guys. They’re players, certainly, but OpenAI leads, and other companies are doing well as well, such as Anthropic.

It’s true that many of these companies have now taken billions of dollars from the big tech companies, but that’s because the big tech companies haven’t actually been able to get the same results.

Copyright currently exists to help the middlemen, not the artists, but is that an argument against applying copyright to AI, or an indictment of our copyright system in general?

It is an indictment of copyright, definitely. But it’s an indictment of what is inherent in the copyright system. And will remain, even here.

Arguing that artists are already required to bend over backwards for major companies, thus copyright shouldn’t be enforced with regards to AI sounds like a bit of a backwards argument.

But I’m not arguing either that “artists should bend over backwards” nor am I arguing that “copyright shouldn’t be enforced.” I’m saying there’s NO COPYRIGHT TO ENFORCE in training, because it’s fair use.

And, in the long run that will HELP artists, and not leave them as beholden to big companies, because there will be much more competition.

Anonymous Coward

December 4, 2023 at 2:04 pm

Re: Re:

Not the person you originally replied to, but

And, in the long run that will HELP artists, and not leave them as beholden to big companies, because there will be much more competition.

What exactly does this mean? What form of “competition” are you talking about?

Anonymous Coward

December 4, 2023 at 2:15 pm

Re: Re: Re:

Generative AI is a tool. and needs human guidance to produce meaningful output. Licensing would make it so that only corporations could afford the cost, and bureaucracy to deal with licensing, giving them another way to trap and exploit artists. If any body can train a mode, or use a model without licensing, then artists are able to avail themselves of the tool without signing contracts which take away some of their control over their own works.

Arianity

December 5, 2023 at 1:34 am

Re: Re:

And, in the long run that will HELP artists, and not leave them as beholden to big companies, because there will be much more competition.

This feels like motivated reasoning. You want it to happen for fair use reasons, but this angle is pretty unfleshed out. But it solves what would otherwise be thorny tradeoffs.

You’re right that the current system only gives them pennies, but I can see why artists wouldn’t be thrilled to give up the pennies they’re getting now, given the alternative

Anonymous Coward

December 5, 2023 at 4:54 am

Re: Re: Re:

Here’s a little exercise for you, that may take time to do. Look at the number of artists and creators publishing works on the Internet, and find out what percentage make any money from their art. I think that you will find that making money from art happens because a fair number of people, in the tens of thousands, like the work, and a small percentage of those decide to support the artists and creators. Copyright doesn’t come into it, as those supporting the creators do so because they want more new works, and supporting the artists helps that happen.

As ever the lot of most of them is very little of no money because they do not attract a big enough audience. Also note that pre-Internet, most works submitted to publishers sat in a pile and were never even looked at, and of those the looked at, only a few were chosen for publication, so as a result most creators made nothing from their work.

arartist

December 5, 2023 at 5:44 am

Re: Re: Re:²

I think you don’t actually understand at all how artists make their living.The way artists make their living is by being employed by companies who need them.
Generative AI renderes artists obsolete and about 90 precent of the artists currently employed by advertisment agencies, animation studios, game studios and many other industries will be unemployed.

Most artists don’t want to be unemployed and would prefer to keep doing what they currently do for a living, and not have to compete against AI that is trained on their work without their consent.

Anonymous Coward

December 5, 2023 at 6:57 am

Re: Re: Re:³

Most artists don’t want to be unemployed and would prefer to keep doing what they currently do for a living, and not have to compete against AI that is trained on their work without their consent.

Most people doesn’t want to be unemployed and would prefer to have a job to make a living, even if it is something they don’t like doing. Imagine if everyone could do what they liked and getting paid for it.

In reality, almost all artists are unemployed as artists, some make some money on the internet and a very few, relatively speaking, has it as a job.

Anonymous Coward

December 5, 2023 at 7:16 am

Re: Re: Re:³

The way artists make their living is by being employed by companies who need them

I am sure all those creators that have worked hard to build an audience while publishing via YouTube, just to give one self publishing platform, far outnumber those who are employed by the corporations. Its like at one time landscape and portrait painters could find jobs with various magazine, or get contracts to paint landscapes and portraits and then the camera came along and within a few years they were out of business.

Those relying on corporations to make a living may have to find a new job, while those who have built an audience via self publishing have a new tool to help them create, where it is useful. It is not as if generative AI will stop creators from making a living, but it may eliminate the need for those with technical skills who work to implement somebody else’s vision.

Anonymous Coward

December 5, 2023 at 9:51 am

Re: Re: Re:³

As it stands, the random content generators aren’t going to put anyone out of business… yet.

But some, ahem, unemployment will happen as people are temporarilly shuffled around their employement status due to random content generators being able to scoop up a fraction of those jobs.

As it is, the biggest reason for the layoffs is… the economy. Oh, and Elon firing half of Twitter’s workforce, including software engineers…

Your fearmongering about the random content generators notwithstanding, that is.

Anonymous Coward

December 4, 2023 at 10:37 am

Why is Allen Iverson getting all the attention nowadays?!?
What has he done, of significance, lately?

Thad (profile)

December 4, 2023 at 10:47 am

Anyone who thinks expanding copyright will help individual creators rather than corporate publishers hasn’t been paying attention the last…every single time we’ve ever tried that.

bluegrassgeek (profile)

December 4, 2023 at 11:25 am

Say what?

“Let companies rip off your work, or else only Big Tech will be able to rip off your work” is not a compelling argument.

Mike Masnick (profile)

December 4, 2023 at 11:59 am

Re:

“Let companies rip off your work, or else only Big Tech will be able to rip off your work” is not a compelling argument.

Well, your first (and largest) problem is assuming “fair use” is the equivalent of “ripping off your work.”

Fix that and the rest of your confusion might melt away.

Anonymous Coward

December 4, 2023 at 12:55 pm

Re: Re:

It’s absolutely not Fair Use. If a music producer wants to use a drum beat from another song and put it into their own music they have to ask the original artist for permission, get sampling clearances and pay for that. Just because you changed the tool you use to do that from Logic Pro to OpenAI doesn’t magically make that process legally any different.

No one’s asking for the expansion of copyright. Artists are asking that AI manufacturers and users are held to the same legal standards they already have to follow.

Anonymous Coward

December 4, 2023 at 1:16 pm

Re: Re: Re:

It’s absolutely not Fair Use. If a music producer wants to use a drum beat from another song and put it into their own music

If they want to use a sample is not the same as using a similar style, In your world thing lake jazz, rage-time, rock etc. could not exist because they all learn from the early adopters of a style.

Anonymous Coward

December 4, 2023 at 5:22 pm

Re: Re: Re:

If a music producer wants to use a drum beat from another song and put it into their own music they have to ask the original artist for permission, get sampling clearances and pay for that

How many producers have paid for their usage of the Four Chords?

Hell, if the lawsuits that Ed Sheeran and Katy Perry faced are anything to go by, not even following the rules will get you out of trouble. All you need is someone with an ax to grind and the resources to drag out a lengthy lawsuit.

No one’s asking for the expansion of copyright. Artists are asking that AI manufacturers and users are held to the same legal standards they already have to follow.

Artists are asking for people to pay for the mimicry of their art style, which is not under copyright.

Sarah Andersen is attempting to use copyright law to stop incels from AI-generating misogynistic content. The motivation is understandable, the law is not. Copyright law would not have protected her work from this usage. On the other hand, if a large corporation generates content similar to hers – by AI or otherwise – and sues her for copyright infringement, that would be the result of the precedent she sets if she wins a copyright lawsuit based on style and usage she disagrees with.

cpt kangarooski

December 4, 2023 at 6:18 pm

Re: Re: Re:

If a music producer wants to use a drum beat from another song and put it into their own music they have to ask the original artist for permission, get sampling clearances and pay for that.

That is because Judge Duffy, frankly, was biased against rap music, and was a jerk.

If a visual artist wants to use an image from another work, and clips it out and pastes it into a collage, that’s fine art. There is no reason whatsoever to treat audio sampling any differently.

Mike Masnick (profile)

December 5, 2023 at 12:15 am

Re: Re: Re:

It’s absolutely not Fair Use.

It absolutely is. Training is the equivalent of reading. Reading is not infringement.

If a music producer wants to use a drum beat from another song and put it into their own music they have to ask the original artist for permission, get sampling clearances and pay for that.

Yes. That’s true. But this is not sampling. They are not using the material in another work. They are training on it and producing something different.

No one’s asking for the expansion of copyright. Artists are asking that AI manufacturers and users are held to the same legal standards they already have to follow.

These cases seek to expand copyright and, in particular, overturn the Google Books and Hathitrust cases.

bluegrassgeek (profile)

December 5, 2023 at 4:19 am

Re: Re: Re:²

It absolutely is. Training is the equivalent of reading. Reading is not infringement.

That is just fundamentally wrong, and why your stance makes no sense. It’s not “just reading”.

They are not using the material in another work. They are training on it and producing something different.

This is fucking ridiculous levels of justifying ripping off someone else’s work. These systems aren’t “producing something different”, they’re making slight tweaks to existing work and passing it off as new. Taking two things and cut & pasting them together still means you’re using those two works. You can argue if it’s transformative or not, but you can’t justify “[t]they are not using the material in another work.”

Anonymous Coward

December 5, 2023 at 5:05 am

Re: Re: Re:³

This is fucking ridiculous levels of justifying ripping off someone else’s work.

Do you play a musical instrument, take photographs, make videos, tell stories? If you do are you not ripping off the works that used used to learn to do those things?

Anonymous Coward

December 5, 2023 at 7:00 am

Re: Re: Re:³

That is just fundamentally wrong, and why your stance makes no sense. It’s not “just reading”.

So tell us, what is it doing?

These systems aren’t “producing something different”, they’re making slight tweaks to existing work and passing it off as new

Which is something that most artists also do, or do you actually believe that there are wholly 100% original art being produced by artists?

Anonymous Coward

December 5, 2023 at 9:57 am

Re: Re: Re:³

That is just fundamentally wrong, and why your stance makes no sense. It’s not “just reading”.

Is learning 1+1=2 a copyright infringement? Is learning how to fucking read and write infringement?

Your argument, as it stands, will make education a form of copyright infriingement. And will legitimately limit education to the ultrarich…

These systems aren’t “producing something different”, they’re making slight tweaks to existing work and passing it off as new.

And that’s how humans make new discoveries.

The Carolingan Revolution was the culmination of these “making slight tweaks to existing works” and compiled into a codified standard. And the wheel. And fire, and the Printing Press….

And pretty much all the music, all the writing, and all the math.

I don’t like the NFT scammers promoting random content generators as a form of artificial intelligence (they are not), and these, ahem, random content generators can do good work in strictly and narrowly defined conditions…

But crowing about training these systems being no different than infringement?

That’s a copyright maximalist argument, and congratulations, YOU JUST REPEATED THE FUCKING THING.

Nihiltres (profile)

December 5, 2023 at 11:12 am

Re: Re: Re:³

These systems aren’t “producing something different”, they’re making slight tweaks to existing work and passing it off as new. Taking two things and cut & pasting them together still means you’re using those two works.

I’m afraid that you must have been misled about how the technology works. It does not “cut and paste things together”. The process for text-to-image latent diffusion starts with an image of coloured static, pseudorandomly generated from the “seed” value, an integer value. As the diffusion process steps forward, it mutates the image to reflect the keywords in its prompt as though it were “seeing shapes in static”, slowly sharpening the image. Check out this example to get a visual idea.

Once you see it pick up on features of a prompt, you can understand that it’s conditioned on words, which we all know are a smaller space than images (perhaps by the colloquial thousand?) and so share vectors with other examples of the same word. The prompt “oil painting” will evoke not only the common understanding of the phrase but many other senses of “oil” and “painting”. Diffusion users are recombining groups of features common to the keywords they use. That sort of “averaging” effect puts the elements used down in complexity, so we have to ask: are the used elements even copyrightable at their level of atomicity? It would clearly be ethical and legal for a human to look at a thousand images of a “dingleblorp” (an object no human has ever seen) and then draw a dingleblorp from memory, averaging a bit across the many they saw, so why should it be unethical or illegal for a machine to do the same? If a human can do it for a dingleblorp, then why can’t a machine, which has never seen a fire hydrant, do it with a million images of a fire hydrant?

I agree that it gets more complicated when the keywords are proper nouns, and are much more likely to reproduce something specific if used in a prompt. Prompting an artist’s name to evoke the style and techniques they use, or prompting a work title specifically, are techniques that are ethically dubious on their face, inviting cheap copying and ripoff. But the options that they represent ought nonetheless to exist: we shouldn’t exclude the sorts of uses that would be fair use or de minimis. We should have Mickey Mouse in the dataset, because someone’s going to make fair-use parody of Mickey Mouse. Techniques ought to be allowed that would evoke an artist or even a specific work but that are fair-use in context. It’s what people do with it that matters.

You may have been misled by examples where someone used diffusion software to do something that they shouldn’t have. There’s an important difference between Duchamp’s L.H.O.O.Q. and prompting “Mona Lisa Leonardo da Vinci”, and in practice some people will use diffusion in image-to-image mode, mutating an existing image just a bit or reinterpreting a provided image, in ways that are cheap ripoffs of the original. There are valid ways to use image-to-image diffusion too, but it’s user conduct that makes the difference, just as a user of a pencil can trivially infringe on copyright.

Mike Masnick (profile)

December 5, 2023 at 10:38 pm

Re: Re: Re:³

That is just fundamentally wrong, and why your stance makes no sense. It’s not “just reading”.

It literally is just reading. In the process it indexes, and indexing is already considered fair use.

So, the only one “fundamentally wrong” is you.

This is fucking ridiculous levels of justifying ripping off someone else’s work.

Again, fair use is not “ripping off someone else’s work.” Stop saying it is.

These systems aren’t “producing something different”, they’re making slight tweaks to existing work and passing it off as new.

That is not even remotely how these systems work.

I’m sorry, but you are ridiculously uninformed.

Taking two things and cut & pasting them together still means you’re using those two works.

And if that were happening, you’d have a point.

But it’s not.

So you don’t.

bhull242 (profile)

December 11, 2023 at 3:20 pm

Re: Re: Re:³

That is just fundamentally wrong, and why your stance makes no sense. It’s not “just reading”.

Yes, yes it is. That’s literally what the AI does. It reads and interprets that data, but doesn’t copy it.

This is fucking ridiculous levels of justifying ripping off someone else’s work. These systems aren’t “producing something different”, they’re making slight tweaks to existing work and passing it off as new.

Nope. They’re producing something different that has a similar style. Styles aren’t copyrightable.

Taking two things and cut & pasting them together still means you’re using those two works.

AIs don’t use cutting or pasting at all. Again, you clearly don’t know how AI works. The AI retains no copies of anything in the training data at all.

You can argue if it’s transformative or not, but you can’t justify “[t]they are not using the material in another work.”

Yes, yes we can. You still haven’t even demonstrated copying is even occurring. Again, the AI retains none of its training material at all.

bluegrassgeek (profile)

December 5, 2023 at 4:14 am

Re: Re:

Which part of ChatGPT’s systems involves “fair use”? Explain that, and your own stance might need adjusting.

Anonymous Coward

December 5, 2023 at 7:01 am

Re: Re: Re:

So you don’t understand the word “reading” then?

Mike Masnick (profile)

December 6, 2023 at 12:22 am

Re: Re: Re:

Which part of ChatGPT’s systems involves “fair use”? Explain that, and your own stance might need adjusting.

I mean, this isn’t difficult:

Multiple cases have found that indexing copyright-covered work to create a new service is fair use: search has been deemed to be fair use, including the archived versions of the indexes. Book scanning for the purpose of search has been found to be fair use.

What’s happening with AI training is even less of a copyright issue than those other examples. In both of those other examples you could use the tools to see some of the copyright-covered content.

I am sorry. You do not seem to understand (1) how AI works (2) how copyright works or (3) how fair use works.

Fix that and you might start understanding things.

Anonymous Coward

December 4, 2023 at 12:04 pm

Re:

Somewhere along the line, copyright has expanded beyond the control of producing copies to controlling how people can use the contents of a work. Any generic form of usage licensing, such as performance licenses that venues are required to get to allow musical performance become a means of supporting another layer of parasites, and transferring money from the the poorer artists to the richer artists.

While get a license seems such a simple solution, it is hugely impractical, especially as self publishing exists. Just who keeps track of who owns what copyrights when tens or hundreds of thousands of works of new works are published every day. Any licensing schemes similar to the collection societies means that a new layer of parasites are created, and the publisher take their cut of the license fees where a few crumbs may make it to the richest creators that they have signed on. obscure creators, and self publishers will likely find their works covered by any licensing fee, but they will not see a penny of that income.

Anonymous Coward

December 4, 2023 at 9:12 pm

Re: Re:

While get a license seems such a simple solution, it is hugely impractical, especially as self publishing exists. Just who keeps track of who owns what copyrights when tens or hundreds of thousands of works of new works are published every day.

“Getting a license is easy. If you won’t do that, you don’t deserve to use my work.”
“Okay, where’s the registration for the copyright you hold? If you don’t actually hold a registered copyright, on what grounds can you claim the copyright belongs to you?”
“Wait, not like that.”

Anonymous Coward

December 5, 2023 at 9:46 am

Re: Re: Re:

I know you are joking, but with a population of 333,287,557 (2022 census), and allowing photos and videos, along with created text and visual arts pieces, just how big would the copyright office have to be to handle registration of works being created in the US. A system designed to meet the needs of the legacy publishers, where it was easy to list the works published in a year in a couple of book sized catalogues, is not going to work in a world with easy self publishing and the flood of works that pre-Internet would never have been seen by more than a few family members.

cpt kangarooski

December 5, 2023 at 11:04 am

Re: Re: Re:²

with a population of 333,287,557 (2022 census), and allowing photos and videos, along with created text and visual arts pieces, just how big would the copyright office have to be to handle registration of works being created in the US.

Not that big. (Also, US censuses are decennial, in years evenly divisible by 10)

In 1970, the census determined the US population to be 203,302,031. Copyrights operated under the 1909 Copyright Act still, and so registration was essential to holding a copyright in a published work. Things seemed to work okay, and they didn’t even have the advantage of all of the modern computer equipment we have now.

According to the Copyright Office report from 1970, they had 316,465 registrations, only 23,549 renewal registrations in 1969, but I don’t think that they’ve ever had more than a few hundred employees.

Bringing back formalities, or even strengthening them to new levels, as I strongly believe should be done, as well as shortening terms but increasing renewals (ditto), would require more work by the Copyright Office, but a lot more of it could be automated than before. Renewals shouldn’t require human involvement at all on the administrative side, for example. (Much like registering and renewing domain names has gone from something done by two or three people at InterNIC to a big but thoroughly computerized industry, so long as there are no complaints or disputes requiring human interaction)

Remember, the issue isn’t the number of works being created; it’s how many of those works’ creators think that it’s worth the time, trouble, and money to bother to register. I bet you didn’t register your post that I’m replying to. I know I’m not going to bother to register this one.

If the author doesn’t care enough about getting a copyright to fill out a form, submit a couple-three best copies, and pay a nominal fee (just to avoid people spamming the system — a dollar would be enough for me), then why should anyone else care about granting them a copyright?

This self-selection works wonders to keep the number of registrations, and thus copyrights, to a manageable level — we just need to make copyrights contingent on registrations.

Anonymous Coward

December 5, 2023 at 12:02 pm

Re: Re: Re:³

You are overlooking a big difference between the world pre-internet and post-Internet. Pre-Internet the works being registered for copyright purposes were those few works selected by the publishers, labels and studios. Those registered in any year could be catalogued is a few book size volumes, which for books at least was done. Post Internet, anybody can publish a work, and on a world wide scale, YouTube gas about 5oo hours worth of videos published every minute, say 400 works a minute. Instagram has about 64,000 photos published every minute. Most of those works are new works, whose copyright belongs to the creator, and that is only two places where new works are published.

Add in article length blog posts, music and books published in various place around the Internet and you have a problem that is bigger than any practical copyright registration system. Just the two example I have given are approaching 10 million new works being published a day. That figure should also tell you how little of human creativity was being published when publication had to go via a gate keeper.

cpt kangarooski

December 5, 2023 at 2:47 pm

Re: Re: Re:⁴

I don’t think so.

Remember, I’m not saying that every published work should be registered. I’m saying that aside from a minor copyright on unpublished works (to avoid pre-publication piracy), copyrights should not be granted without registration. Only the copyright claimant should be allowed to decide whether to register, and to actually go through the registration process.

And the process should not be free; it should involve some effort by the registrant, and it should involve some monetary cost.

Most authors or other copyright claimants won’t bother — indicating that they were willing to create and publish a work without copyright acting as an incentive, and that therefore they are undeserving of copyrights. (Which should be reserved for situations where they are necessary for works to be created and published)

Further, require registration prior to publication — except for works created contemporaneously with publication, such as live performances — and expand the concept of what constitutes publication to include any public availability of the work, rather than of a copy, making performances and displays count too.

Anyone will be able to make, register, and then upload to YouTube their react video or unboxing (or whatever is trendy now; I don’t keep up with what the kids think is cool), or upload to their blog, or whatever. But I bet that by and large they will not bother to.

Which is fine — it was their choice, and now those works are immediately in the public domain. I doubt it will have much effect; who would want to pirate this stuff? And if there is a lot of piracy, well, I’m not willing to protect authors from making mistakes or bad deals any more than we protect anyone else from such things.

Maybe there’s 10 million new works every day, but I bet that if we strictly required formalities, there’d be only a few thousand registrations a day. Because when copyrights aren’t free, and aren’t automatic, and claimants need to think about it, and put in some effort and money to register, they think a lot harder about whether they really need a copyright or not.

And if a lot of works are registered, and we need to increase the staff of the Copyright Office, well, that’s why it’s part of the government. I don’t mind it being taxpayer supported. (The registration fee isn’t meant to support it — it’s meant to impose a minor but tangible hurdle on the claimant so that they don’t spam registrations)

Anonymous Coward

December 5, 2023 at 3:25 pm

Re: Re: Re:⁵

congratulations, you have proposed a two class copyright system, where copyrights largely exists for works that go through a gatekeeper publisher, and leave everything else unprotected. Also, if somebody has an unprotected work become popular, you have created a problem when they want to protect it, and maybe sell rights, like the movie rights.Also, you would have eviscerated the creative commons and opensource/free software licenses.

cpt kangarooski

December 5, 2023 at 8:36 pm

Re: Re: Re:⁶

congratulations, you have proposed a two class copyright system, where copyrights largely exists for works that go through a gatekeeper publisher, and leave everything else unprotected.

I am okay with that. Firstly, because it has basically worked out okay for the history of copyright since the Statute of Anne, and secondly because no one would be stopping self-publishing authors from 1) self-publishing, or even 2) obtaining copyrights and then self-publishing … which also is actually how things worked for the same period of history, except that now self-publishing costs less.

Also, if somebody has an unprotected work become popular, you have created a problem when they want to protect it, and maybe sell rights, like the movie rights.

There is no problem at all. The work would be entirely impossible to protect. It would in fact, be in the public domain. This is deliberate.

The purpose of copyright is to encourage authors to create and publish works that they otherwise would not have created and published. If an author is willing to create and publish a work without a copyright, it is the height of stupidity to grant them a copyright, and it is directly contrary to the public interest, which is in favor of works not being copyrighted when it is not necessary for them to be.

If the author of such a work wanted protection, they should have thought of that before they created and published the work, when they’d have an opportunity to register it and get a copyright.

Also, you would have eviscerated the creative commons and opensource/free software licenses.

Again, not worried about it. Those works would also be in the public domain, unless someone were concerned enough about copyright to register for one from the beginning. It would make things a lot like the BSD License, except with no requirement of credit.

Anonymous Coward

December 6, 2023 at 2:45 am

Re: Re: Re:⁷

Firstly, because it has basically worked out okay for the history of copyright since the Statute of Anne

Pre-Internet the only way to publish was to use a publisher, who dealt with the registration of the copyright, which by the way was automatically granted to the Author. Indeed giving authors copyrights which they could transfer to publisher for a consideration of a royalty was how copyright worked. Also note, that before publication, the Author had control over who could see the work, and back in those days make a copy by the laborious method of hand writing a new copy. Indeed the purpose of copyright protection was to give the means to stop another printer printing a release, either by getting a copy via industrial espionage, or copying a published book if it was a big hit and a second printing was required. (Producing a second printing involved all the same work as producing the first, as the type used was recycled after printing).

Also, if only 10% of works being self published were being registered, that is a million registration a day to be dealt with, and any delays would impact publishing on a schedule, or protecting analysis of current events.

Also, unless a registration system could deal with the real volume of new works, and in a world where the schedule from completion to publication ca be measured in minutes, as opposed to the months of the printed book, record, film etc. world, you are creating a system that has two classes, those who have the time and money to register their works, and those who don’t.

cpt kangarooski

December 6, 2023 at 7:29 pm

Re: Re: Re:⁸

So what’s changed? Copying is still exactly as laborious for a pirate as it is for a publisher, or more so due to the latter’s ability to operate openly. The only leveling effect has been that publishers for some time now have refused to take advantage of modern technology.

(Producing a second printing involved all the same work as producing the first, as the type used was recycled after printing)

Depends. If the work was expected to be a big hit, it would be stereotyped or later, plates would be made and kept. If it wasn’t expected it would need to be reset, but sooner or later the publisher would take those steps to avoid having to reset it any more than necessary.

Also, if only 10% of works being self published were being registered, that is a million registration a day to be dealt with, and any delays would impact publishing on a schedule, or protecting analysis of current events.

First, as is typical in copyrights, patents, etc., you don’t have to wait for the registration to issue. If you’ve filed your paperwork, you’ve got your priority date, and you go ahead. The registration agency will catch up eventually.

Second, you’re damn optimistic. If it costs even a little money and takes a bit of effort, a lot of people just aren’t going to bother. I doubt you’d see anything like 10%. Some guy who goes on YouTube to talk about his political opinions, or to show off some piece of old technology he found, or just to show off his funny cat video, is not going to bother to take any affirmative steps to register a copyright. If he doesn’t care, why should anyone else?

It’s a self-sorting mechanism to determine who was actually incentivized by copyright, and therefore should get one, and who was not, and therefore should not get one. If you have a better way to figure out how to only grant copyrights where it was necessary for incentivizing an author to create and publish a work, I’d like to hear what it is.

Also, unless a registration system could deal with the real volume of new works, and in a world where the schedule from completion to publication ca be measured in minutes, as opposed to the months of the printed book, record, film etc. world, you are creating a system that has two classes, those who have the time and money to register their works, and those who don’t.

Doesn’t strike me as being that difficult, especially since, again, all that is necessary to be done quickly is to file and deposit. Copyrights aren’t examined like patents or trademarks (and therefore registrations should not be treated as having any weight as to copyrightability) and it won’t be hard to dump it into the database and issue a registration number within a reasonable (but far from instant) time.

Anonymous Coward

December 7, 2023 at 3:42 am

Re: Re: Re:⁹

You are destroying the creative commons license, and the GPL and similar licenses, as there would be no protection of the work unless it was registered. Further, the creative commons licenses allow a creator to decide how others can use their works. Also, how often would software need re-register, especially when development is carries out in public in a GIT repository.

Also, how do you prevent someone registering someone else’s unregistered work, and turning copyright against the original creator so that they can successfully monetize the work. Also, do you expect the register to include a copy of the work, because without that how is it useful to establish priority when ownership is disputed.

The existing copyright system id not fit for purpose in the Internet age, but what you are proposing is even worse because without registration creators would have no protection unless the register their works.

cpt kangarooski

December 7, 2023 at 3:08 pm

Re: Re: Re:¹⁰

You are destroying the creative commons license, and the GPL and similar licenses, as there would be no protection of the work unless it was registered.

As noted, I don’t think it’s that dire, and also it wouldn’t bother me if it were. The GPL, Creative Commons, etc. are attempts to make the best out of what is already a bad situation with copyrights being automatically granted upon creation. If most works were in the public domain, they would not be needed.

If it is important to an author to use such a license, they would merely need to register, just like anyone else. Presumably we’d see GPL4, which would require contributors to register their contributions so that the license continued to work basically as normal.

(Although I question what happens if a contribution is in the public domain, which is a scenario that can happen now. For example, suppose a GPL-ed piece of software is modified by a federal employee in the course of their duties, which means that it is uncopyrightable per 17 USC 105. Does the GPL permit the modified, partially GPLed, partially public domain work to be distributed, or would it prohibit the distribution of whatever fell under the GPL? The answer may be instructive as to the proposed reform.)

Whereas if registering contributions was too much of a hassle, I think that it would result in contributors looking for more permissive licensing, or — or — taking advantage of the greater quantity of public domain works for which there would be no hassle whatsoever. Practically a problem that solves itself!

Also, how do you prevent someone registering someone else’s unregistered work, and turning copyright against the original creator so that they can successfully monetize the work.

Same way we do that now.

As I mentioned earlier, I’m not unsympathetic to the concern over manuscript piracy, and obviously copyrights must initially vest in the author.

I would suggest that there is a weak, and short-lived copyright granted upon creation which is only useful for the purpose of providing authors with a remedy in the event that someone publishes (inclusive of public performance or display) their work without authorization. This gives the author time to shop the work around. But if the author publishes without registration, the copyright terminates. The protections should be geared to go after the specific culprits, but not members of the general public who happen to infringe; if authors want strong protection, they should register. And it should be short-lived so that works don’t molder on the shelf forever. The goal of copyright is to get works created and published that otherwise would not be, and to protect them as minimally and briefly as possible. If a work takes more than, say, 5 or 10 years to get published — a specific time period will need to be determined — then that’s long enough. There is a point in time when it is better that the work should be pirated than never known to the public at all.

Note also that an author can register and not publish, but because they’d have to deposit, the public still gets the work in the end. If one were worried about not being able to publish quickly, that would be the way to go.

Also, do you expect the register to include a copy of the work, because without that how is it useful to establish priority when ownership is disputed.

Deposit is a traditional copyright requirement, and very useful for many purposes including establishing priority. It should be made quite strong. Indeed, for software, I’ve suggested in the past that it should require deposit of source with sufficient comments that a person having reasonable skill in the art could understand and usefully modify the work. And that for an author (or someone acting under their aegis, like an authorized publisher) to apply DRM to a work should immediately terminate the copyright. (Publication contracts should not permit authors to waive damages from publishers who do this, so that they have some recourse) Further, the Copyright Office and Library of Congress should sponsor efforts to circumvent DRM, since it leaves the public better off.

The existing copyright system id not fit for purpose in the Internet age, but what you are proposing is even worse because without registration creators would have no protection unless the register their works.

Not what I’ve said, and if you look above you’ll see that, but the copyright system should strongly urge authors to register their works as soon as possible, and providing little to nothing for authors who fail to is part of that.

It worked great for centuries and there’s no reason it cannot continue to work well. Remember, the attacks on formalities began long before the Internet was dreamed up, much less before it became widely used.

Anonymous Coward

December 8, 2023 at 4:00 am

Re: Re: Re:¹¹

Presumably we’d see GPL4, which would require contributors to register their contributions so that the license continued to work basically as normal.

What do the register, the project, or every little update made public?

Deposit is a traditional copyright requirement, and very useful for many purposes including establishing priority.

Not what I’ve said, and if you look above you’ll see that, but the copyright system should strongly urge authors to register their works as soon as possible,

That worked when registration was only a requirement to protect the copyright of works about to be published, as the risk to other works was slight, requiring both the stealing or copying of a manuscript, and finding a publisher for the stolen work. (Note before the mid 70’s, copying involved writing out, typing up, or photocopying from a paper copy).

Registration would now need a system at the scale of Google to handle and store deposits in a usable form, and would be objected to by the traditional publisher, as any security failures of the system would allow pirates to steal their works.

I don’t think you grasp the scale of the problem, there are now more works being published in a minute or two than used to be published in a year. Also don’t forget that copyright applies to unpublished works as well, and is now important as gaining a copy of an unpublished work is now just one security breach away.

bhull242 (profile)

December 11, 2023 at 3:42 pm

Re: Re: Re:¹²

Also don’t forget that copyright applies to unpublished works as well, and is now important as gaining a copy of an unpublished work is now just one security breach away.

That’s still the case now. Or do you think that unpublished works don’t require a copy to be stored somewhere that could be hacked?

Most of the problems you cite basically fall into one of these groups:

Entirely speculative.
Acceptable to most people
Grossly exaggerated
No different between the current system and the proposal.

Here’s the thing: charging a fee and requiring registration should keep things to manageable levels. If it’s as high as you say, then I have no problem with the government having to use such a system. Google could do it, so it’s not impossible for the government.

Anonymous Coward

December 5, 2023 at 4:54 pm

Re: Re: Re:²

And therein lies the issue. Content creators and rightsholders would absolutely not agree to a system where they have to submit a registration for everything they create, or everything they roughly sketch/draft and doesn’t make the light of day.

Yet they have no issues with people lining up around the city block to ask them for permission because one drum riff or one chord progression may actually be infringing if you squint your eyes and perk your ears on a blue moon. It’s entirely impractical, and they know this, but they demand it because they’re not the ones being held responsible when something inevitably fucks up. Hell, even major copyright holders can’t stop themselves from trying to DMCA their own websites off the Internet.

The entire system is basically run by a bunch of Tero Pulkinnen-level simpletons. It’s a system that is impossible to implement fairly and judiciously, but they demand it from everyone when they can’t even keep their own house in order. It’s hypocritical.

Anonymous Coward

December 4, 2023 at 12:45 pm

Re: Your pleas re "authors being ripped off" are unavailing.

The primary objective of copyright is not to reward labor of authors, but to promote the Progress of Science and useful Arts.”

— Sandra Day O’Connor, Feist Publications

If you wish it otherwise, the constitutional amendment process is on your left.

Nathan F (profile)

December 4, 2023 at 12:11 pm

I suspect it will be even worse in the end. If the AI creators have to go to the gatekeepers, what is the likely hood of them being able to get the kind of data they want to train their AI vs some kind of prepackaged low quality data? How vibrant of a marketplace will there be if all the training sets are the same?

Anonymous Coward

December 4, 2023 at 12:50 pm

Great article, absolutely true. And if they establish precedent against AI training, who knows how many other artistic pursuits will be foreclosed? Are we moving toward style being copyrightable? That would be a nightmare.

Nihiltres (profile)

December 5, 2023 at 11:21 am

Re:

Opponents of generative AI are already arguing for that. In the recently amended complaint of Andersen et al. v. Stability AI et al. the plaintiffs argue that their artistic styles represent “trade dress” and are therefore protectable as informal trademarks or something roughly to that effect.

Ninjasaid (profile)

December 5, 2023 at 11:45 am

Re: Re: reply to Nihiltres comment

“Opponents of generative AI are already arguing for that. In the recently amended complaint of Andersen et al. v. Stability AI et al. the plaintiffs argue that their artistic styles represent “trade dress” and are therefore protectable as informal trademarks or something roughly to that effect.”

https://www.comicmix.com/wp-content/uploads/2017/12/51-Order-on-Second-MTD.pdf

didn’t a judge rule that styles cannot be trademarked?

Ninjasaid (profile)

December 5, 2023 at 11:49 am

Re: Re: Re: second reply to Nihiltres comment

multiple courts have ruled that style cannot be trademarked or copyrighted.

Nihiltres (profile)

December 5, 2023 at 12:26 pm

Re: Re: Re:

I’m just saying that that’s what they’re currently arguing. See the amended complaint, for example at pp. 71–75 (it may be fastest to simply search the text for instances of “trade dress”).

I’m not a lawyer, but it seems extremely obvious that it’s an attempt to rope style into IP law.

That One Guy (profile)

December 4, 2023 at 3:50 pm

Any time I hear about how AI needs to ‘pay’ for the content that it’s learning from one of my first thoughts regarding the authors pushing/supporting that argument is ‘Great, now about how much did you pay all the authors you learned from in order to write your stuff?’

If learning is infringement that needs paying for then not only is culture screwed then there’s a lot of currently hypocritical authors that need to start signing a lot of checks to put their money where their mouth is.

Anonymous Coward

December 4, 2023 at 4:08 pm

Re:

I strongly suspect this licensing is being pushed by the traditional publishers, labels and studios, as more wide spread use of AI will increase the competition that they face for eyeballs and ears to consume the works they control.

K Smith (profile)

December 4, 2023 at 5:59 pm

There is a term ...

There is a term that describes the people who think that further empowering copyright holders will somehow magically benefit creators: USEFUL IDIOTS.

As Mike wrote, the past attempts to further empower copyright holders have almost entirely benefited the big companies, with very little benefit going to the creators. There is no reason to doubt that any future attempts will do the same.

-skh

Anonymous Coward

December 4, 2023 at 9:16 pm

Re:

There is a term that describes the people who think that further empowering copyright holders will somehow magically benefit creators: USEFUL IDIOTS.

Oh, there’s another group of people who believe the above. People who want to chip away at basic protections like “innocence before proven guilty”, and standards of evidence that have to be brought in front of a judge before they can grant an all-encompassing subpoena for information.

There’s been one guy flitting around Techdirt who’s claimed that Section 230 must die to make sure that celebrities can sue everyone and anyone who might have besmirched them in an Internet comment. To him, Section 230 is an obstacle to his goals of unfettered mass litigation.

That One Guy (profile)

December 5, 2023 at 2:22 am

Re: Mind blown

As Mike wrote, the past attempts to further empower copyright holders have almost entirely benefited the big companies, with very little benefit going to the creators. There is no reason to doubt that any future attempts will do the same.

Wait wait wait, do you mean to suggest that applying the logic behind trickle-down economics to copyright might be a bad idea?

Anonymous Coward

December 5, 2023 at 4:02 am

Except...

I think they know this, but they want to also aim at big tech.

They know smaller companies will get screwed, but that means they can get the bigger companies easier.

Even though they know big tech will survive, as Leif K-Brooks says:
while some of them are much larger companies with much greater resources, they all have their breaking point somewhere. I worry that, unless the tide turns soon, the Internet I fell in love with may cease to exist, and in its place, we will have something closer to a souped-up version of TV – focused largely on passive consumption, with much less opportunity for active participation and genuine human connection.

Politicians and people who want to stick it to big tech know this (at least I would assume), so don’t get surprised about this.

Anonymous Coward

December 6, 2023 at 5:34 am

Re:

Good luck, they already got bamboozled by Microsoft.

Guess who Sam Altman actually works for?

LostInLoDOS (profile)

December 6, 2023 at 6:36 pm

Question Mike, honest

Wouldn’t the solution be to use library content? Only.

Anonymous Coward

December 10, 2023 at 5:20 pm

Re:

I don’t think that would solve all the issues – what would be the proof that all materials used came from the library? Neither would it convince copyright holders that existing models were all trained with library material, even if they were legally owned.

And that’s not even going into what copyright holders think of libraries. There’s a non-zero number of them who would absolutely love to go after free public access to books.

LostInLoDOS (profile)

December 11, 2023 at 1:59 pm

Re: Re:

Though true, the courts have always stood with 1:1 access being legal.

Using a legal library account definitely creates a gigantic shield. It may not stop a lawsuit, but likely leads to the AI company prevailing.
There’s something to be said about going to war with the right equipment and all.

@nougatmachine@mastodon.social

December 6, 2023 at 9:46 pm

The comparison to Spotify and music labels here is asinine, because to work at all, generative AI models need more than the equivalent of musicians who’ve signed deals with publishers.

Generative AI’s strength is derived from the totality of its training data scraped across the entire internet, from both extremes of both axes of the graph: profitable to amateur, high-quality to low-quality.

Famous, successful creatives may be some of the loudest voices protesting generative works, but ChatGPT’s strength does not come from slurping up Stephen King. Its strength comes from slurping up every blog comment, every Amazon review, every SEO clickbait piece of absolute garbage, every brilliant Substack newsletter tragically under-read.

It’s conceivable that the most-valuable, most high-profile content in the world could get thrown together into some sort of record-label-esque gatekeeper consortium that requires exorbitant rates to access. But that’s not the point. Perhaps I’m ascribing too much of my own thoughts to other people’s motivations, but to me the beauty of a regime that requires copyright permission for training generative AI is the sheer impossibility of tracking down every author of unremarkable, low-quality, amateur content effectively makes any commercial application of generative AI impossible. This is a feature, not a bug: you can’t generate low-quality thoughts created by putting a bunch of words in a blender and hitting puree if it’s impossible to figure out where to buy the ingredients for the smoothie.

From the perspective of someone like me, who views generative AI as a vehicle for enshittification and making the signal-to-noise ratio of information on the web even worse than it already is, nothing could be more delicious.

Saturday
12:06	This Week In Techdirt History: April 21st - 27th (1)
Friday
19:39	LittleBigPlanet: Now You Don't Own What You've Created, Either (17)
15:09	Ctrl-Alt-Speech: The Bell Tolls For TikTok (2)
13:34	Florida Appeals Court Says The Right To Record Extends To Phone Calls With Cops (5)
12:06	Court Dismisses Mark Zuckerberg Personally From Massive ‘Social Media Addicts Children’ Lawsuit (6)
10:45	Net Neutrality Is Back! For Now. (29)
10:40	Daily Deal: U-STREAM Home Streaming Studio with 10" Ring Light & Tripod (0)
09:20	Biden Bans The App His Campaign Insists Is An Important Place To Talk To Voters (32)
05:21	People Are Slowly Realizing Their Auto Insurance Rates Are Skyrocketing Because Their Car Is Covertly Spying On Them (44)
Thursday
20:05	Flynn Family's SLAPP Suit Against CNN Slapped Down By Judge (18)

If Creators Suing AI Companies Over Copyright Win, It Will Further Entrench Big Tech

from the be-careful-what-you-wish-for dept

Comments on “If Creators Suing AI Companies Over Copyright Win, It Will Further Entrench Big Tech”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

The Techdirt Greenhouse

Trending Posts

Saturday

Friday

Thursday

More

Email This Story

Tools & Services

Company

Contact

More