Copyright Licensing For AI Training Won’t Make Anything “Fair”
from the the-copyright-ratchet dept
Afew weeks ago, the UK’s regional and national daily news titles ran similar front covers, exhorting the government there to “Make it Fair.” The campaign Web site explained:
Tech companies use creative content, such as news articles, books, music, film, photography, visual art, and all kinds of creative work, to train their generative AI models.
Publishers and creators say that doing this without proper controls, transparency or fair payment is unfair and threatens their livelihoods.
Under new UK proposals, creators will be able to opt out of their works being used for training purposes, but the current campaign wants more than that:
Creators argue this [opt-out] puts the burden on them to police their work and that tech companies should pay for using their content.
The campaign Web site then uses a familiar trope:
Tech giants should not profit from stolen content, or use it for free.
But the material is not stolen, it is simply analyzed as part of the AI training. Analyzing texts or images is about knowledge acquisition, not copyright infringement. Once again, the copyright industries are trying to place a (further) tax on knowledge. Moreover, levying that tax is completely impractical. Since there is no way to determine which works were used during training to produce any given output, the payments would have to be according to their contribution to the training material that went into creating the generative AI system itself. A Walled Culture post back in October 2023 noted that the amounts would be extremely small, because of the sheer quantity of training data that is used. Any monies collected from AI companies would therefore have to be handed over in aggregate, either to yet another inefficient collection society, or to the corporate intermediaries. For this reason, there is no chance that creators would benefit significantly from any AI tax.
We’ve been here before. Five years ago, I wrote a post about the EU Copyright Directive’s plans for an ancillary copyright, also known as the snippet or link tax. One of the key arguments by the newspaper publishers was that this new tax was needed so that journalists were compensated when their writing appeared in search results and elsewhere. As I showed back then, the amounts involved would be negligible. In fact, few EU countries have even bothered to implement the provision on allocating a share to journalists, underlining how pointless it all was. At the time, the European Commission insisted on behalf of its publishing friends that ancillary copyright was absolutely necessary because:
The organisational and financial contribution of publishers in producing press publications needs to be recognised and further encouraged to ensure the sustainability of the publishing industry.
Now, on the new Make it Fair Web site we find a similar claim about sustainability:
We’re calling on the government to ensure creatives are rewarded properly so as to ensure a sustainable future for AI and the creative industries.
As with the snippet tax, an AI tax is not going to do that, since the sums involved as so small. A post on the News Media Association reveals what is the real issue here:
The UK’s creative industries have today launched a bold campaign to highlight how their content is at risk of being given away for free to AI firms as the government proposes weakening copyright law.
Walled Culture has noted many times it is a matter of dogma for the industries involved that copyright must only ever get stronger, as if they were a copyright ratchet. The fear is evidently that once it has been “weakened” in some way, a precedent would be set, and other changes might be made to give more rights to ordinary people (perish the thought) rather than to companies. It’s worth pointing out that the copyright world is deploying its usual sleight of hand here, writing:
The government must stand with the creative industries that make Britain great and enforce our copyright laws to allow creatives to assert their rights in the age of AI.
A fair deal for artists and writers isn’t just about making things right, it is essential for the future of creativity and AI.
Who could be against this call for the UK government to defend the poor artists and writers? No one, surely? But the way to do that, according to Make it Fair, is to “stand with the creative industries”. In other words, give the big copyright companies more power to act as gatekeepers, on the assumption that their interests are perfectly aligned with those of the struggling creators.
They are not. As Walled Culture the book explores in some detail (free digital versions available), the vast majority of those “artists and writers” invoked by the “Make it Fair” campaign are unable to make a decent living from their work under copyright. Meanwhile, huge global corporations enjoy fat profits as a result of that same creativity, but give very little back to the people who did all the work.
There are serious problems with the new AI offerings, and big tech companies definitely need to be reined in for many things, but not for their basic analysis of text and images. If publishers really want to “Make it Fair,” they should start by rewarding their own authors fairly, with more than the current pittance. And if they won’t do that, as seems likely given their history of exploitation, creators should explore some of the ways they can make a decent living without them. Notably, many of these have no need for a copyright system that is the epitome of unfairness, which is precisely why publishers are so desperate to defend it in this latest coordinated campaign.
Follow me @glynmoody on Mastodon and on Bluesky. Originally posted to WalledCulture.
Companies: copyright, eu, exploitation, generative ai, link tax, make it fair, training, uk


Comments on “Copyright Licensing For AI Training Won’t Make Anything “Fair””
So, you’re going with “copyright isn’t perfect, therefore creatives deserve to have their entire collective works siphoned up and used to create derivative works ad nauseam”? That’s certainly a take.
If the payment is going to be a pittance anyway, why not opt out of having your work coopted to make your entire job largely obsolete?
I’m far from a defender of the copyright status quo, but that doesn’t mean allowing billion dollar companies to just ignore copyright altogether is the answer here. It’s kind of wild to literally see the argument that “creatives are already getting screwed by one group of companies, so why not a second?” in writing.
Re:
If that were what was said, sure.
What he actually said was:
“Copyright isn’t perfect and we shouldn’t strive to apply the same imperfection to this new technological reality. Instead, we should try to implement solutions that would actually benefit the creatives, not just the major companies who own the copyrights, and simultaneously make it possible to train LLMs”.
Re: Re:
And just what would those solutions be? You can’t replace copyright with vibes. I get that the majority of readership on this website is very anti-copyright because of the ways it’s been abused by various actors (the estate of Marvin Gaye comes to mind). It’s like Democracy, it’s the best of a bunch of bad systems.
Even just basic piracy destroyed the economics of recorded and live music (speaking as someone with many professional musician friends and as an economist). Small artists used to make money off of music sales and now make nothing. If piracy weren’t available as an outside option to listeners, musicians and record labels would have more bargaining power with streaming services. We as consumers have gotten used to things being free which never should have been free, and as a result, many many talented musicians forgo careers entirely.
Re: Re: Re:
People like Glyn Moody think that copyright can be replaced by “True Fans”, as he puts it. Essentially, he believes that an impossible number of artists can somehow survive off of Kickstarters, Patreons, and live concerts and stuff that “True Fans” will show up for. It’s a load of bull.
Re: Re: Re:
Hell if I know, but a bad solution isn’t inherently better than no solution.
How? Study after study shows that piracy doesn’t equal a lost sale, that piracy creates opportunities for those artists willing to embrace it, and that pirates spend way more money on culture than non-pirates.
Small artists make no money because streaming pays next to nothing, because it’s harder to be discovered when you’re side by side with thousands and thousands of other artists on a streaming platform, and/or the labels take most of the money anyway(if a small artist has a label).
if piracy were eliminated tomorrow, small artists would arguably make even less than they do now.
Musicians are literally the ones supplying the product that allows the streaming services to make money. They have ALL the bargaining power, especially in the internet age. There are dozens of platforms they could sell their music on.
Re: Re: Re:2
“study after study” source?
Streaming pays nothing because piracy pays nothing. Record companies have no leverage. And yes, I am very aware of Spotify’s efforts at hiring people to produce royalty-free music for major genre playlists. If piracy was not an option, streaming would be a hell of a lot more than 10$ or so per month, and musicians would get a cut. In the 2000s it wasn’t just tech nerds pirating music, all kinds of regular young people were using tools like limewire. They would go back to it if Deezer were 50$ a month.
Multiple online stores do not yield bargaining power if one of them just takes it for free.
Recorded music revenue globally is lower than in 1999 (https://www.ifpi.org/wp-content/uploads/2024/03/GMR2025_SOTI.pdf; adjusting for inflation, 29 billion in 2024 is 15 billion in 1999, despite dramatic growth in the world economy).
Re: Re: Re:3
There have been multiple studies showing that those who pirate usually spend more money on media than those who don’t. Vice had an article a bunch of years ago that went through several studies showing this, here’s the article: Study Again Shows ‘Pirates’ Tend to Be The Biggest Buyers of Legal Content
Have you ever wondered how streaming sites pay the artists? Artists usually don’t get paid per stream, it’s more convoluted than because most streaming sites uses a pro-rata model that’s tilted in the favor of big artists and since smaller artists have fewer number of streams than the big ones they get next to nothing when the revenue is divvied up.
Who do you think earns the most money from streaming aside from the platform that usually takes a 30% cut of all revenue? It’s the record companies, as long as they are making money they’ll keep their thumb on the artists while depriving them of their rights. Have you ever wondered why Taylor Swift started re-recording her earlier albums?
Yes, because people changed what they spent money on and the statistics is quite clear: In 2000 people spent ~20 billion on video content and ~23 billion on music, last year people spent over 900 billion dollars on media content including video streaming and other content like games which ate the music industry’s revenue for the simple reason that people only have a finite amount of disposable income and most of it these days ends up in the pockets of Amazon, Netflix etc.
Another thing you seem to very blind to which is extremely relevant to what kind of media people consume, people don’t value music as they once did. These days music have been commoditized which lowers its monetary value, before digital media people placed a higher value on music for the simple reason that physical media was a scarce resource in comparison.
Look at the price for any other product that started out scarce and then when it finally became commoditized, it went from very expensive to cheap. This is basic economics.
Re: Re: Re:4
That’s very different than saying piracy is beneficial, though. There’s a very obvious sampling issue, where the people most likely to pirate could also happen to be the people who happen to most care about music. (There’s other potential confounding factors, as well, like how easy/available it is. It’s not the Napster days anymore).
I’m pretty pro-piracy, but the idea that piracy is a good thing for sales overall is a really hard sell. It almost certainly has some negative effect (certainly not 1:1, but not 0 either). A good example would be something like Netflix’s crackdown on shared passwords- when that happened, subscriptions went up.
And there’s a very obvious way to check that whole hypothesis- if free distribution was that beneficial, people could choose to freely distribute their music. Almost no one does it intentionally to great commercial success. You don’t have to copyright your stuff. There are probably some niche exceptions
Re: Re: Re:5
That’s kind of the point. He’s suggesting that pirates only care about getting stuff for free, but as my comment below lays out, there is ample evidence that pirates are some of the best friends that an artist can have.
Again, see my comment below.
Techdirt covered this, and in a nutshell, it was “post hoc, ergo propter hoc”.
That’s a strawman. Nobody has claimed that freely distributing one’s music will propel you into massive mainstream fame and income; just that it doesn’t hurt income the way many people think it does.
Re: Re: Re:5
Nobody used the word beneficial here except you, what is being said is that piracy isn’t all bad because it has some upsides. In some outlying examples it has been beneficial to an artist but such occurrences are very rare and can be discounted.
Your hypothesis is built on a mischaracterization of the point being made since nobody claimed piracy was 100% beneficial, the point was that pirates spend more money on average than any other consumer of music. And the other problem with your hypothesis is that if everything is freely distributed there is no need to pay for anything anyway. The caveat to that is that the artist can go with a freemium model to entice people to pay for extras.
Re: Re: Re:5
People do. Check out Leah Abram’s account on BandCamp if you don’t believe me.
Re: Re: Re:5
If a pirate streams a track from which an artist earns next to nothing without paying for it, and then spends nearly $500 bucks on font row seats, earning the artist a significant amount of the take, how is that not beneficial?
Re: Re: Re:6
Whether it’s beneficial or not depends on whether they would still buy the seats if they couldn’t pirate. If that is true, it can be beneficial. But there isn’t strong evidence that a lot of sales are being converted because of piracy
Re: Re: Re:7
Sure there is:
Piracy Boosts Concert Ticket Sales, Hurts Box Office Revenue
File-Sharers More Likely to Pay for Movies, Books, Games and Concerts
Re: Re: Re:3
Online Music Piracy Doesn’t Hurt Sales, European Commission Finds
EUIPO Study: 60% of Pirates Also Buy Content From Legal Sources
Pirates Spend Much More Money on Music, Study Shows
Piracy Is Driven By Availability & Price, People Prefer Not to Break the Law, ISP Study Says
BitTorrent Piracy Boosts Music Sales, Study Finds
And for the Spotify and Youtube eras specifically:
Free Spotify and YouTube Users Are Now a Bigger Challenge Than Music Pirates
Except piracy can turn into income, as some of the studies above show.
Record labels could easily say to Spotify “We aren’t making the amount of money we’d like from your platform, so we’re finding another/making our own/etc.”
You can bet your balls that Spotify would give them whatever they wanted if they did that. So yes, they have all the leverage.
And there’d be even fewer subscribers if that were the case.
Also, Spotify posted a profit for the first time ever in 2024. They can’t just increase the payout the way you want them to.
In conclusion, piracy is not the massive problem you think it is, and is responsible for many artists actually growing their fanbases.
Re:
Contrary to what IP lobbyists would like everyone to believe, copyright is not an unlimited license to control every aspect of how creative content exists and is consumed or used. It’s not even a license to prevent people from benefiting from your work in the abstract.
Protecting Creatives
Isn’t the simplest method just to enforce Opt-In? No taxes to be disbursed. No issues of “is this or not fair use?” The creatives can rest assured that their works will not be a part of the training and therefore will not contribute to the output. If their work is original, then it will stay original, and they can be hired to create more of that quality.
More generally, why should any company that makes money via aggregation be entitled to use contributed content for more than the purpose of the aggregation platform?
Re:
You could argue that putting your work into the public domain could be seen as an ‘opt-in’ to let AI companies (or anyone else for that matter) use the work as they see fit.
Re: Re:
And if the AI companies were limiting themselves to using actual public domain works, that might mean something. But they’re using works that are covered by copyright. THat many such works can be legally accessed by the general public without paying for the privilege doesn’t make those works “public domain”. Just because you (or an AI scraper) can see a picture on ArtStation without going through a paywall doesn’t mean the picture isn’t covered by copyright.
Re: Re: Re:
Completely agree.
Just saying that the ‘opt-in’ system the OP was hinting at actually already exists; just nobody is respecting it…
Re: Re: Re:
Whether AI looks at copyrighted works is irrelevant though. It’s not reproducing the work verbatim anymore than your eyeballs reproduce the image of a work on your retina.
Re: Re: Re:2
I just asked GPT-4o to reproduce the opening paragraph of H.P. Lovecraft’s “The Call of Cthulhu” – a piece of prose I know well enough to spot-check from memory – and it did so flawlessly.
Re: Re: Re:3
If you ask GPT-4 to reproduce a specific work, it will search the web until it finds it, then copy and paste that into itself. If, on the other hand, you ask me to reproduce a specific work, I will search the web until I find it, then copy and paste that into this comment box, like so:
TL;DR: The only difference between AI reproduction and human reproduction is the location of the copy.
Re: Re: Re:2
It does reproduce very substantial portions of work verbatim though. To such an extent that before the NYT lawsuit it was easy to find dozens of pages on reddit showing how to use Chatgpt to get around NYT paywalls.
Mathematically, one interpretation of what LLMs are doing is acting as a stochastic lossy compression of the training data.
Re: Re:
Putting one’s work into the Public Domain requires the use of a specific type of Creative Commons license, or perhaps you meant publishing, which is a very different beast.
Re:
Being realistic, the simplest solution is probably to embed adversarial noise attacks in any media for which it is applicable (sorry writers, no good options for you yet). Like tree-spiking, but for data.
No, it is being stolen. If your regurgitation engines can’t work without swiping from humans maybe you shouldn’t be cheerleading for them.
Re:
So reading is theft. Stallman called it.
Re: Re:
You can’t read or regurgitate hundreds of thousands of books, and besides which you are a human. Humans are the point and deserve special privileges in accordance with that fact. Just because I can temporarily occupy a sidewalk doesn’t mean I can park a piece of industrial machinery there (which is more or less what an LLM is).
Re: Re:
Machines don’t read.
Re: Re: Re:
LLMs do, effectively. There’s a reason it’s called a “Large Learning Model”, and not a “Large Database Model”.
Re: Re: Re:2
LLM stands for “Large Language Model”. There’s no “learning” word there. You have confused it with ML that stands for “Machine Learning”. Get your facts clear first.
Re: Re: Re:3
That’s true, I got them mixed up, but the point stands: LLMs effectively read. They don’t just store whatever they’re trained on in a database ready to grab.
Re: Re: Re:4
The truth is they did store something in a database, which is being alleged of copyright infringement. You are an idiot and had no idea on how computers work.
Loading software programs and data consists of copying can could be ruled as infringement if it was not the Congress that amended the Copyright Act with Section 117.
Re: Re: Re:5
You’re conflating the LLM with the dataset it is trained on. The LLM itself doesn’t store any copyrighted material.
And fair game if you want to punish the companies for illegally collecting copyrighted material, but there are already laws for that. New ones specifically for AI aren’t needed.
Re: Re: Re:5
You have to be more specific than using the word “something”, because anyone is allowed to take a copyrighted work if they have legal access to it and then break it down into different types of data like for example the most used words in a novel. What they can’t do is break it down into data that can be restored to a copy of the original work unless it’s necessary for that process to occur in normal usage of the work – like scaling it for viewing in a browser etc.
Copyright infringement is specifically about copying/reproducing works or producing derivative works and there are many aspects of a work that aren’t copyrightable. Image AI’s doesn’t store copies, it stores categorized information about shapes, textures, colors and patterns after it is run through a convolutional neural network for example.
With that said, downloading a copyrighted work and storing it without a valid license is a copyright infringement, analyzing it with an image AI afterwards most likely isn’t but can be seen as the motivating factor for the initial infringement which can lead to it being classified as willful infringement – intent plays a big part how a infringement case plays out.
Finally, when training an AI model there’s no database – there’s a dataset, they both store structured data but one is not like the other. If you are going to call someone names for using the wrong word – don’t be wrong.
Re: Re: Re:6
Say that to the Kadrey v. Meta case, in which Meta doesn’t.
The latter is known as regurgitation in Gen AI. Where AI companies try to argue it’s a bug, not an intended feature. But, what if the regurgitation is the inevitable feature that AI companies couldn’t avoid totally due to how the neuron weights’ model work?
Re: Re: Re:6
This theory of “categorized information” being uncopyrightable is one specific thing that is denied by the US Copyright Office. USCO said that the works’ expression could be inevitably ended up in the neural weights model when you train it.
AI advocates like to argue that the neural weights are never data, which is nonsense.
Re: Re: Re:7
It’s not actually denied – it’s just limited to models that “has retained or memorized substantial protectable expression from the work(s) at issue” and “copying the resulting weights will only infringe where there is substantial similarity”.
Neural weights are numbers so of course they are data stored in a dataset but a dataset isn’t the same as a database so I don’t understand where your argument comes from because it has nothing to do with what I said.
Re: Re: Re:8
Any digitised data, including a photo of your face, is a bunch of numbers, and it is in those numbers that creative expressions are found in digital art pieces.
There are no limits on the mediums to which Copyright may be applied. That means the neural weights may be a “tangible medium” for copyright protection, just like other digital mediums. That’s also means “copying” a work to a medium of neural weights and that can constitute copyright infringement.
Re: Re: Re:4
They don’t read. You just want to keep anthropomorphizing these machines and giving them more and more human-like legal protections until you’re legally allowed to marry one.
Re: Re: Re:2
“Effectively”, huh.
Machines aren’t people. We can use human terms to describe what we’ve made them do so that we can talk about it, but what is going on is not the same. They don’t read, they don’t write, they don’t think, they aren’t intelligent, they don’t learn, and they aren’t people.
And as soon as someone comes out with “it’s just like when you”, they’ve conceded the argument, because it isn’t “just like when I”, because I am not a machine. And the law is capable of drawing that distinction.
And the idea that because you stole from so many people you can’t even identify all of them you should be immune to consequence because it’d be too hard for you is odious.
I know TechDirt readers have a tendency to embrace any argument whatsoever that they think weakens copyright, but maybe rethink the one being pushed by a pack of thieves and liars.
Re:
No, it’s not. For something to be stolen means that some specific item is no longer available to anybody else. That’s not the case here.
Re: Re:
Nah, it’s still being stolen.
Re: Re: Re:
Source: “trust me bro”.
Re: Re: Re:2
What’s your thought on wage theft?
Re: Re: Re:3
Sounds like you’re trying to trap me into saying that LLMs training on copyrighted material steals income from artists, and therefore the act of training LLMs is theft.
Re: Re: Re:4
I’m more curious as to whether you think wage theft is theft and your reasoning for that.
Re: Re: Re:5
Do you need a basic primer in what legal obligations an employer have towards their employees? Wikipedia has a great (article)[https://en.wikipedia.org/wiki/Wage_theft] on wage theft, you should go and read it.
If your work is unavailable except for purchase, you (or your publisher or distributor) will get paid. Problem solved.
Re:
And what if Meta torrents it?
I’m constantly astonished by how corrupt the UK press is. Just shameless.
Re: Uk press?
You posted that from the country with Fox “news”?
You’ve got chutzpah, or are incredibly ignorant….
That kind of bullshit “headline” is a global problem.
Re:
Dude, P-R-E-S-S is not the correct spelling of “greedy companies”. Just an FYI.
When I read this part:
My instinctive answer was “They should buy a copy of the work they want to use in training, just like I would have to buy a copy of the book I’m using to ‘train’ my own brain”.
Now, I can agree that I’m paying way too much for any knowledge books I need to train for, let’s say, computer programming and I too would rather have free access to these works.
Where do you see a difference in scanning a copy of a work to train an AI and me reading said book to train my own brain?
I think Techdirt (and Glyn) are more realists than AI apologists. “Strengthening copyright is not the way to save creatives from AI” is a good point, and Glyn (and Techdirt generally) are pretty consistent on this.
Copyright is an important right, but it’s gotten so far away from its main goals of protecting the livelihood of creators and “promoting the useful arts” that strengthening it any more is always a bad idea. Any added benefits will only go to the rentseeking oligopolies that control film, music, book publishing, entertainment media, etc.
I think part of the problem is that LLM crawlers and Gen AI companies are largely ignoring copyright by presuming crawling for LLMs is fair use/fair dealing and they’ll just continue to ignore a single national law if it’s too broad (Unless the law is US federal law or California law, which they will fight tooth and nail). Any laws need to be workable, and I think the UK proposals (with an opt-out) are aimed at being workable.
What would make it fair is me choosing not to consume the content, whether it’s made by rentseekers trying to demand money for content I’ve already paid for or made by corporations cynically remixing every genre under the sun.
Somehow I don’t think the authors will be happy with that, but it’s always been “if you don’t like our terms you can do without” until it starts happening to them.
This comment has been flagged by the community. Click here to show it.
Glyn “We should listen to true innovators like Elon Musk and Jack Dorsey” Moody.
Re:
Please provide a source.
Re: Re:
Here you go.
Re: Re: Re:
Oh, so not at all an accurate characterisation. Thank you.
Artists want control of their work so that it’s not shoved into machine models where their art becomes part of a digital corpus of info that can (and will) be used to displace themselves and other artists looking for work, or used to spit out garbage like this.
People don’t want their work taken without their permission and used to create tools and systems that screw over or hurt themselves and other people.
If these companies that get gorillions of dollars in investor money can’t make their machines work without getting permission from artists & authors & more or paying those creators for the stuff the machines scrape? That says a whole lot more about the actual greed of the technology companies and the tech people slavering over the idea of innovation above all else.
Re:
You know, the world’s first computer put thousands of brocade weavers out of work, but I don’t see you bitching and moaning about the Jacquard Loom.
Re: Re:
It will never stop being hilarious how people like you default to “Buggy whip makers! Jacquard Looms!” and shit.
Why are y’all so gung-ho on automating and replacing human creativity and making it stupidly easy for anybody to produce slop they can use to sow chaos and harm?
Re: Re: Re:
It will never stop being hilarious how people like you default to “technological progress will replace human creativity because reason xyz”. Haven’t happened yet and won’t likely happen for the foreseeable future. Your actual gripe is about money, not AI “replacing” creativity unless you think money is the main motivator that drives human creativity.
Also, the whole Shrimp Jesus thing also illustrates something you were unable to grasp – human creativity allows us to use new technologies in creative ways and some of those uses aren’t for good, like mass producing AI-images to scam people.
When technological shifts happens, either you adapt to it or become some bitter dude constantly bitching about it.
Re: Re: Re:2
Of course I know people will continue to create stuff, dumbass. What I’m pointing out is this: A lot of artists ply their trade in order to afford to eat, pay rent, y’know, survive. Y’all seem to be practically drooling at the prospect of automating away those jobs. What do they “adapt” to when their work dries up because of these machines, or they get laid off and replaced by cheap contract labor that just touches up machine-generated outputs? Learn to code? Oh wait, that’s being replaced by machines too.
Re: Re: Re:3
The same thing happened during the industrial revolution. People seemed to come out of that okay.
Re: Re: Re:4
A whole lot of people did not come out of the Industrial Revolution “Okay”. It was also a period of rampant unchecked greed and exploitation.
Back to my question: What do you and Rocky want people displaced by these machine models to actually fucking do? Cause it really feels like you two specifically don’t care how much harm AI causes as long as you get your boners from seeing “innovation” happen.
Re: Re: Re:5
Why do you want us to come up with replacement jobs just because we(or at least I) are cautiously optimistic about the technology? I have endless sympathy for someone losing their job, but this implication that technological innovation should be artificially hamstrung because it’ll impact people is absolutely nuts.
What did people do when the telegram replaced messengers? What did people do when the telephone replaced the telegram? What did people do when factory automation replaced workers? What did people do when modern telephone infrastructure replaced switchboard workers?
Re: Re: Re:6
It is not innovation when it steals creative labor. The Big Tech can still innovative by paying the content creators with AI, because why not?
If the Big Tech want to avoid copyright suit, do it like the way of Adobe, whose AI is only trained with public domain materials!
Don’t try to excuse AI stealing. It is categorically NOT FAIR, and it exploit workers by not compensating though their works.
Re: Re: Re:7
Sure it is. There’s no fairness requirement to innovation that I’ve ever heard of, so if you’re trying to say that sacrificing creative labor for innovation is abhorrent, I understand.
Since you’re fond of demanding solutions from people with certain viewpoints, let me ask you this: what would be an actually functional way to create a licensing system for AI-training that wouldn’t turn into an economic black hole for companies who want to get into the industry?
I can’t find any record of them saying that. Only that they won’t train their AI on users’ content, though it seems to be a case of opting out. Do you have a link to that statement?
It’s not stealing, it’s copyright infringement, and I have no problem with punishing copyright infringement, but laws already exist for that.
Re: Re: Re:8
There is a “fair” word in the term “fair use”.
True. That’s my stance.
While I’m not obliged to provide a solution to such a licensing, I do have a idea that Techdirt site staff might be interested.
In fact I think this partly fits the vision of Techdirt as they advocate the new business model in the age of the internet.
The idea is: An artist payment and funding website like Patreon, where artists can set up payment accounts and offer licencing schemes for AI companies. And AI companies would pay them automatically when developing AI with their data. Such a platform would solve the problem of small artists having no way to offer AI their data in exchange for money. This can also disincentivise AI companies from going to pirate site for copyrighted materials.
It was a quite old statement, and might not reflect the current state of Adobe’s AI:
https://www.fastcompany.com/90906560/adobe-feels-so-confident-its-firefly-generative-ai-wont-breach-copyright-itll-cover-your-legal-bills
https://www.wired.com/story/adobe-says-it-wont-train-ai-using-artists-work-creatives-arent-convinced/
Remember the temptations for companies to break the rules are high, but there is no copyright lawsult against Adobe for now AFAICT.
Do you know why artists relate copyright infringement with stealing? You seemed to have no idea.
That is because when AIs are trained using their works, and can easily produce works in the same style of them (potentially incorporating the copyrighted elements of their works), the artist’s jobs can be threatened to be replaced. The “stealing” part refers to job stealing, not just work stealing.
Re: Re: Re:9
The Fair Use doctrine is not what causes innovation. It’s a legal loophole intended not to stifle it. But innovation has nothing to do with fairness in and of itself.
I asked for a solution that wouldn’t turn into an economic black hole. With the amount of data that’s required to train and fine-tune a generative model, this wouldn’t be fiscally realistic for startups, and it would effectively require any company wanting to train a model to find and “subscribe” to thousands of artists.
That’s quite impressive, assuming that Adobe is telling the truth.
Then make that distinction instead of continually saying that the AI “steals” the content.
Re: Re: Re:10
Fair use would ba a bad solution to lower that barrier for AI startups.
Instead, do this: Allow AI startups to rent a model pre-trained with licensed materials at a discount price. Prevent the big AIs from asserting monopoly (or oligopoly) on those pre-trained models. This is how small telecoms can thrive when the telephone network is dominated by big ones.
Re: Re: Re:7
An “it” can’t steal anything, people steal things. If you can’t separate technology from what people do with it you should immediately stop what you are doing and start thinking hard about why that is.
Name just one artist who hasn’t been influenced or incorporated styles, details and subject matters from other artists. If you can’t, then every artist has stolen content per your own definition above.
Your whole reasoning is based on “it’s not fair” which leads you to look at the problem incorrectly. Whether it’s a person or an AI that learns by looking at what others have produced; it is just that, learning. There are some caveats to that has legal ramifications for copyrighted works and how you access them but the very act of learning something is never illegal in my opinion.
Aside from the question of legality, the actual problem is that people use AI to produce content at scale and if it only has been trained on content that’s in public domain that won’t change which means, regardless of the source material, content creators will be displaced.
Personally I think using AI that has been trained on copyrighted materials to produce content at scale for the sole purpose of monetization is morally reprehensible and it’s legally dubious to most likely illegal and unfortunately that is the imperfect world we live in and until the legal landscape settles on the issue the best anyone can do is to adapt to the situation even if it sucks because not adapting will suck more.
Re: Re: Re:8
This argument depends on the definition of “stealing” and I’m not interested in replying to this part because this is straw man.
No. The argument is about AI copying or “learning” or whatever you call it, not humans. Just as you can argue that humans get “inspiration” from others doesn’t means it is just for machines to do so.
The assumption is machines “learn” the same way as humans do, which is not backed by evidence so I can refuse your opinion in this case.
Even if I can assume machines do learn (the fact of which is debatable), machines don’t have legal rights to do it. Imagine someone uses a computer to copy books they didn’t author and share it on the internet, and when they got accused of illegal copying, they blame the computer that does the copying.
Well, let the AI displace them if this were indeed the case. You at least admitted one fact that it is unnecessary to train the AI with copyrighted materials at all. Which could be an important argument against fair use.
I agree with this part, and yet this is why authors sue. If authors didn’t do it earlier the social outcomes could be worse.
Re: Re: Re:9
No, it doesn’t depend on the definition of stealing. As you said later in your post: Imagine someone uses a computer to copy books they didn’t author and share it on the internet, and when they got accused of illegal copying, they blame the computer that does the copying, which is exactly why a machine can’t steal by itself – there’s always people involved.
This statement indicates you don’t actually understand the technology being discussed and it also indicates that you don’t really want to understand it either. What a machine does is entirely dependent on it’s master, so whether you think what an AI is doing is just or not is you ignoring the one using the machine because you want to blame the machine.
Of course they can learn and act on information they gathered, and saying machines don’t have rights is like saying a stone don’t have rights – it doesn’t make sense.
Which is exactly the point I raised when you claimed machines are stealing, ie you blamed a machine for how people used it. So please decide which of these argument you think is right because currently you are arguing against yourself.
Admitted what? I explained the factual situation to you and nowhere did I say what type of material is necessary or not so please stop claiming I said something which I never did.
Re: Re: Re:10
No, you missed my point. This analogy was to debunk about whether the fact that machines can learn have effect on legal arguments. Because machines are never legal persons, they cannot be liable of what they do and that the liability would be on the operators of the machines. So, does the fact that machine “learn” have an impact on the judgement of copyright infringement? Probably not.
Perhaps the “machine learning” term is all a misnomer. A better term would be “training the machines” that suggests there are humans taking responsibility. Now, putting copyright issues aside, what if a machine got trained with confidential data of a company and that the machine unconsciously spit out company secrets (through a phenomenon called “regurgitation”). Who could you blame for the leaks?
You cannot make simultaneous arguments about that machines learn and that machines don’t copy copyrighted data. Pick one or another.
If the machines can steal jobs even without copyrighted data, it makes no sense to have them trained with copyrighted data at all. That’s why you cannot justify machine “learning” with copyright.
Re: Re: Re:11
We are going in circles, it doesn’t matter what you call what a machine does since any legal liability invariably falls on a human operator or the owner of the machine regardless of the technology involved. That a machine can learn has zero to do with the legal ramifications depending on what content is used.
Don’t tell that to me, tell that to the people who does it. I’m just informing you of the reality of the situation which you seem to take as attack on those who are affected.
I can justify it on a case by case basis and if you actually had bothered to read the USCO AI Report so do they. The problem here is that you think I’m saying something I’m not which is very strange.
Re: Re: Re:12
Let me clarify: That a machine can learn doesn’t means the machine is entitled to learn it. Several kinds of content is just morally bad for machine to even grab ideas from. One example is nude images.
Note that I’m avoiding the examples of copyright because the whole issue is not just copyright, but the general rights of people to do with their own data.
AIs can’t know why nude images are bad, they would just produce the deepfakes of them whenever a user instructed them to do it. Of course the liabilities are on the operators. But since the AI models are also trained with porn, the AI companies should also be liable for such deepfake porn that users instructed the AI to produce. If the AIs weren’t trained with porn, they would have no ability to make porn, and the companies can be safe.
Re: Re: Re:8
“An “it” can’t steal anything, people steal things. If you can’t separate technology from what people do with it you should immediately stop what you are doing and start thinking hard about why that is.”
Fun fact: It’s people setting up these regurgitation engines!
Nobody’s talking about holding the regurgitation engines themselves responsible. They’re machines. But the people who make and exploit them? They’re people and they hold moral culpability for their actions and the actions of the process they set going.
Re: Re: Re:9
Exactly my point.
Oh? So the debate is always about the bad actors and there’s nary a peep about the bad bad “regurgitation engines”. . .
Re: Re: Re:10
What the heck are you trying to say?
People opposed to how this has gone down are, in general, not blaming the regurgitation engines because unlike the people swallowing hype whole, they are capable of distinguishing machines from people.
But a tool not being morally culpable doesn’t mean that therefore anyone can do anything with it whenever they want, or that those people cannot be held responsible. Guns and cars are a couple of classic examples.
And if you can’t make your regurgitation operate honestly, I figure you shouldn’t be allowed to operate it at all. If the argument is that it would be an undue burden to compensate all the people you stole from because you stole from so many people you couldn’t possibly pay them, well, that’s a you problem, you know?
Re: Re: Re:6
Funny, because your attitude and the way you’ve carried yourself across multiple comment sections on this topic give me the vibe that you actually don’t.
Where does a coder, one of the most mentally demanding jobs out there, go to for a new job when they get declared obsolete? What societal benefits do we gain by displacing people who work in mass-employment artistic fields such as animators? And where do those animators go after they get laid off and replaced with cheap contract work? There is a hard limit to the human ability to upskill, and declaring stuff like animation as some job that can be expendable in the pursuit of new efficiencies at the altar of “innovation” is disgusting.
Re: Re: Re:7
A company where they haven’t been declared obsolete.
Ideally, an influx of content(culture).
The same place as the coders: a company that values them.
Who has declared that?
Re: Re: Re:8
And what happens as the pool of companies that declare them obsolete increases massively over time and they can’t find jobs anymore?
“Ideally, displacing skilled people and replacing them with cheap touch-up contractors will mean that we get pur cartoons out faster” is only a benefit for impatient manchildren, not society at large. Also it’s hard to call AI slop “culture”.
Same thing with coders getting declared obsolete; the pool shrinks as companies decide they want to use AI and cheap contract labor rather than pay experienced people. How does society benefit besides “Oh I can get my shows faster than before”?
You seem to be declaring that in the idea that it will lead to “an influx of content (culture)”.
Re: Re: Re:5
That wasn’t really a question, you constructed an argument you thought we put forward when we just described how the world works. You are entirely free to point out where we said that we want people to be displaced by new technology.
As someone said to highlight the very fact that life isn’t what you want it to be: In order to live, I just can’t do the things that I like to do. It’s absolutely unfair.
Yeah, no to this whole article. Copy right can be reformed AND AI companies can pay creatives for their work. AI is currently being used to replace workers in creative industries (video game artists, for example). There was just a huge strike in Hollywood because writers wanted guarantees they would not be replaced by AI. Many artists are upset about AI and it has nothing to do with copyright. Like I’m sorry but I flat out would NOT want my work contributing to an industry ALREADY stealing creatives’ jobs, copy righted or not. If you want to enhance your knowledge with paywalled and copy righted content, pay for it like the rest of us and use your billions of dollars to lobby for copyright changes.
Re: To protect creative workers
There are two things needed here to protect the creative worker from the AI monstrosity taking away jobs:
Yup. A copyright license for AI to analyze a work is the same as a license for a human to consume the same work over and above the purchase price.
Nono, surely this will be the time that expanding copyright will benefit creators and not giant corporations.
Re:
Not just creators, but everyone online who wants control of their own data.
Your photos may be set to be visible only to a few friends of you, and yet companies may take them to train their AI without your consent. And suddenly their AI may be able to recognise your face among other photos without your knowing. Or, they may generate porn images with random faces that just happen to be yours. Or they fake your activity though an innocent image generation prompt by a random user.
It’s really a lot of things on this fight for control. Privacy. Right to your personal image. Safety from impersonation (deepfake). And more. It’s not just copyright.
Re: Re:
Do read the fine print in the TOS of any service you use, because by accepting it you have given them the rights to do almost anything with the content you post on it.
Re: Re: Re:
Well then. This is the issue that Techdirt should focus to fix, to lobby the legislators for stronger laws for protecting personal data, and ban the “consent because of the TOS” (Terms of Service) nonsense.
And as you see, giving the AI companies unprecedented “fair use” on personal data won’t solve this.
The influx of content is mot always a good thing. The internet has already proven this: Lots of content, each contesting for reader’s attention. Much of the content ended up being “trash”.
The opposite is true, unfortunately.
There’s an upper limit for how AI can generate based on how much human knowledge (i.e. data) it has been consumed. When the AI started to get fed with its own data, the AI would degrade, and eventually collapse. The term “model collapse” is coined for this phenomenon. Go search for that term if you want to learn it.
This means a interesting law akin to thermodynamics: Like you can’t make energy out of nothing, you can’t make intellect or creativity out of nothing either. AI is never perpetual motion.