If Creators Suing AI Companies Over Copyright Win, It Will Further Entrench Big Tech
from the be-careful-what-you-wish-for dept
There’s been this weird idea lately, even among people who used to recognize that copyright only empowers the largest gatekeepers, that in the AI world we have to magically flip the script on copyright and use it as a tool to get AI companies to pay for the material they train on. But, as we’ve explained repeatedly, this would be a huge mistake. Even if people are concerned about how AI works, copyright is not the right tool to use here, and the risk of it being used to destroy all sorts of important and useful tools is quite high (ignoring Elon Musk’s prediction that “Digital God” will obsolete all of this).
However, because so many people think that they’re supporting creators and “sticking it” to Big Tech in supporting these copyright lawsuits over AI, I thought it might be useful to play out how this would work in practice. And, spoiler alert, the end result would be a disaster for creators, and a huge benefit to big tech. It’s exactly what we should be fighting against.
And, we know this because we have decades of copyright law and the internet to observe. Copyright law, by its very nature as a monopoly right, has always served the interests of gatekeepers over artists. This is why the most aggressive enforcers of copyright are the very middlemen with long histories of screwing over the actual creatives: the record labels, the TV and movie studios, the book publishers, etc.
This is because the nature of copyright law is such that it is most powerful when a few large entities act as central repositories for the copyrights and can lord around their power and try to force other entities to pay up. This is how the music industry has worked for years, and you can see what’s happened. After years of fighting internet music, it finally devolved into a situation where there are a tiny number of online music services (Spotify, Apple, YouTube, etc.) who cut massive deals with the giant gatekeepers on the other side (the record labels, the performance rights orgs, the collection societies) while the actual creators get pennies.
This is why we’ve said that AI training will never fit neatly into a licensing regime. The almost certain outcome (because it’s what happens every other time a similar situation arises) is that there will be one (possibly two) giant entities who will be designated as the “collection society” with whom AI companies will have to negotiate or to just purchase a “training license” and that entity will then collect a ton of money, much of which will go towards “administration,” and actual artists will… get a tiny bit.
And, because of the nature of training data, which only needs to be collected once, it’s not likely that this will be a recurring payment, but a minuscule one-off for the right to train on the data.
But, given the enormity of the amount of content, and the structure of this kind of thing, the cost will be extremely high for the AI companies (a few pennies for every creator online can add up in aggregate), meaning that only the biggest of big tech will be able to afford it.
In other words, the end result of a win in this kind of litigation (or, if Congress decides to act to achieve something similar) would be the further locking-in of the biggest companies. Google, Meta, and OpenAI (with Microsoft’s money) can afford the license, and will toss off a tiny one-time payment to creators (while whatever collection society there is takes a big cut for administration).
And then all of the actually interesting smaller companies and open source models are screwed.
End result? More lock-in of the biggest of big tech in exchange for… a few pennies for creators?
That’s not a beneficial outcome. It’s a horrible outcome. It will not just limit innovation, but it will massively limit competition and provide an even bigger benefit to the biggest incumbents.
Is the alternative (re: not addressing copyright law with regards to AI) actually any better?
We’ve already seen companies limit API access, ostensibly because of cost, but simultaneously raising rates as the value of API access becomes more associated with AI outcomes.
We already see the largest, most influential tech companies able to far outstrip other competitors with AI, as they’re better able to parse the massive amount of data available.
Copyright currently exists to help the middlemen, not the artists, but is that an argument against applying copyright to AI, or an indictment of our copyright system in general?
Arguing that artists are already required to bend over backwards for major companies, thus copyright shouldn’t be enforced with regards to AI sounds like a bit of a backwards argument.
Yes. Absolutely.
A practice I’ve criticized, and one I don’t think is all that sustainable. I mean, many of the AI systems don’t use the APIs anyway, but resort to scraping or other content access mechanisms anyway. So while the API pricing is notable, I’m not sure it’s difference making.
I… also don’t think that’s accurate. I mean, go back just a few years ago and we kept hearing how the only companies that would be able to parse the data and offer AI were Google, FB, Amazon, and Apple.
And yet, the leaders in AI are… not really any of those guys. They’re players, certainly, but OpenAI leads, and other companies are doing well as well, such as Anthropic.
It’s true that many of these companies have now taken billions of dollars from the big tech companies, but that’s because the big tech companies haven’t actually been able to get the same results.
It is an indictment of copyright, definitely. But it’s an indictment of what is inherent in the copyright system. And will remain, even here.
But I’m not arguing either that “artists should bend over backwards” nor am I arguing that “copyright shouldn’t be enforced.” I’m saying there’s NO COPYRIGHT TO ENFORCE in training, because it’s fair use.
And, in the long run that will HELP artists, and not leave them as beholden to big companies, because there will be much more competition.
Anyone who thinks expanding copyright will help individual creators rather than corporate publishers hasn’t been paying attention the last…every single time we’ve ever tried that.
“Let companies rip off your work, or else only Big Tech will be able to rip off your work” is not a compelling argument.
Well, your first (and largest) problem is assuming “fair use” is the equivalent of “ripping off your work.”
Fix that and the rest of your confusion might melt away.
Somewhere along the line, copyright has expanded beyond the control of producing copies to controlling how people can use the contents of a work. Any generic form of usage licensing, such as performance licenses that venues are required to get to allow musical performance become a means of supporting another layer of parasites, and transferring money from the the poorer artists to the richer artists.
While get a license seems such a simple solution, it is hugely impractical, especially as self publishing exists. Just who keeps track of who owns what copyrights when tens or hundreds of thousands of works of new works are published every day. Any licensing schemes similar to the collection societies means that a new layer of parasites are created, and the publisher take their cut of the license fees where a few crumbs may make it to the richest creators that they have signed on. obscure creators, and self publishers will likely find their works covered by any licensing fee, but they will not see a penny of that income.
I suspect it will be even worse in the end. If the AI creators have to go to the gatekeepers, what is the likely hood of them being able to get the kind of data they want to train their AI vs some kind of prepackaged low quality data? How vibrant of a marketplace will there be if all the training sets are the same?