Let’s Not Flip Sides On IP Maximalism Because Of AI
Copyright policy is a sticky tricky thing, and there are battles that have been fought for decades among public and corporate interests. Typically, it’s the corporate interests that win — especially the content industry. We’ve seen power, and copyrights, collect among a small group of content companies because of this. But there is one significant win that the public interest has been able to defend all these years: Fair Use.
Fair use’s importance has only grown over the years. Put simply, fair use allows people limited use of copyrighted material without permission. Fair use’s foundations are in commentary, criticism, and parody. However, fair use has arguably filled in important gaps to allow us to basically exist on social media. That’s because there are open questions on what is and isn’t copyright infringement, and things as simple as retweeting or linking could theoretically get us in trouble. Fair use also allows a lot of art to exist, because a lot of art critiques or comments on older art. On the flip side, when fair use was ruled to not cover music sampling it basically killed a lot of creative sampling in hip hop music. Now popular sample-based music is relatively tame and tends to use the same library of samples.
Fair use (probably) also protects the creator industry. Many people make a living streaming video games or making content around playing video games. All of that could violate copyright laws. We don’t know the extent of risk here, because it hasn’t been fully tested, but we do know that videogame makers have claimed videogame streaming content as copyrighted material. We also know that in Japan, which doesn’t have fair use, that a streamer got two years in jail for posting Let’s Play videos. A lot of creators also make “react” content, which also relies on fair use protection.
Blowing up Fair Use
Considering the importance of fair use, and the historically bad behavior of the content industry towards ordinary people, it’s surprising that a lot of public interest advocates want to blow it up to hurt AI companies. This is unfortunate, but not particularly surprising. Content industry lobbying has inflated copyright protections into a pretty big sledgehammer, and when you really want to smash something you often look for a sledgehammer. For example, copyright and right of publicity (a somewhat related state-level IP regime) were the first tools people turned to to protect victims when revenge porn first became a big problem.
Similarly, some public interest advocates are turning to copyright to stop AI from being trained on content without permission. However, that use is almost certainly a fair use (if it’s copyright infringement at all) and that’s a good thing. The ability of people to use computers to analyze content without permission is extremely useful, and it would be bad to weaken or destroy fair use just to stop companies from doing that in a socially problematic way. The best way to stop bad things is with policy purposefully made to address the whole problem. And these uses of copyright law often plays into the hands of powerful interests — the copyright industry would love the chance to turn the public interest advocacy community against itself in order to kill fair use.
I’m not saying that there aren’t issues with AI that need to be addressed, especially worker exploitation. AI art generators can be especially infuriating for artists: they use a lot while giving back little. In fact, these generators are arguably being built to replace artists rather than to provide artists with new tools. It can be attractive to throw anything in the way to slow it down. But copyright, especially copyright maximalism, has done a terrible job of preventing artist exploitation.
Porting “on a computer” to copyright
One of the biggest public interest fights in patent law has been against “on a computer” software patents that clogged up the system and led to a number of patent infringement suits against small businesses for silly claimed inventions. The basics of the problem is this: it was initially allowed to claim an invention in doing something that was already known, but on a computer. These on a computer patents have been greatly restricted through Supreme Court rulings (which special interests would like to overturn). However, the bad effects of software patents still exist today, as do patent trolls seeking to exploit them.
This current fight over copyright in training data reminds of this same problem. For example, if a writer wanted to study romance novels to find out what is popular it would be perfectly acceptable under copyright policy for them to read and analyze a lot of popular romance novels and to use that analysis to take the most successful parts of those novels to create a new novel. It is also perfectly acceptable under copyright law for an artist to study a particular artist and replicate that artists style in their own works. But using an AI to do that analysis, doing it “on a computer,” is now suspect.
This is short sighted for a number of reasons, but one I’d like to highlight is how this shrinking of fair use is difficult to contain. We are talking about an area in which the question of whether loading files into RAM is “copying” under copyright law (and therefore needs permission or is a violation) is an actual policy debate that public interest advocates have to fight. If using content as training data becomes a copyright violation, what’s the limiting principle? What kinds of computer analysis would no longer be protected under fair use?
I should also point out that IP maximalization is the easiest way to build oligopolies. Big companies will be able to figure out how to navigate the maze of rights necessary to build a model, and existing models will likely be grandfathered in (with a few lawsuits to get through). However, it will be impossible for any new company or new open source model to be created. Dealing with rights at scale is a problem so significant that even the rightsholder industry has trouble tracking them. And information about rights has been withheld to leverage better deals due to the risk (and high costs) of accidentally infringing someone’s rights.
Matthew Lane is a public interest advocate in DC focusing on tech and IP policy. This post was originally published to his Substack.
Why should it be “fair use” for a corporation worth billions to co-opt the work of creators who never agreed to have their work used to train AI models and who haven’t and won’t be compensated for that use, as the law currently stands?
People like to compare training of an AI model to a human reading and learning from the same work, but these simply aren’t the same thing. For one, AI models aren’t humans, they’re a product being created largely for the profit of corporations. They certainly don’t have the same inherent rights we assign to another human, and we’re a long way off from even considering that possibility.
More importantly, AI models don’t learn like humans. Just look at the recent work where researches have been able to get AI models to disgorge entire sections of the works they’ve been trained on. The very same works we’re supposed to believe explicitly aren’t being copied in whole or in part as these models are trained from them.
Having some respect for creators in light of AI developments absolutely isn’t “blowing up” fair use. I think this is one area where TechDirt’s take will not age well.
Fair use does not require compensation. Agreement is never part of the equation in fair use. The whole point of fair use is to be able to use copyrighted materials without the copyright holder’s permission.
No copyright on the art? Then the artist is out of luck!
For the same reason that Students can learn from books etc. that they buy, borrow or see in an art gallery or museum. it is the same reason that music genres exist, people can learn ways of doing things from other peoples works.
Those claiming it is unfair are asking for free money from computer analysis, which is not the same as copying their works for profit.
Further people trying to make a living have to compete with everybody on the planet who publishes their works via the Internet. Indeed the idea that creative works were rare and valuable come from a pre-Internet world, where publishers selected a few works from the many submitted for publication. Those who won that lottery have an overinflated idea of how rare creativity is, and how much their work should be worth.
Are you saying that a human who has read a book can’t reproduce sections from it with some trial and error and the correct prompting? How about a savant with perfect recall?
And I’m a bit interested in what you mean by “entire sections”? Section can refer to a couple of sentences, a paragraph, a page or even a chapter. So how much text could an LLM output with the right prompting that corresponded to a book?
This tells me you don’t even have an inkling of how LLM’s actually work because the statement above is proof of that you think they “copy” the input.
More anti-capitalist bullshit from TechDirt. Surprise, Surprise (not).
Why is it good for the workers to be exploited?
You’re in favour of worker exploitation, then?
Absolutely, especially when I’m a shareholder and the employees are soon to be replaced with AI or automation anyway.
The other thing about AI, especially when it comes to copyright is that none of the arguments are all that new.
In fact, most of them were done in some variation about 150 years ago when photography came along.
The difference is that case law has become much more complicated in the ensuing 150 years, and the Copyright Act has been updated multiple times.
As a result, the arguments are all old, but their application to the current copyright landscape is novel enough that people who weren’t alive last time around have a hard time seeing the potential simplicity, and instead want to torture the existing laws some more.
And removed the employment of most portrait and landscape artists. That did not stop people painting portraits and landscapes as a hobby, and few managing to make a living by being well above average in ability.