AI Will Never Fit Into A Licensing Regime

from the that's-not-how-any-of-this-works dept

Yes I’m aware that Nvidia and Adobe have announced they will license training data. I don’t know what those agreements will look like, but I can’t imagine that they make any sense in terms of traditional licensing arrangements. Rather, I’m guessing they just brute forced things to build goodwill among artist communities and perhaps to distinguish themselves from other AI companies. I sincerely doubt these arrangements will help artists though, and I fear these licensing conversations will distract from better conversations on how to balance interests. To explain my thoughts on this, I first have to start from the beginning.

How AI Training Works

AI training is very complicated, but can be explained pretty simply using the lie-to-children model. There’s an older video by CGP Grey (and footnote video) that does a beautiful job, but I will try to summarize. AI training is the solution to the problem of trying to make a computer good at something complicated. Let’s say you want to teach a computer to play chess (or Go). The simplest, least efficient, way to start is to tell the computer if “human move1 = moves this pawn, move that pawn” and so on. Writing the program like this would take forever, and the computer would be bad at repeated games of chess because a human could quickly figure out what the computer is doing. So the programmer has to find a more efficient set of rules that also has better success at adapting to and beating human players. But there is a limit, the program keeps getting more complex and great chess players will still find the difficulty lacking. A big part of the problem is that humans and computers just think differently, and it’s really hard to write a program to get a computer to act like humans do when it comes to things that humans are really good at.

Fortunately, there are several methods of getting computers to figure it out on their own. While these methods function different, they generally involve giving the computer a goal, a lot of data, a way to test itself, and a tremendous amount of processing power. The computer keeps iterating until it passes the test and the human keeps tweaking the goal, the data, and the test based on what kind of results they are getting. After a while this learning starts to resemble how humans learn, but done in a way that only computers can do. The results produced by this learning is often called a “black box”, because we don’t know exactly how the computer uses the model it creates to solve the problems we give it.

AI image generation training, at it’s simplest, is giving the model an image and text pair and telling it that some of the words describe a style and some of the words describe objects within the picture. The goal set for the model is to understand and be able to reproduce the style or the objects (or both) based on the words. Give it a picture of The Starry Night and a description, and the model will start to learn the style concepts based on the words “Van Gogh” and “post-impressionism.” It will also start to understand the objects of city, moon, stars, and how they would look at night. It takes a lot of images to train towards a functional understanding of these concepts, and after each image is ingested into the model it’s basically trash. It isn’t stored in the model, only the concepts are. And those concepts should ideally not be tied to a single image (see overfitting).

This brute force learning is not that different from how humans learn art. A human has to learn the practical techniques of how to make art, but they also should be looking at other artists’ work. When I was learning pottery, there was a point where my instructor said now that you can produce a pot it’s time to figure out your style. That involved looking at lots of pictures of pottery. For computers, that’s the only step that really matters. Teaching a human to create art in this way would be like locking someone in a room with every Monet and a set of paints and not letting them out until they have produced impressionist art. Importantly, current AI simply reproduces learned styles. It would not create post-impressionism in protest if given the same task.

The war between impressionism and post-impressionism

How Licensing Works (Traditionally)

Licensing can be a pretty complex process in which copyright law, a lot of lawyers, mutual business interests, and the threat of lawsuit get together to produce a contract that everyone thinks should have been better for them. There is also usually art involved.

To continue oversimplifying things, lets just say that every time a song is streamed someone, somewhere, moves a bead on an abacus. At the end of the month an amount is paid based on whatever the abacus says. An ad campaign uses a photo? Another abacus bead moves based on where it’s displayed, how many people see it, etc.. A song gets used in a tire commercial? More abacuses. The important thing is that an artist’s work is getting reproduced, in whole or in part, in a quantifiable way. That quantity relates back to agreed to terms and the artist is paid based on those terms.

How AI Licensing Doesn’t Work for Artists

Let’s ignore the fact that AI training is likely fair use. Let’s ignore that the audience for the works is a computer. Let’s ignore that the works are only used to teach the computer concepts. Let’s ignore the fact that those works are never (ideally) reproduced or displayed to users of the AI. Even ignoring all that, there is still the problem that the number of times each work is used is one (ideally). Stable Diffusion’s first release ingested 2.3 billion images. So in abacus terms each work moves one lonely bead to make up 1/2.3 billionth (and counting) of a share in a theoretical license fee.

The next problem is what the share is in. The Copyright Office has so far said that AI generated elements of a work can’t be copyrighted. Grimes recently tweeted that she would split 50/50 any royalties using her voice. But royalties are based on licensing, which is based on copyright. Theoretically streaming companies and other distributors could pay royalties on a copyright-less song, but would they? And would listeners pay for a song in the public domain? Maybe. The more likely answer is that works produced by AI have no value on their own in any meaningful sense that would enable artists to get a piece of.

OK, but what about giving artists a share in some right for their contribution to the model? I don’t think this works for a few reasons, but there are two main theories behind this proposal that we have to take in turn. The first theory is related to the open question of what claims an artist has for a song that sounds like them, but isn’t based on any song in their catalog. Answering that question the wrong way could matter for music generally. Creating a rule that gives an artist rights in songs that sound like them, but otherwise aren’t infringing, would mean Led Zeppelin would have a claim over Greta Van Fleet’s catalog. We currently don’t give rights for even heavy influence, or for impersonation. The same is true in other arts as well.

The second theory is based on the fact that AI can be used to infringe copyright in a traditional sense. While output is ideally original, sometimes overfitting occurs and the model will output works that are close enough to works it trained on to be considered infringing under current copyright law. This raises many parallels to the Betamax case, where the Supreme Court ruled that distributing recordable media and recording devices did not result in contributory infringement even if it was used for infringement. Congress reacted to this ruling with the Audio Home Recording Act, which created a generic royalty for each device or recordable media sold. I don’t think that’s the answer here because overfitting is a bug, and AI developers are (and should) working towards fixing that bug.

Either way there isn’t even a good way to divvy up what money might be available. AI training requires a lot of data, so any particular artist’s share would be extremely dilute. One might argue that shares should be paid out based on importance to the model, but that is likely impossible to figure out in any meaningful way. AI functions as a black box, and it’s hard to quantify how much of each work is in any particular concept it invokes when responding to a prompt. That holds true even when artists’ names are used in prompts. Yes the artist might have greater influence in the output (as intended), but all the other works were still necessary to teach the model what a cat is and what a bicycle is and how a cat might ride a bicycle through Dali’s The Persistence of Memory.

Regardless, none of this solves the problem artists are facing: that they are competing with a computer that is faster, cheaper, dumber, weirder, more chaotic, and less open to feedback. A residual of five cents a month is not going to fix that, even if the artist can collect. For example, Stable Diffusion has released their AI model as open source, and many people (myself included) run it on their own computers for free. There’s no way to put the genie back in the bottle.

Other Solutions

None of this dismisses the fact that there should be conversations on how to navigate this new space in a way that preserves art and artists. The simplest answer might be that artists are smart and talented people and they are already figuring it out. A new survey shows that 60% of musicians are using AI to create. Special effects artists Corridor Crew have figured out a workflow to basically do AI rotoscoping. Professional photographer Aaron Nace (owner of Phlearn) has a video teaching how to integrate AI into creative photography projects. Disney animator Aaron Blaise has a video encouraging artists to embrace new AI technology. Artists will adapt to using AI as a tool and will produce better work than people who don’t understand art and have no idea what they are doing. And their use of AI will likely be a small enough part of their works that they will still be able to rely on copyright if that is their model.

The next simplest answer is that denying copyright to low effort uses of AI is probably the best way to protect artists long term. One of the biggest threats artists face from AI is that powerful studios, labels, etc., will use AI to cut them out of the process. If that work can’t be copyrighted, because no human contributed anything of artistic merit, then the copyright industry won’t be able to turn AI into a cash cow by cutting their artist costs.

Finally, yes maybe we should have a conversation about whether there is a line in training models heavily towards the style of living artists. But these kinds of threats are way more present among “hobbyist” communities (NSFW warning for like half the content here) that play with the technology for their own interests. Larger AI developers all seem to be training away from that towards more general models that are easier to use and based on simple prompts.

Matthew Lane is a Senior Director at InSight Public Affairs. Originally posted to Substack. Republished here with permission.

Filed Under: , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “AI Will Never Fit Into A Licensing Regime”

Subscribe: RSS Leave a comment
18 Comments
Stephen T. Stone (profile) says:

Re:

Every AI evangelist keeps saying something to this effect. But their saying it tends to undercut their evangelism⁠—and expose something darker about themselves.

When a person experiences a new work of art that inspires them, they generally don’t slap a copy of that work on their workbench, copy the work wholesale, and mix-and-match pieces of works from the same artist to get a “just like [x]” result. They’ll study that work for inspiration and look at how they can adapt techniques used to create that inspiring work into their own works. An artist who dabbles in drawing, for example, can see a work from a skilled artist, think something like “they draw hands so well”, and look at how the artist renders hands to see if such techniques could be adapted into the inspired artist’s repertoire. (To wit: Drawing fingernails can help with hand/finger positioning.) Humans learn by absorbing knowledge from multiple sources and experimenting with new techniques. They can tell you what they did⁠—but more importantly, they can tell you why.

That, dear AC, is the limit of machine-generated art. A machine can do the “work” of accepting input and studying it, but it can’t actually produce anything with “soul”. It can’t explain why it chose a certain brush stroke or line, why it combined a set of particular words into a sentence/paragraph, why it went from a given set of notes to another. You can tell a machine to generate an image of a character or a short story, but the machine can only assemble from the parts that it knows to assemble from⁠—which is why the image is most likely to be a front-facing portrait with generic lighting and the short story is bound to be clichéd and trite with one-note characters. Machine-generated art is an autocomplete collage⁠—bits and pieces of other people’s works slapped together in a way that can only ever be, pardon the phrasing, artificially appealing. It is the drive of human creativity reduced to a goddamned assembly line.

Therein lies the dark aspect of AI evangelism: Deep down, the people who want machine-generated art to be popularized⁠—perhaps even to see it become the dominant form of art!⁠—want to take the humanity out of art. Maybe they want this done out of envy for the skills they don’t have. Maybe they want it done for the sake of increasing profits by paying fewer people to do “creative grunt work”. But whatever the reason, the ultimate goal of AI evangelism is to replace people with machines because dealing with human artists can be messy and waiting for their art can be time-consuming. To wit: Some asshole probably thinks they can train an LLM to finish George R.R. Martin’s Song of Fire and Ice saga before he does.

People aren’t machines. They don’t create works of art in the same way machines generate “art”. Yes, someone can describe how people take inspiration from other’s works in the same way as one would describe someone programming, say, a model for an AI image generator. But the generator, no matter how good, will never be able to tell someone why it made what it made⁠—and that “why” is the “soul” of human creativity. Even if someone can’t explain every little detail and decision about their art (for one reason or another), they can still explain the broad strokes. A machine will only ever wait for you to press the “Generate” button.

Anonymous Coward says:

Re: Re:

When a person experiences a new work of art that inspires them, they generally don’t slap a copy of that work on their workbench, copy the work wholesale, and mix-and-match pieces of works from the same artist to get a “just like [x]” result.

Having watched and listened to John Sheahan (of the Dubliners) encouraging just that in a back bar jam session with local fiddlers, I can say you are wrong. Indeed many a musician starts out playing other peoples music in their style, and having learnt to emulate several peoples styles they go on to develop their own style.

Indeed the way Humans learn anything from language to an art is by observing and copying others.

Also, have you looked at how these AI tools are used, such as this look at stable diffusion. It is at word processor level, that is it allows someone with skill to produce good results, and anybody lacking those skills likely produces a mess.

Stephen T. Stone (profile) says:

Re: Re: Re:

many a musician starts out playing other peoples music in their style

Which, again, is sort of my point: Even if someone does sit down to “copy” someone else’s art, chances are they’ll end up doing it in their own style. Humans learn by doing, and one of the things they learn by doing is making art⁠—and yes, that means they may have to make a copy or two first. But a computer can’t learn, can’t accept feedback, can’t grow as an artist⁠—it can only ever produce what it’s been programmed to produce. And if you’re going to say “well that programming can be expanded and that’s like human learning” or something to that effect, I have a word of advice for you: don’t.

it allows someone with skill to produce good results

So what? They’re not actually making the art. They’re skipping the actual work of making art⁠—the practice, the experimentation, the feedback, the adaptation of new inspiration⁠—so they can press a button to have a computer spit up an image off what is damn near literally an assembly line. They don’t want to be artists; they want computers to drive artists “out of business” because of envy, greed, or…shit, maybe they don’t like dealing with people. So please, don’t mistake someone being “good” at writing a prompt for an image generator with someone being an artist of any caliber.

Stephen T. Stone (profile) says:

Re: Re: Re:3

An image generator can’t imagine an impressionist-style image and create a wholly original work from that imagined picture. It can only ever spit out a collage of bits and bytes that it’s been programmed to “think” is an impressionist painting based on the input it’s been given⁠—input which could include Monet’s works.

An image generator can’t accept feedback. It can only ever be made to produce another image. It can’t figure out why it’s being told to produce another image⁠—why the previous one wasn’t what the person at the keyboard wanted⁠—so it can’t learn from criticism in the way people can.

An image generator can produce an image, sure. But it can’t come up with a unique image out of thin air⁠—it can only ever collage together a bunch of bits and pieces from other images. It is a machine, cold and unfeeling, and any “art” produced by a machine alone will never have the kind of heart and soul that you’ll find in a work created by a human being. If you don’t believe me, I challenge you to restrict your media diet for the next 24 hours to nothing but machine-generated “art” in any and every form. I guarantee that if you make it the full day, you’ll be sick of looking at AI art for the rest of your life.

Assuming, of course, that you’re not someone who already thinks machine-generated art can, will, and should replace human-generated art.

Anonymous Coward says:

Re: Re: Re:4

This is all simply, flatly, and fundamentally untrue. I’d suggest you take some time to muck about with one of the many free image generators out there to instantly understand how fundamentally untrue it is. They are not collages, but unique and bespoke pieces. See for yourself if you don’t believe. You can’t argue that the sky is green while people are outside looking at it. You’re just wrong.

TKnarr (profile) says:

It might help to remove AI from the picture and substitute a human instead. The problem with training AI to produce art isn’t so much the use of images as training data or the use of AI so much as having the AI then produce images in the style of a specific artist A which then displace the artist’s images in the marketplace. So, replace the AI with another artist B. B goes and looks at a lot of A’s images and learns to produce images in A’s style. B then goes into business doing up images in A’s style and selling them as “works in A’s style but at a fraction of the price”. How would the law and artist A handle this scenario?

Stephen T. Stone (profile) says:

Re:

So long as Artist B isn’t saying they’re actually Artist A and aren’t producing 1:1 copies to sell to the mass market, the law can’t really do shit. Art styles, in and of themselves, aren’t copyrightable. If they were, Disney would have shitloads of lawsuits to file against people who’ve been making “Disney-style” art (2D or 3D) for years and years and years. Do you really want to go down the road that says “laws that could apply to machine-generated art must also apply to human-generated art”?

NaBUru38 (profile) says:

I disagree with the article that it’s impossible to detect the authors used as sources.

If you ask a generator to make “a song about cats in the style of Eminem”, the result will likely “borrow” heavily from that artist’s works, so that people will quickly identify the resemblance or even believe it’s an original.

At this point, the author (or copyright owner) may complaint. Which they did with Eminem’s cat song.

https://m.youtube.com/watch?v=zTs3FuJGiQ4

https://www.vice.com/en/article/88xadz/ai-generated-eminem-rap-youtube-chatgpt

Samuel Abram (profile) says:

Re:

YouTube was given a copyright strike, but that’s not a legal action, it’s more like YouTube private copyright law to deal with infringement…which is also flawed. Otherwise, my friend Sebastian Tomczak wouldn’t have gotten a similar copyright strike over his white noise sounding like someone else’s white noise.

Anonymous Coward says:

Re: Re:

Otherwise, my friend Sebastian Tomczak wouldn’t have gotten a similar copyright strike over his white noise sounding like someone else’s white noise.

You know, maybe artists would have got more sympathy against the incoming wave of AI if they hadn’t been so trigger happy to use copyright law to shut down anything and everything they disagreed with.

Edward C. Greenberg says:

AI

Among the issues not discussed in this article nor in the comments section, is that AI will serve to make money for lawyers like me who litigate copyright cases. The costs to litigate such cases to determine what the ever changing “laws” that apply will be paid by creatives and big companies alike. Creatives, even if they prevail in a copyright case are (generally) not entitled to fees paid to experts EVEN IF they are awarded some or all of their attorneys fees.

The law will NEVER catch up to the speed of the development of AI. If you enjoy litigation and especially if you like paying legal fees then expanding AI is just for you.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...