How About Using AI To Determine Whether Or Not Something Is Creative Enough To Get Copyright Protection
from the put-it-to-good-use dept
There’s been a lot of talk lately about the role of AI and copyright, with much of it focused on fretting by various copyright maximalists about how things created by AI need more copyright or how AI systems are violating the copyright of artists, both of which seem to be fairly questionable claims at best.
But, copyright law professor Brian Frye recently participated in a Copyright Office “listening session” regarding copyright and AI, and he suggested an entirely different way that everyone (including the Copyright Office) should be thinking about. As he notes, the questions everyone seems to be fretting about appear to be easily answered:
I think we are asking the wrong questions about AI and copyright. Everyone is asking whether copyright protects AI-generated works and whether training an AI algorithm infringes copyright. The obvious answer is no and no.
Copyright only protects works created by people. AI doesn’t even create works, it generates content, which we consumers interpret as works. Roland Barthes predicted the death of the author, and AI has written the author’s obituary.
Likewise, training an AI algorithm doesn’t and shouldn’t infringe copyright. AI algorithms don’t copy works, they merely catalog rhetorical conventions and then deploy them to create conventional content.
Instead, he notes, everyone is missing the much bigger picture in that we could be (and arguably should be) using AI to tell us which other works (of the ones created by humans) even have enough creativity to deserve copyright protection in the first place:
We should be asking what AI can tell us about what copyright should protect and why. Copyright can only protect “creative” works. But courts and the Copyright Office have struggled to define “creativity.” Maybe AI can help?
An AI algorithm is essentially a nonsense generator, designed to produce banalities. In other words, AI is uncreative by design. An AI algorithm is a machine for regurgitating conventional wisdom. Indeed, we are amused when an AI “hallucinates” and fails to satisfy our pedestrian expectations.
But we can be just as boring as any AI. And there’s no point in copyright protecting banalities. Maybe AI can help us limit copyright to works that are actually creative. It’s easy, just ask AI to evaluate the “creativity” of works produced by people, to determine whether they deserve copyright. No one knows a fake like a faker, and AI is designed to identify banality. That’s what makes it a killer app.
We don’t know how to identify creativity. But AI can tell us what isn’t creative. Maybe that’s good enough to tell us what is creative, if anything.
Of course, the likelihood of this happening is basically nil, but it’s still a point worth thinking about. In the copyright world, there have long been arguments over what counts as being creative enough to get copyright’s protections. This may have been most notable in the realm of photography, where some (somewhat reasonably!) argued that the photographer, especially in outdoors/landscape photography, was merely capturing a scene created by nature, and therefore had little, if any, creative input into it.
The courts have generally side stepped this issue by arguing that the photographer gets copyright on the artistic decisions in terms of things like “where to point the camera” and “how to frame the photograph.” But that’s often felt like a cop out.
It’s intriguing to think of AI in a different way, as a much more impartial observer of whatever works are seeking copyright, with the ability to say whether or not a give work has the requisite creativity to get copyright. It may seem like a silly (or even trollish) suggestion, but it’s difficult to argue it’s any worse than how things are currently done.
Filed Under: ai, brian frye, copyright, creativity, us copyright office
Techdirt is off for the long weekend! We'll be back with our regular posts tomorrow.


Comments on “How About Using AI To Determine Whether Or Not Something Is Creative Enough To Get Copyright Protection”
It definitely won’t happen, what with all the doomsayers, Congress and news media wetting their pants, and the overblown hype about “AI” being out of control.
What we need is an LLM that can compare the talking points of these idiots with the output of other LLMs.
Oh wait, maybe that’s what they’re afraid of.
Is this something “AI” would even be capable of? It feels like the same overhyping of a technology that’s increasingly harder to find moral use cases for, just in a different direction than most of the MBA dropouts are pushing it.
Feh
I really hope Prof. Frye is trolling because detecting “creativity” by running it through an LLM is a lot like trying to detect whether a given text is AI generated by doing the same. To say that such an approach is unreliable would be an understatement, and to condition legal outcomes based on the output of an AI is precisely the kind of AI misuse we should be looking to avoid.
What should be the threshold for denying copyright registration based on an AI’s confidence score for creativity? What should be the confidence threshold for expelling students for passing off AI generated works as their own? What should be the threshold for deciding whether a work is infringing? While there are differences in the details, it is basically the same problem in each case: wrong decisions by an AI could cause real harm. Same pig, different makeup.
Re:
Prof Frye’s entire body of work is trolling. But serious trolling in that it’s designed to highlight the absurdities of how everyone already does things, and make you question whether or not “the way things are always done” is, itself, trolling.
Re: Re: LOL, fair
Thanks, Mike! And thanks for writing this post.
Yes, it’s both trolling & serious. Trolling because as you note, we talk about “creativity” all the time, but it doesn’t seem to have any content. And more to the point, we pretend we like creativity, even though revealed preferences suggest we like things that are familiar, but just a little bit different.
But Adrian makes a good point about thresholds. But those kinds of Type I/Type II problems are familiar in IP law & elsewhere. Sometimes I’m troubled by false positives, sometimes I’m troubled by false negatives. When it comes to copyright protection, it seems like we should err on the side of denying it. After all, regulatory monopolies are a dangerous thing.
That said, I’m semi-serious that an AI tool could actually be pretty good at reverse engineering “novelty,” which is at least a decent proxy for creativity.
Of course, the trolly part is that if all you get to own is the “creative” parts, maybe they won’t be all that lucrative.
Re: Re: Re:
If anything, the Internet has demonstrated that creativity is very common, while being able to build the audience and fan base needed to support that creativity is very difficult. Therefore, while an AI may be able to create large numbers of new works, it will not attract an audience as it does not initiate conversations.
Re: Re: Re:2
And the AI that does is a hilarious novelty.
I must add that talking AIs are rare enough as it is.
Re: Re: Re:
But what if you feed the novel thing back into itself before you could copyright it? Couldn’t it could reproduce on command then? This is Hilbert’s Decision Problem, all over again.
Hey, if copyright holders think AI is good enough to determine whether a random IP address infringed copyright and deserves a settlement letter or subpoena based on shitty to non-existent evidence, and allows them to scream innocence and claim “But the computer did it!” when it’s proven they made an obvious mistake… then I posit they have very little ground to claim that the same field of technology can’t also hold them to a set of indeterminate, nebulous standards.
Re:
“You wouldn’t steal a car”.
Fine, I’ll get the computer to make an identical car for me, then say that it could never be judged for theft because computers are amoral.
That’s the dumbest misapplication of Barthes I have ever seen.
Re:
Eh, I’ve seen people literally calling it a death threat.
If you think photography is trivial and non-creative, go take your own photos. I’m sure they’ll be just as good as Ansel Adams’.
Re: Photographs deserve copyright protection because...
Pressing record button on camera isn’t itself creative activity, but before you can do that, you need hours of preparation like cutting trees, waiting for sunshine, avoiding rainy whether, waiting for bees to arrive, waiting for flowers to open, sudden arrival of a bear to eat the honey, tracking bear traps, listening hunter’s time schedules, preparing guns for the shot, charging camera batteries, timing the camera shutter, etc..
This preparation work is clearly innovative enough to give you first place in any photography competition. And thus copyright belongs to whoever pressed the red record button, except the bear itself.
Re: Re:
So go push a button then. Put as much effort as you think it takes into taking a good picture with one push of a button. I’m sure you’ll create a work of heartbreaking genius that captures and communicates the sublime majesty of nature by simply taking your lenscap off, looking through the viewfinder, and pressing the button. First time.
Re: Re: Re:
Dude .. wtf
Re: Re: Re:2
If your argument is “anyone could do that”, then it is trivially easy to prove that by actually doing it. So do it.
AI was trained on libelous data.
AI is not protected by Section 230.
This is going to make some “free speech” lawyers amazingly wealthy!
“Hey ChatGPT, who is Brian Hood?”\
“I am unable to produce a response” LOL
Two LibelGPT lawsuits already, more on the way.
Re:
Maybe go spam some other website, John Smith.
All but a few legacy authors no longer rely on copyright.
Re:
[citation needed]
I asked ChatGPT (4) to comment:
The article raises interesting points regarding the intersection of AI and copyright, suggesting a fresh perspective by proposing the utilization of AI in the evaluation of the creativity threshold for copyright protections.
This certainly has its merits. With AI’s capability for pattern recognition, it may indeed be able to discern between unique, creative works and those that are derivative or non-innovative. This could potentially offer a more objective lens to the subjective nature of creativity evaluation, which could benefit the legal copyright process. It’s also thought-provoking to consider AI as an ‘impartial observer’ – an aspect that could bypass human bias and perhaps streamline decision-making.
However, this approach also comes with inherent challenges. First and foremost, “creativity” is a complex, inherently human characteristic that may be difficult to measure or quantify in a way that an AI could understand. Even among humans, opinions can vary widely about what is considered creative. Trying to codify these subjective and multifaceted interpretations into an algorithm is a formidable task.
Furthermore, the proposed idea doesn’t account for the reality that AI, in its current form, is created by humans and trained on human data, thereby reflecting human biases and perspectives. AI’s understanding and identification of ‘creativity’ would be a projection of its training, and not necessarily an unbiased or perfect evaluation.
Also, legal interpretations of creativity, originality, and what constitutes a work of authorship are all bound by complex and nuanced case law. There’s a significant legal context that may be hard to codify into an AI system. While AI has made strides in understanding and applying complex rules, there’s a potential for error or misinterpretation.
In conclusion, while the idea of using AI to determine the creativity of works for copyright protection is certainly novel and intriguing, there are several technological and legal hurdles to consider. It’s a concept worth exploring further but also needs to be approached with caution given the potential complexities involved.
Washington Post, May 18, 2023:
A professor accused his class of using ChatGPT, putting diplomas in jeopardy
https://www.washingtonpost.com/technology/2023/05/18/texas-professor-threatened-fail-class-chatgpt-cheating/
This idea isn't as crazy as it seems at first glance.
A lot of (very expensive) copyright lawsuits, especially in Music, are predicated on a disagreement whether some alleged appropriation was accidental or deliberate. Was the similarity that the listeners testified to ‘de minimis’, or ‘substantive’?
With copyright taking on criminal law elements in more and more countries, it should never be the case that a jury rules a similarity as substantive when this similarity could occur due to pure chance.
A lot of words are thrown around, but strangely, nothing is ever quantified, as if art is this ‘unquantifiable’ thing.
Yet modern LLMs, stable diffusers, and music models could very well quantify the intangible to some extent. “Arithmetic coding” is a simple procedure by which one can take a model, input a piece of text, art, or music, and output a very good compression of said piece, for AIs are in fact fantastically good zip programs on steroids.
The size (in bytes) of this compression is roughly the logarithm of the entropy of the piece, within the world of art expressible by the model. This holds as long as the model isn’t over-fitted, and trained on the widest possible range of sources (so train it on everything less our subject, and don’t censor the dataset).
This entropy can measure something else: the probability that two independent “Authors” (in copyright’s idealized sense of the ‘author’ as a miraculous wellspring of creativity), come up with the same piece of text, that their two pieces happen to be one and the same, is inversely proportional to it.
Combine the number of humans (10^10), copyright’s length (10^10), a slack factor (10), and the birthday paradox (square it) and you get roughtly 10^42, or ~140 bits. That would serve as a minimum standard of a creative work, in that for example the probability that any two human authors independently made the same song motif is minimal for a motif of at least this length. Laws are free to set the real measure, but anything less than 140 bits or so is likely to lead to accidental collisions.
To measure the amount of ‘novelty’ in a case, take such an AI model, input the new work via arithmetic coding (not part of its existing dataset), measure the amount of bytes needed to represent it.
This test would be a lot cheaper, as well as more predictable, than using human juries. It also has another nice property: if you use a less powerful (thus faster and cheaper) algorithm, it will err on the side of caution. Less powerful algorithms will compress the work less, and thus flag a smaller match as infringing. Thus any defendant can counter a claim by showing that there is too little entropy in the ‘plagiarized’ part to make that accusation beyond reasonable doubt.
To give an idea of the sizes using classical compression: for text, in the limit for that it’d be that anything about the size of half a tweet or less (150 characters) would be too small for copyright protection. For music, at least 9 or so notes (compressed MIDI).
I expect the industry to fight tooth and nail to oppose even this tiny safe haven, and that the proposal is dead in the water, even though it’d make perfect logical sense as a starting point to define a computable (and thus, fair and rigorous) minimum copyrightability standard. This standard would, as the article states, not by itself be sufficient to prove copyrightability per se, but it would be useful in refuting it.