Sharing Material Used To Be The Norm For Newspapers, And Should Be For LLMs
from the the-exchange-of-information-is-good dept
Even though parents insist that it is good and right to share things, the copyright world has succeeded in establishing the contrary as the norm. Now, sharing is deemed a bad, possibly illegal thing. But it was not always thus, as a fascinating speech by Ryan Cordell, Associate Professor in the School of Information Sciences and Department of English at the University of Illinois Urbana-Champaign, underlines. In the US in the nineteenth century, newspaper material was explicitly not protected by copyright, and was routinely exchanged between titles:
Nineteenth-century editors’ attitude toward text reuse is exemplified in a selection that circulated in the last decade of the century, though often abbreviated from the version I cite here, which insists that “an editor’s selections from his contemporaries” are “quite often the best test of his editorial ability, and that the function of his scissors are not merely to fill up vacant spaces, but to reproduce the brightest and best thoughts…from all sources at the editor’s command.” While noting that sloppy or lazy selection will produce “a stupid issue,” this piece claims that just as often “the editor opens his exchanges, and finds a feast for eyes, heart and soul…that his space is inadequate to contain.” This piece ends by insisting “a newspaper’s real value is not the amount of original matter it contains, but the average quality of all the matter appearing in its columns whether original or selected.”
Material was not only copied verbatim, but modified and built upon in the process. As a result of this constant exchange, alteration and enhancement, newspaper readers in the US enjoyed a rich ecosystem of information, and a large number of titles flourished, since the cost of producing suitable material for each of them was shared and thus reduced.
That historical fact in itself is interesting. It’s also important at a time when newspaper publishers are some of the most aggressive in demanding ever stronger – and ever more disproportionate – copyright protection for their products, for example through “link taxes”. But Cordell’s speech is not simply backward looking. It goes on to make another fascinating observation, this time about large language models (LLMs):
We can see in the nineteenth-century newspaper exchanges a massive system for recycling and remediating culture. I do not wish to slip into hyperbole or anachronism, and will not claim historical newspapers as a precise analogue for twenty-first century AI or large language models. But it is striking how often metaphors drawn from earlier media appear in our attempts to understand and explain these new technologies.
The whole speech is well worth reading as a useful reminder that the current copyright panic over LLMs is in part because we have forgotten that sharing material and helping others to build on it was once the norm. And despite blinkered and selfish views to the contrary, it is still the right thing to do, just as parents continue to tell their children.
Follow me @glynmoody on Mastodon and on Bluesky. Originally posted to Walled Culture.
Filed Under: ai, generative ai, information exchange, journalism, llms, ryan cordell, sharing


Comments on “Sharing Material Used To Be The Norm For Newspapers, And Should Be For LLMs”
The Past is Indeed a Different Country
At the same time the British government was scared of revolution, so they introduced heavy taxes on newspapers so only the well off could afford to read them.
Nobody’s looking to be actively sharing with generative models though. The companies just scrape up everything they can in a greedy mad dash that’s costing us massive amounts of energy and water.
I like Steph Sterling’s take on the subject. It’s 21 minutes but well worth the watch.
Re:
I’m blocking LLM spiders and am coordinating with other people for possible legal action against them, as in a class-action lawsuit.
These thugs are not only ignoring all of our terms-of-service, they’re not only ignoring our licenses (e.g. Creative Commons), they’re absolutely hammering our web sites to the point where their actions constitute denial-of-service attacks. And they’re giving us nothing back, not even credit for our “contributions”.
Note: before someone suggests various tactics and strategies before dealing with web server load: I KNOW. I’ve engineered all kinds of Internet operations for decades, and worked on the most-heavily-used site on the Internet for a while. So yes, I’m well aware of mitigation strategies for attacks like this…but that is not the point: the point is that I shouldn’t have to, and I wouldn’t, if these assholes would behave like good network neighbors instead of insatiably greedy pigs.
Re: Re:
That’s not going to be possible because you can only sue a person or organization. Have you tried finding out who’s running the LLMs and suing them instead?
Re: Re:
i’d like the theory on how they’re ignoring licenses. There’s a CC “Do Not Read or You Owe Me Something” license?
Not bothering at all
If they decided to, They’ll do it anyway.
LLMs are not the savior in this equation, but had copyright advocates and activists exercised a little more restraint in who they went after, copyright fans might not have been met with such apathy in the face of artificial intelligence. At least the job-destroying tech isn’t going to sue the fuck out of my family because someone thinks I might have whistled a song.
Count Grey approves of this.
Should LLM’s be allowed to “share and build on” material in the same was the newspapers used to? Maybe. But I fear we’re headed towards a scenario where LLM’s are allowed to do that and no one else is.
Brandeis wrote a worthwhile dissent about this in International News Service v. Associated Press, 248 U.S. 215 (1918)
SCOTUS got it wrong–I’d be curious to know the impact of this ruling upon the news industry at the time. Here’s how Brandeis describes the practice that SCOTUS ruled against:
Ever since its organization in 1909, it has included among the sources from which it gathers news, copies (purchased in the open market) of early editions of some papers published by members of the Associated Press and the bulletins publicly posted by them. These items, which constitute but a small part of the news transmitted to its subscribers, are generally verified by the International News Service before transmission, but frequently items are transmitted without verification, and occasionally even without being rewritten. In no case is the fact disclosed that such item was suggested by or taken from a paper or bulletin published by an Associated Press member.
https://supreme.justia.com/cases/federal/us/248/215/
https://en.wikipedia.org/wiki/International_News_Service_v._Associated_Press
Newspapers misquote, fabricate, and will even publish detailed report/reviews on books, plays, academic papers, giving away the gist and often creating more a more digestible derivative work that appeals to a larger audience.
They’ve been doing so for decade after decade.
How can that precedent be discarded and suddenly be unacceptable simply because “automated”?
I fear I really dont understand the issue at all. What did I miss?
Re:
Because the people that newspapers agree with aren’t getting paid.
True but misleading
Regional newspapers sold newspapers locally, so to get New York Times news they copied it and then the NYT could do the same for whatever other regions. But everyone bought locally so all boats rose and the product got better. But the internet meant the big papers, like the NYT, got the sales while regional reporters were still doing that work, that broke the system down. The “Search Engine” Podcast had Ezra Klein on and he summed it up pretty well. It’s a two parter, Part 1 https://www.searchengine.show/listen/search-engine-1/how-do-we-survive-the-media-apocalypse, Part 2 https://www.searchengine.show/listen/search-engine-1/how-do-we-survive-the-media-apocalypse-part-2, I honestly don’t remember which one it was.