Anonymous Coward

April 3, 2026 at 1:25 pm

pretty sure clean room rewrites require the tools to not have the entire codebase included as a baseline

so how has he certified that claude’s training data didn’t include the open source code prior to the ‘clean room’ process?

bearing in mind that there’s an ongoing lawsuit involving chatbots fundamentally identical to claude regurgitating whole articles when prompted

Anonymous Coward

April 3, 2026 at 3:07 pm

Re:

Clean room is a sufficient, not necessary, condition for the new work to not be a derivative of an existing work. One can, for example, write a book using another book as a reference. That doesn’t make the new book a derivative of the other book.

royleith (profile)

April 4, 2026 at 2:19 am

Re:

What about computer languages?

Google used Mozilla’s clean room coding of Sun’s (now part of Oracle) Java language to write the Android operating system. Oracle sued Google for using the language functions (i.e. the API not the coding).

The Supreme Court found that it was fair use thus overturning the Federal Court decision. However, that would still be fair use of copyright material.

ATM a software specification (what the software code does) is copyright. Even worse, Apple V. Google found that swiping across a mobile screen to turn a mobile phone off was patented.

So, writing a software specification for AI to write code for a mobile phone that employs screen gestures could violate both copyrights and patents.

Anonymous Coward

April 3, 2026 at 1:50 pm

Why do we need to code anything? Can’t the AI just do whatever a program would do?

Demoy

April 3, 2026 at 4:39 pm

Re: The chat is an external tool

If you need an offline, secure, free or reliable application you would not use this third-party pay-per-use subject-to-change online-only chatbot directly in your application.

Anonymous Coward

April 3, 2026 at 2:00 pm

First rule of headlines: If they ask a question, the answer is “No”.

The Phule

April 3, 2026 at 2:22 pm

No

I’ve long ago learned that when a technology news headline is of the form “Can X do Y”, the answer is “No.”

Thad (profile)

April 3, 2026 at 2:33 pm

https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines

Anonymous Coward

April 3, 2026 at 2:35 pm

Speaking of “Open”-source AI…

https://arstechnica.com/ai/2026/03/entire-claude-code-cli-source-code-leaks-thanks-to-exposed-map-file/

Anonymous Coward

April 3, 2026 at 4:04 pm

Re:

And they already dmcaed they cult followers

Bloof (profile)

April 3, 2026 at 2:49 pm

And once again, the question is not asked, ‘Do open source projects actually want forced reinvention by AI?’ The users keep saying no, the devs keep saying no for the most part, but the likes of Anthropic are determined to forcibly ‘contribute’ no matter what and AI boosters in management roles keep on cramming it into anything, spending untold riches gambling the likes of Mozilla’s future on AI features nobody requested.

I don’t believe for a second AI will lead to a future without copyright, that ladder is going to be pulled up sooner rather than later, now they have gotten what they want from creators. I do believe that the people promoting it want a future without consent, where their will and financial clout means they have the final say.

Mike Masnick (profile)

April 3, 2026 at 6:46 pm

Re:

The users keep saying no, the devs keep saying no for the most part

This is not my experience at all. Especially among devs.

Anonymous Coward

April 3, 2026 at 9:45 pm

Re: Re:

Have you ever thought to interview developers? Have you thought to interview ones that aren’t just producing the software equivalent of cheap widgets?

Because here is my experience as an actual software engineer doing actual difficult work.

It’s good at doing super simple tasks, or being find and replace v2.

It’s good at writing one off scripts.

It is bad at anything more. Imagine a crew of the cheapest labor you can find and have have them build a house. They forget to use nails, or they use 50 when you only need two, they duct tape over their mistakes, they use tools the wrong way, they use the wrong materials in areas.

The code it writes is a hot mess of glued together pieces unless guided in hand like a toddler that will run into the road if you let go.

Github has dropped its reliability down to 90% instead of 99.9%. Microslop just had to pull its latest patch and windows is getting more and more broken.

It’s not an if but a when that more and more hacks and outages are going to happen as a result of AI slop code.

Bloof (profile)

April 4, 2026 at 2:27 am

Re: Re:

Insert snarky comment about the state and online behaviour of Bluesky Devs here.

Anonymous Coward

April 3, 2026 at 2:51 pm

You're missing something fundamental and important

“Software specification” is not a solved problem in computing, despite decades of research. The problem of turning a description of software into software remains very difficult — as we see constantly with any specification of sufficient size and complexity. Now…formal specification methodologies exist, and they can be used to generate provably correct code. But few people have the training to use these, and they have limits, and in one sense, this just shifts the problem.

So even if suppose that a perfect code generator exists, which human or AI, we do not know what to say to that person to cause the code we want to be produced — except in very limited cases, as in the one that you cite. So yes, we could probably replace /bin/cat with AI-generated code today. But could we create sendmail or postfix? No, probably not, because — as we’ve discovered — there are baseline problems (omissions, conflicts, ambiguities) with the protocol specification.

To put this another way: the code that we could replace this way is not code we need to replace, because it already exists and for the most part, it’s mature and stable. The code that we might want to write doesn’t exist or isn’t mature and stable. And part of the reason that it’s in that state is that we don’t have a truly viable specification for it.

And by the way, one of the dirty little secrets of programming, as a profession, is that almost nobody is any good at writing specifications — and they’re not very interested. Why? Programming is fun, specification is drudgery.

Anonymous Coward

April 3, 2026 at 2:56 pm

I find this pretty naive. First, proprietary software is still far harder to replicate using LLMs than open source. Even just reading the code, understanding its structure and using that to prompt an LLM is super helpful (and ideas are not copyrightable, so this wouldn’t violate any license).

It’s also naive to believe FOSS authors only work for the love of the craft – I think most of them still hope to gain, whether materially, or in prestige and social capital. But even those that do just do it for the love of the game probably aren’t very happy to have their passion project then appropriated, enclosed and exploited for profit. Like, that I am not trying to make any money off it is probably a good indication that I don’t want anyone else to do so either, right?

And even if we only consider selfish material interests: given that, once you publish your code, you can get outcompeted (both on price and just drowned out in the attention economy) immediately by thousands of copycats, why would any company publish their code? We want commercial interests to publish their code: it makes for better, safer software and protects user’s interest.

In a world of LLMs, all the incentives just align to not publish your code. I wouldn’t proclaim the death of open source, but LLMs can only harm the movement.

Rocky (profile)

April 3, 2026 at 3:07 pm

The funny thing about “AI clean room” process is that it isn’t. Claude has been trained on code with various licenses and all derivative code that is then produced by Claude likely inherits licenses that might have clauses like GPLv3 §5:

You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions

Slapping your own license on code produced by an AI-tool looks to me like breaking the original licenses of any code that was used in training the models even you explicitly avoid GPL-code. And in regards to avoiding GPL-code, you can’t be certain that the instructions are followed in their entirety by AI-tools.

Anonymous Coward

April 3, 2026 at 3:57 pm

Re:

Slapping your own license on code produced by an AI-tool looks to me like breaking the original licenses of any code that was used in training the models

“Clean-room design” was never really required by law. It’s more that the people doing it decided it’d be hard for someone to win a lawsuit against them for copying code, when they can convincingly say they never even saw the code in binary or source form. Nevertheless, learning from existing code, including by directly reverse-engineering and re-implementing it, is considered a legal right in much of the world, as long as it’s not copying.

While a lot of people agree with your view that it “looks like” breaking licenses, it’s far from obvious that the laws will support that. Sure, if it spits out an exact copy of something non-trivial and copyrighted, that’s not gonna go well (and people have seen LLMs to do that). Most cases will probably be much more ambiguous. Also remember that there was a significant period in U.S. copyright law where code was not considered copyrightable at all, because it was primarily functional rather than creative. If software companies were to replace people with computers, in significant numbers, it’d tend to support that view.

Rocky (profile)

April 3, 2026 at 8:26 pm

Re: Re:

While a lot of people agree with your view that it “looks like” breaking licenses, it’s far from obvious that the laws will support that. Sure, if it spits out an exact copy of something non-trivial and copyrighted, that’s not gonna go well (and people have seen LLMs to do that). Most cases will probably be much more ambiguous.

I would argue that this case in unambiguous for the simple reason the instructions given to Claude was to specifically avoid GPLv3 source material which was them implicitly acknowledging that they knew they would be breaking software-licenses.

Anonymous Coward

April 3, 2026 at 9:54 pm

Re: Re: Re:

the instructions given to Claude was to specifically avoid GPLv3 source material which was them implicitly acknowledging that they knew they would be breaking software-licenses

…had they used such material. Are you suggesting that a specific instruction to avoid copying something would be proof of a plan to illegally copy it? That would be bizarre.

Rocky (profile)

April 4, 2026 at 5:24 pm

Re: Re: Re:²

Only you are talking about copying here. Let me quote the relevant part that I base my argument on:

After that, Blanchard “started in an empty repository with no access to the old source tree and explicitly instructed Claude not to base anything on LGPL/GPL-licensed code

Ie, Blanchard acknowledges through his instructions to Claude that licenses can carry over.

Anonymous Coward

April 4, 2026 at 7:22 pm

Re: Re: Re:³

Only you are talking about copying here.

You’re talking about copyright licenses. What else would be relevant?

Rocky (profile)

April 5, 2026 at 1:29 pm

Re: Re: Re:⁴

You’re talking about copyright licenses. What else would be relevant?

What the license says and how it may carry over depending on what source material you used. Creating derivative works or copying OSS code isn’t a copyright infringement, changing the license while trying to avoid the rules set out in the original license is.

To reiterate the point, telling Claude to avoid using GPLv3 code to create new code is Blanchard explicitly acknowledging that licenses carry over – and then he slaps a MIT-license on the new code while ignoring any other licenses on source code the training material have ingested.

If copyright and licenses wasn’t a problem, why tell Claude to avoid GPLv3 sources in the first place?

Anonymous Coward

April 5, 2026 at 4:55 pm

Re: Re: Re:⁵

changing the license while trying to avoid the rules set out in the original license is.

If thing B does not involve copying from thing A, the license of thing A is irrelevant. There’s no “change”; it’s one person slapping a license on the thing they own the copyright to.

If copyright and licenses wasn’t a problem, why tell Claude to avoid GPLv3 sources in the first place?

Copyright and licensing is a major potential problem. The person is telling Claude to avoid that problem. I don’t see this as much different than FedEx instructing their drivers not to speed (which would certainly not be proof of criminal intent).

Anonymous Coward

April 4, 2026 at 3:43 am

Re:

“The funny thing about “AI clean room” process is that it isn’t.”

First: you’re absolutely right. All of these models have been trained on an enormous amount of code: good, bad, old, new, working, broken, current, obsolete, etc.

Second: one of the things that the AI fanboys have not considered — because they believe their own hype, and because they’re greedy clueless ignorant newbies who don’t even begin to understand the practice of programming — is what else is in that corpus of code.

Every intelligence agency, every organized crime operation, every terrorist group, everyone out there knows that any code published on the Internet will be scraped and blindly incorporated into these models without any human review. These companies are too greedy, too cheap, too lazy, too stupid to do that.

Which means that any code with backdoors, deliberate security holes, etc. will be pulled in alongside completely benign code.

I trust everyone can work out the implications of that: they’re fairly obvious.

Anonymous Coward

April 4, 2026 at 4:22 pm

Re: Re:

Which means that any code with backdoors, deliberate security holes, etc. will be pulled in alongside completely benign code.

“That’s a great point. You’re absolutely right! That code does have a backdoor! Sorry about that. Try this benign version.”

Rocky (profile)

April 4, 2026 at 5:29 pm

Re: Re:

I trust everyone can work out the implications of that: they’re fairly obvious.

Your trust is misplaced, not everyone can work that out or want to.

Also, relevant to this: Thinking—Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender (The Wharton School Research Paper)

Arianity (profile)

April 3, 2026 at 3:26 pm

After that, Blanchard “started in an empty repository with no access to the old source tree and explicitly instructed Claude not to base anything on LGPL/GPL-licensed code.”

How this works based on training data is a little iffy. Simply telling it not to use it is probably not sufficient.

If the license can be changed or simply cancelled in this way, then there is no way to force people to release their own variants only under the GPL, as Stallman intended.

I’m not sure that’s entirely a good thing for open source. Although if the bar is low enough, maybe it doesn’t matter?

Perhaps a new kind of open source will emerge – Open Source 2.0 – one in which people do not contribute their software patches to a project, as they do today, but instead send their prompts that produce better versions.

Unfortunately, currently the model seems to be sending slop. It killed Curl’s bug bounty program, and it’s hurting a lot of projects open pull request system. :/

There’s a real problem when you have asymmetry, where it takes much more effort for the human on the other side to verify, than to generate.

Anonymous Coward

April 3, 2026 at 3:41 pm

Blanchard says he was able to accomplish this “AI clean room” process by first specifying an architecture in a design document and writing out some requirements to Claude Code.

Perhaps a new kind of open source will emerge – Open Source 2.0 – one in which people do not contribute their software patches to a project, as they do today, but instead send their prompts that produce better versions.

This seems like a huge leap in logic to me. The difference between programming and coding has been previously been pointed out in Techdirt comments, and this article correctly uses the term “coding”.

So, sure, if someone’s already done the hard part of writing a precise specification, maybe a computer can do the comparatively easy part of coding it. Historically, it’s also been popular to farm the coding out to a team of grad students, an outsourcing shop in Eastern Europe, or whatever. People who’ve done this tend to find out that the result is only about as good as the specification, which it turns out is often lacking.

A very good coding team will provide valuable feedback about inconsistences, ambiguities, and potential improvements for the specs, and it can lead to an excellent result. Less-good teams often manage to provide results that technically meet the requirements, while not really being what the client needed (but neglected to correctly ask for, cf. “The Monkey’s Paw”). Much of it deserves to be called “slop”.

As for copyright, well, the open-source world has been moving toward permissive licenses for a while, and the software world in general has been moving toward open-source. Maybe these systems could accelerate the transition somewhat. But people have been saying stuff like “why can’t these open-source developers just write me a Photoshop?” for 30 years now, and if that’s the best design document they can come up with, they should not expect amazing results.

Those who can write great design documents are basically programmers or “software architects” already; and probably professional ones, because it’s rare that anyone is very good at this right out of school.

Anonymous Coward

April 3, 2026 at 4:03 pm

Can I rewrite your entire website and rename techspoop?

The answer is NOOOO. Any large enough company just steals people work anyway, because copyright only exists if you can sue them and win. What won’t change with AI is large companies suing you into the dirt regardless of the law.

So why is it you and others won’t to steal the work of the working class so much? Because the writers of open source projects that use these licenses are just that.

Anonymous Coward

April 3, 2026 at 4:40 pm

In fact, it is quite possible that code produced by genAI is not covered by copyright at all, for the same reason that artistic output created solely by AI can’t be copyrighted.

If the maintainer then modifies a single line of code, does the AI-generated software suddenly become coprightable? Or 10 lines, or a 1000? At what point does it become copyrightable?

If the answer is “never, unless it’s 100% human-written”, then most closed-source software has now lost the possibility for copyright, and the software companies may be one code-leak away from bankruptcy.

Anonymous Coward

April 3, 2026 at 8:07 pm

Re:

If the maintainer then modifies a single line of code, does the AI-generated software suddenly become coprightable?

That line might be, but not the rest. And only if it embodies some human creativity. There’s no point at which the lines not created or touched by humans will become copyrightable (under current law).

Still, it’s ironic that Microsoft is one of the companies eroding copyright, given the Gates “Letter to Hobbyists” accusing them of “stealing”, and Microsoft’s traditional hard-line stance with the Business Software Alliance and all (e.g. sending goons to “audit” licensing).

Rocky (profile)

April 3, 2026 at 8:44 pm

Re:

There exists legal tests for such scenarios:

Copy and alter: Was there substantial labor and skill involved to create the new work?
Similarity: Is the new work similar to the old work or distinct?
Intent and obfuscation: Was there obfuscation and intentional changes made to make it appear to be a new original work?

You can put a copyright on a work that’s a derivative of an uncopyrightable work, but only for the new contributions.

royleith (profile)

April 4, 2026 at 6:20 am

Re: Re:

The ‘sweat of the brow’, skill and ideas count for nothing in copyright. Directories and menus are not copyright protected. Only the flowery language and the pretty pictures in a cookbook are protected.

The only thing that counts is copying the creative expression. (although, judges seem to ignore the ideas/expression dichotomy when it suits them).

The court would have to decide how much the creative expression in the new work differed from the creative expression in the original. If you think you would have a problem deciding this, judges tend to be ten time more hopeless at software copyright law than you would be.

In Oracle V. Google, Judge William Alsup demonstrated a terrific understanding of both APIs, human-readable coding and its translation into machine code, and the key aspects of copyright law.

Most judges seem to think that software that looks and feels similar to another piece of software is a copyright violation. It’s a random outcome, weighted to the earlier work.

Anonymous Coward

April 3, 2026 at 8:07 pm

First AI will have to start being competent at code.

One metric that might be helpful, he said, is measuring tokens burned to get to an approved pull request – a formally accepted change in software. That’s the kind of thing that needs to be assessed to determine whether AI helps an organization’s engineering practice.

To underscore the consequences of not having that kind of data, Smiley pointed to a recent attempt to rewrite SQLite in Rust using AI.

“It passed all the unit tests, the shape of the code looks right,” he said. It’s 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It’s a dumpster fire. Throw it away. All that money you spent on it is worthless.”

All the optimism about using AI for coding, Smiley argues, comes from measuring the wrong things.

https://www.theregister.com/2026/03/17/ai_businesses_faking_it_reckoning_coming_codestrap/

Anonymous Coward

April 3, 2026 at 10:58 pm

I’m a hardcore free software person but I’d trade in the GPL for the end of software copyright. I’m not convinced though that this will actually happen. AI makes reverse engineering easier but it’s easier still to strip protections from software you have the source code to, so free software will lose copyleft but proprietary crap will stay copyright.

Of course, all the permissive open source stuff will be no more vulnerable to exploitation than it ever was, and it probably constitutes the bulk of “FOSS”. Why people want corporations to exploit them is beyond me, but AI will just accelerate what was already standard there.

Anonymous Coward

April 4, 2026 at 2:50 am

AI boosters are so annoying.

Sunday
12:00	Funniest/Most Insightful Comments Of The Week At Techdirt (3)
Saturday
12:00	Game Jam Winner Spotlight: CARAMENTRAN (1)
Friday
19:39	Minnesota Kicks Off Legal Battle With Trump Administration To Hold ICE Shooters Accountable (13)
15:32	In Chiles V. Salazar The Supreme Court Issues A Bad Good First Amendment Decision (19)
13:08	Can Agentic AI Coding Tools Finally End Copyright For Software While Re-Inventing Open Source? (37)
13:03	Daily Deal: Hypergear 3-in-1 Wireless Charging Dock (0)
11:02	The Social Media Addiction Verdicts Are Built On A Scientific Premise That Experts Keep Telling Us Is Wrong (13)
09:24	Senators Ask Tulsi Gabbard To Tell Americans That VPN Use Might Subject Them To Domestic Surveillance (10)
05:24	The Trump Administration Is Trying To Steal $21 BIllion Earmarked For Better Broadband (8)
Thursday
20:08	DOGE Goes Nuclear: How Trump Invited Silicon Valley Into America’s Nuclear Power Regulator (5)

Can Agentic AI Coding Tools Finally End Copyright For Software While Re-Inventing Open Source?

from the reinventing-software dept

Comments on “Can Agentic AI Coding Tools Finally End Copyright For Software While Re-Inventing Open Source?”

Re:

Re:

Re: The chat is an external tool

No

Re:

Re:

Re: Re:

Re: Re:

You're missing something fundamental and important

Re:

Re: Re:

Re: Re: Re:

Re: Re: Re:²

Re: Re: Re:³

Re: Re: Re:⁴

Re: Re: Re:⁵

Re:

Re: Re:

Re: Re:

Re:

Re:

Re: Re:

Add Your Comment Cancel reply

Comment Options:

What's this?

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Sunday

Saturday

Friday

Thursday

More

Tools & Services

Company

Contact

More

Can Agentic AI Coding Tools Finally End Copyright For Software While Re-Inventing Open Source?

from the reinventing-software dept

Comments on “Can Agentic AI Coding Tools Finally End Copyright For Software While Re-Inventing Open Source?”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Sunday

Saturday

Friday

Thursday

More

Email This Story

Tools & Services

Company

Contact

More