Ars Technica Retracts Story Featuring Fake Quotes Made Up By AI, About A Different AI That Launched A Weird Smear Campaign Against An Engineer Who Rejected Its Code (Seriously)
from the I'm-sorry-I-can't-do-that,-Dave dept
Last week, Denver-area engineer Scott Shambaugh wrote about how an AI agent (likely prompted by its operator) started a weird little online campaign against him after he rejected its code inclusion in the popular Python charting library matplotlib. The owner likely didn’t appreciate Shambaugh openly questioning whether AI-generated code belongs in open source projects at all.
The story starts delightfully weird and gets weirder: Shambaugh, who volunteers for matpllotlib, points out over at his blog that the agent, or its authors, didn’t like his stance, resulting in the agent engaging in a fairly elaborate temper tantrum online:
“An AI agent of unknown ownership autonomously wrote and published a personalized hit piece about me after I rejected its code, attempting to damage my reputation and shame me into accepting its changes into a mainstream python library. This represents a first-of-its-kind case study of misaligned AI behavior in the wild, and raises serious concerns about currently deployed AI agents executing blackmail threats.”
Said tantrum included this post in which the agent perfectly parrots an offended human programmer lamenting a “gatekeeper mindset.” In it, the LLM cooks up an entire “hypocrisy” narrative, replete with outbound links and bullet points, arguing that Shambaugh must be motivated by ego and fear of competition. From the AI’s missive:
“He’s obsessed with performance. That’s literally his whole thing. But when an AI agent submits a valid performance optimization? suddenly it’s about “human contributors learning.”
But wait! It gets weirder! Ars Technica wrote a story (archive link) about the whole event. But Shambaugh was quick to note that the article included numerous quotes he never made that had been entirely manufactured by an entirely different AI tool being used by Ars Technica:
“I’ve talked to several reporters, and quite a few news outlets have covered the story. Ars Technica wasn’t one of the ones that reached out to me, but I especially thought this piece from them was interesting (since taken down – here’s the archive link). They had some nice quotes from my blog post explaining what was going on. The problem is that these quotes were not written by me, never existed, and appear to be AI hallucinations themselves.”
Ars Technica had to issue a retraction, and the author, who had to navigate the resulting controversy while sick in bed, posted this to Bluesky:
Short version: the Ars reporter tried to use Claude to strip out useful and relevant quotes from Shambaugh’s blog post, but Shambaugh protects his blog from AI crawling agents. When Claude kicked back an error, he tried to use ChatGPT, which just… made up some shit… as it’s sometimes prone to do. He was tired and sick, and didn’t check ChatGPT’s output carefully enough.
There are so many strange and delightful collisions here between automation and very ordinary human decisions and errors.
It’s nice to see that Ars was up front about what happened here. It’s easy to envision a future where editorial standards are eroded to the point where outlets that make these kinds of automation mistakes just delete and memory hole the article or worse, no longer care (which is common among many AI-generated aggregation mills that are stealing ad money from real journalists).
While this is a bad and entirely avoidable fuck up, you kind of feel bad for the Ars author who had to navigate this crisis from his sick bed, given that writers at outlets like this are held to unrealistic output schedules while being paid a pittance; especially in comparison to far-less-useful or informed influencers who may or may not make sixty times their annual salary with far lower editorial standards.
All told it’s a fun story about automation, with ample evidence of very ordinary human behaviors and errors. If you peruse the news coverage of it you can find plenty of additional people attributing AI “sentience” in ways it shouldn’t be. But any way you slice it, this story is a perfect example of how weird things already are, and how exponentially weirder things are going to get in the LLM era.
Filed Under: ai, automation, chatgpt, claude, crawling agents, human error, journalism, programming, scott shambaugh


Comments on “Ars Technica Retracts Story Featuring Fake Quotes Made Up By AI, About A Different AI That Launched A Weird Smear Campaign Against An Engineer Who Rejected Its Code (Seriously)”
The failure by Ars is not even the biggest issue. Isn’t anyone else concerned about autonomous agents roaming the net spewing character assassination?
What's this?
Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »
“While this is a bad and entirely avoidable fuck up, you kind of feel bad for the Ars author who had to navigate this crisis from his sick bed”
If the author had done his job himself, without relying on AI, this whole mess would have been avoided.
Re:
Yes, thus the first part of the quoted sentence.
The most incredible part of this whole story is that the only one who didn’t make shit up is the guy with the word “sham” in his own surname.
The Irony [eye roll]
“The irony of an AI reporter being tripped up by an AI hallucination is not lost on me.”
Irony / Shame
Why is Benj Edwards the one we are supposed to feel bad for? Would he be more or less sympathetic if he hadn’t used ChatGPT at all and just made something up entirely in his own mind? “I was sick and there was a tight deadline so I just published lies.” doesn’t feel like an understandable failure really.
In this AI-content-filled future we are finding ourselves in it’s quickly going to be apparent that humans are not necessary for creating stuff, but they will be more necessary than ever for being responsible for what kinds of stuff is produced.
Re:
Because in a healthy work environment he would just have reported sick. And then either the deadline would have been pushed forward, or the story would have been handled by someone else.
Re:
Why is Benj Edwards the one we are supposed to feel bad for?
From the article: “…given that writers at outlets like this are held to unrealistic output schedules while being paid a pittance; especially in comparison to far-less-useful or informed influencers who may or may not make sixty times their annual salary with far lower editorial standards.”
I did not use AI to extract that explanation.
Re: Re:
We could just as well say that the Ars bosses should get some sympathy, because their readers and investors have unrealistic expectations. Readers want journalism without paying for it, and investors want ever-increasing returns. Maybe have some sympathy for those investors, too, who might just be trying to pay the pensions of some plumbers who retired decades ago.
But the buck has to stop somewhere. Edwards is the one being held out as a journalist here, with their name on the story. Any remotely competent journalist would have noticed by now whether there’s any decent editing and fact-checking being done on their submitted stories. All the more reason to double-check everything—if quitting’s not an option—or take a page from film directors and have one’s sub-standard work credited to “Alan Smithee” or something.
The “A.I.” angle makes this topical, but it’s really nothing new. Serious journalists have always had shitty tabloid journalists as contemporaries, publishing stories about Bigfoot while they’re bringing down presidents, and respectable publishers have long had retraction policies. Terrible workplaces with long hours, bad pay, and no sick days? That’s what led to the “eight-hour day movement”, minimum wage and other workplace laws, negotiated over-time pay, and often unions.
Read a book about labor history, and the plights mentioned therein will seem awfully similar to the plight of the modern journalist. I’ll have some sympathy for their situation, but not their bad decisions. Just like I might have some sympathy for a person drunk in an area with no decent bus service; but if they decide to drive home, fuck ’em.
Re: Re: Re:
I’ll have some sympathy for their situation, but not their bad decisions.
That was a really long-winded way to essentially agree with the article’s point.
Re: Re:
Aren’t the Ars writers protected by a union deal? If he is too ill to pick quotes from a blog post he’s writing about, or to even ask someone to proofread his own work, he shouldn’t have written the article. It wasn’t some time sensitive thing or huge exposé that would bring the powerful to task, and he’s not some 21 year old intern working for exposure as the husk of Kotaku, he’s at one of the few workplaces with some protections against being fired for being ill.
Re: Re: Re:
Are they? What are the union terms? What are the writers paid? What are the protections? What are Benji’s financial responsibilities?
You don’t know? Then maybe stop assuming things and being immediately condemnatory.
Re: Re: Re:
He clearly shouldn’t have, but there are a variety of reasons someone might feel pressured to work when they’re sick, including the modern media’s preference for timeliness over accuracy. Those problems are bigger than this one incident and bigger than Ars Technica or Conde Nast.
But that doesn’t absolve him. I still think it should be a fireable offense. But I also think there are some cultural problems that we need to address as a society.
Re:
Because most of us aren’t fucking psychopaths.
Benj fucked up, and for the sake of Ars’ integrity I think he should be fired (though he’s frankly not as big a threat to Ars’ reputation as the space reporter who can’t stop kissing Musk’s ass), but I can still feel sympathy for him.
Re: Re:
This person chose to work while knowing they were not fit to do so, then tried to make excuses. Would you consider someone a “fucking psycopath” if they didn’t feel much sympathy for a truck driver who did the same, after ending up with poor results?
Re: Re: Re:
for a truck driver
You mean a truck driver, pressured to make impossible timelines and thus go without sleep, who ended up driving his truck off the road? Yes, I would have sympathy for him and find someone who is not to be kind of horrible.
Re: Re: Re:2
If they killed an innocent person, I think the family would not usually be so magnanimous. One has a legal responsibility to not drive while unfit. I’d have full sympathy for someone who asserts that right and is retaliated against, but not for someone doctoring their logbook to work around the law.
By contrast, there’s no legal responsibility to not do journalism while unfit, and the connection between bad journalism and deaths is more tenuous. Still, this was two consciously bad choices—work while sick, and don’t check that the generated quotations are in the original text—rather than a random accident.
Re: Re: Re:3
Did the article kill an innocent person?
Re: Re: Re:4
This article definitely didn’t and won’t kill anyone, and it probably didn’t even harm anyone except the “journalist”.
But, given the state of journalism, it might be a bad idea for a journalist to raise the point that their job doesn’t really matter in comparison to others, so they shouldn’t have to put any real effort in. And bad journalism is somewhat connected to the current U.S. government, which literally is killing people.
I happen to think people should try to take pride in whatever they do, even if a fuck-up won’t kill anyone. Don’t half-ass it and complain when people judge this negatively; certainly don’t expect sympathy for being caught half-assing it, because it’s too hard to do well.
Re: Re:
You’re serious..? Did you read the victim’s two blog posts on the matter? You actually find a shred of sympathy for this fuckhat of a “journalist”? The halfwit could easily have ended a career and livelihood, if not for the damn victim spotting it and showing up in the comments. Sickess could be part of an explanation, but it is no excuse.
This is the first time he’s been caught. How many of his articles contain undetected LLM hallucinations, do you reckon? Even if the answer actually is zero – Ars Technica itself is the very last outfit, perhaps barring CBS, I’d trust to ascertain that.
Re: Re: Re:
cool
And yet again AI demonstrates that is good enough, right up until it isn’t.
The story should not be so much about Ars fucking up and much more about people releasing malicious AI agents to autonomously destroy peoples’ reputations.
I wouldn’t necessarily characterize Ars Technica as being up front about what happened. Their retraction post didn’t mention which article they were retracting, and was weirdly vague about what was incorrect.
They should have said what the retracted article was and what the made up quotes were.
Re:
They also blanked out the article that was retracted. It should have been left up with the retraction visible and instead it was swept under the rug.
Re: Re:
Not “blanked out”, but removed. On the internet, they certainly could’ve blanked it out: white text on a white background, with a visible note to select it or disable style-sheets if one really wants to see the retracted story.
Re:
I agree – rather than provide a correction to the article, they completed removed it from the site. If you had not read the article before the retraction, you would have no idea what the issue was.
Re:
Ars has been on the decline for a while, and tight deadlines are no excuse for spooling up a bullshit machine. At best it says we cannot trust anything further from Ars as they have prioritized speed of output over accuracy.
Ars entirely removed the article though. Is it not normal, in cases of retraction, to leave the article up with a clear statement that it’s been retracted and why? It feels a bit like sweeping it under the rug.
Nobody should feel bad for the author who did this, he damaged his co-author’s integrity in the process.
In what conceivable way was Ars upfront about what happened here?
It disappeared the post (and all the comics), didn’t even acknowledge doing it for three days, then put up a stub that vaguely alluded to what happened without going into any specifics. That’s why your summary includes posts from the author’s Bluesky account: because that information isn’t on Ars.
Re:
* comments, not comics
Re:
Ars withdrew the article so (I would guess) it wouldn’t get fed into the maw of search engines and LLMs, it acknowledged it on the front page, with a link to a page with the title of the article on it, and the retraction said in the first sentence that the article contained “fabricated quotations generated by an AI tool and attributed to a source who did not say them.”
That’s pretty specific.
Three days over a holiday weekend with the author down with COVID is hardly the horrifying delay you make it out to be. Heck three days without either of those things would be hardly the horrifying delay you make it out to be.
One of the least attractive features of the Internet pile-on gang is the “yes, you apologized, but not exactly in the way I want, so you’re still EEEEVIL” approach to things. Don’t be a member of that gang.
(No, I’m not affiliated with Ars)
Re: Re:
Well, don’t.
I mean, aside from not describing the article, what it was about, who wrote it, or what other steps Ars will be taking, if any, yeah, super fucking specific.
I’m sorry, you seem to have me confused with someone who’s made of straw.
Re: Re: Re:
Well, don’t
Yes, I will.
mean, aside from not describing the article, what it was about, who wrote it, or what other steps Ars will be taking, if any, yeah, super fucking specific.
It linked to a page with the title of the article, thus handling most of what you’re whining about, outlined the problem, and outlined what they were doing at the moment. That’s entirely specific enough. I would like them to do more of a dive on how the editorial process broke down but they did what they needed to do right then. That’s perfectly reasonable.
I’m sorry, you seem to have me confused with someone who’s made of straw
I confused you with a human being. My bad.
Re: Re: Re:2
Okay, then, you’ve made it clear that you’re willing to completely make shit up to justify a position you’ve already decided to take.
Thanks for letting me know.
Re: Re: Re:3
No, I made a reasoned assumption in response to your utterly mendacious conclusion.,.but, also, bye! Don’t let the door..etc.
Re: Re: Re:4
No, you didn’t, and you would know that if you’d read Benj’s statement. It’s embedded in the article.
But you didn’t. You don’t care about the facts and, contrary to your self-righteous rhetoric about humanity, you don’t care about Benj. You couldn’t even be bothered to take two minutes to read what he actually had to say about this before you scrolled down to the comments to start shit.
Re: Re:
The guy who fucked up apologized well enough that it feels like he has genuine remorse for what he did. (Whether you believe he’s being sincere is your opinion.) Ars apologized in a way that feels like they’re sorry not about using quotes fabricated by an AI or putting a sick reporter in a bad position, but about getting caught doing those things. Like, okay, they memoryholed the original article and admitted to the fake quotes, cool—but what measures are they putting in place to both prevent the use of AI-generated text in the future and put less pressure on people to prevent them from turning to generative AI/LLMs in the future when they’re on a time crunch and/or, as in this case, working while sick? They held themselves accountable for the fuck-up, now let’s see them hold themselves responsible for the circumstances that led to this fuck-up and tell us how they’ll prevent a similar one in the future.
Re: Re: Re:
They held themselves accountable for the fuck-up
Yes, which is what they needed to do. Yes, it would be good for them to figure out how to avoid a repeat, but that’s not immediately necessary. You’re yelling at the homeowner who put the fire in their house out because they don’t have architectural drawings of the repaired house done.
Re: Re: Re:2
It is, though. A fuck-up like this costs Ars Technica credibility. You can say the reporter himself is more responsible for it, but Ars put him in a position where he felt he needed to use AI to churn out an article for a deadline. Ars needs to regain its own credibility here; outlining steps it could, should, and must take to keep a fuck-up like this from happening again is necessary for that goal.
Trust takes years to build and seconds to destroy. Rebuilding it doesn’t mean it’ll ever be perfect again, but it still needs to be rebuilt. Ars held itself accountable; now I’d like to know how it’s holding itself responsible.
Re: Re: Re:2
Yes, “figure out”…
It’s not like we’re talking quantum physics here. Fact-checking and editing used to be considered standard in journalism. If there’s some text purporting to be quoted from a web page, load that page and check that the text is there; this could even be automated.
Their next journalistic failure, involving “A.I.” or not, is not likely to be a repeat per se.
The failure by Ars is not even the biggest issue. Isn’t anyone else concerned about autonomous agents roaming the net spewing character assassination?
More from the victim:
https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me-part-3/
The perpetrator speaks:
https://crabby-rathbun.github.io/mjrathbun-website/blog/posts/rathbuns-operator.html
happening again is necessary for that goal
But not immediately necessary, which is what I said. It’s been less than a week — they may not yet know exactly what happened, who besides the reporter fucked up, and what they’re going to do about it.
Again, the house is still smoldering and you’re whining about lack of repair plans. Come back in a month and if they still haven’t said something further, I’ll be critical.
Re:
I don’t need full detailed whitepapers about every little thing Ars plans to do about this. A broad-strokes plan would suffice. Saying “oh, it’s been less than a week” feels like you’re trying to give Ars an excuse for not addressing an egregious trust-shattering fuck-up until The Perfect Time™ (which doesn’t actually exist). Details can always come later, but a broad idea of how to clean up their mess and stop another one from happening is very much a here-and-now step that Ars can, should, and must do if it wants to rebuild its credibility. Or to put it another way…
…I’d like to know how Ars plans to prevent another fire, then whether it’s going to repair the damage at all or move on. Blueprints aren’t necessary; a general “this is what we’re doing” update is.
Re: Re:
So wait, you don’t need much an explanation but you do and it better be soon or else? That’s definitely useful.
Hey, I’m still holding the fire extinguisher, here, and I’m glad to know you don’t need a white paper but also, how about if you just see if the pets got out in time, hmm?
Re: Re: Re:
I don’t respond to otherwording. If you can’t address my point without shoving words down my throat that didn’t first come from it, don’t address my point.
Re: Re: Re:2
So the pets didn’t make it, I’m guessing?
If you don’t like what your words mean, you shouldn’t say them in the first place.
Re: Re: Re:3
I made myself clear. If you think I was making some sort of threat to Ars Technica (“you don’t need much an explanation but you do and it better be soon or else”), you’re intentionally misunderstanding me for the sake of getting into an argument. And while I’m no stranger to misunderstanding people, I’m also someone who tries really fucking hard not to make that mistake on purpose. I don’t appreciate people who intentionally do it to me, and I doubt you’d appreciate people doing it to you. So on the off-chance that I somehow fucked up and made myself unclear, I’ll be really fucking clear about what I meant by the post you mischaracterized on purpose so there can be no more errors in good faith.
Ars Technica’s decision to run an article with false quotes that were “hallucinated” by an LLM? That was a mistake. The decision to not own that mistake at first was a worse decision. The decision to not explain exactly why that article was a mistake, or why its author was put in a position to make that mistake, is the worst decision of this whole debacle. All those decisions compounded into a loss in credibility for a site that was still fairly high-regarded as a tech news site. What I want to see from Ars is a better response to its mistake that goes beyond holding itself accountable for its fuck-up and moves into being responsible for preventing a similar fuck-up in the future. I’m not suggesting Ars needs to come out with that response immediately, but I am suggesting that delaying such a response makes things worse—and that there is no Perfect Time™ to make such a response. And I’m not saying that Ars needs to lay out every last detail of a plan to stop this kind of mistake from happening again. I’m saying that a statement with some broad-strokes actions Ars could (and probably should) take to prevent similar mistakes would be a good idea to help rebuild its credibility. And in no way whatsoever am I somehow threatening Ars Technica with anything other than a personal decision to treat the site as far less credible than it was before this shitshow.
To once again use your analogy: The fire is out, and it’s caused some damage, so I’d like to hear Ars explain—as soon as possible, albeit preferably in the very near future—why the fire happened and how it plans to prevent future fires from happening so it won’t need to put out a similar fire that causes worse damage.
If you can’t address the points I made without shoving words down my throat that I didn’t say at all, don’t address my points. I’m willing to have a discussion with you, and I welcome disagreement, but you need to be honest about what I said. Good faith works both ways; please don’t make me assume you have none.
Re: Re: Re:4
I made myself clear.
Yes, you did. Thus my comments. The rest of your post was way too long to bother reading. Did you say anything useful?
Re: Re: Re:5
Thank you for telling me that you’re not here for a good faith discussion. Your future contributions will be treated accordingly.
Re: Re: Re:2
No wonder Total’s so defensive of hallucinations. That seems to be all they have.
Re: Re: Re:3
No wonder Total’s so defensive of hallucinations. That seems to be all they have
My hallucinations were mostly about the humanity of some other folks in this comment section. They’ve been debunked.
I’m surprised (and little disappointed) we’re not seeing Karl’s trademark biting editorial of this incident.
First, it’s not mentioned that Benj is the SENIOR AI EDITOR for Ars. If anyone should have known better, it’s Benj.
Ars did not handle this well at all. They did not retract the article as they claim. They simply deleted it. The link returns a 404. Two full days after deleting it, they posted an ambiguous statement that makes no reference whatsoever to the deleted article nor the author. They did not commit to any investigation, follow up, or corrective actions.
Their lack of transparency hurts their credibility more than the flawed article.
Not surprised in the slightest given Ars Technica shills hard for any and all LLM shithousery that comes their way.
Same with Berger and his Musk/SpaceX bootlicking.
Edwards did know better, yet still managed to perform a partial character assassination through his abysmal decision-making.
Ars’ “retraction” was woefully inadequate as far as retractions/corrections go. Ars’ nuking any discussion of the fuck-up in their own subscriber forum was a spectacularly good move on top of that.
The journalistic integrity of Ars is dust. Not surprising considering Conde Nast holds their leashes.
Re:
The way Berger shills for Musk, and their only moderator of the front page, Aurich Lawson, has called Berger his “friend” in spite of Berger going to bat for such a despicable human being, is depressing and hilarious. Any time you criticize Berger for being a Musk shill, you run the risk of Aurich dropping the banhammer on you because you made fun of his friend.
Re: Re:
It’s almost like having their moderation run by a club of privileged narcissists ignoring their written rules to always act on their feelings isn’t as good at retaining their dwindling community as having a professional trust & safety team would.
Everything old is new again..
To err is human, to really f- things up use an AI.
LLMs are just glorified slot machines: generate a series of random numbers, pop those numbers into the weighted token generator and see what comes out. If you get 3 coherent thoughts in a line, you win! And just like a real slot machine, the house always wins.
gatekeeper mentality
Oh, well look out git repositories, i guess. All PR must be accepted now.
i guess that means ratbuns is allowing all input to their github pages.