Imagine a newspaper publisher announcing it will no longer allow libraries to keep copies of its paper.
That’s effectively what’s begun happening online in the last few months. The Internet Archive—the world’s largest digital library—has preserved newspapers since it went online in the mid-1990s. The Archive’s mission is to preserve the web and make it accessible to the public. To that end, the organization operates the Wayback Machine, which now contains more than one trillion archived web pages and is used daily by journalists, researchers, and courts.
But in recent months The New York Times began blocking the Archive from crawling its website, using technical measures that go beyond the web’s traditional robots.txt rules. That risks cutting off a record that historians and journalists have relied on for decades. Other newspapers, including The Guardian, seem to be following suit.
For nearly three decades, historians, journalists, and the public have relied on the Internet Archive to preserve news sites as they appeared online. Those archived pages are often the only reliable record of how stories were originally published. In many cases, articles get edited, changed, or removed—sometimes openly, sometimes not. The Internet Archive often becomes the only source for seeing those changes. When major publishers block the Archive’s crawlers, that historical record starts to disappear.
The Times says the move is driven by concerns about AI companies scraping news content. Publishers seek control over how their work is used, and several—including the Times—are now suing AI companies over whether training models on copyrighted material violates the law. There’s a strong case that such training is fair use.
Whatever the outcome of those lawsuits, blocking nonprofit archivists is the wrong response. Organizations like the Internet Archive are not building commercial AI systems. They are preserving a record of our history. Turning off that preservation in an effort to control AI access could essentially torch decades of historical documentation over a fight that libraries like the Archive didn’t start, and didn’t ask for.
If publishers shut the Archive out, they aren’t just limiting bots. They’re erasing the historical record.
Archiving and Search Are Legal
Making material searchable is a well-established fair use. Courts have long recognized it’s often impossible to build a searchable index without making copies of the underlying material. That’s why when Google copied entire books in order to make a searchable database, courts rightly recognized it as a clear fair use. The copying served a transformative purpose: enabling discovery, research, and new insights about creative works.
The Internet Archive operates on the same principle. Just as physical libraries preserve newspapers for future readers, the Archive preserves the web’s historical record. Researchers and journalists rely on it every day. According to Archive staff, Wikipedia alone links to more than 2.6 million news articles preserved at the Archive, spanning 249 languages. And that’s only one example. Countless bloggers, researchers, and reporters depend on the Archive as a stable, authoritative record of what was published online.
The same legal principles that protect search engines must also protect archives and libraries. Even if courts place limits on AI training, the law protecting search and web archiving is already well established.
The Internet Archive has preserved the web’s historical record for nearly thirty years. If major publishers begin blocking that mission, future researchers may find that huge portions of that historical record have simply vanished. There are real disputes over AI training that must be resolved in courts. But sacrificing the public record to fight those battles would be a profound, and possibly irreversible, mistake.
This was extremely wild shit to be happening anywhere, much less in the land of the First Amendment. No sooner had Donald Trump decided it was time to rename the Department of Defense to the Department of War than the head of DoD operations decided it would be sorting news agencies by level of subservience.
Pretending this was all about national security, the Defense Department basically kicked everyone out of the Pentagon’s press office and stated that only those that chose to play by the new rules would be allowed back inside.
Booted: NBC News, the New York Times, NPR. Welcomed back into the fold: OAN, Newsmax, Breitbart. The Pentagon wanted a state-run press, but without having to do all the heavy lifting that comes with instituting a state-run press in the Land of the Free.
Somewhat surprisingly, some of those explicitly invited to partake of the new Defense Department media wing refused to participate. Fox and Newsmax decided to stay out, rather than promise they’d never publish leaked documents. Those choosing to bend the knee were those who never needed this sort of coercion in the first place: One America News (OAN), The Federalist, and far-right weirdos, the Epoch Times. In other words, MAGA-heavy breathers that have never been known for their independence, much less their journalism.
That didn’t stop Hegseth and the department he’s mismanaging from attempting to take a victory lap. And it certainly didn’t stop news agencies like the New York Times from suing over this blatant violation of the First Amendment.
It’s so obvious it only took the NYT four months to secure a win in a federal court (DC) that is positively swamped with litigation generated by Trump’s swamp. (h/t Adam Klasfield)
The decision [PDF] makes it clear in the opening paragraph how this is going to go for the administration and its extremely selective “respect” of enshrined rights and freedoms.
A primary purpose of the First Amendment is to enable the press to publish what it will and the public to read what it chooses, free of any official proscription. Those who drafted the First Amendment believed that the nation’s security requires a free press and an informed people and that such security is endangered by governmental suppression of political speech. That principle has preserved the nation’s security for almost 250 years. It must not be abandoned now.
Amen.
The court notes that in the past, there has been some friction between national security concerns and reporting by journalists. In some cases, the friction has been little more than the government chafing a bit when something has been published that it would rather have kept a secret. In other cases, leaks involving sensitive information have provoked reform efforts on both sides of the equation, seeking to balance these concerns with serving the public interest.
Up until now, any efforts to expel reporters have been limited to backroom bitching. What’s happening now, however, is unprecedented.
Historically, though, even when Department leaders disliked a journalist’s reporting, they did not consider suspending, revoking, or not renewing the journalist’s press credentials in response to that reporting. Julian Barnes, Pete Williams, and Robert Burns—reporters who have spent decades covering the Pentagon—as well as former Pentagon officials, are not aware of the Department ever suspending, revoking, or not renewing a journalist’s credentials due to concern over the safety or security of Department personnel or property or based on the content of their reporting.
This may be new, but the court isn’t willing to make it the “new normal.” It’s the decades of precedent that truly matter, not the vindictive whims of the overgrown toddlers currently holding office.
The Pentagon claims that demanding journalists agree not to “solicit,” much less print data or information not explicitly approved for release by the Defense Department doesn’t reach any further than existing laws governing the handling of classified documents. The court disagrees, noting that the new policy allows the government to conflate the illegal solicitation of classified material with the sort of soliciting — i.e., requests for information, etc. — journalists do every day in hopes of securing something newsworthy.
On top of allowing the government to punish people for things that weren’t previously considered unlawful, the demand for obeisance wasn’t created in a vacuum. Instead, it flowed directly from this entire administration’s constant attacks on the press by the president and pretty much every one in his Cabinet.
The plaintiffs are correct: “The record is replete with undisputed evidence that the Policy is viewpoint discriminatory.” That evidence tells the story of a Department whose leadership has been and continues to be openly hostile to the “mainstream media” whose reporting it views as unfavorable, but receptive to outlets that have expressed “support for the Trump administration in the past.”
The story begins prior to the adoption of the Policy, when—following extensive reporting on Secretary Hegseth’s background and qualifications during his confirmation process—Secretary Hegseth and Department officials “openly complained about reporting they perceive[d] as unfavorable to them and the Department.” Then, in the weeks and months leading up to the issuance of the Policy, Department officials repeatedly condemned certain news organizations—including The Times—for their coverage of the Department. For example, in response to reporting by The Times on Secretary Hegseth’s alleged misuse of the messaging platform Signal, Mr. Parnell posted on X to call out The Times “and all other Fake News that repeat their garbage.” Mr. Parnell decried these news organizations as “Trump-hating media” who “continue[] to be obsessed with destroying anyone committed to President Trump’s agenda.” In other social media posts leading up to the issuance of the Policy, Department officials referred to journalists from The Washington Post as “scum” and called for their “severe punishment” in response to reporting on Secretary Hegseth’s security detail.
It was never about keeping loose lips from sinking ships. It was always about cutting off access to news agencies the administration didn’t like. And once you’ve gotten rid of the critics, you’re left with the functional equivalent of a state-run media, but without the nastiness of having to disappear people into concentration camps or usher them out of their cubicles at gunpoint.
The court won’t let this stand. The new policy violates both the First Amendment and Fifth Amendment (due to the vagueness of its ban on “soliciting” sensitive information). That’s never been acceptable before in this nation. Just because there’s an aspiring tyrant leaning heavily on the Resolute Desk these days doesn’t make it any more permissible.
The Court recognizes that national security must be protected, the security of our troops must be protected, and war plans must be protected. But especially in light of the country’s recent incursion into Venezuela and its ongoing war with Iran, it is more important than ever that the public have access to information from a variety of perspectives about what its government is doing—so that the public can support government policies, if it wants to support them; protest, if it wants to protest; and decide based on full, complete, and open information who they are going to vote for in the next election. As Justice Brandeis correctly observed, “sunlight is the most powerful of all disinfectants.”
The administration will definitely appeal this decision. And it almost definitely will try to bypass the DC Appeals Court and go straight to the Supreme Court by claiming not being able to expel reporters it doesn’t like is some sort of national emergency. It will probably even claim that the fight it picked in Iran justifies the actions it took months before it decided to involve us in the nation’s latest Afghanistan/Vietnam.
But it definitely shouldn’t win. This isn’t some obscure permutation of First Amendment law. This is the government crafting a policy that allows it to decide what gets to be printed and who gets to print it. That’s never been acceptable here. And it never should be.
Last fall, I wrote about how the fear of AI was leading us to wall off the open internet in ways that would hurt everyone. At the time, I was worried about how companies were conflating legitimate concerns about bulk AI training with basic web accessibility. Not surprisingly, the situation has gotten worse. Now major news publishers are actively blocking the Internet Archive—one of the most important cultural preservation projects on the internet—because they’re worried AI companies might use it as a sneaky “backdoor” to access their content.
This is a mistake we’re going to regret for generations.
Nieman Lab reports that The Guardian, The New York Times, and others are now limiting what the Internet Archive can crawl and preserve:
When The Guardian took a look at who was trying to extract its content, access logs revealed that the Internet Archive was a frequent crawler, said Robert Hahn, head of business affairs and licensing. The publisher decided to limit the Internet Archive’s access to published articles, minimizing the chance that AI companies might scrape its content via the nonprofit’s repository of over one trillion webpage snapshots.
Specifically, Hahn said The Guardian has taken steps to exclude itself from the Internet Archive’s APIs and filter out its article pages from the Wayback Machine’s URLs interface. The Guardian’s regional homepages, topic pages, and other landing pages will continue to appear in the Wayback Machine.
The Times has gone even further:
The New York Times confirmed to Nieman Lab that it’s actively “hard blocking” the Internet Archive’s crawlers. At theend of 2025, the Times also added one of those crawlers —archive.org_bot — to itsrobots.txt file, disallowing access to its content.
“We believe in the value of The New York Times’s human-led journalism and always want to ensure that our IP is being accessed and used lawfully,” said a Times spokesperson. “We are blocking the Internet Archive’s bot from accessing the Times because the Wayback Machine provides unfettered access to Times content — including by AI companies — without authorization.”
I understand the concern here. I really do. News publishers are struggling, and watching AI companies hoover up their content to train models that might then, in some ways, compete with them for readers is genuinely frustrating. I run a publication myself, remember.
But blocking the Internet Archive isn’t going to stop AI training. What it will do is ensure that significant chunks of our journalistic record and historical cultural context simply… disappear.
And that’s bad.
The Internet Archive is the most famous nonprofit digital library, and has been operating for nearly three decades. It isn’t some fly-by-night operation looking to profit off publisher content. It’s trying to preserve the historical record of the internet—which is way more fragile than most people comprehend. When websites disappear—and they disappear constantly—the Wayback Machine is often the only place that content still exists. Researchers, historians, journalists, and ordinary citizens rely on it to understand what actually happened, what was actually said, what the world actually looked like at a given moment.
In a digital era when few things end up printed on paper, the Internet Archive’s efforts to permanently preserve our digital culture are essential infrastructure for anyone who cares about historical memory.
And now we’re telling them they can’t preserve the work of our most trusted publications.
Think about what this could mean in practice. Future historians trying to understand 2025 will have access to archived versions of random blogs, sketchy content farms, and conspiracy sites—but not The New York Times. Not The Guardian. Not the publications that we consider the most reliable record of what’s happening in the world. We’re creating a historical record that’s systematically biased against quality journalism.
Yes, I’m sure some will argue that the NY Times and The Guardian will never go away. Tell that to the readers of the Rocky Mountain News, which published for 150 years before shutting down in 2009, or to the 2,100+ newspapers that have closed since 2004. Institutions—even big, prominent, established ones—don’t necessarily last.
As one computer scientist quoted in the Nieman piece put it:
“Common Crawl and Internet Archive are widely considered to be the ‘good guys’ and are used by ‘the bad guys’ like OpenAI,” said Michael Nelson, a computer scientist and professor at Old Dominion University. “In everyone’s aversion to not be controlled by LLMs, I think the good guys are collateral damage.”
That’s exactly right. In our rush to punish AI companies, we’re destroying public goods that serve everyone.
The most frustrating bit of all of this: The Guardian admits they haven’t actually documented AI companies scraping their content through the Wayback Machine. This is purely precautionary and theoretical. They’re breaking historical preservation based on a hypothetical threat:
The Guardian hasn’t documented specific instances of its webpages being scraped by AI companies via the Wayback Machine. Instead, it’s taking these measures proactively and is working directly with the Internet Archive to implement the changes.
And, of course, as one of the “good guys” of the internet, the Internet Archive is willing to do exactly what these publishers want. They’ve always been good about removing content or not scraping content that people don’t want in the archive. Sometimes to a fault. But you can never (legitimately) accuse them of malicious archiving (even if music labels and book publishers have).
Either way, we’re sacrificing the historical record not because of proven harm, but because publishers are worried about what might happen. That’s a hell of a tradeoff.
This isn’t even new, of course. Last year, Reddit announced it would block the Internet Archive from archiving its forums—decades of human conversation and cultural history—because Reddit wanted to monetize that content through AI licensing deals. The reasoning was the same: can’t let the Wayback Machine become a backdoor for AI companies to access content Reddit is now selling. But once you start going down that path, it leads to bad places.
The Nieman piece notes that, in the case of USA Today/Gannett, it appears that there was a company-wide decision to tell the Internet Archive to get lost:
In total, 241 news sites from nine countries explicitly disallow at least one out of the four Internet Archive crawling bots.
Most of those sites (87%) are owned by USA Today Co., the largest newspaper conglomerate in the United States formerly known as Gannett. (Gannett sites only make up 18% of Welsh’s original publishers list.) Each Gannett-owned outlet in our dataset disallows the same two bots: “archive.org_bot” and “ia_archiver-web.archive.org”. These bots were added to the robots.txt files of Gannett-owned publications in 2025.
Some Gannett sites have also taken stronger measures to guard their contents from Internet Archive crawlers.URL searches for the Des Moines Register in the Wayback Machinereturn a message that says, “Sorry. This URL has been excluded from the Wayback Machine.”
A Gannett spokesperson told NiemanLab that it was about “safeguarding our intellectual property” but that’s nonsense. The whole point of libraries and archives is to preserve such content, and they’ve always preserved materials that were protected by copyright law. The claim that they have to be blocked to safeguard such content is both technologically and historically illiterate.
And here’s the extra irony: blocking these crawlers may not even serve publishers’ long-term interests. As I noted in my earlier piece, as more search becomes AI-mediated (whether you like it or not), being absent from training datasets increasingly means being absent from results. It’s a bit crazy to think about how much effort publishers put into “search engine optimization” over the years, only to now block the crawlers that feed the systems a growing number of people are using for search. Publishers blocking archival crawlers aren’t just sacrificing the historical record—they may be making themselves invisible in the systems that increasingly determine how people discover content in the first place.
The Internet Archive’s founder, Brewster Kahle, has been trying to sound the alarm:
“If publishers limit libraries, like the Internet Archive, then the public will have less access to the historical record.”
But that warning doesn’t seem to be getting through. The panic about AI has become so intense that people are willing to sacrifice core internet infrastructure to address it.
What makes this particularly frustrating is that the internet’s openness was never supposed to have asterisks. The fundamental promise wasn’t “publish something and it’s accessible to all, except for technologies we decide we don’t like.” It was just… open. You put something on the public web, people can access it. That simplicity is what made the web transformative.
Now we’re carving out exceptions based on who might access content and what they might do with it. And once you start making those exceptions, where do they end? If the Internet Archive can be blocked because AI companies might use it, what about research databases? What about accessibility tools that help visually impaired users? What about the next technology we haven’t invented yet?
This is a real concern. People say “oh well, blocking machines is different from blocking humans,” but that’s exactly why I mention assistive tech for the visually impaired. Machines accessing content are frequently tools that help humans—including me. I use an AI tool to help fact check my articles, and part of that process involves feeding it the source links. But increasingly, the tool tells me it can’t access those articles to verify whether my coverage accurately reflects them.
I don’t have a clean answer here. Publishers genuinely need to find sustainable business models, and watching their work get ingested by AI systems without compensation is a legitimate grievance—especially when you see how much traffic some of these (usually less scrupulous) crawlers dump on sites. But the solution can’t be to break the historical record of the internet. It can’t be to ensure that our most trusted sources of information are the ones that disappear from archives while the least trustworthy ones remain.
We need to find ways to address AI training concerns that don’t require us to abandon the principle of an open, preservable web. Because right now, we’re building a future where historians, researchers, and citizens can’t access the journalism that documented our era. And that’s not a tradeoff any of us should be comfortable with.
Remember last summer when everyone was freaking out about the explosion of AI-generated child sexual abuse material? The New York Times ran a piece in July with the headline “A.I.-Generated Images of Child Sexual Abuse Are Flooding the Internet.” NCMEC put out a blog post calling the numbers an “alarming increase” and a “wake-up call.” The numbers were genuinely shocking: NCMEC reported receiving 485,000 AI-related CSAM reports in the first half of 2025, compared to just 67,000 for all of 2024.
That’s a big increase! And it would obviously be super concerning if any AI company were finding and detecting so much AI-generated CSAM, especially as we keep hearing that the big AI models (perhaps with the exception of Grok…) have been putting in place safeguards against CSAM generation.
The source of most of those reports? Amazon, which had submitted a staggering 380,000 of them, even though most people don’t tend to think of Amazon as much of an AI company. But, still, it became a six alarm fire about how much AI-generated CSAM Amazon had discovered. There were news stories about it, politicians demanding action, and the general sentiment was that this proved how big the problem was.
Except… it turns out that wasn’t actually what was happening. At all.
Bloomberg just published a deep dive into what was actually going on with Amazon’s reports, and the truth is very, very different from what everyone assumed. According to Bloomberg:
Amazon.com Inc. reported hundreds of thousands of pieces of content last year that it believed included child sexual abuse, whichit found in data gathered to improve its artificial intelligence models. Though Amazon removed the content before training its models, child safety officials said the company has not provided information about its source, potentially hindering law enforcement from finding perpetrators and protecting victims.
Here’s the kicker—and I cannot stress this enough—none of Amazon’s reports involved AI-generated CSAM.
None of its reports submitted to NCMEC were of AI-generated material, the spokesperson added. Instead, the content was flagged by an automatic detection tool that compared it against a database of known child abuse material involving real victims, a process called “hashing.” Approximately 99.97% of the reports resulted from scanning “non-proprietary training data,” the spokesperson said.
What Amazon was actually reporting was known CSAM—images of real victims that already existed in databases—that their scanning tools detected in datasets being considered for AI training. They found it using traditional hash-matching detection tools, flagged it, and removed it before using the data. Which is… actually what you’d want a company to do?
But because it was found in the context of AI development, and because NCMEC’s reporting form has exactly one checkbox that says “Generative AI” with no way to distinguish between “we found known CSAM in our training data pipeline” and “our AI model generated new CSAM,” Amazon checked the box.
And thus, a massive misunderstanding was born.
Again, let’s be clear and separate out a few things here: the fact that Amazon found CSAM (known or not) in its training data is bad. It is a troubling sign of how much CSAM is found in the various troves of data AI companies use for training. And maybe the focus should be on that. Also, the fact that they then reported it to NCMEC and removed it from their training data after discovering it with hash matching is… good. That’s how things are supposed to work.
But the fact that the media (with NCMEC’s help) turned this into “OMG AI generated CSAM is growing at a massive rate” is likely extremely misleading.
For half a year, “Massive Spike In AI-Generated CSAM” is the framing I’ve seen whenever news reports mention those H1 2025 numbers. Even thepress releasefor a Senate bill about safeguarding AI models from being tainted with CSAM stated, “According to the National Center for Missing & Exploited Children, AI-generated material has proliferated at an alarming rate in the past year,” citing the NYT article.
Now we find out from Bloomberg that zero of Amazon’s reports involved AI-generated material; all 380,000 were hash hits to known CSAM. And we have Fallon [McNulty, executive director of the CyberTipline] confirming to Bloomberg that “with the exception of Amazon, the AI-related reports [NCMEC] received last yearcame in ‘really, really small volumes.'”
That is an absolutely mindboggling misunderstanding for everyone — the general public, lawmakers, researchers like me, etc. — to labor under for so long. If Bloomberg hadn’t dug into Amazon’s numbers, it’s not clear to me when, if ever, that misimpression would have been corrected.
She’s not wrong. Nearly 80% of all “Generative AI” CyberTipline reports to NCMEC in the first half of 2025 involved no AI-generated CSAM at all. The actual volume of AI-generated CSAM being reported? Apparently “really, really small.”
Now, to be (slightly?) fair to the NYT, they did run a minor correction a day after their original story noting that the 485,000 reports “comprised both A.I.-generated material and A.I. attempts to create material, not A.I.-generated material alone.” But that correction still doesn’t capture what actually happened. It wasn’t “AI-generated material and attempts”—it was overwhelmingly “known CSAM detected during AI training data vetting.” Those are very different things.
And it gets worse. Bloomberg reports that Amazon’s scanning threshold was set so low that many of those reports may not have even been actual CSAM:
Amazon believes it over-reported these cases to NCMEC to avoid accidentally missing something. “We intentionally use an over-inclusive threshold for scanning, which yields a high percentage of false positives,” the spokesperson added.
So we’ve got reports that aren’t AI-generated CSAM, many of which may not even be CSAM at all. Very helpful.
The frustrating thing is that this kind of confusion wasn’t just entirely predictable—it was predicted! When Pfefferkorn and her colleagues at Stanford published their report about NCMEC’s CSAM reporting system they literally called out the potential confusion in the options of what to check and how platforms would likely over-report stuff in an abundance of caution, because the penalty (both criminally and in reputation) for missing anything is so dire.
Indeed, the form for submitting to the CyberTipline has one checkbox for “Generative AI” that, as Pfefferkorn notes in her letter, can mean wildly different things depending on who’s checking it:
When the meaning of checking a single checkbox is so ambiguous that absent additional information, reports of known CSAM found in AI training data are facially indistinguishable from reports of new AI-generated material (or of text-only prompts seeking CSAM, or of attempts to upload known CSAM as part of a prompt, etc.), and that ambiguity leads to a months-long massive public misunderstanding about the scale of the AI-CSAM problem, then it is clear thatthe CyberTipline reporting form itself needs to change— not just how one particular ESP fills it out.
To its credit NCMEC did respond quickly to Pfefferkorn, and their response is… illuminating. They confirmed they’re working on updating the reporting system, but also noted that Amazon’s reports contained almost no useful information:
all those Amazon reports included minimal data, not even the file in question or the hash value, much less other contextual information about where or how Amazon detected the matching file
As Pfefferkorn put it, Amazon was basically giving NCMEC reports that said “we found something” with nothing else attached. NCMEC says they only learned about the false positives issue last week and are “very frustrated” by it.
Indeed, NCMEC’s boss told Bloomberg:
“There’s nothing then that can be done with those reports,” she said. “Our team has been really clear with [Amazon] that those reports are inactionable.”
There’s plenty of blame to go around here. Amazon clearly should have been more transparent about what they were reporting and why. NCMEC’s reporting form is outdated and creates ambiguity that led to a massive public misunderstanding. And the media (NYT included) ran with alarming numbers without asking obvious questions like “why is Amazon suddenly reporting 25x more than last year and no other AI company is even close?”
But, even worse, policymakers spent six months operating under the assumption that AI-generated CSAM was exploding at an unprecedented rate. Legislation was proposed. Resources were allocated. Public statements were made. All based on numbers that fundamentally misrepresented what was actually happening.
As Pfefferkorn notes:
Nobody benefits from being so egregiously misinformed. It isn’t a basis for sound policymaking (or an accurate assessment of NCMEC’s resource needs) if the true volume of AI-generated CSAM being reported is a mere fraction of what Congress and other regulators believe it is. It isn’t good for Amazon if people mistakenly think the company’s AI products are uniquely prone to generating CSAM compared with other options on the market (such as OpenAI, with its distant-second 75,000 reports during the same time period,per NYT). That impression also disserves users trying to pick safe, responsible AI tools to use; in actuality, per today’s revelations about training data vetting, Amazon is indeed trying to safeguard its models against CSAM. I can certainly think of at least one other AI company that’s been in the news a lot lately that seems to be acting far more carelessly.
None of this means that AI-generated CSAM isn’t a real and serious problem. It absolutely is, and it needs to be addressed. But you can’t effectively address a problem if your data about the scope of that problem is fundamentally wrong. And you especially can’t do it when the “alarming spike” that everyone has been pointing to turns out to be something else entirely.
The silver lining here, as Pfefferkorn points out, is that the actual news is… kind of good? Amazon’s AI models aren’t CSAM-generating machines. The company was actually doing the responsible thing by vetting its training data. And the real volume of AI-generated CSAM reports is apparently much lower than we’ve been led to believe.
But that good news was buried for six months under a misleading narrative that nobody bothered to dig into until Bloomberg did. And that’s a failure of transparency, of reporting systems, and of the kind of basic journalistic skepticism that should have kicked in when one company was suddenly responsible for 78% of all reports in a category.
We’ll see if NCMEC’s promised updates to the reporting form actually address these issues. In the meantime, maybe we can all agree that the next time there’s a 700% increase in reports of anything, it’s worth asking a few questions before writing the “everything is on fire” headline.
Imagine you’re writing an article about a popular policy trend. The trend is expensive to implement, disruptive to normal operations, and—here’s the key part—there’s substantial research showing it doesn’t actually work and can cause other significant problems. How would you structure that article?
One approach: Lead with the evidence. “Despite growing enthusiasm for [policy proposal], studies consistently find it doesn’t accomplish its stated goals.” Put that in paragraph one, maybe paragraph two or three with some lead-up if you’re feeling generous.
Another approach: Spend 13 paragraphs hyping up the trend, listing every conceivable harm it’s meant to address, quoting lawmakers and administrators who support it, and then—only then—casually mention that the evidence shows it doesn’t work.
Mobile phone bans in school and social media bans for kids are increasingly popular around the globe, driven largely by Jonathan Haidt’s bestselling book—which remains a bestseller despite actual experts debunking basically everything in it. So when the paper of record wades into this debate, you’d think they might lead with what the evidence actually shows. You’d think wrong.
The article opens with the traditional moral panic opening, playing up all the fear:
Bullying. Sextortion. Body-shaming. Self-harm. Viralstudent-fight videos. Never-ending newsfeeds. Unhealthy relationships with A.I. chatbots. Teenagers who can’t seem to put down their phones.
Parents and teachers are understandably concerned about social media. For all of the community, creativity and just plain fun kids enjoy online, hazards remain all too frequent, some children’s advocates say.
It’s the greatest-hits compilation of every anxiety adults have projected onto kids and technology for decades (centuries, really). Might as well add “Dungeons & Dragons will make them worship Satan” for completeness.
The piece does eventually ask “can these bans actually help?” But not before spending several more paragraphs cataloging every conceivable harm that’s ever been tangentially associated with social media, strongly implying the tech itself is to blame rather than, you know, humanity. Then it dutifully reports that “lawmakers and schools” see bans as the answer.
Only then—14 paragraphs deep—does the Times get around to mentioning:
Wehave limited researchon whether the bans work. After surveying more than 1,200 students in 30 schools across England, researchers at the University of Birmingham recently reported that cellphone bans did not improve students’ mental well-being.
“Limited research”?
No. We have plenty of research. There’s a comprehensive study in Australia that found no evidence bans helped kids. Multiple reports document actual harms from these bans—including privacy violations and safety issues when kids can’t reach parents during emergencies. It appears that the evidence is just inconvenient for the narrative.
But the Times isn’t done. The article includes a section on how bans “may have drawbacks”—and somehow the main drawback they identify is that bans don’t stop social media companies from doing bad things. Not that the bans don’t work. Not that they create new problems. Just that they don’t magically fix the platforms themselves:
Blanket tech bans can be crude instruments. They may make it harder for many young people to have social media accounts. But they often don’t change the underlying app features that many parents are worried about.
Many popular apps use powerful attention-hacking techniques that can hook young people, said Julia Powles, an Australian researcher who is the executive director of the U.C.L.A. Institute for Technology, Law and Policy. This keeps users online longer, she notes, and makes the companies more money from advertising.
This completely misses the point—which, as danah boyd has repeatedly explained, is that adults are confusing risks with harms. Many things are risky. Some can lead to harm. But we generally deal with risky things by teaching people how to manage those risks.
The response to potential harms from social media shouldn’t be to demand bans. It should be teaching kids how to navigate these spaces appropriately—how to recognize manipulation, how to minimize risks, what to do when something goes wrong. Instead, we hide it. We ban it. We shove it under the rug and pretend that if we just keep this scary thing away from kids, they’ll somehow be fine once the ban lifts.
And thus, we get the worst of everything. For every ban out there, kids will find their ways around them. Often, that will involve doing things surreptitiously, in places with fewer controls and less ability for parents and teachers to properly instruct kids how to use those tools appropriately. It actually puts kids in more danger by pretending that if we just “ban” places for them to communicate, that they’ll just become perfect little kids who never look elsewhere.
The Times had a chance here to actually inform the debate—to lead with what the evidence shows, to explain the tradeoffs, to challenge the reflexive push for bans. Instead, they wrote 13 paragraphs of pure moral panic before mentioning that these policies don’t work, then immediately pivoted back to fearmongering about “attention-hacking techniques.”
This all just feeds the moral panic. It gives politicians and administrators cover to implement bans that won’t help kids but will absolutely create new problems. And when those bans inevitably fail, the Times will probably write another breathless piece wondering why kids are still struggling—while once again burying the fact that we never actually tried teaching them how to navigate these spaces in the first place.
A federal magistrate judge just ordered that the private ChatGPT conversations of 20 million users be handed over to the lawyers for dozens of plaintiffs, including news organizations. Those 20 million people weren’t asked. They weren’t notified. They have no say in the matter.
Last week, Magistrate Judge Ona Wang ordered OpenAI to turn over a sample of 20 million chat logs as part of the sprawling multidistrict litigation where publishers are suing AI companies—a mess of consolidated cases that kicked off with the NY Times’ lawsuit against OpenAI. Judge Wang dismissed OpenAI’s privacy concerns, apparently convinced that “anonymization” solves everything.
Even if you hate OpenAI and everything it stands for, and hope that the news orgs bring it to its knees, this should scare you. A lot. OpenAI had pointed out to the judge a week earlier that this demands from the news orgs would represent a massive privacy violation for ChatGPT’s users.
News Plaintiffs demand that OpenAI hand over the entire 20M log sample “in readily searchable format” via a “hard drive or [] dedicated private cloud.” ECF 656 at 3. That would include logs that are neither relevant nor responsive—indeed, News Plaintiffs concede that at least 99.99% of the logs are irrelevant to their claims. OpenAI has never agreed to such a process, which is wildly disproportionate to the needs of the case and exposes private user chats for no reasonable litigation purpose. In a display of striking hypocrisy, News Plaintiffs disregard those users’ privacy interests while claiming that their own chat logs are immune from production because “it is possible” that their employees “entered sensitive information into their prompts.” ECF 475 at 4. Unlike News Plaintiffs, OpenAI’s users have no stake in this case and no opportunity to defend their information from disclosure. It makes no sense to order OpenAI to hand over millions of irrelevant and private conversation logs belonging to those absent third parties while allowing News Plaintiffs to shield their own logs from disclosure.
OpenAI offered a much more privacy-protective alternative: hand over only a targeted set of logs actually relevant to the case, rather than dumping 20 million records wholesale. The news orgs fought back, but their reply brief is sealed—so we don’t get to see their argument. The judge bought it anyway, dismissing the privacy concerns on the theory that OpenAI can simply “anonymize” the chat logs:
Whether or not the parties had reached agreement to produce the 20 million Consumer ChatGPT Logs in whole—which the parties vehemently dispute—such production here is appropriate. OpenAI has failed to explain how its consumers’ privacy rights are not adequately protected by: (1) the existing protective order in this multidistrict litigation or (2) OpenAI’s exhaustive de-identification of all of the 20 million Consumer ChatGPT Logs.
The judge then quotes the news orgs’ filing, noting that OpenAI has already put in this effort to “deidentify” the chat logs.
Both of those supposed protections—the protective order and “exhaustive de-identification”—are nonsense. Let’s start with the anonymization problem, because it shows a stunning lack of understanding about what it means to anonymize data sets, especially AI chatlogs.
We’ve spent years warning people that “anonymized data” is a gibberish term, used by companies to pretend large collections of data can be kept private, when that’s just not true. Almost any large dataset of “anonymized” data can have significant portions of the data connected back to individuals with just a little work. Researchers re-identified individuals from “anonymized” AOL search queries, from NYC taxi records, from Netflix viewing histories—the list goes on. Every time someone shows up with an “anonymized” dataset, researchers show ways to re-identify people in the dataset.
And that’s even worse when it comes to ChatGPT chat logs, which are likely to be way more revealing that previous data sets where the inability to anonymize data were called out. There have been plenty of reports of just how much people “overshare” with ChatGPT, often including incredibly private information.
Back in August, researchers got their hands on just 1,000 leaked ChatGPT conversations and talked about how much sensitive information they were able to glean from just that small number of chats.
Researchers downloaded and analyzed 1,000 of theleaked conversations,spanning over 43 million words. Among them, they discovered multiple chats that explicitly mentioned personally identifiable information (PII), such as full names, addresses, and ID numbers.
With that level of PII and sensitive information, connecting chats back to individuals is likely way easier than in previous cases of connecting “anonymized” data back to individuals.
And that was with just 1,000 records.
Then, yesterday as I was writing this, the Washington Post revealed that they had combed through 47,000 ChatGPT chat logs, many of which were “accidentally” revealed via ChatGPT’s “share” feature. Many of them reveal deeply personal and intimate information.
Users often shared highly personal information with ChatGPT in the conversations analyzed by The Post, including details generally not typed into conventional search engines.
People sent ChatGPT more than 550 unique email addresses and 76 phone numbers in the conversations. Some are public, but others appear to be private, like those one user shared for administrators at a religious school in Minnesota.
Users asking the chatbot to draft letters or lawsuits on workplace or family disputes sent the chatbot detailed private information about the incidents.
There are examples where, even if the user’s official details are redacted, it would be trivial to figure out who was actually doing the chats:
If you can’t see that, it’s a chat with ChatGPT, redacted by the Washington post saying:
User my name is [name redacted] my husband name [name redacted] is threatning me to kill and not taking my responsibities and trying to go abroad […] he is not caring us and he is going to kuwait and he will give me divorce from abroad please i want to complaint to higher authgorities and immigrition office to stop him to go abroad and i want justice please help
ChatGPT Below is a formal draft complaint you can submit to the Deputy Commissioner of Police in [redacted] addressing your concerns and seeking immediate action:
That seems like even if you “anonymized” the chat by taking off the user account details, it wouldn’t take long to figure out whose chat it was, revealing some pretty personal info, including the names of their children (according to the Post).
And WaPo reporters found that by starting with 93,000 chats, then using tools do an analysis of the 47,000 in English, followed by human review of just 500 chats in a “random sample.”
Now imagine 20 million records. With many, many times more data, the ability to cross-reference information across chats, identify patterns, and connect seemingly disconnected pieces of information becomes exponentially easier. This isn’t just “more of the same”—it’s a qualitatively different threat level.
Even worse, the judge’s order contains a fundamental contradiction: she demands that OpenAI share these chatlogs “in whole” while simultaneously insisting they undergo “exhaustive de-identification.” Those two requirements are incompatible.
Real de-identification would require stripping far more than just usernames and account info—it would mean redacting or altering the actual content of the chats, because that content is often what makes re-identification possible. But if you’re redacting content to protect privacy, you’re no longer handing over the logs “in whole.” You can’t have both. The judge doesn’t grapple with this contradiction at all.
Yes, as the judge notes, this data is kept under the protective order in the case, meaning that it shouldn’t be disclosed. But protective orders are only as strong as the people bound by them, and there’s a huge risk here.
Looking at the docket, there are a ton of lawyers who will have access to these files. The docket list of parties and lawyers is 45 pages long if you try to print it out. While there are plenty of repeats in there, there have to be at least 100 lawyers and possibly a lot more (I’m not going to count them, and while I asked three different AI tools to count them, each gave me a different answer).
That’s a lot of people—many representing entities directly hostile to OpenAI—who all need to keep 20 million private conversations secret.
That’s not even getting into the fact that handling 20 million chat logs is a difficult task to do well. I am quite sure that among all the plaintiffs and all the lawyers, even with the very best of intentions, there’s still a decent chance that some of the content could leak (and it could, in theory, leak to some of the media properties who are plaintiffs in the case).
And, as OpenAI properly points out, its users whose data is at risk here have no say in any of this. They likely have no idea that a ton of people may be about to get an intimate look at what they thought were their private ChatGPT chats.
OpenAI is unaware of any court ordering wholesale production of personal information at this scale. This sets a dangerous precedent: it suggests that anyone who files a lawsuit against an AI company can demand production of tens of millions of conversations without first narrowing for relevance. This is not how discovery works in other cases: courts do not allow plaintiffs suing Google to dig through the private emails of tens of millions of Gmail users irrespective of their relevance. And it is not how discovery should work for generative AI tools either.
The judge had cited a ruling in one of Anthropic’s cases, but hadn’t given OpenAI a chance to explain why the ruling in that case didn’t apply here (in that one, Anthropic had agreed to hand over the logs as part of negotiations with the plaintiffs, and OpenAI gets in a little dig at its competitor, pointing out that it appears Anthropic made no effort to protect the privacy of its users in that case).
There have, as Daphne Keller regularly points out, always been challenges between user privacy and platform transparency. But this goes well beyond that familiar tension. We’re not talking about “platform transparency” in the traditional sense—publishing aggregated statistics or clarifying moderation policies. This is 20 million complete chatlogs, handed over “in whole” to dozens of adversarial parties and their lawyers. The potential damage to the privacy rights of those users could be massive.
Earlier today we wrote about Trump’s extraordinary admission that he was basing military deployment decisions on old Fox News footage and lies from his advisors. But there’s an even more damning story here: how that revelation almost never saw the light of day because of journalistic cowardice.
The smoking gun quote came from Trump’s phone interview with NBC’s Yamiche Alcindor:
“I spoke to the governor, she was very nice,” Trump said. “But I said, ‘Well wait a minute, am I watching things on television that are different from what’s happening? My people tell me different.’ They are literally attacking and there are fires all over the place…it looks like terrible.”
This is an absolutely nuclear quote.
But note that we linked to the local KGW affiliate report on it and not NBC’s.
And that’s because NBC didn’t even mention the quote at all in its own coverage. As Dan Froomkin highlighted in his article about all this, NBC ran two stories by Alcindor (with Alexandra Marquez) about her interview with Trump, neither of which mentioned that bombshell of a quote.
Instead, it was only because NBC apparently sent the full transcript to affiliates that Evan Watson at KGW picked it up and ran a story about it.
But that raises a ton of questions, including how could NBC and Alcindor not see this as a story? And what is wrong with the mainstream media that it basically skipped over this?
The quote is devastating. It reveals a president who is either completely detached from reality, easily manipulated by advisors feeding him false information, or being deliberately deceived by old Fox News footage (as we now know was happening). It raises fundamental questions about who is actually running the country and whether the person with access to nuclear codes can distinguish between television clips from five years ago and reality. As we detailed yesterday, this quote reveals everything about how Trump ended up threatening military action against an American city based on five-year-old Fox News b-roll.
NBC’s failure to see the story in this is journalistic malpractice of the highest order. When the President admits he can’t tell the difference between Fox News b-roll and reality, that’s not a throwaway line—it’s the story.
But it’s also part of a much larger pattern of media cowardice that’s actively damaging public trust in journalism. The problem isn’t just burying important quotes—it’s the widespread adoption of “view from nowhere” reporting that treats even the most basic facts as matters of debate.
Take this astounding example from a recent New York Times piece about Trump’s use of military force against boats in the Caribbean.
Some legal expertshave called it a crime to summarily kill civilians not directly taking part in hostilities, even if they are believed to be smuggling drugs.
“Some legal experts?” Are you kidding me? Summarily executing civilians is a war crime under international law. This isn’t a matter of debate among competing schools of legal thought. There isn’t another camp of legal experts arguing that, actually, murdering civilians is totally fine. The Times is creating false balance where none exists, making it sound like there’s some reasonable disagreement about whether mass murder constitutes a crime.
Or consider this gem from CNN, fact-checking Trump’s claim that he reduced prescription drug prices by 1500%:
Trump has unveiled a number of moves aimed at cutting drug prices in recent months, but he has yet to move the needle on reducing costs – much less slashing them by 1,500%, which is mathematically impossible,experts say.
Experts say? You need experts to tell you that 1500% is more than 100%? This is elementary school math. A 100% reduction means something is free. A 1500% reduction would mean pharmaceutical companies are paying you a decent sum of money to take their pills. You don’t need to consult the National Academy of Sciences to determine this is bullshit—you need to remember fourth grade.
This kind of reporting is journalistic malpractice disguised as objectivity. When reporters feel compelled to add “experts say” to basic mathematical facts or treat war crimes as matters of legitimate debate, they’re not being neutral—they’re actively misleading their audience into believing basic facts are up for debate among “experts.”
The pattern is clear: mainstream media has become so terrified of appearing biased that they’ve abandoned their basic responsibility to clearly communicate truth to the public. They’d rather hide behind the false comfort of “some say” and “experts disagree” than plainly state obvious facts.
This isn’t objectivity—it’s cowardice. And it’s precisely why trust in media continues to crater.
There’s an old joke in the journalism field (with disputes over where it originated from) but the line is “if one person says it’s raining and another says it’s not, the journalist should look outside and report the truth” rather than suggesting whether or not it’s raining is a matter of dispute.
We’re seeing the opposite from the mainstream media these days.
When the President of the United States admits he can’t distinguish between television and reality, that’s not a “both sides” story, or a cute anecdote not worth mentioning. When someone claims to have reduced costs by 1500%, that’s not a matter requiring expert consultation—it’s a mathematical impossibility. When military officials discuss summarily executing civilians, that’s not a policy debate—it’s war crimes.
The public deserves better than this mealy-mouthed nonsense. They deserve reporters who can recognize when they’re witnessing something extraordinary and have the courage to say so clearly. They deserve news organizations that understand the difference between false balance and actual journalism.
Instead, we get reporters who bury the most important quotes of their own interviews and editors who think basic arithmetic requires expert verification. Is it any wonder people are losing faith in institutions that seem incapable of simply stating reality on its own terms?
The media keeps wondering why trust in journalism is at historic lows. Here’s a thought: maybe it’s because when the President reveals he’s making military decisions based on old Fox News footage and lies from his advisors, the reporter who got that admission decides it’s not worth mentioning. Or maybe it’s because the likes of CNN and the NY Times are so worried about angry people attacking them for calling bullshit on the President that they have to cower behind “experts say” on basic objective facts.
That’s not journalism. That’s stenography. And the American people can tell the difference, even when their media apparently cannot.
In what may be a first in American legal history, a sitting president just had his lawsuit struck down by a federal judge before the defendants even had a chance to respond.
Judge Steven Merryday didn’t wait for a motion to dismiss. He didn’t wait for the defendants to file an answer. Four days after Donald Trump’s lawyers filed their 85-page tantrum masquerading as a defamation complaint against the New York Times and Penguin Random House, Merryday struck it sua sponte—essentially telling the President of the United States that his legal filing was so fundamentally defective it wasn’t worth the court’s time.
Sua sponte dismissals are extraordinarily rare. Judges typically bend over backwards to let even the most questionable complaints proceed to motion practice. The fact that a federal judge took the unusual step of striking a complaint without any prompting from defendants signals just how egregiously improper Trump’s filing was.
Last week, we told you about the ridiculously dopey lawsuit that Donald Trump had filed against Penguin Random House, the NY Times, and some reporters over… something. It wasn’t quite clear. But the lawsuit spent many, many pages fluffing Donald Trump’s ego and suggesting that the mere endorsement in the NY Times of Kamala Harris was election interference and suggested that it would break all the laws to criticize Dear Leader Donald J. Trump.
The complaint also betrayed a fundamental misunderstanding of defamation law’s “actual malice” standard and bore hallmarks that led many observers to suspect it was AI-generated—a theory that gains credibility when you read Judge Merryday’s scathing analysis of its contents.
The venue choice was transparently strategic. Trump forum-shopped his way to the Tampa division of the Middle District of Florida despite having no meaningful connection there—Mar-a-Lago is in the Southern District, and the defendants are based in New York. The complaint’s assertion that venue was proper because defendants “sell newspapers and books” in the district was laughably weak.
The real reason was likely that four of the five regular judges in that division were Trump appointees. But Trump’s luck ran out when the case landed on the docket of Judge Steven Merryday (who is on senior status), a no-nonsense Bush Sr. appointee who clearly wasn’t impressed by the presidential plaintiff.
As every member of the bar of every federal court knows (or is presumed to know), Rule 8(a), Federal Rules of Civil Procedure, requires that a complaint include “a short and plain statement of the claim showing that the pleader is entitled to relief.” Rule 8(e)(1) helpfully adds that “[e]ach averment of a pleading shall be simple, concise, and direct.” Some pleadings are necessarily longer than others. The difference likely depends on the number of parties and claims, the complexity of the governing facts, and the duration and scope of pertinent events. But both a shorter pleading and a longer pleading must comprise “simple, concise, and direct” allegations that offer a “short and plain statement of the claim.” Rule 8 governs every pleading in a federal court, regardless of the amount in controversy, the identity of the parties, the skill or reputation of the counsel, the urgency or importance (real or imagined) of the dispute, or any public interest at issue in the dispute.
In this action, a prominent American citizen (perhaps the most prominent American citizen) alleges defamation by a prominent American newspaper publisher (perhaps the most prominent American newspaper publisher) and by several other corporate and natural persons. Alleging only two simple counts of defamation, the complaint consumes eighty-five pages. Count I appears on page eighty, and Count II appears on page eighty-three. Pages one through seventy-nine, plus part of page eighty, present allegations common to both counts and to all defendants. Each count alleges a claim against each defendant and, apparently, each claim seeks the same remedy against each defendant.
But the judge doesn’t mince words about how “improper” the complaint is beyond just the length:
Even under the most generous and lenient application of Rule 8, the complaint is decidedly improper and impermissible. The pleader initially alleges an electoral victory by President Trump “in historic fashion” — by “trouncing” the opponent — and alludes to “persistent election interference from the legacy media, led most notoriously by the New York Times.” The pleader alludes to “the halcyon days” of the newspaper but complains that the newspaper has become a “full-throated mouthpiece of the Democrat party,” which allegedly resulted in the “deranged endorsement” of President Trump’s principal opponent in the most recent presidential election. The reader of the complaint must labor through allegations, such as “a new journalistic low for the hopelessly compromised and tarnished ‘Gray Lady.’” The reader must endure an allegation of “the desperate need to defame with a partisan spear rather than report with an authentic looking glass” and an allegation that “the false narrative about ‘The Apprentice’ was just the tip of Defendants’ melting iceberg of falsehoods.” Similarly, in one of many, often repetitive, and laudatory (toward President Trump) but superfluous allegations, the pleader states, “‘The Apprentice’ represented the cultural magnitude of President Trump’s singular brilliance, which captured the [Z]eitgeist of our time.”
And also points out how “tedious” the complaint is and points out that a civil complaint is no place for ranting and raving about how mean people are to you, with the main target being the PR value over having a legitimate complaint:
As every lawyer knows (or is presumed to know), a complaint is not a public forum for vituperation and invective — not a protected platform to rage against an adversary. A complaint is not a megaphone for public relations or a podium for a passionate oration at a political rally or the functional equivalent of the Hyde Park Speakers’ Corner.
That’s basically: “your complaint is the legal equivalent of the guy screaming out conspiracy theories on the street corner.”
The judge, as he should, gives Trump 28 days to amend the complaint, which is likely to happen. Whether or not his lawyers can actually follow the local rules and properly state a claim will remain only conjecture until that time.
Meanwhile, Trump appeared wholly unaware that the case was tossed on Friday while meeting with the press. He started bragging about the case and when ABC News reporter Jonathan Karl pointed out that the case had been tossed, Trump responded “I’m winning, I’m winning the cases.” He’s not.
TRUMP: That's why I sued the New York Times two days ago for a lot of moneyKARL: A judge just threw that outTRUMP: I'm winning. I'm winning the cases.
The disconnect between Trump’s perception and legal reality perfectly encapsulates his approach to litigation: file theatrical lawsuits designed more for headlines than legal success, then either attack judges or (as here) just deny reality when courts treat them as actual legal documents that must follow rules. It’s a pattern we’ve seen repeatedly—lawsuits that work better as press releases than as instruments of justice.
Having a President who operates in an alternate reality where judicial smackdowns count as victories is, to put it mildly, concerning. But these days, it’s just a Friday.
Even by Donald Trump’s standards for frivolous defamation lawsuits, this one is impressively stupid. On Monday, the president filed yet another lawsuit against the NY Times—this time seeking $15 billion over a book that claims he’s not quite as successful a businessman as he pretends to be.
The timing is almost comically bad. Trump is suing over allegations that he’s not actually that successful… right after winning the presidency in a landslide and making absolute bank while doing it. Has there ever been a sorer winner in the history of politics? You’re the fucking President. Get over the fact that some people criticize you already.
Trump has a decently long history of suing media outlets over unflattering coverage, including multiple failed attempts against the Times. Just last year, he had to pay nearly $400k in legal fees after another bogus lawsuit against the Times failed. But why let past failures slow you down when you can file an even dumber one?
The lawsuit is against the NY Times and book publisher Penguin Random House, along with some reporters at the NY Times. The complaint is… well… it is not the most organized or professional of complaints. It is, as so many Donald Trump lawsuits seem to be, political documents designed to please Donald Trump and his legally ignorant MAGA base, rather than convincing judges.
The complaint reads more like a press release than a legal document, packed with ego-stroking passages that reveal just how pathetically thin-skinned Trump remains. Consider this actual paragraph from a federal lawsuit:
Thanks solely to President Trump’s sui generis charisma and unique business acumen, “The Apprentice” generated hundreds of millions of dollars in revenue, and remained on television for over thirteen years, with nearly 200 episodes. “The Apprentice” represented the cultural magnitude of President Trump’s singular brilliance, which captured the zeitgeist of our time.
And, yes, that picture is included.
The complaint starts out by claiming that the NY Times endorsing Kamala Harris was a form of “election interference” which is not how anything works.
President Trump trounced Harris with 312 electoral votes and a sweep of all seven “battleground” states. This victory was remarkable for many historic reasons, including because President Trump had to overcome persistent election interference from the legacy media, led most notoriously by the New York Times.
That’s literally in the first paragraph of the complaint (though the claims themselves do not revolve around election interference, but even weaker claims of defamation). But, admitting that you won the election already undermines the idea that there was any damage done to Trump’s reputation from [checks notes] political reporting on him (historically some of the most protected of speech under the First Amendment.)
Indeed, Trump is going to have a pretty difficult time showing “damage” done to his reputation here. He claims that the NY Times tried to do three things:
Defendants’ pre-election goal was to kill three birds with one stone: (a) damage President Trump’s hard-earned and world-renowned reputation for business success, (b) in the process, sabotage his 2024 candidacy for President of the United States, and (c) prejudice judges and juries in the unlawful cases brought against President Trump, his family, and his businesses by his political opponents for purposes of election interference.
If that were true (and it isn’t) then they failed on all three counts. Trump won the election easily in 2024, he’s making absolute bank while being President (perhaps more than doubling his wealth) and all of the lawsuits against him have basically been shut down with Trump coming out on top.
Also, for anyone who has followed the NY Times’ repeated (and somewhat pathetic) attempts to bend over backwards to appease Trump and sanewash his attempt to bring fascism to America by pretending it’s politics-as-normal, this following sentence is ridiculous:
Today, the Times is a full-throated mouthpiece of the Democrat Party.
There is no one who has followed the NY Times’ willingness to “both sides” every crazy thing Trump does who actually believes that.
Then, after nearly five pages of screaming about how liberal the NY Times is, the lawsuit finally says that this lawsuit is not really about the NY Times at all, but rather a book written by two of its reporters (hence the Penguin Random House inclusion on the defendants list).
The subject matter of this action—a malicious, defamatory, and disparaging book written by two of its reporters and three false, malicious, defamatory, and disparaging articles, all carefully crafted by Defendants, with actual malice, calculated to inflict maximum damage upon President Trump, and all published during the height of a Presidential Election that became the most consequential in American history—represent a new journalistic low for the hopelessly compromised and tarnished “Gray Lady.” Defendants’ pre-election goal was to kill three birds with one stone: (a) damage President Trump’s hard-earned and world-renowned reputation for business success, (b) in the process, sabotage his 2024 candidacy for President of the United States, and (c) prejudice judges and juries in the unlawful cases brought against President Trump, his family, and his businesses by his political opponents for purposes of election interference. With President Trump having won the Presidency, Defendants’ goals remain similar and unlawful: tarnish his legacy of achievement, destroy his reputation as a successful businessman, and subject him to humiliation and ridicule.
Specifically, on September 17, 2024, Penguin published a false, malicious, and defamatory book titled “Lucky Loser: How Donald Trump Squandered His Father’s Fortune and Created the Illusion of Success” (the “Book”), authored by Craig and Buettner.
Dude. You won! Has there ever been a sorer winner in the history of politics? My goodness.
Before diving deeper into this mess, it’s crucial to understand what Trump actually needs to prove. As a public figure, he must show “actual malice”—and despite what Trump’s lawyers seem to think, that’s not about being mean to him.
Actual malice requires proving the defendants published something they knew was false or with reckless disregard for the truth (and reckless disregard also means something different than most people assume: it means you have to have ignored evidence that what you were publishing was false). It’s an extremely high bar, deliberately designed to protect robust debate about public figures. It has absolutely nothing to do with being angry or hostile—which is what Trump’s very bad lawyers seem to think it means.
Defendants each desire for President Trump fail politically and financially. Each feels actual malice towards President Trump in the colloquial sense: that is, each—Craig, Buettner, Baker, and Schmidt, as individuals, and the Times and Penguin’s relevant executives as corporations—subjectively wishes to harm President Trump, and each wish to manipulate public opinion to President Trump’s disadvantage to worsen his current and future political and economic prospects. Put bluntly, Defendants baselessly hate President Trump in a deranged way.
That final sentence—”Defendants baselessly hate President Trump in a deranged way”—reads like it was written by a sixth grader having a tantrum, not a lawyer filing a federal lawsuit. More importantly, nowhere in this 85-page screed do Trump’s lawyers actually demonstrate the knowing falsity or reckless disregard that the law requires.
They describe completely typical best practices in reporting as if they’re nefarious, such as the following:
Likewise, the Times and its reporters, including Craig, Buettner, Baker, and Schmidt, have a pattern and practice of contacting President Trump and his team regarding negative stories on a short timeline so as to be able to state that they sought comment—in order to preserve a scintilla of the pretense of neutrality—while making it functionally impossible for President Trump to comment on stories with factual errors, correct those errors, or provide a responsive quote before publication. This policy further enables the Times and its reporters to publish negative assertions about President Trump about which they subjectively harbor doubts as to their truthfulness by permitting them to claim that they sought factual confirmation or denial regarding their stories, even when they subjectively realize that they did not do so in good faith.
Again, that’s not how any of this works, and it’s certainly not how the NY Times’ reporting works. I have plenty of criticisms about the NY Times and its coverage, but the idea that they do this for the reasons stated is ludicrous.
The incredibly weak attempt to argue for reckless disregard… is to claim that because they didn’t interview producer Mark Burnett about Trump’s time on The Apprentice, that’s a form of ignoring counter evidence.
For non-exhaustive examples, and as detailed supra, Defendants published numerous statements regarding President Trump’s role in “The Apprentice” without first securing an interview from primary sources senior to the production of The Apprentice, such as Burnett. Defendants knew that Burnett would likely have contradicted numerous specific false, malicious, and defamatory purported statements of fact that they made regarding President Trump’s role in “The Apprentice” as well as their general narrative regarding President Trump’s role in the show’s success. Defendants therefore did not sufficiently pursue speaking with Burnett even after he did not grant an interview, did not sufficiently seek to obtain his original notes or records, and otherwise failed to engage with Burnett and other potential insiders with “The Apprentice” because they subjectively believed that these sources would have tended to contradict the defamatory lies that they wished to publish about President Trump.
Again, this is not how the NY Times works. If Burnett would have spoken to them (and historically he has refused to talk to the media about Trump beyond a single press statement he made before the 2016 election), the NY Times would have loved it and would have quoted him extensively, as that would be a huge scoop, given how often Burnett has refused to comment on Trump.
There’s also a whole tangent building off of Tulsi Gabbard’s ridiculously misleading statements earlier this year, falsely claiming that the Obama administration tried to fake Russia’s attempts to interfere with the 2016 election, even though multiple investigations (including those led by Republicans) have found that Russia absolutely tried to influence the 2016 election, even if it didn’t have much actual success.
The lawsuit then asks for… $15 billion dollars. How very Dr. Evil. The NY Times, for what it’s worth, is currently valued at less than $10 billion.
A lot of people discussing this lawsuit are claiming two things: that it’s really all about getting a settlement out of the NY Times like he’s been getting out of others, and second that it’s an attempt to get NYT v. Sullivan (the key case that established the actual malice standard) overturned.
While that may be the intent behind this lawsuit, I find both unlikely. Yes, in the lawsuit, Trump lists out a bunch of those corrupt settlements, as if they’re somehow relevant here. But plenty of people have observed that those settlements had nothing to do with the merits of the cases, but rather were entirely about capitulating to a bully and trying to get him off their backs. And, in the case of CBS, it seemed quite clear that the settlement was so that Shari Redstone could get her deal to sell Paramount/CBS to Larry Ellison’s son.
And, when it comes to the NY Times, they have a very good legal team that tends to relish taking on bad faith, bullshit SLAPP style lawsuits. They have a very good track record on those, and don’t often roll over. I would imagine that the legal team feels pretty strongly about defending this case rather than settling.
As for the attack on the actual malice standard, that’s the same thing people claimed about the last Trump lawsuit against the NY Times and it went up in smoke. It’s what people seem to want to claim about a bunch of frivolous defamation claims lately, and while it may be what the lawyers want, they seem like really bad cases to make these arguments. Because the underlying facts are so silly and so obviously bullshit, that the facts make for really bad cases to argue that the NYT v. Sullivan standard is somehow unfair.
Honestly, this just feels like so many of Trump’s lawsuits: engaging in pointless vexatious SLAPP lawfare just to punish media properties that publish negative stories about him. He has long admitted that he enjoys filing such lawsuits. Famously, he once said:
“I spent a couple of bucks on legal fees, and they spent a whole lot more. I did it to make his life miserable, which I’m happy about.”
That’s the very definition of a SLAPP suit. And, if you’re wondering, Florida does have an anti-SLAPP law, though it’s a bit quirky compared to other states. Also (more importantly) the Eleventh Circuit (which covers Florida) has said that you can’t use anti-SLAPP laws in federal court.
But, really, if you want proof that this is just Trump trying to punish those who dare to report on him accurately, just witness how he responded to a question about how he felt about Pam Bondi’s unconstitutional claims of punishing people for hate speech, by immediately threatening to go after the journalist who asked the question.
JON KARL: What do you make of Pam Bondi saying she's gonna go after hate speech? A lot of your allies say hate speech is free speechTRUMP: We'll probably go after people like you because you treat me so unfairly. You have a lot of hate in your hate. Maybe they'll have to go after you.
The New York Times has had a rough few decades when it comes to being manipulated by bad actors. But their latest embarrassment—a complete non-story about NYC mayoral candidate Zohran Mamdani’s college application to Columbia University from 2009—represents a new low in journalistic malpractice that combines hacked materials, racist sources, and a breathtaking willingness to be used as a vehicle for right-wing propaganda. Oh, and all for a story that has zero news value and zero insight into Mamdani’s qualifications to be mayor of New York City.
Here’s what happened: The Times published a story claiming that Mamdani, who was born in Uganda to parents of Indian descent, checked both “Asian” and “Black or African American” boxes on his Columbia University application all the way back in 2009. The implication, pushed by the story’s framing, was that this was somehow scandalous—a case of gaming the system for affirmative action benefits.
As he runs for mayor of New York City, Zohran Mamdani has made his identity as a Muslim immigrant of South Asian descent a key part of his appeal.
But as a high school senior in 2009, Mr. Mamdani, the Democratic nominee, claimed another label when he applied to Columbia University. Asked to identify his race, he checked a box that he was “Asian” but also “Black or African American,” according to internal data derived from a hack of Columbia University that was shared with The New York Times.
Columbia, like many elite universities, used a race-conscious affirmative action admissions program at the time. Reporting that his race was Black or African American in addition to Asian could have given an advantage to Mr. Mamdani, who was born in Uganda and spent his earliest years there.
I’m genuinely curious about the Times’ logic here. Person born in Uganda checks “African American” box. Where’s the lie? Did Uganda move? Is it not in Africa anymore? Are we really going to pretend that America’s racial categories, designed primarily for descendants of American slavery, map perfectly onto the global complexity of human identity?
If there is a story, it is solely about the Times’ decision and later justification for publishing this non-story.
Mamdani has a complex racial and ethnic background that doesn’t fit neatly into America’s crude racial categories. As he told the Times: “Most college applications don’t have a box for Indian-Ugandans, so I checked multiple boxes trying to capture the fullness of my background.” He also noted that he wrote in “Ugandan” in the space provided for additional information.
Oh, and for all the “could have given an advantage to Mr. Mamdani” reporting in the piece: it didn’t. He didn’t even get into Columbia. Even though his father is a professor there.
So much for gaming the system.
But here’s where it gets really ugly: The Times obtained this information from a massive hack of Columbia’s database, and their source was Jordan Lasker, who goes by the online handle “Cremieux” and whose hobbies include arguing that Black people are genetically inferior. Yes, really. The Times initially described him merely as “an academic who opposes affirmative action,” but as The Guardian previously reported, Lasker regularly argues that Black people are mentally inferior to other races and has written posts defending the idea that African countries have “average national IQs at a level that experts associate with mental impairment.”
But wait, it gets worse. The NY Time’s description of him as “an academic” is generous at best (or perhaps just credulous). His own sister claimed that the family has no evidence he ever graduated and that he didn’t walk at the graduation ceremony that year and his name wasn’t listed in the graduation program. An analysis by another account noted that while he was a PhD student between 2021 and 2024 at Texas Tech, the only academic publication they could find by him turned into a huge scandal that got the professor he co-authored with fired. The paper was not just racist pseudoscience—it also involved lying to the NIH to get access to data. Two-fer!
That article also suggests Lasker (in that paper) lied about his supposed affiliation with the University of Minnesota. When asked about it, the University of Minnesota revealed that Lasker had been a “non-employee” “data consultant” and they had asked him not to claim an academic affiliation:
So, to summarize the Times’ sourcing: They granted anonymity to a person whose identity was already publicly known, who promotes ideas about racial hierarchy that would make a 1930s eugenicist blush, who may have lied about his academic credentials, and whose main claim to fame is getting a professor fired for publishing racist garbage research. And this seemed like a credible source to them for a story attacking a Muslim candidate of color.
What could possibly go wrong?
The Rufo Connection Makes It Even Worse
If this sounds familiar, it should. As Semafor reported, the Times rushed to publish this non-story because they were afraid of being “scooped” by Chris Rufo, the right-wing activist who has openly bragged about manipulating mainstream media to advance his culture war agenda.
The paper believed it had reason to push the story out quickly: It did not want to be scooped by the independent journalist Christopher Rufo. Two people familiar with the reporting process told Semafor that the paper was aware that other journalists were working on the admissions story, including Rufo, a conservative best known for hiscrusade against critical race theory.
Rufo literally announces his manipulation tactics on Twitter. He’s written about how he plans to get outlets like the Times to amplify his disingenuous and misleading campaigns. And yet, the Times still falls for it every single time, then acts surprised when people point out they’re being played.
As Jamison Foser noted months ago about this dynamic, this isn’t really about the Times being “manipulated”—it’s about the Times wanting to publish these stories and using figures like Rufo as an excuse to do what they already wanted to do.
The Times had a choice: they could have ignored this obvious non-story, or they could have served as a willing vehicle for racists and right-wing propagandists to manufacture a fake scandal. They chose the latter. And then they doubled down on it.
But here’s what kills me: they could have written a fascinating story about how a network of racist activists was trying to weaponize hacked university data that revealed nothing particularly interesting to attack a Muslim mayoral candidate. They could have exposed the whole operation. Instead, they decided to become part of it. It’s like if Woodward and Bernstein, upon discovering Watergate, had decided to focus their expose on how the security at the Watergate Hotel was top notch, with an anonymous quote from G. Gordon Liddy.
The Double Standard Is Glaring
The Times’ decision becomes even more indefensible when you consider their recent editorial choices. They refused to publish hacked materials about JD Vance during the 2024 election and declined to explain why. But when a racist hands them a hacked college application from 2009 that reveals nothing of public interest, suddenly those ethical concerns disappear.
The paper also famously decided not to endorse candidates in local elections—except when it came to Mamdani, whom they specifically urged voters not to rank at all on their ballots. Interestingly, they didn’t issue similar “please don’t vote for this person” guidance about Andrew Cuomo, the disgraced former governor who resigned over sexual harassment allegations and has been plagued with scandals from his mismanagement during the pandemic. Apparently checking the objectively accurate box on a college application is more disqualifying than a pattern of sexual misconduct and mismanagement.
Manufacturing Controversy To Justify Bad Journalism
Perhaps most galling is the Times’ response to criticism. When readers and media critics pointed out how absurd this story was, an anonymous Times source told Semafor that the controversy proved they were right to publish this:
“The fact that this story engendered all the conversation and debate that it has feels like all the evidence you need that this was a legit line of reporting,” one senior reporter told Semafor.
But that’s not how any of this works. At all. Sometimes the “conversation and debate” is about how you should have known better.
Times editor Patrick Healy also doubled down, claiming—in a lengthy rambling thread on ExTwitter—that Mamdani responding honestly to their questions about this made it into a story.
The Times then published a follow-up piece asking readers about frustrations with racial categories on forms—a transparent attempt to retroactively justify their original story by suggesting there’s some broader conversation about racial identity that needed to be had.
But there was already a conversation about racial identity. It’s been going on for centuries. The Times didn’t need to platform a racist and manufacture a fake scandal to contribute to it.
The Real Story They Missed
As Margaret Sullivan, the Times’ former public editor, noted in The Guardian, this story tells us nothing about Mamdani’s qualifications or policy positions. It’s the journalistic equivalent of spending your time investigating whether someone returned their elementary school library books on time instead of, you know, whether they’d be competent at running a city.
Traditional journalism ethics suggests that when news organizations base a story on hacked or stolen information, there should be an extra high bar of newsworthiness to justify publication. Much of Big Journalism, for example, turned their noses up at insider documents offered to them about JD Vance during last year’s presidential campaign, in part because the source was Iranian hackers; in some cases, they wrote about the hack but not the documents.
The Mamdani story, however, fell far short of the newsworthiness bar.
The real story here is how easily America’s supposed “paper of record” can be manipulated by bad actors who openly announce their manipulation tactics. It’s about how the Times’ apparent opposition to certain candidates leads them to abandon basic journalistic standards. And it’s about how the paper’s desperate desire to appear “balanced” makes them perfect marks for right-wing propagandists who understand exactly which buttons to push.
As Hell Gate put it: “The failing, bumbling New York Times” has become a vehicle for race science and manufactured outrage, all while pretending they’re just doing journalism.
So who does this put the Times in league with? Much likeits coverage of trans youth, it’s helpful to look around and see who else is pushing the same line of coverage. It’s hard-right ideology laundered as legitimate journalistic inquiry. The article’s print edition on Sunday ran under the title “Mamdani Faces Scrutiny Over College Application.” From who? For what? The Times clearly doesn’t feel all that interested in answering these questions, other than its providing cover for fascistic ideologues. The Times is coordinating with people whose work is actively eroding what’s left of America’s attempts at racial equity.
Again, it’s hard to tsk-tsk a newspaper that said it wasn’t endorsing candidates in local elections anymore, andthen revised that to actually be like, “unless you’re thinking of electing a socialist, which in that case do not do that and instead vote for this sexual harasser.” Having failed spectacularly at stopping Mamdani, the Times is now unveiling its tried-and-true strategy to drum up controversy—and question the legitimacy of a person’s humanity—by doing the dirtiest of work for the worst-faith actors.
The Times owes its readers an explanation for why they thought this was a story worth telling. Why they granted anonymity to a person who promotes racial pseudoscience. Why they rushed to publish obvious non-news to avoid being “scooped” by a known manipulator. And why they continue to provide aid and comfort to people whose stated goal is to manipulate them.
But the paper has shown no inclination toward introspection. Instead, they’ve doubled down, claiming that the controversy they manufactured proves they were right to manufacture it.
In the meantime, the rest of us can learn something from this debacle: when someone tells you who they are, believe them. Chris Rufo has told us he manipulates mainstream media. Jordan Lasker has told us he believes in debunked racist pseudoscience about “racial hierarchy.” And the New York Times has told us that they’re willing to amplify both of them if it serves their editorial agenda.