Six Months Of ‘AI CSAM Crisis’ Headlines Were Based On Misleading Data
from the lies,-damned-lies,-and... dept
Remember last summer when everyone was freaking out about the explosion of AI-generated child sexual abuse material? The New York Times ran a piece in July with the headline “A.I.-Generated Images of Child Sexual Abuse Are Flooding the Internet.” NCMEC put out a blog post calling the numbers an “alarming increase” and a “wake-up call.” The numbers were genuinely shocking: NCMEC reported receiving 485,000 AI-related CSAM reports in the first half of 2025, compared to just 67,000 for all of 2024.
That’s a big increase! And it would obviously be super concerning if any AI company were finding and detecting so much AI-generated CSAM, especially as we keep hearing that the big AI models (perhaps with the exception of Grok…) have been putting in place safeguards against CSAM generation.
The source of most of those reports? Amazon, which had submitted a staggering 380,000 of them, even though most people don’t tend to think of Amazon as much of an AI company. But, still, it became a six alarm fire about how much AI-generated CSAM Amazon had discovered. There were news stories about it, politicians demanding action, and the general sentiment was that this proved how big the problem was.
Except… it turns out that wasn’t actually what was happening. At all.
Bloomberg just published a deep dive into what was actually going on with Amazon’s reports, and the truth is very, very different from what everyone assumed. According to Bloomberg:
Amazon.com Inc. reported hundreds of thousands of pieces of content last year that it believed included child sexual abuse, which it found in data gathered to improve its artificial intelligence models. Though Amazon removed the content before training its models, child safety officials said the company has not provided information about its source, potentially hindering law enforcement from finding perpetrators and protecting victims.
Here’s the kicker—and I cannot stress this enough—none of Amazon’s reports involved AI-generated CSAM.
None of its reports submitted to NCMEC were of AI-generated material, the spokesperson added. Instead, the content was flagged by an automatic detection tool that compared it against a database of known child abuse material involving real victims, a process called “hashing.” Approximately 99.97% of the reports resulted from scanning “non-proprietary training data,” the spokesperson said.
What Amazon was actually reporting was known CSAM—images of real victims that already existed in databases—that their scanning tools detected in datasets being considered for AI training. They found it using traditional hash-matching detection tools, flagged it, and removed it before using the data. Which is… actually what you’d want a company to do?
But because it was found in the context of AI development, and because NCMEC’s reporting form has exactly one checkbox that says “Generative AI” with no way to distinguish between “we found known CSAM in our training data pipeline” and “our AI model generated new CSAM,” Amazon checked the box.
And thus, a massive misunderstanding was born.
Again, let’s be clear and separate out a few things here: the fact that Amazon found CSAM (known or not) in its training data is bad. It is a troubling sign of how much CSAM is found in the various troves of data AI companies use for training. And maybe the focus should be on that. Also, the fact that they then reported it to NCMEC and removed it from their training data after discovering it with hash matching is… good. That’s how things are supposed to work.
But the fact that the media (with NCMEC’s help) turned this into “OMG AI generated CSAM is growing at a massive rate” is likely extremely misleading.
Riana Pfefferkorn at Stanford, who co-authored an important research report last year about the challenges of NCMEC’s reporting system (which we wrote two separate posts about), wrote a letter to NCMEC that absolutely nails what went wrong here:
For half a year, “Massive Spike In AI-Generated CSAM” is the framing I’ve seen whenever news reports mention those H1 2025 numbers. Even the press release for a Senate bill about safeguarding AI models from being tainted with CSAM stated, “According to the National Center for Missing & Exploited Children, AI-generated material has proliferated at an alarming rate in the past year,” citing the NYT article.
Now we find out from Bloomberg that zero of Amazon’s reports involved AI-generated material; all 380,000 were hash hits to known CSAM. And we have Fallon [McNulty, executive director of the CyberTipline] confirming to Bloomberg that “with the exception of Amazon, the AI-related reports [NCMEC] received last year came in ‘really, really small volumes.'”
That is an absolutely mindboggling misunderstanding for everyone — the general public, lawmakers, researchers like me, etc. — to labor under for so long. If Bloomberg hadn’t dug into Amazon’s numbers, it’s not clear to me when, if ever, that misimpression would have been corrected.
She’s not wrong. Nearly 80% of all “Generative AI” CyberTipline reports to NCMEC in the first half of 2025 involved no AI-generated CSAM at all. The actual volume of AI-generated CSAM being reported? Apparently “really, really small.”
Now, to be (slightly?) fair to the NYT, they did run a minor correction a day after their original story noting that the 485,000 reports “comprised both A.I.-generated material and A.I. attempts to create material, not A.I.-generated material alone.” But that correction still doesn’t capture what actually happened. It wasn’t “AI-generated material and attempts”—it was overwhelmingly “known CSAM detected during AI training data vetting.” Those are very different things.
And it gets worse. Bloomberg reports that Amazon’s scanning threshold was set so low that many of those reports may not have even been actual CSAM:
Amazon believes it over-reported these cases to NCMEC to avoid accidentally missing something. “We intentionally use an over-inclusive threshold for scanning, which yields a high percentage of false positives,” the spokesperson added.
So we’ve got reports that aren’t AI-generated CSAM, many of which may not even be CSAM at all. Very helpful.
The frustrating thing is that this kind of confusion wasn’t just entirely predictable—it was predicted! When Pfefferkorn and her colleagues at Stanford published their report about NCMEC’s CSAM reporting system they literally called out the potential confusion in the options of what to check and how platforms would likely over-report stuff in an abundance of caution, because the penalty (both criminally and in reputation) for missing anything is so dire.
Indeed, the form for submitting to the CyberTipline has one checkbox for “Generative AI” that, as Pfefferkorn notes in her letter, can mean wildly different things depending on who’s checking it:
When the meaning of checking a single checkbox is so ambiguous that absent additional information, reports of known CSAM found in AI training data are facially indistinguishable from reports of new AI-generated material (or of text-only prompts seeking CSAM, or of attempts to upload known CSAM as part of a prompt, etc.), and that ambiguity leads to a months-long massive public misunderstanding about the scale of the AI-CSAM problem, then it is clear that the CyberTipline reporting form itself needs to change — not just how one particular ESP fills it out.
To its credit NCMEC did respond quickly to Pfefferkorn, and their response is… illuminating. They confirmed they’re working on updating the reporting system, but also noted that Amazon’s reports contained almost no useful information:
all those Amazon reports included minimal data, not even the file in question or the hash value, much less other contextual information about where or how Amazon detected the matching file
As Pfefferkorn put it, Amazon was basically giving NCMEC reports that said “we found something” with nothing else attached. NCMEC says they only learned about the false positives issue last week and are “very frustrated” by it.
Indeed, NCMEC’s boss told Bloomberg:
“There’s nothing then that can be done with those reports,” she said. “Our team has been really clear with [Amazon] that those reports are inactionable.”
There’s plenty of blame to go around here. Amazon clearly should have been more transparent about what they were reporting and why. NCMEC’s reporting form is outdated and creates ambiguity that led to a massive public misunderstanding. And the media (NYT included) ran with alarming numbers without asking obvious questions like “why is Amazon suddenly reporting 25x more than last year and no other AI company is even close?”
But, even worse, policymakers spent six months operating under the assumption that AI-generated CSAM was exploding at an unprecedented rate. Legislation was proposed. Resources were allocated. Public statements were made. All based on numbers that fundamentally misrepresented what was actually happening.
As Pfefferkorn notes:
Nobody benefits from being so egregiously misinformed. It isn’t a basis for sound policymaking (or an accurate assessment of NCMEC’s resource needs) if the true volume of AI-generated CSAM being reported is a mere fraction of what Congress and other regulators believe it is. It isn’t good for Amazon if people mistakenly think the company’s AI products are uniquely prone to generating CSAM compared with other options on the market (such as OpenAI, with its distant-second 75,000 reports during the same time period, per NYT). That impression also disserves users trying to pick safe, responsible AI tools to use; in actuality, per today’s revelations about training data vetting, Amazon is indeed trying to safeguard its models against CSAM. I can certainly think of at least one other AI company that’s been in the news a lot lately that seems to be acting far more carelessly.
None of this means that AI-generated CSAM isn’t a real and serious problem. It absolutely is, and it needs to be addressed. But you can’t effectively address a problem if your data about the scope of that problem is fundamentally wrong. And you especially can’t do it when the “alarming spike” that everyone has been pointing to turns out to be something else entirely.
The silver lining here, as Pfefferkorn points out, is that the actual news is… kind of good? Amazon’s AI models aren’t CSAM-generating machines. The company was actually doing the responsible thing by vetting its training data. And the real volume of AI-generated CSAM reports is apparently much lower than we’ve been led to believe.
But that good news was buried for six months under a misleading narrative that nobody bothered to dig into until Bloomberg did. And that’s a failure of transparency, of reporting systems, and of the kind of basic journalistic skepticism that should have kicked in when one company was suddenly responsible for 78% of all reports in a category.
We’ll see if NCMEC’s promised updates to the reporting form actually address these issues. In the meantime, maybe we can all agree that the next time there’s a 700% increase in reports of anything, it’s worth asking a few questions before writing the “everything is on fire” headline.
Filed Under: ai, csam, cybertipline, data, moral panic
Companies: amazon, ncmec, ny times


Comments on “Six Months Of ‘AI CSAM Crisis’ Headlines Were Based On Misleading Data”
The correction is welcome (and overdue) but...
…let’s note that nonconsensual deepfake porn of adults is showing up in a lot of places. I haven’t attempted to develop a metric for it yet, so this is a subjective impression, and thus it could be subject to sample bias, confirmation bias, blahblahblah bias. It would be interesting to see Pfefferkorn et.al. weigh in on this, because my bet is that they have the tools to make a much more rigorous assessment.
That seems a bit rosy. If they’re only matching known hashes to real victims to their training set, by design that wouldn’t tell us about AI CSAM in the wild or being output. And while it did vet it’s training data, it also generated garbage unactionable reports. Not exactly what I’d call responsible, to say nothing of the false-positive rate.
To give NYT some small credit, they did also compare it with the UK’s numbers, which they said went from 2 to 1286. Between that, NCMEC’s comments, Amazon’s lack of comment, and the fact that Amazon really is actually reporting more (due to different reporting standards), I don’t totally blame them.
And to give NCMEC a little credit, there is a free form box companies can use to clarify. It’s just not the checkboxes. They just… don’t use it. (and it being automated is no excuse, you can automate that).
It still shouldn’t have happened, but it’s not totally just on lack of journalistic skepticism.
Honestly, while better numbers are better, I don’t think the actual legislation changes? To the extent that this provokes change, it’s likely to be at NCMEC.
The legal justification for banning child porn is that possession either involves abuse itself or acquisition, the latter of which creates a market for abuse. But AI-generated child porn does not involve abuse nor in any clear way create a market for it so the above argument falls apart. (I am assuming that the imagery does not appear to depict particular real people.) What exactly is the problem with such AI generated imagery that makes it a problem? That it is disturbing or disgusting? That it makes laws against real child porn hard to enforce?
Re:
I think that writing laws that ban disturbing or disgusting generated imagery is a slippery slope which in this case is based on a somewhat valid “think of the children” excuse.
The actual sticking point as I see it is what you asked (rephrased): Do AI-generated content make it harder to detect and prosecute actual sexual exploitation of children?
I think the answer is yes but I have no good answer how to deal with it without greasing the slippery slope.
Re: Re:
The answer is yes. I’ve spoken to actual experts about this (not the pretend experts) and they have said that the rise of AI CSAM has harmed investigations into actual producers and victims of CSAM. That’s a big concern. There is also some evidence suggesting that it leads to an increasing propensity to seek out real CSAM thought that’s less conclusive.
Re: Re: Re:
That last one comes from a “think of the children” “non-profit” which makes dodgy surveys conflating at times, say, violent content with sexual content to “juice the numbers”. They aggressively promote the E.U.’s “chat control” surveillance proposal. They were founded around the time that proposal was first proposed. It smells like astroturf, frankly.
This “non-profit” is funded by another “think of the children” “non-profit” bankrolled by (and partially launched by) the British government.
A few years ago, they managed to sucker a few technologists with less expertise in sociology, and so, you still see people repeating this line today.
Re:
You are correct. There is no justification for singling that aspect out. Many of the arguments hinge on “what ifs” (quite a few are being pushed by “save the children” “non-profits” bankrolled by the British government with a history of unscientific puritanical arguments, if not a cop’s “opinion”).
If you unpeel it, it’s essentially rhetoric used against something someone already has an issue with.
The so-called focus on “AI CSAM” supposedly being uniquely bad is really a clever Silicon Valley marketing operation to distract from the fact that AI models scrape a vast amount of data on the Internet, even data someone might not expect to be scraped. This is quite similar to the concerns around Clearview AI.
That is the issue. It’s an issue with all these systems being pushed by OpenAI, Google, and so on.
All this talk about this filter or that filter really distracts from the reality that these companies are scraping so much.
Re:
It’s actually really bad when someone conflates artificial content with clear deepfakes of someone. I don’t think it’s Mike’s fault here as he is more of a commentator but there are people in the industry who have pushed this irresponsible bs.
A deepfake is bad because it victimizes someone. If people muddle the two concepts, it dilutes the gravity and seriousness of this conduct.
It’s also the same reason you get things like the Grok case.
The hubbub here was not because it “might generate” some offensive content. Rather, it was the way that xAI put buttons on everyone else’s posts, or the way someone could tag in Grok in the replies. Everyone hated this feature.
Instead of directly dealing with this, the media discourse is about whether there is a “flood of CSAM” (this point was actually debunked by eSafety Commissioner Julie Inman Grant who said there were no reports made to their office that were actionable, Australian law is quite strict and Julie hates Elon).
Whether it’s the methodology of measuring the number of attempts w/o checking the imagery or success rate, or anecdotal cases, the argument of “the flood of CSAM” didn’t seem to hold up.
For a number of people, even, it started to look like a partisan hit job against Elon (it didn’t help).
However, whether it is CSAM or not (or NCII for that matter), this tool can be used in a manner that is harassing. Whatever filter Elon installs or doesn’t install does not change that. It’s also not a matter of “punishing Elon”. Nor is it time for the nanny state.
It’s that damn feature.
Yeah, no, Amazon clearly sucks. Meaningless reports floodingthe zone; and i feel like everyone is being extremely overly charitable about the check box. For now, i am going to doubt any reasonable person created or uses the check box to indicate “What was happening when we found it.”
we all know Amazon sucks
Amazon sucking but also doing what they should when it comes to scanning and reporting doesn’t make AI not full of CSAM (hello, X/grok). People are alarmed for quite reasonable reasons that you seem to gloss over.
Re:
Repeatedly throughout the article I said there was a real problem. Did you not read it?
That was six months ago. Enter Grok.
The tone of this article is odd. Perhaps because it is unusual for an article about heightened caution around reporting potential csam (ai generated or not), especially when no one has been wrongfully accused, to be so ‘I told you so’ focused?
Re:
You are missing the bigger picture that laws with negative consequences for everyone has been written based on less noise than this.
The point of laws is to address actual problems without creating other problems, which almost never happen when the call “Think of children!” is raised and used as the sole impetus ignoring everything else.