Anonymous Coward

November 13, 2025 at 10:54 am

I’m torn; yes, this is a grotesque privacy violation, but so is the simple existence of all those logged chats. Sometimes it takes a grotesque violation to get people’s attention and make them realize that it’s unwise to entrust a hallucinating bullshit machine with such personal information.

Rocky (profile)

November 14, 2025 at 1:46 am

Re:

Most people are totally oblivious to privacy-related problems because they don’t understand them and the implications, and their response is usually “Why should I care, I don’t have anything to hide.”

It’s the same thinking that are so common in why people engage in risky behaviors because “I never had any problems before”. That is, until their whole life gets fucked up when those risks suddenly become a fact.

Anonymous Coward

November 13, 2025 at 12:03 pm

This is extraordinarily bad

(I’m channeling Egon Spengler here.)

Those of us who deal with identified, deidentified, and anonymized data know that it is incredibly difficult to actually make this happen, even with simple data such as tables of alphanumeric values. But at least there are methodologies — painful and tedious methodologies — that allow us to do this, given enough effort and to use statistical analysis to show that we’ve done it.

Of course almost nobody ever bothers with that, all they do is strip out a few fields and declare success. And thus we have the parade of failures mentioned in this article.

But when it comes to the kind of data we’re talking about here, with its syntactic and semantic complexity, I wouldn’t even know where to begin. Heck, I’m not even aware of any research that provides guidance on how to do this with sample data sets, let alone millions.

There’s plenty of blame to go around here: plaintiffs, judge, etc. But it’s also OpenAI’s fault for not having the minimal foresight required to see this coming and realize that keeping so many chat logs was a disastrous choice which would inevitably lead to their disclosure, one way or another.

n00bdragon (profile)

November 13, 2025 at 12:23 pm

Not saying this is a good thing, but OpenAI could have avoided all this trouble by simply not making and retaining those logs in the first place. Finance companies routinely purge old data that they are not legally required to keep, because having it can make them financially liable if a dispute arises. If you’re concerned that people are giving private info to your chat bot (and OpenAI must realize that this is happening), then the only way to truly protect yourself from the law is to not keep anything the law doesn’t require you to keep.

Pink Elephant

November 13, 2025 at 6:13 pm

Re:

This is nonsense. These are not “old logs” that are not needed; these are user’s chats from yesterday. The user is expecting that previous conversations will be there, so they can search, re-use, and continue past conversations.

This is like saying “Google should purge all old emails the minute after you read them“.

Anonymous Coward

November 14, 2025 at 4:01 pm

Re: Re:

If only there were some magical way for the user to have control over their own data.

Ha, lol, how silly. Oh well.

Ethin Probst (profile)

November 13, 2025 at 12:47 pm

OpenAI offered a much more privacy-protective alternative: hand over only a targeted set of logs actually relevant to the case, rather than dumping 20 million records wholesale.

Okay, but was this something OpenAI would have control over? If so, I can kinda understand why the news orgs were not even remotely eager to take them up on that offer. There would be nothing stopping OAI from doing some secret record clean-up to get completely off the hook, and I wouldn’t put it past them to try that given all the other weird things they’ve tried in these cases.

That One Guy (profile)

November 13, 2025 at 1:16 pm

Apply the 'You first' test

Anyone claiming that data like that can be ‘anonymized’ should be told to put up or shut up.

Have them create a full copy of their personal information, from email, doctor’s records to financial data, scrub their name and address from it and then ask them, ‘How willing are you to hand this ‘anonymized’ data to someone who doesn’t know you? It doesn’t have your name or address, so clearly they could never identify you with it, right?’

Anonymous Coward

November 14, 2025 at 12:23 am

Re:

Your point is well-taken, but the process you’ve described isn’t anonymization: it’s just de-identification.

And in a way that highlights the issue: a lot of people perform de-identification and then call it anonymization. But’s it not. Not even close.

Arianity (profile)

November 13, 2025 at 1:22 pm

OpenAI offered a much more privacy-protective alternative:

And it just so happens to cover their asses as much as possible. I’m so glad it cares- now. And not at any point where they were hoovering up 20million users’ data (which has exactly the same potential to leak/be hacked etc).

Back in August, researchers got their hands on just 1,000 leaked ChatGPT conversations

“leaked”. They were using share links indexable by search engines lol.

there are a ton of lawyers who will have access to these files. The docket list of parties and lawyers is 45 pages long if you try to print it out.

It’s still quite a bit, but those largely seem to be the same few lawyers/firms, just repeated. Neverminding that like 2/3 seems to be OpenAI and various subsidiaries. It’s the same Steptoe LLP, Susman Godfrey,Lieff Cabraser,Boies Schiller, Saveri etc repeated ad nauseum. Susman Godfrey for instance, is listed 148 times.

Anonymous Coward

November 13, 2025 at 3:15 pm

Re:

“leaked”. They were using share links indexable by search engines lol.

That’s a depressingly common form of data leak.

Anonymous Coward

November 13, 2025 at 2:32 pm

So let me ask you.

If it was a question between protecting 20 million people and finding your child’s murderer would you feel the same way?

Beyond that this data was never safe by the basic fact it was collected at all.

Rocky (profile)

November 14, 2025 at 1:49 am

Re:

So let me ask you.

If it was a question between violating 20 million peoples rights to more easily stop and find a potential criminal in the future, would you feel the same way?

Anonymous Coward

November 14, 2025 at 4:04 pm

Re:

One’s feelings don’t matter.

Of course protecting 20 million, or 20 people, from unreasonalble search, seizure, disclosure, etc., is more important.

TKnarr (profile)

November 13, 2025 at 5:21 pm

I can’t help but consider this fiasco a good thing, though. It makes utterly clear the problems inherent in the third-party doctrine. Having people’s noses rubbed in it might just motivate enough outrage to get that doctrine revisited and replaced with one that recognizes that people do retain an expectation of privacy in records they give to third parties no matter how inconvenient the government might find that.

crazy_diamond (profile)

November 13, 2025 at 6:22 pm

Redaction

I think that Open AI should give up the records. Redact them with the same enthusiasm that the DOD and Justice Department do: Apply black blocks over all the content (don’t use Acrobat) !except the date, various pronouns, punctuation marks, and any minor words which collectively say nothing. If that’s acceptable from the government, why nor private citizens.

I know, I know, the government will always lead with everything from’national security” to “ongoing criminal investigation” claims. But the argument for PERSONAL security should carry far more weight than it apparently does.

“The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, …” seems like black letter law to me.

P.S. I also know that the 4th Amendment is a dead letter

NerdyCanuck (profile)

November 13, 2025 at 8:41 pm

Is it even clear that those 20 million chats are from only American customers? what if some of them are EU residents?? or Canadian or Australian? There’s no way that this kind of disclosure without notice+consent could possibly comply with privacy regs. in all of the other countries that DO actually have privacy laws…

Rocky (profile)

November 14, 2025 at 1:51 am

Re:

If they do business in the EU they are most likely bound by the GDPR.

NerdyCanuck (profile)

November 14, 2025 at 8:43 pm

Re: Re:

That’s my point, is any of the EU customers would be bound by GDPR, Canadians by the Privacy Act, etc. and I couldn’t find anything in the various coverage about whether all of the 20 million chats are confirmed to only be from Americans

MrWilson (profile)

November 13, 2025 at 9:28 pm

No, this isn’t a good thing, even if you think it might have a good effect like “waking people up.” They’re going to fall for the next leak and the next assurance that the next leak won’t happen and supposedly secure data will leak and this is reality of the world we live in. But a judge demanding this leak is still a violation that cannot be cheered, regardless of how irresponsible or naive you think the victims are. What if it’s your spouse or your friend or your boss or your employee who uses your name in their chats, the same way your friends’ and family’s emails contain your responses and if leaked will reveal your secrets? It doesn’t matter if it’s a hacker, a judge, or a bad system admin. This is bad. You don’t shoot people to teach them about firearms safety.

Ethin Probst (profile)

November 14, 2025 at 3:56 am

Re:

Then they either shouldn’t store the chats at all or should store them on the users device. This is literally a solved problem by now. There is no way of anonymizing the data (or de-identifying it or anything else) given it’s not structured data and identifying what is “sensitive” and not is practically impossible. But if OAI didn’t want to suffer this, maybe they should’ve actually thought about cybersecurity and privacy instead of just creating something and ignoring it until it was impossible to ignore anymore. This is entirely OAI’s fault.

Anonymous Coward

November 14, 2025 at 8:10 pm

Re: Re:

if OAI didn’t want to suffer this

OAI isn’t suffering shit. They aren’t suffering now. They won’t be suffering after this data is released. No suffering will occur for OAI regardless of the outcome here.

The choice is between OAI not suffering, and splashing the details of Bob’s abusive relationship across the internet while OAI doesn’t suffer.

MrWilson (profile)

November 14, 2025 at 11:14 pm

Re: Re:

Then they either shouldn’t store the chats at all

So you’re just saying you don’t understand how the software functions at all?

or should store them on the users device.

This could be a possibility, but really you should just download the offline version of the model and chat locally, though that excludes mobile users and anyone without sufficient processing and ram, which brings us back to the question – do you just not understand how the software functions? Do you not understand that the logs are desired by the customers?

This is literally a solved problem by now.

You literally don’t seem to understand what you’re talking about.

There is no way of anonymizing the data (or de-identifying it or anything else) given it’s not structured data and identifying what is “sensitive” and not is practically impossible.

Correct, which is why the judge, who is the source of the issue, shouldn’t have demanded the leak.

But if OAI didn’t want to suffer this, maybe they should’ve actually thought about cybersecurity and privacy instead of just creating something and ignoring it until it was impossible to ignore anymore. This is entirely OAI’s fault.

Would you suggest the same of, say, Google Drive or Microsoft OneDrive? Do you not see any value to the customer in the retention of privately-accessed logs of previous use of the service? Do you think all cloud storage should be deleted immediately upon generation?

Anonymous Coward

November 14, 2025 at 4:07 pm

What is this even about?

Oh, copyright claims, likely specious AF. But even if they were reasonable and solid claims; fuck you, plaintiffs, fuck you, court.

Dbdtw (profile)

November 15, 2025 at 4:56 am

The judge is not acting in the interest of the constitution.

Narp

November 15, 2025 at 12:14 pm

DuckDuckGo

DDG offers various AI models at duck.ai and promises “Free and private chats, anonymized by us. No account required. … No AI training on your conversations.”

Which might still help privacy even if you put your name and address in your queries.

Friday
19:39	The FDA Takes Its Turn Burying Studies Showing The Safety Of COVID, Shingles Vaccines (3)
15:55	Ken Paxton Wanted To Crack Down On Forum Shopping. Now Lawyers Say He’s Improperly Seeking Out Favorable Courts. (2)
13:14	France's Terrible Copyright Law, Hadopi, Is Not Quite Dead (1)
10:59	Journalists Identify Murder Victims Of Trump's Boat Strike Program (13)
10:54	Daily Deal: Headway Premium Memorial Day Sale (0)
09:32	SpaceX's IPO Filing Shows Elon's Twitter 'Business Genius' Was A Fantasy (10)
05:32	Amazon Gets Into The AI Podcast Slop Business (9)
Thursday
20:02	Post Loss Clarity: Bill Cassidy Rediscovers His Spine As A Lame Duck Senator (9)
16:48	Ctrl-Alt-Speech: Message In A Bottleneck (0)
13:04	The Science Is Not Settled: How Weak Evidence Is Fueling A National Push To Ban Social Media For Youth (14)

Judge Orders OpenAI To Give Lawyers 20 Million Private Chats, Thinks ‘Anonymization’ Can Keep Them Private

from the seems-like-a-problem dept

Comments on “Judge Orders OpenAI To Give Lawyers 20 Million Private Chats, Thinks ‘Anonymization’ Can Keep Them Private”

Re:

This is extraordinarily bad

Re:

Re: Re:

Apply the 'You first' test

Re:

Re:

Re:

Re:

Redaction

Re:

Re: Re:

Re:

Re: Re:

Re: Re:

DuckDuckGo

Add Your Comment Cancel reply

Comment Options:

What's this?

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Trending Posts

Friday

Thursday

More

Tools & Services

Company

Contact

More

Judge Orders OpenAI To Give Lawyers 20 Million Private Chats, Thinks ‘Anonymization’ Can Keep Them Private

from the seems-like-a-problem dept

Comments on “Judge Orders OpenAI To Give Lawyers 20 Million Private Chats, Thinks ‘Anonymization’ Can Keep Them Private”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Trending Posts

Friday

Thursday

More

Email This Story

Tools & Services

Company

Contact

More