discordian_eris

January 26, 2017 at 3:56 pm

Unreal

The Western world is at the point now where data acquisition meets or exceeds that behind the Great Firewall of China. The only difference is that it is not the government doing the spying, it is business. Thanks to the third party doctrine and the All Writs act, the governments here in the US have access to all of it. And almost always without the need for something as onerous as a warrant or even probable cause.

Obama refused to crack down on these activities adequately and has handed Trump tools and weapons that no president should have. I’d say that it is up to Congress now to do their jobs, but since they work for the corporations, and not the people, that isn’t going to happen.

While I think that Obama did a number of good things, I sure wish that he had had the balls to actually heed the warnings that he was given. Like LBJ, (in regards to Vietnam), Obama was too worried about being called a pussy to do that right things about T.W.A.T. Now we all, Americans and the entire world, are going to be forced to deal with the consequences of his inaction.

I sure hope like hell that in 2020 Americans remember this kind of crap and put someone in the White House who isn’t too cowardly or psychotic to do the right thing for the country. It’s time we voted in people who KNOW that they work for the best interests of the people, not the best interests of the government, or the corporations.

madasahatter (profile)

January 26, 2017 at 4:24 pm

Oh Well

I am not surprised. The basic flaw is the anonymized data hides personal preferences and habits. It does not. Everyone tends to have well defined usage patterns on the web. Correlating this patterns between various services can will effectively deanonymize data. The points of correlation the more accurate the identification.

Unanimous Cow Herd

January 26, 2017 at 8:11 pm

About time

Good to have you back.

Typical Business Executive

January 26, 2017 at 8:46 pm

Everybody knows that in order to look people up in a database you start with their last name. And you start that search with the first letter of their last name. So, by removing the first letter of the last name from the data we have made it impossible to link any of the data to any particular person because we have made it impossible to look them up! Thus, we have achieved perfectly anonymized data.

Anonymous Coward

January 27, 2017 at 4:49 am

Re: Re:

“Everybody knows that in order to look people up in a database you start with their last name. “

I do not think that everyone knows that tidbit, seems like unreasonably restrictive search criteria. Why is this the case while other identifying tags might provide equivalent results?

Anonymous Coward

January 27, 2017 at 6:33 am

Re: Re: Re:

Woosh!

Graham Cobb (profile)

January 27, 2017 at 7:10 am

How do we fight this?

Two possible routes (I am sure there are others):

The law. Allow for data to be marked, by the owner or source, as "anonymised" (whether any technical steps are taken or not) and make it a criminal offence to either (i) attempt to de-anonymise, or (ii) correlate such data with any other data. This should be enough to prevent (for example) insurance companies using such data to set premiums and it might even be enough to prevent major commercial data brokers from using the data (although steps would have to be taken to make sure investigation and penalties are severe enough to prevent data-washing, possibly abroad). Of course, it has no effect on governments, nor on commercial deals where the source is not willing to mark the data as "anonymised".
Publish standards (NIST?) for anonymisation. Maybe not so much specific algorithms as principles. For example, if identifiers are to replaced by meaningless numbers, the identifier-to-number mapping must change more frequently than an adversary is likely to be able to gether enough data to de-anonymise. These would have to be based on research. For example, based on the research in the article, a database of tweets might need to change the mapping of the profile name every 29 tweets, or something. Or a database of ANPR data showing traffic movements might have to change the vehicle pseudo-identity every 1 hour.

These two steps would also have to be accompanied by greater public awareness of de-anonymisation. The legal route is particularly important in making sure that companies cannot claim something is "anonymised" unless there are ways for the data subjects to actually enforce it.

Anonymous Coward

January 27, 2017 at 8:11 am

It is not just Internet information that is currently being used. Quintiles/IMS gets prescribing information from pharmacists concerning which prescriptions are filled. They then anonymize that information and then resell that information to pharmaceutical companies and anyone else that wants to buy it. Then that information is sent to other companies to append personal information to the records. The hit rate of a match is pretty good, otherwise no one would buy it. Ask Quintiles/IMS about this and you won’t get a very clear answer of exactly what they do.

Anonymous Coward

January 27, 2017 at 8:13 am

Another example is the public information of clinical trials. Personal information is “removed” but it is not hard to run the information supplied on an “anonymized” record and match it up with a person, especially on smaller clinical trials.

Anonymous Coward

January 27, 2017 at 12:03 pm

Differential Privacy

https://en.wikipedia.org/wiki/Differential_privacy

(Not the best wikipedia article; better to read Dwork’s papers & watch her videos on YouTube.)

Given a number of queries, n, DP fuzzes the dataset in such a way that

Anonymous Coward

January 27, 2017 at 12:05 pm

Re: Differential Privacy

“Given a number of queries, n, DP fuzzes the dataset in such a way that”

It looks like putting in a “less than” character will end the comment.

I’m not going to waste any time providing high quality feedback if your web page arbitrarily truncates the comments.

frank87 (profile)

January 28, 2017 at 3:04 am

33 bits of entopy

frank87 (profile)

January 28, 2017 at 3:06 am

33 bits of entopy

Is enough to identify a person. That’s less than the entropy in this story.

historygeek (profile)

January 28, 2017 at 10:35 pm

All of today’s major web browsers collect and give out specific information about the computer being used. Not just things like the MAC address which is necessary for the current internet protocols but also what your operating system is. Geographical locators are turned on by default to optimize search results and simplify mapping. And now many websites take a “portrait” of the icons on your desktop and their arrangement as well as a list of all the programs installed on your system. Which is about as unique as a fingerprint. Furthermore a recent study showed that individual computer users could by reliably identified by the patterns formed by the routine movements they used/made with a computer mouser [as shown by the travel of the cursor across the screen]. Unless you are making serious, consistent efforts to hide your online behaviour you are always personally identifiable. This is the standard state of affairs.

historygeek (profile)

January 28, 2017 at 10:37 pm

Re: Re:

Computer mouser was a typo. Ironically it could be used as a term for the work of the tracking technologies described.

Saturday
12:00	This Week In Techdirt History: February 1st - 7th (0)
Friday
19:39	Reminder: Don't Believe The NFL's Lies About Its Super Bowl Trademarks (2)
15:41	Former Federal Judge: ICE's Home Raiding Policy Violates A Basic Constitutional Right (5)
13:41	DOJ's Frivolous Boasberg Complaint Dismissed—While Nobody Can Explain How DOJ Got The 'Evidence' It Never Provided (13)
11:56	Telly's Plan For 'Free' Ad-Based TV Revolution Runs Into Quality Control Problems (2)
10:49	The CIA Erased The World Factbook With No Warning… And Told Everyone To 'Stay Curious' (31)
10:44	Daily Deal: The Ultimate AWS Data Master Class Bundle (0)
09:37	Facial Recognition Tech Used To Hunt Migrants Was Deployed Without Required Privacy Paperwork (1)
05:32	MAGA Zealots Are Waging War On Affordable Broadband (13)
Thursday
20:04	NIH Boss Jay Bhattacharya Breaks With RFK Jr. On Vaccines (12)

One More Time With Feeling: 'Anonymized' User Data Not Really Anonymous

from the we-can-see-you dept

Comments on “One More Time With Feeling: 'Anonymized' User Data Not Really Anonymous”

Unreal

Oh Well

About time

Re: Re:

Re: Re: Re:

How do we fight this?

Differential Privacy

Re: Differential Privacy

33 bits of entopy

33 bits of entopy

Re: Re:

Add Your Comment Cancel reply

Comment Options:

What's this?

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Trending Posts

Saturday

Friday

Thursday

More

Tools & Services

Company

Contact

More

One More Time With Feeling: 'Anonymized' User Data Not Really Anonymous

from the we-can-see-you dept

Comments on “One More Time With Feeling: 'Anonymized' User Data Not Really Anonymous”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Trending Posts

Saturday

Friday

Thursday

More

Email This Story

Tools & Services

Company

Contact

More