That One Guy

'Our arguments don't apply to you though, honest...'

The best part of the government making that argument is that at least one of the USG agencies(and several others I’m sure) know full well that you can take bits and pieces of otherwise harmless data and combine it into something not so harmless, which would make it extra rich should they try the ‘it’s just metadata/anonymized data’ argument.

Anonymous Coward

July 30, 2019 at 6:50 am

Project Insight requires insight…

Anonymous Anonymous Coward (profile)

July 30, 2019 at 6:58 am

Inject false data

Wouldn’t the best way to muddy the ‘anonymized’ data be to irregularly, but often, do something entirely out of character for yourself? Maybe someone could make an app for that?

The hardest part might be finding out what your ‘characteristics’ are so that doing something ‘out of character’ can be determined (denial of what ‘your’ characteristics are will be a problem). The second piece, the list of things that are ‘out of character’ for you, but still acceptable (which might be a characteristic that in the end ‘outs’ you) would also be hard. But sending data that you did something you didn’t actually do might actually make acceptance irrelevant.

And, since we don’t seem to do random very well, making ‘random’ injections might just provide sufficient information to identify individual ‘randomness’ and separate that from other identifiable data points.

Oh well. Maybe it would just be best to no let anyone have the data to begin with, but Pandora’s box has been opened and I cannot think of any way of putting what has escaped back in the box. That is likely not possible and banning future collections might just be futile.

Bruce C.

July 30, 2019 at 7:01 am

Anonymity

People used to (and maybe still) call the TSA "Security theater": action designed to give the appearance of security, without actually accomplishing much.

Current anonymization techniques should be called "anonymity theater". They give the appearance of anonymity to the average person, while still allowing companies to correlate the results with their entire history as well as other data gathering services.

Anonymous Coward

July 30, 2019 at 7:58 am

Re: Personally Identifiable Data

Anonymous Coward

July 30, 2019 at 8:12 am

Re: Re: Personally Identifiable Data

the more common term that people encounter is "Personally Identifiable Data" (PID) in Terms of Service agreement for various software, apps, services, memberships, etc.

companies often claim that no PID is collected, stored, shared … or at least very strictly secured for highly limited, absolutely necessary purposess.
Such TOS claims are usually false or misleading. Few people read them anyway.

Mason Wheeler (profile)

July 30, 2019 at 7:18 am

Another study looking at vehicle data found that 15 minutes’ worth of data from just brake pedal use could lead them to choose the right driver, out of 15 options, 90% of the time.

This example seems a bit "lying with statistics" to me. Simply because… how often are there only 15 people to pick from in datasets like this? How often could they use this information to choose the right driver out of 15 million options?

TripMN

July 30, 2019 at 7:41 am

Re: I think you missed the point

The point wasn’t to show you that they could pick out the correct person out of 15 correctly 90% of the time, but that we have personally identifiable characteristics even in as small a task as braking. Now think about all of the other things you do that you now have a sensor and micro-computer attached to that is taking very precise readings about you and giving them to someone else to use. Even anonymized, these readings do a very good job of "fingerprinting" you.

Michael (profile)

July 30, 2019 at 7:54 am

Re: Re:

This is why I like to drive in the left lane on the highway while holding my brake pedal just enough to keep my brake lights on.

James Burkhardt (profile)

July 30, 2019 at 7:55 am

Re: Re:

How often are 15 million people the likely driver of a single vehicle?

The quote you cite is right after a discussion about combining leaked/stolen data pools. That we can use as random a data point as braking pattern to narrow and refine de-anonymization efforts is important. While true that a 15 person pool in isolation seems like a restriction that limits the value of this find, I was struck by how likely someone with braking data is to know what car the data came from. I was struck, reading that data point, about how with enough data, the braking patterns could be used to determine who was driving a car when. Then if they had access to that car’s location from LPR databases, or perhaps the car’s lojack GPS, and know intimate details about each individual sharing that vehicle.

Zof (profile)

July 30, 2019 at 7:46 am

You can do the same thing back to them

You can apply the same logic to find the K Street PR firms that Boeing and Warren Buffet hired to get hate stories on Elon Musk pushed to tech sites.

You can use a VPN service to see how Google presents 15 different versions of Google News biased 15 different bizarre ways depending on what country you are in, instead of just telling the truth. You can watch for patterns in the bias to notice "enhanced news" as I like to call it. You can also see full fake campaigns being used to direct folks to websites that have only existed a day or two. You can see all of this because Google won’t mess with their own Trends. So you can see terms and conventions being thrown around by the Media at times that claim to be ancient but have existed for days, not years. You can literally observe the man behind the curtain.

Start using their logic to look where they don’t want you to look.

James Burkhardt (profile)

July 30, 2019 at 8:20 am

Re: You can do the same thing back to them

As noted by techdirt, search is inhearently biased. The whole reason Google got popular over Yahoo was that Google was better at showing you results you actually wanted to see. The algorithm is designed to bias towards current events and larger websites and news sources, because the data it gets from use indicate those results are what people are searching for.

Hell Bing ran its entire initial ad campaign on the idea that it was better than Google at biasing its search results to websites relevant to your request.

I heard the term Zero-Day from Scorpian, thought it was a stupid term and suddenly everyone was using it. It is a phenomenon that has been described for a long time, to see seemingly new terms erupt into your awareness and they be everywhere. As well, academic terms often explode into public consciousness very suddenly. They can be old terms, but only gaining widespread usage recently. Millennial for instance. A lot of Gen X’ers originally heard the term used by Marketing departments – the uninspired Gen Y – and assumed that is what everyone was using. But academics had coined the term Millennial around the same time. So I have spent 5 years listening to them discover that Millennial describes people born before 2000, not after, and angrily wondering when they ‘changed it. Of course, marketing may have changed the term they used, but it was a term in long use in academic circles. They just Assumed who Millennials were, much as Gen X and the MTV generation were being used by baby boomers to describe children born well into the 90s. Just because a term didn’t trend doesn’t mean it didn’t exist. Occam’s Razor.

Similarly both Democratic Socialist and Social Democrat have been academic terms for a while, but only recently have become general use terms as we needed a way to differentiate from the socialist boogeyman of the Republican Party and are often misused or misunderstood when compared to their academic roots as most words from Academia are.

Your lack of citation for your claims makes your screed here look more a conspiracy theory than genuine claim. Perhaps show me a full fake campaign (what does that even mean), or a term you think was just created out of thin air by [Google? The Media? its unclear from your rant] that Google somehow is hiding by letting us view the trend data. Then I have something to sink my teeth into.

James Burkhardt (profile)

July 30, 2019 at 8:43 am

Re: Re: You can do the same thing back to them

Submitted missing this part:

And none of this really follows from your subject. The closest thing you have to de-anonymizing data sets is your comment about backtracking an HR firm – but not actually backtracking from there to Boeing or Buffet which would be the the actual comparison to make.

Toom1275 (profile)

July 30, 2019 at 8:43 am

Re: You can do the same thing back to them

But can you see why kids love Cinnamon Toast Crunch?

Anonymous Coward

July 30, 2019 at 8:48 am

Re: You can do the same thing back to them

Wait, you’re telling me that different countries get different new results! Wow! Where did you discover this incredible conspiracy?

Anonymous Coward

July 30, 2019 at 8:49 am

Re: You can do the same thing back to them

You can use a VPN service to see how Google presents 15 different versions of Google News biased 15 different bizarre ways depending on what country you are in, instead of just telling the truth.

The idea that presenting different stories to people from different countries is some kind of bizarre and deceitful bias is itself bizarre. People have always cared more about local news. If it’s something more sinister than that, you haven’t made your point.

Anonymous Coward

July 30, 2019 at 9:12 am

Re: Re: You can do the same thing back to them

Alternatively, consider what happens if Google and other media outlets only push out one truth.

You can bet your back teeth Zof would piss himself.

Anonymous Coward

July 30, 2019 at 9:25 am

Re: Re: You can do the same thing back to them

Google does not write news stories, and different countries have different news sources, which report the news in slightly different ways. Also, they have different priorities as to what is most newsworthy.

However what would be frightening is if Google news presented the same stories from the same sources in the same order in every country.

Anonymous Coward

July 30, 2019 at 10:15 am

Re: Re: Re: You can do the same thing back to them

However what would be frightening is if Google news presented the same stories from the same sources in the same order in every country.

Certainly it could be useful to let people select their country, and one could argue that it would be more transparent to skip the automatic geolocation and just ask everyone which area they want to see news for. But "Google News uses geolocation!" is just about the weakest Google-conspiracy I’ve ever seen.

Anonymous Coward

July 31, 2019 at 3:15 pm

Re: Re: Re: You can do the same thing back to them

Almost like if some global broadcaster had every local news firm make the same "local" presentation, and then someone simulcast 100 of them at the same time showing the ‘local’ presentation from 100 different channels…

I mean nothing ‘creepy’ about that now is there?

Anonymous Coward

July 30, 2019 at 1:12 pm

Re: You can do the same thing back to them it’s called zoffing

And you can lie 15 different ways and get debunked just as much because you are dangerously addicted to failure.

Anonymous Coward

July 30, 2019 at 1:56 pm

Re: Today’s trending word is: Extremely

“Start using their logic to look where they don’t want you to look.”

And that place is a massively popular search tool? Do you realise how stupid and paranoid you sound?

Anonymous Coward

July 31, 2019 at 8:34 am

Re: Re:

Do tell how providing relevant local search results to users in other countries is somehow a conspiracy.

Weren’t you leaving?

TasMot (profile)

July 30, 2019 at 8:41 am

So, let’s see. Facial recognition is only accurate 20% of the time in identifying an individual (leading to lots of false arrests because LEOs don’t do any other checking. BUT, anonymized data available on the Internet is 98.98% of the time. Why doesn’t law enforcement start using this instead?

Anonymous Coward

July 30, 2019 at 8:45 am

hmmmm....

if only there were a way to license my private data, including my face and physical image…

TasMot (profile)

July 30, 2019 at 8:53 am

Re: hmmmm....

There is, if you take the picture (such as a selfie) and don’t give away a license to it to the likes of Instagram, Facebook, Google, or anyone else. Then, they would need to license it. However; just walking down the street and letting them take you picture won’t do it. You’ll need to just start always walking in public wearing a "Guy Fawkes" mask.

Anonymous Coward

July 30, 2019 at 9:00 am

but we’re not supposed to be able to get THIS information because governments everywhere want us to think we are safe, that no one, in particular all governments, cant get our data, let alone have any company on the Planet get it and sell it so money can be made by anyone/everyone except us, and privacy and safety go straight out the window!

Anonymous Coward

July 30, 2019 at 9:43 am

Re: Re:

The government has almost nothing to do with this. It’s really all about corporate liability. I’m oversimplifying this to make a point, but it goes something like this.

Person: I’m suing big corp for misusing my personal data.
Big Corp: We used industry standard anonymization.
Judge: Industry Standard. Case dismissed.

The problem here is that "everyone else is doing it" is a valid defense against negligence. No negligence means no liability. Most businesses only care about the liability. That’s why computer systems are so insecure. So long as a company follows industry standard practices, they really don’t have to worry about what actually happens, they won’t be liable.

Many companies use things like Symantec. It doesn’t matter if the "security" software is so insecure that it causes customer credit card numbers to be exposed. It matters that the company doesn’t have to worry about liability, because everyone is doing it.

bob

July 30, 2019 at 2:10 pm

Re: Re: Re:

They also dont have to worry about liability if they have a 3rd part reaponsible for security and not the business. Another tactic is the company has insurance to cover the costs if an incident does occur. So as long as the premiums aren’t too high and a company is covered they don’t care what happens.

ECA (profile)

July 30, 2019 at 12:06 pm

Iv told many..

about the tricks and problems with Their computer devices..
Enter your name, address, phone, CC#, SS#..
At ANY TIME…and the browser will give it to anyone with clearance. MS sold that ability for $99 per year on Browsers.

Chrome recently discovered a flaw in the incognito mode and is revamping it to WORK better.

Bank keeps asking me to use the INTERNET, to access my accounts, and I tell them I know enough about computers to NOT do it..

People dont get What is happening out there..
Compare sites, and log the names of certain people, in the same groups you will have alt names, but HOW THEY SPELL, and form sentences.. will give you a good chance of knowing who they are.

Allot of people use the same name ALL OVER the net.. Hack 1 site he belongs on and the password is probably the same everywhere..(at leat 50-60%) and if you can get access to his PERSONAL info sheet on a site…name,address, phone,…

ALL, of this, and in the old days…they would collect the Sales info you would use in mail order.. PCH is the BIG ONE.. order 2-3Mags and end up getting A TON more for others you never heard.. And those lists can be bought. Even NOW,. your credit card is tracking you..and Corps can use that data to sell, MORE info.
Then comes the fun…what can they do?
depending on the Info, open a bank account with your DATA, and a fake picture. And then put you in debit… because they got your SS#..
The Odds are, at this time there are 3-4 people in the nation USING your SS# to find work..(I love them doing this for me)..
Think about it..
SS# = 1234-34-2345. IS NOT suppose to be used by any one except your bank, your work, and you.. And this has been bypassed, even in the past. that the Corps use it to track you. Now they also use your bank and credit cards..

AND YOUR PHONE..built in GPS….love it. And a phone that COULD tell them everything about you..
Anyone remember Blue tooth hacking??? it still happens.

Anonymous Coward (user link)

July 30, 2019 at 1:47 pm

Once More With Feeling: 'Anonymized' Data Is Not Really Anonymou

To reduce the article to a sound bite:
“Even the most anonymous and seemingly innocuous data element is part of a trail of bread crumbs leading to sensitive and intimately personal data”

any moose cow word

July 30, 2019 at 2:26 pm

The crux of the problem of companies breaking user anonymity is users giving personally identifiable data needed to correlate with other anonymous data sets. What many users fail to realize is that most identifiable data that companies request is completely unnecessary to begin with. Sure, sites may want your real name, birthday, phone number and physical address, but does those companies have any actual need for it? I find the exceptions to be increasingly rare. Unless the company or its users actually need to interact you with personally, they don’t need any of that. In fact, not only can you fake most of the data provided to most sites, you absolutely should!

The vast majority of exceptions were purchases. Before the rise of third-party Internet payment services such as PayPal, purchases used to require providing each site a credit card number. Since those were inherently insecure, payment processors required various identifiable data in an attempt to verify the buyer’s identify and curb fraud. This practice gave many sites a lot of user data, which led to the proliferation of that data and consequently identity theft and fraud. As far as I can tell, PayPal doesn’t check any of that data. They use a completely different set of data points to identify the buyer. If the purchase isn’t for a physical item, then just about all data on the transaction can be fake as well.

One other exception is if you can’t reset an account password or if the account has been compromised, the site or a tech support rep may ask for some of the data you provided in order verify your ID. While it may seem less inconvenient to use your real data, just in case, identity theft and other breeches of user anonymity can be considerably more inconvenient. Instead, keep an address book of your fake IDs so that you can provide the "correct" one when needed. And of course, secure that book just as well as you do your passwords, as internet hoodlums can use it to compromise your accounts just as if they had the password itself.

ECA (profile)

August 2, 2019 at 3:17 am

Corps know..

Corps know MORE about the people in this country then the Gov. ever has..
They COULD(if they were honest) answer almost any question about the Whole populace..

Tuesday
15:34	Because It's Done Such A Great Job Policing Illegal Drugs, The DEA Decides It's Time To Start Engaging In Legal Drug Hysteria (2)
13:38	When You Need To Post A Lengthy Legal Disclaimer With Your Parody Song, You Know Copyright Is Broken (11)
12:09	No One Can Own The Law—So Why Is Congress Advancing A Bill To Extend Copyright To It? (9)
10:52	Top Lawyer In Texas Doesn't Understand Court Rulings, Celebrates Obvious SCOTUS Loss As A Win (13)
10:48	Daily Deal: The 2024 Complete Godot Stack Development Bundle (0)
09:35	Any Privacy Law Is Going To Require Some Compromise: Is APRA The Right Set Of Tradeoffs? (7)
05:31	The Future Of Streaming TV: More Pointless Mergers And Making It Harder To Cancel (28)
Monday
20:15	More Open Access Training For Academics Would Lead To More Open Access (5)
15:56	First Approved Emulator App Appears In Apple's App Store Under New Rules (4)
13:35	Lawmakers Who Insisted The US Gov’t Should Never Combat Foreign Influence Online, Vote To Combat TikTok’s Foreign Influence Online (36)

Once More With Feeling: 'Anonymized' Data Is Not Really Anonymous

from the nothing-to-see-here dept

Comments on “Once More With Feeling: 'Anonymized' Data Is Not Really Anonymous”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

The Techdirt Greenhouse

Trending Posts

Tuesday

Monday

More

Email This Story

Tools & Services

Company

Contact

More