It Took Just 5 Minutes Of Movement Data To Identify 'Anonymous' VR Users

from the no-such-thing-as-anonymous dept

As companies and governments increasingly hoover up our personal data, a common refrain to keep people from worrying is the claim that nothing can go wrong because the data itself is "anonymized" -- or stripped of personal identifiers like social security numbers. But time and time again, studies have shown how this really is cold comfort, given it takes only a little effort to pretty quickly identify a person based on access to other data sets. Yet most companies, many privacy policy folk, and even government officials still like to act as if "anonymizing" your data means something.

The latest case in point: new research out of Stanford (first spotted by the German website Mixed), found that it took researchers just five minutes of examining the movement data of VR users to identify them in the real world. The paper says participants using an HTC Vive headset and controllers watched five 20-second clips from a randomized set of 360-degree videos, then answered a set of questions in VR that were tracked in a separate research paper.

The movement data (including height, posture, head movement speed and what participants looked at and for how long) was then plugged into three machine learning algorithms, which, from a pool of 511 participants, was able to correctly identify 95% of users accurately "when trained on less than 5 min of tracking data per person." The researchers went on to note that while VR headset makers (like every other company) assures users that "de-identified" or "anonymized" data would protect their identities, that's really not the case:

"In both the privacy policy of Oculus and HTC, makers of two of the most popular VR headsets in 2020, the companies are permitted to share any de-identified data,” the paper notes. “If the tracking data is shared according to rules for de-identified data, then regardless of what is promised in principle, in practice taking one’s name off a dataset accomplishes very little."

If you don't like this study, there's just an absolute ocean of research over the last decade making the same point: "anonymized" or "de-identified" doesn't actually mean "anonymous." Researchers from the University of Washington and the University of California, San Diego, for example, found that they could identify drivers based on just 15 minutes’ worth of data collected from brake pedal usage alone. Researchers from Stanford and Princeton universities found that they could correctly identify an "anonymized" user 70% of the time just by comparing their browsing data to their social media activity.

The more data that's available to researchers (or corporations or governments), the easier it is to identify you. And with hacks, data leaks, and breaches dumping an endless ocean of existing datasets into the public domain, and no serious rules of the road governing things like the collection of location and other sensitive data, it shouldn't be too hard to see how the idea of "privacy" is a myth. Especially if the company is, say, Facebook, which is now tying your entire online Facebook experience to VR whether you like it or not.

It's all something to keep in mind for whenever the U.S. gets off its ass and finally crafts a meaningful privacy law for the internet era. Especially given that "don't worry, your data is anonymized!" will be an endless refrain by industry as they try to ensure any rules are as feeble as possible.

Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: anonymity, anonymized data, data, de-anonymized, vr


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • icon
    Upstream (profile), 9 Nov 2020 @ 7:16am

    Don't hold your breath

    ...for whenever the U.S. gets off its ass and finally crafts a meaningful privacy law for the internet era.

    This may never happen because:

    The more data that's available to researchers (or corporations or governments), the easier it is to identify you.

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 9 Nov 2020 @ 8:21am

      Re: Don't hold your breath

      no serious rules of the road governing things like the collection of location and other sensitive data.

      Will never happen. There's too much money to be made for the corps to stop doing it on their own. Even if there wasn't, governments around the world would ensure that the price would go up to keep the tap wide open. If for nothing else so that governments can have their scapegoat when the plebs get angry about it, like they do every once in a blue moon.

      If you were actually serious about stopping it, the first thing would be a general ban on data collection in consumer products. No, I don't care about the "experience" needing to be optimized, or development feedback. You idiots got greedy and now it's time to take your toys away.

      Another rule: General ban on using consumer devices to auction off eyeballs. That should have never been permitted in the first place. Using the visitor's browser as a bot to make money should have died a quick death. Both in how much it slows down page loading and the fact the resources are being stolen from them without compensation. I don't think everyone visiting Walmart in-person would consent to whoring themselves out for 5-30 minutes (because humans are slower than computers) to advertisers at the edge of each isle just to enter one. No one should be allowed to demand that of virtual visitors.

      You can't do anything about the data that is already out there, as such there's no ban on selling the info they already have. It would be a never ending endeavor for the courts to try and stop it. But the data collection can be observed, and therefore stopped, on consumer devices.

      the idea of "privacy" is a myth.

      It's only a myth because people don't give a shit about their own safety. Let alone anyone else's. They don't care if someone else is in that photo they sent to Facebook. They are selfish and believe that the person should be honored to be seen on their account. They don't care if someone could use all of the tweets they've made to figure out their lifestyle and falsely portray themselves to take advantage of them. They have to post that location update. My employees don't wanna pay for tracking their location, audio / video recordings, and time using my workplace app? Well, I need better employees then. Poisoning? Nope, they have to post that food pic. Theft? Nope, gotta post about the fact they left the front door unlocked and how funny it is. Rape? Nope, gotta post about going out to get drunk right now, at this specific bar, alone. Murder? Nope they actually took the damn cellphone with them and actually asked Siri where to bury the body, how to destroy evidence, etc.

      Go ahead and try to avoid these things where you have input, you'll find out just how selfish society really is. Hell, some of them may even try to re-educate you, or worse, punish you over it.

      reply to this | link to this | view in chronology ]

  • identicon
    Bruce C., 9 Nov 2020 @ 8:03am

    The irony of it all...

    On the internet, everyone who gets your data stream knows you're a dog.

    I'm still trying to decide whether law enforcement should have access to this kind of info. On the one hand, it's a huge government intrusion, on the other hand it would (eventually) allow courts to be more strict about data for probable cause and search warrants and get rid of the "my years of experience" justification when not backed by data.

    reply to this | link to this | view in chronology ]

    • icon
      That One Guy (profile), 9 Nov 2020 @ 12:14pm

      Re: The irony of it all...

      Much more likely the collection and use of such data would be normalized, and that would be used on top of 'years of experience', so no, they definitely shouldn't have that data.

      reply to this | link to this | view in chronology ]

  • icon
    Koby (profile), 9 Nov 2020 @ 8:05am

    Misleading Marketing

    The term "anonymized" is shaping up to be another deceptive term, akin to "unlimited data plan" or "all-natural". As soon as someone starts trying to sell you on this, know that they're probably trying to con you into giving up your privacy.

    reply to this | link to this | view in chronology ]

  • icon
    That One Guy (profile), 9 Nov 2020 @ 12:12pm

    'You first.'

    Anyone who tries to argue that data like that is anonymous or not a privacy concern because it will be 'anonymised' should be presented with a 'put up or shut up' challenge, where they either admit that they're wrong or lying or have their data given that treatment and then poured through by a third party to see how much they could learn about them from 'anonymous' data, which would show that they are wrong or lying.

    reply to this | link to this | view in chronology ]

  • identicon
    Yes, I know I'm commenting anonymously, 10 Nov 2020 @ 8:22am

    One Simple Rule

    Anonymized data is worthless (i.e. no-one wants to pay for it) therefore sold data wil always be de-anonymizable.

    reply to this | link to this | view in chronology ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Close

Add A Reply

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Follow Techdirt
Special Affiliate Offer

Advertisement
Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Advertisement
Report this ad  |  Hide Techdirt ads
Recent Stories
Advertisement
Report this ad  |  Hide Techdirt ads

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.