New Study Shows Anonymous Data Isn't Very Anonymous At All

from the hear-that? dept

We've pointed out time and time again that there's really no such thing as an anonymized dataset. Given the data, it's almost always easy enough to at least connect some of it back to a real person. It looks like there's now some research to support that. Steven Hoy points us to a new paper where some researchers wrote an algorithm that takes anonymized data from social networks and connects it back to names and addresses of individuals:
We present a framework for analyzing privacy and anonymity in social networks and develop a new re-identification algorithm targeting anonymized social-network graphs. To demonstrate its effectiveness on real-world networks, we show that a third of the users who can be verified to have accounts on both Twitter, a popular microblogging service, and Flickr, an online photo-sharing site, can be re-identified in the anonymous Twitter graph with only a 12% error rate.
Basically, the researchers are saying that anonymized data isn't really anonymous -- and social networks that insist they're "safe" because they've anonymized the data are being somewhat disingenuous.

Reader Comments (rss)

(Flattened / Threaded)

  1. identicon
    Weird Harold, Mar 27th, 2009 @ 3:02pm

    Nowhere to hide!


    You insipid little runts will pay for all the mean-spirited things you've been saying about me. It looks like you can't hide behind your little netwalls after all.

    I'll see you in hell, punks!


    reply to this | link to this | view in thread ]

  2. identicon
    some old guy, Mar 27th, 2009 @ 3:04pm

    Re: Nowhere to hide!

    Why do I suddenly have the urge to register ""?

    reply to this | link to this | view in thread ]

  3. identicon
    TheStuipdOne, Mar 27th, 2009 @ 3:11pm

    Isn't that difficult

    Take for instance me on this site. While I acknowledge that IP address isn't anonymous if you just use geographic information from the IP you can get a pretty good idea of who I am.

    When I post on this site from work the IP logging would see an IP from corporate headquarters (I think). When I post from home you'd see the IP from my home. Knowing that from my posts what profession I'm in it wouldn't be too difficult to find a company headquarterd in area A with an office in area B that does the type of work I claim to be doing.

    So while this example doesn't give my name and address it does show more information than I've posted to this site to anyone who cares to look

    reply to this | link to this | view in thread ]

  4. identicon
    Frau Farbissina, Mar 27th, 2009 @ 3:15pm

    Re: Nowhere to hide!

    Send in the CLONES!!

    reply to this | link to this | view in thread ]

  5. identicon
    Weird Harold, Mar 27th, 2009 @ 3:16pm

    Re: Re: Nowhere to hide!

    It's the weekend, I guess the children are home from school. Hopefully mommy will tuck him in to bed at 8 and he will go away, perhaps distracted by a shiny object or something.

    reply to this | link to this | view in thread ]

  6. identicon
    ehrichweiss, Mar 27th, 2009 @ 3:36pm

    Re: Isn't that difficult

    It's more than that though.

    What they found was that people tend to have the same groups of friends, even when they're "anonymous", and that by analyzing the groups of friends on different social sites combined with a few other tell tale signs, you can narrow down who they are 2 out of 3 times.

    I should have published on this years ago cause I've hunted people down in this manner for ages. It's not that difficult but it does require patience and some ability to use logic, as well as a keen understanding of human nature. Each piece of information is a new stepping stone to the next and eventually yields the ultimate goal.

    reply to this | link to this | view in thread ]

  7. identicon
    Heird Warold, Mar 27th, 2009 @ 4:26pm

    Re: Nowhere to hide!


    reply to this | link to this | view in thread ]

  8. identicon
    AOL, Mar 27th, 2009 @ 4:56pm


    Your anonymized dataset is safe with us.

    reply to this | link to this | view in thread ]

  9. identicon
    AllXClub, Mar 27th, 2009 @ 5:54pm

    Anonymous Data

    Your data is safe with us. AllXClub is totally confidential, and private. You can read more about allxclub and how important the confidentiality is at or more not so confidential information about the new mlm allxclub at Why pay to play, when you can play and get paid?

    reply to this | link to this | view in thread ]

  10. identicon
    Anonymous Coward, Mar 27th, 2009 @ 6:06pm

    Re: Anonymous Data

    Please ignore the spam


    reply to this | link to this | view in thread ]

  11. identicon
    Twin, Mar 29th, 2009 @ 1:12am


    Well, *duh*.

    reply to this | link to this | view in thread ]

  12. identicon
    Twin, Mar 29th, 2009 @ 1:13am


    Post 10, I think post 9 was in the "satire" category...

    reply to this | link to this | view in thread ]

  13. identicon
    Ray, Mar 29th, 2009 @ 7:28pm


    It appears, and I say 'appears' advisably since they have not posted the full results, that the more sites you post on that sell your "anonymized" information, then the better the chance that you can be matched up with what you might consider your private data.

    Really nothing new about this if that is true, it is a standard spy methodology used to identify what is going on someplace, just get a lot of data points and see what the pattern is. First saw the affects of that back in the late 70's when the monitoring of CB radios (about half the unit used them) and the telephones gave away the supposedly secret plans for a military training operation that most of us only knew tiny pieces of prior to the exercise. Or in the late 80's a security test group identified what was going on in a supposedly secret building by using license plates, normal phone listening, and hanging around local businesses people went to for lunch and drinks.

    So the next question is, what happens if you use a different IP address (not the same company/town, but a different company which has a different town listed), and a different user name for each social site? I think (but would not bet on it) that it would be much harder to cross-reference without analyzing postings carefully over a long period of time, not impossible since most people have unique habits that act like a signature.

    reply to this | link to this | view in thread ]

  14. identicon
    Anonymous Coward, Mar 29th, 2009 @ 7:52pm

    Below you will find the real names, addresses, phone numbers, and adjusted income for the 13 posters above.
    1. XXXX
    13. XXX

    [deleted by moderator]


    reply to this | link to this | view in thread ]

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
Insider Shop - Show Your Support!

Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Report this ad  |  Hide Techdirt ads
Recent Stories
Report this ad  |  Hide Techdirt ads


Email This

This feature is only available to registered users. Register or sign in to use it.