The Story Behind Facebook Threatening To Sue Developer Into Oblivion For Highlighting Useful Facebook Data

from the how-nice-of-them dept

Facebook's lawyers have been getting pretty nasty lately. We recently covered the company's threats against the creator of a useful Greasemonkey script, and now a developer named Pete Warden has shared the sordid details of his legal run-in with Facebook -- where they threatened to sue him for his activity aggregating publicly available data found on Facebook.

You should read the full story, but basically, he built a simple crawler for public Facebook info, initially for his own purposes. He made sure that Facebook's robots.txt didn't block such crawlers -- and he also emailed someone at Facebook (who he had dealt with before), but didn't hear back from anyone. As his crawler worked, it started collecting a bunch of interesting data, and so he set up a website to let people explore some of this (again, public) data.

After playing with some of the data himself, he started making some interesting maps and charts with the data, and did a simple analysis of geographic locations of Facebook friend connections to show people what you could do with the data. He noted that if others (such as professional researchers) wanted to dig into the data, he would let them access a version of the data set (with identifying info stripped). The chart he released got picked up by a variety of sites and quickly got passed around.

And that's when the lawyers called:
On Sunday around 25,000 people read the article, via YCombinator and Reddit. After that a whole bunch of mainstream news sites picked it up, and over 150,000 people visited it on Monday. On Tuesday I was hanging out with my friends at Gnip trying to make sense of it all when my cell phone rang. It was Facebook's attorney.

He was with the head of their security team, who I knew slightly because I'd reported several security holes to Facebook over the years. The attorney said that they were just about to sue me into oblivion, but in light of my previous good relationship with their security team, they'd give me one chance to stop the process. They asked and received a verbal assurance from me that I wouldn't publish the data, and sent me on a letter to sign confirming that. Their contention was robots.txt had no legal force and they could sue anyone for accessing their site even if they scrupulously obeyed the instructions it contained. The only legal way to access any web site with a crawler was to obtain prior written permission.
Mathew Ingram reported on the data getting forced down, and got a statement from Facebook that seems to miss the point:
Andrew Noyes, manager of public policy communications at Facebook, said in an email that Warden "aggregated a large amount of data from over 200 million users without our permission, in violation of our terms. He also publicly stated he intended to make that raw data freely available to others." Noyes also noted that Facebook's statement of rights and responsibilites says that users agree not to collect users' content or information "using automated means (such as harvesting bots, robots, spiders, or scrapers) without our permission."
But I still don't see what the legal argument is. At best, I could see them terminating his account for disobeying the terms of service -- but even then the whole thing doesn't make much sense. The data is publicly available and, as Peter notes, it's pretty much standard practice for people to aggregate and analyze such data. However, he also pointed out that he couldn't afford to be a legal test case, and so he gave in and negotiated with Facebook to remove the data.

In the end, though, this shows Facebook's rather schizophrenic view towards data and privacy. On the one hand, it tries to push everyone to open up their info, but then if anyone does anything useful with it, they threaten to sue?


Reader Comments (rss)

(Flattened / Threaded)

  1.  
    identicon
    dave blevins, Apr 7th, 2010 @ 11:15am

    Goodbye Facebook, it wasn't nice to know you.

     

    reply to this | link to this | view in thread ]

  2.  
    identicon
    Richard, Apr 7th, 2010 @ 11:28am

    since forever

    weeeelll this is an old battle, remember the aggregators of the late 90's? hotspot err altavista (before Fast bot) and so many others I cant remember. I don't really know what happened with any of those cases except that companies were sued out of business. The serach engines (also aggregators) lasted and the others ended up on the penny exchange. I think the tried and true "sued into oblivion" strategy is the real story here. I mean, thats a massive failure of the legal system. It's denying justice to the poor and thats unconstitutional.

     

    reply to this | link to this | view in thread ]

  3.  
    identicon
    Anonymous Coward, Apr 7th, 2010 @ 11:30am

    He's lucky he wasn't arrested. Many prosecutors still seem to think that violating a site's terms of service means you're violate federal hacking laws.

    Does data that requires you to login constitute "public" data though? Where's the threshold?

     

    reply to this | link to this | view in thread ]

  4.  
    identicon
    Ryan, Apr 7th, 2010 @ 11:31am

    Re:

    Yeah, didn't he just commit federal computer fraud under the Lori Drew law? Or was that only in Missouri(for now)?

     

    reply to this | link to this | view in thread ]

  5.  
    identicon
    Anonymous Coward, Apr 7th, 2010 @ 11:42am

    Re:

    thanks to facebook's new policy most of that data is public. not stuff you log in to see, but just go to facebook and you can see it.

    in fact everyone should log out of facebook and google their names to make sure they know what is posted publicly.

     

    reply to this | link to this | view in thread ]

  6.  
    identicon
    Anonymous Coward, Apr 7th, 2010 @ 11:46am

    We have reached a point in time where corporate declarations hold the rule of law.When a company says you cannot use Product X to do Action Y, then the courts will say aye, tis the law of the land.

     

    reply to this | link to this | view in thread ]

  7.  
    identicon
    Beta, Apr 7th, 2010 @ 11:46am

    I know logic isn't involved, but...

    If Facebook's entire argument is based on his using an automated tool to gather this information, then he could crowdsource it: announce his plan to Facebook users and invite them to contribute information which they collect by hand.

    And by "could", I mean "could have". By which I mean that once he made the announcement, it'd be hard to prove that the data in his possession hadn't come from big crowds of helpful Facebookers.

     

    reply to this | link to this | view in thread ]

  8.  
    identicon
    Dave G, Apr 7th, 2010 @ 11:49am

    I saw this yesterday and it really irked me

    I saw this yesterday on another site. I went and read the blog and this really irked me. I wish peopel woudl get together and say enough when we see these type of abuses. Some peopel had arguments that their EULA states you can't spider their site without previous permission, but I say, then don't allwo it in your robot.txt file. You can't play the open card, then shut the door when it dosn't server your purpose. I feel the same way about people who set up rss feeds, then state you cannot use it in an open manner in some blurb on the website, but not int the feed itself.

     

    reply to this | link to this | view in thread ]

  9.  
    identicon
    david G, Apr 7th, 2010 @ 11:54am

    Re: I know logic isn't involved, but...

    "And by "could", I mean "could have". By which I mean that once he made the announcement, it'd be hard to prove that the data in his possession hadn't come from big crowds of helpful Facebookers."

    One problem, with today's standard you are guilty until you prove YOU DID NOT get it by other means.

     

    reply to this | link to this | view in thread ]

  10.  
    identicon
    John Doe, Apr 7th, 2010 @ 11:58am

    All about control...

    They only want you to open up your info if THEY can control it. They don't want anyone else to have it; you must come to them to get it. If it is that useful, they will want to charge for it.

    Personally I don't believe they have a legal leg to stand on, but our court system is for the rich as the rest of us can't afford to fight.

     

    reply to this | link to this | view in thread ]

  11.  
    identicon
    Anonymous Coward, Apr 7th, 2010 @ 11:59am

    Who gives a rat's @$$

     

    reply to this | link to this | view in thread ]

  12.  
    icon
    JackSombra (profile), Apr 7th, 2010 @ 12:14pm

    "On the one hand, it tries to push everyone to open up their info, but then if anyone does anything useful with it, they threaten to sue? "
    The reason is very simple if you ask yourself a simple question, how does facebook make money?

    Via two methods. The obvious one is advertising, the second, not so obvious method is selling info like what was collected by this guy. He was cutting into their revenue stream, hence the trigger happy (but imo toothless) lawyers

     

    reply to this | link to this | view in thread ]

  13.  
    identicon
    Dental Chicken, Apr 7th, 2010 @ 12:41pm

    This sounds like something for the EFF to handle.

    If this is not a violation of the FB EULA (which I can't say whether I know it is or not) then it seems to me this is something that falls squarely in the charter of the EFF.

     

    reply to this | link to this | view in thread ]

  14.  
    icon
    imbrucy (profile), Apr 7th, 2010 @ 12:53pm

    Re: Re:

    The woman was from Missouri, but the case was brought in LA (Myspace's location).

     

    reply to this | link to this | view in thread ]

  15.  
    identicon
    JP, Apr 7th, 2010 @ 12:58pm

    Tresspass to Chattels

    Facebook could use Tresspass to Chattels [http://en.wikipedia.org/wiki/Trespass_to_chattels] to win their case. It's been done before. As much BS as it is, I wish the EFF would help.

     

    reply to this | link to this | view in thread ]

  16.  
    identicon
    cc, Apr 7th, 2010 @ 1:25pm

    What kind of an argument is, "You can do it, but you can't make a computer do it for you"?

     

    reply to this | link to this | view in thread ]

  17.  
    identicon
    Anonymous Coward, Apr 7th, 2010 @ 3:02pm

    Re:

    it is a question of speed and volume just like a library compared to a torrent. as a single person clicking and making notes you might get a few dozen pieces of information. a bot running 24 hours per day will collect much more data more than anyone would personally need. scale is key.

     

    reply to this | link to this | view in thread ]

  18.  
    identicon
    Anonymous Coward, Apr 7th, 2010 @ 6:38pm

    Re: Re:

    Except...anyone personally studying Facebook's publicly available data, of course.

     

    reply to this | link to this | view in thread ]

  19.  
    icon
    nasch (profile), Apr 8th, 2010 @ 7:58am

    Re: Re:

    We all understand computers can do it faster, but how does that change anything LEGALLY?

     

    reply to this | link to this | view in thread ]

  20.  
    identicon
    mark, Apr 8th, 2010 @ 7:59am

    Why don't they...

    ...just hire the guy? I'm baffled.

     

    reply to this | link to this | view in thread ]

  21.  
    identicon
    V for Vendetta, Jul 31st, 2010 @ 7:13am

    Facebook data leak - download all files here

    The original work, released a few days ago, was done on a Unix machine, and therefore, used Unix compression, which is woefully inadequate when compared to even WinRAR.

    So I have all the original Facebook data, decomressed them, and tested three Windows-based compressors - WinRAR won out (the other contestants were 7-Zip and WinZIP)

    The original data are merely huge text files, and came in at a hefty 15GB. With WinRAR, I was able to get that to just a bit over 2GB.

    If you would like the files, you can download them yourselves from RS, much faster I suspect than from a torrent. Here are the links:

    http://rapidshare.com/files/409949014/Facebook.repacked.part01.rar
    http://rapidshare.com/ files/409947525/Facebook.repacked.part02.rar
    http://rapidshare.com/files/409947812/Facebook.repacke d.part03.rar
    http://rapidshare.com/files/409997211/Facebook.repacked.part04.rar
    http://rapidshare. com/files/409997597/Facebook.repacked.part05.rar

    [b]YOU MUST DOWNLOAD all five files to get the data.[/b] Click FREE USER button if not a Premium Member.

    It's ALL public information, so is all legal - kinda fun to peruse through, though not exciting.

    Enjoy.

     

    reply to this | link to this | view in thread ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
A word from our sponsors...
Essential Reading
Techdirt Reading List
Techdirt Insider Chat
A word from our sponsors...
Recent Stories
A word from our sponsors...

Close

Email This