Just Assume Any Info You Put Online Is Public

from the welcome-to-the-new-world dept

I have to admit that I was sorry to see that my fellow Techdirt blogger Julian had beaten me to the punch, writing a characteristically insightful post on the Robert Scoble/Facebook story. But Facebook and screen-scraping are two of my favorite things to talk about, so I can't resist pointing out that I disagree with some of Julian's analysis.

Having noted that a script acting on Scoble's behalf can only access information that Scoble himself can reach manually, Julian argues that this can't be considered the only criterion in evaluating the situation:

[P]rivacy is not just a function of the publicity of your personal information, but of the searchability and aggregability of that information. Public closed-circuit surveillance cameras, for instance, typically capture the same information that a casual observer on the street is already privy to. But we recognize that being spotted by diverse random pedestrians, or even being captured on diffuse and disconnected private security cameras, is not intrusive in the same way as being captured on a citywide surveillance system that is searchable from a centralized location.

All of this seems true: individuals' attitudes about privacy are rightly driven by a pragmatic appraisal of the likelihood of someone doing something bad with the available information — a judgment based on the information's value and the cost of obtaining it. Ripping up your credit card statement before throwing it in the trash doesn't make it impossible for a dumpster-diving thief to target you, but it increases the difficulty of ripping you off enough that you'll probably be safe.

But I think Julian makes a mistake when he assumes that this is a viable way to conduct your life online. The problem with applying this approach to an digital context is that a user's estimation of the accessibility of a given piece of online information is almost invariably going to be too low — and will be getting more so by the second. The costs to automatically collecting data are very small and getting smaller.

There are a few reasons for this. First, the tools are getting better. Libraries like WWW::Mechanize are simple for any programmer to use and available in a variety of languages. And GUI-based applications like Dapper and Piggy Bank aim to make things even simpler. Second, if done properly, it's very difficult to prevent, detect or punish automated data collection. Facebook's script detection technology is impressively existent relative to that of its competitors, but it's still almost certainly trivial to subvert it with proxies, faked user agents and plausibly human delays. Third, once the data is collected it can, of course, be easily distributed.

And the situation is only going to get worse! In fact, it's getting worse at such a rapid rate that counting on the privacy of any even slightly public online information is a mistake.

The negative reaction to Scoble's script is coming from users who think of it as a violation of the covenant they perceived to surround their data. But that covenant was based upon their own mistaken understanding of the internet. Scoble's actions shouldn't be viewed by these users as a transgression against them, but rather as a pleasantly benign lesson.

It's fine to lament the situation, or to applaud Facebook for taking steps to keep its valuable, freely-acquired user data away from competitors (and, while they're at it, script-employing users). But this assertion of community norms is unlikely to stop those who, unlike Scoble, are genuinely acting in bad faith. The technology for containing digital cats in digital bags is woefully inadequate, and it's unlikely to improve anytime soon.

Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: privacy, robert scoble, scraping, social graphs, social networks

Reader Comments

Subscribe: RSS

View by: Time | Thread

  1. identicon
    Silicon Valley, 5 Jan 2008 @ 7:23am

    Already one of the top stories of 2008

    Talk about viral marketing.

    if Facebook does not learn its lesson from this - god help them.

    This really puts a damper on all the great publicity they were getting last year by openning up their API

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here

Subscribe to the Techdirt Daily newsletter

Comment Options:

  • Use markdown. Use plain text.
  • Make this the First Word or Last Word. No thanks. (get credits or sign in to see balance)    
  • Remember name/email/url (set a cookie)

Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it

Email This

This feature is only available to registered users. Register or sign in to use it.