Just Assume Any Info You Put Online Is Public

from the welcome-to-the-new-world dept

Fri, Jan 4th 2008 03:20pm - Tom Lee

I have to admit that I was sorry to see that my fellow Techdirt blogger Julian had beaten me to the punch, writing a characteristically insightful post on the Robert Scoble/Facebook story. But Facebook and screen-scraping are two of my favorite things to talk about, so I can't resist pointing out that I disagree with some of Julian's analysis. Having noted that a script acting on Scoble's behalf can only access information that Scoble himself can reach manually, Julian argues that this can't be considered the only criterion in evaluating the situation:

[P]rivacy is not just a function of the publicity of your personal information, but of the searchability and aggregability of that information. Public closed-circuit surveillance cameras, for instance, typically capture the same information that a casual observer on the street is already privy to. But we recognize that being spotted by diverse random pedestrians, or even being captured on diffuse and disconnected private security cameras, is not intrusive in the same way as being captured on a citywide surveillance system that is searchable from a centralized location.

All of this seems true: individuals' attitudes about privacy are rightly driven by a pragmatic appraisal of the likelihood of someone doing something bad with the available information — a judgment based on the information's value and the cost of obtaining it. Ripping up your credit card statement before throwing it in the trash doesn't make it impossible for a dumpster-diving thief to target you, but it increases the difficulty of ripping you off enough that you'll probably be safe.

But I think Julian makes a mistake when he assumes that this is a viable way to conduct your life online. The problem with applying this approach to an digital context is that a user's estimation of the accessibility of a given piece of online information is almost invariably going to be too low — and will be getting more so by the second. The costs to automatically collecting data are very small and getting smaller.

There are a few reasons for this. First, the tools are getting better. Libraries like WWW::Mechanize are simple for any programmer to use and available in a variety of languages. And GUI-based applications like Dapper and Piggy Bank aim to make things even simpler. Second, if done properly, it's very difficult to prevent, detect or punish automated data collection. Facebook's script detection technology is impressively existent relative to that of its competitors, but it's still almost certainly trivial to subvert it with proxies, faked user agents and plausibly human delays. Third, once the data is collected it can, of course, be easily distributed.

And the situation is only going to get worse! In fact, it's getting worse at such a rapid rate that counting on the privacy of any even slightly public online information is a mistake.

The negative reaction to Scoble's script is coming from users who think of it as a violation of the covenant they perceived to surround their data. But that covenant was based upon their own mistaken understanding of the internet. Scoble's actions shouldn't be viewed by these users as a transgression against them, but rather as a pleasantly benign lesson.

It's fine to lament the situation, or to applaud Facebook for taking steps to keep its valuable, freely-acquired user data away from competitors (and, while they're at it, script-employing users). But this assertion of community norms is unlikely to stop those who, unlike Scoble, are genuinely acting in bad faith. The technology for containing digital cats in digital bags is woefully inadequate, and it's unlikely to improve anytime soon.

Comments on “Just Assume Any Info You Put Online Is Public”

listen_to_techdirt (user link)

January 4, 2008 at 3:34 pm

Facebook - should it be banned?

Hey Tom,
Paul, gmail creator did a really good post on this too. He argues that facebook too can be banned by yahoo/google/MSN as a violation of the latter’s term of service.

Here is the link to his post.

http://paulbuchheit.blogspot.com/2008/01/should-gmail-yahoo-and-hotmail-block.html

TSO

January 4, 2008 at 3:53 pm

Rule #1: if it’s in Google cache, it’s fair game.
Don’t want it to spread — don’t post.

Anthony Lieuallen

January 4, 2008 at 6:29 pm

It's rude, here's why:

I just read a very good post by a friend of mine explaining why this was bad and wrong:

http://www.yardley.ca/dash/2008/01/04/on-implicit-social-contracts/

In short, these emails were rendered as images. There is exactly one reason to do so: so they won’t be scraped. That’s obvious. Scraping them anyway is wrong. The linked article says it much more elegantly.

dualboot

January 4, 2008 at 6:39 pm

I always assumed...

Ever since I found some guys online journal of information that was intensely private in nature (back in 1995, with a dotted quad URL very similar to what I thought I typed) I have always assumed that if I ever sent, wrote, or posted anything online, the very nature of the internet as a bunch of computers SHARING information would make it inherently un-private. I don’t even email information under the premise that it’s private ever since I googled myself and found an email that I sent to a tech support company posted on their website. Privacy software, in my opinion, is just like locks on your house; they discourage honest people from becoming dishonest, but if the person really intends to get in, it will only slow them down the first time.

Bottom line… if you don’t want it out there, don’t put it out there, in any form.

Rose M. Welch

January 4, 2008 at 7:22 pm

In an ideal world...

…people would respect one another’s privacy on the Internet. In reality, we’re a bunch of celebrity-obsessive, YouTube-watching, self-Googlers. (Not excluding myself.) In an ideal world, no one would have posted a distraught Beyonce falling on her ass on YouTube, esp. not after she asked everyone not to. In an ideal world, ambulance photos of a hysterical Britney would be shunned and no one would post them. (Her kids will have to see this shit in another ten year, yanno.) But, in reality, man is the animal who laughs, and we laugh primarily at the misfortune of others.

So if it’s on-line, in this wonderful information sharing medium, it’s pretty much fair-game. And while I’m not going to go where I’m not wanted, I could go, and less ethical people will. It’s reality. Deal with it.

Walter Dnes

January 4, 2008 at 10:32 pm

Re: What's new is old

I’m in my late 50’s. As a kid, growing up, I’d often hear the following news story…
– crime happens (murder/rape/robbery/whatever)
– police search a suspect’s apartment
– search finds a personal diary, laying out in great detail the comission of the crime the police were investigating
– the diary was an important piece of evidence leading to the ultimate conviction of the suspect

I never understood the mentality behind it. I was always witing for the suspect to deny that he wrote it, and to claim that the diary was planted evidence. But none of them ever denied it. It blew my mind. Today, I see the same stupidity with lawbreakers bragging on Facebook, etal, about their exploits. Then they whine about their privacy being violated when police bypass the “friends-only” so-called “privacy settings”. Sheesh. People never learn.

bayareaguy

January 5, 2008 at 12:00 am

We need more fine-grained access control

“Robert is breaking the terms of service, but it’s also unclear if he owns those e-mail addresses,” Owyang told eWEEK Jan. 3. “People said, ‘Yes, you can be my friend,’ but they never said, ‘Robert, you can take my e-mail address and use it elsewhere.’

Perhaps all items of data someone submits should come with some “ownership” checkboxes:

( ) private – this is my data, not to be redistributed under any conditions until I reclassify it
( ) privilaged – may only be shared with people who I allow after they explicitly ask
( ) sensitive – may only be shared with people I’ve designated as friends, but they don’t need to ask
( ) public – may be shared with anyone

Melle Gloerich (profile)

January 5, 2008 at 2:39 am

Re: We need more fine-grained access control

How is more fine-grained control going to help against ‘attacks’ like the one Scoble did? Ownership of data is sooo RIAA and not how it works on the tubes we call internet. It’s exactly how Tim Lee said in this article, if it’s entered in any (online) database, some day its security is going to get breached and you’re info is publicly available.

Sure, more control is going to slow down that process for a bit and is going to give more room for legal action, but it’s not going to secure your info.

Silicon Valley (user link)

January 5, 2008 at 7:23 am

Already one of the top stories of 2008

Talk about viral marketing.

if Facebook does not learn its lesson from this – god help them.

This really puts a damper on all the great publicity they were getting last year by openning up their API

Tommy Jefferson

January 5, 2008 at 7:51 am

If you want some email privacy;

http://enigmail.mozdev.org

works well with

http://www.mozilla.com/en-US/thunderbird/

Anonymous Coward

January 6, 2008 at 2:07 pm

The stupidity of people never ceases to amaze me. How long has the internet been around? How long have we been hearing stories of people being outraged that their “privacy” has been compromised? I think it should be blatantly obvious to anyone smarter than a rabbit that if you send something via the internet it is NOT private. Frankly, I don’t have any sympathy for anyone who complains that their “privacy” was invaded on the internet.

shepherdtrust nezomba (user link)

January 30, 2008 at 2:00 am

hi

sorry for not reply as a comment but
my name is trust nezomba from Zimbabwe im a boy aged 19
im ICT student at Africa Univesty i doing my my Cisco so my poit is that i have a problerm of computer to use at home
my adress is
chinausunzi court 13 r6 Sakubva
mutare Zimbabwe
E-mail nezombat@cooltoad.com

Add Your Comment

Sunday
12:00	Funniest/Most Insightful Comments Of The Week At Techdirt (19)
Saturday
12:00	This Week In Techdirt History: May 17th - 23rd (0)
Friday
19:39	The FDA Takes Its Turn Burying Studies Showing The Safety Of COVID, Shingles Vaccines (8)
15:55	Ken Paxton Wanted To Crack Down On Forum Shopping. Now Lawyers Say He’s Improperly Seeking Out Favorable Courts. (3)
13:14	France's Terrible Copyright Law, Hadopi, Is Not Quite Dead (2)
10:59	Journalists Identify Murder Victims Of Trump's Boat Strike Program (18)
10:54	Daily Deal: Headway Premium Memorial Day Sale (0)
09:32	SpaceX's IPO Filing Shows Elon's Twitter 'Business Genius' Was A Fantasy (16)
05:32	Amazon Gets Into The AI Podcast Slop Business (10)
Thursday
20:02	Post Loss Clarity: Bill Cassidy Rediscovers His Spine As A Lame Duck Senator (9)

Just Assume Any Info You Put Online Is Public

from the welcome-to-the-new-world dept

Comments on “Just Assume Any Info You Put Online Is Public”

Facebook - should it be banned?

It's rude, here's why:

I always assumed...

In an ideal world...

Re: What's new is old

We need more fine-grained access control

Re: We need more fine-grained access control

Already one of the top stories of 2008

hi

Add Your Comment Cancel reply

Comment Options:

What's this?

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Trending Posts

Sunday

Saturday

Friday

Thursday

More

Tools & Services

Company

Contact

More

Just Assume Any Info You Put Online Is Public

from the welcome-to-the-new-world dept

Comments on “Just Assume Any Info You Put Online Is Public”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Trending Posts

Sunday

Saturday

Friday

Thursday

More

Email This Story

Tools & Services

Company

Contact

More