Just So We're Clear: More Data Isn't Better Data
from the it's-just-more-work dept
New data-retention policies went into effect in the UK this week, forcing ISPs to store details of all user emails and VoIP calls for a year, just in case law enforcement or the security services want to thumb through them. The government’s intent is to mine the data to try and recognize patterns in relationships and contacts that will help them find terrorists and criminals. The idea that all of this data is being stored by ISPs makes privacy activists shudder, and their worry is not unfounded. But it’s also important to understand that the idea, that by capturing all this data, the government can easily root out terrorists, is bunk. More data doesn’t equal better data; it just makes it a hell of a lot more work to dig out useful information. It also raises the possibility of discovering false patterns that waste law enforcement’s time and suck in innocent people. Recently, a guy in Wales found himself in the middle of an armed anti-terror raid on his home after somebody told police that they thought he might be a terrorist because he had soundproofing gear and wiring. He wasn’t a terrorist, but rather a musician with a home recording studio. If police will go to such lengths based on unverified, anonymous tips, the thoughts of the conclusions they’ll draw from having an entire country’s email and VoIP records at their fingers should raise a few eyebrows.
Filed Under: data retention, europe
Comments on “Just So We're Clear: More Data Isn't Better Data”
To be even clearer...
…as the article you actually link to mentions, this is NOT a UK directive – it comes from the EU
A good article can be found at http://www.guardian.co.uk/commentisfree/libertycentral/2009/apr/06/internet-houseofcommons (I got this from http://www.opendemocracy.net/ourkingdom/thomas_ash/isp_data_retention)
My guess is that it was indeed British who brought this forward (we can’t get it past out own houses of parliament so have used the EU as a sort of backdoor)
Interestingly this directive was apparently brought up as some sort of corporate rather than legal directive, thereby bypassing even more possibilities to vote on it within the EU itself – sneaky bastards
It is by no means popular throughout Europe, the Swedes have already stated that they will refuse to follow the directive at all
The whole thing, who raised it, how it was tabled and who voted on it deserves more investigation
While requiring ISPs and telephony operators to log everything is bad, the even more creepy part is what you find by reading the only existing draft statutory instrument about data retention: http://www.opsi.gov.uk/si/si2009/draft/ukdsi_9780111473894_en_1
Cutting it short, the problem lies in point 2(e)(iii), which states the following:
“In these Regulations ‘public communications provider’ means-
(i) a provider of a public electronic communications network, or
(ii) a provider of a public electronic communications service
and ‘public electronic communications network’ and ‘public electronic communications service’ have the meaning given in section 151 of the Communications Act 2003(a).
Off we go to check out section 151 of the Communications Act 2003(a) (which can be found at http://www.opsi.gov.uk/ACTS/acts2003/ukpga_20030021_en_15#pt2-ch1-pb28-l1g151):
“‘public electronic communications network’ means an electronic communications network provided wholly or mainly for the purpose of making electronic communications services available to members of the public;
‘public electronic communications service’ means any electronic communications service that is provided so as to be available for use by members of the public;”
These definitions are rather vague and they could be easily interpreted in such a way that would make sharing your Internet connection with your neighbour or running a Tor relay fall under the jurisdiction of the data retention directive. If that becomes a reality, then characterizing this as “overreaching” is just an understatement.
more data is better data
it just requires the right kind of analysis.
unless you are attempting to say that there is absolutely zero value in the data, the more of it you have the better. In fact, the more of it there is, the better it is for privacy as well, because it means that most of the use of the data will be used/parsed/viewed through automated tools.
Re: more data is better data
No, more data is NOT better. More GOOD data is better.
If I want to know how many people prefer Coke to Pepsi, having a database of migratory swallow patterns doesn’t help AT ALL.
More data is not better. More good data is better.
Re: more data is better data
Actually, no, more data isn’t better data and the kind of analysis doesn’t matter. The reason being that the data you are adding unrelated to what you’re looking for, is almost pure noise, and it comes in such great volumes that it precludes deep analysis by even automated means.
So, following the logic of AC’s post(comment 2) it would appear that all the numbskulls who install a shiny new ADSL wireless router out of the box and are amazed that their PC connected straight away will be open to prosecution because they don’t know that their wireless connection is unencrypted.
Try a quick wardrive in most areas and you will find lots of open routers out there. I did a test and found SIX on a one mile stretch of road. ALL of those people are potential court cases waiting to happen as ignorance is not a defence that stands up in court.
A large percentage of the population don’t know about/understand wireless security and are happy that it ‘just works’ when they install it.
Anyone know of an ISP actually doing this?
So the rules are in place, but is anyone following them? I got this from a friend who runs a small ISP for business users;
“At current most ISPs don’t have the required equipment in place to log user activity…
…It’s probably going to stay that way for quite a while – unless the government starts paying for the required storage and processing (or issuing huge fines to companies who aren’t complying).”
Seriously, more GOOD data is better. Think again! All the data is saved so there can be NO BETTER DATA! They will be working with all there is. Their main trouble will be the volume of data and picking out useful, meaningful patterns. It will, for the most part, all boil down to the value of “p”, that is the probability that the results are meaningful. I wish them luck.
A terrorist attack will kill a dozen people in London. The massive amount of data will be mined and worldwide anyone within 6 degrees of seperation of the terrorist will be arrested and promptly executed.
Hmmm …. does getting spammed by the same spammer qualify as a 1st degree seperation?
When looking for a needle in a haystack...
it’s always more efficient to make the haystack as large as is technically possible…
Data Retention and Network Security
Finally, the EU gets it. They need time to find a sorting mechanism to extract useful data and catch the “wrongdoers”. They will need more time than one year. If they find any of mine, will they please send it to the FBI so they can correct their falsified records. I can’t seem to get in touch with them, indirectly. Thanks.
“In the future, we will all die from hearsay”