GCHQ's Karma Police: Tracking And Profiling Every Web User, Every Website

from the this-is-what-you'll-get,-when-you-mess-with-us dept

One of the very first revelations from the Snowden leaks was a GCHQ program modestly entitled “Mastering the Internet.” It was actually quite a good name, since it involved spying on vast swathes of the world’s online activity by tapping into the many fiber optic cables carrying Internet traffic that entered and left the UK. The scale of the operation was colossal: the original Guardian article spoke of a theoretical intake of 21 petabytes every day. As the Guardian put it:

For the 2 billion users of the world wide web, Tempora represents a window on to their everyday lives, sucking up every form of communication from the fibre-optic cables that ring the world.

But the big question was: what exactly did GCHQ do with that huge amount of information? Two years later, we finally know, thanks to a new article in The Intercept, which provides details of another major GCHQ program called “Karma Police” — the name of a song by Radiohead, with the repeated line “This is what you’ll get, when you mess with us”. A GCHQ document obtained by Snowden indicates that Karma Police goes back some years — at least to 2008. It provides the following summary of the project’s aims:

KARMA POLICE aims to correlate every user visible to passive SIGINT [signals intelligence] with every website they visit, hence providing either (a) a web browsing profile for every visible user on the internet, or (b) a user profile for every visible website on the internet.

Profiling every (visible) user, and every (visible) website seems insanely ambitious, especially back in 2008 when computer speeds and storage capacities were far lower than today. But the information that emerges from the new documents published by The Intercept suggests GCHQ really meant it — and probably achieved it.

As of 2012, GCHQ was storing about 50 billion metadata records about online communications and Web browsing activity every day, with plans in place to boost capacity to 100 billion daily by the end of that year. The agency, under cover of secrecy, was working to create what it said would soon be the biggest government surveillance system anywhere in the world.

That’s around 36 trillion metadata records gathered in 2012 alone — and it’s probably even higher now. As Techdirt has covered previously, intelligence agencies like to say this is “just” metadata — skating over the fact that metadata is actually much more revealing than traditional content because it is much easier to combine and analyze. An important document released by The Intercept with this story tells us exactly what GCHQ considers to be metadata, and what it says is content. It’s called the “Content-Metadata Matrix,” and reveals that as far as GCHQ is concerned, “authentication data to a communcations service: login ID, userid, password” are all considered to be metadata, which means GCHQ believes it can legally swipe and store them. Of course, intercepting your login credentials is a good example of why GCHQ’s line that it’s “only metadata” is ridiculous: doing so gives them access to everything you have and do on that service.


Login ID, userid and password all considered to be “metadata”

The trillions of metadata records are stored in a huge repository called “Black Hole.” In August 2009, 41 percent of Black Hole’s holdings concerned web browsing histories. The rest included a wide range of other online services: email, instant messenger records, search engine queries, social media, and data about the use of tools providing anonymity online. GCHQ has developed software to analyze these other kinds of metadata in various ways:

SOCIAL ANTHROPOID, which is used to analyze metadata on emails, instant messenger chats, social media connections and conversations, plus ?telephony? metadata about phone calls, cell phone locations, text and multimedia messages; MEMORY HOLE, which logs queries entered into search engines and associates each search with an IP address; MARBLED GECKO, which sifts through details about searches people have entered into Google Maps and Google Earth; and INFINITE MONKEYS, which analyzes data about the usage of online bulletin boards and forums.

In order to connect these different kinds of Internet activity with individuals, GCHQ makes great use of information stored in cookies:

A top-secret GCHQ document from March 2009 reveals the agency has targeted a range of popular websites as part of an effort to covertly collect cookies on a massive scale. It shows a sample search in which the agency was extracting data from cookies containing information about people’s visits to the adult website YouPorn, search engines Yahoo and Google, and the Reuters news website.

Other websites listed as “sources” of cookies in the 2009 document are Hotmail, YouTube, Facebook, Reddit, WordPress, Amazon, and sites operated by the broadcasters CNN, BBC, and the U.K.’s Channel 4.

Clearly the above activities allow incredibly-detailed pictures of an individual’s online activities to be built up, not least their porn-viewing habits. One tool designed to “provide a near real-time diarisation of any IP address” is called, rather appropriately, Samuel Pepys, after the famous 17th-century English diarist.

The extraordinary scale of GCHQ’s spying on “every visible user” raises key questions about its legality. According to The Intercept story:

In 2010, GCHQ noted that what amounted to “25 percent of all Internet traffic” was transiting the U.K. through some 1,600 different cables. The agency said that it could “survey the majority of the 1,600” and “select the most valuable to switch into our processing systems.”

Much of that traffic will be from UK citizens when they access global services like Google or Facebook, which GCHQ has admitted it defines as “external platforms,” and which is thus completely stripped of what few safeguards UK law offers against this kind of intrusive surveillance by GCHQ.

This means that it is certain that many — perhaps millions — of UK citizens have been profiled by GCHQ using these newly-revealed programs, without any kind of warrant or authorization being given or even sought. The information stored in the Black Hole respository, and analyzed with tools like Samuel Pepys, provides unprecedented insights into the minutiae of their daily lives — which websites they visit, which search terms they enter, who they contact by email or message on social networks. Within that material, there is likely to be a host of intimate facts that could prove highly damaging to the individual’s career or relationships if revealed — perfect blackmail material, in other words. Thanks to other Snowden documents, we know that the NSA had plans to use this kind of information in precisely this way. It would be naive to think it would never be used domestically, too.

It’s frustrating that it has taken over two years for these latest GCHQ documents to be published, since they reveal that the scale of British online surveillance and analysis is even worse than the first Snowden documents indicated, bad as they were. They prove that the current calls for additional spying powers in the Snooper’s Charter are even more outrageous than we thought, since the UK authorities already track and store British citizens’ online moves in great detail.

When Edward Snowden handed over his amazing trove of documents to journalists to release as they thought best, he also placed a huge responsibility on their shoulders to do so as expeditiously as possible. If, as seems likely, there are yet more important revelations about the scale of US and UK spying to come, it is imperative that they are published as soon as possible to help the fight against those countries’ continuing attempts to bolster mass surveillance and weaken our freedoms.

Follow me @glynmoody on Twitter or identi.ca, and +glynmoody on Google+

Filed Under: , , , , , , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “GCHQ's Karma Police: Tracking And Profiling Every Web User, Every Website”

Subscribe: RSS Leave a comment
60 Comments
Anonymous Coward says:

This is what I have been saying for several years now

there is likely to be a host of intimate facts that could prove highly damaging to the individual’s career or relationships if revealed — perfect blackmail material, in other words.

This is exactly why this kind of surveillance needs to stop. It is only a matter of time where the party in charge of the government will have enough info on the other party to keep them from making a serious run for their office. All political parties should oppose this kind of snooping but they don’t. They all think they will be the one in charge when the music stops.

hobo says:

Re: This is what I have been saying for several years now

It’s worse than this as it’s not just politicians who are at stake. It is only a matter of time before databases like this are compromised and data dumps occur. Then your history, your passwords, your life is out on public display.

This is exactly the same as wanting back doors to encryption, they think that only the “good guys” will have access, and that they will only use it for “good reasons.” Neither of those suppositions is true.

That One Guy (profile) says:

Re: Re: This is what I have been saying for several years now

‘Only a matter of time’? Yeah, that database would be priceless for countless groups, the odds that not a single one of them has cracked it by now is so low as to be non-existent.

It’s pretty much a given that it’s already been compromised, the only question is how much and by who?

JMT says:

Re: This is what I have been saying for several years now

“It is only a matter of time where the party in charge of the government will have enough info on the other party to keep them from making a serious run for their office.”

This is the vital fact that needs to be hammered home to those people who don’t think government surveillance is such a big deal, because they’ve done nothing wrong, and terrorists! Even if you’re never directly affected by government surveillance, if it keeps going unchecked eventually it will be used in ways that completely undermine the idea of a democratic government. It’s just human nature, and there are plenty of historical examples. This is actually something worth hundreds of people of people dying at the hands of terrorists, because the end result could be hundreds of millions living under a far worse form of government than we have now.

That One Guy (profile) says:

Re: Insanity

Depends on who you ask.

A member of the public who’s paying attention to what’s been happening? Almost certainly not.

A member of the public who’s not been paying attention, and gets all their news from the government? Absolutely, after all the entire planet would be a smoking crater by now if it weren’t for the brave actions of the spy agencies, as clearly demonstrated by the claims made by the very same spy agencies.

A member of one of the spy agencies? Without a doubt. Never having to worry about your budget or any pesky ‘investigations’, the ability to absolutely ruin anyone who speaks out against you if you care to, or even just care to hint at it(access to their accounts allows more than just monitoring them after all)… yeah, I’m sure they consider the world much improved thanks to their actions.

Anonymous Coward says:

The true story is how little press this is getting outside of tech/privacy circles. This is orders of magnitude more invasive than phone metadata collection, but now the public treats it with a halfhearted “meh”.

Awareness of these issues seems to have peaked and the opportunity to reform them expediently along with it.

Anonymous Coward says:

Re: Re:

objecting about surveillance likely gets you surveyed, a vicious fucking circle, thats gonna make people more angrier, over THEIR actions of surveillance

Its like they think their entitled to do whatever the fuck they want with OTHER peoples lives, and expect everyone to be ok with it, and if not, they “MAKE” you “ok” with it, instead of respecting the rights and freedoms they supposedly “champion”

No choice, no consent, we are OWNED by our self appointed “betters”

I have no sympathy for their self afflicted problems…….

Anonymous Coward says:

I’d just like to say thanks techdirt for posting this intercept article i read a few days ago, and giving me the opportunity to say, without having to register, very well done article, intercept, not many articles go into this kind of depth/information when it comes to the disgusting surveillance entitled behaviour

Our lives belong to us, not our governments

Just Another Anonymous Troll says:

Those GCHQ jokers and their operation names

The trillions of metadata records are stored in a huge repository called “Black Hole”.
How fitting.

MEMORY HOLE, which logs queries entered into search engines and associates each search with an IP address
Props for the ironic 1984 reference, GCHQ.

and INFINITE MONKEYS, which analyzes data about the usage of online bulletin boards and forums.
I suggest switching the name of this operation and the name of this agency.

Violynne (profile) says:

Re: Re:

This would be equivalent to saying “Those who work for spying-on-the-public agencies should be ashamed of themselves”, because these programmers work for the agencies pull the data.

People should also be aware these programs were assisted by the NSA, who has several years experience on capturing internet traffic.

Can’t wait until the story breaks on how encryption on the internet is made moot by the ghost certs these agencies use.

SteveMB (profile) says:

Maybe "As Expeditiously As Possible" Isn't The Best Strategy

When Edward Snowden handed over his amazing trove of documents to journalists to release as they thought best, he also placed a huge responsibility on their shoulders to do so as expeditiously as possible.

Continuing the steady drip-drip-drip approach may be more effective in the long run. It had the advantage of driving the cycle:

1. Disclose X
2. Three-letter apparatchik denies X
3. Disclose evidence for X
4. Three-letter apparatchik admits X, but double-pinky-swears that it’s only X and not Y
5. Disclose Y
6. Lather-rinse-repeat

Anonymous Coward says:

really, it’s all going to be so nice. all will be in agreement and no need for hard feelings or unpleasantry.

it’ll be like one big tea party where we’re all so nice, and nobody will bat an eye when the country we love chooses to eliminate an entire population somewhere that will have gotten between us and something we want.

tea party, yes. where the women come and go, talking of michelangelo. and of charlemagne and cortez, of course.

of course.

it’ll be like heaven without the feathers.

Median Wilfred says:

Data fog?

How does GCHQ work around deliberate or accidental data fog? I’ve spent several days with my “Google” login set to that of one of my kids, for example, which is going to mess up the nice profile for that kid.

Bruce Schneier looks up random people on facebook. Do I need to start doing that sort of thing? I can easily schedule “wget” to make HTTP requests of sites with DNS names composed from a list of goofy words, save the cookies I get, or just send random cookies, maybe with Verizon’s extra HTTP header, just to make people wonder.

Will this be effective in clogging GCHQ’s rather totalitarian database? If not, how many people doing this sort of thing would clog it?

Anonymous Coward says:

Re: Data fog?

The database is already fogged when it comes to finding the small number of well organised terrorists and criminals. It allow them to see a popular protest forming, giving a chance to send in the trolls to put reasonable people off of taking part. It is also useful for identifying ring leaders of protests.
The gather it all approach mainly has use in controlling the population and heading off protest and the organization of political parties that could challenge the status quo.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...