Latest Revelations Show How Collecting All The Haystacks To Find The Needle Makes The NSA's Job Harder

from the and-makes-us-all-less-safe dept

Yet another post about the latest NSA revelations about collecting buddy lists and email contacts. As we’d mentioned in the original post, the story noted that this data collection was at times overwhelming. Here’s the Washington Post’s report on this point:

The volume of NSA contacts collection is so high that it has occasionally threatened to overwhelm storage repositories, forcing the agency to halt its intake with “emergency detasking” orders. Three NSA documents describe short-term efforts to build an “across-the-board technology throttle for truly heinous data” and longer-term efforts to filter out information that the NSA does not need.

Spam has proven to be a significant problem for NSA — clogging databases with data that holds no foreign intelligence value. The majority of all e-mails, one NSA document says, “are SPAM from ‘fake’ addresses and never ‘delivered’ to targets.”

In fall 2011, according to an NSA presentation, the Yahoo account of an Iranian target was “hacked by an unknown actor,” who used it to send spam. The Iranian had “a number of Yahoo groups in his/her contact list, some with many hundreds or thousands of members.”

The cascading effects of repeated spam messages, compounded by the automatic addition of the Iranian’s contacts to other people’s address books, led to a massive spike in the volume of traffic collected by the Australian intelligence service on the NSA’s behalf.

After nine days of data-bombing, the Iranian’s contact book and contact books for several people within it were “emergency detasked.”

Here’s a slide from the leaked NSA presentation, in which it urges people to be more careful about what kind of data it collects via this program, saying they’re trying to “store less of the wrong data” and “shift the collection philosophy at the NSA” to “memorialize what you need” from “order one of everything off the menu and eat what you want.”

This is the very same NSA which is led by Keith Alexander, whose unofficial motto has been described as “collect it all.” At times, Alexander has argued in public that “you need the haystack to find the needle.”

Of course, that’s bogus, and the data deluge discussed in this program demonstrated why. Collecting it all makes it harder to find the right information. Piling more hay on the haystack doesn’t make it easier to find the needle, it makes it harder. That’s one of many reasons why we’re so concerned about these bulk data collection programs. Not only do they rarely seem to turn up useful information, but they also seem to better obscure important information by flooding the system with bogus data.

Filed Under: , , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Latest Revelations Show How Collecting All The Haystacks To Find The Needle Makes The NSA's Job Harder”

Subscribe: RSS Leave a comment
36 Comments
Ninja says:

This is the very same NSA which is led by Keith Alexander, whose unofficial motto has been described as “collect it all.”

Sounds like some plot from some weird Japanese animation where some wacko from some city aims to catch weird monsters for no reason other than having them.

Hi! I’m Alexander Keith from Syracuse and I’m gonna be a Surveillance master!

*Alexander throws PRISM at DATA*

*Alexander caught PENIS ENLARGEMENT spam*

Yep. We need a parody.

Crusty the Ex-Clown says:

Mmmm, Master Plan

I, for one, am in favor of Alexander’s plan to ‘collect it all.’ Especially if that means the Nominal Security Agency drowns and chokes on zillions of terabytes of data. They will be paralyzed by their own success in actually acquiring everything, fools that they are. Go for it, Keith!! What a bunch of maroons……

out_of_the_blue says:

Re: Mmmm, Master Plan

@ “drowns and chokes on zillions of terabytes of data.”


So, you (and others below) accept the notion that your email provider can recognize spam and filter it out, but NSA can’t?

Yeah, let’s all laugh at poor old incompetent NSA! Don’t even have spam filters!

No. Didn’t mention this in my first post as seems obvious, but you’re being channeled into seeing NSA as just another clueless ineffective gov’t agency, lessening its crimes in your mind, and that has to be exactly as they wish.

Just more diversion from NSA crimes. The small but real opening Snowden gave us is being frittered away — in a manner that’s actually to NSA benefit.

Ninja (profile) says:

Re: Re: Mmmm, Master Plan

So, you (and others below) accept the notion that your email provider can recognize spam and filter it out, but NSA can’t?

Apparently they can’t. And if they could you just need to drive a sheer volume of mails to get out of their collection and sneak some coded message as if it was some spam.

you’re being channeled into seeing NSA as just another clueless ineffective gov’t agency

You need to check your sarcasm detector.

Anonymous Coward says:

So, my take aways from this are:

1. It turns out that the NSA does not have the world’s most awesome spam filter. Indeed, it seems likely they get more spam sent to an account that they account itself might receive once the mail host’s spam filters get done with the traffic.

2. The best way for terrorists to avoid NSA scrutiny of their email is to become massive spammers. Ironically, they would likely cause more problems for the USA with their spamming activities than with their terrorist activities.

3. Emails that look like mass spam and phishing attacks will be the best way for terrorists to send emails in the future. Emails with seemingly random text that contain links to obvious phishing attacks could easily contain coded messages that the NSA would ignore because they aren’t storing phishing emails promising penis enlargement.

Mr. Applegate says:

Re: Re:

“Emails with seemingly random text that contain links to obvious phishing attacks could easily contain coded messages that the NSA would ignore because they aren’t storing phishing emails promising penis enlargement.”

I would not recommend placing coded messages in SPAM about Penis Enlargement because the NSA is obviously already compensating for a lack of penis size, so these emails may actually interest them.

Just sayin

Killer_Tofu (profile) says:

Re: Re:

The funny thing would be all of the random penis and breast enlargement sites they accidentally go to and try to enter their information into because they were just told “you will receive a special penis enlargement email”.

Pretty soon they will be pissed at the Nigerians for taking their terrorist funds.

Scammers vs Terrorists!

Anonymous Coward says:

Re: Re: Re:

That was another thought of mine, terrorist start up their own version of the Nigerian scam. Either the terrorists get funding from hundreds, if not thousands of gullible people, or hundreds, if not thousands of gullible people get their finances screwed by the government as it attempts to prevent the “terrorist event” of money being transferred. Either way the terrorists would win.

madasahatter (profile) says:

Deuge

The NSA failed to do traffic analysis and then target the accounts based on the analysis. Traffic analysis is based on the fact all organizations have information flow patterns that can be determined. The patterns give on a clue about the structure and location the various parts. You start with a known and work from there. In the case of terrorist, only some of the traffic links are important.

Also, for most people, traffic analysis will lead to nothing important for the NSA because they are doing nothing of interest to them. Vacuuming all traffic threatens to overwhelm the analyst with useless data that will never be of any value.

Since I do not knowingly hangout with terrorists of any type the chances that any of my conversations, emails, etc. will lead to an intelligence break through is about 0. Collecting my data only clutters up the disk and will never provide any useful information.

The real risk is that the NSA angers enough people like me and the resulting political and commercial pressure forces an over reaction that hurts US businesses and the ability of the NSA to actually monitor foreign enemies.

John Fenderson (profile) says:

Re: Deuge

Since I do not knowingly hangout with terrorists of any type

You may not knowingly do so, but that doesn’t mean you don’t actually do so. Between the rather loose definition of “terrorist” and the fact that you don’t know who the people you hang out with also hang out with, I’m guessing that the odds that you’re linked with a terrorist is higher than you might think.

Brazenly Anonymous says:

Re: Deuge

The real risk is that the NSA’s new toys get used by someone with something more sinister in mind (something we know has happened on a small scale already). Whether they get there by being a politician/spy/contractor or just manage to install a back-door into the database is irrelevant. The existence of this capability is a massive security risk that should be fought with every tool we have (encryption services, publicizing vulnerabilities, etc).

When you find a weed in your garden, you dig down to the roots and yank the whole thing out. You don’t prune and/or nurture it.

out_of_the_blue says:

Logically, then, Google can't work!

That’s corollary to Mike’s argument. Of course, he has as premise that NSA is looking for needles instead of broad trends. But even for needles*, seems to me that Google’s “collecting all the haystacks” is precisely its “business model” and so proven to work in practice.

[* More accurately, soon as the system is informed that there IS a needle to be found: meaning for whatever reason an individual (dissident) is focused on, then it can be entirely effective. I think it obvious that the national surveillance system, including the mega-corporations, is to control the populace, not protect them.]


Google’s ability to target you for advertising is EXACTLY what NSA needs to target you as political dissident, NOT coincidentally.

Rikuo (profile) says:

Re: Logically, then, Google can't work!

Difference between Google and the NSA: The NSA is ostensibly looking for one thing (terrorists) in the overwhelming sea of haystacks. That means that 99.99999999999….% of the data they have is absolutely useless to them.
Google? They help their end users find whatever it is they’re looking for. Google isn’t just indexing the internet and looking for one thing, it’s looking for tens of billions of things, based on search engine queries.

Oh wait…I forgot. You’re an idiot who can’t bother to use logic and reason.

out_of_the_blue says:

Re: Re: Logically, then, Google can't work!

@ “Rikuo”
The NSA is ostensibly looking for one thing (terrorists) in the overwhelming sea of haystacks.


First: said you’d stop replying to me, but you can’t.

Listen, sonny. You obviously didn’t even read mine, are just yapping at sight of my screen name. Here’s what I wrote above that directly addresses the needle bit: “Of course, [Mike] has as premise that NSA is looking for needles instead of broad trends.” — SO WHY DO YOU REPEAT THAT PLUS SOME AD HOM? — Cause you’re just a nasty little kid trolling this fine site.

out_of_the_blue says:

Re: Re: Re:2 Logically, then, Google can't work!

@ “Rikuo”

>>> “Listen, sonny. You obviously didn’t even read mine,”

It’s true. I didn’t read your post. That’s how, in my reply, I was able to respond to a point you made and countered it…wait what?


So long as you’re OFF-TOPIC, it’s FINE with me! You’ve not countered my points.


So long as “The Market” (if not NSA directly) rewards Google for spying, do you expect it to do LESS of it?

Rikuo (profile) says:

Re: Re: Re:3 Logically, then, Google can't work!

“So long as you’re OFF-TOPIC, it’s FINE with me!”

You do realize what you’ve just said, don’t you? You don’t want a discussion on the topic of the article, you want to de-rail it. Oh and yes, I did counter your point. You said that because Google is able to find things, then obviously the NSA should be able to as well, which I countered with that Google and the NSA search for entirely different things and in completely different ways. Not to mention the fact that at best, NSA staff are incompetent, what with their Utah datacenter catching on fire at least ten times.

Ninja (profile) says:

Re: Logically, then, Google can't work!

Actually Google gives a shit to what you do. They want to present ads tailored to what they think it suits you. And they often fail miserably even if you do give all details for free and by choice (remember the NSA just spy on you without asking for permission). That’s the ad part.

The search engine part wants to know everything that’s publicly available so you can search for it. Sort of a PRSIM of websites except that they don’t go into what is locked down or closed.

I know you have wet dreams of going Rambo and entering Google headquarters with machine guns firing an impossibly large amount of bullets while looking and smelling definitely macho. But you see, it’s not what you are thinking…

out_of_the_blue says:

Re: Re: Logically, then, Google can't work!

@ “Ninja”
Actually Google gives a shit to what you do.


Darn right it does: that’s how it gets money, by tracking me all over the net. It’s as invasive and controlling as it can be, but has plans for more.

I’d already written a new tag line that answers your drivel:


So long as “The Market” (if not NSA directly) rewards Google for spying, do you expect it to do LESS of it?

Ninja (profile) says:

Re: Re: Re: Logically, then, Google can't work!

It doesn’t. It doesn’t care if you sniff mushrooms, sodomize yourself or make terrorist plots. They don’t care if you are Republican, Democrat, Pirate, Yuppie, Goat Lover or whatever. They just use the data they collect on you to try to drive you to click their ads. Nothing else.

That’s the difference 😉

Anonymous Coward says:

Of COURSE most email is spam

Long-term (multiyear) measurement across varied ASNs, networks, countries, domains, servers, MTAs, etc. suggests that somewhere around 96% to 98% of all email messages are abuse of some variety, including spam. Any estimate below 95% should be discarded as incompetent; any above 99% is certainly plausible, but likely not indicative of the whole.

If you accept the “collect it all” position, just for the purpose of argument, then the problems with this become obvious, both in terms of scale and searchability. If you don’t accept that position, then this leaves the NSA (and friends) with the problem of discerning — prior to collection — which traffic is and isn’t junk. Either way, these alternatives pose serious technical problems, even before we get to questions of legality, ethics, long-term benefit to the nation, etc.

And as a side note, let me add that in recent years spammers have gotten quite crafty about individualizing their messages to achieve traceability: for example, per-message differences in whitespace (often at the end of lines, where’s hard to notice) have been used in order to figure out which abuse victim is the one reporting them and thus which one should be retaliated against. This same mechanism could also be used to bury useful information in massive spam runs: to the casual observer, it would look like another 300-million message incident. But to the single recipient within those 300M, it could be coded message.

Anonymous Coward says:

“Latest Revelations Show How Collecting All The Haystacks To Find The Data Makes The NSA’s Job Harder”

Of course it’s making their job harder! And any first year CS student could spot the flaw a mile away.

They should be collecting hay in heaps instead of stacks. Much easier to search.

It’s shameful that the NSA doesn’t know of this.

/It’s a programming joke…move along. And please don’t punch the nerd.

mcinsand (profile) says:

overwhelming spam

As an internet user, I’m concerned about my bandwidth, especially since it will only be a matter of time before Raspberry Pi’s start being commissioned as unattended spam servers. Solar panel in the bushes plus tiny credit-card-sized computer near a wifi hotspot…

Spam seems to be a logical counteraction to the NSA’s dataslurping. I will be surprised if we don’t end up with overloaded internet infrastructure shortly.

Wally (profile) says:

Statistics...

The fact that the government was scanning way too much has been a statistical constant…the prevented attacks some view as nonexistant actually did exist…At that time, the search was narrowed to only spying people on foreign soil…..

I think because of this mass of statistical data the NSA has does a lot of bad math for “good” intentions…

I see where they could easily think that the metadata collected can easily create a psych profile for pretty much every smart phone user on Earth.

Bear in mind I’m putting rights aside for a moment as it is fairly obvious that the 4th Amendment was violated…I’m here to talk about math:

The mathamatical flaw, it seems, isn’t in the data collected about us creating a good psych profile..it’s the quantity in which it was collected. When collecting metadata (variables in statistics and marketing) it’s alwats good to have good quality data in small organized chunks…The NSA gets an F in Statistics and Marketing…which deals in predictable outcomes…too much data can cause issues when data is actively being sought out…

In short…mathematically, the NSA had good quality data…but they now have so much data that a super computer has trouble finding crorelation data…

Now in my profession, when a correlation study is run, we try to do a narrowed search about the specific data we need…the term “terrorist” is extremely broad in my opinion as psych classification metadata because there are so many things to look at and collect for to match thay profile…The NSA failed so badly to narrow the data they needed that the abnormalities could not be spotted sufficiently…

I can now conclude that the entirety of the NSA completely failed in preventing Behngazi and Boston.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...