NSA Personnel: Search For Needles Not Being Helped By Continual Addition Of Hay To The Stacks

from the can't-use-intelligence-if-you-can't-locate-it dept

The NSA’s desire to harvest as much data as possible lies at the root of its defense of the nearly-dead Section 215 collection. Although mostly useless, it is still being defended as a data collection the NSA needs to have around “just in case.” Data — lots and lots of it — is good and useful and helps locate terrorists. And, according to those running these collection programs, the only thing better than data is more data. Hence: “collect it all.” Hence, also: a gargantuan data center in Utah that is still in danger of losing its water supply.

“Collect it all,” proclaimed Keith Alexander, as the NSA amassed haystack after haystack with the needles seemingly little more than an afterthought. “Collect it all,” the analysts yelled back, frantically running haystacks through their analytic spinning wheels in hopes of appeasing King Alexander with the occasional production of counter-terrorist gold.

The Intercept’s cache of documents reveals not everyone in the NSA is so enthralled with haystack-building. Adding haystacks doesn’t aid in intelligence efforts. It just adds more hay. Sooner or later, everything bottlenecks at the analytic point. Worse, it adds to the amount of cleanup that must be done before the data can even be analyzed, as well as possibly removing “signal” while filtering out “noise.”

These (leaked) informal documents contain conversational discussions of intelligence topics that come from about as “everyman” a perspective as spooks sitting in a sea of servers can actually have.

From “Too Many Choices,” by the “SIGINT Philosopher:”

“Analysis paralysis” isn’t only a cute rhyme. It’s the term what happens when you spend so much time analysing a situation that you ultimately stymie any outcome. It’s what happens inside your grandfather’s brain while you wait endlessly him to make his move on the chessboard. It’s what happens when I stand in of the jams and jellies at the supermarket. And it’s what happens in SIGINT when we have access to endless possibilities, but we struggle to prioritize, narrow, and exploit the best ones.

A.k.a, the Netflix problem, for those more prone to stream entertainment then purchase jams and/or jellies. If nothing immediately stands out, the tendency to cycle through list after list of possibile choices results in more time spent looking for something to watch than actually watching something.

When lives are potentially on the line, adding more data makes it harder to find what you’re looking for in a timely fashion. Stack up enough hay, and more time will be spent examining and discarding false positives and negligible intelligence than will be spent looking at useful data that might point analysts towards an impending threat.

The SIGINT mission is far too pressing for many team-building activities or brain-storming sessions aiming to improve our organizational approach to analysis. At the same time, the SIGINT mission is far too vital to unnecessarily expand the haystacks while we search for the needles. Prioritization is key.

But this doesn’t seem to fit in with the NSA’s general approach to intelligence gathering. Nearly every program it runs is an effort to gather even more data than it already has. Every exploit it plants gives it another source for intel. Every new agreement it makes with foreign countries’ intelligence services gives it another set of haystacks to dig through. There is no apparent prioritization inherent in its intel gathering. Everything is potentially significant, but its significance can only be determined after it is collected and analyzed. The agency prefers collecting in bulk to targeting. It has been this way for years. So, it’s no surprise that those questioning this approach may find themselves doing the following: [Side note: this paragraph says some interesting things about the Section 215 program capabilities and comprehensiveness.]

Recently I tried to answer what seemed like a relatively straightforward question about which telephony metadata collection capabilities are the most important in case we need to shut something off when the metadata coffers get full. By the end of the day, I felt like capitulating with the white flag of, “We need COLOSSAL data storage so we don’t have to worry about it,” […] because getting the metrics for empirical evidence to review was so very difficult and, frankly, I’m still a little scarred by the experience.

The emphasis is “more hay,” not “better targeting.” And no one seems to know which collections are actually returning useful intel — at least not in an agency-wide sense.

There’s a running joke in the S3 community that we’ll only know if collection is important by shutting it off and seeing if someone screams.

And that screaming may only be because someone thinks their particular haystack-gatherer is useful, rather than it actually being useful.

Despite all of this incoming intel, terrorists are still evading the worldwide surveillance net cast by the NSA and its global partners. Officials tend to blame this on leaks, encryption, “going dark” — anything that doesn’t raise the uncomfortable possibility that the needles it’s looking for are already swimming through its massive haystacks. This isn’t because the NSA doesn’t know what it’s looking for. It’s because it can’t find what it’s looking for.

Snowden… noted in an interview with the Guardian that the men who committed recent terrorist attacks in France, Canada and Australia were under surveillance—their data was in the haystack yet they weren’t singled out. “It wasn’t the fact that we weren’t watching people or not,” Snowden said. “It was the fact that we were watching people so much that we did not understand what we had. The problem is that when you collect it all, when you monitor everyone, you understand nothing.”

Those in the analytic trenches seem to feel the NSA collects too much. Upper-level officials seem far less concerned. The NSA collects to collect. It collects “just in case.” This saves intelligence officials from the unlikely event of having to explain how a gap in coverage resulted in a terrorist attack. It’s CYA by massive data centers. The massive, overlapping collections are just as likely to result in an unthwarted terrorist attack, but it very pointedly won’t be because the NSA didn’t try.

Filed Under: , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “NSA Personnel: Search For Needles Not Being Helped By Continual Addition Of Hay To The Stacks”

Subscribe: RSS Leave a comment
David says:

Hay scales.

Processing haystacks is just a matter of scale. If you are an agency and given the prospect “we aren’t good enough, we’ll double your funding”, what are you going to do? Produce a tenth of data from the same input? How do you sink so much more money in that?

If you scale your work to your monetary resources, it’s much easier to just process proportionally more input. This just requires expanding your machinery, not the analysts being allowed to view security and privacy sensitive and classified information.

So, of course, you also get proportionally more output.

And if we take our own word redefinitions seriously, the moment human analysts take a look at individual data sets, we are “starting” to conduct a search requiring judicial warrants. Unfortunately, judges don’t scale. So there is not much of a point to scale up the human resources for per-case analysis since we can’t keep up the flow of warrants required to let them do their job.

So instead, computers have to do the “non”-searches. And they’ll turn up what looks suspicious to a computer. Which are patterns of behavior/communication that are so braindead that those programming the searches anticipate them.

Anonymous Coward says:

Re: Re: Hay scales.

Depends on how well they are at “minimizing” and departmentalizing the data.
Actually those two things would be the effect of having decentral data storage at private companies too, but having to deal with probable cause is such a pain.

Overall, what I am getting from what is being written by SIGINT Philosopher is that a lot of money is used on practical projects while in reality they need to expand research to actually gain meaningful results from such data. Since the type of research needed do not rely on continuous data streams, it would seem much more effective for NSA to deprecate the mass collection and rely on research to determine the data they can use.

Since research is pretty expensive in man-hours, I am sure that scaling that department up will be able to fill the current budget anyway.

orbitalinsertion (profile) says:

Re: Re: Hay scales.

The comment, your response, and jilocasin @ 15-06-02 07:16 “It’s just right for the propper mission” are all sort of pieces of what i tend to think about, which is: Whatever they are claiming now, and whatever they may actually try to do with this now, the real goal is somewhat nebulously forward-looking. They want a database (and the R&D trajectory) to have their Total Information Awareness in the future. They are waiting for trends in various technologies to reach certain points. They are waiting for science fiction to become reality. They want their massively cross-referenced petadata analytics with slick UIs that can relate “all the stuff” then let some humans make decisions based correlations in their 3-D VR dashboard environments. All hand-swipey and everything. Who knows. But they want all this, what will be historical data, for some indefinite time in the future where their little wet dreams come true and Kurzweil is right or something. And they want it “too big to fail” until that time comes. (When my children’s children’s children can spy on, predict the actions of, manipulate, and incarcerate (or incinerate) your children’s children’s children and their brother’s sister’s cousin. All of them. All their wives, and all their children, and all their sheep, and all their cattle, and all their cats and dogs.)

tqk (profile) says:

Re: Re:

Maybe the most likely way to stop mass-surveillance is through a revolt by the intelligence workers.

“Dear Congresskritters, you’re sending too much funding to my employer who is then able to command me to do foolish, unproductive stuff. Please stop.”

I don’t think you’ve thought this through. Besides, we’re already demonizing too many whistleblowers. The few who have gotten away with complaining their bosses are breaking the law only barely show up on the radar.

Anonymous Coward says:

The massive, overlapping collections are just as likely to result in an unthwarted terrorist attack, but it very pointedly won’t be because the NSA didn’t try.

Because it’s much better to be seen as incompetent because there was too much information, than incompetent because there was not enough information?

Either way the NSA has an excuse, and our elected officials are dumb enough to swallow it.

Ever get the feeling that we’re just not firing enough people?

Anonymous Anonymous Coward says:

Re: Re:

“Ever get the feeling that we’re just not firing enough people?”

Start with politicians. Use what I’ll call the ‘Rabid Dog Method’. If a candidate foams at the mouth when discussing terrorism or crime or for that matter anything, vote for someone else. Anyone so invested in only one set of issues cannot be focused on what is good for everything, or anything.

But then I think of the way political parties and money in politics works and I get to…oh……wait….

jilocasin (profile) says:

It's just right for the propper mission

I can’t help but think that we only think the current methodology is ineffective, or counter productive even, because we are looking at the wrong use case.

For catching terrorists, where terrorists fit the traditional definition of outsiders trying to use terror to influence/change our government or society, “collect it all” is a useless strategy.

Given that most honest people agree with that position, in what situation does “collect it all” does it make sense?

The only one I can think of, off the top of my head, is where the government wants to protect itself from its citizens.

We have seen this situation play out countless times throughout history and continuing into today. You need to look no further than; Nazi Germany, Soviet Russia, Communist China, Islamist Iran, (I’m not sure how to categorize) North Korea. In modern times, sadly, we can add Great Briton and the United States to that dismal array of countries in fear of their citizenry.

Here in the US of A, the Constitution and especially the fourth amendment, serve as a bulwark against the oppressive mass surveillance of the common man by the government, or at least it used to. Unfortunately too many people are cowered by the fear of terrorists to think straight. The odds of dieing in a terrorist attack on US soil is so far down on the list, that personally I don’t even think about it. I am far more likely to die in a car accident, drowning in a swimming pool, getting the flu, heck being struck by lightning while winning the lottery. Have people died in a terrorist attack, yes. Will people do so in the future, most definitely. Will it be me, or someone I know, not bloody likely.

Nothing is ever totally safe.

The ultimate goal of our government is to safeguard our freedoms. Therefore whenever you hear a government agent or politician say that their primary goal is to keep you safe, be afraid. Not of whatever boogeyman they have currently dragged up, but of their motives. What they are really trying to do is scare you into letting them strip you of your liberty and freedom.

As it’s been said by better men than me over the years;

“Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety.”

[Benjamin Franklin https://en.wikiquote.org/wiki/Benjamin_Franklin%5D

“Give me liberty, or give me death!”

[Patrick Henry https://en.wikipedia.org/wiki/Give_me_liberty,_or_give_me_death!%5D

“I’d rather die on my feet, than live on my knees.”

[Emiliano Zapata Salazar https://en.wikiquote.org/wiki/Emiliano_Zapata%5D

Anonymous Coward says:

Bad for finding terrorist. Great for finding dirt on people.

The main problem is that all this collection isn’t good for looking for patterns and finding “bad people” like in Winter Soldier. However, it’s great if you already have a person in mind and want to dig up dirt on them. And that’s why it’s so problematic. It’s a horrible tool for protecting us and a great tool for violating people rights, which by the NSA’s own admission has already happened several times.

Derek Kerton (profile) says:

Effectiveness Or Lack Thereof Is Not The Main Thrust

The effectiveness or lack thereof should not be our main argument against our egregious government surveillance of citizens. The arguments should begin with:

1. It is ethically wrong.
2. It goes against the Bill of Rights
3. It is illegal
4. It may not work well.

The main reason bullet 4 is a weak one is because I can make a very strong counter-argument to the article above:

OK, so the data is just bigger haystacks today. But we don’t want to be like the IBM CEO who estimated a market for maybe 6 computers in the world. The reality is that Moore’s, Kryder’s, and Nielsen’s laws are all in effect, and it’s only a matter of time before Big Data analytics tools actually manage to make sense of this massive haystack.

While we maybe can’t make sense of the haystack today, having data that goes back many years will prove “useful” in the future when we have greater analytical compute capacity. With years of data, not only is their more information to mine, but trend or panel data can be derived, as opposed to just “snapshot in time” data.

So, I’m not convinced the NSA is stupid to want all that data. I just think they are forward-looking. Unflappably insidious, for sure, but not stupid.

Ed C. says:

Re: Effectiveness Or Lack Thereof Is Not The Main Thrust

The crucial part you’re missing is that as computing power increases, the size of the haystacks don’t remain static. They continue to grow as well. Sure, there may, someday, be a tipping point when computational power overtakes the problems of data set size but there’s still the issue of retrieval. Computer power is useless if the system has to constantly wait for the data to be retrieved.

John Fenderson (profile) says:

Re: Effectiveness Or Lack Thereof Is Not The Main Thrust

“The reality is that Moore’s, Kryder’s, and Nielsen’s laws are all in effect, and it’s only a matter of time before Big Data analytics tools actually manage to make sense of this massive haystack.”

Ed C. is correct. The same things that improve the ability to collect and analyze huge amount of data also increase the amount and complexity of the data to be analyzed.

It’s a bit like crypto: increased computing power makes breaking crypto easier, but it also makes it possible to build even stronger crypto. It’s a perpetual race.

The ineffectiveness argument kindof drives me nuts. Not because its’ wrong (it isn’t wrong) but because it’s allowing the terms of the debate to be derailed from the real argument (ubiquitous surveillance is wrong) to one of technical capabilities.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...