The recent leak
of the XKeyscore source code has raised an interesting question. Is there a second leaker? The report written by Jacob Appelbaum
and others for DasErste.de detailed the NSA's targeting of Tor users (and even those who just read
about Tor) and the harvesting of their communications, but very explicitly did not state that Snowden was the source of this code snippet.
Others noticed this lack of attribution and commented on it. Cory Doctorow at Boing Boing apparently received confirmation that this particular leak
from Snowden's trove of documents.
Another expert said that s/he believed that this leak may come from a second source, not Edward Snowden, as s/he had not seen this in the original Snowden docs; and had seen other revelations that also appeared independent of the Snowden materials.
Cryptologist and security expert Bruce Schneier (who has
seen the documents released to journalists by Snowden) concurred with Doctorow's conclusion
And, since Cory said it, I do not believe that this came from the Snowden documents. I also don't believe the TAO catalog came from the Snowden documents. I think there's a second leaker out there.
The TAO catalog
was originally revealed by Der Spiegel
with reporting by (again) Jacob Appelbaum and Greenwald/Snowden partner Laura Poitras. Nothing in the story explicitly states its origin, although the inclusion of Poitras at least suggests the documents can be traced back to Snowden's stash.
Glenn Greenwald, however, offered his agreement with Schneier's take here
If so, then that's two
people who have seen Snowden's documents, including one with ongoing access, claiming there's a second leaker. And if so, the NSA's problem, instead of gradually disappearing from the public eye, will become more severe. Coupled with the recent leak published by the Washington Post, which shows the agency harvests and stores plenty of unminimized non-terrorist communications with its 702 collections (the same collection the Privacy and Civil Liberties Oversight Board recently found to be more law-abiding
and less Constitutionally unsound
than the bulk metadata program), the agency now looks worse than ever. It was completely unprepared for the Snowden revelations, but at least by this point, it has a general feel for the leak release process. Now, it possibly has another
leaker offering new data and info to journalists, one which is a totally unknown quantity.
At this point, all anyone has is speculation. If there's another leaker, it's doubtful he or she will make his/her identity known any time soon. Snowden revealed himself as a leaker and that hasn't exactly worked out well for him.
But there's also some indications that this snippet of code came from Snowden's leaks. Errata Security (the group of bloggers that exposed the fakery
behind NBC's pre-Winter Olympics "report" that all visitors to Sochi would be instantly hacked) has done its own fisking of the code snippet and come to the following conclusions.
1. The signatures are old (2011 to 2012), so it fits within the Snowden timeframe, and is unlikely to be a recent leak.
2. The code is weird, as if they are snippets combined from training manuals rather than operational code. That would mean it is “fake”.
3. The story makes claims about the source that are verifiably false, leading us to believe that they may have falsified the origin of this source code.
4. The code is so domain specific that it probably is, in some fashion, related to real XKeyScore code – if fake, it's not completely so.
Errata Security notes some of the oddities of the code, pointing out that it looks more like something pulled from a training exercise or manual rather than directly from XKeyscore itself. More investigation by Errata Security and The Grugq (another security expert)
apparently uncovered the fact that the text was pulled from a document (pdf, docx, etc.) rather than an actual source file. But the aspect that seems to indicate this is part of Snowden's stash is the timeline.
As this post to the Tor developer mailing list describes, the signatures in the code are old. The earliest date this file can be valid is 2011-08-08, when the Linux journal reported on TAILS. The latest date might be 2012-09-21, just before a new server was added to Tor that isn't in the XKeyScore list. Since this is shortly before Snowden first tried to contact Greenwald, the dates sync up.
If the code is unrecognizable by those who've had access to the documents, that's probably due to it being compiled from various pages and mocked up into a short code excerpt. Rob Graham at Errata Security doesn't feel it's necessarily fake, but believes the origin of the quoted source code may have been obscured -- hence, no citation of Snowden's leaks or any acknowledgment of existing NSA files.
Of course, this could mean another leaker is simply hiding behind Snowden, and has pulled files roughly in the same date range in order to deliver new leaks in order to remain undetected. If there is another leaker, my guess is he/she will be discovered rather than coming out publicly.
New leaker or no, the one-two punch of published leaks by Jacob Appelbaum and Barton Gellman (of the Washington Post) shows that the NSA is doing everything it's been accused of -- namely, hoovering up and holding onto incidental communications (even those originating from "untargeted" American citizens) and viewing anyone with even a passing interest in anonymity or encryption as "suspicious."