Truth, Reality and the NSA

What I See When I l LookThrough Bart Gellman's Dark Mirror

May 20, 2020

Bart Gellman’s Dark Mirror is neither a history of the National Security Agency, nor is it a biography of Edward Snowden, whose documents Gellman published in the Washington Post, and it’s certainly no polemic about government surveillance. What it is, at its core, is an account of his meticulous reporting on these subjects. That’s what makes it so valuable. It’s a story about how a good journalist with ethical sensibilities and an allegiance to his country makes hard choices about what to publish. It is a tale about truths.

Truths are hard to come by. It's very easy to say that we all want the truth about complicated political questions. It's also easy to say that well, when confronted with truths, we might resist the allure of tribal affiliations and move towards the light. Trying to figure out what the NSA’s surveillance programs did and didn’t do shows the limits of the word, which suggests a certainty about facts. Reporters are supposed to pursue truths, and we wield that charge as a professional sword. What we are actually doing is searching for reality. A mirror can show you a truth. Reality may be elsewhere.

Even uncontested facts can be completely misinterpreted and misunderstood by the very people who are defending them in the first place. This problem is inherent to complex bureaucracies. This problem is acutely dangerous when those complex bureaucracies are involved in protecting national security. This problem is potentially a threat to your privacy when the national security entity involved is the National Security Agency, which probably collects more information or, if unprocessed, the unrefined flower of information, than any other entity on earth aside, from Google, Facebook and Amazon.

Gellman writes about the National Security Agency from the perspective of a journalist who understands the natural tension between reporting and national security equities and as someone who, throughout his career understands that the arguments for and against keeping something classified are often irreconcilable. That is, both sides can't make better arguments than the ones they’re making. So at some point, you just have to choose.

Gellman was among the journalists who had early access to Snowden’s archive, communicated with Snowden and met him later on. His view, unlike the view of almost everyone who has written about Edward Snowden is quite nuanced. And Gellman carefully explains how he formulated the view and how the view changed over time. He applies the similar lens of curiosity, perspective, and not a small degree of empathy to the national security gatekeepers.

This is where lessons about truth become really interesting. In the course of his reporting, we learn a number of things about the NSA programs and lines of argument used to defend them. There are few abject liars in the book. There are, however, senior officials who pick a line and stick to it, even when reality seems to misalign with the facts. In some cases, it seems clear that the folks at the highest level of government who used NSA products to develop policy simply did not understand how the programs worked from the bottom up, and had little instinct to probe the engineering culture that built them. High dudgeon appears often when these officials, rightly, say that Gellman, and certainly not Snowden, had the right to determine what constitutes national security information worthy of protection, and therefore shouldn’t be in the business of publishing classified information. This is a normal argument from the government. But it presupposes that the folks who do make these decisions have a full sense of the reality about the programs. And Gellman shows, in several instances, how that wasn’t true. Indeed, if you’ve followed the NSA’s own struggle to comply with the law – a struggle that the agency itself has acknowledged through declassified court filings – you get the strong sense that the NSA’s worldwide system of intelligence collection became an emergent creation; no one person, or even a group of people, understood every truth. And so the agency found itself misrepresenting both the scope and limits of its own technology, over and over.

There are two revelations in particular and in order to get to them, we have to wade into the muck a little bit. There is a repository at the NSA called MAINWAY. NSA likes to capitalize cover terms. Gellman makes the point that MAINWAY, when looked at from the perspective of NSA senior officials, was simply a database with telephone numbers that's sitting somewhere or residing in various clouds and doesn't do anything and certainly isn’t invading anyone's privacy until an analyst with a valid reason and a legal predicate decides to query it. This is true, but the truth is a self-selective truth. By that I mean: it is a truth that relies on a set of definitions the NSA itself invented to help it collect intelligence.

To non NSA folks, MAINWAY is the place where all of those telephone records collected by the NSA went. It's the database that we associate with contact chaining – that is -- this person called this person, who called this person , who called this person. What Gellman figured out is that in order for MAINWAY to actually work, it had to preprocess the data that it has, and essentially create “social graphs” on every American. These graphs live in MAINWAY. So the data within the database is not neutral. It’s not a set of numbers linked to a random identifier. From the NSA’s perspective, nothing happens to the data until an analyst queries it. It has no legal significance until an analyst queries it. The analyst says: I'd like to know, based on this particular reasonably articulated suspicion that Marc Ambinder is doing something that has a nexus to terrorism, who he has been in contact with. MAINWAY then spits out my chained contacts – who I called, who I texted, and whether the enriched data shows indices of suspicion.

It’s like NSA is a big shoe store. You ask for a 9 and a half. It already exists in the back, and the shoe salesperson will bring it to you. It didn’t not exist until you asked for it.

However, looking at it from the perspective of the people who put the database together, and Gellman does an amazing job of channeling the sociological impulses of the young hacker generation at work at the NSA, all of the work has already been done. That is to say: I exist as a social graph already in MAINWAY (or I did at the time that MAINWAY actively operating in 2013.) And not only that, my phone number and my phone contacts would already be enriched by other NSA data the moment the analyst decided to add my name and certify that I was a valid selector. So is it accurate to say that the NSA had dossiers on every American?

The NSA would say no. Because nobody could see anything. Nothing existed until an analyst asked for it. But I might be willing to say yes, because they existed. Both perspectives are true, but one is closer to reality. The NSA was afraid to say publicly that it had dossiers on Americans because of the connotation that the word. “Dossier” suggests some sort of careful curation. This curation was done by MAINWAY and other software, rather than humans. But it was a dossier nonetheless.

The question I still have is: did NSA officials who are not experts in engineering, and even a number of lawyers and certainly policymakers at the White House, actually know this? Not to say that anybody hid it from them, but would they have had the interest or, or knowledge to actually ask, “Well, how does this actually work? What is pre-computed and what isn't? “ Was this ignorance willful? Or was it incidental?

There is a another revelation in the book where this confusion of truth comes into play. Snowden revealed that the NSA had secret partnerships with corporations like Google, who would, upon receipt of a request from the NSA and the certification of the Justice Department, copy, then send stored communication, to, from or about the particular non-US person to a system created by NSA. This was PRISM: a secret relationship governed by laws and good faith. U.S. persons couldn’t be queried under PRISM (although NSA had trouble cutting away the US-persons fat from the bone, because internet data tends to be bundled in a way designed for efficiency, not intelligence collection.) Put aside the debate about PRISM itself (or read about it here).

Snowden then discovered that the NSA also collected unencrypted content and metadata from Google without Google’s knowledge. It had mapped the internal architecture used by the company, and tapped into the cables through which Overseas Google Server Farm A sent traffic to Overseas Google Server Farm B. Google has lots of server farms; within the Google system, at the time, encryption happened at the point where data left Google’s servers and touched the regular internet. Inside, everything was, for the sake of efficiency and ease, en clare. So NSA got stuff from Google formally and “upstream” – and the latter was a lot more, and had to include a lot of American e-mail traffic.

As you read this, you wonder: who would approve of such a system? How dangerous would it be to the actual relationship between NSA and Google, which had a legitimate intelligence purpose, if Google knew that the NSA had figured out how to intercept the unencrypted traffic between Google data centers outside the United States? Did policymakers know that this was how NSA was collecting Google's upstream data? Did policymakers know or ask or have the ability to ask the question?

It strikes me that the cohort of engineers that developed these fairly ingenious ways of intercepting lots of information had an incentive perhaps, to be cagey about these things, or they simply didn't care because it wasn't their job to preserve the corporate relationship with Google. How much did NSA leaders know about playing Google on both sides? How much did the Senate and policymakers know?

In 2004, James Comey was very briefly the acting attorney general. He refused to sign off on a part of the Stellarwind program that dealt with internet metadata because he figured out, more or less on his own, that in order to determine what metadata was important to target, the NSA had to basically collect all of it. And that didn’t comport with Comey’s reading of the law or the classified legal interpretations that were used to buttress the argument that it was legal. It seems like intelligence community bought into a linguistic obfuscation machine to convince the public that bulk collection on Americans couldn't happen because patriotic Americans would never engineer the system that way. The unpalatable reality was that one simply could not treat US communications differently as they came in to the system; only at the point-of-service with the analyst could one make that distinction.

The NSA has worked hard to get a grip on its overcollection problems and synchronize policymakers understanding of its systems with the way its engineers do. It has become more transparent. I don’t believe that the agency is abusing its power, and there is ample evidence to suggest that its analysts are well-trained and the agency continues to improve its auditing and oversight. The NSA doesn’t collect phone records anymore and has wound down some of its collection programs that scoop up of large amounts of American communication. People familiar with the agency say that it is easier now to find needles in haystacks because the haystacks are smaller; the NSA has figured out how to automatically search for, and screen out, unnecessary or uncollectable communications before being brought into the system of system of systems. On the other hand, the agency uses so-called “cyber certs” to screen communication for malicious cyber activity that heads into the U.S., and this process often includes content created by innocent people living here. Corporations are much more careful about their secret cooperation with the government, and many have leaned aggressively into end-point encryption, which is good for the security of the commons. The Foreign Intelligence Surveillance Court asks more probing questions of the NSA and is not satisfied with its answers. And our own sense of privacy, or lack thereof, remains a chaotic mess. We don’t have a collective intuition about what belongs to us in the digital world, and thus we don’t know how to collectively demand that the government protect or that corporations explicitly and repeatedly seek our consent to use it. Privacy means anything to anyone.

Bottom line: After 9/11, the NSA was asked to a do a lot. A LOT. And over time, NSA couldn’t figure out exactly what it was doing on organizational level. So it invented a language and series of justifications to account for its technological insufficiencies rather than treating the technological insufficiencies as a hard limiter on what it could do.

Tradecraft: a newsletter by Marc Ambinder

Truth, Reality and the NSA

What I See When I l LookThrough Bart Gellman's Dark Mirror