The N.S.A.’s Math Problem
Jonathan David Farley, a science fellow at the Center for International Security and Cooperation at Stanford, argues that the NSA’s scanning of phone records “probably isn’t worth infringing our civil liberties for — because it’s very unlikely that the type of information one can glean from it will help us win the war on terrorism.” The reason is mathematical:
If the program is along the lines described by USA Today — with the security agency receiving complete lists of who called whom from each of the phone companies — the object is probably to collect data and draw a chart, with dots or “nodes” representing individuals and lines between nodes if one person has called another.
Mathematicians who work with pictures like this are called graph theorists, and there is an entire academic field, social network analysis, that tries to determine information about a group from such a chart, like who the key players are or who the cell leaders might be. But without additional data, its reach is limited: as any mathematician will admit, even when you know everyone in the graph is a terrorist, it doesn’t directly portray information about the order or hierarchy of the cell. Social network researchers look instead for graph features like “centrality”: they try to identify nodes that are connected to a lot of other nodes, like spokes around the hub of a bicycle wheel. But this isn’t as helpful as you might imagine. First, the “central player” — the person with the most spokes — might not be as important as the hub metaphor suggests. For example, Jafar Adibi, an information scientist at the University of Southern California, analyzed e-mail traffic among Enron employees before the company collapsed. He found that if you naÃ¯vely analyzed the resulting graph, you could conclude that one of the “central” players was Ken Lay’s … secretary.
Of course, even with his superior understanding of pattern analysis, Farley suffers the same disadvantage as all of us on the outside of this program: having to make guesses on the improbable basis that the USA Today reportage of this classified program is accuate. That is incredibly unlikely. Indeed, considering that I have never attended an event and not found glaring errors in how that event was covered in the press, it is simply inconceivable to me that reporters are going to get it right when covering things that classified and incredibly complicated through the lens of leaking malcontents driven by an unknown agenda.
The remainder of Farley’s analysis is clever but rather odd:
In addition, the National Security Agency’s entire spying program seems to be based on a false assumption: that you can work out who might be a terrorist based on calling patterns. While I agree that anyone calling 1-800-ALQAEDA is probably a terrorist, in less obvious situations guilt by association is not just bad law, it’s bad mathematics, for two reasons.
The simplest reason is that we’re all connected. Not in the Haight-Ashbury/Timothy Leary/late-period Beatles kind of way, but in the sense of the Kevin Bacon game. The sociologist Stanley Milgram made this clear in the 1960’s when he took pairs of people unknown to each other, separated by a continent, and asked one of the pair to send a package to the other — but only by passing the package to a person he knew, who could then send the package only to someone he knew, and so on. On average, it took only six mailings — the famous six degrees of separation — for the package to reach its intended destination.
Looked at this way, President Bush is only a few steps away from Osama bin Laden (in the 1970’s he ran a company partly financed by the American representative for one of the Qaeda leader’s brothers). And terrorist hermits like the Unabomber are connected to only a very few people. So much for finding the guilty by association.
But the intent of the program–combining the USA Today report and a modicum of common sense–is not guilt by association but rather to find clues. Only a small fraction of the people a terrorist makes contact with are fellow terrorists; that’s a given. Still, one is more likely to find other terrorists on the call list of terrorists than on the call lists of non-terrorists, right? Lawyers, doctors, schoolteachers, and bloggers are more apt to network with their fellows than those outside those loops. Wouldn’t the same be true of terrorists?
A second problem with the spy agency’s apparent methodology lies in the way terrorist groups operate and what scientists call the “strength of weak ties.” As the military scientist Robert Spulak has described it to me, you might not see your college roommate for 10 years, but if he were to call you up and ask to stay in your apartment, you’d let him. This is the principle under which sleeper cells operate: there is no communication for years. Thus for the most dangerous threats, the links between nodes that the agency is looking for simply might not exist.
Then again, they might. Indeed, the fact that rooting out terrorist networks is hard is the reason for resorting to extraordinary means. If, for example, the American Jihadist Terrorist Association held an annual meeting and published a directory, it would be silly to spend a lot of time having computers scanning phone records–we’d just stake out the convention center.
If our intelligence agencies are determined to use mathematics in rooting out terrorists, they may consider a profiling technique called formal concept analysis, a branch of lattice theory. The idea, in a nutshell, is that people who share many of the same characteristics are grouped together as one node, and links between nodes in this picture — called a “concept lattice” — indicate that all the members of a certain subgroup, with certain attributes, must also have other attributes.
For formal concept analysis to be helpful, you need much more than phone records. For instance, you might group together people based on what cafes, bookstores and mosques they visit, and then find out that all the people who go to a certain cafe also attend the same mosque (but maybe not vice versa).
Well, no kidding. You think our intelligence agencies don’t know that? You think they’re not following those kind of leads? But, again, terrorists who are sufficiently high on the food chain as to be key intelligence targets are likely to avoid hanging out in the same cafe every day with their terrorist friends.
This is because, as Kennedy and Lincoln assassination buffs know, two people can be a lot alike without being the same person. Even if there is only a 1 in 150 million chance that someone might share the profile of a terrorist suspect, it still means that, in a country the size of the United States, two people might share that profile. One might be a terrorist, or he might be Cat Stevens.
Right. But, then, you’ve narrowed the field to 1 in 2 rather than x out of 300,000 million. That’s a good thing, right? The NSA isn’t taking people who fit a pattern into custody and shooting them. They’re investigating them more closely. Cat Stevens is, last I checked, at large.
This isn’t to say that mathematicians are useless in fighting terrorism. In September 2004 — 10 months before the bombing of the London Underground — Gordon Woo, a mathematician and risk-assessment consultant, gave a speech warning that London was a hotbed of jihadist radicalism. But Dr. Woo didn’t anticipate violence just using math; he also used his knowledge of London neighborhoods. That’s what law enforcement should have been doing then and should be doing now: using some common sense and knowledge of terrorists, not playing math games.
Again, just because you read about one particular program in a newspaper does not mean you understand the entire scope of U.S. counterterror ops. Indeed, I read everything I can get my hands (or a computer mouse) on and have no clue about the vast preponderance of it. That’s the nature of highly complicated, secret things.
Math is just a tool.
So, it seems, are some mathematicians.