Anonymized Data?

Richard Falkenrath, a Brookings scholar and former Bush NSA official, has an op-ed in today’s WaPo defending the NSA’s acquisition and algorithmic combing of phone records. His argument boils down to this:

The potential value of such anonymized domestic telephone records is best understood through a hypothetical example. Suppose a telephone associated with Mohamed Atta had called a domestic telephone number A. And then suppose that A had called domestic telephone number B. And then suppose that B had called C. And then suppose that domestic telephone number C had called a telephone number associated with Khalid Sheik Mohammed, the mastermind of the Sept. 11, 2001, attacks. The most effective way to recognize such patterns is the computerized analysis of billions of phone records. The large-scale analysis of anonymized data can pinpoint individuals — at home or abroad — who warrant more intrusive investigative or intelligence techniques, subject to all safeguards normally associated with those techniques.

Frankly, bloggers who aren’t at Brookings and haven’t served on the NSC were cranking out better analysis before our first cup of coffee Thursday morning. By early afternoon, my colleague Steve Verdon even had some cool formulas up laying all this out.

Regardless, Falkenrath’s analysis is fine as far as it goes. Kevin Drum, though, dubs Falkrenrath an apparatchik, an odd designation for a legitimate expert employed for a short time at the highest levels of government, and is outraged by the use of the term of art “anonymized.”

Even a child knows that phone numbers can be linked to names and addresses using ordinary commercial databases. There is absolutely nothing anonymous about this data, and only a shameless con man would try to convince us otherwise. Why does the Post give space to this obvious agitprop?

But “anonymized” is not meant to convey “anonymous.” Of course there are means of reverse lookup on the data. What would be the point of searching it for patterns, otherwise?

Sir! We have found evidence that 202-555-1212 is a terrorist!

Excellent job, Smith! Let’s bring him in.

Sir, there’s a problem. . .

The point of anonymizing data is not to prevent figuring out whose data it is but to prevent inadvertant disclosure of information. In this case, presuming the USA Today report is accurate, NSA computers are combing through data looking for patterns that match algorithms written by NSA Poindexters and analysts are looking at some printouts that are spit out. Presumably, even some large number of those printouts that are spit out turn out to be nonsense. Regardless, until a human gets to the point of thinking there’s a reasonable chance that an Aha! moment has occured, no one has seen any information that is connected in their brain with any other specific human being.

At the moment a detailed human investigation begins, presumably, it’s time to add names and addresses back into the picture. I suspect the NSA’s handy dandy computers can do that in seconds.

It may well be the case, too, that–before the de-anonymization (I don’t know if that’s a word) occurs–warrants may need to be obtained to actually get the phone records of the individual suspects. It’s one thing to scan information in an anonymized database; it’s another to look at a specific individual’s phone records knowing who that individual is.

FILED UNDER: Uncategorized, , , , , , ,
James Joyner
About James Joyner
James Joyner is Professor and Department Head of Security Studies at Marine Corps University's Command and Staff College. He's a former Army officer and Desert Storm veteran. Views expressed here are his own. Follow James on Twitter @DrJJoyner.

Comments

  1. Ugh says:

    GWB: Goddam I hate Seymour Hersh.

    CIA/NSA/FBI: Uh-huh.

    GWB: What do we got on em?

    CIA/NSA/FBI: Phone records.

    GWB: Great. Who called him and who did he call?

    CIA/NSA/FBI: [REDACTED]

    GWB: Heh heh heh.

  2. RiverRat says:

    Algorithms Smalgorithms! They’re more likely combing for social webs keyed from a suspected overseas contact.

    If anyone saw the Gonzales testimony in January it was clears he was trying to isolate the “foreign surveillance” program from other programs. It seems to me he was attempting to avoid discussing the fusion of the two programs.

    A(in Pakistan)called B, called C, called D, Called A. Now give us warrants to listen to B,C,&D. This is just one simple example.

  3. Toeaz says:

    Its notthe data its who got access. If its not just DIA, it could be Plame not liking you…

  4. RiverRat says:

    UGH,

    Bush can’t do that without breaking very specific laws. He would need the collusion of of the FBI or the CIA. He sure as hell is not going to get assistance from the CIA. Besides, who gives a damn who a moonbat like Hersch is talking to unless it’s a “reasonable” target? What the hell is the NSA going to do? Put out a contract on Hersch?

  5. anjin-san says:

    Look, lets just have the government put cameras & mics in all of our homes and get it over with.

    The friggin’ terrorists have allready won, because this country is becoming less free with each passing day. The scary thing is how many people have cheered as each bit of the constitution has been shredded.

  6. Bithead says:

    Why, of course, Drum is going to label him an apparatchik… Since he has no other defense of his position what else would you expect a Drum to do?

  7. Tano says:

    “It may well be the case, too, thatâ??before the de-anonymization (I donâ??t know if thatâ??s a word) occursâ??warrants may need to be obtained to actually get the phone records of the individual suspects.”

    This is the key. I assume that Kevin Drum is writing under the assumption that there is no need for warrants before taking this step (or, further, that any need for warrants might be ignored). I would assume the same. So perhaps we should try to find out if this assumption holds or not.

    If there is such a need for warrants, and it is respected, then Drum is being over-the-top, and James’ criticism is valid. If there is not such need, then Drum is right in his characterizations.

  8. Jon Hendry says:

    “If there is such a need for warrants, and it is respected, then Drum is being over-the-top, and James� criticism is valid. If there is not such need, then Drum is right in his characterizations.”

    Respecting a need for warrants? Yeah, right. Old Alberto G. says Bush doesn’t need no stinking warrants, because Bush is the queen of America.

    James, your rationalizing of the “anonymization” bollocks doesn’t hold up. They have the data. It’s being used any way they want. I’m quite sure they have the phonecall database tied into other databases, and could bring up all your phonecalls in seconds by putting in your name. (That may be against the law, but Bush and Al G. have decided that laws don’t apply to Queen George.)

    The only anonymization that would count would be if, BEFORE turning the data over, the phone companies replaced the phone numbers with unique numbers uncorrelated to any other data anywhere, so that the NSA could determine only that phone 345232 called phone 87322.

    That, however, would be useless for the NSA’s purposes. It would also prevent linking the datasets of all the phone companies, because each company’s data would be different.

    So, forget it. There is no anonymization, unless you trust the scofflaw Bush administration to restrain itself.

  9. The exampled giving is a load of bull, that’s not how datamining works. Indeed, if there was a know terrorist in your call chain, you could just get traditional warrant and the NSA’s program would be a complete waist of effort. If your tracing a call chain from Mohamed Atta to A to B to C, why worry about collecting data about the unrelated D through Z who have no relation to any of them?

    A better example of datamining would be something like:

    Create two sets of call data. One of the total body (or a sufficiently large subset) of calls between the US and Saudi Arabia. A second of the calls between known terror suspects in the US and know terror suspects in Saudi Arabia.

    Determine a statistic where the expected value for the first set differs from the first set to the second set (e.g. the ratio of calls from the US to the calls from Saudi Arabia, the number of common contacts, etc.)

    Now look for sets of people A and B for who the value of this statistic is closer the terrorist value to than to the total call value.

    You’ll notice the difference here is that in this example that neither A nor B need have know ties (direct or indirect) to terrorism at all. Nor is there anything explictly terrorist about the activity that got them chosen for scrutiny.

  10. Roger says:

    “Itâ??s one thing to scan information in an anonymized database; itâ??s another to look at a specific individualâ??s phone records knowing who that individual is.”

    True, James. But if you can do the first without any legal oversight or cause for suspicion of criminal activity, why not the second? No one’s rights remain inviolate and the Constitution be damned. That’s the problem.

  11. James Joyner says:

    Roger,

    There’s no meaningful sense in which my rights are being violated by having NSA computers looking for webs of connections with terrorists. Indeed, to the extent that they have been violated, it’s because Verizon turned the records over without demanding a warrant.

    It’s a pretty big leap from that to warrantless searches of specific people’s information. Going from mass to personal is a big leap.

  12. Maggie says:

    I really don’t give a damn if numbers appearing on my phone records are “glanced over” by the government…but could someone suggest how I keep my husband from reviewing those same phone records…He had a total shit-fit when he saw my cell phone charges last month! Sheesh.

  13. Roger says:

    James, you have no way to know if you’re rights have been “meaningfully” violated. You didn’t know a few days ago Bush has your phone records. You may find out tomorrow he has transcripts of your phone calls. If you concede the laws don’t apply to Bush, you concede him your rights to abuse as he will.

  14. James Joyner says:

    Roger,

    “Bush” doesn’t have the records; NSA does. I don’t concede he doesn’t need to follow the law. As best we know so far, no law was broken. Records were voluntarily turned over by phone companies seeking to help fight the war on terrorists.

  15. Roger says:

    Ok, James, the NSA is withholding the phone records from Bush. Right. “As best we know so far, no law was broken.” Keep using that line while you can. As I’ve pointed out elsewhere, its shelf-life is fading. The mere fact that phone companies participate in criminal activity does not mean it’s legal. There are legal means to fight the war on terrorists that have not been shown to be any less effective than the series of illegal actions we’ve seen so far. In fact, the only reason it would seem to be necessary to engage in these activities without judicial sanction as required by law would be if you intended to misuse the info gathered.

  16. James Joyner says:

    Roger: It’s not at all clear on what basis a judge would be able to issue a warrant for a non-search of a non-specific person and place. Nor is it clear that FISA is even implicated in this particular op.

  17. Roger says:

    Interesting. The judiciary may have no clear basis to sanction this activity, but the executive can do it anyway. Truly fascinating stuff, James.

  18. James Joyner says:

    Executive officials much less accountable to the public than the president, down to the level of beat cops in fact, are authorized to make judgments about protecting public security. Searches are conducted, routinely, without warrants and the courts have long ruled that they are perfectly permissible so long as the basis for doing so is reasonable.

  19. Roger says:

    Wow, James. An amazing stretch there equating a cop on the beat making an on the spot decision within the law with a rogue regime in Washington that has been conducting illegal activities for years. The cop on the beat caught opening people’s phone bills to review their calling records without a warrant would not be taking any walks in the park for many years to come no matter how urgent his claim to be protecting public security.

  20. Jon Hendry says:

    “Searches are conducted, routinely, without warrants and the courts have long ruled that they are perfectly permissible so long as the basis for doing so is reasonable.”

    So, basically, Bush can just “pull over” the entire population of the United States and search us?

    And that’s ‘okay’?