Saturday, June 08, 2013

A Gift that Keeps on Giving

There is one very important question about government surveillance that I have not yet seen discussed: What happens to the data?

To see why that matters, imagine it is 2016 or 2020 and the candidate of the incumbent party faces a serious risk of losing. Someone in the security apparatus, loyal to the candidate or believing that the election of the opposition candidate poses a serious risk to security, starts looking through a massive database containing records of all phone calls made over the previous ten years—who called whom from where—looking for calls made by or to the candidate. He finds in the pattern evidence of an extra-marital affair. He waits until the candidate has been nominated, then leaks the information to a friendly reporter. Or imagine that some important bill is up in Congress and the vote is very close. A senator opposed to the bill gets a call from someone who makes it clear that he has somehow obtained information of misdeeds by the senator, political, marital, or legal, and the information will become public if the senator shows up and votes against. 

Modern technology makes it possible to inexpensively store and access vast amounts of data. The hard drives I own have a total capacity of several terabytes; a terabyte is enough to store a significant amount of data—about six hundred words worth—on every person in the U.S. The much larger storage facilities available to the National Security Agency should have no difficulty holding the complete calling records of the entire population over a period of decades. And once the information is there, whatever the legal purpose for which it was collected, it can be used for other purposes.

Whatever else comes out of the current controversies, one thing that should come out, and probably will not, is a requirement that all data collected and not used be erased within a reasonable period, say a year, after collection.

26 comments:

Tibor said...
This comment has been removed by the author.
Tibor said...

That is an all too familiar pattern in my country. Before the second last parliament elections here there was a tie between the ČSSD (socialist party) and ODS (sort of a right wing/centrist party which uses a lot of free trade rhetoric but then acts in a way similar to the german CDU - Merkel's party). A week before the election a scandalous police report "leaked" to the media. It was a report by col. Kubice, it was called octopus (not very imaginative, but never mind) and linked various ČSSD politicians to people from organized crime.

The timing was perfect and tipped the balance. We have a multi-party system which almost always results in ties, so those 2 or 3 seats they gained meant that they got a narrow majority (101 or 102 out of 200 seats) for their coalition in the parliament. Then there was one more election and 3 more internal affairs ministers...and now the current head of that ministry is Mr. Kubice. He was fired from the police force after his report was published, so he no longer is a policeman, but now he is a head of the department that controls the police.

I'm not saying that information should not be published...but the timing and the fact that Mr. Kubice is now a minister (although not a party member) in a government with an ODS prime minister (albeit there was yet another election in between those two events) is suspicious to say the least.

RKN said...

I've read that any and all access to the data must be approved by a FISA court judge, based on sound justification of reasonable suspicion of terrorist activity. Problem is, these rules, like the data itself, are evidently also a secret, hardly transparent, and want do you want to bet "mutable" as the government sees fit.

First the camel's nose appeared, nobody was alarmed, before long his whole head was inside the tent, and now ...

chriscal12 said...

Especially since we know that much more inocuous information (census data) has been used for much more sinister purposes (interning the Japanese).
http://www.scientificamerican.com/article.cfm?id=confirmed-the-us-census-b

Unknown said...

@RKN

That court is often little more than a rubber stamp, with only a handful of requests outright denied (many have been modified) since its inception in 1979.

Handle said...

If there's a reason to record it there's a reason to keep it. There's no such thing as some arbitrary time limit of when we can presume it won't be necessary to mine the data to uncover the activities of foreign and malicious non-state actors. That's the focus of the data aggregation, and it really is just that simple no matter how much people want to believe "The government is spying on American citiziens!"

And if you do that - you've just created the world's most obvious loophole around US intelligence efforts. Foreign actors don't even have to be experts to go dark, they'll just use X-year strength encryption to pass their messages.

There are already plenty of anti-indefinite-retention "mandatory destruction" and "time to live" requirements.

But those apply to "Collection" which, unlike what everybody thinks it means, in the US is actually a legal term of art that means, basically, "delivered" to an analyst or some other government employee with a need to know.

There are also plenty of firewalls "for intelligence purposes only" to ensure that information is almost never given to Law Enforcement, and if so, that it can't be used for the purpose of prosecution.

And finally, the system is set up specifically to prevent its being accessed without a trail (mostly to detect even sophisticated espionage) and certainly not by any one lone loyalist, even a senior leader, trying to provide dire to their political patron against the opposition. There's no way to get that stuff without lots of people knowing about - and - as I'm sure is apparent - no way to keep it secret if you did because it'd such an explosive piece of news.

There is no way to resolve the mandates of "protect privacy" and "know everything". Most of what people imagines are "structural reforms" to police such a system are, in reality, already in place. But the details are, and will properly always be, classified. We'll learn to live with it.

dWj said...

Would the FISA judge have the authority to require that the data be pre-processed in some manner before being passed along, or can the judge just say "these records will be released", or "these will not", with some non-computational specificity as to the records?

It seems, in principle, as though there are occasions when I'd say, "you've justified being able to give Verizon this particular script to run that will return 10 kilobytes of data that are initially fairly diffuse across the database, but not the entirety of any particular record independent of whatever calculation the script might be doing". (You'd need a somewhat odd combination of skills to put together such an order in a useful way, though. Hire a retired patent lawyer or something.)

August said...

No, it wouldn't be a pattern in that case. It would merely be the calls. In other words, the guy would go through the calls until he found something. See, the pretext is that this computer modeling works. I don't think it does. It has been creeping in everywhere, even DNA research, but its most annoying deployment is in climatology. At some point, you have to test against the real stuff, whatever it is, and these multivariant computer models aren't very good at prediction.

jimbino said...

There is no such thing as "amounts of data" any more than there is "amounts of people," "amounts of strata," or amounts of "errata."

An enormous amount of information no doubt comprises numerous data.

Anonymous said...

David is correct but the threat may actually be worse. We can assume that over time the IRS records, bank records (probably including credit card statements), emails, Google searches, etc. will all migrate into the same database. Now attach a computer with Watson like programming and the Federal legal code in its memory. Plug in a name and get out a list of likely felonies. We saw what the IRS did with the Tea Parties. Imagine what could be done with this capability.

Tom W. Bell said...

A friendly amendment to your proposal that all collected data be subject to mandatory erasure: GIve the subject of the data the right to receive a copy. Autobiographers, quantified selfers, and cryoncists, among others, would appreciate that data. We've paid for the service; why throw the work away?

Anonymous said...

Since data is easily copied and budgets a secret, any kind of firewall or mandatory erasure is unverifiable and therefore has no meaningful deterrent effect.

Does anybody believe the ATF doesn't keep a database of firearms sales, for example?

Tibor said...

I think you are all forgetting one thing here in the comments. There are not just legal ways for the information to leak out. That is number one. Number two: We have little information about the exact ways agencies such as NSA work and what the goals of some of their high ranking officers are. And I am pretty sure that if they want it bad enough, they can leak information by "forgetting to protect a datastream" or by "losing a laptop" with valuable information, or probably by some more ingenious methods.

Jimbino: Partly true, the amount of information can protect you a little, but not all that much. As long as you are not an important person for the agency, you are protected. They don't have enough power or workforce to spy on everyone. But they have the capability to spy on anyone. So once you become a moderately important politician, public speaker, enterpreneur or whatever they consider important, then can easily filter the data and see you.

A friend of mine now works for a company that deals with botnets and similar security threats. One effect of that is that (by is own words) he's becoming increasingly paranoid :) They recieved a newsletter from a similar company that actually deals with removing those threats (my friend's company just deals with making sure the customer's network is safe from such attacks) and they had one target labeled as "threat no. 1"...which turned out to be a 20 storey building in China full of hackers working for the government. Now, I don't know if I can see the report myself, probably I could since otherwise he wouldn't even tell me, so I might ask him to send it to me and post it here so it is not just hearsay. And also:

http://www.wired.com/threatlevel/2012/03/ff_nsadatacenter/

it is a "data centre"...but that is basically an euphemism for spying centre.

Sure there are good reasons to keep some stuff secret. But in the case of secret governmental agencies the question is whether the benefits really outweight the cost. And it is hard to evaluate that since all of it is secret :)

And lastly one great SMBC comic on this:

http://www.smbc-comics.com/comics/20130428.gif

bruce said...

'a requirement that all data collected and not used be erased'

That requirement would be a huge moral hazard. How many career people, knowing there will be a cluster-f every administration if they follow that law, would quietly make illegal records? How many political appointees would think, 'if it's illegal to use, it must be worth doing'?

American political scandals are skewed by the Democrats running the media. This won't change. More laws, more crime.

RKN said...

That requirement would be a huge moral hazard. How many career people, knowing there will be a cluster-f every administration if they follow that law, would quietly make illegal records?

I hadn't considered that, adding phoney patterns of data to the database and doing so in a way that made it look like the data had been collected.

Unknown said...

Professor Friedman,

I think it is unreasonable to expect any data to be erased, as it is becoming increasingly easy to store secret backups, as storage capacity becomes cheaper. Not only that, but comprehensive surveillabce is increasingly affordable for entities other than the government.

In the spirit of your "Future Imprerfect" book, I'd be interested in visions of a future where comprehensive surveillance with infinite memory is accepted as a given and assumed by all.

I don't think that it would be the end of the world. It would simply mean that end-to-end encryption, communication path hiding (mixnets such as TOR) will become more important and ubiquitous.

Tibor said...

With regard to Future Imperfect and encryption I think what is going to be an interesting even pretty soon is an introduction of a functional quantum computer. Now when all computers (or most) are like that, society will go on as usuall, only in improved conditions. I guess there are a lot of people in algebraic departments who work on encryption in the era of quantum computers. But what will be interesting and possibly dangerous is the leap in between the technologies.

If for instance, the US (or other) government (or some other organization) manages to produce a quantum computer, they will be able to brute force any encryption in the world...at least as long as other people still use regular computers. That window of opportunity won't be big, but possibly big enough for that organization to cause a lot of damage if they wish to. Or am I missing something and there is no such danger?

James said...

Who is to say this isn't already happening?

The official story is that it was a bank reporting requirement that set off the investigation of Elliot Spitzer, but we only have the Feds word for that.

Spitzer made a lot of enemies on Wall Street. It is really inconceivable that someone with pull in the administration started poking until they found something?

And once you have the full story on his activities, it's easy enough to go back and create a plausible pretext for the investigation.

Laird said...

And once the information is there, whatever the legal purpose for which it was collected, it can be used for other purposes.

Correction: it WILL be used for illegal purposes. There has never been secret data collected which hasn't been perverted to illicit ends. Just as there has never been government power which hasn't been abused.

As others have already said, one can never be certain that the data has been erased. It's far too easy to keep secret backup copies. The only solution is to prohibit the collection of the data in the first place. Frankly, I'd rather take the risk of an occasional terror attack succeeding than give our government complete data about every one of us all the time.

And the FISA court is a joke. I have absolutely no confidence in their willingness to protect our privacy, because there is no control on their activities. As far as any of us know, they sign everything put before them. Indeed, that must be the easiest job in the entire federal judiciary; sign whatever the government gives you and head back to the golf course. We need transparency there as in everything else the government does.

Tibor said...

A remotely related comic joke:

http://www.smbc-comics.com/comics/20130610.png

Unknown said...

The alleged whistleblower, Edward Snowden, might seek asylum in Iceland. In a sense, he is attempting to "change godord" a la medieval Iceland. Obviously, modern Iceland does not resemble contemporary Iceland. But there is a (pretty cool) historical connection between what Snowden is trying to do and the history of the place in which he is trying to do it.

Mike Fagan said...

"Obviously, modern Iceland does not resemble contemporary Iceland..."

What a funny typo! Presumably you meant medieval Iceland does not resemble contemporary Iceland.

Unknown said...

Mike Fagan: Yes, I intended to write "medieval" where I ended up writing "modern."

Anonymous said...

TrackMeNot runs in Firefox as a low-priority background process that periodically issues randomized search-queries to popular search engines, e.g., AOL, Yahoo!, Google, and Bing. It hides users' actual search trails in a cloud of 'ghost' queries, significantly increasing the difficulty of aggregating such data into accurate or identifying user profiles. To better simulate user behavior TrackMeNot uses a dynamic query mechanism to 'evolve' each client (uniquely) over time, parsing the results of its searches for 'logical' future query terms with which to replace those already used.

I find this interesting. One method to discourage snooping might just be having computers constantly send junk data. The next step might be for computers log on to fake email accounts and send affair conversations back and forth.

Patrick R. Sullivan said...

Seems pretty far fetched that anyone would go to the trouble to research their political opponents this way when it's much easier to get dirt more conventionally.

Both Robert Bork and Clarence Thomas had their video store rental records passed to groups opposed to their nominations. Just for one example.

We lived until the 1970s with a POTUS being able to have virtually anyone wiretapped by the FBI on nothing more than a claim that national security was threatened. Did we live in a police state then?

I'm more worried about the possibilities for invasion of privacy of people by government by Obamacare, than this.

Patrick R. Sullivan said...

Here's another example of a far worse opportunity for surveillance of citizens:

https://www.e-zpassny.com/en/about/nycarea.shtml

It provided the crucial information for Lennie Briscoe to crack numerous cases over the years on Law and Order.