Oceans of Data
Try to visualize all the data about you that is recorded, stored or transmitted each day in one form or another. Consider every possible source, both public and private. What if it could all be put together, correlated with data about every other person on earth and sifted by detectives whose only task is to look for subtle patterns of behavior?
Let’s start with phone calls: In addition to the number dialed, the phone company knows your location, the caller of ID of incoming calls, and even has access to the actual conversation. (Believe it or not, your government is listening). Check the phone bill of both parties and we can figure out how often you call each other. If we then learn everything we can about the people that you talk to, we can probably learn a thing or two about you. And speaking about location, did you know that both iPhones and Android phones log your precise location every few seconds and then transmit your location history to Apple or Google several times each hour? An even more ominous program discovered this week is embedded in Android phones. It sends every keystroke to your carrier even if you opt out.
What about your health records, magazine subscriptions, tax filings, legal disputes, mortgage records, banking transactions including charge card purchases? Now add your internet use – not just the sites at which you are registered, but every site you have ever visited. Suppose we add videos from convenience stores, traffic enforcement cameras and every ATM that you pass. Don’t forget the snapshot at the toll booth. They have one camera pointed at your face and another at the license plate. Of course, there is also a log entry from the toll payment device on your windshield and the key chain FOB that you use when you buy gas.
What about the relationships that are revealed by your old high school yearbook, old newspaper articles or that 4th grade poetry contest your daughter was in. There was a handout that night and so it counts as information related to you. How about that camera in the elevator at work? Suppose that it could recognize your face immediately and match it up with your fingerprints from your last international flight and your phone calls, web visits, hotel reservations and TV viewing habits.
Whew! That’s a lot of information to recognize or sift through in any meaningful way. But for a moment, ask yourself “What If”… What if all that data from every transaction record, GPS device, tax return and historical log could all be accurately attributed, correlated, matched and analyzed. What could be accomplished with all of this? Who wants it and for what purpose? Would their goals align with yours?
Person of Interest
In the CBS Television series, Person of Interest, a government computer looks for clues to the next terrorist event by monitoring virtually everyone and everything. The project doesn’t require its creators to build a new surveillance network. Massive amounts of data are already floating around us every day.
Of course, the data is fragmented. It was gathered for different reasons – mostly for private commerce (banking, medicine, safety). Few people consider it to impact privacy or personal freedoms, because we assume that It is too disparate and unwieldy for analysis by any single entity. Yet, in Person of Interest, the computer taps into all of these sources and mines the data for suspicious patterns.
As patterns emerge from all of this data, the computer finds converging threads based on individual behavior. Taken alone, the data points are meaningless — someone in Oregon signs for a package; someone using a different name in Rhode Island makes a plane reservation; someone in Pakistan fitting both descriptions checks into a motel and visits a convicted arms smuggler. The mobile phone carried by the last person accepts a phone call at a number previously used by one of the other individuals. Normally, no one could have ever fit these pieces together.
Eventually, the computer begins to identify suspicious activity. Depending on the programming and based on past findings, it even predicts events. But wait! Many of the patterns it finds are unrelated to terrorism. It finds clues to likely mob hits, crimes of passion, kidnapping, guns at school, and regional crime. The results are irrelevant to the machine’s purpose and in this fictional drama, the government decide that analysis would constitute illegal domestic spying. So they order the programmer to purge “irrelevant data” by adding a software routine to periodically delete extraneous results.
Of course, if the “personal” results were deleted, we wouldn’t have a new and exciting television series (my personal favorite). So, the middle-age geek who gave life to the analytics, recasts himself as a vigilante. He teams up with a former special ops agent (in the mold of Harrison Ford) and together, they follow data-mined leads in hope of saving innocent individuals.
In the US, our government has such a program. In fact, there are many Total Information Awareness projects. Unlike the Hollywood version, there was never any intent to purge personal information. In fact, it’s collection and analysis is the whole point. Another difference with the television series is that our government is not satisfied to mine public data or even legally obtained data. Instead, The federal government adds new primary data mechanisms every month and builds enormous enterprises to spy on individuals. This results in voluminous information daily, all of it available for future data mining without anyone’s knowledge or consent.
Of course, information and videos of individuals are routinely recorded wherever we go. But typically, we assume that this information is not centrally gathered, compared or analyzed. Most people assume that they are “off the radar” if they are not being actively tracked as part of an investigation. But with data mining techniques, no one is really off the radar. Machines make decisions about patterns that should be flagged and escalated for additional scrutiny.
Mixmaster: An Innocent Tool or Antiforensics?
In the 1990’s, despite a background in cryptography and computer science, I wasn’t aware of these programs. In the fields of political science and sociology, I was a ninnyhammer. It is either coincidence or perhaps prescience that I proposed and then participated in a project called a Mixmaster more than a decade ago…
The idea was simple: As you surf the web or send mail, your digital footprints are randomized so that an interloper or investigator could not piece together the participants in an internet exchange, nor determine the habits of an individual user. Well, they’re not really random, but the IP address reported to the email service or web page you visit is substituted by one associated with another participant in the project. That’s because each data leaving your PC is relayed through internet services associated with the others. We added a few simple facets to further obscure tracks:
Recognizing that a rogue participant might keep a log on the individuals who hand off data through his own relay (or may be compelled to do so in the future), our code automatically increased the number of ‘hops’ in relationship to the number of available peers. Anonymity was enhanced, because an unfriendly investigator attempting to trace the source of a web visit or email would need cooperation from a larger pool of participants.
Data between participants ware encrypted and randomized in length and even timing, to thwart possible forensic analysis.
A backward channel was added, but with very tight rules on expiration and purging. This allowed packet acknowledgement, web site navigation, and even two-way dialogue while still preserving anonymity.
Privacy & Politics
For most of us involved in the project, we had no endgame or political agenda. We simply recognized that it is occasionally comforting to send email, browse the web or post to a public forum without leaving a traceable return address. To those who claimed that our work might aid money launderers, terrorists or child molesters, we explained that identification and authentication should be under control of parties involved in a conversation. The internet is a new communications medium. But it was not designed to undermine the privacy of every conversation for the purpose of facilitating future forensic investigation. Investigators – if their purpose is supported by judicial oversight –have many old school methods and tools to aid their detective work. The growth of a new communication medium must not become a key to suppression or compromised privacy.
Anonymous, but authenticated
There is a big difference, between identification and authentication. In a democracy, citizens are authenticated at the polls. But they enter a private booth to cast their vote and they turn in a ballot without a signature. They are identified (or even better, authenticated without identification) for the purpose of verifying eligibility. But their identity is not carried over to their voting decision. The real business is effectively anonymous.
This isn’t to say that all authorized entry systems should allow anonymous access. Of course not! Access entry systems typically might asks “Who are you?” (your User ID) and then ask for proof (typically a password). Your identity is not always required, but proof of authorized access can come in 3 forms. Very secure systems (such as banks) require at least 2 of these before allowing access:
- something you know: A password or challenge
- something you have: Evidence that you have a token or card
- something you are: A fingerprint, recognizable face, or voice match
In each case, it is the person behind the door that needs your identity or authorization and not your government.
Anonymity and encryption go hand in hand. Both technologies are used to ensure that internet communication is private and does not become the affair of your friends, employer, former spouse, or government overseers. So where, exactly, does your government stand on the use of internet encryption or anonymity? In most of the world, the answer is clear. Governments stand for propaganda and crowd control. They are against any technology that enhances privacy. But this is not a universal axiom: In Germany, they stand on the side of citizens. Your data and your identity belong to you. Very little of your affairs are open to the government. But in the United States, the answer is very murky…
The NSA conducts vacuum-cleaner surveillance of all data crossing the Internet–email, web surfing… everything! –Mark Klein
Under George W. Bush, every bit of information was Uncle Sam’s business. With oversight by Dick Cheney (and hidden from legislative or judicial oversights), the executive branch concocted mechanisms of blatant domestic spying. Of course, the ringleaders realized that each mechanism violated the US constitution protection from unreasonable search, and so it was ordered and implemented covertly until a technician working for AT&T blew the whistle. Suddenly stories were surfacing that Uncle Sam was implementing a Reagan era project that had been shelved during the Clinton era. This launched a scramble to win public support for The Patriot Act, an absurd euphemism which attempts to whitewash illegal snooping as the patriotic duty of each citizen (talk about ‘deceptive’! Our leaders must think that we are sheep. Not just your garden variety grass-eating sheep, but really, really dumb sheep that feed on bull chips!).
-=-=-=-=-=-=-=- (writing in progress)
… until and (including preemptive data mining with programs like Dick Cheney’s “total information awareness”), back doors built into encryption chips, “deep packet” data sniffing installed at major switching center, satellite interception of phone calls, and national security letters (a euphemism for warrantless snooping).
Before the Obama administration, the answer was clear. These technologies are barely tolerated for banking, medicine and commerce. But they are to be weakened, subterfuged or thwarted when used by private citizens. In each case, the government sought to block the technology or insert a back door into the programming code (and into actual data centers) for use during any future investigation. Of course, in a bold era of predictive behavior modeling, authorized investigations often gives way to fishing expeditions for the sole purpose of information gathering.
But something has changed in the past 2 years. As news spread about Internet censorship in China, the Arab spring, and covert schools for girls in Taliban controlled regions of Afghanistan, the US government began to recognize that uncensored and even untraceable Internet use sometimes coincided with foreign policy objectives. Imagine the conundrum this revelation must have generated within the state department! On the one hand, the Patriot Act sanctions blatant acts of domestic spying (including preemptive data mining with programs like Dick Cheney’s “total information awareness”), back doors built into encryption chips, “deep packet” data sniffing installed at major switching center, satellite interception of phone calls, and national security letters (a euphemism for warrantless snooping). Yet, they also support freedom of speech and privacy for anything that supports US policy amongst our friends.
Today, this model has been widely adopted and greatly enhanced by an open source project called Tor. In this blog, I won’t try to justify the need for robust anonymous relays. Better writers and social philosophers than me have explained why free and anonymous communications channels are central to a free and democratic society. Better writers than me have chronicled the abuse of the Patriot Act, Echelon, TIA and numerous other abuses of government forms of overreach. Better writers than me have explained how open and free communication leads to increased safety even if it sometimes facilitates communications among terrorists, digital pirates or pornographers.
Turn of Events: Government as Advocate
- Obama lends support to Tor
- Tor to users: Use Amazon Cloud as bridge to anonymity (this section under development)