Enhancing Privacy: Blind Signaling and Response

A user-transparent privacy enhancement may allow online service providers like Google to provably shield personal data from prying eyes—even from themselves. Personal user data like search, email, doc and photo content, navigation and clicks will continue to support clearly defined purposes (advertising that users understand and agreed to), data will be unintelligible if inspected for any other purpose.
In effect, the purpose and processes of data access and manipulation determine whether data can be interpreted or even associated with individual users. If data is inspected for any purpose apart from the original scope, it is unintelligible, anonymous and self-expiring. It is useless for any person or process beyond that which was disclosed to users at the time of collection. It cannot even be correlated to individual users who generate the data.

Blind Signaling and Response is not yet built into internet services. But as it crosses development and test milestones, it will attract attention and community scrutiny. A presentation at University of Montreal Privacy Workshop [video] gives insight into the process. The presenter can be contacted via the contact link at the top of this Blog page.

Can Internet services like Google protect user data from all threats—even from their own staff and processes—while still supporting their business model? If such commitment to privacy could be demonstrable, it could usher in an era of public trust. I believe that a modification to the way data is collected, stored and processed may prevent a breach or any disclosure of personal user information, even if compelled by a court order.

The goal of Blind Signaling and Response is define a method of collecting and storing data that prevents anyone but the intended process from making sense of it. But this pet theory has quite a road ahead…

Before we can understand Blind Signaling and Response, it helps to understand classic signaling. When someone has a need, he can search for a solution.

When an individual is aware of their needs and problems, that’s typically the first step in marrying a problem to a solution. But in a marketing model, a solution (sometimes, one that a user might not even realize he would desire) reaches out to individuals.

Of course the problem with unsolicited marketing is that the solution being hawked may be directed at recipients who have no matching needs. Good marketing is a result of careful targeting. The message is sent or advertised only to a perfect audience, filled with Individuals who are glad that the marketer found them. Poor marketing blasts messages at inappropriate lists or posts advertisements in the wrong venue. For the marketer (or Spam email sender), it is a waste of resources and sometimes a crime. For the recipient of untargeted ads and emails, it is a source of irritation and an involuntary waste of resources, especially of the recipient’s attention.

Consider a hypothetical example of a signal and its response:

Pixar animators consume enormous computing resources creating each minute of animation. Pixar founder, John Lasseter, has many CGI tools at his disposal, most of them designed at Pixar. As John plans a budget for Pixar’s next big film, suppose that he learns of a radical new animation theory called Liquid Flow-Motion. It streamlines the most complex and costly processes. His team has yet to build or find a practical application that benefits animators, but John is determined to search everywhere.

Method #1: A consumer in need searches & signals

Despite a lack of public news on the nascent technique, John is convinced that there must be some workable code in a private lab, a university, or even at a competitor. And so, he creates a web page and uses SEO techniques to attract attention.

The web page is a signal. It broadcasts to the world (and hopefully to relevant parties) that Pixar is receptive to contact from anyone engaged in Liquid Flow-Motion research. With Google’s phenomenal search engine and the internet’s reach, this method of signaling may work, but a successful match involves a bit of luck. Individuals engaged in the new art may not be searching for outsiders. In fact, they may not be aware that their early stage of development would be useful to anyone.

Method #2: Google helps marketers target relevant consumers

Let’s discuss how Google facilitates market-driven signaling and a relevant marketing response today and let us also determine the best avenue for improvement…

At various times in the past few weeks, John had Googled the phrase “Liquid Flow-Motion” and some of the antecedents that the technology builds upon. John also signed up for a conference in which there was a lecture unit on the topic (the lecture was not too useful. It was given by his own employee and covered familiar ground). He also mentioned the technology in a few emails.

Google’s profile for John made connections between his browser, his email and his searches. It may even have factored in location data from John’s Android phone. In Czechoslovakia, a grad student studying Flow-Motion has created the first useful tool. Although he doesn’t know anything about Google Ad Words, the university owns 75% of the rights to his research. They incorporate key words from research projects and buy up the Google Ad Words “Liquid Flow-Motion”.

Almost immediately, John Lasseter notices very relevant advertising on the web pages that he visits. During his next visit to eBay, he notices a home page photo of a product that embodies the technique. The product was created in Israel for a very different application. Yet it is very relevant to Pixar’s next film. John reaches out to both companies–or more precisely, they reached out in response to his signal, without even knowing to whom they were replying.

Neat, eh? What is wrong with this model?

For many users, the gradual revelation that an abundance of very personal or sensitive data is being amassed by Google and the fact that it is being marketed to unknown parties is troubling. Part of the problem is perception. In the case described above and most other cases in which the Google is arbiter, the result is almost always to the user’s advantage. But this fact, alone, doesn’t change the perception.

But consider Google’s process from input to output: the collection of user data from a vast array of free user services and the resulting routing of ads from marketing partners. What if data collection, storage and manipulation could be tweaked so that all personal data–including the participation of any user–were completely anonymized? Sounds crazy, right? If the data is anonymized, it’s not useful.

Wrong.

Method #3: Incorporate Blind Signaling & Response into AdWords
— and across the board

A signaling and response system can be constructed on blind credentials. The science is an offshoot of public key cryptography and is the basis of digital cash (at least, the anonymous form). It enables a buyer to satisfy a standard of evidence (the value of their digital cash) and also demonstrate that a fee has been paid, all without identifying the buyer or even the bank that guarantees cash value. The science of blind credentials is the brainchild of David Chaum, cryptographer and founder of DigiCash, a Dutch venture that made it possible to guaranty financial transactions without any party (including the bank) knowing any of the other parties.

The takeaway from DigiCash and the pioneering work of David Chaum is that information can be precisely targeted–even with a back channel–without storing or transmitting data that aids in identifying a source or target. (Disclosure: I am developing a specification for the back channel mechanism. This critical component is not in the DigiCash implementation). Even more interesting is that the information that facilitates replying to a signal can be structured in a way that is useless to both outsiders and even to the database owner (in this case, Google).

The benefits aren’t restricted to Internet search providers. Choose the boogeyman: The government, your employer, someone taking a survey, your grandmother. In each case, the interloper can (if they wish) provably demonstrate that meaningful use of individually identifiable data is, by design, restricted to a stated purpose or algorithm. No other person or process can find meaning in the data—not even to whom it belongs.

The magic draws upon and forms an offshoot of Trusted Execution Technology, a means of attestation and authentication. In this case, it is the purpose of execution that must be authenticated before data can be interpreted, correlated with users or manipulated. This presentation at a University of Montreal privacy workshop pulls back the covers by describing a combination of TXT with a voting trust, (the presenter rushes through key slides at the end of the video).

It’s reasonable to assume that privacy doesn’t exist in the Internet age. After all, unlike a meeting at your dining table, the path from whisper to ear passes through a public network. Although encryption and IP re-routing ensure privacy for P2P conversations, it seems implausible to maintain privacy in everyday searches, navigation, and online email services, especially when services are provided at no cost to the user. Individuals voluntarily disgorge personal information in exchange for services, especially, if the goal is to keep the service provider incented to offer the service. For this reason, winning converts to Blind Signaling and Response requires a thoughtful presentation.

Suppose that you travel to another country and walk into a bar. You are not a criminal, nor a particularly famous or newsworthy person. You ask another patron if he knows where to find a good Cuban cigar. When you return to your country, your interest in cigars will probably remain private and so will the fact that you met with this particular individual or even walked into that bar.

Gradually, the internet is facilitating at a distance the privileges and empowerment that we take for granted in a personal meeting. With end-to-end encryption, it has already become possible to conduct a private conversation at a distance. With a TOR proxy and swarm routing, it is also possible to keep the identities of the parties private. But today, Google holds an incredible corpus of data that reveals much of what you buy, think, and fantasize about. To many, it seems that this is part of the Faustian bargain:

  • If you want the benefits of Google services, you must surrender personal data
  • Even if you don’t want to be the target of marketing,* it’s the price that you pay for using the Google service (Search, Gmail, Drive, Navigate, Translate, Picasa, etc).

Of course, Google stores and act on the data that it gathers from your web habits. But both statements above are false!

a)  When Google incorporates Blind Signaling into its services, you will get all the benefits of Google services without anyone ever discovering personal information. Yet, Google will still benefit from your use of their services and have even more incentive to continue offering you valuable, personalized services, just as they do now.

b)  Surrendering personal data in a way that does not anonymize particulates is not “the price that you pay for Google services”. Google is paid by marketers and not end users. More importantly, marketers can still get relevant, targeted messages to the pages you visit, while Google protects privacy en toto! Google can make your personal data useless to any other party and for any other purpose. Google and their marketing partners will continue to benefit exactly as they do now.

Article in process…

* This is also a matter of perception. You really do want targeted messaging. Even if you hate spam and, like me, prefer to search for a solution instead of have marketers push a solution to you. In a future article, I will demonstrate that every individual is pleased by relevant messaging, even if it is unsolicited, commercial or sent in bulk.

Will Google “Do No Evil”?

Google captures and keeps a vast amount of personal information about its users. What do they do with all that data? Despite some very persistent misconceptions, the answer is “Nothing bad”. But they could do a much better job ensuring that no one can ever do anything bad with that data—ever. Here is a rather simple but accurate description of what they do with what is gleaned from searches, email, browsing, documents, travel, photos, and more than 3 dozen other ways that they learn about you:

  • Increase the personal relevance of advertising as you surf the web
  • Earn advertising dollars–not because they sell information about you–but
    because they use that data to match and direct relevant traffic toward you

These aren’t bad things, even to a privacy zealot. With or without Google, we all see advertising wherever we surf. Google is the reason that so many of the ads appeal to our individual interests.

But what about all that personal data? Is it safe on Google’s servers? Can they be trusted? More importantly, can it someday be misused in ways that even Google had not intended?

I value privacy above everything else. And I have always detested marketing, especially the unsolicited variety. I don’t need unsolicited ‘solutions’ knocking on my door or popping up in web surfing. When I have needs, I will research my own solutions—thank you very much.

It took me years to come to terms with this apparent oxymoron, but the personalization brought about by information exchange bargains are actually a very good bargain for all parties concerned, and if handled properly, it needn’t risk privacy at all! In fact, the things that Google does with our personal history and predilections really benefits us, but…

This is a pro-Google posting. Well, it’s ‘pro-Google’ if they “do no evil” (Yes—it’s the Google mantra!). First the good news: Google can thwart evil by adding a fortress of privacy around the vast corpus of personal data that they collect and process without weakening user services or the value exchange with their marketing partners. The not-so-good news is that I have urged Google to do this for over two years and so far, they have failed to act. What they need is a little urging from users and marketing partners. Doing no evil benefits everyone and sets an industry precedent that will permeate online businesses everywhere.

The CBS prime time television series, Person of Interest, pairs a freelance ‘James Bond’ with a computer geek. The geek, Mr. Finch, is the ultimate privacy hack. He correlates all manner of disparate data in seconds, including parking lot cameras, government records, high school yearbook photos and even the Facebook pages of third parties.

Mr. Finch & Eric Schmidt: Separated at birth?

It’s an eerie coincidence that Google Chairman, Eric Schmidt, looks like Mr. Finch. After all, they both have the same job! They find a gold mine of actionable data in the personal dealings of everyone.

Viewers accept the TV character. After all, Finch is fictional, he is one of the good guys, and his snooping ability (especially the piecing together of far-flung data) is probably an exaggeration of reality. Right?!

Of course, Eric Schmidt & Google CEO Larry Page are not fictional. They run the largest data gathering engine on earth. I may be in the minority. I believe that Google is “one of the good guys”. But let’s first explore the last assumption about Mr. Finch: Can any organization correlate and “mine” meaningful data from a wholesale sweep of a massive eavesdropping machine and somehow piece together a reasonable profile of your interests, behavior, purchasing history and proclivities? Not only are there organizations that do this today, but many of them act with our explicit consent and with a disclosed value exchange for all that personal data.

Data gathering organizations fall into three categories, which I classify based on the exchange of value with web surfers and, more importantly, whether the user is even aware of their role in collecting data. In this classification, Google has moved from the 2nd category to the first, and this is a good thing:

  1. Organizations that you are aware of–at least peripherally–and for which there is a value exchange (preferably, one that is disclosed). Google comes to mind, of course. Another organization with informed access to your online behavior is your internet service provider. If they wanted to compile a dossier of your interests, market your web surfing history to others, or comply with 3rd party demands to review your activities, it would be trivial to do so.
  2. Organizations with massive access to personal and individualized data, but manage to “fly beneath the Radar”. Example: Akamai Technologies operates a global network of servers that accelerate the web by caching pages close to users and optimizing the route of page requests. They are contracted by almost any company with a significant online presence. It’s safe to say that their servers and routers are inserted into almost every click of your keyboard and massively distributed throughout the world. Although Akamai’s customer relationship is not with end users, they provide an indirect service by speeding up the web experience. But because Internet users are not actively engaged with them (and are typically unaware of their role in caching data across the Internet), there are few checks and on what they do with the click history of users, with whom they share data, and if–or how–individualized is data is retained, anonymized or marketed.
  3. National governments. There is almost never disclosure or a personal value exchange. Most often, the activity involves compulsory assistance from organizations that are forbidden from disclosing the privacy breach or their own role in acts of domestic spying.
The NSA is preparing to massively vacuum data from everyone, everywhere, at all times

The US is preparing to spy on everyone, everywhere, at all times. The massive & intrusive project stuns scientists involved.

I have written about domestic spying before. In the US, It has become alarmingly broad, arbitrary and covert. The über secretive NSA is now building the world’s biggest data gathering site. It will gulp down everything about everyone. The misguided justification of their minions is alternatively “anti-terrorism” or an even more evasive “911”.

Regarding, category #2, I have never had reason to suspect Akamai or Verizon of unfair or unscrupulous data mining. (As with Google, these companies could gain a serious ethical and market advantage by taking heed of today’s column.) But today, we focus on data gathering organizations in category #1—the ones with which we have a relationship and with whom we voluntarily share personal data.

Google is at the heart of most internet searches and they are partnered with practically every major organization on earth. Forty eight free services contain code that many malware labs consider to be a stealth payload. These doohickeys give Google access to a mountain of data regarding clicks, searches, visitors, purchases, and just about anything else that makes a user tick.

It’s not just searching the web that phones home. Think of Google’s 48 services as a marketer’s bonanza. Browser plug-ins phone home with every click and build a profile of user behavior, location and idiosyncrasies. Google Analytics, a web traffic reporting tool used by a great many web sites, reveals a mountain of data about both the web site and every single visitor. (Analytics is market-speak for assigning identity or demographics to web visits). Don’t forget Gmail, Navigate, Picassa, Drive, Google Docs, Google+, Translate, and 3 dozen other projects that collect, compare and analyze user data. And what about Google’s project to scan everything that has ever been written? Do you suppose that Google knows who views these documents, and can correlate it with an astounding number of additional facts? You can bet Grandma Estelle’s cherry pie that they do!

How many of us ever wonder why all of these services are free to internet users everywhere? That’s an awful lot of free service! One might think that the company is very generous, very foolish, or very unprofitable. One would be wrong on all counts!

Google has mastered the art of marketing your interests, income stats, lifestyle, habits, and even your idiosyncrasies. Hell, they wrote the book on it!

But with great access to personal intelligence comes great responsibility. Does Google go the extra mile to protect user data from off-label use? Do they really care? Is it even reasonable to expect privacy when the bargain calls for data sharing with market interests?

At the end of 2009, Google Chairman, Eric Schmidt made a major gaffe in a televised interview on CNBC. In fact, I was so convinced that his statement was toxic, that I predicted a grave and swift consumer backlash. Referring to the Billions of individuals using Google search engine, investigative anchor, Maria Bartiromo, asked Schmidt why it is that users enter their most private thoughts and fantasies. She wondered if they are aware of Google’s role in correlating, storing & sharing data—and in the implicit role of identifying users and correlating their identities with their interests.

Schmidt seemed to share Bartiromo’s surprise. He suggested that internet users were naive to trust Google, because their business model is not driven by privacy and because they are subject to oversight by the Patriot Act. He said:

If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place. If you really need that kind of privacy, the reality is that search engines — including Google — do retain this information for some time and it’s important, for example, that we are all subject in the United States to the Patriot Act and it is possible that all that information could be made available to the authorities.

At the time, I criticized the statements as naive, but I have since become more sanguine. Mr. Schmidt is smarter than me. I recognize that he was caught off guard. But clearly, his response had the potential to damage Google’s reputation. Several Google partners jumped ship and realigned with Bing, Microsoft’s newer search engine. Schmidt’s response became a lightning rod–albeit brief–for both the EFF (Electronic Freedom Foundation) and the CDT (Center for Democracy & Technology). The CDT announced a front-page campaign, Take Back Your Privacy.

But wait…It needn’t be a train wreck! Properly designed, Google can ensure individual privacy, while still meeting the needs of their marketing partners – and having nothing of interest for government snoops, even with a proper subpoena.

I agree with the EFF that they undermine Google’s mission. Despite his high position, Schmidt may not fully recognize to that Google’s marketing objectives can coexist with an ironclad guarantee of personal privacy – even in the face of the Patriot Act.

Schmidt could have had salvaged the gaffe quickly. I urged him to quickly demonstrate that he understands and defends user privacy. But I overestimated consumer awareness and expectations for reasonable privacy. Moreover, consumers may feel that the benefits of Google’s various services inherently trade privacy for productivity (email, taste in restaurants, individualized marketing, etc).

Regarding a damning consumer backlash for whitewashing personal privacy with their public, I was off by a few years, but in the end, my warnings will be vindicated. Public awareness of privacy and especially of internet data sharing and data mining has increased. Some are wondering if the bargain is worthwhile, while others are learning that data can be anonymized and used in ways that still facilitate user benefits and even the vendor’s marketing needs.

With massive access to public data and the mechanisms to gather it (often without the knowledge and consent of users), comes massive responsibility. (His interview contradicts that message). Google must rapidly demonstrate a policy of “default protection and a very high bar for sharing data. In fact, Google can achieve all its goals while fully protecting individual privacy.

Google’s data gathering and archiving mechanism needs a redesign (it’s not so big a task as it seems): Sharing data and cross-pollination should be virtually impossible – beyond a specified exchange between users and intended marketers. Even this exchange must be internally anonymous, useful only in aggregate, and self expiring – without recourse for revival. Most importantly, it must be impossible for anyone – even a Google staffer – to make a personal connection between individual identities and search terms, Gmail users, ad clickers, voice searchers or navigating drivers! For a while now, voice search has been thought of as a huge potential advancement in the Google data gathering system, although many would argue that it has not yet had its desired impact.

I modestly suggest that Google create a board position, and give it authority with a visible and high-profile individual. (Disclosure, I have made a “ballsy” bid to fill such a position. There are plenty of higher profile individuals that I could recommend).

Schmidt’s statements have echoed for more than 2 years now. Have they faded at all? If so, it is because Google’s services are certainly useful and because the public has become somewhat inured to the creeping loss of privacy. But wouldn’t it be marvelous if Google seized the moment and reversed that trend. Wouldn’t it be awesome if someone at Google discovered that protecting privacy needn’t cripple the value of information that they gather. Google’s market activity is not at odds with protecting their user’s personal data from abuse. What’s more, the solution does not involve legislation or even public trust. There is a better model!

They are difficult to contain or spin. As Asa Dotzler at FireFox wrote in his blog, the Google CEO simply doesn’t understand privacy. Here in USA, Schmidt’s statements have become a lightning rod for both the EFF and CDT (Center for Democracy & Technology). The CDT has even launched a front page campaign to “Take Back Your Privacy”.

Google’s not the only one situated at a data Nexus. Other organizations fly below the radar, either because few understand their tools or because of Government involvement. For example, Akamai probably has more access to web traffic data than Google. The US government has even more access because of an intricate web of programs that often force communications companies to plant data sniffing tools at the junction points of massive international data conduits. We’ve discussed this in other articles, and I certainly don’t advocate that Wild Ducks be privacy zealots and conspiracy alarmists. But the truth is, the zealots have a leg to stand on and the alarmists are very sane.