What Does Privacy Mean Now?
Last week, Anthropic refused to let the Department of War use their models to spy on Americans or kill people autonomously. In retaliation, the DoW declared Anthropic a “supply chain risk,” a form of embargo. OpenAI jumped in to take over Anthropic’s contract, pretending that they’d somehow secured promises from the DoW than Anthropic hadn’t.
I’m against killer robots, and I won’t analyze that here. I’m also against domestic spying, and I want to discuss how LLMs change the meaning of “privacy.” Does it matter whether a human or a machine is looking at your private data?
The NSA’s Definition of “Collect” #
In 2013, Obama’s director of national intelligence, James Clapper, was asked in a Senate hearing, “Does the NSA collect any type of data at all on millions or hundreds of millions of Americans?” He said no. Months later, Edward Snowden revealed that the NSA was doing exactly that. Clapper defended himself, citing a 1982 DoD manual that defines data as “collected” only when a human intelligence analyst reads it. In court, the government argued that machine processing of metadata about Americans’ phone calls didn’t require a search warrant. But Bruce Schneier asked, “is it really okay for a computer to monitor you online, and for that data collection and analysis only to count as a potential privacy invasion when a person sees it?”
Some people worried (correctly) that human analysts would illegally look at Americans’ data once it was stored and indexed on the NSA’s hard drives. Others thought that even if no human ever looked at their private information, their privacy was still violated.
Now, LLM analysts will make the distinction murkier still, by creating a middle level between humans and old-fashioned software. Or maybe all the levels are collapsing into one.
Many Kinds of Watchers #
I think privacy has to be measured on at least two axes. On one axis we measure what is revealed: My yearbook photo, or a nude photo? My political party, or whom I voted for? My resume, or my salary? And on the other axis is who or what observes my data.
When people talk about online privacy, or when a browser promises to protect my privacy, I think they’re mostly talking about ad networks—fully automated systems that build a profile of my interests and auction my attention to advertisers. No human at Google or Meta is reviewing my browsing history and thinking about me. It’s just dumb machines. I might still call it a privacy concern, because the information has left my control and is being used to manipulate my behavior. Or I might not care, because no conscious being sees my browsing behavior.
In “Nonstandard Observers and the Nature of Privacy” (2014), Eldon Soifer and David Elliott write:
Suppose you are naked in the shower one morning, when you suddenly notice an observer perched on the ledge outside, looking in through the window. It would probably make a great deal of difference to you whether the observer is your neighbor or your neighbor’s cat.
I have an interest in controlling my data, according to Soifer and Elliott, because I care about the judgment of others, because I want to construct a certain public persona. This isn’t just about pride or avoiding embarrassment: others’ opinions of me determine the course of my career and my life. The cat’s opinion of me, however, has no effect.
Computers are a new kind of observer. How much can a computer violate my privacy? For me, an ad network is like a cat. Sure, it’s a little creepy when I search for pants on one site and see ads for pants on another site, but I know no human is watching me so I don’t care. I just install AdBlock and hope for the best. For some people, though, it feels like a privacy violation just to have their data leave their control, gathered by distant and amoral corporations. The NSA’s domestic surveillance (which has supposedly, mostly ended) was much worse than an ad network, because it exposed us to the risk that humans working for a powerful government might look at our private information.
What Kind of Observer is an LLM? #
An LLM is different from an ad network, because it can deduce so much more. Yes, ad networks have a spooky ability to guess my age, gender, income, etc. from my browsing behavior, using mere correlations. But LLMs can do even more than correlate: they can reason. Perhaps a powerful LLM can analyze my deliberate, public disclosures and deduce that I have a secret Care Bears fetish. Suddenly, the intelligence of the observer has changed what is revealed.
I doubt ad networks will use LLMs any time soon—LLMs are far too expensive compared to the fractions of cents involved in ad auctions. But a deep-pocketed and determined observer like the US government could use LLMs to logically deduce much more about a much larger group of people than has ever been possible before. Let’s say the Department of War deploys GPT to surveil Americans. GPT reads my public data, and concludes that I have a secret Care Bears fetish. But it also concludes I’m no threat to MAGA, so it doesn’t alert a human analyst and my secret is safe.
There are two problems I can see with this scenario. First, my private Care Bears fetish is waiting there in a file, until the day when a government analyst decides to read it. There’s far too much risk that such data will be used eventually for political oppression. Second, my privacy might be violated by the LLM’s silent judgment of me. The LLM is human enough that I might feel embarrassed, knowing that it knows my secret. As M. Ryan Calo writes in “People Can Be So Fake: A New Dimension to Privacy and Technology Scholarship” (2015),
We are hardwired to react to anthropomorphic technology as though a person were actually present. This causes changes in our attitude, behavior, even our physiological state. The resulting chill to curiosity and threat to solitude is all the more dangerous in that it cannot be addressed through traditional privacy protections such as encryption or anonymization.
In other words, the feeling of being observed by a conscious-seeming being is harmful, because it forces us to behave in private as if we were always in public.
The Inexorable Panopticon #
This year, the accelerating capabilities and deployment of AI seem inexorable. It flows like water around anyone who tries to slow it. When Anthropic tried to plug a hole in the dike, OpenAI poured over the top within hours. The shortage of GPUs will delay mass surveillance only for a short while. I have no faith that the law will protect our private data from the government. Even if it did, AI surveillance of our public data would be enough to deter dissent, because of its power of deduction. I think the only safeguard against government abuse of LLMs is to elect people we trust not to surveil us.
Images: Minya Diéz-Dührkoop