How Differential Privacy Works

For years Apple has had a long commitment towards privacy not shared by many of its competitors. While Google and Microsoft are happy to suck up personalized data which hackers and the government could exploit, Apple has refused to do so. As an example, Apple announced at its Worldwide Developers Conference that all iOS apps must encrypt web communications by the end of the year.

But Apple needs data in order to personalize its services and know what adjustments their customers want, so on Tuesday Apple senior vice president of software engineering Craig Federighi discussed a concept called differential privacy which will be in iOS 10 software.

According to Apple differential privacy will “help discover the usage patterns of a large number of users without compromising individual privacy.” The idea is that while Apple can see user data in the aggregate to improve its services, it will be impossible for anyone to find data about any one user. This includes Apple itself, as well as hackers and governments.

The problems with privacy

iPhoneSecurity-cropped

How is it possible to get data in the aggregate but not at the individual level? In order to understand that we need to start with the challenges behind protecting user privacy.

Most companies do make some effort to protect your privacy, and they will often anonymize your data and refuse to publish personal information. But people can use what data is revealed to figure out your personal data.

It is comparable to finding out an Internet forum user’s real-life identity. You won’t have their real name or phone number, but you can note that the forum user lives in New York and went on a date at this restaurant. By using facts like these you can narrow it down until you can discover their true identity. And as Wired pointed out, researchers were able to do something like this in 2007 when Netflix published a list of “anonymous” customers.

This shows that even if a company tries to hide personal information, hackers can use the information they do have to glean personal data. And if the company tries to hide all the information they have, then they cannot use it on their end.

But what if all the information is obscured?

The idea behind Differential Privacy

change-bitlocker-encryption-method-check-current-encryption-method1

That is what differential privacy sets out to do. It works by algorithmically obscuring the data with noise so that hackers can never truly figure out what any one person said.

A lot of the ideas behind differential privacy are theoretical, worked out by tech scientists and cryptologists. But Cynthia Dwork, the co-inventor of differential privacy according to Engadget, gives an example of how it could work, using a surveyor who asks someone whether they have cheated on an exam:

Before responding, the person is asked to flip a coin. If it’s heads, the response should be honest but the outcome of the coin shouldn’t be shared. If the coin comes up tails, the person needs to flip a second coin; if that one is heads, the response should be “yes.” If the second is tails, it’s “no.”

Since a coin over the long run should come up head or tails about fifty percent of the time, the surveyor can roughly guess how many people actually did cheat on their exam over the aggregate. But if a malicious agency finds out that one particular individual answered “yes,” he has no idea if that is because the individual cheated on the test or because he said so after getting a tails and then heads on his coin flip.

Actual differential privacy algorithms are much more complicated but would be similar to the coin flip example. By creating mathematical “noise” to obscure individual data, it is impossible for anyone to know any one data point even if he knew the algorithm.

Potential concerns

ios-jailbreak

Differential privacy could mean that Apple and other companies could get data which helps them while protecting their customers’ privacy. But the fact is that much of the work surrounding differential privacy has been largely theoretical, and there have been no small-scale tests of how it might work.

Implementing it on a large scale, like Apple plans to do with iOS, without small-scale trials is risky.

However, differential privacy is not nearly as useful on a small scale. The mathematical noise will more heavily obscure the data in a small sample size, increasing the chances of entirely inaccurate data. Think about the above coin example. If the surveyor only surveyed 10 people, it is possible that eight people could have flipped “tails,” and his survey would be worthless. But if he surveyed 10,000, it is far less likely that 8,000 people flipped “tails”, and thus he can better trust his data.

Differential privacy is a hard-to-understand concept. But if Apple is successful, it could seriously change how companies acquire data. While there will be companies happy to take user data, the fact that there is a way to collect data without affecting individual privacy could have huge effects between company and customer.

Subscribe to our newsletter!

Our latest tutorials delivered straight to your inbox

Nathan Chandler Avatar

Read next

In 1965, Joe Sutter’s Boeing team began shaping the 747 around a future they thought would belong to supersonic jets, lifting the cockpit onto a hump so the nose could open for cargo once the giant subsonic passenger plane had outlived its brief moment
Apple’s original 1984 Macintosh keyboard had no arrow keys, no function keys, and no numeric pad because Steve Jobs wanted users to reach for the mouse first. Then Apple quietly sold the missing keys as an accessory.
When the SS Great Eastern laid the first working transatlantic telegraph cable in 1866, a message that had taken ten days by steamship suddenly crossed the ocean in minutes, and the financial markets of London and New York were forced, within a single trading week, to invent the modern concept of synchronised global price.
Masahiro Hara and Denso engineers built the QR code in 1994 to help Toyota suppliers scan car parts from any angle, then kept the patent open until phone cameras and a 2020 pandemic turned the factory square into a daily ritual on restaurant tables
In 1965, Mary Allen Wilkes wrote LAP6 for the LINC computer from her parents’ Baltimore home, testing an interactive operating system on a 250-pound machine in the living room and becoming the first known person to use a personal computer at home, twelve years before the Apple II reached buyers
When Grace Hopper wanted to explain a nanosecond to admirals who kept asking why satellites were slow, she handed each of them a piece of wire 11.8 inches long, the exact distance light travels in a billionth of a second, and told them to keep it in their pocket as a reminder that physics, not laziness, sets the limit.
The Big Ear telescope was scanning at 1420.4056 megahertz on the night of 15 August 1977, the exact frequency at which hydrogen atoms vibrate across the universe, because Giuseppe Cocconi and Philip Morrison had argued years earlier that any species trying to be found would broadcast on that channel — and then, for 72 seconds, something did.
When Doug Wheelock came home after 163 days in space, he said he had craved the aroma of leaves, grass, flowers, and trees, the rush of Earthiness that reaches astronauts only when the hatch opens back onto the living planet