Skip to content

Instantly share code, notes, and snippets.

@darobin
Last active July 21, 2021 04:26
Show Gist options
  • Save darobin/d39eb71d85a88820a476e68ca1db88c3 to your computer and use it in GitHub Desktop.
Save darobin/d39eb71d85a88820a476e68ca1db88c3 to your computer and use it in GitHub Desktop.

It's more that we have a situation in which the simple solutions don't work, and people are up in arms that reality is being inconvenient to them.

Let's take consent, to start somewhere. Consent is just one tool in the privacy toolbox. It's not a particularly good tool, too. In the overwhelming majority of everyday privacy contexts, we don't use consent because that would be absurd: is it okay that I listen while you're talking to me? Is it okay that I see you when you enter the room? Is it okay that, as your doctor, I analyse the symptoms you just described to me? The answer to problems caused by consent isn't more consent, automated consent, or consent that covers more processing. It's to get rid of consent in all cases in which it doesn't improve people's autonomy.

The only reason that consent is used as much as it is is because it seems simple (why think when you can just ask people!), there is a small but vocal group of privacy absolutists who think that there should be practically no data in the digital sphere and therefore everything should be consented, and there are some people in the data industry who know that this is the only way that they're going to get any data (they usually call it "choice"). This makes it look like a more useful solution that it is.

All that consent does is that it offloads privacy labour to the user, and only under very specific conditions does it increase the user's autonomy. This isn't new information. There is a review article from Science in 2015 that brings close to a hundred references to bear on the fact that digital self-determination in the face of privacy issues is highly manipulable[1] and that's even without dark patterns. There are entire books about just the failure of notice regimes[2]. The pathologies of digital consent are known and covered[3]. Lindsey Barrett summarised the situation well when she described notice and choice as "a method of privacy regulation which promises transparency and agency but delivers neither."[4]

Or we could take the perennial question of pseudonymous data. To people who have no experience in data protection, it sounds pretty reasonable. I mean, it's just a number, what do you really learn about someone? But there's an entire industry offering deanonymisation services[5]. Those companies can get quite sophisticated, but the basics of breaking identifiers are straightforward. Reporters taught themselves how to do it in a short time without significant prior knowledge of analytics[6]. Then a different set of writers did it again[7]. In fact, Arvind Narayanan recently said that it's so easy to do, you can't even get a research paper published about the practice[8]. Even with rotating keys, the protection is only good if you can protect against timing attacks and have sufficient k-anonymity. You'd think we might learn because Netflix very publicly failed at this over ten years ago[9], as did AOL in 2006[10].

People who should know better are routinely wrong — by which I mean mathematically wrong — about the safety of identifiers, how do we expect laypeople to meaningfully consent to this?

Another evergreen proposal is that we can somehow use a web of contracts to ensure self-regulation. We have that — we've had it since 2000[11]. The deal struck in the negotiations that took place in 1999-2000 was essentially that the tracking industry would be allowed to keep operating in exchange for setting up a self-regulatory regime[12]. I don't think it's unreasonable to consider that 20 years is more than enough of a chance for this approach to prove itself. We gave it more than a fair shot; the deal is off.

I'm not ruling out regulatory help, there are positive signs, but at any rate regulators need us to build technological alternatives, otherwise there is nothing for them to point to as better.

Speaking of competition, another recurring contention is that, after two decades of unfettered personal data broadcasting led to some of the most significant market power concentration in the history of humankind, what we really need is more unfettered personal data broadcasting because that will clearly lead to greater competition. I can see that there are non-market factors at play in the current situation, but still: given the empirical evidence, the burden of proof sits squarely with those who make that statement.

From what I've read, it's not obvious that it is much more than wishful thinking. Data is inherently relational, meaning it forms a network. The value of data depends in large part on its volume and its variety. This can lead to network effects in which, when data is broadly available, having even just a little bit more volume or variety than others can lead to winner-take-all outcomes from network effects[20]. The OECD's analysis concluded that data enables multi-sided markets which can combine with increased returns to scale and scope, leading to dominance, winner-take-all, and competition for the market rather than in the market[21]. In a separate study, they point to non-linear returns and network effects, with obvious competition implications[22].

And speaking of dysfunctional markets, there's good evidence that the claim that people will enter into a "value exchange" involving their data is an imaginary position that only data marketers believe in[23]. People don't see a value exchange, they just hate you in silence.

Which brings us to another trope: that we somehow don't have a definition of privacy. I mean, it's a somewhat contested space, but not that much anymore. We have a pretty broadly accepted definition[24], that has its own conference[25], is massively cited[26], and is the definition used in social and data science textbooks[27]. I covered it for a general audience[28]. I also introduced it for use in a standards context[29], which hopefully TAG/PING can find some consensus around soon.

I could keep going, but this might be enough of a literature review for today. The way I see it, we have a pretty straightforward choice. We can keep loudly blustering that doing more of exactly what we've done for the past two decades will somehow magically lead to different outcomes, or we can actually bite the bullet, whether we like it or not, and find solutions that actually work.

So with this in mind, I would like to suggest that we stop wasting time revisiting failed strategies to save a broken system, and instead invest in making it work.

--

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment