Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Notes from Peter Alvaro's talk at PWL, August 21, 2014

"Consensus is impossible" is a gloss for "there's always an execution that doesn't terminate".

Once you've internalized that you can't distinguish slow nodes from dead nodes, how do you deal with that?

"When you're engaged in a battle of wits with a Sicilian and death is on the line, you certainly need to think about epistemic logic."

A correct distributed program achieves (nontrivial) distributed property X. So we need to ask:

  1. is X even attainable?
  2. what's the cheapest* protocol that gets me X? (* according to some cost metric)
  3. how should I implement it?

(LK: What's a "distributed property"?)

A strong claim about distributed correctness properties: Uncertainty is what makes reasoning about distributed systems difficult. Uncertainty is the abundance of possibilities. Knowledge is the dual of possibility.

(LK: In modal logic, there are "possible" and "necessary" connectives. Are they dual?)

A triangle between properties, protocols, and people. People are flaky.

The "Consistency without borders" SOCC paper bridges the gap between properties and protocols.

If Alice wants to send a message to Bob over an unreliable protocol, well, we can have Alice send the message infinitely many times. But that's no good; then we have to store the message for an infinitely long time! (And Bob might receive it infinitely many times.)

OK, what if Bob has to acknowledge receipt? If we get the ack we can garbage-collect. But Alice still has to send the message infinitely many times, because it still might get lost.

OK, so then Bob has to have a receive buffer on his side, and then send the ack as soon as it's full; if the message arrives again, that has no effect. But then we have to keep Bob's buffer around forever, because he doesn't want to accidentally send the ack more than once, and what if the message arrives from Alice again? He doesn't know if she's received the ack...

OK, so what if they both send acks...You see where this is going.

"Warmup's over! Logic time!"

exists and forall are dual:

exists x. phi === not (forall x. not phi)

likewise, diamond and box are dual:

diamond phi === not (box (not phi))

in epistemic logic, diamond means "possible", box means "known" (LK: what I've heard is "necessary"; maybe "certain" would work too)

Let's say that "E^inf phi" means "Everyone knows that everyone knows that ... everyone knows that phi". Or, in other words, "It is common knowledge that phi". "C phi" for short.

OK, here we go: we want to show that "common knowledge" is not attainable via protocol. In other words, we could have a protocol that ensures that "everyone knows phi", or even "everyone knows everyone knows phi" or whatever, up to some arbitrary huge number of repetitions. But it can't be infinite.

In order to show this we need to leverage modal logic.

Some notation: S |= phi means "phi holds in the model S".

What is the model (or structure) in which a given formula is true?

"We need a database! The structure for first-order logic is just a database! And that's why first-order logic is such a great way to write queries of databases!"

Incidentally, the Boom Analytics paper and "I Do Declare" are both good papers about using FO logic to program distributed systems!

But we're supposed to be talking about modal logic! We need a richer structure than we had before. We need a structure that can interpret the propositional formula under different modalities.

A Kripke structure: (W, pi, R)

  • W is a set of worlds (a set of "states of affairs")
  • For each element of W, pi is a propositional structure
  • R is an accessibility relation

Emerson and Clarke got their model checking Turing award for observing that this mathematical structure -- a Kripke structure -- looks just like a state machine.

(LK: ooh, I never knew that! What does this have to do with Dreyer's state transition systems and relational transition systems?)

OK, enough semantics. Now let's model a distributed system. We've got some set of processors, p1, p2, ...

h(p, r, t) is the history of processor p as of time t and run r.

We're not going to go over the whole proof that common knowledge is not attainable via protocol, but here's an interesting and useful lemma that we use in the proof:

If, in two different runs (r and r') of the same protocol, some h(p, r, t) = h(p, r', t), then it must be the case that if something is common knowledge in the first, then it's common knowledge in the second!

(Aside for the philosophers out there: the only difference between knowledge and belief is that you throw in an axiom that says, 'if I know phi, then phi'. "Knowledge is belief in true things.")

(A corollary of our result about common knowledge: the "two generals" will never attack!)

OK. So we've just shown that we can't have common knowledge. Does that mean we can't have consensus?! Where does that leave us, in practice?

Reality check: let's look at the fragile assumptions on which the proofs rest.

We assumed:

  • a deterministic protocol
  • simultaneous* agreement is necessary (* LK: I don't know what this means)
  • "communication not guaranteed"

"It's not the case that there's a demon in the network that chooses exactly which packets you need to have get through, and then drops them." But in order to do the proof, we do, basically, have such a demon.

Also! We didn't prove that you can't have common knowledge; we proved that you couldn't reliably obtain any via protocol. But you might have some a priori! Real systems do. And, given that a priori knowledge, you might be able to bootstrap common knowledge...

Oh, wait! What about the "how should I implement it?" up above? What about the whole language design/verification/synthesis part of this? (See Halpern and Fagin, "Knowledge-based programming" [PODC '95] for some ideas.)

Oh, finally: monotonic growth of knowledge is interesting. If we can ensure that we accumulate knowledge monotonically, what kind of guarantees can we make? In consensus, receiving messages doesn't always add to your knowledge; it sometimes takes away from your knowledge.

Closing points:

  • Knowledge is the dual of possibility
  • Local knowledge dictates protocol behavior
  • "even more tantalizing", the purpose of protocols is obtaining a particular level of distributed knowledge
  • There are deep connections between semantic structures and system behavior. Pick a logic and find a rich correspondence between it and something in the real world, and..."profit".
  • Common knowledge is unattainable via protocol (but there's hope)

This comment has been minimized.

Copy link

commented Aug 22, 2014

In case I'm understanding this right: If anyone knows about common knowledge before anyone else, then it's not actually common knowledge yet, so it would be incorrect for them to "know" it. So common knowledge would have to be agreed upon simultaneously.


This comment has been minimized.

Copy link

commented Aug 23, 2014

that's right. halpern and moses also define a weaker form called "eventual" common knowledge -- but they show that even this is unattainable via protocol if communication is not guaranteed.


This comment has been minimized.

Copy link

commented Aug 23, 2014

(LK: What's a "distributed property"?)

a property that doesn't hold in vacuous executions in which no messages are received.

But Alice still has to send the message infinitely many times

No; when Alice GCs the buffer she implicitly GCs the behavior too. but we cannot ensure that Bob receives the message exactly once, since Alice can send it some # of times before the ACK arrives.

because he doesn't want to accidentally send the ack more than once

Bob probably doesn't care about that, but he wants to ensure that he doesn't deliver the message more than once (in case downstream processing is non-idempotent)


This comment has been minimized.

Copy link
Owner Author

commented Aug 26, 2014

@palvaro, thanks for the correx. The part of your talk where you were describing the interactions between Alice and Bob went too fast for me (and I had trouble caring about the details, since I suspected you were leading up to a "this will never work in general" conclusion), so I think I was sort of just making stuff up in that part of my notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.