Skip to content

Instantly share code, notes, and snippets.

@trptcolin
Last active August 29, 2015 14:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save trptcolin/68fc49c8c0643cfa8e9b to your computer and use it in GitHub Desktop.
Save trptcolin/68fc49c8c0643cfa8e9b to your computer and use it in GitHub Desktop.

Double Barrier Confusion

Recipe

The way I'm reading the Zookeeper Double Barriers recipe, the exit condition (for Leave) is that all children of the barrier node must be deleted before any given client process can complete the Leave operation.

So it seems to me that if a client process Enters the barrier after the minimum number of processes have joined the barrier, it can hold up previously-Entered client processes. I feel like a typical implementation of barriers might have the minimum # of processes equal to the total # of processes, so maybe this is an edge case? And maybe this is all totally OK and normal; I'm just looking to confirm/disconfirm my interpretation.

  1. Am I thinking about this right? If so, is there actually a problem here? If not, why not?

Paper

A related point of confusion for me: the language in the ZooKeeper paper for the Double Barrier alternately says "all of the processes have removed their children" (which sort of jives with the pseudocode in the recipe, except for a possible disagreement about what "all of the processes" means), and "processes watch for a particular child to disappear" (which seems at odds with this pseudocode).

  1. Are those two wordings in the ZK paper somehow consistent with one another? Assuming so, how come?
@trptcolin
Copy link
Author

OK, based on some actual code in Curator, I think my initial interpretation about 1) makes sense (Curator's implementation is similar to this pseudocode). Not sure if it's actually a problem in practice - probably not.

And on 2), I didn't read carefully enough 😧: the second quote went on to say "To leave, processes watch for a particular child to disappear and only check the exit condition once that znode has been removed." This seems like it's just the performance optimization around lowest/highest process nodes in the recipe.

So I think I get this now, unless somebody wants to come along and shed some additional light on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment