Skip to content

Instantly share code, notes, and snippets.

@a-deal
Last active May 18, 2020 20:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save a-deal/20076c4bf9cab07f275a13dd34bd2aa3 to your computer and use it in GitHub Desktop.
Save a-deal/20076c4bf9cab07f275a13dd34bd2aa3 to your computer and use it in GitHub Desktop.

What are the parameters R, W, and N?

  • R | Minimum number of nodes that must participate in a successful read operation
  • W | Minimum number of nodes that must participate in a successful write operation
  • N | Number of replicas nodes that retain copies of a system's data (or partition thereof)

How many node failures can the system tolerate while still guaranteeing that read operations will return up-to-date values?

  • Given that R is the minimum number of replicas to be read from for a read operation to be successful, it stands that N, where N equals the number of healthy replica nodes, must be at least equal to R for an up-to-date read operation to succeed. Failures can affect up until where the remaining healthy nodes is at least R.

What’s ”eventual consistency”?

Describes a system where updates to it's persistent state are eventually copied to all replica nodes (N). This implies that there is a window where not all replicas contain a particular update.

What are some advantages and disadvantages?

  • Advantages include:
    • Consumers can access a data store with very little concern of downtime (in the case of Dynamo, at least 99.9% of the time)
    • Eventually consistent systems can sustain network failures or outages of some if not most of it's replica nodes while maintaining consumer access. By comparison, strong consistency guarantees would fail an update if any or even a fraction of a system's replica nodes are unavailable.
  • Disadvantages include:
    • The need to resolve conflicts among read operations that yield different, potentially at odds, states for the same item.

What’s a specific example of behavior that could happen under this consistency model?

  • Amazon's shopping cart service, exemplified in the text, demonstrates that under it's eventually consistent model customer carts occasionally include stale data (deleted items). As conflicts surface between two different states of the same get operation, the preference is to merge these conflicted states under the tradeoff that it is better to include all writes (regardless of when they happened) at the risk of including (understood as a fraction) of bad reads. Conflict resolution is pushed to the application layer where the context is richest as to how that particular data is used.

Why does ”consistent hashing” (compared to, say, hash(key) % N) reduce the amount of data that needs to be moved when a new node comes online?

  • Computing a hash's modulo relative to the number of available replica nodes involves a significant amount of work to redistribute data when a either (a) a node is removed, (b) a node is added, or (c) a node is swapped out for another. In both (a) and (b), all stored items need to be rehashed and reassigned as addition/removal changes the bound imposed by the modulo operator. If this reorganization doesn't occur, certain read/write operations will hash to a position unserviced by a node (in the case of removal) or by a different node (in the case of addition). As for (c), rehashing will only be necessary to copy data from the node removed to that of its successor. All cases present an undesirable caveat to a system that assumes fault tolerance (i.e. some nodes being unavailable) invariably resulting in suffered performance.
  • By contrast, consistent hashing does away with the modulo approach's dependency on the number of replica nodes. To do this, consistent hashing produces a hash ring that abstractly demarcates 'zones' of responsibility along it's circumference. Each replica node is assigned a zone and each data hash corresponds to a radian along the circle. Intuitively, if a data item's key is hashed to a node's zone - it is responsible for that data item. If that a particular node is removed, only the keys hashed to its zone will need to be remapped to the remaining servers while their own keys maintain their respective zones. The addition of replica node resolves in a similar fashion. The total amount work involved remapping is considerably reduced as a result.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment