Skip to content

Instantly share code, notes, and snippets.

@kubek2k
Last active May 25, 2017 14:43
Show Gist options
  • Save kubek2k/53b15656bfda647ef7e67cb420173f30 to your computer and use it in GitHub Desktop.
Save kubek2k/53b15656bfda647ef7e67cb420173f30 to your computer and use it in GitHub Desktop.

Notes from lecture of Kyle "aphyhr" Kingsbury

original talk

Initially he talked about the problems that he has found in multiple widely used DB's. Then he moved on to commercial products and described in detail what are those based on, and what kind of problems these can face.

CRANE.io

This is a DB solely based on Elasticsearch (which is interesting from our standpoint). According to his knowledge ES has following problems:

  • allows for dirty reads (writes are visible outside of non-commited transaction)
  • is based on a broken consensus algorithm The conclusion was that ES should not be used as a main record store.

CockroachDB

  • Based on Raft
  • has problems with linearility of operations (the order of operations that were done one after another can diverge)
  • uses wallclock as a sync leader election factor - therefore the systems have to have really well synchronized clocks
  • to be sure that one action is happening after another, in case of failure client has to wait for a potential-clock-skew amount of time to continue with next operation
  • he also pointed that cockroach has a nice feature of distinguishing between retryable and non-retryable exceptions (so it signals if the operation has failed, because of some timeout, but it is effective, or simply has to be repeated)

All in all - it is a good store, but some consideration have to be made before using it.

MongoDB

🙌

Apparently mongo was one of the DB that was broken the most (loosing documents from time to time, for no reason). But, together with him, they actually managed to make MongoDB pass all the Jepsen tests. Since version 3.2 mongo can use v1 protocol which stops to use only wall-clock for leader election (in addtion to that it uses datastructure called term). Additional recommendation he made was to not use arbiters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment