Skip to content

Instantly share code, notes, and snippets.

@Haseeb-Qureshi
Created February 3, 2019 06:18
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Haseeb-Qureshi/62380ae5d50898c6174fba31cef472eb to your computer and use it in GitHub Desktop.
Save Haseeb-Qureshi/62380ae5d50898c6174fba31cef472eb to your computer and use it in GitHub Desktop.
# Urkel Trees: An optimized and cryptographically provable key-value store for decentralized naming (SBC19)

Urkel Trees: An optimized and cryptographically provable key-value store for decentralized naming

Boyma Fahnbulleh (Handshake)

  • Merkle Trees are great, but can we do better?
  • Ethereum's Wish List on better Merkle Trees:
    • Wanted a key-value store for the state
    • Allow updates without having to reconstruct the entire tree
    • Has bounded depth
    • History independent: root hash doesn't depend on ordering among updates (i.e., commutative updates)
    • Merkle Patricia Tree is basically a fancy radix hash tree
      • Compresses any node that doesn't have more than 1 child
      • Saves a lot of memory—walking down an empty-ish paths is common
      • Stores a key for a value in a DB (like LevelDB)
      • The pointer to the next node in the tree has a cryptographic hash of that node
  • Looking at these, is there a better construction tailor-made for Handshake?
  • What is Handshake?
    • Blockchain-based DNS naming system, intended to replace the DNS root zone file & servers
    • Current root zone embedded in blockchain space, and rest of the space will be auctioned on blockchain
    • Light client is super important in this kind of system
      • Regular users who make DNS lookups will want something very light
  • Handshake's Merkle Tree wish list:
    • Key-value store with minimal storage
    • Exceptional performance on SSDs (for end consumers)
    • Small proof size (< 1kb)
    • History independence (again, commutativity of updates)
    • Candidates?
      • Rebalancing data structures are no good, since they're not history independent
      • Merkle Patricia Tree, right?
        • But proof sizes are too large, and storage requirements are too large
      • Sparse Merkle Tree?
        • Didn't cut it in terms of performance, requires a bunch of DB lookups
        • Way too many rounds of hashing per insertion
  • What can we use instead?
  • Urkel Tree: an optimized and cryptographically provable key value store for decentralized naming
    • Intended to be stored in append-only flat files
    • Base-2 Merkleized trie
      • Based on the Merkle Set, invented by Bram Cohen
    • Two types of nodes: internal nodes and leaf nodes
      • Storing 4 types of nodes like in Patricia Trees sucks
    • Append only files gives you crash-consistency
  • How does it work?
    • Basically involves a lot of tombstone nodes and common hash prefixes...
    • Honestly, just watch the talk if you want to know. Easier to explain w/ ASCII art in the slides than verbally.
  • Benchmarks
    • 50 million 300-byte leaves, 500 leaf batches, periodic commissions of 44K values
    • 500 value batches averaged 100-150ms
    • 44K value commissions averaged 400-600ms
    • Average node depth of 27-28 bits
    • 800 bytes proof size
    • 1-2ms proof creation time
  • One problem: key grinding attacks to force the tree to grow (27 bits is small)
    • Bitcoin produces 72-80 bit collisions on block headers all the time
    • Can do a base-2 Merkleized radix tree for DoS protection
      • But adds complexity to the code
    • Called "Merklix variant", based on Merklix Tree
  • Summary:
    • More performant
      • Urkel Tree allows you to write straight to disk, no underlying KV store
    • Simple
      • Only two types of nodes
    • Storage
      • Nodes are constant size
    • Proof size
      • Are really small
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment