Skip to content

Instantly share code, notes, and snippets.

@nothingmuch
Created January 20, 2017 19:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nothingmuch/c9cbd8a5b9439ccb3ad3ff9658f9abb5 to your computer and use it in GitHub Desktop.
Save nothingmuch/c9cbd8a5b9439ccb3ad3ff9658f9abb5 to your computer and use it in GitHub Desktop.

Notes on foundational modules of Dat code:

Hypercore

Feeds

totally ordered sets of blocks encoded as merkle trees

static: pure merkle trees, created by merkle-tree-stream

node types are leaf, node & root
flat-tree numbers all nodes in tree for given tree size
tree root = hash of all roots (unless $|blocks| log 2 ∈ \mathbb{N}$)
stream is not written in flat-tree order, but a breadth first upwards walk of flat tree (0, 2, 1, 4, 6, 5, 3…)

live: a ed25519 keypair that signs static roots

hypercore feed implementation

all feeds are polymorphic under reading (static, live, and live with secretKey)

unavailable blocks can be fetched implicitly or opt in for read error

static & live feeds are polymorphic for appending (static as purely functional data structure, live via write cap (secretKey))

tree-index & bitfield store block<->hash correspondence, indexing merkle tree and tracking actual available blocks in core (feed may be locally sparse)

storage delegated to core._data

random per-feed namespace (_prefix) keyed by block number
feed blocks are not automatically deduplicated (within or between feeds)
does hypercore-archiver deduplicate based on data block hash?
is replication the only deduplicating operation?
core._nodes (= protobuf encoded merkle nodes) is trusted, core._data is not (can opt in to verify but doesn’t do a full merkle proof)
why are node indexes also saved in the value? aren’t they available from leveldb keys?

Core

a collection of feeds

block selection is mutable, and may be scheduled in and out

are unscheduled blocks evicted?

cores are unique, core.id = random

but this ID is not really in use in user facing ways

one db (= levelup-ish) per core contains many feeds

separated into sublevels

nodes - protobuf merkle tree nodes - index, size & hash
data - actual data blocks
feeds - keys etc
signatures - live feeds
bitfields - block availability

Hyperdrive

Archives

An archive is a hypercore feed

live or static

feed contains all data (= git-tree kinda, see hyperdrive-encoding)

data is written to the feed with rabin chunking (not yet on the browser)

Drive

collection of archives

implemented as a hypercore feed containing hypercore feeds

hyperdrive-ln vs. hyperdrive-named-archives?

Replication

swarms & relpicators

swarms allow replicators to establish p2p links

discovery-swarm, webrtc-swarm, etc

peers establish pairwise streams between them

replicators talk to each other over node streams

replication = idempotent block exchange (for appendable live hypercores - total order assumed)

Do swarm joins deduplicate {simple-,}peers?

can you even replicate multiple things on the same peer connection? (not data channel)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment