Skip to content

Instantly share code, notes, and snippets.

@martinheidegger
Last active October 9, 2019 07:53
Show Gist options
  • Save martinheidegger/82dbf775e3ff071d897819d7550cb3d7 to your computer and use it in GitHub Desktop.
Save martinheidegger/82dbf775e3ff071d897819d7550cb3d7 to your computer and use it in GitHub Desktop.
Self-healing hypercores

Self Healing Hypercores

In DAT, append-only logs addressable by a key are called hypercores. When you copy one such log to another computer and write to it, there are suddenly two logs in two different locations. The new location being a fork of the first log. Both logs can be found under the same ID, making it impossible to distinguish them from another.

DAT mitigates this by making every log "single-writer". Only with a secret writeKey you can append data to the log. By convention that secret is stored in a folder hidden from the user, to not let the user write to the log from a different location.

However, that doesn't mean that the user can't make copies of the secret and use it on another computer!

User Scenarios:

  • A user made a backup of their computer with an append-log and the secret key. The computer breaks, they buy a new computer and restore from an old version. When a new file is added a new fork is accidentally created.
  • A user buys a new computer and wants to toss the old computer. The user needs to move the secret key from one computer to the next, after moving he replicates the logs. Before the log is entirely replicated, a sync process accidentally adds new data at the end, creating an accidental fork.

Now, what happens when you distribute two forked logs, with the same ID to different devices? When a device receives a new part, it verifies if that part belongs to the log and can not verify it (hopefull, havn't tested that case yet) as a result the whole data log is put into question :) - merkle trees, gotta love 'em.

This is just the case I have been thinking about: What if there could be a auto-healing process for append-logs.

Reader case before writer merge

If a reader encounters a fork of a hypercore, it reverts to the last non-conflicting version.

Writer case, merging

If a writer encounters a fork of a hypercore, it looks at the last non-conflicting version, then it calls a hook to ask what to do with the difference. Either:

  • drop one, use the other
  • use one, append the entries of the other
  • drop both
  • custom merging

(automatic: Use the longer one)

When the selection is taken it adds a special "merged" entry after the last non-conflicting entry. Then it adds all the entries as defined in the merging strategy.

Reader case after writer merge

It looks in the entry following the last non-conflicting entry for every fork. If that entry is a merge entry that states which forks it merges. Then it takes that fork as fork-of-truth.

A merge entry looks like:

{ merged: ['hash-a', 'hash-b'] }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment