martinheidegger/thoughts.md

## thoughts.md

      
    Raw
  

              thoughts.md
            
          
    Self Healing Hypercores

In DAT, append-only logs addressable by a key are called hypercores.
When you copy one such log to another computer and write to it, there
are suddenly two logs in two different locations. The new location being
a fork of the first log. Both logs can be found under the same ID,
making it impossible to distinguish them from another.
DAT mitigates this by making every log "single-writer". Only with a secret
writeKey you can append data to the log. By convention that secret is
stored in a folder hidden from the user, to not let the user write to the log
from a different location.
However, that doesn't mean that the user can't make copies of the secret
and use it on another computer!
User Scenarios:

A user made a backup of their computer with an append-log and the secret key.
The computer breaks, they buy a new computer and restore from an old version.
When a new file is added a new fork is accidentally created.
A user buys a new computer and wants to toss the old computer. The user needs
to move the secret key from one computer to the next, after moving he replicates
the logs. Before the log is entirely replicated, a sync process accidentally
adds new data at the end, creating an accidental fork.

Now, what happens when you distribute two forked logs, with the same ID to different
devices? When a device receives a new part, it verifies if that part belongs to the
log and can not verify it (hopefull, havn't tested that case yet) as a result the
whole data log is put into question :) - merkle trees, gotta love 'em.
This is just the case I have been thinking about: What if there could be a auto-healing
process for append-logs.
Reader case before writer merge

If a reader encounters a fork of a hypercore, it reverts to the last non-conflicting
version.
Writer case, merging

If a writer encounters a fork of a hypercore, it looks at the last non-conflicting
version, then it calls a hook to ask what to do with the difference. Either:

drop one, use the other
use one, append the entries of the other
drop both
custom merging

(automatic: Use the longer one)
When the selection is taken it adds a special "merged" entry after the last non-conflicting
entry. Then it adds all the entries as defined in the merging strategy.
Reader case after writer merge

It looks in the entry following the last non-conflicting entry for every fork.
If that entry is a merge entry that states which forks it merges. Then it takes
that fork as fork-of-truth.
A merge entry looks like:
{ merged: ['hash-a', 'hash-b'] }