Skip to content

Instantly share code, notes, and snippets.

@jaseemabid
Last active October 9, 2018 03:06
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jaseemabid/ac2ad1272ed781799434056c73b48448 to your computer and use it in GitHub Desktop.
Save jaseemabid/ac2ad1272ed781799434056c73b48448 to your computer and use it in GitHub Desktop.
Safe

Safe

A safe storage for your personal data like photos, documents and notes for life. Privacy and safety over anything else.

What would it be?

  1. A dead simple folder you can drag and drop files into and forget about. As simple as dropbox but something you can trust.

  2. A fuse interface on Mac and Linux. Mounting a directory should be the only thing the user have to do. The files can be synced in the background transparently, all UNIX tools like find and grep should work out of the box.

  3. A dropbox/google drive like interface on Android.

  4. Safely store redundant copies of the data in unrelated locations to minimise data loss.

  5. Pluggable data backends - a local hard drive, AWS S3, Google drive or Dropbox.

  6. Encrypted at rest with untrusted backends. Ideally a backend getting compromised or being malicious should make it very hard for an attacker without the credentials to read any content at all.

  7. A simple blob storage model for scale, compatibility with multiple backends, and data integrity verification.

  8. True P2P; all nodes should be the same. Files added from a phone should be visible on the laptop immediately and vice versa. A local node over WiFi must get precedence over a remote server across the Atlantic. Should be efficient enough to stream the entire gallery.

  9. The size of the corpus might be much larger than the individual nodes. Fetch everything on demand. Cache intelligently based on topology.

Rough design

  1. Blobs split into fixed size chunks like torrents. Each blob encrypted and signed. A file is an ordered collection of blobs. A directory is an ordered collection of files.

  2. A simple replication model. Consistent hashing like Riak/Apache Cassandra might be an overkill but massively scalable and fun to implement. Tunable consistency at a directory level would be great. A simple model that replicates to all nodes might be good enough to begin with.

  3. The actual on disk storage can be something dead simple like Git or something more sophisticated like LevelDB.

  4. True P2P sync with rsync or something else that is more optimised.

  5. Never delete anything! A last write wins model for conflict resolution can potentially lose data. Make it a versioned file system with smart garbage collection.

Inspirations

  1. Syncthing https://syncthing.net/

  2. Perkeep https://perkeep.org

    I like the design, but I don't care about the Web UI, all the data feeders and the interface I need is classic UNIX file system. Even if I build something custom, this might be a good backend.

  3. ZFS; yes the file system

    Rock solid. Every user level program works well because its a file system. Not sure about mobile support. Snapshots are awesome.

  4. Tahoe LAFS https://tahoe-lafs.org/trac/tahoe-lafs

    I've heard Tahoe can do most of these, but its the tool I'm least familiar with. Needs investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment