Skip to content

Instantly share code, notes, and snippets.

@ahdinosaur
Created August 6, 2014 01:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ahdinosaur/6629fe835b75dafa96ae to your computer and use it in GitHub Desktop.
Save ahdinosaur/6629fe835b75dafa96ae to your computer and use it in GitHub Desktop.
more info on gundb

taken from an email by mark@accelsor.com

question / answer

What type of database is gun?

It is a graph database.

How does data versioning work?

Gun combines vector clocks and timestamps together.

What is the conflict resolution strategy?

A new algorithm, modeled after the behavior of light. Both the data and the deltas sent through the system have the same format, where everything behaves as a state machine. Peers match and merge data relative to their state, such that all data is functionally immutable through time, but can still be manipulated in procedural ways within the machine's live state. This live state is defined as being the boundary between the machine's local time all the way back to the value's modified state. Any incoming data or deltas that exceed the machine's upper boundary is put in a queue to be processed when the machine reaches that state - this mitigates exploits from malicious peers that claim to be from the future, which timestamp systems are vulnerable to. Any incoming data or deltas that are earlier than the live data's state is logged, if logging is turned on. Incoming data or deltas within the machine's boundary is updated, if the states are identical then a naive lexical sort is used to converge data. A naive comparison is used in these rare cases because the only thing that matters is that absolutely every peer in the system uses the exact same comparison. This system is resilient to partitions and is highly available, however, per the CAP theorem, it cannot be globally consistent. Instead, it is eventually consistent, and this choice is important because the universe itself is naturally partitioned by the speed of light - that is, global consistency does not exist in reality. Because the algorithms mimic photons, the conflict resolution algorithms will work in Newtonian time as well as within quantum state where time breaks down. Meaning the synchronization protocol is also immune to any Grandfather Paradox, as network acknowledgements are only given relative to the machine's live state.

How do transactions work?

The vocabulary of ACID is vague, as a result we are explicit with the definition: A - Atomic, if you set a full node, or nodes of nodes, if any value is in error then nothing will be set. If you want sets to be independent of each other, you need to set each piece of the data individually. C - Consistency, if you use any reserved symbols or similar, the operation will be rejected as it could lead to an invalid read and thus an invalid state. I - Isolation, the conflict resolution algorithm guarantees idempotent transactions, across every peer, regardless of any partition, including a peer acting by itself or one having been disconnected from the network. D - Durability, if the acknowledgement receipt is received, then the state at which the final persistence hook was called on is guaranteed to have been written. The live state at point of confirmation may or may not be different than when it was called. If this causes any application-level concern, it can compare against the live data by immediately reading it, or accessing the logs if enabled.

Will it support OT or CRDT?

No. As a result arrays are not supported in gun, as they are eventually impossible to converge over significant distance based latency, even with OT. Arrays however can be emulated on the local machine by sorting nodes via a given key that they contain. You could build CRDT structures ontop of gun, which define more complicated data types that span multiple nodes in the data graph. This would be necessary for tightly coupled multi-transactional data, such as collaborative document editing - which has long been considered the holy grail of OT.

How does this compare to LevelDB?

It does not, as gun is a distributed cache that embeds into the same process as the application logic itself. Other "embedded" databases like SQLite and LevelDB still have C bindings that the application logic has to query through. In gun, you write your own custom Turing complete query logic, without having to learn a query language - if you want support for SQL or MongoDB's JSON query spec in gun, plugin modules can be written to parse them for you and run the query directly. The only similarity between gun and LevelDB is that it also handles persistence for you, which is implemented as a simple hook that could be used with any type of persistence layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment