A spike into using datomic to model data registers.
You need Datomic. If you agree to the Datomic Free License you can
install it using homebrew: brew install datomic
.
- The database is a single, append-only value
- Clients connect to the database and download the entire value
- they run queries against the whole dataset using their local
cached copy
- queries don’t require coordination with the central server
- further updates can be fetched incrementally
- transactions are sent to the transaction queue
- the Transactor is processes transaction in series, imposing a
strict total order on transactions being processed
- everything happens in the same order from everyone’s point of view
- transactions can perform validation and abort
Datomic doesn’t implement its own storage backend; instead it hosts itself on another storage backend such as postgres, riak, dynamo, infinispan etc.
The basic unit of data is the “datom” or “fact”:
[entity attribute value tx-id added?]
In day-to-day operation you only need the first three.
Entities are internal numerical ids.
Attributes are special kinds of entities with metadata describing the kinds of values that they’re allowed to take, and their cardinality.
Queries are performed using a datalog-inspired query language. For example, to find the name of a school with the identity string “100000”:
(d/q '[:find ?name :where [?id "school" "100000"] [?id "name" ?name]] the-db)
Datomic doesn’t help you at all here. You can excise data and there’s no in-built mechanism to detect that this has happened. You’re going to need to build an integrity-protection system on top.
History is a first-class citizen. Because the basic unit of data knows the transaction-id for when it was inserted, you can just query the history database in the same way you query the regular database. For example, to find all names that a school has ever been known by:
(d/q '[:find ?name ?tx :where [?id "school" "100000"] [?id "name" ?name ?tx]] (d/history the-db))
Note how we use the query [?id "name" ?name ?tx]
to ask for the
transaction id.
Datomic supports running a single query over multiple data sources. Here’s an example:
(let [r-conn (d/connect "datomic:mem://register")
s-conn (d/connect "datomic:mem://school")
r-db (d/db r-conn)
s-db (d/db s-conn)]
(d/q '[:find ?postcode :in $register $school :where
[$school ?id "school" "100000"]
[$school ?id "address" ?addr-id]
[$register ?addr "address" ?addr-id]
[$register ?addr "postcode" ?postcode]]
r-db s-db))
;;output: #{["EC3A 5DE"]}
- validation of approach
- separation of queries and updates
- interesting ideas:
- local cache of whole db
- custom functions in transactor
- validation and process
- (database-globally) unique monotonically increasing transaction
id to order transactions independent of system clock
- this is only achievable because of serialization behaviour of
transactor
- this means that the transactor cannot achieve CAP availability
- that’s probably okay
- this is only achievable because of serialization behaviour of
transactor
- lessons of bootstrapping
- appendix C of the art of the metaobject protocol “Living With Circularity”
- existing data model doesn’t quite fit datomic
- no way to create schemas for entities, only attributes
- no support for integrity checking
- you’d have to build this on top yourself
- querying “which records exist in a particular register” is implicitly encoded by the entity having exactly that registers fieldset
- or maybe we need an extra attribute to mark the entity as belonging to a particular register?
- no support for custom datatypes afaics (though some references to desiring this feature in future)