Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@philandstuff
Last active August 29, 2015 14:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save philandstuff/c8f5455e584bb61e6f7a to your computer and use it in GitHub Desktop.
Save philandstuff/c8f5455e584bb61e6f7a to your computer and use it in GitHub Desktop.
spike

datomic-spike

A spike into using datomic to model data registers.

Usage

You need Datomic. If you agree to the Datomic Free License you can install it using homebrew: brew install datomic.

analysis

communication and coordination model

  • The database is a single, append-only value
  • Clients connect to the database and download the entire value
  • they run queries against the whole dataset using their local cached copy
    • queries don’t require coordination with the central server
  • further updates can be fetched incrementally
  • transactions are sent to the transaction queue
  • the Transactor is processes transaction in series, imposing a strict total order on transactions being processed
    • everything happens in the same order from everyone’s point of view
  • transactions can perform validation and abort

storage

Datomic doesn’t implement its own storage backend; instead it hosts itself on another storage backend such as postgres, riak, dynamo, infinispan etc.

data model

The basic unit of data is the “datom” or “fact”:

[entity attribute value tx-id added?]

In day-to-day operation you only need the first three.

Entities are internal numerical ids.

Attributes are special kinds of entities with metadata describing the kinds of values that they’re allowed to take, and their cardinality.

Queries are performed using a datalog-inspired query language. For example, to find the name of a school with the identity string “100000”:

(d/q '[:find ?name :where [?id "school" "100000"] [?id "name" ?name]] the-db)

integrity

Datomic doesn’t help you at all here. You can excise data and there’s no in-built mechanism to detect that this has happened. You’re going to need to build an integrity-protection system on top.

history

History is a first-class citizen. Because the basic unit of data knows the transaction-id for when it was inserted, you can just query the history database in the same way you query the regular database. For example, to find all names that a school has ever been known by:

(d/q '[:find ?name ?tx :where [?id "school" "100000"] [?id "name" ?name ?tx]] (d/history the-db))

Note how we use the query [?id "name" ?name ?tx] to ask for the transaction id.

joining multiple data sources

Datomic supports running a single query over multiple data sources. Here’s an example:

  (let [r-conn (d/connect "datomic:mem://register")
        s-conn (d/connect "datomic:mem://school")
        r-db   (d/db r-conn)
        s-db   (d/db s-conn)]

    (d/q '[:find ?postcode :in $register $school :where
           [$school ?id "school" "100000"]
           [$school ?id "address" ?addr-id]
           [$register ?addr "address" ?addr-id]
           [$register ?addr "postcode" ?postcode]]
         r-db s-db))
;;output: #{["EC3A 5DE"]}

what have we learned?

  • validation of approach
    • separation of queries and updates
  • interesting ideas:
    • local cache of whole db
    • custom functions in transactor
      • validation and process
    • (database-globally) unique monotonically increasing transaction id to order transactions independent of system clock
      • this is only achievable because of serialization behaviour of transactor
        • this means that the transactor cannot achieve CAP availability
        • that’s probably okay
  • lessons of bootstrapping
  • existing data model doesn’t quite fit datomic
    • no way to create schemas for entities, only attributes
    • no support for integrity checking
      • you’d have to build this on top yourself
    • querying “which records exist in a particular register” is implicitly encoded by the entity having exactly that registers fieldset
    • or maybe we need an extra attribute to mark the entity as belonging to a particular register?
    • no support for custom datatypes afaics (though some references to desiring this feature in future)
(ns datomic-spike.core
(:require [datomic.api :as d]))
(defn make-entity
([ident] (make-entity ident :db.part/user))
([ident partition]
{:db/id (d/tempid partition)
:db/ident ident}))
(defn make-field
([ident type]
(make-field ident type :db.cardinality/one))
([ident type cardinality]
(assoc (make-entity ident :db.part/db)
"field" ident
:db/cardinality cardinality
:db/valueType type
:db.install/_attribute :db.part/db)))
(defn make-register [name fields text]
{:db/id (d/tempid :db.part/user)
"register" name
"fields" fields
"text" text})
(defn make-record [attr-vals]
(assoc attr-vals :db/id (d/tempid :db.part/user)))
(defn -main
"I don't do a whole lot."
[& args]
(do ;; reset everything
(d/delete-database "datomic:mem://register")
(d/create-database "datomic:mem://register")
(d/delete-database "datomic:mem://school")
(d/create-database "datomic:mem://school"))
(let [r-conn (d/connect "datomic:mem://register")
s-conn (d/connect "datomic:mem://school")]
;; bootstrapping: create the basic "field" field
(let [create-field-field (let [field-id (d/tempid :db.part/db)]
[{:db/id field-id
:db/ident "field"
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db.install/_attribute :db.part/db}
])]
(d/transact r-conn create-field-field)
(d/transact s-conn create-field-field)
(d/transact r-conn [{:db/id "field" "field" "field"}])
(d/transact s-conn [{:db/id "field" "field" "field"}]))
;; bootstrapping: create more fields
(let [more-fields [(make-field "fields" :db.type/string :db.cardinality/many)
(make-field "register" :db.type/string)
(make-field "text" :db.type/string)
]]
(d/transact r-conn more-fields)
(d/transact s-conn more-fields))
;; bootstrapping: create register register
(d/transact r-conn
[(make-register "register" #{"register" "fields" "text"}
"Factual data the law says government must record.")])
;; now we can create new registers and fields using convenience
;; methods
(d/transact r-conn
[(make-field "name" :db.type/string)
(make-field "address" :db.type/string)
(make-field "street" :db.type/string)
(make-field "post-town" :db.type/string)
(make-field "postcode" :db.type/string)
(make-register "address" #{"address" "street" "post-town" "postcode"} "an address")
(make-register "post-town" #{"post-town" "text"} "a post town")
(make-register "postcode" #{"postcode"} "a postcode")
])
(d/transact s-conn
[(make-field "name" :db.type/string)
(make-field "school" :db.type/string)
(make-field "address" :db.type/string)
(make-register "school" #{"school" "name" "address"} "a school")
(make-register "address" #{"address" "street" "post-town" "postcode"} "an address")
(make-register "post-town" #{"post-town" "text"} "a post town")
(make-register "postcode" #{"postcode"} "a postcode")])
;; create some data
(d/transact r-conn
[(make-record {"postcode" "EC3A 5DE"})
(make-record {"post-town" "London"})
(make-record {"address" "200000071925"
"street" "Sir John Cass C Of E School, St James's Passage"
"post-town" "London"
"postcode" "EC3A 5DE"})])
(d/transact s-conn
[(make-record {"school" "100000"
"name" "Sir John Cass's Foundation Primary School"
"address" "200000071925"})])
;; change the name of a school
(let [id (d/q '[:find ?id . :where [?id "school" "100000"]] (d/db s-conn))]
(d/transact s-conn
[{:db/id id
:name "Sir John Cass's Church of England Foundation Primary School"}])))
;; now, we can get query the present value of the database:
(let [s-conn (d/connect "datomic:mem://school")
db (d/db s-conn)]
(d/q '[:find ?name :where [?id "school" "100000"] [?id "name" ?name]] db))
;;output: #{["Sir John Cass's Church of England Foundation Primary School"]}
;; we can even query data from multiple datasources
(let [r-conn (d/connect "datomic:mem://register")
s-conn (d/connect "datomic:mem://school")
r-db (d/db r-conn)
s-db (d/db s-conn)]
(d/q '[:find ?postcode :in $register $school :where
[$school ?id "school" "100000"]
[$school ?id "address" ?addr-id]
[$register ?addr "address" ?addr-id]
[$register ?addr "postcode" ?postcode]]
r-db s-db))
;;output: #{["EC3A 5DE"]}
;; or we can query the history of the database to get all values at any point in time:
(let [conn (d/connect "datomic:mem://register")
db (d/db conn)
history-db (d/history db)]
(d/q '[:find ?name ?tx ?added? :where [?id "school" "100000"] [?id "name" ?name ?tx ?added?]] history-db))
;; output (latest entry first):
;;#{["Sir John Cass's Church of England Foundation Primary School" 13194139534327 true]
;; ["Sir John Cass's Foundation Primary School" 13194139534327 false]
;; ["Sir John Cass's Foundation Primary School" 13194139534322 true]}
)
(defproject datomic-spike "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.5.1"]
[com.datomic/datomic-free "0.9.5153"]]
:main datomic-spike.core)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment