Skip to content

Instantly share code, notes, and snippets.

@pchiusano
Created February 21, 2017 21:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pchiusano/9e2fd5c9bb1aa6d9095d7759ba60dc86 to your computer and use it in GitHub Desktop.
Save pchiusano/9e2fd5c9bb1aa6d9095d7759ba60dc86 to your computer and use it in GitHub Desktop.
Rough draft of v3 of Unison distributed programming APIs

Unison computations can hop between nodes, can fail, can be forked to execute asynchronously, and can be supervised:

-- Promote a pure value to `Remote`
Remote.pure :  a . a -> Remote a

-- Sequencing of remote computations
Remote.bind :  a b . (a -> Remote b) -> Remote a -> Remote b

-- The current node where the computation is executing
Remote.here : Remote Node

-- Transfer control of remainder of computation to target node
Remote.transfer : Node -> Remote Unit

-- Explicitly fail a computation for the provided reason
Remote.fail :  a . Text -> Remote a

-- Sleep the current computation for the given duration
Remote.sleep : Duration -> Remote Unit

-- Start running a remote computation asynchronously, returning
-- a `Task` value that can be used for supervision
Remote.fork :  a . Remote a -> Remote Task

-- Halt a running task (and any running subtasks) using the provided `Cause`
Task.stop : Cause -> Task -> Remote Unit

-- Obtain the `Cause` that caused a running task to complete
Task.supervise : Task -> Remote (Remote Cause)

-- Create a duration from a number of seconds
Duration.seconds : Number -> Duration

-- this is TBD
type Cause = Error Text Node | Completed | Cancelled | Unresponsive Node

Unison computations can provision new nodes:

-- Like `Remote.spawn`, but create the node inside a fresh sandbox
Remote.spawn-sandboxed : Sandbox -> Remote Node

-- Like `Remote.spawn-sandboxed`, but use the provided symmetric key
-- to communicate with the returned `Node`
Remote.spawn-sandboxed' : Key -> Sandbox -> Remote Node

-- Create a new node 'in the same location' as the current node, sharing
-- current sandbox resources
Remote.spawn : Remote Node

-- Like `Remote.spawn`, but use the provided symmetric key
-- to communicate with the returned `Node`.
Remote.spawn' : Key -> Remote Node

-- Statically provision a `personal-info : Node`
node personal-info -- layout block starts here
  Sandbox 5% 10MB 3GB accept-from

-- TBD
type Sandbox =
  Sandbox CPU% Memory Storage ( a . Node -> Remote a -> Remote a)

We can encrypt / decrypt any value at all:

-- Encrypt a value, requires `Remote` since we use random IV / nonce
encrypt :  a . Key -> a -> Remote (Encrypted a)

-- Decrypt a value, or return `None` if key is incorrect
decrypt :  a . Key -> Encrypted a -> Either DecryptionFailure a

-- `Key` is just a symmetric encryption key. We might generate keys via:

AES256.key : Remote Key
Blowfish.key : Remote Key
-- etc

-- TBD
type DecryptionFailure = WrongKey | AlgorithmMismatch | IntegrityFailure

Unison programs have access to mutable variables, which also serve as a concurrency primitive:

-- Create an ephemeral `Box` on the current node; just a (GUID, Node) at runtime
Box.empty :  a . Remote (Box a)

-- Put a value into the box, or if the box is full,
-- wait until a `Box.take` empties the box.
Box.put :  a . a -> Box a -> Remote Unit

-- Remove and return the value in the box, or if the box is empty,
-- wait until a `Box.put` fills the box.
Box.take :  a . Box a -> Remote a

-- Like `Box.take`, but leaves the value inside the box
Box.read :  a . Box a -> Remote a

-- Read the current value inside the box or return `None` immediately.
-- Also returns a setter which returns `True` if the set was successful.
-- The `set` is successful only if the value inside the box has not
-- otherwise changed since the read, so this can be used to implement
-- "optimistic" atomic modifies.
Box.access :  a . Box a -> Remote (Optional a, a -> Remote Bool)

Unison can resolve references dynamically on a node:

-- Create a `Name`, which is a typed reference to a node-local value.
Name.make :  a . Remote (Name a)

-- Lookup the node-local value associated with the `Name`.
Name.resolve :  a . Name a -> Remote (Box a)

-- Declare `bob : Name Number` statically. The value bound to
-- the `Name` does not survive node restarting.
ephemeral name bob : Number

-- Declare `cluster-peers : Name (Vector Node)` statically. The current
-- value of `cluster-peers` survives node restarting.
durable name cluster-peers : Vector Node

Unison can make any value durable. Durable values are immutable:

-- Move any value from RAM to local durable storage
Durable.store :  a . a -> Remote (Durable a)

-- Synchronize any value to local durable storage, returning
-- `True` if the give `Node` has that `Durable a` locally and
-- the sync was successful.
Durable.sync-from :  a . Node -> Durable a -> Remote Boolean

-- Load a durable value into RAM, assuming it exists on the given node
Durable.load-from :  a . Node -> Durable a -> Remote (Optional a)

-- Returns a list of nodes that the Unison runtime believes could
-- successfully `Durable.load-from` or `Durable.sync-from` for the
-- given `Durable`.
Durable.peers :  a . Durable a -> Remote (Vector Node)

Lastly, we can declare foreign functions:

-- Declare `my-fn : Foreign (Number -> Remote Number)` statically
-- Bindings for some of these foreign declarations would be done
-- in some implementation-dependent way on Unison node container startup.
foreign my-fn : Number -> Remote Number

-- Ask the current node if it has a binding for a `Foreign a`
Foreign.ask : forall a . Foreign a -> Remote (Optional a)

Notes on semantics and implementation details

The Task returned by Remote.fork controls the entirety of the computation forked, including any subtasks forked. Stopping that Task stops anything that may be running underneath this fork.

Implementation notes on Task.supervise:

  • At runtime, a Task value contains a Node reference where the Task was originally forked.
  • To implement Task.supervise, the runtime maintains at each node a Map Task (Timestamp, Status, Optional Node), tracking for each task a timestamped last update for that task (when it was running on the current node), and an Optional Node if the computation was transferred elsewhere. This Map can be pruned using some ad hoc policy (like retain 30s of data or 5000 entries). Task.supervise then just chases the computation, following these transfer links until it obtains a "recent enough" status update for the computation. If a node is unresponsive or unreachable, this eventually leaks to an Unresponsive error being passed to the supervisor.

On node local storage:

  • The association between a Name and a Box is local to the node. Conceptually, each node has its own durable and ephemeral storage. There is no storage concept exposed to Unison at any granularity beyond nodes (though of course you can write multi-node storage as regular Unison libraries). Nodes are isolated from each other and must communicate explicitly (even if the nodes are all spawned in a single sandbox).
  • The durable name blah : Name Number is somewhat analogous to a typed file name. It can be resolved on any node to a Box Number, and the state of that Box Number (whether it is empty or full) will survive node restarts.
  • The node node-name block declares a node statically, by proving a Sandbox.
  • The various Durable functions give some flexibility to Unison programs in how they resolve Durable values and where they load them from.
@atacratic
Copy link

Hi, found the link to this on gitter :) Hope you don't mind more comments. (Have checked this supersedes all my previous ones so no need to look again at those.)

I like the name Name.

The fact that Durable is backed by Box and Box can be empty - that's going to mean pervasive matching on the Optional from Box.read - no type-level guarantee the box is non-empty. Is too many Boxes like programming with NULL I wonder? It also harms the ergonomics of building data structures with Durables. But I don't have a better idea. (Or maybe everyone maps their Optionals and it doesn't look too bad.)

Ah, you've switched to explicit syncing of Durables. I did quite like the implicit version, if there were a way that Unison could make this fully transparent in a non-leaky but reasonably performant way. But I can't see a way, if Durables might be megabytes in size (or transitively reference a million other Durables). So I think it's the right call. Fundamentally, big data transfers are an effect that the user will need to manage explicitly.

I wonder if it's still true that Durables are GCd on a per-node basis and can't contain cycles. I guess the GC should have a minimum grace time to give remote nodes a chance to decide to sync.

Observation: load-from makes it indistinguishable whether the node didn't have that Durable or whether it did but the Box was empty.

Inconsistency: your third-last bullet uses Name in durable name blah : Name Number, but your cluster-peers example doesn't use Name except in the comment. The : Name Number variant makes more sense to me.

If the durable name declaration scheme gives you a Box which is durable under the covers, then I'm not sure what the status of the Durable type is. Seems funny that some durable Boxes are called Box and some are called Durable. I guess this is coming from not wanting to spawn variants of Name.

Durable.peers - interesting! Curious to know how it will work.

Is the run-time representation of a Durable going to be a hash of the value or a GUID? (or something else)

I like the way foreign functions are no longer kept in mutable cells.

typos: in the sync-from comment (give->given); and leaks->leads; 'exposed to Unison'->'exposed by Unison'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment