belisarius222/remote-scry.md

## remote-scry.md

      
    Raw
  

              remote-scry.md
            
          
    Remote Scry Protocol Proposal

Overview

Despite Urbit's "scry" namespace being global (every request path contains the host ship), there is no way to query other ships.  This proposal adds a second Urbit-to-Urbit network protocol that implements remote scrying.  This will allow for ships to field read requests without incurring disk writes, and since the namespace is immutable, caching responses will be simple and worthwhile.
To "scry" in Urbit means to query the Urbit namespace.  Conceptually, if a query resolves, it can produce either a piece of marked data (meaning tagged with a system-recognized type) or an empty result indicating that this path will never contain data.  Not all requests resolve; some "block", which represents a refusal or inability to answer the question (such as a local query for a file at a future date).  The namespace is immutable in the sense that all nonblocking results to the same query must be identical.  Whether a query resolves is not specified; a query could succeed, then block, then succeed again, as long as both successes produce the same piece of marked data or both produce null.
The Arvo kernel guarantees local (same ship) immutability through its own design and the constraints it imposes on userspace code -- userspace code does not have the opportunity to violate immutability.  Producing two different results to the same scry request is a Byzantine fault, so without posting scry results to a blockchain or other consensus mechanism, it is infeasible to have a strong guarantee of immutability in the responses from other ships.  This imposes fewer limitations than one might think, but the scry namespace should not be mistaken for a blockchain.
The immediate use case for a remote scry protocol is to download new kernel source files from galaxies to perform over-the-air software updates.  Right now this happens over Ames, which is a stateful protocol -- every ack packet from a requester ship must be written to disk on the host.  This causes heavy disk write load on galaxies and stars during updates.
Longer-term use cases are much broader, and should generally encompass almost all network reads from ship to ship, possibly eventually including private subscription data.  This remote scry proposal is heavily inspired by Named Data Networking from Van Jacobson et. al. and can productively be thought of as a simplified version of NDN optimized for Urbit.
This proposal is intended to be the simplest viable form of a remote scry protocol that still provides good scaling capabilities for remote read requests.  It does not implement subscriptions, long-lived requests, hop-to-hop MTUs, or courtesy responses to blocking requests, all of which might be worth considering in more sophisticated proposals.
Scry Query Syntax

A query can be represented as a path, for example:
/cx/~zod/kids/37/sys/lull/hoon

I am not convinced this particular path format will survive indefinitely, but the ability to represent a query as a path is likely to be conserved, and it's worth examining the different parts of the path to understand how the system can be used.
The first path element is really two separate pieces of data: the vane that should field the query (%c for Clay in this case) and the type of request, called a "care", which here is %x, meaning "grab the file contents at this path".  There are several other "care"s, including %y to request a directory listing and %z to request a directory hash.
The second element, ~zod, is the host ship -- the Urbit address of the server who produced the data.
%kids, the third element, represents a "desk", which is a sort of workspace, similar to a Git branch.
37 is an incrementing revision number.  This is a special case of the "case" datatype, which could alternatively be either a date e.g. ~2021.4.8 or a textual label.  The case is used to situate the request in time.  In Clay, for example, you cannot overwrite a file, since that would violate the namespace's immutability; instead, you can add a new revision of a desk at the current date and the next revision number.
The triple of ship, desk, and case is called a "beak", and it is a reference to a snapshot of a workspace (desk) on that user's (ship's) Arvo, at that date or revision number (case).
Finally, /sys/lull/hoon is the path within that beak that is being requested.  This example is one of the source files used to build the Arvo kernel.  Another example would be /app/dojo/hoon, which is the source file for the Dojo command-line shell.
Protocol Layering

There are two layers to consider for a remote scry protocol: the message layer and the packet layer.  The canonical version of the protocol specifies both, using UDP as the transport layer and performing its own authentication and message fragmentation, but an implementation that uses some other secure channel could use a different packet layer while retaining the same message layer.
Message Layer

A remote scry message is a query for a piece of data hosted on another ship.  At a high level, the requester sends a scry request as a piece of data over the wire to the host ship, who should respond with the result data.  One request message yields at most one response message.  A response is only given in response to a request.
The host should attest to the scry result by signing the tuple of the request and response, and it should send this signature along with the response data.  Most requests will probably not be signed, but some requests should be, such as requests made during a DDoS attack or which require the host to expend significant CPU or memory resources to calculate the result; therefore, the requester signature is an optional field in a request.
Once the host ship's Vere hears a remote scry request, it will scry into Arvo using Arvo's +peek arm to perform the query, then packetize the result using the packet layer.
Logically, the request and response types can be defined as follows:
|%
+$  request
  $:  pax=path                          ::  scry query
      aut=(unit [ship life signature])  ::  requester authentication
  ==
+$  response
  $:  dat=(unit [mark noun])            ::  response data or empty
      aut=[life signature]              ::  host authentication
  ==
--
::  The scry result in `dat` will be a jammed (serialized) noun that can be:
::
::  ~                ::  empty result
::  [~ mark result]  ::  nonempty result
::
::  (`~` is Hoon's null value, which is a typed version of the atom 0;
::  these response types are structurally discriminable.)

"Blocking" requests are dropped by the host's ship, producing no response.  The downside of this approach is that from the requester's perspective, there is no way to tell the difference between slowness and a (possibly permanent) refusal to respond to the request.  One can imagine adding an explicit refusal response type to avoid infinite repetition of bogus requests.  This proposal does not include such a response message, but future revisions might.
The requester ship's Arvo emits request packets to Vere, which sends them over the wire to the host ship, whose Vere will scry into Arvo using Arvo's +peek arm, then generate and send response packets based on the result.  These packets can be cached by the requester, host, and any relay between the two.
Packet Layer

Every packet should fit within an MTU of 1500 bytes.  Response packets include the request path, so the request path is limited to 384 characters, ensuring response packets can include up to 1024 bytes of response fragment data.  If an app tries to make a remote scry request with path greater than 384 characters, the request will fail without even attempting a remote request.
A scry packet is either a request packet, sent from requester to host, or a response packet, sent from host to requester.  Some request packets and all response packets are signed.  The signature algorithm is ed25519, run on a hash of the jammed response.  The response packet signature applies to the bytes of the request section of the packet, concatenated with the bytes in the packet's response section excluding the signature itself; the specific fields are responder life, responder address, number of fragments, and response data.
Every response packet can be validated by any node that receives it, as long as that node knows the public key associated with that Urbit address.  Relays do not need to know how to +cue (deserialize) Nock nouns in order to validate and relay packets.
The message buffer resulting from concatenating all response packet buffers will contain both a message signature and the jammed message, with the signature coming first; i.e. the initial 512 bits of the first response packet's response-data segment will be the signature on the tuple [host-address host-life request-path response-message], jammed (serialized) and hashed using SHA256.
Both request and response contain the scry request data structure, partially decomposed for easier processing by relays.  The host Urbit address is encoded separately from the rest of the scry request, so the remaining path looks like /cx/kids/37/sys/lull/hoon.  Paths are encoded as null-terminated ASCII text, except for the host ship, which is encoded the same way as in Ames, so as not to slow down relaying with string parsing.
Any scry packet can contain an origin field just like Ames packets.  If a relay responds to a scry request from its cache without asking the host, the relay should include an origin containing the last known IP and port of the host.  Just like Ames, the remote scry protocol should be resilient against the origin pointing at an unreachable IP and port.
Packet Format

32-bit Header (Same for Requests and Responses)

The header is exactly the same as the Ames header, but with the "scry or ames?" bit set to "scry", not "ames".
2  bits: unused
1  bits: 1 if request, 0 if response
1  bits: scry or ames?
3  bits: protocol version
2  bits: sender address size
2  bits: receiver address size
20 bits: checksum
1  bits: relayed?

Body (Request Packet)

The first five fields of the packet body are the same as in Ames.
::  prelude

  4 bits: sender life (mod 16)
  4 bits: receiver life (mod 16)
variable: sender address
variable: receiver address
 48 bits: (optional) 48-bit origin

::  request

512 bits: client signature
 32 bits: fragment number
 16 bits: path string length
variable: path as ASCII

Origin (if relayed)

TODO: check order
32 bits: IPv4 address
16 bits: port

Body (Response Packet)

Just like with request packets, the first five fields (the prelude) are the same as in Ames.
::  prelude

  4 bits: sender life (mod 16)
  4 bits: receiver life (mod 16)
variable: sender address
variable: receiver address
 48 bits: (optional) 48-bit origin

::  request

 32 bits: fragment number (starting at 1)
 16 bits: path string length
variable: path as ASCII

::  response

512 bits: responder signature
 32 bits: number of fragments
 16 bits: response data size (0 if null response)
variable: response data

Implementation

Requests will be handled by a new vane, called Fine, after French mathematician and cartographer Oronce Fine.  Fine will use an adapted version of Ames's congestion control system to send and re-send request packets and collate response packets into messages.  Response messages will be returned to requesting vanes along with their signatures to preserve provenance.
Unlike in Ames, there is no enforced ordering among scry request messages.  This ensures commutativity of requests and responses, allowing for maximum parallelism in generating responses.
When the host ship receives a scry request packet, its Vere engages in
the following logic:

Check if the response for this request is cached.

If yes, skip to packetization.  Otherwise, continue.


Run Arvo's +peek arm with the scry request.
Take the resulting noun, sign the request+response tuple described
above, and store the response in the cache.
Run a packetization function on the response based on the fragment
number in the request packet.
Send the resulting packet to the IP and port from which we heard the
request packet.

Each Arvo kelvin version has its own packetization function, which could be scried out of the Fine vane as a gate, which is how it is specified, or simply hard-coded into Vere for each Kelvin version.
The scry cache could live purely in memory or be loaded from disk.  The
initial implementation should probably live in memory.  Cache
invalidation should not be necessary, generally speaking; cache eviction
could use LRU, or a clock algorithm for ease of implementation, or any
other heuristic deemed appropriate.