This proposal advocates for using the wireguard protocol as the Terran basis of the next generation of Urbit networking, with QUIC layered on top. This takes care of the following, all in the Urth process, meaning no Nock execution or Arvo events required:
- sponsor keepalives
- star packet forwarding
- peer discovery
- authentication
- encryption (including forward secrecy)
- transmission control (packetization and congestion control)
- scry notifications
- DDoS protection
- IP anonymization
- traffic obfuscation
All that's left is Arvo telling Vere to send a message to some ship, and Vere handles the rest.
Only stars tighten routes, and only down to other stars. This makes each star like a VPN relay, and it also simplifies routing to a single level of tightening, which we can expect to remain stable, since stars rarely change IP addresses.
We will use supersymmetric routing: all packets to and from a planet or moon go through the sponsoring star. Most packets are planet-to-planet, and will go through the sender's star and then the receiver's star.
Every network hop is between two ships and is its own wireguard channel. End-to-end encryption is achieved through wireguard's standard VPN-style forwarding capability.
A ship will ping its sponsor using wireguard's built-in keepalive configuration.
To stave off DDoS attacks, a ship will use wireguard to accept connections only from specific ships, as authenticated by the Azimuth PKI (augmented by Ames-based key propagation for moons and comets). A ship's curve25519 networking key will be used as its wireguard public key. All other packets will be discarded.
A galaxy allows incoming connections from galaxies and its own subnet*
, and it is responsible for knowing and sharing its stars' lanes with other stars. A star allows incoming connections from its sponsoring galaxy, other stars, and its own subnet (planets, registered moons, and comets). A planet or moon allows no incoming connections but maintains an outgoing connection to its sponsoring star.
*
Alternatively, a galaxy could support unauthenticated requests for the locations of its stars, reducing the amount of PKI state it needs to store.
A ship initiates a QUIC connection with another ship over its wireguard channel to implement the Urth-to-Urth protocol described below. This QUIC connection uses the ship's curve25519 key to create a TLS 1.3 certificate. OpenSSL has an implementation of the certificate procedure. The need for a TLS handshake is a bit unfortunate since the wireguard connection is already authenticated and encrypted, but QUIC requires it and it's a minimal (1-RTT) handshake.
The following requests can be sent over a QUIC channel between two ships. This could use the %newt
length-prefixed jammed-noun protocol, or a more bespoke encoding.
+$ request
$% [%poke =bone seq=@ud command=*] :: command
[%peek =path] :: namespace read
[%lane sponsee=ship] :: request sponsee location
==
+$ response
$% [%poke =bone seq=@ud ok=?] :: command (n)ack
[%peek seal=@ux =path value=*] :: namespace response, signed
[%bide =path] :: namespace request ack
[%lane sponsee=ship =lane] :: sponsee location
==
A %poke
request represents a cross-ship command, which must be acknowledged or rejected by a %poke
response. If the QUIC connection dies without either process restarting, both sides should remember how much of the poke request has been sent, and resume from that point when a new connection is established.
TODO: messages to resume %poke
request or %peek
response after connection death
A %peek
is a remote scry request: an attempt to read a path in Urbit's scry namespace. If the read succeeds, which may occur at a significantly later date, the value grown at that path will be sent back to the requester in a %peek
response, along with a signature. If the QUIC connection dies without either process restarting, both sides should remember how much of the peek response has been sent, and resume from that point when a new connection is established.
A %bide
response indicates that the responder has heard the %peek
request and will send a %peek
response once it's ready. This happens when the requester asked for a path that doesn't exist yet. Unlike %poke
requests and %peek
responses, which live until a process restarts, %bide
responses are scoped to the QUIC connection. If a connection dies, the server will no longer be responsible for delivering any %peek
responses; %peek
requests re-sent in a later connection could cause new %bide
responses if the requested path still doesn't exist.
This means download resumption will not work across process restarts, and the downloads will need to start from scratch. I don't think this is a big problem, since restarts happen rarely. If it is deemed to be a problem, though, the runtime could checkpoint downloads by periodically delivering data chunks to Arvo as events, possibly through Khan, then using that information to resume the download after restart.
The QUIC handshake should share a process nonce for both sides. When a new connection is established, each side can check whether the peer has restarted since the last connection. If so, all pending poke and peek requests will need to be re-sent.
TODO: how do comets and moons get onto the network?
I want to register strong opposition to proposals that keep route tightening at the star level. Star and galaxy forwarding should be in service of peer discovery, and a backup against failure to create a direct connection.
I understand doing remote scry this way initially, but seeing it now move back into proposals for redoing Urbit networking generally is worrisome.