A blockchain is a combination of two technologies: a fully-ordered event log and a consensus mechanism for ensuring the validity of a pending event to be appended to the event log. The Urbit PKI is a system that can be represented as an event log of logical operations that modify properties of Urbit addresses such as owner, network key, and sponsor. The Urbit PKI contains a hierarchy of addresses such that the higher parts of the hierarchy (galaxies and stars) are more expensive than planets and comets.
The current Urbit PKI implementation, Azimuth, resides on Ethereum. Ethereum is a Proof of Work-based blockchain that serially computes Turing-complete Solidity code in each new block. Solidity code allows for the creation of “smart contracts”, which consist of functions and mutable state.
Ethereum has a few long-term scaling issues, which inevitably cause a tragedy of the commons for everyone who uses Ethereum smart contracts:
- Within the Ethereum Virtual Machine, all transactions and function calls are serially computed for every smart contract, which means that it can only use a single CPU-core to compute all of Ethereum’s smart contract code
- Every Ethereum node must store all of the Ethereum state, which includes the state of all smart contracts
- Technological development on increasing the speed of the EVM is not progressing nearly as quickly as the demand for computation on the EVM is increasing.
The transaction cost for using the EVM is currently fluctuating up to an order of magnitude higher than the current value of Urbit planets, which is pricing out the onboarding of new users.
Any Urbit PKI must fulfill a minimal set of requirements:
- Must prevent double-spending of assets
- Must have a globally-consistent state
- Must have data availability of the full PKI state
- Must have cheap enough transactions to allow user onboarding and maintenance
- Must have interoperability with other blockchains so as to allow trustless atomic swaps of Urbit assets for digital currencies
- Must have high enough throughput to support Urbit’s use-cases
- Must have a protocol, social or technological, for upgrading the PKI ruleset
Ethereum currently fails to provide cheap enough transactions for user onboarding to support our use-cases. Long-term, Ethereum will also fail to provide a cost-effective state storage solution, as Urbit’s demands for data storage will amount to about 400 GB once the network is in full use. Layer two additions to Ethereum lose some of the existing validity guarantees and add significant additional complexity and technological dependencies.
Proof of Authority is the simplest approach to running a blockchain, and one that’s especially suited for the Urbit network’s design. In a Proof of Authority chain, a fixed set of authorities must come into consensus around the state of the event log at any given time, and the ruleset for modifying that event log. In order for them to consistently come to a distributed consensus, a majority of authorities must agree to any new transactions. Failure modes must also be defined, as well as conventions around ruleset upgrades. However, the problem of providing a complete Urbit PKI is as simple as it is possible for a distributed ledger system to be without simply having a single authority (in other words, being a centralized database).
The galaxies currently have formal and social control over the PKI and are the obvious candidate for maintaining property rights across the address space. These galaxies are heavily invested in the long-term value of the network as a whole. They are incentivized to maintain the validity and stability of property rights within the address space and to increase liquidity of address space. They’re also incentivized to be online and available, since their primary revenue stream will be providing routing services to stars.
We have a federated hierarchical network with a fixed semi-trusted set of participants, whereas other systems such as Bitcoin and Ethereum have a flat, trustless, anonymous network structure without a fixed set of participants. The trustless network structures of Bitcoin and Ethereum require much more intricate game theory to negate possible attack vectors. Urbit’s federated hierarchy relies on some amount of trust in the galaxies to function as stewards of the network, and this design facilitates significantly simpler governance.
The Urbit PKI has historically been based around a social agreement-based protocol of ownership of galaxies, stars, and planets. In 2019, this agreement was reified as a set of Ethereum contracts. These contracts no longer serve their purpose, thus it has become necessary to migrate this social agreement of property rights to a new medium. We have previously had external dependencies on technologies outside of the control of the network’s stakeholders, and have been burned by this arrangement. As such, we find it prudent to move to a more permanent PKI solution that is entirely within the control of the stakeholders.
A Proof of Authority blockchain is a combination of an event log and a proof of authority consensus mechanism. A proof of authority consensus mechanism is one in which a predetermined set of stakeholders are responsible for guaranteeing the validity and order of new events being appended to the event log.
In order for a proof of authority consensus mechanism to work properly to solve Byzantine-fault tolerance, it is necessary to have a mechanism for preventing double-spends by rogue authority nodes.
There are two overall strategies for this. One is for the galaxies to run an off-the-shelf Byzantine-fault-tolerant consensus algorithm such as pBFT or HotStuff, using specialized nodes outside of Urbit. The alternative is an “Urbit-native” solution, which implements a BFT consensus algorithm using the galaxy Urbit nodes themselves.
Either option would be run by either all galaxies or some subset of galaxies to whom non-participating galaxies have delegated their PKI management responsibilities. An Urbit-native solution could either implement pBFT or HotStuff in Hoon, or an Urbit-specific consensus algorithm, which will hopefully be simpler and easier to tailor to Urbit’s use case over time. This document describes an Urbit-specific system, but the others are worth considering.
To process transactions, the galaxies elect one of their members as a “coordinator” who will be responsible for processing transactions and broadcasting them to the rest of the galaxies, until the galaxies choose a different coordinator. This is the same idea as a “leader” in traditional consensus algorithms. Requests to process transactions are sent to the coordinator, who validates them, orders and timestamps them, and broadcasts them to the other galaxies. Every galaxy is responsible for maintaining the full PKI state and serving requests on it.
The coordinator is the only galaxy who could attempt a double-spend, which they could do by attempting a Byzantine fault by broadcasting two different chains of transactions to two different partitions of galaxies. This could be the coordinator attempting to double-spend its own assets, or the coordinator colluding with another node to double-spend the other node’s assets, deliberately sowing confusion by trying to break consistency, or maybe even a bug -- it’s easy to forget that Byzantine faults were originally encountered in automated sensor networks, without any human antagonism.
To prevent double-spends, the network would require signatures from a majority of galaxies on every proposed transaction (or block of transactions). The coordinator would send the proposed transaction to all galaxies. Each galaxy would validate the transaction, sign it, then send the signature back to the coordinator. Once the coordinator receives signatures from a majority of galaxies, the coordinator signs the transaction itself to seal it, then broadcasts the finalized transaction to all the other galaxies. Each other galaxy then validates the majority signature, then broadcasts the transaction to its stars. As an implementation detail, signature aggregation could be used to reduce space and computation costs.
Not all galaxies are online yet. It’s also possible for a galaxy that is currently online to go offline. Any self-hosted Urbit PKI blockchain must take this into account. However, more than half of galaxies could realistically be kept online now; outages causing a loss of quorum are already relatively rare, and will become less frequent as Urbit matures.
The simplest way to solve this would be to require (socially) for every galaxy to run, even if it doesn’t have any live stars yet. This might be the simplest solution. It also has a positive property that it could be more difficult for other galaxies to try to kick out a disliked galaxy based on some sort of liveness criterion -- there is no liveness criterion for validation; every galaxy gets to participate.
One thought for dealing with this is that as soon as a galaxy joins the Urbit network, it begins participating in the PKI consensus system automatically. Only a majority of live galaxies would be required for validation. This does not provide for a mechanism by which a galaxy could go offline, which might be necessary.
Another way to handle liveness would be to require each galaxy to either be online or delegate its PKI responsibilities to another galaxy. This could be performed socially when the system is first launched, and then a galaxy could later change its delegate or enter the live set.
There is prior art for dealing with liveness problems, most of which include some sort of staking protocol. An Urbit-native PKI would ideally not need to incorporate monetary payments directly, but this problem deserves more attention.
The coordinator proposes one block of transactions for validation by the other galaxies per block interval, which will likely be something like five to ten minutes. This is standard procedure among blockchains: use blocks to measure and agree upon time. If the coordinator fails to broadcast a new validated block for a certain amount of time (say, an hour’s worth of blocks), then the other galaxies will treat it as having gone offline.
Galaxies will then expect the galaxy within the list of currently registered live galaxies whose address is one more than the previous coordinator, modulo 256 (e.g. ~zod (0) -> ~nec (1)), to take over as coordinator. The candidate for next coordinator is then responsible for sending a block to the currently live galaxies and accruing signatures from ⅔ of the currently live galaxies in order to be elected as coordinator. This provides the leader election with a high degree of Byzantine fault tolerance.
The candidate has some amount of time (maybe two blocks’ worth) to gather all needed signatures and become the coordinator. Otherwise the galaxies increment again and try with the next candidate. If the galaxies somehow become out of sync (which would happen only if there were an unbounded amount of clock skew, or some other, worse error), then the system would fail to a state of unavailability, but would retain safety and consistency. In this case, galaxies could manually resynchronize and pick a new coordinator deliberately.
Most transactions can be validated automatically without human oversight. Some, however, require active voting by the galaxies: leader election and upgrades to the PKI contract itself. This would likely be performed via precommitment and automated voting. Galaxies would first vote informally, then when they’re confident that the required majority has been reached, they each pre-approve the hash of the new contract on their own nodes. The coordinator emits the vote transaction as part of a later block, which the galaxies then automatically sign, since that contract hash has been validated. Failure to live up to a precommitment could cause downtime, but would retain safety and consistency.
The Urbit blockchain must support hashed timelock contracts to allow an Urbit address to be locked up by some other system until either a timeout expires, in which case the address is unlocked, or a value is supplied that hashes to a pre-specified value, in which case the address is transferred to a pre-specified public key.
Urbit block time will likely not be as robust as block time on other chains, due to the lower number of nodes participating in block generation. This should be accounted for by making sure the timeout on the other chain is long enough to provide a reasonable degree of certainty that the transaction can be revoked on both sides in the case of an outage of the Urbit chain.
Start with planets then move stars and galaxies over. The first step would run the full proof of authority chain as the canonical source of PKI data, but it would only perform writes about planet keys and sponsorship information; all other writes would be sourced from the existing Ethereum chain, including galactic senate voting and changes to star and galaxy keys.
TODO: describe stages of rollout, omitting leader election, voting, and atomic swaps for an MVP