chriseth/Patricia.sol

## evm-based sidechains.md

      
    Raw
  

              evm-based sidechains.md
            
          
    Verifying evm-based sidechain executions

Specific smart contract that can be used both off-chain and on-chain for verification.
Multiple execution modes:

direct:

The smart contract gets the full state (encoded in a certain way) plus the transaction as input.
It updates the state thereby calculating
(a) helper values for verification,
(b) Merkle proofs for all reads and writes into the state and
(c) the new root hash

with helpers:

as in 1, just that the helper values that are generated in 1(a) are also added to the input.
Those are used to speed up the computation like in the following example.
If this is run with _helper == 0, it will produce a value to be found and a value for helper.
If that helper value is used in the next run, it will speed up the search. At the same time,
there is no value for _helper which can produce a different return value (it will only
revert in some cases).
function findValue(uint[] a, uint x, uint _helper) returns (uint value, uint helper) {
  // a is sorted by increasing value
  for (uint i = _helper; i < a.length; i++) {
    if (array[i] >= x && (i == 0 || array[i-1] < x))
      return (array[i], i);
  }
  // it is important to detect an invalid value of _helper
  revert();
}


with helpers plus merkle proofs for all state access

In all three variants, state access has to be performed by special functions:

readState(MerkleTree _state, bytes _key, bytes _merkleProofInput) returns (bytes _value, bytes _merkleProof)
writeState(MerkleTree _state, bytes _key, bytes _value, bytes _merkleProofInput) returns (bytes _merkleProof)

If run in the first mode, the input merkle proofs are empty. Instead they are generated and provided as return value.
writeState updates the merkle tree in place.
If run in the second mode, the merkle proof can be provided as input. In this case, (for read) it already contains the
value to be read and is only verified against the state. The function just returns the same proof again.
For writeState, the state is also updated in place, but it is assumed to consist only of the root hash anyway.
More formal writeup (work in progress)

This technique allows to re-run and and in particular verify transactions such that the second run is much more
efficient than the first run. The use-case is to process transactions in a private chain and verify them on the
public chain only on demand and thus take load off the public chain. Since the second run will not generate
the full state, the operators of the private chain have to ensure data availability and can be punished if
they do not provide required data.
Let f be a fixed function taking three parameters and returning two values:

Let f(s, x) = (s', y, w) be the result of running f in full mode (0) on the current state s with input x, producing the new state s', actual output y and witness w

We call g a verifier of f using the hash function H (think of H as returning the root of a Merkle tree for structured input), if for all s and x and (s', y, w) = f(s, x) it holds that g(H(s), x, w) = (H(s'), y).
Patricia-Merkle-Tree

The Patricia-Merkle-Tree encodes an arbitrary partial mapping m: bits -> bits as an authenticated data structure with
short proofs. Let <.> -> bits256 be a hash function. We first transform m to M by M(<k>) = v for m(k) = v.
For a prefix pref of a key in M let path_M(pref: bits) be the longest bit string b such that pref . b
is a unique prefix of keys in M. If pref is not a prefix of a key in M, then path_M(pref) is the empty bit string.
We now define a partial function node_M that takes a prefix and returns the value of a node in the Patricia-Merkle-tree.
<node(path_M('')) . path_M('')> is called the root hash of M if M is non-empty. If it is empty, then
the root hash is the all-zero bitstring of 256 bits.
Definition

For an arbitrary prefix of a key in M, node(pref) is defined as follows:
node_M(pref) =
  - M(pref)                                          if pref is a key in M
  - <node_M(pref . '0' . path_M(pref . '0'))> .        otherwise
    <node_M(pref . '1' . path_M(pref . '1'))> . 
    length(path_M(pref . '0')): uint8 . 
    length(path_M(pref . '1')): uint8 . 
    path_M(pref . '0')) [ padded to multiples of 8 bits ] . 
    path_M(pref . '1')) [ padded to multiples of 8 bits ]

Definition

A bit string p is called a branch point of M if it is a longest unique prefix
of a key in M, but not equal to a key.
Observation

For a branch point p of M, node_M(p) only contains recursive mentions
of node_M with branch points or keys of M as arguments.
For a key k of M, node_M(k) does not contain recursive mentions
of node_M.
From this it follows that if p is a prefix of a key in M, then node_M(p . path_M(p)) does
not contains recursive mentions of node_M where the argument is not a prefix
of a key in M.

  
## Patricia.sol
contract PatriciaTree {
    // List of all keys, not necessarily sorted.
    bytes[] keys;
    // Mapping of hash of key to value
    mapping (bytes32 => bytes) values;

    struct Label {
        bytes32 data;
        uint length;
    }
    struct Edge {
        bytes32 node;
        Label label;
    }
    struct Node {
        Edge[2] children;
    }
    // Particia tree nodes (hash to decoded contents)
    mapping (bytes32 => Node) nodes;
    // The current root hash, keccak256(node(path_M('')), path_M(''))
    bytes32 public root;
    Edge rootEdge;

    function edgeHash(Edge e) internal returns (bytes32) {
        return keccak256(e.node, e.label.length, e.label.data);
    }

    // Returns the hash of the encoding of a node.
    function hash(Node memory n) internal returns (bytes32) {
        // if (n.isLeaf)
        //     return keccak256(n.value);
        // else
            return keccak256(edgeHash(n.children[0]), edgeHash(n.children[1]));
    }

    // Returns the Merkle-proof for the given key
    // Proof format should be:
    //  - uint8 branchMask - bitmask with high bits at the positions in the key
    //                    where we have branch nodes (bit in key denotes direction)
    //  - bytes32[] hashes - hashes of sibling edges
    function getProof(bytes key) returns (uint8 branchMask, bytes32[] _siblings) {
        Label memory k = Label(keccak256(key), 256);
        Edge memory e = rootEdge;
        bytes32[256] siblings;
        uint length;
        uint numSiblings;
        while (k.length > 0) {
            var (prefix, suffix) = splitCommonPrefix(e.label, k);
            require(prefix.length == e.label.length);
            length += prefix.length + 1;
            branchMask |= 1 << (32 - length);
            var (head, tail) = chopFirstBit(suffix);
            siblings[numSiblings++] = edgeHash(nodes[e.node].children[1 - head]);
            e = nodes[e.node].children[head];
            k = tail;
        }
        _siblings = new bytes32[](numSiblings);
        for (uint i = 0; i < numSiblings; i++)
            _siblings[i] = siblings[i];
    }

    function verifyProof(bytes32 rootHash, bytes key, bytes value, uint8 branchMask, bytes32[] siblings) {
        Label memory k = Label(keccak256(key), 256);
        Edge memory e;
        e.node = keccak256(value);
        uint previousBranch = 32;
        for (uint i = 0; ; i++) {
            uint branch = lowestBitSet(branchMask);
            (k, e.label) = splitAt(k, branch);
            if (branchMask == 0)
                break;
            uint bit = bitSet(uint8(branch), k.length);
            bytes32[2] memory edgeHashes;
            edgeHashes[bit] = edgeHash(e);
            edgeHashes[1 - bit] = siblings[siblings.length - i - 1];
            e.node = keccak256(edgeHashes);
            branchMask &= uint8(~(uint(1) << branch));
            // TODO try to get rid of this
            previousBranch = branch;
        }
        require(rootHash == edgeHash(e));
    }

    function insert(bytes key, bytes value) {
        Label memory k = Label(keccak256(key), 256);
        values[k.data] = value;
        keys.push(key);
        rootEdge = insertAtEdge(rootEdge, k, value);
        root = edgeHash(rootEdge);
    }

    // TODO also return the proof (which is basically just the array of encodings of nodes)
    function insertAtNode(bytes32 nodeHash, Label key, bytes value) internal returns (bytes32) {
        Node memory n = nodes[nodeHash];
        var (head, tail) = chopFirstBit(key);
        n.children[head] = insertAtEdge(n.children[head], tail, value);
        return replaceNode(nodeHash, n);
    }

    function insertAtEdge(Edge e, Label key, bytes value) internal returns (Edge) {
        var (prefix, suffix) = splitCommonPrefix(e.label, key);
        bytes32 newNodeHash;
        if (prefix.length >= e.label.length) {
            newNodeHash = insertAtNode(e.node, suffix, value);
        } else {
            // Mismatch, so let us create a new branch node.
            var (head, tail) = chopFirstBit(suffix);
            Node memory branchNode;
            branchNode.children[head] = Edge(insertValueNode(value), tail);
            branchNode.children[1 - head] = Edge(e.node, removePrefix(e.label, prefix.length + 1));
            newNodeHash = insertNode(branchNode);
        }
        return Edge(newNodeHash, prefix);
    }
    function splitCommonPrefix(Label check, Label label) internal returns (Label prefix, Label labelSuffix) {
        return splitAt(label, commonPrefix(check, label));
    }
    /// Splits the label at the given position and returns prefix and suffix,
    /// i.e. prefix.length == pos and prefix.data . suffix.data == l.data.
    function splitAt(Label l, uint pos) internal returns (Label prefix, Label suffix) {
        require(pos <= l.length);
        prefix.length = pos;
        prefix.data = (l.data >> (32 - pos)) << (32 - pos);
        suffix.length = l.length - pos;
        suffix.data = l.data << pos;
    }
    function commonPrefix(Label a, Label b) internal returns (uint prefix) {
        for (prefix = 0; prefix < a.length && prefix < b.length; prefix++)
            if (a.data[prefix] != b.data[prefix])
                break;
    }
    function removePrefix(Label l, uint prefix) internal returns (Label r) {
        require(prefix <= l.length);
        r.length = l.length - prefix;
        l.data = l.data << prefix;
    }
    function chopFirstBit(Label l) internal returns (uint firstBit, Label tail) {
        require(l.length > 0);
        return (uint(l.data >> 255), Label(l.data << 1, l.length - 1));
    }
    function insertValueNode(bytes value) internal returns (bytes32 newHash) {
        // Node memory n;
        // n.isLeaf = true;
        // n.value = value;
        // return insertNode(n);
    }
    function insertNode(Node memory n) internal returns (bytes32 newHash) {
        bytes32 h = hash(n);
        nodes[h] = n;
        return h;
    }
    function replaceNode(bytes32 oldHash, Node memory n) internal returns (bytes32 newHash) {
        delete nodes[oldHash];
        return insertNode(n);
    }
    function lowestBitSet(uint8 bitfield) returns (uint bit) {
        for (bit = 31; bit > 0; bit --)
            if (bitSet(bitfield, bit) != 0)
                return bit;
        return 0;
    }
    function bitSet(uint8 bitfield, uint bit) returns (uint) {
        return bitfield & (1 << (31-bit)) != 0 ? 1 : 0;
    }
}
	contract PatriciaTree {
	// List of all keys, not necessarily sorted.
	bytes[] keys;
	// Mapping of hash of key to value
	mapping (bytes32 => bytes) values;

	struct Label {
	bytes32 data;
	uint length;
	}
	struct Edge {
	bytes32 node;
	Label label;
	}
	struct Node {
	Edge[2] children;
	}
	// Particia tree nodes (hash to decoded contents)
	mapping (bytes32 => Node) nodes;
	// The current root hash, keccak256(node(path_M('')), path_M(''))
	bytes32 public root;
	Edge rootEdge;

	function edgeHash(Edge e) internal returns (bytes32) {
	return keccak256(e.node, e.label.length, e.label.data);
	}

	// Returns the hash of the encoding of a node.
	function hash(Node memory n) internal returns (bytes32) {
	// if (n.isLeaf)
	// return keccak256(n.value);
	// else
	return keccak256(edgeHash(n.children[0]), edgeHash(n.children[1]));
	}

	// Returns the Merkle-proof for the given key
	// Proof format should be:
	// - uint8 branchMask - bitmask with high bits at the positions in the key
	// where we have branch nodes (bit in key denotes direction)
	// - bytes32[] hashes - hashes of sibling edges
	function getProof(bytes key) returns (uint8 branchMask, bytes32[] _siblings) {
	Label memory k = Label(keccak256(key), 256);
	Edge memory e = rootEdge;
	bytes32[256] siblings;
	uint length;
	uint numSiblings;
	while (k.length > 0) {
	var (prefix, suffix) = splitCommonPrefix(e.label, k);
	require(prefix.length == e.label.length);
	length += prefix.length + 1;
	branchMask \|= 1 << (32 - length);
	var (head, tail) = chopFirstBit(suffix);
	siblings[numSiblings++] = edgeHash(nodes[e.node].children[1 - head]);
	e = nodes[e.node].children[head];
	k = tail;
	}
	_siblings = new bytes32[](numSiblings);
	for (uint i = 0; i < numSiblings; i++)
	_siblings[i] = siblings[i];
	}

	function verifyProof(bytes32 rootHash, bytes key, bytes value, uint8 branchMask, bytes32[] siblings) {
	Label memory k = Label(keccak256(key), 256);
	Edge memory e;
	e.node = keccak256(value);
	uint previousBranch = 32;
	for (uint i = 0; ; i++) {
	uint branch = lowestBitSet(branchMask);
	(k, e.label) = splitAt(k, branch);
	if (branchMask == 0)
	break;
	uint bit = bitSet(uint8(branch), k.length);
	bytes32[2] memory edgeHashes;
	edgeHashes[bit] = edgeHash(e);
	edgeHashes[1 - bit] = siblings[siblings.length - i - 1];
	e.node = keccak256(edgeHashes);
	branchMask &= uint8(~(uint(1) << branch));
	// TODO try to get rid of this
	previousBranch = branch;
	}
	require(rootHash == edgeHash(e));
	}

	function insert(bytes key, bytes value) {
	Label memory k = Label(keccak256(key), 256);
	values[k.data] = value;
	keys.push(key);
	rootEdge = insertAtEdge(rootEdge, k, value);
	root = edgeHash(rootEdge);
	}

	// TODO also return the proof (which is basically just the array of encodings of nodes)
	function insertAtNode(bytes32 nodeHash, Label key, bytes value) internal returns (bytes32) {
	Node memory n = nodes[nodeHash];
	var (head, tail) = chopFirstBit(key);
	n.children[head] = insertAtEdge(n.children[head], tail, value);
	return replaceNode(nodeHash, n);
	}

	function insertAtEdge(Edge e, Label key, bytes value) internal returns (Edge) {
	var (prefix, suffix) = splitCommonPrefix(e.label, key);
	bytes32 newNodeHash;
	if (prefix.length >= e.label.length) {
	newNodeHash = insertAtNode(e.node, suffix, value);
	} else {
	// Mismatch, so let us create a new branch node.
	var (head, tail) = chopFirstBit(suffix);
	Node memory branchNode;
	branchNode.children[head] = Edge(insertValueNode(value), tail);
	branchNode.children[1 - head] = Edge(e.node, removePrefix(e.label, prefix.length + 1));
	newNodeHash = insertNode(branchNode);
	}
	return Edge(newNodeHash, prefix);
	}
	function splitCommonPrefix(Label check, Label label) internal returns (Label prefix, Label labelSuffix) {
	return splitAt(label, commonPrefix(check, label));
	}
	/// Splits the label at the given position and returns prefix and suffix,
	/// i.e. prefix.length == pos and prefix.data . suffix.data == l.data.
	function splitAt(Label l, uint pos) internal returns (Label prefix, Label suffix) {
	require(pos <= l.length);
	prefix.length = pos;
	prefix.data = (l.data >> (32 - pos)) << (32 - pos);
	suffix.length = l.length - pos;
	suffix.data = l.data << pos;
	}
	function commonPrefix(Label a, Label b) internal returns (uint prefix) {
	for (prefix = 0; prefix < a.length && prefix < b.length; prefix++)
	if (a.data[prefix] != b.data[prefix])
	break;
	}
	function removePrefix(Label l, uint prefix) internal returns (Label r) {
	require(prefix <= l.length);
	r.length = l.length - prefix;
	l.data = l.data << prefix;
	}
	function chopFirstBit(Label l) internal returns (uint firstBit, Label tail) {
	require(l.length > 0);
	return (uint(l.data >> 255), Label(l.data << 1, l.length - 1));
	}
	function insertValueNode(bytes value) internal returns (bytes32 newHash) {
	// Node memory n;
	// n.isLeaf = true;
	// n.value = value;
	// return insertNode(n);
	}
	function insertNode(Node memory n) internal returns (bytes32 newHash) {
	bytes32 h = hash(n);
	nodes[h] = n;
	return h;
	}
	function replaceNode(bytes32 oldHash, Node memory n) internal returns (bytes32 newHash) {
	delete nodes[oldHash];
	return insertNode(n);
	}
	function lowestBitSet(uint8 bitfield) returns (uint bit) {
	for (bit = 31; bit > 0; bit --)
	if (bitSet(bitfield, bit) != 0)
	return bit;
	return 0;
	}
	function bitSet(uint8 bitfield, uint bit) returns (uint) {
	return bitfield & (1 << (31-bit)) != 0 ? 1 : 0;
	}
	}