BIP: X1 Title: Merkle prefix trees Author: Mark Friedenbach <email@example.com> Status: Draft Type: Standards Track Created: 19-12-2013
This BIP describes a Merkle hash tree variant of the prefix-tree data structure, ideally suited for encoding key-value indices which support space-efficient extracted proofs; extensions to the peer-to-peer protocol which enable clients to perform trust-free queries of the contents of these indices and updates to them; and modifications to the JSON-RPC API to enable extraction and validation of serialized proofs of inclusion or modification.
There are a number of applications which would benefit from having a data structure with the following properties:
- Arbitrary mapping of keys to values. A key can be any bytestring, and its value any other bytestring.
- Duplicate keys disallowed. Every key has one, and only one value associated with it. Some applications demand assurance that no key value is reused, and that this constraint can be checked without requiring access to the entire data structure.
- Efficient look-up by key. The data structure should support sub-linear lookup operations with respect to the number of keys in the mapping. Logarithmic time or linear with respect to the length of the key should be achievable and would be sufficient for realistic applications.
- Merkle compression of mapping structure. It should be possible to produce a reduced description of the tree consisting of a single root hash value which is deterministically calculated from the mapping structure.
- Efficient proofs of inclusion or modification. It should be possible to extract a proof of key/value mapping or record of modification which is limited in size and verification time by the length of the key in the worst case.
- Computation of updates using local information. Given a set of inclusion proofs, it should be possible to calculate adjustments to the local mapping structure (update or deletion of included mappings, or insertion between two included mappings which are adjacent in the global structure) without requiring regions of the data structure not included in the proofs.
- Retrieval of proof data from untrusted nodes. It should be possible for nodes to query inclusion or operational proofs via the peer-to-peer protocol, and for remote nodes to return partial proofs due to size, data, or query constraints constraints.
scriptPubKey, document time-stamping, and more efficient merged mining. This BIP describes a Merkle hash tree variant of prefix trees which has the above properties, and extensions to the peer-to-peer protocol and JSON-RPC API to enable extraction and validation of serialized proofs, but leaves the myriad applications to be formalized in future BIPs.
This BIP defines a binary prefix tree. Such a structure provides a mapping of bitstrings (the keys) to bytestrings (the values). It is an acyclic binary tree which implicitly encodes keys within the traversal path -- a "left" branch is a 0, and a "right" branch is a 1. Each node is reachable by only one unique path, and reading off the branches taken (0 for each left, 1 for each right) as one follows the path from root to target yields the target node's key.
The particular binary prefix tree defined by this BIP is a hybrid PATRICIA and de la Brandais tree structure.
PATRICIA trees compress a long sequence of non-branching nodes into a single interior node with a per-branch skip prefix. This achieves significant savings in storage space, root hash calculation, and traversal time.
A de la Brandais trie achieves compression by only storing branches actually taken in a node. The space savings are minimal for a binary tree, but it reduces by a significant constant factor the number of hash operations required to perform updates and validate proofs.
This BIP describes two variations of the authenticated prefix tree and its serialized representations. Additional BIPs describe the application of authenticated prefix trees to such applications as committed output indices, compressed proof of work, and document time-stamping and merged mining.
In the general case, each node of the prefix tree is itself a binary hash tree of depth 2, as represented in the following block diagram:
----------- | NODE HASH | ----------- / \ ------ ---------- | SELF | | CHILDREN | ------ ---------- / \ / \ ------ ------- ------ ------- | INFO | | VALUE | | LEFT | | RIGHT | ------ ------- ------ ------- | | / \ info value ... ...
Figure 1: Hash structure of an interior node.
To determine the hash value of a node, the info-string (defined below), and the node's value bytestring are separately hashed. The resulting two hashes are then concatenated and the result hashed, forming the left side of the hash tree. The hash of the left branch and the right branch are concatenated together and the result hashed to form the right side of the hash tree. These two intermediate hash values are then concatenated and hashed to generate the hash value of the sub-tree rooted at the node. Thus a total of five hash operations are required to calculate the root hash of a node (the left and right branches are themselves the root hash values of their respective sub-trees, so we do not double-count their construction in this calculation).
The PATRICIA optimization mandates that any chained sequence of non-branching, non-terminal interior nodes are elided, with the sequence of left-right branches recorded as a skip prefix in the info string of the parent node to the sequence. This makes every left- or right-branch in a hash tree structure point to a terminal or branching node, minimizing the number of nodes in the tree and the number of hashes required to compute the root hash of the tree.
The de la Brandais optimization instructs that anytime the value, left-branch, or right-branch is not used, the unused position is cut from the hash tree structure and its sibling hash (if any) propagated upwards. The presence of these positions can be inferred from the first byte of the info string.
For example, an interior node that does not itself contain a value has the VALUE hash removed and the INFO hash promoted:
----------- | NODE HASH | ----------- / \ ------ ---------- | INFO | | CHILDREN | ------ ---------- | / \ info ------ ------- | LEFT | | RIGHT | ------ ------- / \ ... ...
Figure 2: Empty interior node (VALUE cut).
Likewise, if either the left- or right-branch is empty in an interior node, that branch is removed and its sibling promoted:
----------- | NODE HASH | ----------- / \ ------ ------ | SELF | | LEFT | ------ ------ / \ | ------ ------- ... | INFO | | VALUE | ------ ------- | | info value
Figure 3: Non-branching interior node (RIGHT cut).
----------- | NODE HASH | ----------- / \ ------ ------- | SELF | | RIGHT | ------ ------- / \ | ------ ------- ... | INFO | | VALUE | ------ ------- | | info value
Figure 4: Non-branching interior node (LEFT cut).
PATRICIA compression assures that non-branching, non-terminating interior nodes (e.g. VALUE and RIGHT cut, or VALUE and LEFT cut) are never observed.
In a leaf node, both branches are cut:
----------- | NODE HASH | ----------- / \ ------ ------- | INFO | | VALUE | ------ ------- | | info value
Figure 5: Leaf node (LEFT and RIGHT cut).
In the single, unique case of the root node of an empty tree, it is possible for all three of the value, left- and right-branches to be cut, resulting in the following structure:
----------- | NODE HASH | ----------- | info
Figure 6: Root node of an empty tree (VALUE, LEFT and RIGHT cut); the root hash is the hash of the info string of an empty node.
Two cryptographic hash function constructions are used in this BIP:
First, a method is required for reducing arbitrary bytestrings to fixed-length hash values, for the reduction of info strings and values in the left-side of the hash tree structure. Bitcoin's standard "hash256" double-SHA256 construction is used for this purpose.
Second, a method is required for the Merkle compression of two hash values in the context of calculating the root hash of the Merkle hash tree. In principle this could be the same double-SHA256 construction, but there are two practical concerns which motivate the selection of an alternative construction:
- The nested construction is not required as inputs to the Merkle compression function are always of fixed length (two hash values), so length extension attacks are not possible. The second invokation of the underlying hash function can be eliminated, thereby reducing by one the number of applications required for combining two hash values.
- Only 448 bytes are availble for use in the final block of a SHA256 application, due to the 64-bit length field in the NIST padding scheme. This means that two 256-bit hash values, 512 bytes total, will not fit in a single message block and will therefore require two applications of the SHA256 compression function.
Note that we are explicitly not using SHA224 or SHA256/224, the NIST contructs designed for this purpose. This is for two reasons:
- The NIST constructs specify different initialization vectors, which are not guaranteed to be supported by every special-purpose SHA256 hardware accelerator.
- The root hash of the tree remains a 256-bit value, one of the expected hash value sizes in existing bitcoin protocols, since truncation is performed as an internal step of the Merkle compression function.
As a hierarchical structure, the serialization of an entire tree is the serialization of its root node. There are two serialization formats of concern to this BIP: the hash tree serialization used to reduce info strings and values to bytestrings suitable for hashing, and the transmission serialization formats which define how (pruned sections of) such trees are serialized to disk or for network transmission.
In the sections that follow, both serialization formats will be fully defined. Two envelopes will then be considered: one for encoding inclusion (or exclusion) proof extracted from a prefix tree, and one for encoding operational proofs providing the necessary information to perform some update to a tree.
The binary prefix tree maps bitstrings (keys) to bytestrings (values). It is manifested as a tree container structure where the nodes of the tree have two branches corresponding to
0 (left) and
1 (right), for each branch a possibly empty skip prefix bitstring, and an optional bytestring value.
The key of a node is calculated by beginning with an empty bitstring and traversing the tree from the root to the node in question. For each branch taken, a
1 is affixed for a left or right branch, respectfully, followed by the branches' skip prefix as recorded in the info string.
The value of a node, if the
has_value flag of the info string (described below) is set, has a dedicated position in the hash tree structure. The position is filled by serializing the value as a bytestring and hashing it.
Two different methods exist in the bitcoin source file serialize.h for compact serialization of unsigned integers,
CVarInt. Neither format provides lexigraphical ordering of items if used in the serialization of key values for a prefix tree.
CVarInt is incapable of providing such capability, but the problem with
CompactSize is merely a matter of endianness. This BIP recommends but does not require the addition of a big-endian variation of
CompactSize in which a multi-byte
xSize is first converted to/from big-endian or network byte order. This big-endian encoding is used in some of the proposed applications of this BIP's data structure, including in the construction of the example transaction output tree test vectors included in this BIP.
For clarity it is suggested that the original little-endian
CompactSize be referred to as
LittleCompactSize, and the newly introduced big-endian serialization as
The info string (described below) contains some universally useful aggregate information about the left and right branches of a node, if present, including the number of items contained under the branch, and an the size of the canonical serialization of the branch in its unpruned state.
Other aggregate information may also be included in the meta field of the info string. The contents of a meta field are application specific, but all meta fields must have the property that they can be deterministically built from the meta fields of the left and right branch and the node's value (additive construction), or from the parent node's meta field and value and the meta field of its sibling (subtractive construction).
For example, an index over the unspent transaction output set might include the aggregate number of bitcoins contained by a subtree within its meta field, enabling efficient proofs of inflationary fraud.
An info string is the concatenation of five structures:
info := flags || meta || value || left || right
flags is a single byte field whose composite values determine the bytes that follow.
flags = (left_code << 0) | (right_code << 2) | (has_value << 4) | (prune_left << 5) | (prune_right << 6) | (prune_value << 7)
These codes and flag fields take on different values depending on whether the info string is being serialized for transmission, or as part of the hash tree calculations. In the transmission serialization formats, the
right_code are special 2-bit enumeration fields:
- A value of 0 indicates that the node does not branch in this direction, and the corresponding
rightbranch is missing (replaced with the empty string in the node serialization).
- A value of 1 indicates a single bit key prefix for this branch, implicitly 0 for
leftand 1 for
- A 2 indicates up to 7 bits of additional skip prefix (beyond the implicit first bit, making 8 bits total) are stored in a compact single-byte format.
- A 3 indicates a skip prefix with greater than 7 additional bits, stored run-length encoded.
- A 0 indicates that the branch is empty: its position is cut from the hash tree structure, neither a skip prefix nor the size and count fields are included for this branch.
- A 1 indicates the branch is present, with an run-length encoded skip prefix, count, and size fields in the info string.
right_codefields are not not in the hash tree serialization and must be zero.
The single bit
has_value indicates whether the node stores a data bytestring, the value associated with its key prefix. Since keys may be any value or length, including one key being a prefix of another, it is possible for interior nodes in addition to leaf nodes to have values associated with them, and therefore an explicit value-existence bit is required.
The remaining three bits are used for proof extraction, and are masked away prior to hash operations -- they are
0 in the hash tree serialization.
prune_left indicates that the entire left branch has been pruned.
prune_right has similar meaning for the right branch. If
has_value is set,
prune_value may be set to exclude the node's value from encoded proof. This is necessary field because interior nodes may hold values, and it is possible their values may be pruned while their children are not.
meta field is a run-length encoded bytestring, and must always be present, even for applications which do not use it. It takes on a (possibly zero length) bytestring value defined by the particular application. Use of the
meta field is application dependent, and will not be covered in this specification.
meta field can be set to the empty bytestring (serialized as a single zero byte to encode its length) if the application has no use for the
with calculate_meta(node, ...) as data: meta := CVarInt(nbytes(meta)) || CFlatData(meta)
meta is a bytestring: its length prefix in serialized form is the number of bytes, as calculated by the pseudocode function
In the transmission serialization formats only, the run-length encoded value bytestring or, if pruned, the 28-byte truncated double-SHA256 hash of the value bytestring is serialized after the
right non-terminals are only present if the corresponding
flags.right_code are non-zero. The format depends on the value of this flags setting.
In the transmission serialization formats a compact encoding is used, the structure of which is selected by the
switch flags.branch_code: case 0: branch := ε case 1: branch := branch_node_or_hash case 2: branch := int_to_byte(1 << (nbits(prefix) - 1) | bits_to_int(prefix >> 1)) || branch_node_or_hash case 3: branch := CVarInt(nbits(prefix) - 9) || CFlatData(bits_to_string(prefix >> 1)) || branch_node_or_hash
In the hash tree serialization a simpler bimodal encoding is used where the skip prefix is always encoded in the same format:
switch flags.branch_code: case 0: branch := ε case 1: branch := CVarInt(nbits(prefix) - 1) || CFlatData(bits_to_string(prefix >> 1)) || CVarInt(count(branch)) || CVarInt(size(branch))
branch_code is a stand-in meant to describe either
right_code, and likewise everywhere else in the above pseudocode
branch can be replaced with either
prefix is the key bits between the current node and the next branching, terminal, and/or leaf node, including the implicit leading bit for the branch (0 for the left branch, 1 for the right branch). In the above code,
nbits(prefix) returns the number of bits in the prefix, and
prefix >> 1 drops the first bit reducing the size of the prefix by one and renumbering the indices accordingly.
int_to_byte takes an integer in the range [0,] and returns the octet representing that value. This is a NOP in many languages, but present in this pseudocode so as to be explicit about what is going on.
bits_to_int interprets a sequence of bits as a little-endian integer value. This is analogous to the following pseudocode:
def bits_to_int(bits): result = 0 for idx in 1..len(bits): if bits[idx] == 1: result |= 1<<idx return result
bits_to_string serializes a sequence of bits into a byte string. It uses little-endian bit and byte order, as demonstrated by the following pseudocode:
def bits_to_string(bits): bytes =  * ceil(len(bits) / 8) for idx in 1..len(bits): if bits[idx] == 1: bytes[idx / 8] |= 1 << idx % 8 return map(int_to_byte, bytes)
branch_node_or_hash, used in the transmission serialization formats, is the serialized child node or if pruned its 28-byte truncated hash and associated meta-data:
if flags.branch_pruned: branch_node_or_hash := CFlatData(H(branch)) || CVarInt(count(branch)) || CVarInt(size(branch)) else: branch_node_or_hash := serialize(branch)
count(branch) returns the number terminal (value containing) nodes in the fully expanded / unpruned branch, and
size(branch) is the canonical serialization size of the branch. The canonical serializatin size of a node is defined to be:
def size(node): size = 1 if flags.has_value: size += nbytes(CVarInt(nbytes(value))) size += nbytes(value) for branch in left,right: if flags.branch_code != 0: size += nbytes(CVarInt(nbits(branch_prefix) - 1)) size += nbytes(bits_to_string(branch_prefix >> 1)) size += size(branch)
That is to say, the canonical size of a node is defined to be the length of the flags field (one byte) plus the serialized size of its run-length encoded value bytestring, if present, plus for each non-empty branch the size of the hash tree serialization of that branch's skip prefix and its recursively computed canonical size.
An inclusion proof is a prefix tree pruned to contain a subset of its keys. The serialization of an inclusion proof takes the following form:
inclusion_proof := variant || root_hash || tree || checksum
CVarIntencoded version field for future extensions. The format for inclusion proofs with a varient value greater than zero are undefined at this time.
root_hashis the Merkle compression hash of the tree, the 32-byte SHA256 hash of the root node.
treeis the possibly pruned, serialized representation of the tree.
checksumis the first 4 bytes of the double-SHA256 checksum of
tree(all fields prior to
-----BEGIN INCLUSION PROOF----- ATzPZheRnns6KfqBKzZs0dXLOxithdan2p18KgJ2c4O0DgARBwAauzpZmgIQAAP9agcPADEt7Zxd TC3tABAAA/1xBylsTG0t7cwBEAAD/YgHJ5OO2rOcGhAAA/3ZByEg+2g= -----END INCLUSION PROOF-----
Decoded, it looks like this:
0x01 # Level-compressed hashing # Merkle root: 0x3ccf6617919e7b3a29fa812b366cd1d5cb3b18ad85d6a7da9d7c2a02767383b4 # Serialized tree (unpruned): 0x0e001107001abb3a599a02100003fd6a070f00312ded9c5d4c2ded00100003fd 0x7107296c4c6d2dedcc01100003fd880727938edab39c1a100003fdd907 # Checksum: 0x2120fb68
An operational proof is a list of insert/update and delete operations suffixed to an inclusion proof which contains the pathways necessary to perform the specified operations. The inclusion proof must contain the key values to be updated or deleted, and the nearest adjacent nodes for each insertion. The serialization of an operational proof takes the following form:
operational_proof := variant || root_hash || tree || vector(delete*) || vector(update*) || new_hash || checksum
delete := CVarInt(nbits(key)) || CFlatData(bits_to_string(key)) update := CVarInt(nbits(key)) || CFlatData(bits_to_string(key)) || CVarInt(nbytes(value)) || CFlatData(value)
The first three fields,
tree are the inclusion proof, and take the same values described in the previous section.
vector(delete*) is a list of key values to be deleted; each key value in this list must exist in the inclusion proof.
vector(update*) is a list of key, value mappings which are to be inserted into the tree, possibly replacing any mapping for the key which already exists; either the key itself must be present in the insertion proof if it exists (update), or the two lexicographically closest nodes on either side if it does not (insert).
new_hash is the resulting 32-byte Merkle root hash after the insertion, updates, and deletes are performed, and
checksum is the initial 4 bytes of the SHA-256 hash of the preceding fields.
Just like inclusion proofs, an operational proof is encoded in base64 for display and transport. Here's an example operational proof:
-----BEGIN OPERATIONAL PROOF----- ATzPZheRnns6KfqBKzZs0dXLOxithdan2p18KgJ2c4O0LgARaIsVaQi/GdhOPOgA8p4Pu4PiEfEg lcmy3j7bOc7hXw0DLSeTjtqznBoQAAP92QcBMOS4reacrACzuZJbyP7fqIOf5VEk4iarG4+uPoZC oun8BztQMQBy0LHVeSY= -----END OPERATIONAL PROOF-----
Decoded and broken into its constituent fields:
0x01 # Level-compressed hashing # Original Merkle root: 0x3ccf6617919e7b3a29fa812b366cd1d5cb3b18ad85d6a7da9d7c2a02767383b4 # Serialized tree (included keys: '中本'): 0x2e0011688b156908bf19d84e3ce800f29e0fbb83e211f12095c9b2de3edb39ce 0xe15f0d032d27938edab39c1a100003fdd907 # Deletion list ['中本']: 0x01 0x30e4b8ade69cac # Insertion list : 0x00 # New Merkle root: 0xb3b9925bc8fedfa8839fe55124e226ab1b8fae3e8642a2e9fc073b50310072d0 # Checksum: 0xb1d57926
This BIP adds two new messages to the peer-to-peer protocol.
Retrieves an inclusion proof for zero or more prefixes of a given root hash. The message body consists of a 32-byte root header and a vector of prefixes. The response is the inclusion proof indicating the presence or absence of the requested prefixes. A 64-bit nonce is used to match requests to replies.
prefix := CVarInt(nbits(prefix)) || CFlatData(bits_to_string(prefix))
Retrieves an operational proof recording the transformation of an authenticated prefix tree between two states. The message body consists of the two 32-byte (before and after) hashes, and the response is the corresponding operational proof. A 64-bit nonce is used to match requests to replies.
getdelta := nonce || prev || next invdelta := nonce || operational_proof
This BIP adds two new API endpoints for retrieving inclusion and operational proofs.
A new command
getrawprefix is added, for the purpose of retrieving inclusion proofs. What is returned is the serialized inclusion proof, or an error if the root hash is not recognized.
getrawprefix root [prefixes]
A new command
getrawdelta is added, for the purpose of retrieving oerational proofs mapping one tree state to another. It takes the before and after root hashes and returns the serialized operational proof, or an error if the hashes do not constitute a known delta.
getrawdelta prev next