jachiang/TaprootDescriptorProposal.md

## TaprootDescriptorProposal.md

      
    Raw
  

              TaprootDescriptorProposal.md
            
          
    Taproot Descriptor Proposal

Status: Discussion
Co-Authors: Elichai Turkel (elichai.turkel@gmail.com), James Chiang (james.chiang@protonmail.com)
Output descriptor support in Bitcoin Core provides an intuitive language which simplifies how wallets determine which UTXOs they can sign and spend. A descriptor expression today expands to a single output script of a given output.
However, with the introduction of Taproot, a given output can now have multiple spending paths at different heights of the taproot tree. We wish to propose a taproot output descriptor which encapsulates both individual tapscripts and mid-level tapscript descriptors while encoding the intended taptree structure.
The proposed taproot descriptor design prioritizes:

Supporting current and foreseeable future descriptor language at the tapscript descriptor level.
Providing human-readable encoding of each tapscript depth in the taptree.
Enabling support for mid-level descriptor expressions which expand to multiple tapscripts (See multisig tapscript subscriptors below).
Maintaining a 1-to-1 mapping between taproot descriptor and output script.

The design achieves the goal of providing a unique descriptor for given output script at the minor cost of excluding certain tree structures with equivalent tapscript leaf depths. Since we believe the user is mostly interested in cost(tapscript depth) this appears to be a reasonable trade-off.
Taproot Descriptor Encoding.

The encoding of taproot output information required for solvability and spendability requires both tapscript and tree structure information to be encoded in the descriptor.
The taproot descriptor encapsulates a sequence of tapscript depth/descriptor pairs ordered first by leaf depth and then lexicographically (tapscript descriptor string).
tap([depth₀, descriptor₀(keys)], [depth₁, descriptor₁(keys)], ... , [depth_i, descriptor_i(keys)])
Description:

There can only be one encapsulating tap() descriptor for a given taproot output.
The taproot descriptor encapsulates a series of ordered [depth, tapscript descriptor] tuples.
The tapscript descriptors can be any valid output descriptor.
The depth/tapscript tuples must be first ordered by leaf depth, and then lexicographically by descriptor string, which includes the descriptor keys or arguments.
Tapscript depths are valid only if they are solvable into a valid binary tree structure (Described below).

Examples:

tap([1, pk(key)], [2, pk(key)], [2, pk(key)])
tap([2, pk(key)], [2, pk(key)], [2, pk(key)], [2, raw(HEX)])
tap([2, pk(key)], [2, p2wpkh(key)], [2, p2wsh(pk(key))], [3, pk(key)], [3, pk(key)])

We illustrate the third example as a taptree below.
                 Taproot *   
                        / \ 
                       *   *
                      / \ / \
              pk(key) * | |  * 
                        | | / \
            p2wpkh(key) * | | | 
                          | | |
           p2wsh(pk(key)) * | |
                            | |
                    pk(key) * |
                              |
                      pk(key) *

Taproot Descriptor Decoding.

We introduce the following greedy algorithm to demonstrate the decoding from taproot descriptor to taptree.
tap([depth₀, descriptor₀], [depth₁, descriptor₁], [depth₂, descriptor₂], [depth₃, descriptor₃])
The tuples must be lexicographically ordered from left to right, first by depth then script (if descriptors are identical the ordering of them doesn't matter). We insert the tapscripts into the tree from left to right. The greedy algorithm individually inserts each tapscript leaf at the desired depth. It does so by conducting a left-biased traversal beginning at the root. If it cannot find an insertion location during its left-biased downwards traversal, it traverses back up the tree until an alternative downwards path can be taken.
Left-biased downwards traversal:

Traverse downwards through the binary tree on the left until the tapscript leaf can be inserted at the desired depth.

If a leaf is found while traversing down left, continue the left-biased downwards traversal from the right sibling.

If the right sibling is occupied by a leaf, begin the upwards traversal.


If you hit an empty left position at the desired height, insert there and return.
If a left leaf is preoccupied but the right is empty insert there.


Upwards traversal:

At every height during the upwards traversal:

If not already traversed, continue the left-biased downwards traversal with the right sibling.


This algorithm has been designed and implemented in Python by Elichai and can be found in the appendix.
Note: A given set of tapscript/depth pairs may be solveable into multiple different tree structures satisfying the same tapscript depth requirements. That is why we have proposed ordering tapscript descriptor/depth pairs by height and then by descriptor string above.
Multisig Tapscript Descriptor.

We propose to support higher-level tapscript descriptor expressions which expand to a specific sequence of output descriptors. For example, there are two obvious n-of-m multisig variants which imply a series of n-of-n tapscript descriptor permutations.
Non-interactive n-of-m multi:

multi(2, PK0, PK1, PK2) expands to:

multi(PK0+PK1) or <PK0> CHECKSIG <PK1> CHECKSIGADD 2 EQUAL
multi(PK0+PK2) or <PK0> CHECKSIG <PK2> CHECKSIGADD 2 EQUAL
multi(PK1+PK2) or <PK1> CHECKSIG <PK2> CHECKSIGADD 2 EQUAL


A resulting multi tapscript subscriptor without an n-index simply implies an n-of-n multisig output descriptor.
Also note that the ordering of the n-of-n multisig tapscripts results from the public key subset permutations, determined in order of increasing public key indices and following the sequence of the public keys expressed in the higher-level n-of-m multi descriptor. This ordering is important, as it facilitates the indication of the leaf heights of each n-of-n multi tapscript descriptor.
Since the number and order of n-of-n multisig descriptors (subset determination) resulting from an expanded multisig descriptor expression are known, we can simply indicate the depths of the individual tapscripts as a sequence preceding the n-of-m multisig descriptor:
For example:

tap([[2,2,1], multisig(2, PK0, PK1, PK2)])

Note that the element order in the depth tuple is not in decreasing order but instead corresponds to the respective n-of-n multi tapscripts from the expanded n-of-m multisig descriptor.
Interactive n-of-m musig:

musig(2, PK0, PK1, PK2) expands to:

musig(PK0+PK1) or <PK0+PK1> CHECKSIG
musig(PK0+PK2) or <PK0+PK2> CHECKSIG
musig(PK1+PK2) or <PK1+PK2> CHECKSIG


The same ordering principles of the public key subset permutations from the multisig descriptor apply here.
Taproot Descriptors with multisig expressions:
A taproot descriptor with both regular and multi tapscript descriptors always orders the multisig tapscript descriptors last, and in lexicographic string order if multiple n-of-m multisig/musig tapscript descriptors are featured.
Example:

tap([2, pk(key)], [3, pk(key)], [[2,3,2], multi(2, PK0, PK1, PK2)]
Expands to: tap([2, multi(PK0, PK1)], [2, multi(PK1, PK2)], [2, pk(key)], [3, multi(PK0, PK2)], [3, pk(key)])

This corresponds to the following taptree.
                 Taproot *   
                        / \ 
                       *   *
                      / \ / \
      multi(PK0, PK1) * | |  * 
                        | | / \
        multi(PK1, PK2) * | | | 
                          | | |
                  pk(key) * | |
                            | |
            multi(PK0, PK2) * |
                              |
                      pk(key) *

The ordering of the tapscripts in this example may be a little more complicated but follow the same ordering principles as previously described. In general, after expanding all multisig descriptor expressions into (n-of-n multi) tapscripts, all leaves in the tree are ordered from left to right by the depth and then lexicographically.
Appendix: Implementation of Taproot Decoding Algorithm

Input: A tap() descriptor string.

Parse the input into a list of tuples (depth, script).
Assert that the tuples are lexicographically ordered, first by depth then by script (if some are equal the ordering of them doesn't matter).
Loop through the ordered list and call tree.insert(script, depth).
The insertion algorithm(tree.insert(script, d)) basically inserts scripts from left to right while always trying to default to left whenever possible.

class Node:

    def __init__(self):
        self.left = None
        self.right = None
        self.data = None
        self.parent = None

    def insert(self, data, depth):
        operating = self
        for i in range(depth):
            if operating.is_leaf():
                operating.insert_right(data, depth - i)
                return
            operating = operating.get_left_child()

        if operating.data is None:
            operating.set(data)
        else:
            if operating.is_left():
                operating.get_right_sibling().insert(data, 0)
            elif operating.is_right():
                operating.insert_right(data, 0)

    def insert_right(self, data, depth):
        operating = self
        while operating.is_right():
            depth += 1
            operating = operating.parent
        operating.get_right_sibling().insert(data, depth)

    def is_right(self):
        return self is self.parent.right

    def get_left_child(self):
        if self.left is None:
            self.left = Node()
            self.left.parent = self
        return self.left

    def get_right_sibling(self):
        assert not self.is_right()
        if self.parent.right is None:
            self.parent.right = Node()
            self.parent.right.parent = self.parent
        return self.parent.right

    def is_left(self):
        return self is self.parent.left

    def is_leaf(self):
        return self.data is not None

    def set(self, data):
        assert self.right is None and self.left is None
        self.data = data


def main():
    root = Node()
    root.insert("Descriptor0", 2)
    root.insert("Descriptor1", 3)
    root.insert("Descriptor2", 3)
    root.insert("Descriptor3", 3)
    root.insert("Descriptor4", 3)
    root.insert("Descriptor5", 4)
    root.insert("Descriptor6", 4)
    root.insert("Descriptor7", 4)
    root.insert("Descriptor8", 4)

    print("Descriptor0 is depth 2 and should be 2 nodes to the left: ", root.left.left.data)
    print("Descriptor8 is depth 4 and because it was last it should be 4 nodes to the right:",
          root.right.right.right.right.data)


if __name__ == "__main__":
    main()