Here is a proposal for tapscript miniscript, as a list of changes w.r.t. existing P2WSH miniscript.
multi
is not available in tapscript miniscriptmulti_a
is added, not available in p2wsh miniscript.multi_a(k,A,B,C,...Z)
translates to[A] CHECKSIG [B] CHECKSIGADD [C] CHECKSIGADD ... [Z] CHECKSIGADD <k> OP_NUMEQUAL
.d:
remains available, and corresponds to the same script, but has different typing rules in tapscript miniscript (it always has theu
type property). An alternative is introducing a separate fragment that maps to the same script, with different typing rules, but we consider the added ambiguity worse.- All key expressions are x-only when converting to script (and for hashing in
pk_h
). - 64-character hex arguments are valid key expressions (xonly keys), in additon to 66-character ones (whose top byte is ignored on conversion to script).
- For witness size estimation, signatures are now 64 (or 65) bytes.
- Instead of ops/scriptsize/stacksize limits, tapscript miniscript has a unified "score" resource limit, which is deliberately more restrictive than what consensus/policy allow, but significantly simpler to reason about:
- Every leaf fragment (
0
,1
,after
,older
,pk
,pk_h
,multi_a
, and the hashlocks) gets an associated integer "score" equalling the number of stack elements consumed by them, but at least 1 (so it is 1 for all exceptpk_h
which has 2, andmulti_a
which has score n). All other fragments/wrappers have score 0. - The score of a miniscript is the sum of all scores in it.
- A tapscript miniscript is valid when its score is 997 or below (to be determined what the actual limit is). The idea is that every score increment corresponds to:
- a maximum script size increment (somewhere around ~36 bytes for cases that push a pubkey + opcodes to later aggregate it with other results). So a limit on score obviously limits the maximum script size far below the (implicit tx-based) limit of 400000.
- a maximum runtime stack size increment (of 1), while the number of stack elements that may be added during execution in addition to those counted as score, is believed to be a constant (here assumed to be 3, to obtain the max score 997, with consensus limiting the maximum stack size upon entry to, or during execution, to 1000).
- The ops, witness size, and stack element size limit don't affect us (they're never a bottleneck in miniscript), so just the score limit should be sufficient to satsify all policy or consensus resource limits.
- This rule does outlaw a number of perfectly-valid constructions like
or_i(multi_a(990,...), a:multi_a(990,...))
. However correct assessment of runtime stack size limitations is nontrivial to get right. Given concerns from hardware manufacturers about implementing this correctly, this score rule is much simpler to implement, and likely sufficient for almost all real world use cases. If it really isn't, it could be relaxed later.
- Every leaf fragment (
These rules apply on a per-miniscript basis. A tr()
descriptor can hold multiple such miniscripts. Rules like no reuse of pubkeys only apply within one miniscript.
This is a change to miniscript, applicable both to p2wsh miniscript, and tapscript miniscript (see further).
The existing fragments and_v
, and_b
, and or_b
are extended to accept arbitrary numbers of arguments (at least 2, the maximum is perhaps an open question):
and_v(A,B,...,Z)
translates to[A] [B] ... [Z]
.and_b(A,B,C,...,Z)
translates to[A] [B] BOOLAND [C] BOOLAND ... [Z] BOOLAND
.or_b(A,B,C,...,Z)
translates to[A] [B] BOOLOR [C] BOOLOR ... [Z] BOOLOR
.
Thus, and_v(A,B,C)
corresponds to the same script as and_v(and_v(A,B),C)
and and_v(A,and_v(B,C))
. In addition, and_b(A,B,C)
is equivalent to and_b(and_b(A,B),C)
(but not to and_b(A,and_b(B,C))
); the same holds for or_b
. Despite mapping to the same script, they are distinct Miniscripts. So when parsing from string, and_v(and_v(A,B),C)
must be serialized back to the nested structure, as the alternative would break expectations of stability of roundtripping existing descriptors. Still, when parsing from script, the n-ary form is preferred over the nested binary form.
It may be worthwhile to have similar n-ary and
/or
in the policy language, but that's orthogonal. Further, it's probably useful to have policy compilers output these n-ary fragments (either directly, or by preferring the compatible nesting structure, and converting to n-ary as a post-processing step). This is all out of scope for this proposal.
This change is relevant in the context of tapscript miniscript, as without it, the optimal way of writing an n-of-n policy is an awkwardly nesting of and_v
with pk()
leaves. Making and_v
n-ary also has an additional advantage, as it avoids some ambiguity in how parsing from script should work (arguably, by adding a third way of parsing, but a much more natural one).
This is a change that applies to descriptors in general, but with implications for miniscript ones (including tapscript miniscript).
raw(HEXSCRIPT)
becomes permitted insidesh()
andwsh()
, for P2(W)SH with known script, but nothing else known about it.- Inside
tr()
script tree expressions,rawnode(HEXNODEHASH)
is added, for conveying the hash of a subtree (which could be a TapLeaf or a TapBranch hash), without known way it was constructed. - Inside
tr()
script tree expressions,rawleaf(HEXSCRIPT[,LEAFVER])
is added, for conveying a leaf of the script tree with known script/leaf version. This differs fromrawnode()
in that the script is known. An open question is whether this should just beraw()
instead (which would mean adding an optional LEAFVER argument toraw()
insidetr()
). rawpkh(HEXPUBKEYHASH160)
andrawpk_h(HEXPUBKEYHASH160)
are added in all places wherepkh
andpk_h
are allowed (toplevel, insh
/wsh
/tr
, in miniscript). They represent a variant ofpkh()
/pk_h
where only the pubkey hash is known, and not the full (xonly) pubkey.rawtr(KEY)
(PR already open) is added at the toplevel for encoding a taproot output with known output key. Note that KEY can be any key expression, not just a hex key. This makesrawtr()
different from all otherraw*
constructions in that its argument isn't hex encoded bytes that can appear as-is on chain, but rather just bypasses the internal key tweaking operation.
rawsh(HEXSCRIPTHASH)
or rawwsh(HEXSCRIPTHASH)
are NOT added, because at the top level these are equivalent to addr(ADDR)
, and rawwsh
inside sh()
can be done through raw()
inside sh()
instead. While these alternative constructions aren't as readable, there is no difference in information conveyed. The same argument does not apply for rawtr()
, precisely because it accepts any KEY expression and not just a HEX pubkey.
This introduces a provably-unspendable, derivable, backward-compatible KEY expression. It applies to descriptors in general (including miniscript), but is primarily useful as the first KEY argument to tr()
, where unspendable keys are inevitable for conditions without available key spending path.
unspend(HEXCHAINCODE)
is added; it is usable anywhere an xpub/xprv is allowed, under condition that it is followed by a /
. So for example tr(unspend(0000...0000))
is not allowed, but tr(unspend(0000...0000)/0)
or tr(unspend(0000...0000)/1/*)
is. It is equivalent to writing a level-0 (master) xpub whose public key is (compressed hex notation) 0250929b74c1a04954b78b4b6035e97a5e078a5a0f28ec96d547bfee9ace803ac0 (the provably unspendable pubkey H in BIP341), and whose chaincode is the provided 64-character argument to unspend()
:
- It is backward compatible in the sense that it is equivalent to an xpub, which can already be serialized into PSBT and related code.
- It can be inferred back from PSBT/script data by recognizing the xpub's key as H, thereby revealing it is in fact unspendable.
- It's compatible with ranged descriptors that each have their own internal unspendable key, as
unspend()
can be followed by/*
. Software not supportingunspend()
will just treat it as a regular xpub, and be able to derive from it. - It is private in that the derived keys from such an
unspend()
are not recognizably unspendable without knowing the unspendable master they were derived from. - The requirement of having a
/
after it means that the fixed H key never actually ends up as the derived pubkey, as that'd be obviously linkable.
Like with n-ary and_v
etc., using unspend()
does yield a distinct descriptor which must be serialized back to its original if parsed from a string.
A potential implementation concern is the Bitcoin Core dummy signer should be able to detect keys derived from such an unspendable xpub and treat them as non-signable.
Following finalization of the MuSig2 BIP, we envision a musig()
extension to descriptors, with the following semantics:
musig(KEY,KEY,...,KEY)
is a valid KEY expression, representing the MuSig2* key aggregate of the provided KEY expressions (which are treated as full XY keys if Y parity is available for them; even otherwise).musig(KEY,KEY,...,KEY)/...
is a valid KEY expression if all KEY arguments are xpubs or derived from xpubs. It is interpreted as a synthetic xpub expression whose key is the MuSig2* aggregate of the provided keys, and whose chaincode is a fixed chosen constant (e.g. "MuSigMuSigMuSig..." or so).- It is essential that the arguments are xpub expressions, which are designed to be derived from (as otherwise using pubkey expressions inside and outside a
musig()
simultaneously may allow an attacker to link the generated keys together). - Ignoring the chaincode of the provided arguments simplifies the necessary corresponding PSBT changes, and isn't a security reduction (because the chaincode isn't actually necessary for security). The simplification is that at the PSBT level, the key-musig and xpub-musig constructions are essentially equivalent, as only the aggregation of a pubkeys into a pubkey would need to be conveyed through a new field. Hardcoding the chaincode does mean signers should only permit such aggregation to be provided for xpubs whose chaincode matches the hardcoded value - the alternative would mean that signers are required to sign for whatever chaincode is claimed in the PSBT, without validation. While that doesn't have obvious security problems, it feels unacceptably lax.
- An alternative here is letting the chaincode of such a
musig()
expression be a hash of the inputs' chaincodes. That's more obviously safe, but more complex too, as now all the participants xpub chaincode data needs to go into PSBT, along with requiring that verifiers track the chaincode data along and verify it.
- It is essential that the arguments are xpub expressions, which are designed to be derived from (as otherwise using pubkey expressions inside and outside a
musig()
is allowed insidetr()
andrawtr()
, both in their first (key path spend) argument, or inside the scripts that follow inpk()
,pk_k()
, ormulti_a()
. It is not allowed insidepkh()
orpk_h()
, because the 160-bit hash there would result in just 80-bit collision security (relevant for an attacker grinding their key to yield and output they can steal, when combined/aggregated with a cosigner's key). It's not available outside oftr()
, because MuSig(2) depends on BIP340.
I'm poking at implementing Miniscript for Tapscript and i don't think we need to introduce a new "score" metric. I think we can simply estimate the upper bound stack size after the execution of a fragment, choosing the largest in case of a fragment containing an
OP_IF
. This is already what we are doing now for computing the number of executed opcodes depending on whether aCHECKMULTISIG
is executed or not.The motivation for the score metric is to put conservative bound to 1) the script size, 2) the runtime stack size. The former can be statically computed. In fact it will very likely be by implementations for the purpose of fee estimations. Having a check that it is smaller than the maximum standard transaction size minus a few bytes is trivial. The latter is trickier since we need to make sure that it never overflows when executing every single opcode within a fragment. And since the stack size is modified differently by each opcode, the initial stack size before executing a fragment may vary depending on which fragments were executed or not beforehand. But it's exactly like the current limit on the number of executed opcodes, that may vary depending on whether a
CHECKMULTISIG
was executed beforehand or not.Therefore i suggest we simply generalize the "executed opcodes" approach we have now to take into account the runtime stack size. A
max_stack_size(node, main_start_size, alt_start_size)
function is introduced. It returns either a(main_stack_size, alt_stack_size)
tuple corresponding to the stacks sizes after executing the fragment, orNone
if the size was exceeded while executing it.n
during the execution of the fragment, verify thatmain_start_size + alt_start_size + n <= 1000
. Return the stacks sizes after execution.A
andB
: querymax_stack_size
onA
and then query it onB
with the result value from the former asmain_start_size
andalt_start_size
.A
andB
: return the largest between the stack size after dissat(A) + sat(B) and the stack size after sat(A) + dissat(B).andor(A, B, C)
: combine the two from above, the largest between satisfying A + B and satisfying C.thresh(k, A, B, ..., Z)
: return the largest between all possible satisfactions ofk
subs. It's more expensive, but not more than what exists today.This approach does not introduce a new metric, doesn't introduce new complexity (both in logic and computation) and allow for much finer grained conservative bounds. The
or_i(multi_a(990,...), a:multi_a(990,...))
miniscript given above would for instance be accepted under this rule.What do you think?
EDIT: i don't think we should be concerned about the stack size actually. So it's even easier. Leaving as is until i'm done implementing it.EDIT 2: actually, we should. :)