Kixunil/taproot_review.md

## taproot_review.md

      
    Raw
  

              taproot_review.md
            
          
    My review of the Taproot PR by Pieter Wuille

About

I have reviewed the Taproot PR and decided to write some notes.
This is a combination of notes that may be useful for other (less experienced) reviewers to understand the code
and possibly helpful for a bit wider technical audience to understand how Taproot works in better details.
It's definitely not intended for general public.
Summary/conclusion

I find the reviewed code in the PR very solid and I didn't find any serious flaw except for it using BIP9 activation logic.
Yes, the same logic that greatly contributed towards tension during the scaling debate, the need for a quite risky UASF movement and related problems.
Taproot is an important privacy improvement and it should be expected that certain entities will not like it and attempt to prevent it.
As such I find this PR inssuficien in its current form.
That being said the rest is very good and I'd like to thank Pieter and all other contributors, reviewers and testers for all their work.
Their contribution contains the core of the logic and the modification of activation rules should not be too much effort even for less experienced contributors.
Things excluded from my review:


Cryptographic code - I don't want to cause a false sense of security because of "Martin said it's correct".
Tests - I will try to write my own from scratch to test my understanding of BIPs.
This will probably be better if I don't see the existing tests.
Will revisit if I'm unable to write them myself.

Full log

Disclaimer: this is pretty much a brain dump that I wrote during review. Don't expect good structure or lack of typos.

TaggedHash takes a const string&,
which forces the callers to allocate, but it's fine because hahsers are constants
so the cost of allocation is only felt at the startup, where it doesn't matter so much.
(Rustceans would use &str, so that'd mean const char * or Span<const char> in C++.)
Policy changes add additional rules on top of consensus that only govern accepting transaction into mempool.
Annex is an extension mechanism reserved for future upgrades.
The meaning is not defined yet, so a transaction with it is non-standard.
Transaction using it won't be accepted into mempool but can be present in a block.
(It is ignored by the consensus rules.)
There's another extension mechanism called "leaf version".
The policy rules check that the specific, known, leaf version is used.
The transaction is non-standard if the version is unknown.
Note that consensus rules are not checked in policy, that's correct.
The exceptions are those required to avoid undefined behavior (e.g. checing array sizes).
BIP341 introduces a new type of public key: X-only public key.
The usual public keys are 64 bytes long because they contain X and Y coordinates from the elliptic curve.
However, the Y coordinate can be calculated from X, so it's not normally transferred over wire in full.
(Except for old bitcoin wallets.)
Only parity bit is transferred but working with 33 bits is annoying as most datastructures work with 32 bits.
Thus X-only public key defines a scheme to get rid of that one bit by defining the parity to always be even.
The core logic of this is implemented in libsecp256k1.
The Taproot PR defines a new class for X-only pubkey.
This class has two most important methods, one to verify the signature,
the other to verify that the pubky was constructed from another pubkey using specified tweak.
This is the core feature of taproot.
Note that signature is not verified in this case - only commitment to the script (MAST).
This is correct because in the case of revealing the script branch the signature is not present.
Interpreter is another core piece of code that was modified.
EvalChecksig is the old singature verification function and was renamed to a more appropriate name EvalChecksigPreTapscript.
An assert makes sure it's not called with Tapscript.
CheckSig was renamed to CheckECDSASignature too, which distinguishes it from Schnorr.
A completely new EvalChecksigTapscript function was added
which contains the core logic of signature verification in Tapscript.
It contains a bit surprising logic where the signature is not checked but the function
still returns true, indicating false in the success output variable.
The purpose is to implement OP_CHECKSIGADD, it should be more obvious when reading its implementation (seee below).
A new EvalChecksig function was added which dispatches to the appropriate implementation based on script version.
There's a possibly surprising change in logic checking max script size.
This may look like a hard fork (previously invalid size may be valid), however, it's not because in case of an unknown version the script would be blindly accepted without any checking by the caller.
As a result, this is actually a soft fork.
There's a bit strange line that looks like refactoring, but it's not.
The condition is unrealated to increment but it may be easy to miss from a glance.
The check of maximum number of non-push opcodes in Segwit v1 was removed.
This is according to BIP342
OP_CHECKSIGADD was added.
It correctly verifies the inputs (Tapscript, number of elements on stack >= 3) and takes the values from stack.
The order of items is in a bit strange order - signature is on the bottom, then a counter follows ended by pubkey.
The reason for such order is to allow efficient multisig implementation.
For pubkey script OP_0 OP_PUSH(pubkey_1) OP_CHECKSIGADD OP_PUSH(pubkey_2) OP_CHECKSIGADD ... OP_PUSH(pubkey_n) OP_CHECKSIGADD OP_PUSH(k) OP_NUMEQUALVERIFY
a sig script OP_PUSH(sig_1) OP_PUSH(sig_2) ... OP_PUSH(sig_k) interspersed with n - k instances of OP_0 can unlock the k of n multisig contract.
There are mostly trivial functions for hashing spent amounts and spent scripts.
I found the endianess of serialization of the amount to be not very obvious, so I checked it's little endian as required by the BIP.
The functions defined in consensus code are used so any unexpected changes should be caught quickly by tests.
PrecomputedTransactionData was modified to take Taproot into account.
One of the first things to point out when it comes to this change: don't mistake BIP134 with BIP341.
The function first scans what kinds of signatures are present in script and then it pre-computes the data needed to verify the signatures.
This is an optimization that avoids recomputing the same data for each input.
SignatureHashSchnorr function was added.
This calculates the hash of the transaction used in signing/verifying operation.
The hash_type is verified and written to the hasher, then other items according to BIP341 follow.
I have verified that the code matches the BIP regarding those rules.
(This is sometimes slightly confusing as BIP talks about various items separately, but they are used in the code together in a data sctructure.
e.g. line 1520)
The message is extended with other data defined in BIP132 in case of Tapscript.
This code matches as well.
The sha256 is returned at the end as expected.
CheckShnorrSignature is pretty straightforward.
It verifies the inputs as expected, according to BIP341 (including the rule about 65B signatures).
There was a pre-scan added to ExecuteWitnessScript
which makes an output spendable without other conditions if the script contains OP_SUCCESSx anywhere.
This is intentional design of the BIP which makes future upgrades easier.
(Especially without leaking the information about used script version into the commitment.)
The check of control size is not present in VerifyTaprootCommitment,
so I verified that it's present in the caller.
The compact size of script is present in the script structure, so the calculation of k is correct.
The following loop is pretty much literal translation of rules from BIP.
The remaining rules are then validated in CheckPayToContract.
If Taproot softfork is not activated then encountering witness version 1 must succeed to maintain consensus.
This is correctly implemented in VerifyWitnessProgram.
The code checking annex does not drop an empty item because an empty item doesn't begin with 0x50.
Checking key spend is pretty straightforward.
Checking script path spending looks more involved but it's mostly moving values around and verifying some inputs.
The leaf versioning logic is also implemented there together with the policy rule about non-standard leaf versions.
There are two more minor changes that make sure VerifyWitnessProgram takes the information about P2SH being used.
(Taproot is not defined for P2SH!)
Folloving the interpreter, there's bunch of rather trivial definitions of various item.
However I managed to find an interesting discrepancy between BIP114 and BIP341
I skipped reviewing libsecp256k1.
I'm not a professional cryptographer, so I'm not willing to create a false sense of security ("Martin said it's correct").
In my favorite file, validation.cpp, the caching logic was moved around a bit to accomodate for recent changes.
There's a part of activation logic as well - setting validation flags to verify Taproot if the soft fork was activated.
The activation logic uses BIP9.
I find this very dangerous as Taproot brings privacy benefits which could lead to political battle like during the scaling debate (SegWit used BIP9 activation).
BIP8 fixes the core issues with BIP9 activation logic, so it should be used instead.
I've skipped reviewing the tests for now as I'm seriously considering writing my own test from scratch (minus cryptography) purely based on the BIPs.
Doing so without my mind unaffected by seeing other tests could be more valuable.
If I'm unable to create such tests, I will reconsider reviewing the existing ones.