glozow/package_mempool_accept_post.md

## package_mempool_accept_post.md

      
    Raw
  

              package_mempool_accept_post.md
            
          
    Package Mempool Accept Post

Hi there,
I'm writing to propose a set of mempool policy changes to enable package validation (in preparation for package relay) in Bitcoin Core. These would not be consensus or P2P protocol changes. However, since mempool policy significantly affects transaction propagation, I believe this is relevant for the mailing list.
My proposal enables packages consisting of multiple parents and 1 child. If you develop software that relies on specific transaction relay assumptions and/or are interested in using package relay in the future, I'm very interested to hear your feedback on the utility or restrictiveness of these package policies for your use cases.
A draft implementation of this proposal can be found in Bitcoin Core PR#22290.
An illustrated version of this post can be found at https://gist.github.com/glozow/dc4e9d5c5b14ade7cdfac40f43adb18a.
Background

Feel free to skip this section if you are already familiar with mempool policy and package relay terminology.
Terminology Clarifications


Package = an ordered list of related transactions, representable by a Directed Acyclic Graph.


Package Feerate = the total modified fees divided by the total virtual size of all transactions in the package.

Modified fees = a transaction's base fees + fee delta applied by the user with prioritisetransaction. As such, we expect this to vary across mempools.
Virtual Size = the maximum of virtual sizes calculated using BIP141 virtual size and sigop weight. Implemented here in Bitcoin Core.
Note that feerate is not necessarily based on the base fees and serialized size.


Fee-Bumping = user/wallet actions that take advantage of miner incentives to boost a transaction's candidacy for inclusion in a block, including Child Pays for Parent (CPFP) and BIP125 Replace-by-Fee (RBF). Our intention in mempool policy is to recognize when the new transaction is more economical to mine than the original one(s) but not open DoS vectors, so there are some limitations.


Policy

The purpose of the mempool is to store the best (to be most incentive-compatible with miners,  highest feerate) candidates for inclusion in a block. Miners use the mempool to build block templates. The mempool is also useful as a cache for boosting block relay and validation performance, aiding transaction relay, and generating feerate estimations.
Ideally, all consensus-valid transactions paying reasonable fees should make it to miners through normal transaction relay, without any special connectivity or relationships with miners. On the other hand, nodes do not have unlimited resources, and a P2P network designed to let any honest node broadcast their transactions also exposes the transaction validation engine to DoS attacks from malicious peers.
As such, for unconfirmed transactions we are considering for our mempool, we apply a set of validation rules in addition to consensus, primarily to protect us from resource exhaustion and aid our efforts to keep the highest fee transactions. We call this mempool policy: a set of (configurable, node-specific) rules that transactions must abide by in order to be accepted into our mempool. Transaction "Standardness" rules and mempool restrictions such as "too-long-mempool-chain" are both examples of policy.
Package Relay and Package Mempool Accept

In transaction relay, we currently consider transactions one at a time for submission to the mempool. This creates a limitation in the node's ability to determine which transactions have the highest feerates, since we cannot take into account descendants (i.e. cannot use CPFP) until all the transactions are in the mempool. Similarly, we cannot use a transaction's descendants when considering it for RBF. When an individual transaction does not meet the mempool minimum feerate and the user isn't able to create an replacement transaction directly, it will not be accepted by mempools.
This limitation presents a security issue for applications and users relying on time-sensitive transactions. For example, Lightning and other protocols create UTXOs with multiple spending paths, where one counterparty's spending path opens up after a timelock, and users are protected from cheating scenarios as long as they redeem on-chain in time. A key security assumption is that all parties' transactions will propagate and confirm in a timely manner. This assumption can be broken if fee-bumping does not work as intended.
The end goal for Package Relay is to consider multiple transactions at the same time, e.g. a transaction with its high-fee child. This may help us better determine whether transactions should be accepted to our mempool, especially if they don't meet fee requirements individually or are better RBF candidates as a package. A combination of changes to mempool validation logic, policy, and transaction relay allows us to better propagate the transactions with the highest package feerates to miners, and makes fee-bumping tools more powerful for users.
The "relay" part of Package Relay suggests P2P messaging changes, but a large part of the changes are in the mempool's package validation logic. We call this Package Mempool Accept.
Previous Work


Given that mempool validation is DoS-sensitive and complex, it would be dangerous to haphazardly tack on package validation logic. Many efforts have been made to make mempool validation less opaque (see #16400, #21062, #22675, #22796).
#20833 Added basic capabilities for package validation, test accepts only (no submission to mempool).
#21800 Implemented package ancestor/descendant limit checks for arbitrary packages. Still test accepts only.
Previous package relay proposals (see #16401, #19621).

Existing Package Rules

These are in master as introduced in #20833 and #21800. I'll consider them as "given" in the rest of this document, though they can be changed, since package validation is test-accept only right now.


A package cannot exceed MAX_PACKAGE_COUNT=25 count and MAX_PACKAGE_SIZE=101KvB total size (#20833)
Rationale: This is already enforced as mempool ancestor/descendant limits. Presumably, transactions in a package are all related, so exceeding this limit would mean that the package can either be split up or it wouldn't pass this mempool policy.


Packages must be topologically sorted: if any dependencies exist between transactions, parents must appear somewhere before children. (#20833)


A package cannot have conflicting transactions, i.e. none of them can spend the same inputs. This also means there cannot be duplicate transactions. (#20833)


When packages are evaluated against ancestor/descendant limits in a test accept, the union of all of their descendants and ancestors is considered. This is essentially a "worst case" heuristic where every transaction in the package is treated as each other's ancestor and descendant. (#21800)


Packages for which ancestor/descendant limits are accurately captured by this heuristic: 
There are also limitations such as the fact that CPFP carve out is not applied to package transactions. #20833 also disables RBF in package validation; this proposal overrides that to allow packages to use RBF.
Proposed Changes

The next step in the Package Mempool Accept project is to implement submission to mempool, initially through RPC only. This allows us to test the submission logic before exposing it on P2P.
Summary


Packages may contain already-in-mempool transactions.
Packages are 2 generations, Multi-Parent-1-Child.
Fee-related checks use the package feerate. This means that wallets can create a package that utilizes CPFP.
Parents are allowed to RBF mempool transactions with a set of rules similar to BIP125. This enables a combination of CPFP and RBF, where a transaction's descendant fees pay for replacing mempool conflicts.

There is a draft implementation in #22290. It is WIP, but feedback is always welcome.
Details

Packages May Contain Already-in-Mempool Transactions

A package may contain transactions that are already in the mempool. We remove ("deduplicate") those transactions from the package for the purposes of package mempool acceptance. If a package is empty after deduplication, we do nothing.
Rationale: Mempools vary across the network. It's possible for a parent to be accepted to the mempool of a peer on its own due to differences in policy and fee market fluctuations. We should not reject or penalize the entire package for an individual transaction as that could be a censorship vector.
Packages Are Multi-Parent-1-Child

Only packages of a specific topology are permitted. Namely, a package is exactly 1 child with all of its unconfirmed parents. After deduplication, the package may be exactly the same, empty, 1 child, 1 child with just some of its unconfirmed parents, etc. Note that it's possible for the parents to be indirect descendants/ancestors of one another, or for parent and child to share a parent, so we cannot make any other topology assumptions.
Rationale: This allows for fee-bumping by CPFP. Allowing multiple parents makes it possible to fee-bump a batch of transactions. Restricting packages to a defined topology is also easier to reason about and simplifies the validation logic greatly. Multi-parent-1-child allows us to think of the package as one big transaction, where:

Inputs = all the inputs of parents + inputs of the child that come from confirmed UTXOs
Outputs = all the outputs of the child + all outputs of the parents that aren't spent by other transactions in the package

Examples of packages that follow this rule (variations of example A show some possibilities after deduplication): 
Fee-Related Checks Use Package Feerate

Package Feerate = the total modified fees divided by the total virtual size of all transactions in the package.
To meet the two feerate requirements of a mempool, i.e., the pre-configured minimum relay feerate (minRelayTxFee) and dynamic mempool minimum feerate, the total package feerate is used instead of the individual feerate. The individual transactions are allowed to be below feerate requirements if the package meets the feerate requirements. For example, the parent(s) in the package can have 0 fees but be paid for by the child.
Rationale: This can be thought of as "CPFP within a package," solving the issue of a parent not meeting minimum fees on its own. This allows L2 applications to adjust their fees at broadcast time instead of overshooting or risking getting stuck/pinned.
We use the package feerate of the package after deduplication.
Rationale:  It would be incorrect to use the fees of transactions that are already in the mempool, as we do not want a transaction's fees to be double-counted for both its individual RBF and package RBF.
Examples F and G show the same package, but P1 is submitted individually before the package in example G. In example F, we can see that the 300vB package pays an additional 200sat in fees, which is not enough to pay for its own bandwidth (BIP125#4). In example G, we can see that P1 pays enough to replace M1, but using P1's fees again during package submission would make it look like a 300sat increase for a 200vB package. Even including its fees and size would not be sufficient in this example, since the 300sat looks like enough for the 300vB package. The calculcation after deduplication is 100sat increase for a package of size 200vB, which correctly fails BIP125#4. Assume all transactions have a size of 100vB.

Package RBF

If it meets feerate requirements, the package can replace mempool transactions if any of the parents conflict with mempool transactions. The child cannot conflict with any mempool transactions. Multiple transactions can replace the same transaction, but in order to be valid, none of the transactions can try to replace an ancestor of another transaction in the same package (which would thus make its inputs unavailable).
Rationale: Even if we are using package feerate, a package will not propagate as intended if RBF still requires each individual transaction to meet the feerate requirements.
We use a set of rules almost identical to BIP125 as follows:
Signaling (Rule #1)

All mempool transactions to be replaced must signal replaceability.
Rationale: Package RBF signaling logic should be the same for package RBF and single transaction acceptance. This would be updated if single transaction validation moves to full RBF.
New Unconfirmed Inputs (Rule #2)

All transactions in the package, including those that do not directly conflict with any mempool transactions, only include an unconfirmed input if that input was included in one of the directly conflicting transactions or is from another transaction in the package.
Absolute Fee (Rule #3)

The package must increase the absolute fee of the mempool, i.e. the total fees of the package must be higher than the absolute fees of the mempool transactions it replaces. Combined with the CPFP rule above, this differs from BIP125 Rule #3 - an individual transaction in the package may have lower fees than the transaction(s) it is replacing. In fact, it may have 0 fees, and the child pays for RBF.
Feerate (Rule #4)

The package must pay for its own bandwidth; the package feerate must be higher than the replaced transactions by at least minimum relay feerate (incrementalRelayFee). Combined with the CPFP rule above, this differs from BIP125 Rule #4 - an individual transaction in the package can have a lower feerate than the transaction(s) it is replacing. In fact, it may have 0 fees, and the child pays for RBF.
Total Number of Replaced Transactions (Rule #5)

The package cannot replace more than 100 mempool transactions. This is identical to BIP125 Rule #5.
Always Try Individual Submission First

We should always try to submit the individual transactions first, and then use package validation for the transactions leftover. Not doing so wouldn't break package validation, but could be incentive-incompatible.
Rationale: Packages are intended for incentive-compatible fee-bumping. That is, transaction A is a legitimate fee-bump transaction B only if A is a descendant of B and has a higher feerate. One way of checking this would be to analyze the fees and relationships between all transactions in the package, but submitting them individually first is much simpler.
Examples Q1 and Q2 show that a naive application of package feerate can cause "parent pays for child" behavior. Assume that all transactions are 100vB; both have a package feerate of 2.5sat/vB, but with very little contribution from the child transactions. In Q1, just accepting P1 gets us all the fees without the extra 100vB from P2, so P2 should be rejected. In Q2, P3 contributes a meager 50sat but is 100vB, so it should be rejected. Note an edge case in Q2: submitting each transaction individually doesn't help since P1 has 0sat and P2 requires P1 in order to be valid. Luckily, P1+P2 is a valid package, and can be broadcast as such.

Examples R1 and R2 show that a naive application of package feerate can cause "siblings pay for each other" behavior. Assume that all transactions are 100vB. R1 barely scrapes past the mempool minimum feerate with a package feerate of 1sat/vB; we could accept the package as is, but P2+P3 would be evicted very quickly from the mempool (P2 has a descendant feerate of 0.5sat/vB).
In R2, the package feerate would be enough to replace both M1 and M2, but P1 pays the majority of the fees. A miner can collect the same amount of fees by mining P1+M2 as opposed to P1+P2. Mining P1+P2+P3 instead of P1+M2 adds an additional 100sat, but requires relaying an additional 200vB worth of transactions - it doesn't pay for its bandwidth. We easily enforce this by accepting P1 individually first, then package validating P2+P3.

Rationale: We can fail faster. There's no need to run package validation if the first transaction fails signature checks.
Rationale: Package validation should only increase a transaction's chances of getting into the mempool; we want to prevent subtle bugs where a slight inconsistency between single and package policies causes package relay to create censorship vectors for individual transactions.
Expected FAQs


Is it possible for only some of the package to make it into the mempool?
Yes, it is. In the case where it would be more economical to accept some transactions but not others, we will do so. If people are using packages for fee-bumping (i.e. the child sponsors the fees of the parents that otherwise don't make it into the mempool), the most common scenario would be all-or-nothing.


Should we allow packages to contain already-confirmed transactions?
No, for practical reasons. In mempool validation, we actually aren't able to tell with 100% confidence if we are looking at a transaction that has already confirmed, because we look up inputs using a UTXO set. If we have historical block data, it's possible to look for it, but this is inefficient, not always possible for pruning nodes, and unnecessary because we're not going to do anything with the transaction anyway. As such, we already have the expectation that transaction relay is somewhat "stateful" i.e. nobody should be relaying transactions that have already been confirmed. Similarly, we shouldn't be relaying packages that contain already-confirmed transactions.


Acknowledgements

Thanks to John Newbery, Dave Harding, and Mark Erhardt for the help in writing this document.