sdaftuar/Dandelion-implementation-thoughts.md Secret

## Dandelion-implementation-thoughts.md

      
    Raw
  

              Dandelion-implementation-thoughts.md
            
          
    General observations


It's hopeless to hide the dandelion relaying mapping (inbound_peer -> outbound_peer) from a determined adversary. This is noted in the paper and there's some analysis bounding the precision of an attacker who has full knowledge of the dandelion routing.
Without a mempool-like construct, dandelion routing would be trivial to DoS attack. Transactions that could be routed via dandelion but not accepted to the mempool would pose a DoS risk, as an attacker could use bandwidth without incurring cost (such as transaction fees).

An alternate approach could be to come up with other relay limits on the dandelion transactions that limit the free relay/bandwidth attack, such as rate limiting spends from utxo's (Matt's idea). It's not clear that rate-limiting utxo spends is sufficient (my recollection from when we analyzed the old priority rate limiter was that there are too many utxo's to really rely on this to prevent relay attacks, especially since creating new utxo's is pretty cheap), but perhaps we could come up with some other clever metric that bounds relay attacks while still allowing "normal" transactions to propagate well.
Short of coming up with something totally new, requiring new transactions to pass our existing anti-DoS rules for the mempool seems like a more practical solution.


Dandelion relay should be a full replacement for our existing relay system; we shouldn't drop anything on the floor in situations that would generally succeed in the past (timing differences should generally be tolerable).

So in particular, Dandelion relay should support transaction chains in some capacity, as well as BIP 125 replacement.
However, transaction chains seem like they have some fundamental relay issues with Dandelion, as route shuffling can happen at any time anywhere in the path, breaking relay if parents would be relayed via a different path than children. This seems like it imposes a real constraint on wallet relay behaviors (wallets should queue their transactions themselves, until parents are in the mempool).  We can put some buffers in to the dandelion relay as well to make sure that chained transactions don't break due to mempool timing differences.


Proposal

Keep a global stempool -- an additional, private mempool -- with the following semantics:

When new transactions are added to the mempool via legacy/fluff relay, attempt to add them to the stempool as well. (Don't try to add things that fail mempool acceptance.)
Whenever transactions are removed from the mempool for any reason, remove them from the stempool as well (along with any descendants, as usual, to keep the stempool consistent).
The stempool gets its own memory limits - discussed below.
When dandelion transactions arrive, try to add to mempool via test_accept:

If success, then try to add to stempool:

If the transaction is already in the stempool, drop it on the floor (don't relay or send reject message).
If stempool acceptance succeeds, relay on to next dandelion hop (and set embargo) or fluff, as per the dandelion algorithm.
If stempool acceptance fails, then add to mempool and fluff (as though this were a legacy-relayed transaction).


Else if mempool is missing inputs:

Add to per-peer orphan map, a memory limited data structure. Reprocess as dandelion transaction when parents arrive in mempool.


Else if mempool acceptance fails for some other reason (feerate, invalid sigs, etc):

Just drop it on the floor and do not relay.


For transactions added to stempool (and relayed to next dandelion hop):

Check embargo timer in future and, if not yet in mempool, try to accept to mempool and fluff on success. If mempool acceptance fails, remove transaction and descendants from stempool and orphan pool.


For dandelion transactions added to orphan map:

Either on a timer (same as embargo timer?) or on mempool signal (checking for parents becoming available): check to see if all parents are now available in mempool. If so, reprocess as a new dandelion transaction, and either drop on the floor or relay as usual. (Does it matter how we do this?)


The stempool gets its own memory limit, equal to the mempool limit, and accounts for all transactions in the stempool against its memory limit (thus double-counting transactions that are also in the mempool, which are counted against the mempool's memory limit).

Analysis


The stempool and mempool can generally be out of sync, but any stempool-only transactions should eventually be tested for mempool acceptance and discarded if mempool acceptance fails.

New dandelion transactions are temporarily added to stempool and then should be flushed later if mempool acceptance fails. However, accepting a new dandelion transaction could cause some other stempool transaction to be evicted from the stempool but not the mempool. So the stempool can be missing mempool transactions.
Because the stempool also removes anything that is removed from the mempool, nothing should stay in the stempool after mempool eviction. So the stempool should not contain anything outside the mempool other than embargoed transactions.
Since the stempool will generally just be missing transactions, this means that we might prematurely fluff transactions that depend on in-mempool but stempool-evicted transactions, but this seems like not a big concern.
Free relay is prevented by only dandelion-relaying things if they successfully get into the stempool (and would be admitted to the mempool), and only fluffing things that successfully get into the mempool. In the event of a bandwidth attack where a node is flooded with transactions, the rising stempool-min-fee would ratchet up the cost to maintaining the attack.


Even though we cannot prevent a dedicated attacker from learning the dandelion relay mapping, it's important that we do not leak the presence of transactions in the stempool to anyone other than our inbound and corresponding outbound edge for that transaction (doing otherwise would defeat the purpose of dandelion routing). Consequently any transactions with parents that are only in the stempool must appear to any peer to be treated the same as orphan transactions (if not, chaining a child transaction would leak whether the parents are in the stempool) -- even if invalid for stempool acceptance (due to feerate, policy limits, or consensus rule violation).

Note that by fluffing transactions which are stempool-invalid but mempool-valid, we're introducing a possible source of information leakage about the presence of stempool transactions.  If it were possible for an arbitrary transaction to have its stempool-presence tested by relay of some other transaction via dandelion, then this could be a problem.  I think there are limited situations where it might be possible -- eg if you and I both have outputs from a common unconfirmed parent, and you create a transaction spending your output, once I learn of your transaction I could test its presence in a given node's stempool by creating a just-below-the-package-limit transaction spending my output and seeing if it gets immediately fluffed or not.  But I haven't come up with a general sort of attack that could be done, so my guess is we can live with this.


As mentioned above, routing transaction chains through dandelion before parents are available in the mempool may not be worth the effort (due to changing dandelion routes causing propagation issues). However we could consider a different algorithm in the case of a dandelion transaction having inputs that are only in the stempool but not the mempool:

Try to add to stempool.  If success, and if all stempool parents (but not mempool-parents) came from the same sending peer and all parents were forwarded on to same next-hop (so no dandelion routing shuffle in between), then continue with dandelion relay and compute embargo time as starting from max(parent_embargo_time) over all stempool-only parents.
If adding to stempool fails or not all stempool parents were received from same sender or not all stempool parents were already forwarded to same next hop, then treat as an orphan transaction, and put in a memory-limited orphan map.


Memory limiting the stempool

The stempool should be big enough to hold everything in the mempool along with some new, to be fluffed transactions.

If it's smaller than the mempool, then in the event of a transaction spike, lower-feerate transactions which would be eligible for the mempool might stop being eligible for the stempool, as its min_fee goes up.  This would break dandelion relay for lower-feerate transactions, which is undesirable.
If it's bigger than the mempool, then a stream of transactions that would cause the mempool's minimum feerate to go up may not cause a fee increase in the stempool. This introduces potential for a free-relay attack, where an adversary sends a stream of transactions that would be accepted to the mempool if processed individually, but taken in the aggregate would be rejected due to mempool limiting, while the stempool would relay at the lower feerate. So we should bound how much bigger than the mempool the stempool can be, to limit the size of this kind of attack.

Thus we propose giving the stempool its own memory limit equal to the mempool (perhaps 150MB or 200MB each), where it counts all the transactions in it.  Since mempool transactions are admitted to the stempool, this will generally double-count memory (a lot!).

3. TODO: estimate how much smaller the mempool could be without materially affecting fee estimation over long time horizons.
Discussion

This proposal is wasteful because the stempool should generally not be actually using anywhere near that much memory, since much of it is shared with the mempool, but it ensures that the stempool has enough room to grow even when the mempool is full:

Suppose we have a mempool that is full with a 200MB limit, and a stempool with a 50MB limit (also full, since it contains the highest feerate elements from the mempool), and then 100MB of high feerate (with no chains/replacements) arrive.  We want to be able to relay all 100MB of transactions and eventually clear out our current mempool in favor of the new transactions. This seems not achievable if our stempool is limited to a size that is much smaller than our mempool -- instead the stempool will impose an increase min relay fee as transactions arrive, and not all of the new transactions would get relayed, even though they should all get into our mempool.
An alternate stempool interpretation might be to try to use it in a way where it only contains the transactions in flight, and not the transactions that are already in the mempool -- then it would be reasonable to limit this to a smaller number that reflects the amount of bandwidth we can reasonably expect to buffer before fluffing. However I don't think this works:

In practice you would need to pull in mempool entries into the stempool in order to evaluate the DoS limits in AcceptToMemoryPool (such as all descendants of all ancestors of a given dandelion transaction!). That means that it would be relatively straightforward to relay a small amount of data via dandelion and use up all the stempool memory with these extra mempool entries; note that the package limits do not limit the size of the required data here. And once the stempool fills up, new transactions would be evaluated based on the stempool's min_fee, which will go up, interfering with relay as described above.
This idea of the stempool transactions as sitting on top of the mempool creates semantic ambiguities in the event that we accept a replacement transaction into the stempool (and thus the stempool needs to think of the conflicting mempool transactions as being removed). In a model where we don't replicate transactions from the mempool into the stempool, how do we distinguish between conflicting transactions and merely missing transactions?


Giving the stempool its own memory limit that is calculated the same as the mempool's should make implementation and reasoning about the algorithm easier.  Presumably we could just call ATMP with a different CTxMemPool instance and get correct behavior.

Performance considerations


Having two mempools is slower than one.  We should consider how to efficiently update the stempool after a new block is found -- we must update the mempool immediately so that CreateNewBlock will DoTheRightThing, but we could consider updating the stempool later/asynchronously?  Is this necessary?