Background: Currently, a node will only rebroadcast a transaction if it is the originating wallet. This is awful for privacy.
PR #16698 reworks the rebroadcast logic to significantly improve privacy.
Overview of changes
- Instead of the wallet directly relaying transactions to peers, the wallet will submit unconfirmed txns to the node, and the node will apply logic to trigger txn rebroadcasts.
- The node will apply rebroadcast conditions to all transactions.
- The wallet will attempt to resubmit unconfirmed transactions to the node on a scheduled timer. This is only useful if the txn is dropped from the local mempool before it gets mined.
- The mempool tracks locally submitted transactions (wallet & rpc) to ensure they are succesfully rebroadcast. Success is defined as receiving a
GETDATAfor the txn.
New rebroadcast conditions:
- Regularly run a fee rate cache that computes top of mempool & stores min package fee rate for txn to be included
- When it is time to rebroadcast, calculate top transactions that are older >30 mins.
- Filter out an txns with fee rate < cached fee rate
- Queue remaining set to be sent to peers
Params & currently proposed values
These constants are all able to change to adjust rebroadcast behavior.
- Frequency of resubmission attempt from wallet to node -> wallet resubmits once / day
- Frequency of triggering rebroadcast -> ~ once / hour
- Defining highest priority transactions (top of mempool for potential txns to rebroadcast ) -> 3/4 block worth of txns based on package fee rate
- Define what “recent” transaction means -> only rebroadcast if txn is >30 minutes old.
- Frequency of fee rate cache -> 20 minutes
Fundamental concepts & design choices
topic: avoid extreme bandwidth spikes network-wide current prevention strategies:
- poisson distribution of rebroadcast timings per node
- filtering logic for rebroadcast candidates
- filterInventoryKnown is a per-peer rolling bloom filter that prevents resending invs to the same peer within a short time span.
- worst case hard limit from chosen params (currently 3/4 block of txns every ~1 hr)
Monitoring the network
after running the patch for..
- 10 days, node has only outbound connections -> 30 additional invs sent to peers (28, 2, 1)
- 8 days, node also accepts incoming connections -> 186 additional invs sent to peers (22, 29, 28, 2, 3, 24, 35, 7, 5, 3, 28)
Since each inv message is 36 bytes, this means...
- ~1 kb of data sent in 10 days with only outbound connections
- ~6.5 kb of data sent in 8 days when also accepting incoming connections
Other things to monitor
- 2 rebroadcast nodes connected to each other
- How many of these
INVmessages are actually followed with a
- PR review club: https://bitcoincore.reviews/16698.html
- Conceptual example of how the rebroadcast filters work: https://gist.github.com/amitiuttarwar/17ddf44e28e3de896b9be0139621f6f9
Open questions & solutions
concern: excessive bandwidth usage per node
- possible solution -> add a max of [# of rebroadcasts per duration] as a safety net (eg. 1000 txns / hour)
- possible solution -> have ability to enable/disable new rebroadcast logic. could also be used for rolling out. downside would be fingerprint abilities, but privacy leak might be minimal. See
concern: introducing dependency on mining code
- explanation -> https://github.com/bitcoin/bitcoin/pull/16698#pullrequestreview-321309451
- possible solution -> rebroadcast kill switch discussed above.
- possible solution -> (proposed in same comment) rebuild mapTx index as a bounded-size priority queue. The current diff of the rebroadcast change is already significant so I'd prefer to avoid this route.
concern: nodes with old policies will always have txns that cannot be mined
- explanation -> when policy rules are updated, nodes that are not upgraded can have their mempools cluttered with txns that will never be mined & never expire, since the rebroadcast logic ping pongs the txns between their pools. While this is already the case (it just takes 1 node to rebroadcast), these proposed changes increase these chances.
- possible solution -> additional
max_rebroadcast_countdata structure. It would maintain a blacklist of txids, expiry time, and count (num times I rebroadcasted). The txn would be removed from the list if mined into a block. ATMP would reject a txn if maintained on the blacklist.
GETDATA is insufficient to ensure a txn was succesfully propagated to the network
- explanation -> https://github.com/bitcoin/bitcoin/pull/16698#discussion_r348840194 & https://github.com/bitcoin/bitcoin/pull/16698#discussion_r350450955
- probably not a big deal. reasoning explained in links.
- possible solution -> add a timer. Eg. timer starts when first
GETDATAis received & txn is only removed from unbroadcast set after timer is up.
- possible solution -> require x number of
GETDATAmessages before removing from unbroadcast set.
Follow up PRs
- persist the unbroadcast txn set to
- fix circular dependency introduced between txmempool & miner
There is an inherent tradeoff between defining param [top of mempool] vs param [age of transactions filtered out]. The values I have proposed opt to reduce #1 allow for leniency for #2. Having more recent transactions enables txns evicted from the mempool during volatility to be rebroadcasted and thus confirmed sooner.
Open question: what are privacy implications for when nodes have varied mempool expiry settings? For example, is it a privacy leak if you expire txns quicker, your wallet resubmits the txn, and you rebroadcast sooner than default 2-week expiration? -> While it would be great to understand the fingerprint possibilities, this is not a blocker because the privacy leak is dramatically less than the current situation.
Can the compact blocks relay code be used to minimize the data set? -> we'd need to introduce a different P2P message to indicate these are mempool transactions. Concerns about bandwidth usage can be addressed with simpler solutions.
What are the implications of empty blocks? -> TODO