Skip to content

Instantly share code, notes, and snippets.

Last active January 6, 2020 20:16
Show Gist options
  • Save amitiuttarwar/b592ee410e1f02ac0d44fcbed4621dba to your computer and use it in GitHub Desktop.
Save amitiuttarwar/b592ee410e1f02ac0d44fcbed4621dba to your computer and use it in GitHub Desktop.

Background: Currently, a node will only rebroadcast a transaction if it is the originating wallet. This is awful for privacy.

PR #16698 reworks the rebroadcast logic to significantly improve privacy.

Overview of changes

  • Instead of the wallet directly relaying transactions to peers, the wallet will submit unconfirmed txns to the node, and the node will apply logic to trigger txn rebroadcasts.
  • The node will apply rebroadcast conditions to all transactions.
  • The wallet will attempt to resubmit unconfirmed transactions to the node on a scheduled timer. This is only useful if the txn is dropped from the local mempool before it gets mined.
  • The mempool tracks locally submitted transactions (wallet & rpc) to ensure they are succesfully rebroadcast. Success is defined as receiving a GETDATA for the txn.

New rebroadcast conditions:

  • Regularly run a fee rate cache that computes top of mempool & stores min package fee rate for txn to be included
  • When it is time to rebroadcast, calculate top transactions that are older >30 mins.
  • Filter out an txns with fee rate < cached fee rate
  • Queue remaining set to be sent to peers

Params & currently proposed values

These constants are all able to change to adjust rebroadcast behavior.

  • Frequency of resubmission attempt from wallet to node -> wallet resubmits once / day
  • Frequency of triggering rebroadcast -> ~ once / hour
  • Defining highest priority transactions (top of mempool for potential txns to rebroadcast ) -> 3/4 block worth of txns based on package fee rate
  • Define what “recent” transaction means -> only rebroadcast if txn is >30 minutes old.
  • Frequency of fee rate cache -> 20 minutes

Fundamental concepts & design choices

topic: avoid extreme bandwidth spikes network-wide current prevention strategies:

  • poisson distribution of rebroadcast timings per node
  • filtering logic for rebroadcast candidates
  • filterInventoryKnown is a per-peer rolling bloom filter that prevents resending invs to the same peer within a short time span.
  • worst case hard limit from chosen params (currently 3/4 block of txns every ~1 hr)

Monitoring the network

after running the patch for..

  • 10 days, node has only outbound connections -> 30 additional invs sent to peers (28, 2, 1)
  • 8 days, node also accepts incoming connections -> 186 additional invs sent to peers (22, 29, 28, 2, 3, 24, 35, 7, 5, 3, 28)

Since each inv message is 36 bytes, this means...

  • ~1 kb of data sent in 10 days with only outbound connections
  • ~6.5 kb of data sent in 8 days when also accepting incoming connections

Other things to monitor

  • 2 rebroadcast nodes connected to each other
  • How many of these INV messages are actually followed with a GETDATA?

Other resources

Open questions & solutions

concern: excessive bandwidth usage per node

  • possible solution -> add a max of [# of rebroadcasts per duration] as a safety net (eg. 1000 txns / hour)
  • possible solution -> have ability to enable/disable new rebroadcast logic. could also be used for rolling out. downside would be fingerprint abilities, but privacy leak might be minimal. See walletbroadcast=0.

concern: introducing dependency on mining code

  • explanation -> bitcoin/bitcoin#16698 (review)
  • possible solution -> rebroadcast kill switch discussed above.
  • possible solution -> (proposed in same comment) rebuild mapTx index as a bounded-size priority queue. The current diff of the rebroadcast change is already significant so I'd prefer to avoid this route.

concern: nodes with old policies will always have txns that cannot be mined

  • explanation -> when policy rules are updated, nodes that are not upgraded can have their mempools cluttered with txns that will never be mined & never expire, since the rebroadcast logic ping pongs the txns between their pools. While this is already the case (it just takes 1 node to rebroadcast), these proposed changes increase these chances.
  • possible solution -> additional max_rebroadcast_count data structure. It would maintain a blacklist of txids, expiry time, and count (num times I rebroadcasted). The txn would be removed from the list if mined into a block. ATMP would reject a txn if maintained on the blacklist.

concern: one GETDATA is insufficient to ensure a txn was succesfully propagated to the network

  • explanation -> bitcoin/bitcoin#16698 (comment) & bitcoin/bitcoin#16698 (comment)
  • probably not a big deal. reasoning explained in links.
  • possible solution -> add a timer. Eg. timer starts when first GETDATA is received & txn is only removed from unbroadcast set after timer is up.
  • possible solution -> require x number of GETDATA messages before removing from unbroadcast set.

Follow up PRs

  • persist the unbroadcast txn set to mempool.dat
  • remove m_best_block_time
  • fix circular dependency introduced between txmempool & miner

Other stuff

  • There is an inherent tradeoff between defining param [top of mempool] vs param [age of transactions filtered out]. The values I have proposed opt to reduce #1 allow for leniency for #2. Having more recent transactions enables txns evicted from the mempool during volatility to be rebroadcasted and thus confirmed sooner.

  • Open question: what are privacy implications for when nodes have varied mempool expiry settings? For example, is it a privacy leak if you expire txns quicker, your wallet resubmits the txn, and you rebroadcast sooner than default 2-week expiration? -> While it would be great to understand the fingerprint possibilities, this is not a blocker because the privacy leak is dramatically less than the current situation.

  • Can the compact blocks relay code be used to minimize the data set? -> we'd need to introduce a different P2P message to indicate these are mempool transactions. Concerns about bandwidth usage can be addressed with simpler solutions.

  • What are the implications of empty blocks? -> TODO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment