Instantly share code, notes, and snippets.

Embed
What would you like to do?
Bitcoin Core Fee Estimation Algorithm

High level description Bitcoin Core's fee estimation algorithm

The algorithm takes as input a target which represents a number of blocks within which you would like your transaction to be included in the blockchain. It returns a fee rate that you should use on your transaction in order to achieve this.

The algorithm is conceptually very simple and does not attempt to have any predictive power over future conditions. It only looks at some recent history of transactions and returns the lowest fee rate such that in that recent history a very high fraction of transactions with that fee rate were confirmed in the block chain in less than the target number of blocks.

Transactions can occur with a nearly continuous range of fee rates and so in order to avoid tracking every historical transaction independently, they are grouped into fee rate "buckets". A fee rate bucket represents a range of fee rates within which the algorithm treats all transactions as having approximately the same fee rate and the answer the algorithm returns is actually the lowest fee rate bucket such that transactions in that bucket and all higher buckets were confirmed within the target over the recent history.

The recent history is defined as an exponentially decaying moving average with the data point counted at the time of transaction confirmation. In 0.14 this decay is 0.998 or a half-life of 346 blocks. Only transactions which had no unconfirmed parents at the time they entered the mempool are considered.

The implementation conceptually is that for each fee rate bucket you keep a counter for each possible confirmation target (1-25 in 0.14) of how many transactions in that bucket were confirmed in the target or less. You also keep an overall counter of all transactions of that fee rate. So if a block includes a specific transaction T, when you look up its mempool entry and see it entered the mempool 12 blocks ago, you then increment the counters for targets 12-25 for its fee rate bucket and increment the total tx count for that bucket. All counters are decayed by 0.998 every block.

To calculate an estimate you look only at the counters corresponding to your desired confirmation target and you start at the highest fee rate bucket and evaluate for txs in that fee rate bucket whether:

txs-confirmed-in-target-or-less / (all-confirmed-txs + txs-still-in-mempool-for-at-least-target) > threshold (0.95 in 0.14).

If you pass the threshold you check the next highest fee rate bucket and keep going, returning the last bucket that met the threshold before the first bucket which fails it. There is a required number of data points per bucket before doing the evaluation and you will just combine successive fee rate buckets to achieve that.

Proposed Changes for 0.15

Please see PR 10199

Under the hood improvements

  • The fee estimate calculation has been improved so that failing to hit the threshold returns a fee rate only if continuing to add datapoints from lower fee rate buckets does not cause the passing percentage to re-exceed the threshold. This allows for a smaller required number of data points because extraneous noise won't accidentally cause a high estimate.
  • The size of fee rate buckets has been halved (more precision possible in part due to above change)
  • An additional counter is introduced for each feerate,target pair that counts transactions which are evicted or otherwise removed from the mempool without being included in a block. This counter is incremented for every target <= the number of blocks the transaction had been in the mempool at the time of its eviction. And the threshold calculation is changed to add failed-txs-which-left-mempool-after-target to the denominator. These failures are recorded for all mempool transactions at shutdown.
  • A concept of scale is introduced for performance reasons, where confirmation targets are only resolved to within a certain scale. (for instance at scale of 2 a transaction confirmed in 19 or 20 blocks is tracked in the same counter)

Algorithm Changes

  • The algorithm is now repeated 3 times to cover different horizons for recent history.

    • Short history: 0.962 decay (half-life 18 blocks, 3 hours) for targets up to 12 (scale 1)
    • Medium history: 0.9952 decay (half-life of 144 blocks, 1 day) for targets up to 48 (scale 2)
    • Long hisotry: 0.99931 decay (half-life of 1008 blocks, 1 week) for targets up to 1008 (scale 24)
  • The logic of estimatesmartfee has changed. Instead of requiring a 95% threshold at the target, it now returns the maximum of these 3 calculations:

    • 60% threshold at target/2
    • 85% threshold at target
    • 95% threshold at 2 * target

    Each of the above calculations is done with the fee estimate data for the shortest time horizon that contains the target. If we can not query the short time horizon for a target of say 14, but the maximum possible target of 12 in the short time horizon would produce a lower fee rate, that lower answer is used. The ability to use a much shorter time horizon when possible makes the estimates far more responsive to changes in the prevailing conditions.

  • Estimates are optionally (but by default) conservative, meaning that the 95% threshold for 2 * target must be met not only at the shortest possible horizon but also at all longer horizons. This is useful for transactions (such as those unable to use CPFP or RBF) that do not want to take the risk of an extended delay if conditions change adversely.

  • A maximum target for which estimates can be returned for is determined by tracking how many blocks have been recorded for data and setting the max target to half this number. When estimates are saved, this historical max target can be used if the estimates are restarted within a reasonable time window (6 weeks). The historical max target is replaced on shutdown when the new estimates have achieved a max target at least half as long.

@jonasschnelli

This comment has been minimized.

jonasschnelli commented Apr 19, 2017

ping @crwatkins (he showed us some great observation about the current landscape of "fee estimations" in Berlin).

@Christewart

This comment has been minimized.

Christewart commented Apr 20, 2017

Can you explain where this threshold (0.95 in 0.14) is coming from? Just to be clear, once our equation evaluates to a value <= the threshold we choose that fee rate correct?

It also might be useful to add what happens if fee estimation fails. I know there was talk in the IRC channel about how you need to observe 2^N blocks to be able to correctly estimate a target for n blocks?

@morcos

This comment has been minimized.

Owner

morcos commented Apr 21, 2017

The threshold is just a number I picked, originally it was 85% but at some later point it changed to 95%, it just represents that if less than the threshold percentage of recent transactions in that fee rate are not getting confirmed within your target, then you should probably use a higher fee rate if you want to meet the target.

I don't want to dwell on the failure modes for fee estimation, but the most common historical failure mode was if there was no fee rate high enough such that 95% of transactions were confirmed within 1 block (b/c miners may not include very recent transactions). Or if there simply were not enough data points all together to meet the required minimum. The 2*N (not 2^N) observed blocks required to return an estimate is part of the new changes. (about to update post for those)

@DavidVorick

This comment has been minimized.

DavidVorick commented Jun 9, 2017

I've come up with something that follows an altogether different strategy, based around looking at the mempool, based on looking at the 90%-ile fee rate in each block, and based on looking at how much fees tend to vary historically over 'N' blocks, for a user-chosen value of 'N'. My proposal I think is also much more careful about miners that try to maliciously influence the fee estimation.

https://gist.github.com/DavidVorick/ba17fb7db50b9270d1c2131ff425422b

In an attempt to keep it simple, I did not include things like decay, but I think those are good strategies, and would want that added to any serious implementation of my proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment