Skip to content

Instantly share code, notes, and snippets.

@gavinandresen
Last active January 2, 2017 20:18
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gavinandresen/54f6e24b830781aae1f4 to your computer and use it in GitHub Desktop.
Save gavinandresen/54f6e24b830781aae1f4 to your computer and use it in GitHub Desktop.
Unifying validation and storage costs

Single cost metric for transactions and blocks

Building on the work of Jonas Nick, Greg Sanders and Mark Friedenbach (see video: https://youtu.be/37LiYOOevqs?t=2h16m7s slides: https://scalingbitcoin.org/hongkong2015/presentations/DAY2/3_tweaking_the_chain_3_nick.pdf for background)

The goal: create a consensus rule or rules to replace the hard-coded limits Satoshi hurriedly put into place in 2010 that reflects all we have learned since then.

Desirable properties for the rule or rules:

  • Simple (because simple designs are less likely to lead to bugs or security flaws)
  • Dynamic, adjusting as the ecosystem evolves (not set arbitrarily by developers or a committee of experts)
  • Safe, not subject to attack

Nick et. al. suggest a "full cost metric" that takes into account transaction size (bandwidth cost), and signature hashing and verification (cpu cost), based on measurements of actual hardware/bandwidth costs and with an upper threshold of the average cost metric of some number of recent blocks (TODO: check with Jonas, is this actually what was being proposed?)

They suggest that miners be allowed to build blocks up to half that threshold, but propose that miners only be allowed to build larger blocks if their blocks decrease the size of the UTXO set. The more they decrease the UTXO set, the larger their blocks are allowed to be, up to the threshold. (TODO: check... can't be right, because would mean a constant downward spiral in allowed validation cost)

This is an interesting proposal, but ignores two critical related issues: how do miners decide which transactions to put into their blocks? And how do wallets decide how much to pay in fees? The non-linear nature of allowed block size at the half-threshold level makes selecting which transactions to include in the block a hard problem, and makes the wallet's task of choosing an appropriate transaction fee even harder.

I think it makes sense to start from basic principles, and realize that there are three relevant costs:

  • bandwidth cost (related to transaction size, in bytes)
  • cpu cost (dominated by signature hash creation and ECDSA validation; see Nick et. al.'s presentation)
  • storage cost (dominated by UTXO-in-fast-memory storage costs)

The actual magnitude of those costs does not matter if the consensus rule is set based on the average (or median) of recent blocks; the consensus rule would not be "block must cost less than 0.00011 cents to validate and store" but might be "block must cost less to validate and store than twice the average cost of blocks in the last difficulty adjustment period." As long as the same cost metric is used to measure historical blocks, new blocks being tested for consensus-validity, and individual transactions, everything works out. Wallets will use the cost metric to decide what transaction fee is necessary and miners will sort transactions by greatest fee-per-cost metric to maximize the fees they collect.

Using real-world money as "cost" allows CPU/memory/bandwidth to be combined into a single unified cost metric (whose units would be "reference hardware 2016 dollars").

Calculating a base-line bandwidth cost is straightforward; if the average worldwide broadband internet user pays the equivalent of $50 per month for 300GB of data and we assume each node will receive and then transmit each transaction once, on average, the cost per byte of transaction is $50/150,000,000,000. Alternatively, bandwidth costs for a cloud service provider like Amazon Web Services could be use to determine "cost metric ruler."

Calculating a base-line CPU cost is a little less straightforward. CPU costs for a cloud service provider could be used after running benchmarks to see how how many ECDSA validations or signature hashes can be performed for some amount of money. Alternatively, benchmarks could be run on a typical personal computer, and then a CPU cost calculated based on how many validations could be done in a typical useful lifetime for a personal computer (e.g. 4 or 5 years).

Calculating a base-line UTXO storage cost is the most difficult, but is also possible. We can measure the size of the average UTXO and measure how long the median transaction output remained in the UTXO set (assigning a time of infinity for any currently unspent outputs), and use the same technique to estimate how much it costs to rent or own that much high-speed memory for that amount of time. TODO: a more accurate accounting would be to use a discount rate based on drop in cost of memory over time for transactions that NEVER leave the UTXO set.

Calculating all of the above for a single reference point in time will produce constants for a cost function for a transaction or a block of transactions:

Cost(size_in_bytes, ecdsa_validations, sighash_bytes, utxo_change_in_bytes)

The cost function will never be perfect, and may become less perfect over time if (for example) CPU costs drop much faster than memory or bandwidth costs. Software optimizations may also change the actual costs. Fortunately, the cost metric does not have to be perfect, as long as it is consistent it will serve as a coordination mechanism between transaction creators and miners, and an anti-denial-of-service protection mechanism in the unlikely event of a "rogue miner" trying to produce an extremely expensive-to-validate block.

If the benefits of resetting the cost metric formula ever outweighed the costs, it could be changed via hard-fork. However, since the benefits are small (resetting the metric might just make transactions with lots of signature hashing pay a few satoshis less in fees, while making transactions with OP_RETURN data pay a few satoshis more) and the costs are large (all transaction-creating software would have to be updated to calculate fees using the new cost metric) that is unlikely.

TODO: work out an actual cost metric formula, then apply it to recent blocks and transactions and try to estimate how much disruption there might be if we switched overnight and wallets were just using today's simple fee-per-kilobyte metric.

@instagibbs
Copy link

Simple (because simple designs are less likely to lead to bugs or security flaws)
Dynamic, adjusting as the ecosystem evolves (not set arbitrarily by developers or a committee of experts)
Safe, not subject to attack

I think there may be some sort of Zooko's triangle going on here. Dynamic often means gameable, a more complex dynamic one won't be simple, etc. For "version 1" we might think of tackling "simple and safe", and then perhaps think about what "dynamic" would entail, but as a second-class citizen.

This is an interesting proposal, but ignores two critical related issues: how do miners decide which transactions to put into their blocks?
In my version I'd rather just have a linear combination of factors that are simply estimating how long propagation takes, using not crazy amounts of parallelism, rather than calculating explicit "computer cost". We would have estimated fairly conservative numbers for how much each attribute contributed to validation/propagation latency, then plug those in to the over cost function. This makes block-packing a simple 0-1 weighted knapsack problem just like today, I suppose even easier since we drop the "&& sigops <=" criterion. I think this cost metric formulation has lots of support, and we're taking our first steps with SegWit.

The one "dimension" to this problem I'm not sold on yet is UTXO set stuff wrt to my "validation time" metric. Previous blocks can severely impact performance of follow-up blocks, and I don't want a block cost to be dependent on how exactly the UTXO set is stored. My inclination is to go along with a capped UTXO set size(with some slow growth over time), or forgetting old ones, ala http://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-December/011952.html . It might make sense to keep around the SegWit discounting, or some variant of it, to encourage people to not kick too much data in the UTXO set.

If we ever manage to get memristors working for all-in-memory-databases, we can revisit that point. :)

That crosses off "simple" and "safe" hopefully, by forcing people to internalize costs to the system as well as capping inherent propagation delays.

For "dynamic", I haven't thought a lot about it beyond what maaku has thought about it re:flexcap. I think getting the first 2 parts right makes the 3rd much easier, so I'm optimistic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment