Skip to content

Instantly share code, notes, and snippets.

Gavin Andresen gavinandresen

Block or report user

Report or block gavinandresen

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
gavinandresen /
Last active Apr 5, 2018
Storing the UTXO as a bit-vector

Half-baked thoughts exploring a different way of implementing a fully-validating BCH node.

The idea is to shift the storage of full transaction data to wallets, and explore how little data a fully validating node could store. This isn't a problem today (the UTXO set easily fits in the RAM of an inexpensive server-class machine), but might eventually be at very large transaction volumes.

Initial block download is a problem today (it is annoying to have to wait several hours or days to sync up a new node), and this scheme could make it orders of magnitude faster by shifting the time when full transaction data is broadcast from initial block download to new transaction announcement.

gavinandresen /
Last active Jan 5, 2018
Backwards Initial Block Download Scheme

This is a scheme for discovering the current "unspent transaction output set"; it could be used to bootstrap a node that stores the UTXO set using a Cuckoo Filter.

It has the following nice properties:

  1. It starts with an empty UTXO set, but quickly discovers a large enough subset to validate a majority of new transactions (TODO: empirically measure how many blocks to be able to validate 50+% of new tx), and incrementally gets better over time.
  2. It uses a simple data structure to keep track of progress (TODO: empirically calculate maximum size of the spent set processing tip to genesis).
  3. It doesn't require any consensus-level change or require miners to maintain any extra data structures or indexes.

However, it does require downloading every single block.

gavinandresen /
Last active Sep 6, 2018
Using a cuckoo filter for fast 'unspent?' lookups

A "Cuckoo Filter" is a nifty data structure invented a few years ago; read the paper for details.

Like a Bloom filter, you insert items and then can later ask "does this item exist in the filter?" You'll get either a definite "no" or "yes, probably" with some false-positive error rate. Cuckoo filters have two advantages over Bloom filters:

  1. They are more space efficient at false positive rates less than about 0.03.
  2. You can delete items from them without affecting the false positive rate at all.

It seems to me that an in-memory cuckoo filter could work really well to keep track of Bitcoin's "unspent transaction output set" (UTXO set), to make transaction (or block) validation as fast as possible using minimal memory or disk.

Recall that Bitcoin transactions (other than coinbase transactions that create new coins) have inputs that refer to unspent outputs. An input refers to a previous output by giving the transaction id (a 32-byte hash) co

gavinandresen / ReplayProtection.patch
Created Jun 15, 2017
OP_RETURN replay protection patch
View ReplayProtection.patch
diff --git a/src/policy/policy.cpp b/src/policy/policy.cpp
index ec398f662..6724399c0 100644
--- a/src/policy/policy.cpp
+++ b/src/policy/policy.cpp
@@ -117,6 +117,10 @@ bool IsStandardTx(const CTransaction& tx, std::string& reason, const bool witnes
return false;
+ if (tx.ReplayProtected()) {
+ return false;
gavinandresen /
Created Apr 22, 2017
Random Sanity for specific languages

The standard library of every programming language has at least one pseudo-random number generator. C has rand(), Javascript has math.Random()... and if you're writing code that needs good randomness you shouldn't use those.

There could be a series of github projects that generate bad random datastreams using those old, deprecated generators and "typical" starting seeds (e.g. 0, current Unix time, small integers that might be process ids) and feeds the results in to the database.

That only does some good if programmers start inserting checks to into their C or Javascript applications... and if they don't know enough to use RAND_bytes() or crypto.RandomBytes()


For head-first relaying of blocks, I implemented an 'invalidblock' message that, like a 'block' message, broadcasts the full block data when the header of the block was valid but the full block turned out to be invalid.

This should be a very rare occurrence-- miners are incentivized to fix bugs that cause them to produce invalid blocks. So transmitting the full block data once or twice a year shouldn't be an issue... but perhaps it will be an issue if partial validation ever becomes necessary to scale.

First, I assume the 80-byte block header is valid: its hash satisifes proof-of-work.

Proving that the block reward in the coinbase transaction is <= sum(fees)+subsidy requires the full block data (although once segwit is deployed, signatures in the witness data aren't needed).

Proving the block is over size or sigop or sighash limits requires all of the transaction data up to the first transaction that puts the block over the limit (and proof that those transactions are in the block, so a partial merkle tre

gavinandresen /
Last active Feb 2, 2016
Flexcap as a long-term maximum size solution
gavinandresen /
Last active Jan 25, 2016
2MB voting tests

Rules being proposed for the 2MB hard-fork bump of Bitcoin Classic:

  1. Threshold vote by miners, using block.version bit 0x10000000
  2. Votes after expiration date (timestamp in block header > 2018-01-01 00:00:00 GMT) do not count
  3. New rule (bigger blocks) not allowed until after threshold reached plus grace period (4 weeks)
  4. Once threshold reached, voting bit does not matter (but version must be greater than 4 to be compatible with previous soft forks)

Edge case tests missing from current Classic regression test:

Threshold-triggering vote at or just-after the expiration date (two test cases).

gavinandresen /
Last active Jan 20, 2016
Expensive-to-validate "standard" transactions

Standard transactions (that will be relayed/mined) have been limited to 100,000 bytes for years.

How many signature operations or how much signature hashing can a standard transaction have? (excluding pay-to-script-hash transactions specifically designed to be expensive to validate for the moment)

Back-of-the-envelope estimates:

Lots-of-inputs, one-output transactions are most expensive to validate. A minimal input takes about 120 bytes (74 bytes for a signature plus another 40 or so); 100,000 / 120 rounded up is a max of about 850 inputs.

If each of those spends a 1-of-3 'raw' multisig unspent transaction output using the first key (so three signature operations are needed), you get about 2,500 signature operations for that 100,000 byte transaction.

gavinandresen /
Last active Jan 2, 2017
Unifying validation and storage costs

Single cost metric for transactions and blocks

Building on the work of Jonas Nick, Greg Sanders and Mark Friedenbach (see video: slides: for background)

The goal: create a consensus rule or rules to replace the hard-coded limits Satoshi hurriedly put into place in 2010 that reflects all we have learned since then.

Desirable properties for the rule or rules:

  • Simple (because simple designs are less likely to lead to bugs or security flaws)
You can’t perform that action at this time.