Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?

A description of known problems in Satoshi Nakamoto's paper, "Bitcoin: A Peer-to-Peer Electronic Cash System", as well as notes on terminology changes and how Bitcoin's implementation differs from that described in the paper.

Abstract

The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power.

Implementation detail: If each link in the chain (called "blocks" in Bitcoin) was built using the same amount of Proof Of Work (POW), the longest chain would be the one backed by the largest pool of computational power. However, Bitcoin was implemented in such a way that the amount of POW can vary between blocks, so it became import not to check for the "the longest chain" but rather "the chain demonstrating the most POW"; this is often shortened to "strongest chain". (The change from checking for the longest chain to checking for the strongest chain occured after Bitcoin's initial release.)

Terminology change: General CPUs were used to generate the POW for the earliest Bitcoin blocks but POW generation today is mostly performed by specialist Application Specific Integrated Circuits (ASICs), so instead of saying "CPU power" it is perhaps more correct to say "computational power" or, simply, "hash rate" for the hashing used in generating the POW.

As long as a majority of CPU power is controlled by nodes that are not cooperating to attack the network, they'll generate the longest chain and outpace attackers.

Terminology change: The term "nodes" today is used to refer to full validation nodes, which are programs that enforce all the rules of the system. Programs (and hardware) that extend the chain today are called "miners" based on Nakamoto's analogy to gold miners in section FIXME of the paper. Nakamoto expected all miners to be nodes but the software he released did not require all nodes to be miners. Today it is the case that the overwhelming number of nodes are not miners and that many individuals who own mining hardware do not use it with their own nodes (and even those that do often mine for short periods of time on top of newly discovered blocks without ensuring their node considers the new block valid). The early parts of the paper where "nodes" is mostly used without modification refer to mining using a full validation node; the later parts of the paper which refer to "network nodes" is mainly about what nodes can do even if they aren't mining.

Post-publication discovery: When a new block is produced, the miner who produces that block can begin working on its sequel immediately but all other miners must wait for that new block to propagate across the network to them. This gives miners who produce many blocks an edge over miners who produce fewer blocks, and this can be exploited in what's known as the selfish mining attack to allow an attacker with around 30% of total network hash rate to make other miners less profitable, perhaps driving them into following the attacking miner's policy. So instead of saying "a majority of CPU power is controlled by nodes that are not cooperating to attack the network" it is perhaps more correct to say "as long as nodes cooperating to attack the network control less than about 30% of the network".

2. Transactions

We define an electronic coin as a chain of digital signatures. Each owner transfers the coin to the next by digitally signing a hash of the previous transaction and the public key of the next owner and adding these to the end of the coin.

Implementation detail: Bitcoin implements a more general version of this system where digital signatures are not used directly but rather a "deterministic expression" is used instead. Just as a signature that matches a know public key can be used to enable a payment, the data that satisfies an known expression can also enable a payment. Generically, the expression that must be satisfied in Bitcoin in order to spend a coin is known as an "encumbrance". Almost all encumbrances in Bitcoin to date require providing at least one signature. So instead of saying "a chain of digital signatures" it is more correct to say "a chain of encumbrances".

4. Proof-of-Work

we implement the proof-of-work by incrementing a nonce in the block until a value is found that gives the block's hash the required zero bits.

Implementation detail: Adam Back's Hashcash implementation requires finding a hash with the required number of leading zero bits. Bitcoin treats the hash as a integer and requires that it be less than a specified integer, which effectively allows a fractional number of bits to be specified.

Proof-of-work is essentially one-CPU-one-vote.

Important note: the vote here is not on the rules of the system but merely on the ordering of the transactions in order to provide assurances that an "electronic coin" cannot be easily double spent. This is described in more detail in section 11 of the paper where it says, "We consider the scenario of an attacker trying to generate an alternate chain faster than the honest chain. Even if this is accomplished, it does not throw the system open to arbitrary changes, such as creating value out of thin air or taking money that never belonged to the attacker. Nodes are not going to accept an invalid transaction as payment, and honest nodes will never accept a block containing them."

proof-of-work difficulty is determined by a moving average targeting an average number of blocks per hour.

Implementation detail: The average implemented in Bitcoin targets an average number of blocks per two weeks (not per hour). Other implemented rules may further slow.

7. Reclaiming Disk Space

Once the latest transaction in a coin is buried under enough blocks, the spent transactions before it can be discarded to save disk space

Possible post-publication discovery: Although the Merkle Tree structure described in this section can prove a transaction was included in a particular block, there is currently no way in Bitcoin to prove that a transaction has not been spent except to process all subsequent data in the blockchain. This means the method described here cannot be universally used for reclaiming disk space among all nodes, as all new nodes will need to process all transactions. A method called segregated witness that is expected to be deployed in the near future will allow nodes to skip validation of some data in blocks that use segregated witness. Another group of related methods called UTXO-, TXO-, and STXO-commitments could allow something similar to what Nakamoto described in this section.

8. Simplified Payment Verification

One strategy to protect against this would be to accept alerts from network nodes when they detect an invalid block, prompting the user's software to download the full block and alerted transactions to confirm the inconsistency.

Important Note: although software has been produced that implements most parts of this section and calls that Simplified Payment Verification (SPV), none of these programs currently accepts alerts from network nodes (full validation nodes) when invalid blocks have been detected. This has placed bitcoins in so-called SPV wallets at risk in the past.

chris-belcher commented Jan 29, 2017 edited

Also:

10. Privacy

Some linking is still unavoidable with multi-input transactions, which necessarily reveal that their inputs were owned by the same owner

Multiple owners can co-operate to produce a single bitcoin transaction with multiple inputs owned by each of them. This is what coinjoin is based on and is used to improve bitcoin privacy today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment