Skip to content

Instantly share code, notes, and snippets.

@nothingmuch
Last active August 20, 2018 21:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nothingmuch/d3ccc0633960f62d1d23bfc05857a9b7 to your computer and use it in GitHub Desktop.
Save nothingmuch/d3ccc0633960f62d1d23bfc05857a9b7 to your computer and use it in GitHub Desktop.

In Bitcoin every redeemable UTXO is encumbered by a script, which along with the right inputs (that cause it to evaluate to true) would release those funds.

Focusing only on pay to pubkey hash scripts, which refer to a single key by the RIPEMD160 of the SHA256 of the curve point generate from the private key, if you enumerate private keys, derive the pubkey, and hash it, the birthday problem quantifies your chances of finding a RIPEMD160 hash collision for which you have the corresponding private key that allows you to sign a transaction that redeems that UTXO.

This is the most efficient way of brute forcing P2PKH outputs under the assumption that the keys were generated with a cryptographically secure random number genreator.

We could also consider P2PK outputs but they are not in common use, or P2SH outputs which would require finding a collision of script hashes (as opposed to pubkey hashes, so this is a much larger space of inputs, but some of those scripts are deterministically generatable from a pubkey, e.g. SegWit wrapper scripts, and we're still only searching for RIPEMD160 collisions), or P2WPKH... but to keep it simple let's just focus on P2PKH.

If the keys are truly random we will have a very hard time doing better than a brute force search. However, if we know that some low entropy seed is used to generate a key, with some sort of key derivation function (e.g. BIP39 and BIP42, maps a single seed to a large set of private keys), even though every point in the search space now costs strictly more to compute than just iterating through possible private keys, there is a significant advantage to doing this because we're actually enumerating a significantly smaller space and then examining its injection into the set of possible private keys, and using it to derive an ordering over the set of possible private keys.

So now we are no longer searching for RIPEMD160 collisions, we're actually searching for collisions in the space of passwords, but we still need to look up these collisions in the RIPEMD160 hashes for a given UTXO set using this complicated mapping.

In other words, weak brain wallets ARE low entropy private keys, because brain wallets are precisely this mapping between passwords and private keys.

It's helpful to emphasize that entropy is a relative concept (knowing how a certain subset of the possible private keys can be represented with a smaller number of bits is what matters, without that knowledge all private keys require 256 bits of entropy).

For truly weak brain wallets, the additional overhead of deriving the mapped values is negligible, because iterating through the smaller space of weak passwords is orders of magnitude cheaper than enmuerating a larger space (e.g. the space of RIPEMD160 collisions on a given UTXO set, which is what the large bitcoin collider attempts, which in turn is orders of magnitude smaller than e.g. P2WPKH collisions, since those use SHA256 instead of RIPEMD160 to identify pubkeys).

Brainwallets are a way of biasing the search space of all possible RIPEMD160 collisions derived from a private key, that assumes that some RIPEMD160 hashes correspond to keys that under the brainwallet model have a very low entropy representation.

However, if considering the search space of randomly generated passwords, which is significantly larger than the search space of possible keys, and involves more work to map to the space of private keys / redeem scripts, it's obvious that it's much cheaper to just try private keys directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment