Skip to content

Instantly share code, notes, and snippets.

@dabura667
Last active August 29, 2015 14:15
Show Gist options
  • Save dabura667/875bb2c159b219c18885 to your computer and use it in GitHub Desktop.
Save dabura667/875bb2c159b219c18885 to your computer and use it in GitHub Desktop.
A proposal for an improved HD wallet generation algorithm

WARNING: Current proposal does not protect against colluding of a single ancestor public node and a single descendant private node. This is still a WIP.

Abstract

The current derivation method of Hierarchical Deterministic wallets has a weakness in which any individual private key may be combined with any ancestor extended public key (as long as there are no hardened keys in between) to generate the associated extended private key.

This proposal will set out to eliminate this weakness by:

  1. Using 1 leak protection of keys
  2. Using convention to prevent multiple keys being derived from the same parent.

Motivation

There is a lot of considerations for security in use cases that would benefit from an extended public key on a server to generate public keys. Currently there are even certain business models that are impossible to use BIP32 due to the weakness.

One example is the service of Reality Keys. The site must generate a yes and no address for every bet that takes place, and reveal the private key to the winning address as a smart contract + oracle based betting service. As they are purposely giving out private keys, they can not use BIP32 via an online Extended public key, as one private key would reveal all private keys for all bets.

Another example are organizations that would want to use HD wallets to give departmentalized extended private keys without having a risk of colluding with a third party auditor with the Extended Public Key that is ancestor to the colluder's key.

In short, the motivation of this proposal is to allow the great feature of Extended Public Keys to be used without worry of the destruction of the whole tree later down the line.

Technical Specification

Master generation

(Note: All HMAC-SHA512 are split into left 32 bytes and right 32 bytes)

a, c1 = HMAC-SHA512(seed)
b, c2 = HMAC-SHA512(concat(a, c1, seed))
chaincode = c1 xor c2

Deriving from Private Node (Extended Private Key): (Contents: chaincode, a, b)

(Deriving index 3)
i3, c1 = HMAC-SHA512(concat(3,chaincode,A))
j3, c2 = HMAC-SHA512(concat(3,chaincode,B))
New chaincode = c1 xor c2
 
x3 = i3 * a
y3 = j3 * b

p3 = x3 + y3

a3 = x3 + p3
b3 = y3 + p3

k3 = a3 + b3

k3 = the derived private key

Cost Increases deriving from Private Node

  • Proposed Calculation cost: 2 EC Multiplication + 2 HMAC-SHA512 + 1 XOR + 2 BIGINT Multiplication + 6 BIGINT MOD + 4 BIGINT Addition
  • For Parent privkey to Child pubkey add 1 EC Multiplication)
  • Current BIP32 cost: 1 EC Multiplication + 1 HMAC-SHA512 + 1 BIGINT MOD + 1 BIGINT Addition
  • Extra calculations: 1 EC Multiplication + 1 HMAC-SHA512 + 1 XOR + 2 BIGINT Multiplication + 5 BIGINT MOD + 3 BIGINT Addition

Deriving from Public Node (Extended Public Key): (Contents: chaincode, A, B)

(Deriving index 3)
i3, c1 = HMAC-SHA512(concat(3,chaincode,A))
j3, c2 = HMAC-SHA512(concat(3,chaincode,B))
New chaincode = c1 xor c2

X3 = i3 * A
Y3 = j3 * B

P3 = X3 + Y3

A3 = X3 + P3
B3 = Y3 + P3
 
K3 = A3 + B3

K3 = the derived public key

Cost Increases deriving from Public Node

  • Proposed Calculation cost: 2 EC Multiplication + 2 HMAC-SHA512 + 1 XOR + 4 EC Addition
  • Current BIP32 cost: 1 EC Multiplication + 1 HMAC-SHA512 + 1 EC Addition
  • Extra calculations: 1 EC Multiplication + 1 HMAC-SHA512 + 1 XOR + 3 EC Addition

Conventions

  1. All generated keys must end in key code index for last layer (abbreviated as "k")
  2. All nodes generated must have two consecutive last layers as node code index (abbreviated as a single "n"). Nodes MUST end paths with n.
  3. All intermediate layers must not use key or node codes as index
  4. Any path not ending in k or n will be assumed to be a key, and automatically place a k as the extra last layer.
  5. A key can not be the first layer, nor may it immediately follow a node layer. ("k" may not directly follow "m" or "n")

Key and Node index values

(Notice: 5 byte index instead of 4)

 key code index = (0x00 || 0xffffffff)
node code index = (0x01 || 0xffffffff)

New Master/Node format (xprv/xpub)

  • Only difference from BIP32 is adding 33 bytes after public/private key bytes to accommodate for both a/A and b/B.
  • Private node will have 1 0x00 padding byte for each key (a and b) to preserve the byte length of the node.
  • Magic numbers will change to 0xc88efaba for private ("xprv") and 0xc88fb5c4 for public ("xpub") masters/nodes.
  • Nodes' child numbers shall contain the index of the last non-node layer. So the node m/2/n will be index 0x00000002
  • Parent fingerprint will be the first 4 bytes of the pubkey hash of the parent's compressed "key" (Which is the point obtained from the EC addition of the two pubkeys within the parent)

Examples

(Using m/i'/c/k setup from BIP32)

Keys

Key Paths Key Explanation
m/0/1/0/k First Account's First Change Key
m/0/0/3/k First Account's Fourth Receiving Key
m/4/0/7 Fifth Account's Eighth Receiving Key (same as m/4/0/7/k)

Nodes

Key Paths Key Explanation
m/0/n First Account's Node (Extended Key)
m/0/n/2/0/4/k First Account's Node's Third Account's 5th Receiving Key

Notice how "m" may derive over "n" to generate lower layers)

Note on Depth

Path  : m/0/  n  /2/0/...
Depth : 0.1.(2.3).4.5....
  • "n" represents both the third and fourth depths.
  • The Extended key will contain the depth of 0x03 (The deepest depth of the 2 layers)

Considerations for derivation standards

One consideration that must be made is that any derived key path can only have as many Public nodes that can derive it as the number of "n" node layers above it + 1 (for the master node "m")

Example: A key such as the following BIP44 key would need to be altered to follow the functionality sought out in BIP44

BIP44 path New HD path
m/44'/0'/0'/0/0 m/44/0/0/n/0/0/k
The logic behind this conversion is asking the question "which layer would need a node in the future?" to which the only one would be the "Account" layer. However, This may be made obsolete by the fact that releasing M (Master Public Node) would not compromise any private keys, but would be useful for use cases such as "I will give you one of the Account Public Nodes of my chain and you can make payments to those addresses incrementally." where you don't want to give the other party access to your whole wallet's public keys.

"Two Leaked Keys" Weakness

(Stronger than BIP32, which is completely broken with one leaked key)

An attacker could theoretically derive the parent Private Node (extended private key) if he had 2 things:

  1. The associated Public Node (extended public key)
  2. Any 2 private keys that are child to the Public Node

(Note: This attack does not work with a grandparent Public Node and 2 grandchildren private keys)

Equations

Knowns: k0, k1, A, B, chaincode (2 derived keys and the A and B pubkeys + chaincode from their parent Public Node)

... i0 = HMAC-SHA512(0,chaincode,A)
... i1 = HMAC-SHA512(1,chaincode,A)
... j0 = HMAC-SHA512(0,chaincode,B)
... j1 = HMAC-SHA512(1,chaincode,B)

And now we have: k0, k1, i0, i1, j0, j1, A, B, chaincode

  • Since Bn = 2*jn*B + in*A, we can replace it as such:
(G is the generator point)
k0*G = 3*i0*a*G + 3*j0*b*G
k1*G = 3*i1*a*G + 3*j1*b*G
  • Cross out the generator points to get:
k0 = 3*i0*a + 3*j0*b
k1 = 3*i1*a + 3*j1*b
  • Bring a to one side on both to subtract equations:
a = (k0 - 3*j0*b)/(3*i0)
a = (k1 - 3*j1*b)/(3*i1)
  • Bring b to one side:
0 = (k0 - 3*j0*b)/(3*i0) - (k1 - 3*j1*b)/(3*i1)
0 = 3*i1*k0 - 3*i1*j0*b - 3*i0*k1 + 3*i0*j1*b
3*i0*k1 - 3*i1*k0 = b(3*i0*j1 - 3*i1*j0)
b = (3*i0*k1 - 3*i1*k0)/(3*i0*j1 - 3*i1*j0) mod n
  • Now we know b, plug in and solve for a:
a = (k0 - 3*j0*b)/(3*i0)

Knowing only 1 child privkey you can not solve for parent privkey

This is because knowing only k0 = i0*a + j0*b, there are 2 unknowns (a,b) and only 1 equation.

Since convention will require wallets to generate keys alone on their lowest layer (using the key code index) it is impossible for an attacker to retrieve 2 keys from the same parent.

However, a malicious implementation of the method could purposefully generate multiple keys from the same parent in order to break the wallet in the future... but by this point, there are other more conventional attacks they could carry out.

Also, requiring nodes (descendant xpub/xprvs) that are shared to others (ie. departments of a company) to derive two layers of depth using index values that are not allowed for normal key derivation, we further protect the possibility of multiple department heads colluding, as they could not possibly be under the same parent.

Implementation

I have yet to make a UI for it, but I have modified bip32.js from my website to accommodate these new rules. Here is the diff. https://github.com/dabura667/NewBIP32js/compare/oldbip32...newbip32

@jhoenicke
Copy link

You wrote:

An attacker could theoretically derive the parent Private Node (extended private key) if he had 2 things:

  1. The associated Public Node (extended public key)
  2. Any 2 private keys that are child to the Public Node

(Note: This attack does not work with a grandparent Public Node and 2 grandchildren private keys)

The last note is not true. If you have a public node, you can compute all i,j for all keys below this node. For some greatgreatgrandchild, a{1/2/3/4} = a * i1 * i2 * i3 * i4 holds, so you can see it as a * i for some i=i1 * i2 * i3 * i4 that you can compute. So i is a kind of summary of the i1,i2,i3,i4 on the path. If you now leak a public node and two private keys anywhere below this public node, I can compute the summarized i,i',j,j' and use the algorithm you sketched to recover a,b and all private keys.

@dabura667
Copy link
Author

So if I gave you m/0/0/k and m/1/2/k privkeys (so you do not know their a and b, only their added k values) and I gave you M (the Master public node) you would be able to calculate the a and b of the Master Private Node?

I would be interested to see how that works... If you could point that out I would appreciate it.

@dabura667
Copy link
Author

Looks like I'll need to go back to the drawing board on this one...

While keys may be safe (I am awaiting on your reply) your mention of i = i1 * i2 * i3... got me thinking that any ancestor Public Node and any descendant Private Node could still collude to generate the Private Node of the ancestor... ie. a * i = ax mod n so a = ax / i mod n where x just means "the descendant" and i is the i values generated all the way down multiplied together (which can be calculated by an ancestor Public Node) and the same could be done for b as well. So the Public Node <> Private Node collusion weakness is still at 1...

@dabura667
Copy link
Author

Thanks for the feedback, @jhoenicke

I have added b into a's derivation equation and a into b's so now both individual keys and nodes can not be colluded as long as they are not multiple generated from the same parent.

Please let me know if you see any more weaknesses.

Thanks!

@oleganza
Copy link

In your example of Reality Keys someone gives out private keys. Then it means they must have access to a parent private key, so they can simply use hardened derivation so revealing private key won't be an issue. Or do I miss something there?

@dabura667
Copy link
Author

@oleganza It would be better to have a xpub on the server deriving addresses at will, as without it the site admin would have to generate the public keys offline and upload them en masse to the server.
They can also "simply" use bitcoind to generate tons of keys too, but the seed gives an ease of backing up... the xpub on a server would also give the ease of pubkey generation without worry of leakage leading to destruction of the entire chain.

@jhoenicke I just realized that adding b and a into each others' equations still allows for colluding with the parent public node...

knowns: A, B, chaincode, a1, b1, chaincode1

definitions:
a1 = 2*i1*a + j1*b
b1 = 2*j1*b + i1*a

separate a in both equations:
a = (a1 - j1*b)/(2*i1)
a = (b1 - 2*j1*b)/i1

subtract equations
0 = (a1 - j1*b)/(2*i1) - (b1 - 2*j1*b)/i1

separate b:
0 = a1*i1 - j1*b*i1 - b1*2*i1 + 4*j1*b*i1
0 = b(4*j1*i1 - j1*i1) - (b1*2*i1 - a1*i1)
b = (b1*2*i1 - a1*i1)/(4*j1*i1 - j1*i1)
now solve for a etc...

This is very difficult...

Back to the drawing board.

@jhoenicke
Copy link

I see no easy way around this.

If you know the master public key, you can see a,b of the master node as the two only unknowns. Every other private key is a linear combination of a,b with known factors. For every known private key you get one linear equation in a,b. For every known master private key you get two linear equations in a,b.

So if you know the master public key and either one master private key or two private keys somewhere below in the hierarchy you get linear equations that you can solve for a and b. This is because a non-singular system of two linear equations with two unknowns is always solvable for the unknowns.

You need to somehow avoid the linear dependencies of the private keys. However, I think addition and scalar multiplication are the only sensible operations you can perform on the public keys to get a new public key. These correspond to linear operations on the private keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment