What's this all about?
Digital cryptography! This is a subject I've been interested in since taking a class with Prof. Fred Schneider back in college. Articles pop up on Hacker News fairly often that pique my interest and this technique is the result of one of them.
Specifically, this is about Lamport signatures. There are many signature algorithms (ECDSA and RSA are the most commonly used) but Lamport signatures are unique because they are formed using a hash function. Many cryptographers believe that this makes them resistant to attacks made possible by quantum computers.
How does a Lamport Signature work?
Here's the long version. This is the short copy without all the parameterization:
- Come up with 2 sets (set A and set B) each containing 256 random 256-bit numbers. Keep these secret! They're your private key.
- Take the SHA-256 hash of each of your secret numbers. These 512 hashes are your public key.
- Get the SHA-256 hash of whatever document you want to sign
- For each bit
iof the hash from
0..256: If the bit is a
0, publish the
ith number from secret set A. If it is a
1, publish the
ith number from secret set B. Destroy all unused numbers.
- You now have a signature (the 256 random numbers from step 4 corresponding to the bits of the hash from step 3) and a public key (the 512 hashes from step 2 that define if the value of a bit is 0 or 1 for each bit of the signature).
Because hashes are one-way functions, it is computationally hard to forge the secret, random numbers you created in step 1 that would allow an attacker to change your signature. In other words, it takes an impractically huge amount of processing power for an adversary to produce verifiable proof that you signed something other than what you actually signed.
There are two snags with using Lamport signatures in practice:
- This is a one-time signature. You can't sign anything else with the same public key. Doing so reveals more of your secret numbers and would allow an attacker to forge your signature for other documents. Fortunately, this is easily solved with a hash tree that merges many public keys down to a single "root". This is called the Merkle signature scheme.
- The signatures are enormous compared to ECDSA or RSA. Signing an
N-bit hash requires
Nbits each. That's 8 kib for signing a 256-bit hash or 32 kib for a 512-bit hash, which is around 1000x the size of a comparable ECSDA signature.
How to Shorten Lamport Signatures
The Wikipedia article outlines ways to shorten the private key using a cryptographically secure pseudorandom number generator and compress the public key with a hash list. No solution for shortening the public signature itself is published. That's what I hope to contribute with this article!
The Other End of the Spectrum: A Hash Ladder
First, I present an toy demonstration of the algorithm. It is impractical to implement for a number of reasons, but it shows the functionality. After we go through this, I'll modify it to make it usable in practice.
Assume you have a 3-bit signature of a document,
H_3bit(document) = 6. Pick two random 5-bit values,
29, as your private key. Hash each value
2^signed_hash_bits + 2 = 2^3 + 2 = 10 times with a perfect random 5-bit oracle called
H_5bit. Each possible value of your 3-bit hash has a corresponding index in these chains of 5-bit hashes:
hash ix:         A: 12 --> 5 --> 24 --> 9 --> 1 --> 15 --> 10 --> 8 --> 11 --> 27 B: 23 <-- 19 <-- 20 <-- 4 <-- 13 <-- 17 <-- 3 <-- 14 <-- 28 <-- 29 ^
A starts on the left and repeated hashes are listed to the right.
B starts on the right and repeated hashes are listed to the left. This forms the "ladder" metaphor, with pairs of hashes making the "rungs". The hash is 5 bits to accommodate at least 20 unique non-colliding hashes. In this example, I made up a perfect random oracle hash function with no collisions. In practice, the hash values are incredibly unlikely to collide because their range is 256+ bits rather than 5.
The values at the end of each chain
(A=27, B=23) are your public key.
To sign your document with hash
6, find the pair of values at that index:
 = (A=8, B=14). By publishing these values, anyone with the public key
(A=27, B=23) can verify that the pair
(A=8, B=14) corresponds to the value
- Take the A part of the signature pair,
8. Hash it until it equals
27, the A part of the public key. This takes
- Take the B part of the signature pair,
14. Hash it until it equals
23, the B part of the public key. This takes
- The length of each hash chain when producing the public key is
8. This is public knowledge as an algorithmic constant.
8-2 = 6, which is the claimed hash.
7-1 = 6, which is also the claimed hash. These values equal each other, and are thus the value that was signed by the owner of the public key
An adversary cannot create a valid signature for another value without inverting one of the hashes. For example, to change the pair to sign the value
7, the adversary must be able to solve
H_5bit(x) = 14 for
x in order to produce the pair
(11, 28). Similarly, to change the pair and sign the value
5, the adversary must be able to invert
H_5bit(x) = 8 and produce the pair
(10, 3). I call this construct a hash ladder because every pair of hash values in the two rows locks each other in place and defines a distinct location on the number line.
Now, we can't actually use this in practice without some modification. Real hashes are at least 256 bits. Creating a signature in this way for a 256-bit value would require a 257-bit hash function to be executed
2*2^256+2 times just to make the public signature. Not only would one signature take longer to compute than the age of the universe, it isn't secure without a hash function that is a perfect perfect random oracle. Any machine that could actually compute this function would be able to invert hashes by brute force anyway.
To use shorten Lamport signatures with a hash ladder in implementation, we need to chop up the hash to be signed into chunks with not very many bits (8-16) and create a ladder for each. With between
2^16 positions on the ladder, the ladder is short enough to be both computable and to have a very low probability of having hash collisions anywhere in the ladder itself.
The Implementable Algorithm
While this can be parameterized to use different ladder chunks and different hash sizes, I present this actual algorithm using 8-bit chunks and 256-bit hashes.
- Take the SHA-256 hash of the document you want to sign
- Split the 256-bit hash of your document into 32 8-bit chunks
- For each chunk, generate a pair of secret random 256-bit numbers. These 64 numbers are your private key.
- Hash each of these numbers 258 times. This final set of 32 pairs of 2 hashes each are your public key. (Note: Use a hash chain and this public key becomes just 256 bits)
- To create your signature, examine each chunk again. Let the value of this chunk be
nwith the range
[0, 255]. There are 2 256-bit numbers of the private key associated with that chunk. Let
aequal the first of these numbers hashed
bequal the second of these numbers hashed
256-ntimes. Publish the result
(a,b). This pair is your signature for this 8-bit chunk.
- Collect up the
32signatures from each chunk, and you have a
32*2*(256/8) = 2kbsignature! This is 4x smaller than the usual Lamport signature.
- Take the SHA-256 hash of the document you want to verify
- Split the 256-bit hash of the document into 32 8-bit chunks
- For each chunk, let the chunk's value from the hash be
V, the signature pair of numbers be
(a, b)and the corresponding public key pair be
aand count the iterations until it equals
Paor it has been hashed 256 times. If it was hashed 256 times without reaching
Pa, the signature is invalid. Save the number of iterations it took to reach
- Repeat step (4) for
b, saving the number of iterations to reach
256-i_a != i_b-1or
256-i_a != V, this signature is invalid.
- If there are more chunks, check the next chunk starting with step (3)
- The signature is valid if all chunks are signed correctly.
We trade off storage size for computation. Rather than having to compute 256 hashes to verify a 256-bit signature, we now must compute at least
256/8 * 256 = 8192 hashes. However, given that hash functions are intended to be fast this is likely to be a good tradeoff for cachable chunk sizes.
n is the bits of the hash function and
k is the bit size of each chunk:
(n/8)*2*(n/k)bytes is the size of the public key
n/k * 2^kis the number of hashes that must be computed to verify the key
|Hash Bits||Chunk Bits||Public Key Size||Relative Size||Hashes to Verify||Time to Verify @ 100 kHps|
|256||Lamport||4096 bytes||100 %||256||2 milliseconds|
|256||8||2048 bytes||50 %||8192||8 milliseconds|
|256||12||~1400 bytes||~30 %||~90000||~ 1 second|
|256||16||2048 bytes||25 %||1048576||~ 10 seconds|
|512||Lamport||32768 bytes||100 %||512||4 milliseconds|
|256||8||8192 bytes||25 %||16348||160 milliseconds|
|256||12||~5500 bytes||~17 %||~176128||~ 1.8 seconds|
|256||16||4096 bytes||13 %||1048576||~ 21 seconds|
100 kHps (100,000 hashes per second) was chosen from this list as most CPUs are able to do SHA-256 at at least the megahash-per-second level.
I have been rewriting this page periodically to try to clarify the algorithm. It appears it still needs more work :)
The main thing that makes me think I haven't communicated it clearly is that the B leg of the ladder is absolutely critical to the algorithm. The fact that a single value hash-chains to the public key is, as you correctly called out, not at all secure. It is trivial to derive any value in that chain after the one provided by the signer and claim that corresponds to the value that was signed. What is impossible (computationally unfeasible) is changing its partner--the hash value in the other chain. Since the hash legs go in opposite directions, advancing the A leg requires inverting the hash on the B leg and vice-versa. There is exactly one location where the values from both legs of the ladder are public: the value that was signed. I think defining my random oracle and giving more thorough examples will help explain this.
In the toy example, the entire hash is 3 bits: i.e. the values of the hash can only be 0-7, so there is exactly 1 place that corresponds to that value on the hash ladder. In a real implementation, it is not feasible to have hash chains 2^256 bits long. So, the hash must be split into smaller chunks, each with their own ladder. That is what is shown in the implementable algorithm and the table at the end.
Just as with a Lamport signature, a CSPRNG can be used (but is not required) to derive the roots of the A and B ladders. I left off compression methods like this because they are the same as for the usual Lamport signature.
I'll work on this explanation when I get some time. Please check back!