hhanh00/crypto-basic.md

## crypto-basic.md

      
    Raw
  

              crypto-basic.md
            
          
    Digital Signatures

Intro

Let's imagine an island whose inhabitants all have an identification number and a phone number. The identification number (ID) starts from 1 and the phone number is a 10 digit number.
They have a phone directory that lists everyone's ID and phone number.
The phone number is given to the user in a way that is difficult to understand.

Given a phone number, we cannot figure out the person ID. There is no reverse directory.

The impossibility of a reverse directory is strange. After all, if one has a phone directory why can't he make another directory by phone number? For the moment, let's assume that it is not possible. We'll see how we can do this later.
The person who assigns phone numbers is a strange mathematician and he decided that it would be fun to give them so that if you take the ID of any two people, sum them together and look at the phone number of the result, you will find that it is the same as 'combining' their phone numbers. Essentially, he has a way to combine phone numbers together that we call | such that

Phone(ID1) | Phone(ID2) = Phone(ID1+ID2)

From this property, he shows that

Phone(ID)|... [repeated N times] = Phone(ID+... [repeated N times])
= Phone(ID x N)

It comes from the previous property applied N times in a row.
Exactly how | is defined is not important, the relationship between | and + is what matters.
With this clever trick, the mathematician at the phone company saved a lot of paper! Because he can claim that phone books are not necessary. He publishes Phone(1) and with the previous relationship he can tell people that they can figure out their phone number themselves.
After all,
Phone(2) = Phone(1)|Phone(1)
Phone(3) = Phone(1)|Phone(1)|Phone(1)
...

Some people complained that he was just lazy and that it is quite unconvenient for the people with a high ID number but he replied that he has a fast method for them.
Let's say your ID is 8. You may choose to combine Phone(1) 8 times or you can double:

Phone(8) = Phone(4)|Phone(4)
Phone(4) = Phone(2)|Phone(2)
Phone(2) = Phone(1)|Phone(1)

Only 3 operations instead of 8. Not bad at all. If we want Phone(10), we first calculate Phone(8) and Phone(2) then combine them together.
He also says that his method is extensible. If new people come to the island, they get assigned a phone number with this method and there is no need to print a new phone book. Everyone wins.

The only problem is that a reverse phone directory is impossible (It could be that | is reversible but let's say that it is not). In other words, ID -> Phone(ID) is easy but Phone(ID) -> ID is too hard.

A dedicated spook could try to build a reverse directory by calculating every phone number starting from 1. To make this impractical, the island authorities extended the ID range to 40 digits (between 1 and 10000000000000000000000000000000000000000) and they did the same with the phone numbers too. Only a tiny tiny part of the possible IDs and phone numbers are in use. It's a complete deterrent to the reverse phone book approach.

Your ID is known only to you. Your phone number is public.
This property will be used to our advantage to build a crypto system.

Sharing a secret

Let's see some practical use. Alice and Bob want to communicate secretly but they know that other people can listen on their conversation.
They know of a good encryption algorithm. Problem is, the algorithm has a parameter - the key. If both of them use the same key, the algorithm works. Alice can encrypt and Bob can decrypt. Without the key, the conversation looks like gibberish.
Great, but how will they tell the key to each other? The key has to be agreed on before the encryption can take place. But without encryption, anyone could listen and grab the key. It seems impossible - a chicken and an egg problem.
The same mathematician tells them that he knows a solution.
First of all, he observes that the value of the key doesn't matter. Both Alice and Bob must agree on the same value - that's it.

He tells Alice to pick a random IDa and calculate Phone(IDa). The IDa may be in use or not. It's not relevant for this method.
She sends Phone(IDa) to Bob.
Bob does the same thing. He picks a random IDb and sends Phone(IDb)
The shared secret is Phone(IDa x IDb)

Alice can calculate it from IDa (her secret choice) and Phone(IDb) (Bob gave it to her).
Phone(IDb)|Phone(IDb)|...|Phone(IDb) [repeat IDa times] = Phone(IDa x IDb)

and likewise Bob can calculate it from IDb and Phone(IDa)
Phone(IDa)|Phone(IDa)|...|Phone(IDa) [repeat IDb times] = Phone(IDa x IDb)

In both case, they would use the doubling technique to quickly combine the same phone number repeatedly.
The poor eavesdropper cannot figure out the secret because she doesn't have IDa or IDb.
Signing a message

Another application of these numbers is digital signatures. Now Alice has a message she wants to send to Bob. The message is not encrypted and it's ok if someone else can read it. However, the message will go through many intermediaries before Bob receives it. Alice is worried that someone will modify her message along the way and Bob will be tricked into believing that the modified message is from her.
She wants to add a 'signature' to the message so that if anyone modifies the message, Bob will be able to tell because the signature will not match. For this to work,

Alice must be able to apply her signature and Bob must be able to recognize it
If the message is modified by Eve, Eve cannot create a signature that works with the modified message

The signature scheme that we present here is named after its inventor Schnorr. There are other digital signature algorithms. Another popular one is DSA but Schnorr is simplier and as powerful.
Before we get into the algorithm itself, we need to talk about hash functions.
The message that Alice wants to sign can be long. It could be 100 bytes or even 1 GB. Instead of working on the message itself, the signature works on a shorter form of it called a digest or a hash.
The main idea of a hash function is that it shuffles and mixes the content of the message in a reproducible way but it gives a result that is impossible to guess. The output will always have the same length - enough to make it impossible to try out all the numbers. For example, a commonly used hashing function SHA-256 returns a 256-bit result, i.e. 32 bytes.
32 bytes is much shorter than the length of some of the input messages. A typical page would have thousands of bytes. Yet, the function is designed so that the result still cannot be guessed in advance.

You change 1 byte of the message and the result is completely different
You have the result but you cannot find a different input that would give the same result
In fact, you cannot give any pair of messages which give the same result

That is how good the hash function is. If any of these statements is remotely false, then the hash function is not good at all.
Now that we have a hash function, we can build the digital signature.
The principle is similar to what Alice did for sharing a secret with Bob. This time Alice does not receive a reply from Bob though.
She starts by choosing a random ID like before. She calculates Phone(ID). It will be the first part of the signature. She appends Phone(ID) to her message and calculates the hash of it. Now
she has a value that is tied to her message and to Phone(ID). If someone changes either of them, the hash will be different. Let's call the hash H.
She has her own ID_A and Bob knows her phone number Phone(ID_A).
She wants to give to Bob something that he can use to verify against Phone(ID). Obviously sending ID is pointless. It is not related to the message or to Alice.
But if Alice sends ID - H x ID_A, then Bob can verify the signature by following the steps:

Calculate Phone(ID - H x ID_A)
He has Phone(ID) (it's the first part of the signature) and he has the message. He can calculate H just like Alice did.
Now he can calculate Phone(ID_A)|... [repeated H times] using the doubling trick. This is the same as Phone(H x ID_A)
Finally he calculates Phone(ID - H x ID_A) | Phone(H x ID_A)
This is the same as Phone(ID). So at the end, if everything checks out he should find the same value as the first part of the signature.

If the message was modified, H will be different. Eve cannot sign for Alice because she doesn't know ID_A.
This doesn't prove that the signature is without backdoors though. A security proof is much more complicated.
DSA is based on the same principles but works differently to avoid some patent issues.