srikanthmanda/sec-comms.md

## sec-comms.md

      
    Raw
  

              sec-comms.md
            
          
    Secure Communications

What? Keeping secrets secret.
How? Encryption.
What is Encryption?

Turning something intelligible i.e., which can be understood/comprehended easily, into something unintelligible i.e., meaningless.
Decryption is the reverse operation of Encryption.
A Brief History

One early need for encryption was the need to send battle plans and strategies in the open, without enemy understanding the communication.
Some early techniques:

Caesar cipher - Substitute the letters in input with letters that are fixed number of positions away in the alphabet.
Polyalphabetic cipher - Substitution, but the position shift of the letters is cycled from a list of numbers.

In both the above techniques, the secret part is the list of position shift numbers. This is called the key. Since same key is used for encryption and decryption, we may consider them to be symmetric encryption techniques.
Demo

Caesar cipher

Substitute the letters of input by letters 1 (one) position away in the alphabet.

Input, also called plaintext: HELLO WORLD
Output, also called ciphertext: IFMMP XPSME

Here the key is 1 (one).
Link: Demo
Polyalphabetic cipher

Substitute letters' shift will be based on the list 11-4-23.


The 1st letter H will be replaced by the letter 11 letters away in the alphabet, S.


The 2nd letter E will be replaced by the letter 4 letters away in the alphabet, I.


The 3rd letter L will be replaced by the letter 23 letters away in the alphabet, I.


The 4th letter L will be replaced by the letter 11 letters away in the alphabet, W.


The 5th letter O will be replaced by the letter 4 letters away in the alphabet, S.


So shifts applied to each input letter are: 11-4-23-11-4-23-11-4-23-11.


Input, also called plaintext: HELLO WORLD


Output, also called ciphertext: SIIWS TZVIO


Here the key is the list 11-4-23. Since these numbers are less than 26, we may represent the key itself as K-D-W.
Encryption Strength

The strength of an encryption technique is essentially how hard it is to decipher the secret without knowing the key.
One way is brute forcing all possible permutations and combinations of the key. How to mitigate this problem? By increasing the length of the key. The longer the key, the more guesses are needed.
So key length is an important factor in encryption strength.
But given enough compute cycles, any key can be brute forced. How to work around this? By changing the key itself. If a key is not reused, a brute forced key of one communication will not work for another one. But what if the key pattern itself can be guessed? (Remember how you used to change your laptop password from @CCen123 to @CCen124?) This is a very real problem and the solution is randomness.
Randomness of the key is another important factor in encryption strength.
Can things be mode even harder? Yes, by increasing the steps in the ecryption process. For e.g., following steps in addition to substitution:

Swap adjacent letters
Reverse the string
Split the string in half and swap the two portions.
Do different operations on letters in even and odd number positions.

The idea is to make the cipher text as far removed from the input text as possible.
The following modern encryption algorithms are designed to be secure, fast:

Advanced Encryption Standard (AES)
Serpent
Twofish

AES, being a standard, is supported even at hardware level. This support and resultant performance is crucial for applications like BitLocker that perform harddisk encryption.
Limitation

The techniques and algorithms mentioned above use symmetric keys. While performant, these do have a limitation that both the parties of the communication should know the same secret. While this works where the two parties can meet and share the secret, it doesn't scale for something like the Internet.
Diffie–Hellman Key Exchange

Diffie-Hellman key exchange is an algorithm for two parties to share something in the open and arrive at a shared secret. It uses a one way function that is easy to perform but are impossible or very hard to reverse. A physical example - it is easy to mix two different colours and form a third colour. But once they are mixed, it is very hard to separate the two colours from the third colour.
The mathematical equivalent is discrete logarithm problem. It takes the form g^x % p, where p is a prime number and g is primitive root of p. What primitive root means is that the result of the operation g^x % p is equally likely to be any of the numbers from 0 to p-1, for any x.
The two parties can publicly agree upon a g and p pair, come up with a secret x and share the results r of the operation g^x % p publicly. Both parties can then arrive at a shared secret by performing in the operation x^r % p, where x is their secret and r is the result shared by the other party.
Demo


Party A and party B publicly agree on g = 5 and p = 23.
Party A sets secret x = 4, performs 5^4 % 23 and shares the result 4.
Party B sets secret x = 3, performs 5^3 % 23 and shares the result 10.
Both parties perform the equivalent operations 10^4 % 23 & 4^3 % 23 and arrive at shared secret 18.

Due to the nature of the discrete logarithm problem, it is computationally intensive to arrive at the shared secret 18, without knowing either party's secret, 4 or 3. Knowing 5 (g), 23 (p) and the results 10 & 4 does not help either.
Usage

Using this Diffie-Hellman and related key exchange algorithms, two parties can arrive at a shared secret key that can then be used with the actual encryption algorithm like AES.
Identify & Authentication

Using Diffie-Hellman key exhange, two parties can share a secret and using that secret and a symmetric encryption algorithm like AES, they can communincate securely out in the open without anyone being privy to the information they are sharing.
However, communication over computer networks has one more problem - Authentication. How do the two parties know for certain that the other party is who they claim to be? When you log into Gmail, how do you know it is Google's email service and not a spoof being run by your internet service provider or your company or your government?
Asymmetric Key Encryption

An alternative to symmetric key encryption is using two different keys - one for encryption and the other for decryption. This is referred to as asymmetric key encryption. In this technique, a party has a pair of keys - a private key they keep secret and a public key they share openly. The fact that these keys work only as a pair can solve the identity and authentication problem if a trusted party can vouch for the ownership of the public key.
Phi (Φ) Function

Phi function is a one way mathematical function. It is easy to perform but hard to reverse, like the differential logarithm problem.

For a number n, Φ(n) is the largest number between 0 and n-1 that does not share a factor with n.
Phi (Φ) function is also multiplicative i.e., Φ(n * m) = Φ(n) * Φ(m).
By the nature of prime numbers, for a prime number p, Φ(p) = p-1.
So for two primes p and q - Φ(p * q) = Φ(p) * Φ(q) = (p-1) * (q-1).
With prime numbers, it is easier to calculate the product but hard to determine prime factors of the product.

RSA Algorithm

A pair of prime numbers can be used with Phi function to generate a pair of keys, one public and one private, which can be used to encrypt and decrypt asymmetrically.
Steps:

Pick two prime numbers p, q.
Determine e, the smallest prime number that is NOT a factor of Φ(p * q) i.e., Φ(p * q) % e != 0.
Determine d, an integer of the form ((K * Φ(p*q)) + 1) / e, where K is an integer too.
Key Pair: e & (p * q), d & (p * q)
Encryption: (input)^e % (p * q)
Decryption: (output)^d % (p * q)

Usage

Both keys can encrypt data that can be decrypted only by the other key. So either of the keys can be made private (or public), but not both.
Demo


p = 23, q = 31
p * q = 713
Φ(p * q) = (p - 1) * (q - 1) = 22 * 30 = 660
e = 7 as -

660 % 2 = 0
660 % 3 = 0
660 % 5 = 0
660 % 7 = 2 != 0


d = 283 as -

((1 * 660) + 1) / 7 = 94.428
((2 * 660) + 1) / 7 = 188.714
((3 * 660) + 1) / 7 = 283 - an integer


Key pairs: (7, 713) and (283, 713)
Encryption: input = 42, output = 42^7 % 713 = 199
Decryption: output = 199, input = 199^283 % 713 = 42

Public Key Infrastructure (PKI)

SSL Certificate = Owner Information + Owner's Public Key + Issuer Information + Issuer's Digital Signature
Digital Signature is information encrypted with a private key. If a public key can decrypt the information, then the public and private keys are a pair and should belong to same entity.
By this principle, if a system/browser/app has a public key of an issuer it knows, then that issuer can vouch for ownership of another public certificate by signing the ownership information using their own (i.e., the issuer's) private key. Thus a chain of trust can be built starting with one known entity. This is the basis of Public Key Infrastructure (PKI).
There are a handful of Certificate Authorities (CAs) like Quovadis whose public certificates are embedded in many software. These CAs issue certificates to what are called intermediary CAs. Most often it is these intermediary CAs, like LetsEncrypt that issue the end user certificates to websites, persons etc.
Steps for getting a SSL certificate:

Owner generates a pair of public and private keys.
Owner generates a certificate signing request (CSR) with their information and public key.
CA signs the CSR with their private key and issues the owner a SSL certificate.

How SSL certificates work:

Client initiates a connection with a server.
Server presents its certificate.
Client retrives the issuer's public key and verifies the signature of the certificate.
If the issuer certificate is not in the Client trust store, the Client continues to verify every certificate in the chain until it finds a certificate in its trust store.
The connection is closed if no certificate in the chain is not found in the trust store.

Resources


OpenSSL Cookbook
Art of Problem Cryptography Videos
Cipher Suite
Computerphile Videos:

Almost All Web Encryption Works Like This (SP Networks)
Public Key Cryptography
Prime Numbers & RSA Encryption Algorithm
Secret Key Exchange (Diffie-Hellman)
Man in the Middle Attacks & Superfish
Transport Layer Security (TLS)
TLS Handshake Explained
AES Explained (Advanced Encryption Standard)
What are Digital Signatures?
Diffie Hellman - the Mathematics bit
Key Exchange Problems
One Encryption Standard to Rule Them All!
128 Bit or 256 Bit Encryption?


Appendix

Randomness

In English, the most common letter is E. So in a ciphertext, the most common occurrence of a letter/pattern may correspond to E. So text that is not random enough may reveal patterns that leak information. During World War 2, repeated use of the phrase "Heil Hitler" in the encrypted German communications led to it being deciphered long before Allies could break the Enigma machine code (reference).
Computers are deterministic, i.e., they are not very good at randomness. The random number generators in computers are actually pseudorandom. They all take an initial seed to generate sequence of random numbers. The randomness of the sequence depends on the randomness of the initial seed. Things like our mouse movements, CPU temperature are used in generating this initial seed. A tidbit - CloudFlare uses a group of Lava lamps to generate seeds that underpin the SSL encryption of a lot of websites (reference).