Skip to content

Instantly share code, notes, and snippets.

@hujohnson
Created August 7, 2013 18:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hujohnson/fe56214c6676ccb3ee92 to your computer and use it in GitHub Desktop.
Save hujohnson/fe56214c6676ccb3ee92 to your computer and use it in GitHub Desktop.
A draft of RSA example
{
"metadata": {
"name": "sage_python"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": "#An example of RSA\nHow does RSA encrypt and decrypt information?\nI'll outline the process below. You can follow along to produce your own secret messages using [this](http://aleph.sagemath.org/) free SAGE notebook server. (when using the SAGE server, the **sg** prefix used in my code is superfluous.)\n\nThe mathematics used in this example is very simple number theory, but it is necessary that you understand the basics of modular arithmetic. If you've never seen congruence arithmetic before, you may want to take a moment to read this [article](https://www.khanacademy.org/math/applied-math/cryptography/modarithmetic/a/what-is-modular-arithmetic).\n\nThe line of code below is necessary for those who might want to follow along using the iPython notebook rather than the online SAGE notebook. Using SAGE from within iPython unfortunately does take some work -- [these](http://www.liafa.univ-paris-diderot.fr/~labbe/blogue/categorie/ipython/) instructions are a good starting place."
},
{
"cell_type": "code",
"collapsed": false,
"input": "#sage users do not need this line\nimport sage.all as sg",
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": "RSA is an example of **public key cryptography**. The basic idea is that for every communicator (you!) there exists a *private key* and a *public key*, and together these keys allow for the encryption and decryption of messages to you. Anyone in the world can encrypt a message (using your **public** key) and send it to you in such a way that only you can decrypt the result (using your **private** key). Notice that in this scenario, \"you\" are the only agent receiving messages. Also note that the messages are encoded especially for you, using _your_ public key. To produce messages and send them secretly to other agents, you must use *their* keys, in particular their public key. They then decode messages encrypted for them via their private key.\n\nAt the most abstract level, you can think of a **key** as just being _information_. Of course disembodied information is somewhat mysterious and impractical. Keys are represented as integers, _i.e._ whole numbers. In RSA the keys are actually pairs of integers.\n\n\nNow we'll see how the public and private keys are made. The process requires several steps. The outline of the key generation process for RSA given on Wikipedia is considerably more concise than what I will do -- if you are able to follow the mathematics, you may want to skim the outline given there.\n\nThe first step is to select two large (mine aren't so large) prime numbers, which we'll call $p$ and $q$. Python together with the SAGE libraries handily does this for us in the next cell."
},
{
"cell_type": "code",
"collapsed": false,
"input": "#sage users do not need the sg prefix\np = sg.random_prime(100,lbound=15)\nq = sg.random_prime(100,lbound=15)\n[p,q] #secret...sh!",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 39,
"text": "[37, 73]"
}
],
"prompt_number": 39
},
{
"cell_type": "markdown",
"metadata": {},
"source": "The above python code gave us two random prime numbers $p$ and $q$, selected at random in the range 15 to 100. It turns out that $p = 37$ and $q = 73$. Think of these numbers as *ingredients* to the keys we are making. These ingredients are **secret** -- we don't want anyone copying our recipe exactly. These numbers are also the *whole secret* in the sense that the values of $p$ and $q$ are the only secret parts of the encryption/decryption process. They are not the key(s) exactly, but we will see that knowing $p$ and $q$ along with the public key is enough to deduce the private key (_i.e._ to break RSA).\n\n\nThe next step is to combine these ingredients to make a public value called the *modulus*."
},
{
"cell_type": "code",
"collapsed": false,
"input": "n = p*q #making the modulus n\nn #public -- anyone can know this!",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 40,
"text": "2701"
}
],
"prompt_number": 40
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Though $p$ and $q$ are secret and special to each communicator, the modulus $n=pq$ is public knowledge. It may seem strange that a number would be public knowledge, but its factors secret. Couldn't someone just factor $n$ and recover $p$ and $q$? The answer is *yes* in principle -- a person who was able to do this could read our secret messages. However, at the time of this writing, there is no known method to factor $n$ in a reasonable amount of time when $p$ and $q$ are sufficiently large. Obviously, the $p$ and $q$ in this example are not large enough to offer real security!\n\n\nThe next stage in cooking up the public and private keys is to apply Euler's totient function $\\phi(\\cdot)$ to the modulus $n$. The reason for doing this is somewhat technical--it has to do with a mathematical proposition based on the Chinese Remainder Theorem which ensures that messages are encrypted and decrypted correctly. Mathematically, this is the \"brilliant\" part of RSA. But it's easy to understand what $\\phi(n)$ does to $n$ -- it just returns the number of integers in the set {$1,2,\\ldots,n-1$} which are [relatively prime](http://mathworld.wolfram.com/RelativelyPrime.html) to $n$. "
},
{
"cell_type": "code",
"collapsed": false,
"input": "phi_n = sg.euler_phi(n)\nphi_n #secret!",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 41,
"text": "2592"
}
],
"prompt_number": 41
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Note that this means that in the set {1,2,...,2700}, there are 2592 elements $a$ such that $gcd(a,2701) = 1$.\n\nThe value $\\verb=phi_n=$, or $\\phi(n)$, is a *secret* ingredient in the key generation process. Knowing $\\phi(n)$ is actually \"equivalent\" to knowing the secret values $p$ and $q$ because of the relationship \n\n$$\\phi(pq) = (p-1)(q-1)$$ \n\nwhich holds whenever $p$ and $q$ are prime. To flesh things out a little bit more, we are saying that if we know\n\n$$2592 = (p-1)(q-1)$$\n\nand also that\n\n$$2701 = pq$$\n\nthen these two equations in two unknowns can easily be solved to find $p$ and $q$. \n\nIn particular, we do $q = \\frac{2701}{p}$ and then make this substitution in the first equation to yield\n\n$$2592 = (p-1)(\\frac{2701}{p}-1)$$\n\nThis equation, in turn, can be solved using the quadratic formula to give $p = 37$, from which point the value of $q$ is immediate via $q = \\frac{2701}{p}$ .\n\n\nConversely, if we know $p$ and $q$ we can easily find $\\phi(n)$. This is the sense in which knowing $\\phi(n)$ is \"equivalent\" to knowing $p$ and $q$.\n\nKeys\n----\n\n\nWe have now arrived at a stage at which we can say what keys in RSA \"really are\". The **public key** in RSA is a pair of integers $(n,e)$ where $n$ is the modulus and $e$ is a value called the *public exponent*. There is some free choice in selecting $e$. For the math to work, we need only that $1 < e < \\phi(n)$ and that $e$ and $\\phi(n)$ are relatively prime.\n\nFor those who know a little group theory, the point of this condition on $e$ is to ensure that $e$ is a member of the multiplicative group of integers mod $\\phi(n)$, and in particular that $e^{-1}$ exists. \n\nIn plain language, the condition guarantees that there is another number $d$ with $1 < d < \\phi(n)$ such that $e\\cdot d \\equiv 1$ mod $\\phi(n)$. In fact, the resulting $d$ not only exists, but is unique.\n\nThe number pair $(n,d)$ constitutes the **private key** in RSA. Note that the roles played by $e$ and $d$ are completely symmetric. The public key could be swapped for the private key, and all the constraints would still hold.\n\nThe following bit of code selects a public exponent $e$ through a random process. In a real implementation, $e$ would be selected slightly more carefully."
},
{
"cell_type": "code",
"collapsed": false,
"input": "my_public_exponent = sg.randint(10,phi_n) \nwhile sg.gcd(my_public_exponent,phi_n) != 1:\n my_public_exponent = sg.randint(10,phi_n-1)\nmy_public_exponent #public (duh)",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 49,
"text": "913"
}
],
"prompt_number": 49
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Now that we have $e=913$, we can quickly compute $d$. The code masks the underlying process of computing $d$, which is quite efficient -- it amounts to using the extended Euclidean algorithm on $e$ and $\\phi(n)$ to find the Bezout coefficient of $e$."
},
{
"cell_type": "code",
"collapsed": false,
"input": "my_private_exponent = sg.inverse_mod(my_public_exponent,phi_n)\nmy_private_exponent #secret!",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 50,
"text": "2257"
}
],
"prompt_number": 50
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Thus we have $d = 2257$. If you like you can check that $913\\cdot2257 \\equiv 1$ mod $2592$.\n\nThe key generation phase of RSA is now (finally!) over. We have arrived at a public key of $(2701, 913)$ and a private key of $(2701,2257)$. Now all we need is a message to encrypt.\n\nRSA only works on numerical messages. Any message someone wishes to send you must first be coded as an integer between 0 and $n-1$, in some publicly known way. If the message is too big to allow for that then it needs to be broken down into suitably small chunks. Below we select a message at random."
},
{
"cell_type": "code",
"collapsed": false,
"input": "message = sg.randint(0,n-1) #the message can be anything\nmessage #secret",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 51,
"text": "1965"
}
],
"prompt_number": 51
},
{
"cell_type": "markdown",
"metadata": {},
"source": "##Encryption\n\nNow that we have the message $M=1965$, we want to encrypt it. The formula for computing the ciphertext $C$ from the message $M$ is\n\n$C = M^e$ mod $n$."
},
{
"cell_type": "code",
"collapsed": false,
"input": "ciphertext = sg.power_mod(message,my_public_exponent,n) \nciphertext #public",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 52,
"text": "1631"
}
],
"prompt_number": 52
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Thus the message $M = 1965$ is encrypted as $C = 1965^{913}$ mod $2701 = 1631$. \n\nOf course, you want to use a special procedure to compute the modular exponent -- it is infeasible to expand $1965^{913}$ precisely. This is why we use the SAGE $\\verb=power_mod=$ function rather than direct exponentiation.\n\n##Decryption\n\nNow suppose we want to get from $C$ back to $M$. The formula for this is \n\n$M = C^d$ mod $n$"
},
{
"cell_type": "code",
"collapsed": false,
"input": "decoded_ciphertext = sg.power_mod(ciphertext,my_private_exponent,n)\ndecoded_ciphertext",
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 53,
"text": "1965"
}
],
"prompt_number": 53
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Success! Note that the original message has popped out. All that's left is to figure out _why_ it worked. The reason is as follows.\n\nNotice that when we computed\n\n$C^d$ mod $n$\n\nwe equivalently did\n\n${M^e}^d$ mod $n$\n\nby definition of $C$.\n\nWe now have to explain why\n\n$M \\equiv {M^e}^d$ mod $n$\n\nUnfortunately a full account requires a discussion of the Chinese Remainder Theorem, which is beyond the scope of a short example. For those wishing to go further, [here](http://amca01.wordpress.com/2008/04/04/the-rsa-theorem/) is a place to start. The road to complete enlightenment is not long -- you should be able to prove the correctness of RSA with a couple of hours of study."
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment