Skip to content

Instantly share code, notes, and snippets.

@rhydomako
Created August 2, 2017 21:22
Show Gist options
  • Save rhydomako/b98e24424dc8e52fecdf6ac0a76a72ea to your computer and use it in GitHub Desktop.
Save rhydomako/b98e24424dc8e52fecdf6ac0a76a72ea to your computer and use it in GitHub Desktop.
Project Euler problem 59
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Project Euler -- Problem 59\n",
"\n",
"__Problem text__\n",
"\n",
"Each character on a computer is assigned a unique code and the preferred standard is ASCII (American Standard Code for Information Interchange). For example, uppercase A = 65, asterisk (*) = 42, and lowercase k = 107.\n",
"\n",
"A modern encryption method is to take a text file, convert the bytes to ASCII, then XOR each byte with a given value, taken from a secret key. The advantage with the XOR function is that using the same encryption key on the cipher text, restores the plain text; for example, 65 XOR 42 = 107, then 107 XOR 42 = 65.\n",
"\n",
"For unbreakable encryption, the key is the same length as the plain text message, and the key is made up of random bytes. The user would keep the encrypted message and the encryption key in different locations, and without both \"halves\", it is impossible to decrypt the message.\n",
"\n",
"Unfortunately, this method is impractical for most users, so the modified method is to use a password as a key. If the password is shorter than the message, which is likely, the key is repeated cyclically throughout the message. The balance for this method is using a sufficiently long password key for security, but short enough to be memorable.\n",
"\n",
"Your task has been made easy, as the encryption key consists of three lower case characters. Using cipher.txt (right click and 'Save Link/Target As...'), a file containing the encrypted ASCII codes, and the knowledge that the plain text must contain common English words, decrypt the message and find the sum of the ASCII values in the original text."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Approach\n",
"\n",
"* First, develop the infrastructure to be able to apply the key to the cipher text and examine the resulting text\n",
"* Next, compute the character distribution for the deciphered text and compare that distribution to a reference distribution (some other english text)\n",
"* Find a key that minimizes the distance between the deciphered character distribution and the reference distribution by cycling through all possible key combinations, or by some other clever minimization of the distribution distance"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"from collections import Counter\n",
"import string"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Download cipher text__"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2017-08-02 11:20:15-- https://projecteuler.net/project/resources/p059_cipher.txt\n",
"Resolving projecteuler.net... 185.119.173.194\n",
"Connecting to projecteuler.net|185.119.173.194|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 3203 (3.1K) [text/plain]\n",
"Saving to: 'p059_cipher.txt.1'\n",
"\n",
"p059_cipher.txt.1 100%[===================>] 3.13K --.-KB/s in 0s \n",
"\n",
"2017-08-02 11:20:16 (54.5 MB/s) - 'p059_cipher.txt.1' saved [3203/3203]\n",
"\n"
]
}
],
"source": [
"!wget https://projecteuler.net/project/resources/p059_cipher.txt"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"cipher_text = open('p059_cipher.txt').read().strip()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"'79,59,12,2,79,35,8,28,20,2,3,68,8,9,68,45,0,12,9,67,68,4,7,5,23,27,1,21,79,85,78,79,85,71,38,10,71,27,12,2,79,6,2,8,13,9,1,13,9,8,68,19,7,1,71,56,11,21,11,68,6,3,22,2,14,0,30,79,1,31,6,23,19,10,0,73,79,44,2,79,19,6,28,68,16,6,16,15,79,35,8,11,72,71,14,10,3,79,12,2,79,19,6,28,68,32,0,0,73,79,86,71,39,1,71,24,5,20,79,13,9,79,16,15,10,68,5,10,3,14,1,10,14,1,3,71,24,13,19,7,68,32,0,0,73,79,87,71,39,1,71,12,22,2,14,16,2,11,68,2,25,1,21,22,16,15,6,10,0,79,16,15,10,22,2,79,13,20,65,68,41,0,16,15,6,10,0,79,1,31,6,23,19,28,68,19,7,5,19,79,12,2,79,0,14,11,10,64,27,68,10,14,15,2,65,68,83,79,40,14,9,1,71,6,16,20,10,8,1,79,19,6,28,68,14,1,68,15,6,9,75,79,5,9,11,68,19,7,13,20,79,8,14,9,1,71,8,13,17,10,23,71,3,13,0,7,16,71,27,11,71,10,18,2,29,29,8,1,1,73,79,81,71,59,12,2,79,8,14,8,12,19,79,23,15,6,10,2,28,68,19,7,22,8,26,3,15,79,16,15,10,68,3,14,22,12,1,1,20,28,72,71,14,10,3,79,16,15,10,68,3,14,22,12,1,1,20,28,68,4,14,10,71,1,1,17,10,22,71,10,28,19,6,10,0,26,13,20,7,68,14,27,74,71,89,68,32,0,0,71,28,1,9,27,68,45,0,12,9,79,16,15,10,68,37,14,20,19,6,23,19,79,83,71,27,11,71,27,1,11,3,68,2,25,1,21,22,11,9,10,68,6,13,11,18,27,68,19,7,1,71,3,13,0,7,16,71,28,11,71,27,12,6,27,68,2,25,1,21,22,11,9,10,68,10,6,3,15,27,68,5,10,8,14,10,18,2,79,6,2,12,5,18,28,1,71,0,2,71,7,13,20,79,16,2,28,16,14,2,11,9,22,74,71,87,68,45,0,12,9,79,12,14,2,23,2,3,2,71,24,5,20,79,10,8,27,68,19,7,1,71,3,13,0,7,16,92,79,12,2,79,19,6,28,68,8,1,8,30,79,5,71,24,13,19,1,1,20,28,68,19,0,68,19,7,1,71,3,13,0,7,16,73,79,93,71,59,12,2,79,11,9,10,68,16,7,11,71,6,23,71,27,12,2,79,16,21,26,1,71,3,13,0,7,16,75,79,19,15,0,68,0,6,18,2,28,68,11,6,3,15,27,68,19,0,68,2,25,1,21,22,11,9,10,72,71,24,5,20,79,3,8,6,10,0,79,16,8,79,7,8,2,1,71,6,10,19,0,68,19,7,1,71,24,11,21,3,0,73,79,85,87,79,38,18,27,68,6,3,16,15,0,17,0,7,68,19,7,1,71,24,11,21,3,0,71,24,5,20,79,9,6,11,1,71,27,12,21,0,17,0,7,68,15,6,9,75,79,16,15,10,68,16,0,22,11,11,68,3,6,0,9,72,16,71,29,1,4,0,3,9,6,30,2,79,12,14,2,68,16,7,1,9,79,12,2,79,7,6,2,1,73,79,85,86,79,33,17,10,10,71,6,10,71,7,13,20,79,11,16,1,68,11,14,10,3,79,5,9,11,68,6,2,11,9,8,68,15,6,23,71,0,19,9,79,20,2,0,20,11,10,72,71,7,1,71,24,5,20,79,10,8,27,68,6,12,7,2,31,16,2,11,74,71,94,86,71,45,17,19,79,16,8,79,5,11,3,68,16,7,11,71,13,1,11,6,1,17,10,0,71,7,13,10,79,5,9,11,68,6,12,7,2,31,16,2,11,68,15,6,9,75,79,12,2,79,3,6,25,1,71,27,12,2,79,22,14,8,12,19,79,16,8,79,6,2,12,11,10,10,68,4,7,13,11,11,22,2,1,68,8,9,68,32,0,0,73,79,85,84,79,48,15,10,29,71,14,22,2,79,22,2,13,11,21,1,69,71,59,12,14,28,68,14,28,68,9,0,16,71,14,68,23,7,29,20,6,7,6,3,68,5,6,22,19,7,68,21,10,23,18,3,16,14,1,3,71,9,22,8,2,68,15,26,9,6,1,68,23,14,23,20,6,11,9,79,11,21,79,20,11,14,10,75,79,16,15,6,23,71,29,1,5,6,22,19,7,68,4,0,9,2,28,68,1,29,11,10,79,35,8,11,74,86,91,68,52,0,68,19,7,1,71,56,11,21,11,68,5,10,7,6,2,1,71,7,17,10,14,10,71,14,10,3,79,8,14,25,1,3,79,12,2,29,1,71,0,10,71,10,5,21,27,12,71,14,9,8,1,3,71,26,23,73,79,44,2,79,19,6,28,68,1,26,8,11,79,11,1,79,17,9,9,5,14,3,13,9,8,68,11,0,18,2,79,5,9,11,68,1,14,13,19,7,2,18,3,10,2,28,23,73,79,37,9,11,68,16,10,68,15,14,18,2,79,23,2,10,10,71,7,13,20,79,3,11,0,22,30,67,68,19,7,1,71,8,8,8,29,29,71,0,2,71,27,12,2,79,11,9,3,29,71,60,11,9,79,11,1,79,16,15,10,68,33,14,16,15,10,22,73'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cipher_text"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"cipher = [int(x) for x in cipher_text.split(',')]"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[79, 59, 12, 2, 79, 35, 8, 28, 20, 2]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cipher[:10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Use a reference text to get a rough distribution of character frequencies__"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2017-08-02 11:20:17-- https://www.gutenberg.org/files/1342/1342-0.txt\n",
"Resolving www.gutenberg.org... 152.19.134.47, 2610:28:3090:3000::bad:cafe:47\n",
"Connecting to www.gutenberg.org|152.19.134.47|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 726223 (709K) [text/plain]\n",
"Saving to: '1342-0.txt.1'\n",
"\n",
"1342-0.txt.1 100%[===================>] 709.20K 369KB/s in 1.9s \n",
"\n",
"2017-08-02 11:20:19 (369 KB/s) - '1342-0.txt.1' saved [726223/726223]\n",
"\n"
]
}
],
"source": [
"!wget https://www.gutenberg.org/files/1342/1342-0.txt"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"reference_text = open('1342-0.txt').read().strip()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"ref_freq = Counter([c for c in reference_text.lower() if c in string.lowercase])\n",
"ref_freq_normalized = { k:(float(v)/sum(ref_freq.values())) for k,v in ref_freq.items() }"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"{'a': 0.07738415667509493,\n",
" 'b': 0.01697071857752644,\n",
" 'c': 0.025491422201680214,\n",
" 'd': 0.041405435777530065,\n",
" 'e': 0.12905394995604616,\n",
" 'f': 0.022446371768032408,\n",
" 'g': 0.018930063529177203,\n",
" 'h': 0.06268272568264412,\n",
" 'i': 0.0705926066900483,\n",
" 'j': 0.001758154119427603,\n",
" 'k': 0.006057475326935103,\n",
" 'l': 0.04000253754202804,\n",
" 'm': 0.027412704022910378,\n",
" 'n': 0.07018841249558196,\n",
" 'o': 0.07500249222877754,\n",
" 'p': 0.01573819816390709,\n",
" 'q': 0.0011563941527781554,\n",
" 'r': 0.06066537977035245,\n",
" 's': 0.06139039177836383,\n",
" 't': 0.0872932582945905,\n",
" 'u': 0.028110528080621335,\n",
" 'v': 0.010585175316966186,\n",
" 'w': 0.022788939941817788,\n",
" 'x': 0.001571463527364672,\n",
" 'y': 0.023620891221010847,\n",
" 'z': 0.0017001531587866924}"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ref_freq_normalized"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Functions to XOR the key with the ciphertext and compute the distance to the reference distribution__"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def apply_key(key, cipher):\n",
" \"\"\" Apply the key to the ciphertext \"\"\"\n",
" extended_key = np.tile(key, len(cipher)).astype(int)\n",
" return [chr( cipher[i] ^ extended_key[i] ) for i in range(len(cipher))]\n",
"\n",
"def compare(key, cipher, ref_freq):\n",
" \"\"\" Return a measure of how close the character distribution is to a reference distribution \"\"\"\n",
" \n",
" deciphered = apply_key(key, cipher)\n",
"\n",
" # calculate the character frequence distribution\n",
" # NOTE: lowercase characters only -- just dropping everything else\n",
" freq = Counter([c.lower() for c in deciphered if c.lower() in string.lowercase])\n",
" freq_normalized = { k:(float(v)/len(deciphered)) for k,v in freq.items() }\n",
"\n",
" # here, I'm using the Chi^2 metric as a measure of distance between the \n",
" # deciphered character frequence and the reference distribution\n",
" chi = 0.\n",
" for k,v in ref_freq.items():\n",
" chi += (freq_normalized.get(k,0) - v)**2 / v\n",
" \n",
" return chi"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Brute force\n",
"\n",
"* Enumerate all 26^3 possible keys (the problem text constrains the key-space to three lowercase characters)\n",
"* Guarenteed to find the global minimum for our metric\n",
"* But a lot of excess computations (takes a fair bit of time)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 25.3 s, sys: 282 ms, total: 25.6 s\n",
"Wall time: 26.4 s\n"
]
}
],
"source": [
"%%time\n",
"metric = []\n",
"keys = []\n",
"\n",
"for a in range(97, 123):\n",
" for b in range(97, 123):\n",
" for c in range(97, 123):\n",
" key = [a,b,c]\n",
" metric.append( compare(key, cipher, ref_freq_normalized) )\n",
" keys.append( key )"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[103, 111, 100]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"keys[np.argmin(metric)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Line-scanning\n",
"\n",
"* Minimize the individual key characters one at a time -- that is, scan across one of the key characters while keeping the other two constant\n",
"* Only 3*26 computations\n",
"* Not really guarenteed to find the global minimum"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"key: (103, 111, 100)\n",
"CPU times: user 126 ms, sys: 6.28 ms, total: 132 ms\n",
"Wall time: 137 ms\n"
]
}
],
"source": [
"%%time\n",
"first_metric = []\n",
"first_keys = []\n",
"for a in range(97, 123):\n",
" key = [a,0,0]\n",
" first_metric.append( compare(key, cipher, ref_freq_normalized) )\n",
" first_keys.append( a )\n",
" \n",
"A = first_keys[np.argmin(first_metric)]\n",
"\n",
"second_metric = []\n",
"second_keys = []\n",
"for b in range(97, 123):\n",
" key = [A,b,0]\n",
" second_metric.append( compare(key, cipher, ref_freq_normalized) )\n",
" second_keys.append( b )\n",
"\n",
"B = second_keys[np.argmin(second_metric)]\n",
"\n",
"third_metric = []\n",
"third_keys = []\n",
"for c in range(97, 123):\n",
" key = [A,B,c]\n",
" third_metric.append( compare(key, cipher, ref_freq_normalized) )\n",
" third_keys.append( c )\n",
"\n",
"C = third_keys[np.argmin(third_metric)]\n",
"\n",
"print \"key:\", (A,B,C)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Take a look at the key we found"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"('g', 'o', 'd')"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chr(103), chr(111), chr(100)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"deciphered = apply_key([103, 111, 100], cipher)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"\"(The Gospel of John, chapter 1) 1 In the beginning the Word already existed. He was with God, and he was God. 2 He was in the beginning with God. 3 He created everything there is. Nothing exists that he didn't make. 4 Life itself was in him, and this life gives light to everyone. 5 The light shines through the darkness, and the darkness can never extinguish it. 6 God sent John the Baptist 7 to tell everyone about the light so that everyone might believe because of his testimony. 8 John himself was not the light; he was only a witness to the light. 9 The one who is the true light, who gives light to everyone, was going to come into the world. 10 But although the world was made through him, the world didn't recognize him when he came. 11 Even in his own land and among his own people, he was not accepted. 12 But to all who believed him and accepted him, he gave the right to become children of God. 13 They are reborn! This is not a physical birth resulting from human passion or plan, this rebirth comes from God.14 So the Word became human and lived here on earth among us. He was full of unfailing love and faithfulness. And we have seen his glory, the glory of the only Son of the Father.\""
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"\".join(deciphered)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Finally, summing the integer values to find the solution to the original problem:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"107359"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum([ord(c) for c in deciphered])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [default]",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
@wtberry
Copy link

wtberry commented Aug 3, 2017

Hi, I met you at the pyHawaii meetup earlier tonight, and would you mind if I fork this page?

@rhydomako
Copy link
Author

@wtberry go ahead!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment