Skip to content

Instantly share code, notes, and snippets.

@tsaylor
Created May 2, 2011 14:59
Show Gist options
  • Save tsaylor/951727 to your computer and use it in GitHub Desktop.
Save tsaylor/951727 to your computer and use it in GitHub Desktop.
An experiment in statistical decryption of simple letter replacement ciphers. Calculates the frequency of each letter and then replaces each according to the frequency of letters in the english language.
import string
cipher = '''
Vs lbh'er yvxr zr, lbh znl erpnyy Gehzcrg Jvafbpx, gung avsgl yvggyr ovg bs fbsgjner gung cebivqrq na vagresnpr sebz jvaqbjf gb gur GPC/VC cebgbpby fgnpx gung crbcyr nyy bire gur jbeyq hfrq gb pbaarpg gb gur vagrearg sbe gur svefg gvzr.
Fb, vg gheaf bhg, gur thl jub znqr guvf qvqa'g znxr penc bss uvf jbex. Gehzcrg Jvafbpx jnf fgbyra naq npgviryl tvira njnl ba vafgnyy qvfxf sebz nyy gur znwbe grpu zntnmvarf naq va pbecbengr vafgnyyf. Ur jnf whfg n fznyy gvzr thl naq uvf pbzcnal unq ab jnl gb svtug gur enzcnag gursg.
Uvf anzr jnf Crgre Gnggnz. Fbzrbar ba erqqvg gubhtug vg'q or n tbbq vqrn gb fgneg n qbangvba cbg gb frr vs sbyxf jbhyq cbal hc sbe gur fbsgjner gung znqr vg cbffvoyr sbe gurz gb trg gb gur ovt jvqr jro. V qvq. :)
Fb, tb ba ol vs lbh hfrq vg, gbff n puvc be gjb va. Vg'f n avpr guvat gb qb.'''
real_plaintext = '''
If you're like me, you may recall Trumpet Winsock, that nifty little bit of software that provided an interface from windows to the TCP/IP protocol stack that people all over the world used to connect to the internet for the first time.
So, it turns out, the guy who made this didn't make crap off his work. Trumpet Winsock was stolen and actively given away on install disks from all the major tech magazines and in corporate installs. He was just a small time guy and his company had no way to fight the rampant theft.
His name was Peter Tattam. Someone on reddit thought it'd be a good idea to start a donation pot to see if folks would pony up for the software that made it possible for them to get to the big wide web. I did. :)
So, go on by if you used it, toss a chip or two in. It's a nice thing to do.
'''
print "Contractions"
contractions = set()
for word in cipher.split(' '):
if "'" in word:
contractions.add(word)
print contractions
print "\nSingle Letters"
singles = set()
for word in cipher.split(' '):
if len(word) == 1:
singles.add(word)
print singles
print "\nLetter Frequency"
cipher = cipher.lower()
charlist = list(cipher)
chardict = {}
for i in charlist:
if i in list('abcdefghijklmnopqrstuvwxyz'):
chardict[i] = chardict.get(i, 0) + 1
charlist = zip(chardict.values(), chardict.keys())
charlist = sorted(charlist, key=lambda x: x[0])
charlist.reverse()
for j,k in charlist:
print j, k
print "\nStatistical Frequency Plaintext"
intab = ''.join([x[1] for x in charlist])
outtab = 'etaoinshrdlcumwfgypbvkjxqz'[:len(intab)]
plain = string.translate(cipher, string.maketrans(intab, outtab))
print plain
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment