Last active
October 2, 2023 06:31
-
-
Save junaidk/5120470 to your computer and use it in GitHub Desktop.
Code to decipher Vigenere Cipher Using frequency analysis
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
First find the strings of characters that are repeated in the ciphertext.The distances between consecutive occurrences of the strings are likely to be multiples of the length of the keyword. Then factors each of these numbers. If any number is repeated in the majority of these factorings, it is likely to be the length of the keyword. | |
After finding keyword length ... | |
We find k0 (first letter of keyword) first. Each of the characters y0, yL, y2L, | |
y3L, y4L, ..., y(q–1)L (q ≈ n / L)have been encrypted by adding k0 . Essentially, if we restrict to these characters, we have a simple shift cipher with shift k0.) We count the frequency of each letter (A, B, ..., Z) in these q characters of the ciphertext, and divide these frequencies by q to obtain probabilities. | |
Let W = (w0, w1, ..., w25), where w0, w1, ..., w25 are the probabilities of A, B, ..., Z in | |
positions 0, L, 2L, ..., (q−1)L of the ciphertext. | |
w0, w1, ..., w25 are the probabilities of A− k0, B− k0, ..., Z− k0 in | |
the plaintext (same positions). | |
So W ≈ A(k0) assuming these positions of our ciphertext | |
resemble a typical English language text. | |
We can use the size of W.A(k0) as a measure of how close W | |
is to A(k0). We compute W.A(i) for i = 0, 1, ...,25. If one value of i | |
makes W.A(i) significantly larger than any other, that value of | |
i is our best guess for k0. We can attempt to find k1, k2,..., kL−1 | |
in a similar manner. For example, to find k1, we would look at | |
ciphertext characters y1, y1+L, y1+2L, y1+3L, y1+4L, ... |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
alphabets = 'abcdefghijklmnopqrstuvwxyz' | |
#sample strings | |
string1 = 'yysptuglntomaztiqblczgngvphueqdbatrcqnbbulrlzganytiybribohezniwwnvptkeevmkzqzrguszijcavtosalphhrndtrjujsbjyotldcggtecifxkohipgvgxzxprsofumfyusahmwhvrtiyuflrksiwtrtavzagpnyvhdgqurztegbmxkefgicvqeyljepenuqrdbipmlhirboabsqsaxdlrhnfjrkzmampqdtryqgnflrehhbiajmekmvyrsmienewzhurnmpsqwukkuiargtprhdgywryimgaaluernmgzqvfulxzerpuvmtxixyczeynaggzhshyxkeacurkxwvxvrhxnoqagfmbgpjruafbugfwhvoefozdqagfmbgmsumrfbukfiefprdtvqvblflrflwvatgeuooljtzxkrrgbvsqwexwidtrfqgcmwpxyxdiangfuyiniwidlgqmyyuivgallsjcgvnmhthaqblfcvqonygpoewacrmnroshekmwbrvprxqarklsqllhwhxxigmlvviavprcavqpomfhctmpkpiqxswlenplvzqrqxkmqefwzrrkxuxdsudzwaghqgbflmwrvglnrxsgalviohttrzfiellrgiaiqaknygvvyodskvqtarrmvjltgjmpgeifhuszijcavtbsfllwvibpwsiaeawteqdvyifozechzmwibpbbgfxnvrxkeskzfzyiflhkhoaemzudiqbcmgiaiqgozxbpvvgsnplcafxvgnhrtfhwekmgulfqeoyypvitanlzxllywvxtaaalvxuenvmqofabkrigohvqaztmfyhwkibpurxqifeuiqojvprluvfmsiwtrtkntaryriidwukkuoeezhzxxsrhcyjuwphciuyfkvpkuxbvjyusaqnrcqvgahrwhegmgoyifbuxkifupbxfwrgaiqcrcvqztiubzeosbcxcgdiamprwhrumpuzhjhyhqojkbokoszxzephrtmnkepnglsufvntvtsmamoirbikwhyheptuglefkvgnqrnflephrtmnhqwytuibiuclfuyearsiwtrtaauixutamfohnlcxagrxkaltuewayuhrkhfoepqvsopiavlxrtugariarqflwvatgeuooljhyohdbwbvtflvlmevhvqvnkxvvxzlhrrkkbaxhbgscpaxgarteiorwywtvpoggzhtyvvwhroqfyurtelxweeuiajeycivwlntvpnzflrghqhwnubugfsslvqhhbwarudmaghxzhvepgnqaebaiuwnuaggkmazprvprebbxyeemprdnqkpnjxmfmlrhdjkbuztihmtsvtvpbrxqwgmvxkeswtygzhpeleuapewhtfssavapystqrtplnwwvrdhemqxqwheawzhvepugpprwasvopqucrqxrtjspmnplbbqvbnyhlfskkhrfmrldldtqkllughbmoiqsvtifqqhgalmqscgkgud' | |
# string1 = 'LVFTWOUPTBJDXBYHTZFSIQQAHGNEDBBESBJSSODTWXFNJOWYPBIIHHMEGSFNNFJQJWWESFJASWSGUCWMDBIANHMEQCTKXGKUCRFMTBYAAGTFUOGYAOWSDBBIARWIVVYOGWXTWSHOJFXEEOHKDIYDDSXTWSJDXHNOCAFTISWTWSHOJFXEDIYLXBJSPMXTWPZTXBYHTAFRZSYOCZDTWSYHXGFVPWQAQZJ' | |
# string1 = 'IFNTTEXHOUEQULGHQPIKDJLNEQVEAREWNUEFPYLTTNIFEHRLMLQIEXPWBLAAKPRQGMZWNYENPGAGRSZEYUDNKRUENSWCSZFHZBNMQRBSVZOSRYOYXEZKYUWVXBEISLBGBPSGTCPOGVAWZHCXASGDAIALRLEQURVOZQILDLRGTCPOGVAWYBTYULRSMCALRIUGULGHQQIJULTBZJLTTIBZSZAIMGYANPM' | |
string1 = string1.downcase | |
# real alphabet frequecy in normal english | |
b = [0.08167,0.01492,0.02782,0.04253,0.12702,0.02228,0.02015,0.06094,0.06966,0.00153,0.00772,0.04025,0.02406,0.06749,0.07507,0.01929,0.00095,0.05987,0.06327,0.09056,0.02758,0.00978,0.02360,0.00150,0.01974,0.00074] | |
# gives a number 0 to 25 to each alphabet | |
def Encode (text) | |
i=0, j=0, uc=0, k = 0; | |
uc = text.ord | |
if uc >= 97 && uc <= 122 | |
k = uc-97 | |
et=k | |
j = j+1 | |
end | |
if uc >= 65 && uc <= 90 | |
k= uc-65 | |
et = k | |
j = j+1 | |
end | |
et | |
end | |
i = 0 | |
counts=Hash.new(0) | |
countsArry = [] | |
# size 3 to 6 | |
for k in 3..6 | |
size = k | |
for i in 0..string1.length | |
toComp = string1[i,size] | |
j = i | |
for j in (i+size)..string1.length | |
if toComp == string1[j,size] | |
dist = j - i # finds distance between two similars | |
counts[toComp] = dist | |
countsArry << dist | |
end | |
end | |
end | |
end | |
key_length = Hash.new(0) | |
# find factors for each distance so that | |
# most common factor can be found | |
countsArry.each do |value| | |
# puts key | |
(2..20).collect do |n| | |
if ((value/n) * n) == value | |
key_length[n] += 1 | |
end | |
end | |
end | |
key_length.each {|key,value| puts "size #{key} : #{value}" } | |
puts "enter the approximate keyword size : " | |
keyWord_size = gets.to_f | |
# dividing cipher text in buckets based on key word size | |
buckets = Hash.new{|h, k| h[k] = []} | |
i = 0 | |
for i in 0..(keyWord_size-1) | |
x = 0 | |
begin | |
buckets[i] << string1[x+i] | |
x += keyWord_size | |
end until x > string1.length | |
end | |
buckets.each {|key,value| puts "#{key} : #{value.join}\n" } | |
j=0 | |
keyword = [] | |
for buktN in 0..(keyWord_size-1) | |
frequency = Hash.new{|h, k| h[k] = []} | |
frequencyNo = Hash.new(0) | |
for i in 0..(alphabets.length-1) | |
frequencyNo[alphabets[i]] = 0 | |
end | |
# frequency of alphabets in each bucket | |
buckets[buktN].each do |alpha| | |
frequencyNo[alpha] += 1 | |
end | |
realFreq = [] | |
frequencyNo.each do |key,value| | |
realFreq << (value.to_f)/buckets[buktN].length.to_f | |
end | |
temp = 0 | |
sum = 0 | |
result = [] | |
shift = [] | |
for i in 0..25 | |
for j in 0..25 | |
index = (i+j)%26 | |
shift[j] = realFreq[index] | |
end | |
for j in 0..(shift.length-1) | |
temp = shift[j]*b[j] | |
sum = sum + temp | |
end | |
result[i] = sum | |
sum = 0 | |
end | |
keyword << ((result.index(result.max))+97).chr | |
end | |
puts '//////////****/////' | |
puts "keyword is : #{keyword.join}" | |
key = keyword.join | |
keyString = Array.new((string1.length/keyWord_size)+keyWord_size ,key).join | |
# puts keyString | |
deciphered = [] | |
for i in 0..(string1.length-1) | |
temp = Encode(string1[i]) - Encode(keyString[i]) | |
temp2 = temp%26 | |
deciphered[i]= (temp2+97).chr | |
end | |
puts "deciphered text : " | |
puts deciphered.join |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment