Skip to content

Instantly share code, notes, and snippets.

@cutaway
Created September 2, 2019 18:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cutaway/3dd7ca7f6a214dd43ea0e37b6462ad91 to your computer and use it in GitHub Desktop.
Save cutaway/3dd7ca7f6a214dd43ea0e37b6462ad91 to your computer and use it in GitHub Desktop.
Conduct frequency analysis on all characters in a binary blob.
#!/usr/bin/env python3
import os, sys
# Debug if you want to stop early in large files
DEBUG = False
COLUMNS = 4
def main():
# Preload a dictionary with all characters
table = {}
table = {e:0 for e in range(256)}
# Grab data from file
data = open(sys.argv[1], 'rb').read()
# Loop thru data and count
if DEBUG: b = 0
for c in data:
table[c] += 1
# Debug large files
if DEBUG:
b += 1
if b == 2000: break
# print results
for e in range(0,256,COLUMNS):
print()
for r in range(COLUMNS):
# This formating will help the column output
print('{:<3s} {:>4s} {:>4d}'.format(hex(e+r),repr(chr(e+r)),table[e+r]), end = '')
# Don't put the colon on the last column
if r < (COLUMNS - 1): print(' : ',end = '')
print()
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment