Skip to content

Instantly share code, notes, and snippets.

@Semnodime
Forked from averagesecurityguy/pdf_flatedecode.py
Last active March 31, 2024 20:07
Show Gist options
  • Save Semnodime/6480588f9aa98ce0204698980da9e6b6 to your computer and use it in GitHub Desktop.
Save Semnodime/6480588f9aa98ce0204698980da9e6b6 to your computer and use it in GitHub Desktop.
Decompress FlateDecode Objects in PDF
#!/usr/bin/env python3
import re
import sys
import zlib
def main(filename:str):
"""This script will find each FlateDecode stream in the given PDF document using a regular expression, unzip it, and print out the unzipped data."""
with open(filename, 'rb') as pdf_file:
pdf_stream = re.compile(rb'.*?FlateDecode.*?stream(.*?)endstream', re.S)
for s in pdf_stream.findall(pdf_file.read()):
s = s.strip(b'\r\n')
try:
print(zlib.decompress(s))
print('-'*64)
except:
print('Error: Could not decompress pdf stream using zlib:', s, file=sys.stderr)
if __name__ == '__main__':
if len(sys.argv) != 2:
print('Provide the filename of the pdf as commandline argument.', file=sys.stderr)
exit(1)
main(filename=sys.argv[1])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment