Skip to content

Instantly share code, notes, and snippets.

@letorbi
Created December 14, 2021 22:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save letorbi/31a6ea3cdc41ebc80ad2c7f5351c0556 to your computer and use it in GitHub Desktop.
Save letorbi/31a6ea3cdc41ebc80ad2c7f5351c0556 to your computer and use it in GitHub Desktop.
A shell script that tries to remove exploits and malware from PDFs
#!/bin/bash
# References:
# https://security.stackexchange.com/questions/103323/effectiveness-of-flattening-a-pdf-to-remove-malware
# https://superuser.com/a/373740
TEMPFILE=$(mktemp /tmp/pdfsanitize.XXXXXXXXX)
OUTFILE=${1/.PDF/.pdf}
OUTFILE=${OUTFILE/.pdf/_sanitized.pdf};
# Re-write PDF and uncompress any images to remove image meta data (EXIF)
gs -sDEVICE=pdfwrite -dColorConversionStrategy=/LeaveColorUnchanged -dPassThroughJPEGImages=false -dPassThroughJPXImages=false -dEncodeColorImages=false -dEncodeGrayImages=false -dEncodeMonoImages=false -dNOPAUSE -dBATCH -sOutputFile="$TEMPFILE" "$1"
# Re-compress images and downgrade PDF version to destroy (hopefully) all malware and exploits
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dBATCH -sOutputFile="$OUTFILE" "$TEMPFILE"
# Clean up
rm "$TEMPFILE"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment