Skip to content

Instantly share code, notes, and snippets.

@jwalton
Created May 24, 2021 17:37
Show Gist options
  • Save jwalton/68fe14704e1b64a25bc3a8841802ef8f to your computer and use it in GitHub Desktop.
Save jwalton/68fe14704e1b64a25bc3a8841802ef8f to your computer and use it in GitHub Desktop.
Convert a PDF of JPGs to JPGs, recompress them, and then back to PDF.
# Launch a fresh VM
docker run --rm -it -v `pwd`:/mount ubuntu
# Extract PDF to JPGs
cd /mount
apt-get update
apt-get install poppler-utils
pdfimages -j file.pdf fileprefix
mkdir extract
mv *.jpg extract
# Recompress JPGs and convert to greyscale
apt-get install imagemagick
FILES=$( find extract -type f -name "*.jpg" | cut -d/ -f 2)
mkdir shrink && cd shrink
for file in $FILES; do
convert -strip -quality 30 -colorspace Gray ../extract/$file ./$file
done
# Back to PDF
# Note that your docker VM needs a lot of RAM to do this if you have a lot of images.
# Alternatively, if you're on a Mac, you can `brew install imagemagick` and
# do this last step native.
convert *.jpg ../final.pdf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment