Skip to content

Instantly share code, notes, and snippets.

@ahmed-musallam
Last active March 10, 2024 13:53
Show Gist options
  • Star 92 You must be signed in to star a gist
  • Fork 18 You must be signed in to fork a gist
  • Save ahmed-musallam/27de7d7c5ac68ecbd1ed65b6b48416f9 to your computer and use it in GitHub Desktop.
Save ahmed-musallam/27de7d7c5ac68ecbd1ed65b6b48416f9 to your computer and use it in GitHub Desktop.
How to compress PDF with ghostscript

How to compress PDF using ghostscript

As a developer, it bothers me when someone sends me a large pdf file compared to the number of pages. Recently, I recieved a 12MB scanned document for just one letter-sized page... so I got to googlin, like I usually do, and found ghostscript!

to learn more abot ghostscript (gs): https://www.ghostscript.com/

What we are interested in, is the gs command line tool, which provides many options for manipulating PDF, but we are interested in compressign those large PDF's into small yet legible documents.

credit goes to this answer on askubuntu forum: https://askubuntu.com/questions/3382/reduce-filesize-of-a-scanned-pdf/3387#3387?newreg=bceddef8bc334e5b88bbfd17a6e7c4f9

Steps below were only tried on macOs sierra

you can install gs via the official site or via homebrew

brew install ghostscript

now to compress a pdf:

gs 
 -q -dNOPAUSE -dBATCH -dSAFER \
 -sDEVICE=pdfwrite \
 -dCompatibilityLevel=1.3 \
 -dPDFSETTINGS=/screen \
 -dEmbedAllFonts=true -dSubsetFonts=true \
 -dColorImageDownsampleType=/Bicubic \
 -dColorImageResolution=144 \                `#PDF downsample color image resolution`
 -dGrayImageDownsampleType=/Bicubic \
 -dGrayImageResolution=144 \                 `#PDF downsample gray image resolution`
 -dMonoImageDownsampleType=/Bicubic \
 -dMonoImageResolution=144 \                 `#PDF downsample mono image resolution`
 -sOutputFile=out.pdf \                      `#Output file`
 file.pdf                                    `#Input file`

you can find documentation on ghostcript commands here: https://www.ghostscript.com/doc/current/Use.htm#Options

you'll notice that I set all the ImageResolution options to 144, I found that this value gives the best results for legible text scans, you can change that to whatever you like

I also added a function to my .bash_profile to make a shorthand that will compress and rename file.pdf to file.pdf.compressed.pdf:

pdfcompress ()
{
   gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -dCompatibilityLevel=1.3 -dPDFSETTINGS=/screen -dEmbedAllFonts=true -dSubsetFonts=true -dColorImageDownsampleType=/Bicubic -dColorImageResolution=144 -dGrayImageDownsampleType=/Bicubic -dGrayImageResolution=144 -dMonoImageDownsampleType=/Bicubic -dMonoImageResolution=144 -sOutputFile=$1.compressed.pdf $1; 
}

use it: pdfcompress somefile.pdf

@mideoye
Copy link

mideoye commented May 23, 2020

Hi Ahmed, thanks for posting this - was really helpful :)

I just wanted to notify you about a little error in the description of the input and output files:

-sOutputFile=out.pdf \ #Output file
file.pdf #Input file

@ahmed-musallam
Copy link
Author

@mideoeye, fixed it! Thank you!

@oliverlambson
Copy link

oliverlambson commented Jan 4, 2021

I removed the original file extension so instead of file.pdf.compressed.pdf you get file.compressed.pdf

pdfcompress ()
{
   gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -dCompatibilityLevel=1.3 -dPDFSETTINGS=/screen -dEmbedAllFonts=true -dSubsetFonts=true -dColorImageDownsampleType=/Bicubic -dColorImageResolution=144 -dGrayImageDownsampleType=/Bicubic -dGrayImageResolution=144 -dMonoImageDownsampleType=/Bicubic -dMonoImageResolution=144 -sOutputFile=${1%.*}.compressed.pdf $1; 
}

Thanks for the script! Adding it to my .zshrc is going to save loads of wasted googling time every time I need to compress a pdf!

@dheadshot
Copy link

I found I got some corrupted pages unless I used -dCompatibilityLevel=1.4 instead of -dCompatibilityLevel=1.3 and 144 Resolution seemed to scale it up, so I changed that to 64 and it worked perfectly. Settings are probably specific to my scenario though, so YMMV; just be aware these might need adjusting.

Thanks for this!

@rajan-31
Copy link

I am getting error on Windows 10, Powershell 5

gswin64c.exe `
-q -dNOPAUSE -dBATCH -dSAFER `
-sDEVICE=pdfwrite `
-dCompatibilityLevel="1.3" `
-dPDFSETTINGS=/screen `
-dEmbedAllFonts=true `
-dSubsetFonts=true `
-dColorImageDownsampleType=/Bicubic `
-dColorImageResolution=144 `
-dGrayImageDownsampleType=/Bicubic `
-dGrayImageResolution=144 `
-dMonoImageDownsampleType=/Bicubic `
-dMonoImageResolution=144 `
-sOutputFile="output.pdf" "input.pdf" `
**** Error: stream operator isn't terminated by valid EOL.
               Output may be incorrect.
**** Error: stream operator isn't terminated by valid EOL.
               Output may be incorrect.

but, I can open compressed pdf. Nothing seems wrong or corrupted.

@rootwork
Copy link

Thanks for this! Three notes:

  • The very first line of your script, I think you need a \ after gs.
  • While I didn't get corrupted pages like @dheadshot, I got dramatically smaller file sizes using -dCompatibilityLevel=1.4 -- like, 90% smaller files. PDF 1.4 was released in 2001, so I can't imagine there are that many systems or software that can't deal with it.
  • In situations where the PDF was destined to be printed or I otherwise knew it wouldn't matter, adding -dNOTRANSPARENCY saved a little bit extra as well.

@NightSpirit2099
Copy link

I found I got some corrupted pages unless I used -dCompatibilityLevel=1.4 instead of -dCompatibilityLevel=1.3 and 144 Resolution seemed to scale it up, so I changed that to 64 and it worked perfectly. Settings are probably specific to my scenario though, so YMMV; just be aware these might need adjusting.

Thanks for this!

I just added an argument to the function, so you can define de dpi when you call it

@muzimuzhi
Copy link

you can find documentation on ghostcript commands here: https://www.ghostscript.com/doc/current/Use.htm#Options

Link is broken. Try this one instead: https://ghostscript.readthedocs.io/en/latest/Use.html#command-line-options.

@andrewschaeffer
Copy link

Worked like a charm! thanks!

@Herz3h
Copy link

Herz3h commented May 3, 2023

Adding this link here as it is interesting: https://www.ghostscript.com/blog/optimizing-pdfs.html

@JaosnHsieh
Copy link

Thank you. In my case, compression only works if changed to -dCompatibilityLevel=1.4 instead of -dCompatibilityLevel=1.3.

-dCompatibilityLevel=1.3: 11MB to 12MB

-dCompatibilityLevel=1.4: 11MB to 800KB

@didim99
Copy link

didim99 commented Jun 29, 2023

@JaosnHsieh thank you, it's also works for me!

@vanabel
Copy link

vanabel commented Jul 6, 2023

great, it work smoothly on my mac

@FaheemOnHub
Copy link

I want to create a pdf compression website,can you help me with that?

@vanabel
Copy link

vanabel commented Jul 11, 2023

I want to create a pdf compression website,can you help me with that?

It should be very sample, if you mean upload multiple pdf files and then combine it into one and the uploader can download the combined one.

@FaheemOnHub
Copy link

I dont want to make a combiner , I want to make a compression tool which will reduce the size of the pdf file

@vanabel
Copy link

vanabel commented Jul 12, 2023

I am sorry that I mis understanding you, they are almost the same in principle, just change the shell-command behind the php/node.

@kwvg
Copy link

kwvg commented Jul 24, 2023

Been wanting a convenient PDF compression command defined in my shell profile, I've written my own attempts but none of them seem to retain an acceptable level of quality, this alongside the suggestions made here have found a new home in my .zshrc. Thanks for your help!~

@MikaelFangel
Copy link

MikaelFangel commented Oct 28, 2023

I believe the setting for the monochrome downsampling type to Bicupic doesn’t contribute to the overall compression or quality of the pdf. See this quote from the Ghostscript blog (:

Moving on, the final parameter is the type of downsampling to apply, and there are three possibilities: Subsample, Average and Bicubic. For Monochrome images, we can only use Subsample because the other types involve coming up with some kind of average value of the pixels we are considering. Since monochrome images can only have black or white pixels, there’s no way to come up with an average.

Source

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment