Skip to content

Instantly share code, notes, and snippets.

@BrianZbr
Forked from legumbre/gist:1182280
Last active August 29, 2015 14:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save BrianZbr/5428fd30800a0f8a4963 to your computer and use it in GitHub Desktop.
Save BrianZbr/5428fd30800a0f8a4963 to your computer and use it in GitHub Desktop.
# convert multipage pdf to single page tiff
gs -q -dNOPAUSE -dBATCH -sDEVICE=tiffg4 -sOutputFile=%04d.tif source.pdf -c qui
# or use -sDEVICE=pgmraw to convert to pgm
# unpaper, rotate the logical page 90 degrees, each logical page contained two scanned physical pages, so we use --layout double (for input) and --output-pages 2 since we want to split these two pages.
unpaper -v --deskew-scan-deviation 3.0 --border-align top --deskew-scan-range 15 --no-grayfilter --no-blurfilter --no-noisefilter --overwrite --pre-rotate 90 --border-scan-step 4 --layout double --output-pages 2 %04d.pgm.pbm unpaper%04d.pbm
# trim the pages and convert the to single-page pdfs
find . -name 'unpaper*' | xargs -i -n1 -P6 convert -trim +repage {} {}.pdf
# finally reassemble the pdf with ghostcript
gs -sDEVICE=pdfwrite -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf *.pdf
# (optional) convert pqm to pbm
find . -name '*.pgm' | xargs -i -n1 sh -c "pgmtopbm {} > {}.pbm"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment