Skip to content

Instantly share code, notes, and snippets.

@IgnoredAmbience
Last active August 23, 2019 10:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save IgnoredAmbience/334526541f2cdf83cfb3f83c6518a895 to your computer and use it in GitHub Desktop.
Save IgnoredAmbience/334526541f2cdf83cfb3f83c6518a895 to your computer and use it in GitHub Desktop.
Tiff and PDF Conversion Commands

Extracting PDF Files to Images

PDF files that contain a single image are often a pain to work with, but it is simple to extract that image using pdfimages.

First, run pdfimages with the list command to check the document is composed of a single image:

$ pdfimages -list file.pdf
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    3508  2480  rgb     3   8  jpeg   no         8  0   300   300  409K 1.6%

If the output gives one image per page, then it should be possible to extract the image to the raw image file. The following command will extract the file to its native format (if tiff or jpeg), or fallback to png otherwise. You should confirm that the two resulting files are visually identical before discarding the original.

pdfimages -png -tiff -j file.pdf file

Converting corrupted/invalid TIFF Files into another image format

libtiff is reporting the following error for a number of my tiff files and refuses to parse them. (Using feh as an image viewer that uses libtiff as a backend).

$ feh file.tif
TIFFReadDirectoryCheckOrder: Warning, Invalid TIFF directory; tags are not sorted in ascending order.
JPEGPreDecode: Improper JPEG sampling factors 2,2
Apparently should be 4,1..
...

mupdf is able to read such files and the companion program mutool can convert them.

$ mutool convert -o file.png file.tif
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment