IgnoredAmbience/pdf.md

## pdf.md

      
    Raw
  

              pdf.md
            
          
    Extracting PDF Files to Images

PDF files that contain a single image are often a pain to work with, but it is simple to extract that image using pdfimages.
First, run pdfimages with the list command to check the document is composed of a single image:
$ pdfimages -list file.pdf
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    3508  2480  rgb     3   8  jpeg   no         8  0   300   300  409K 1.6%
If the output gives one image per page, then it should be possible to extract the image to the raw image file.
The following command will extract the file to its native format (if tiff or jpeg), or fallback to png otherwise.
You should confirm that the two resulting files are visually identical before discarding the original.
pdfimages -png -tiff -j file.pdf file

  
## tiff.md

      
    Raw
  

              tiff.md
            
          
    Converting corrupted/invalid TIFF Files into another image format

libtiff is reporting the following error for a number of my tiff files and refuses to parse them. (Using feh as an image viewer that uses libtiff as a backend).
$ feh file.tif
TIFFReadDirectoryCheckOrder: Warning, Invalid TIFF directory; tags are not sorted in ascending order.
JPEGPreDecode: Improper JPEG sampling factors 2,2
Apparently should be 4,1..
...
mupdf is able to read such files and the companion program mutool can convert them.
$ mutool convert -o file.png file.tif