How to extract images from PDF files recursively in folders in fish shell


sudo apt install poppler-utils imagemagick

To extract all the images:

for file in **.pdf
    pdfimages -all "$file" "$file"

Then since they're full-page images, to remove the ID number at the bottom of the page:

for file in *.png
    convert "$file" -crop +0-100 +repage "cropped $file"

and then auto-crop the white space:

for file in cropped*.png
    convert "$file" -trim +repage "$file"

though it still needs to get a little closer:

for file in cropped*.png
    convert "$file" -fuzz 10% -trim +repage "$file"
endolith commented Jan 30, 2022

On Windows, just use the poppler-utils inside Windows Subsystem for Linux

 for %f in (*.*) do wsl pdfimages -all %f %f

