Skip to content

Instantly share code, notes, and snippets.

@dan-r95
Created April 13, 2023 10:04
Show Gist options
  • Save dan-r95/272b8709ff26842f5856448daa323b86 to your computer and use it in GitHub Desktop.
Save dan-r95/272b8709ff26842f5856448daa323b86 to your computer and use it in GitHub Desktop.
ocr all pdfs - make them searchable
#!/bin/zsh
# iterate through all files in the current directory
for file in *.pdf
do
# get the filename without the extension
filename="${file%.*}"
# create a new filename with the extension .txt
newfile="output/$filename-s.pdf"
# run the command
ocrmypdf -l deu+eng --rotate-pages --deskew --jobs 8 --clean --rotate-pages-threshold 1.5 --output-type pdfa "$file" "$newfile"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment