Skip to content

Instantly share code, notes, and snippets.

@dsanson
Created September 15, 2010 15:13
Show Gist options
  • Save dsanson/580864 to your computer and use it in GitHub Desktop.
Save dsanson/580864 to your computer and use it in GitHub Desktop.
pdfocr.sh: ocr scan pdfs to make them searchable
#!/bin/sh
#
# This is a shell wrapper around the command line version of
# [VelOCRaptor](http://www.velocraptor.com/), an affordable
# OCR program for OS X.
#
if [ $# = 2 ]; then
echo "Scanning file..."
/Applications/VelOCRaptor.app/Contents/SharedSupport/velocraptor.rb "$1" "$2" && \
echo "OCRed file saved as $2."
elif [ $# = 1 ]; then
output=`mktemp -t ${1##*/}`.pdf
echo "Scanning file..."
/Applications/VelOCRaptor.app/Contents/SharedSupport/velocraptor.rb "$1" "$output" && \
mv "$1" "${1%%.pdf}_notext.pdf" && \
mv "$output" "$1" && \
echo "Original file saved as ${1%%.pdf}_notext.pdf"
else
echo "Usage: ${0##*/} input.pdf <output.pdf>"
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment