Skip to content

Instantly share code, notes, and snippets.

@jbl0ndie
Created August 6, 2016 09:14
Show Gist options
  • Save jbl0ndie/ce0659fea469a334b10d580f70df8c6a to your computer and use it in GitHub Desktop.
Save jbl0ndie/ce0659fea469a334b10d580f70df8c6a to your computer and use it in GitHub Desktop.
OCR a pdf in OS X
#User downloads a pdf but it's an image only document and the text cannot be searched
#open a terminal
#brew install tesseract (unless you already have it installed)
#open the document in Preview and export to a Tiff document (multipage is supported, 150dpi seems ok
#change to the file directory in terminal to save you the bother of putting the full path in
#tesseract filename.tiff outputfilename pdf
#tesseract then crunches through your file and creates an output file with the specified name and filetype
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment