Skip to content

Instantly share code, notes, and snippets.

View jburon's full-sized avatar

Jakob Buron jburon

  • EnOcean GmbH
  • München, Germany
View GitHub Profile
@jburon
jburon / ocrpdf.sh
Last active January 23, 2018 16:42 — forked from wcaleb/ocrpdf.sh
Take a PDF, OCR it, and add OCR Text as background layer to original PDF to make it searchable
#!/bin/sh
# Take a PDF, OCR it, and add OCR Text as background layer to original PDF to make it searchable.
# Hacked together using tips from these websites:
# http://www.jlaundry.com/2012/ocr-a-scanned-pdf-with-tesseract/
# http://askubuntu.com/questions/27097/how-to-print-a-regular-file-to-pdf-from-command-line
# Dependencies: pdftk, tesseract, imagemagick, hocr2pdf
cp $1 $1.bak
pdftk $1 burst output tesspage_%02d.pdf