Skip to content

Instantly share code, notes, and snippets.

@geakstr
Created January 8, 2014 16:30
Show Gist options
  • Save geakstr/8319638 to your computer and use it in GitHub Desktop.
Save geakstr/8319638 to your computer and use it in GitHub Desktop.
Install Tesseract OCR and bash script for using
sudo apt-get install \
imagemagick libpng12-dev libjpeg8-dev libtiff4-dev bc \
libtesseract-dev libtesseract3 tesseract-ocr tesseract-ocr-equ \
tesseract-ocr-equ tesseract-ocr-osd tesseract-ocr-osd \
tesseract-ocr-eng tesseract-ocr-rus
touch i2t
chmod +x i2t
nano i2t
#!/bin/bash
# Usage 1: i2t image.png rus
# Usage 2: i2t image.png
file_in=$1
langs=$2
if [ -z "$2" ]
then
langs="eng+rus"
fi
dpi=$(convert $file_in -format "%x" info:)
dpi=${dpi//[A-z\ ]/}
if [ $dpi -lt 300 ]
then
convert -resample 300 $file_in .ocr.tmp.tif
else
convert $file_in .ocr.tmp.tif
fi
tesseract .ocr.tmp.tif $file_in -l $langs
rm -f .ocr.tmp.tif
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment