This also gets a JS version, but same as
Ocrad
, the result is not promising.
It got better result than
Ocrad
and 'so-called' more format support. You can useimagemagick
to convert images to supported format
This is the most accurate and useful OCR software I found. For language support, download the language packs from here and place the extract data to
share/testdata
folder. e.g. download the chinese simplified pack and place it extracted data to/usr/local/Cellar/tesseract/3.02.02_3/share/tessdata
folder, then runtesseract
with-l chi_sim
option(tesseract image.pdm out -l chi_sim
).I tried google homepage logo and it seems
tesseract
works better with.pnm
format.
- Convert image to bitonal image
convert input.jpg -threshold 50% output.jpg
# add `-negate` option to invert the image
convert input.jpg -negate output.jpg