Skip to content

Instantly share code, notes, and snippets.

Created April 25, 2018 03:55
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
What would you like to do?
extract sentence using rtesseract
# brew install tesseract
# gem install rtesseract
# gem install mini_magick
# example:
# puts image_has_sentence?('/Users/me/Pictures/hi.png', 'hello world')
require 'rtesseract'
require 'mini_magick'
def image_has_sentence?(image_path, sentence)
image_output =[rtesseract jpg])
# scale up and make grayscale. they said it works a charm. it does! do |convert|
convert << image_path
convert.merge! ['-resize', '200%', '-negate', '-set', 'colorspace', 'Gray']
convert << image_output.path
# The secret here is the `psm` value. See `tesseract --help`.
# Different kind of image will benefit from certain psm value
result =, processor: 'mini_magick', psm: 1, debug: true)
return (yield sentence).call result.to_s if block_given?
result.to_s.include? sentence
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment