Skip to content

Instantly share code, notes, and snippets.

@drjwbaker
Last active December 5, 2016 17:35
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Embed
What would you like to do?
Tesseract OCR Engine
import pytesseract
import requests
from PIL import Image
from PIL import ImageFilter
from StringIO import StringIO
def process_image(url):
image = _get_image(url)
# image = image.resize( [int(2 * s) for s in image.size] )
# image.filter(ImageFilter.SHARPEN)
# image.filter(ImageFilter.EDGE_ENHANCE)
# image.filter(ImageFilter.FIND_EDGES)
return pytesseract.image_to_string(image)
def _get_image(url):
return Image.open(StringIO(requests.get(url).content))
# edit 5/12/16: image filters and resizing needs to be tested to see what works best.
# adapted from https://realpython.com/blog/python/setting-up-a-simple-ocr-server/
@drjwbaker
Copy link
Author

Yay that works! Thanks Sarah!

@snim2
Copy link

snim2 commented Dec 5, 2016

np

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment