Skip to content

Instantly share code, notes, and snippets.

@jarodsmk
Created September 23, 2019 11:51
Show Gist options
  • Save jarodsmk/4d3c0f19fba9c386cfec292513e946b4 to your computer and use it in GitHub Desktop.
Save jarodsmk/4d3c0f19fba9c386cfec292513e946b4 to your computer and use it in GitHub Desktop.
Correct text-image orientation with Python/Tesseract/OpenCV
import cv2
import pytesseract
import urllib
import numpy as np
import re
# Installs: https://www.learnopencv.com/deep-learning-based-text-recognition-ocr-using-tesseract-and-opencv/
if __name__ == '__main__':
# Uncomment the line below to provide path to tesseract manually
# pytesseract.pytesseract.tesseract_cmd = '/usr/bin/tesseract'
# Read image from URL
# Taken from https://stackoverflow.com/questions/21061814/how-can-i-read-an-image-from-an-internet-url-in-python-cv2-scikit-image-and-mah
# https://i.ibb.co/4mm9WvZ/book-rot.jpg
# https://i.ibb.co/M7jwWR2/book.jpg
# https://i.ibb.co/27bKNJ8/book-rot2.jpg
resp = urllib.request.urlopen('https://i.ibb.co/27bKNJ8/book-rot2.jpg')
image = np.asarray(bytearray(resp.read()), dtype="uint8")
image = cv2.imdecode(image, cv2.IMREAD_COLOR) # Initially decode as color
# TAKEN FROM: https://www.pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/
# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)
rot_data = pytesseract.image_to_osd(image);
print("[OSD] "+rot_data)
rot = re.search('(?<=Rotate: )\d+', rot_data).group(0)
angle = float(rot)
if angle > 0:
angle = 360 - angle
print("[ANGLE] "+str(angle))
# rotate the image to deskew it
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h),
flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)
# TODO: Rotated image can be saved here
print(pytesseract.image_to_osd(rotated));
print("[TEXT]")
# Run tesseract OCR on image
text = pytesseract.image_to_string(rotated, lang='eng', config="-psm 1")
# Print recognized text
print(text.encode(encoding='UTF-8'))
@kucukagan
Copy link

berbat

@jarodsmk
Copy link
Author

@kucukagan - Sorry that this wasn't what you were looking for, I did make this a gist because it was a quick mockup of what I actually did and not a proper repository.

Would you care to suggest what you didn't like in this post that caused you to comment on it?

@Goutam-Kelam
Copy link

The code is very helpful. However I found one change that could really improve it.
Currently you are finding an angle and deciding whether the image should be rotated clockwise or anti clockwise. Then you are using cv2 functions to get the rotated matrix and warp affine. The problem is that OpenCV does not automatically allocate space for our entire rotated image to fit into the frame. As a result, if your image is of rectangular shape and rotated +90 or -90 then when you correct its orientation, half the image will be missing.

To avoid that you can use rotation using bound awareness. I am using the imutils function from https://www.pyimagesearch.com/2021/01/20/opencv-rotate-image/

import cv2
import pytesseract
import urllib
import numpy as np
import re
import imutils #added

resp = urllib.request.urlopen('https://i.ibb.co/27bKNJ8/book-rot2.jpg')
image = np.asarray(bytearray(resp.read()), dtype="uint8")
image = cv2.imdecode(image, cv2.IMREAD_COLOR) # Initially decode as color
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

rot_data = pytesseract.image_to_osd(image);
print("[OSD] "+rot_data)
rot = re.search('(?<=Rotate: )\d+', rot_data).group(0)

angle = float(rot)

# rotate the image to deskew it
rotated = imutils.rotate_bound(image, angle) #added


#  TODO: Rotated image can be saved here
print(pytesseract.image_to_osd(rotated));
print("[TEXT]")
# Run tesseract OCR on image
text = pytesseract.image_to_string(rotated, lang='eng', config="--psm 6")

# Print recognized text

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment