Skip to content

Instantly share code, notes, and snippets.

@SourabhJaz
Last active December 26, 2019 05:18
Show Gist options
  • Save SourabhJaz/1bbabb25cd8c959799602f62b6933977 to your computer and use it in GitHub Desktop.
Save SourabhJaz/1bbabb25cd8c959799602f62b6933977 to your computer and use it in GitHub Desktop.

Approach 1: OCR Space (Link)

OCR Space provides an easy to use API to integrate OCR with any system. The free tier has up to 25,000 free calls per month with maximum document size of 1 mb and a maximum of 500 calls per day (as on December 2019). On high quality images we observed an accuracy of 55%, which dipped in medium quality images with 50% accuracy and 45% on low quality images. These numbers didn't look promising for medium and low quality images.

Approach 2: Tesseract OCR(Link)

Tesseract is the most popular open source OCR. We wrote a python script to use pytesseract (Link) on our dataset. Tesseract provides a lot of configuration parameters, we used LSTM based OCR engine for our purpose. On high quality images we observed an accuracy of 73%, accuracy of 72% with medium quality images and 37% on low quality images. Clearly, there was some work to be done for low quality images.

Approach 3: Tesseract OCR with Custom Pre-Processing

Approach 2 looked effective on High and Medium quality images. To enhance the accuracy of Tesseract on low quality images, we pre-processed the image before passing it to tesseract. We applied thresholding (Link), scaling and smoothing (Link). The performance on low quality images improved marginally to 40% by this but the performance on high and low quality dipped to 68%.

Approach 4: Tesseract OCR with EAST (Link)

EAST is a very accurate text detector which helps with efficiently detecting text in images. Text detection helps in reducing the effect of noise on accuracy of OCR. We took reference from this post and checked the performance of OCR + EAST on our dataset. This resulted in a better performance than previous models for low quality images with accuracy of 50%, the performance in medium quality images was 63% and with high quality images was 81%. The time taken by this approach per image was more than the previous approaches. We had to further improve the accuracy and reduce the processing time.

Approach 5: Google Vision (Link)

Considered by many to be the best OCR solution available online. Vision allows up to 1000 free calls per month and $1.5 per 1000 images after that (as on December 2019). We were curious to check Vision's performance on our dataset. Vision achieved an accuracy of 95% on high quality images, 91% on medium quality images and 85% on the low quality images. The numbers were way better than other approaches we tried. The response time of Vision was lesser than OCR+EAST. We decided to conclude our OCR selection at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment