Skip to content

Instantly share code, notes, and snippets.

@williamratcliff
williamratcliff / README.md
Created March 24, 2016 20:41 — forked from dannguyen/README.md
Using Google Cloud Vision API to OCR scanned documents to extract structured data

Using Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@williamratcliff
williamratcliff / forms.py
Created March 14, 2012 08:55 — forked from maraujop/forms.py
django-crispy-forms bootstrap form example
# -*- coding: utf-8 -*-
from django import forms
from crispy_forms.helper import FormHelper
from crispy_forms.layout import Layout, Div, Submit, HTML, Button, Row, Field
from crispy_forms.bootstrap import AppendedText, PrependedText, FormActions
class MessageForm(forms.Form):
text_input = forms.CharField()
@williamratcliff
williamratcliff / plist2json.py
Created April 28, 2011 20:07 — forked from saturngod/plist2json.py
Plist to Json
import json
from plistlib import readPlist
import StringIO
plist = open("source.plist","r").read()
in_file = StringIO.StringIO(plist)
plist_dict = readPlist(in_file)
open("source.json","w").write(json.dumps(plist_dict))
@williamratcliff
williamratcliff / second file
Created November 2, 2010 14:05 — forked from jmalonzo/gist:654412
cloned gist

"Other developers are just like us - weird"

"If you ever need to deploy Django, you're good. If you know Capistrano and Unicorn they've got rip-offs of all that stuff."

"I think we should all admit we're horrible coders and move on" "We're all drug addicts - we're fighting methods..."

"It's easy to learn to play the guitar and be able to play Bob Dylan and Weezer and never get better."

"This is basically a talk that was given 30 years ago. We just have to keep giving it every few years because young guys come along and forget it."