Skip to content

Instantly share code, notes, and snippets.

dannguyen /
Last active Oct 20, 2021
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

tswaters /
Last active Nov 24, 2021
Adding subdirectory of a remote repo to a subdirectory in local repo

This is way more complicated than it should be. The following conditions need to be met :

  1. need to be able to track and merge in upstream changes
  2. don't want remote commit messages in master
  3. only interested in sub-directory of another repo
  4. needs to go in a subdirectory in my repo.

In this particular case, I'm interested in bringing in the 'default' template of jsdoc as a sub-directory in my project so I could potentially make changes to the markup it genereates while also being able to update from upstream if there are changes. Ideally their template should be a separate repo added to jsdoc via a submodule -- this way I could fork it and things would be much easier.... but, it is what it is.

After much struggling with git, subtree and git-subtree, I ended up finding this -- it basically sets up separate branches from tracking remote, the particular sub-directory, and uses git subtree contrib module to pull it all togther. Following are



Many of you, like me, have taken to the streets more than once this year.

At the end of last year, a series of events caused considerable concern over the deterioration in the state of democracy in Taiwan. As a result, a group of friends who work in the IT industry founded a movement called “” — “0” as in “007.”

The goal of g0v is to improve information transparency in the government. Using modern technology, we aim to transform society and ensure that its citizens are both heard and seen.

To have a voice, we organized a large amount of data, wrote many lines of code and built websites, so that the public can gain a deeper understanding about various issues and become more willing to participate in such matters. Why do we want to stay in front of the computer all day, instead of going out and having fun?

tobytailor / get_barcode_from_image.js
Created Jun 1, 2010
Barcode recognition with JavaScript - Demo:
View get_barcode_from_image.js
* Copyright (c) 2010 Tobias Schneider
* This script is freely distributable under the terms of the MIT license.
var UPC_SET = {
"3211": '0',
"2221": '1',
"2122": '2',