Skip to content

Instantly share code, notes, and snippets.

View sabuncu's full-sized avatar

sabuncu sabuncu

  • Ankara, Turkey
  • 04:27 (UTC -12:00)
View GitHub Profile
@dannguyen
dannguyen / README.md
Last active July 6, 2024 16:36
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@danharper
danharper / background.js
Last active June 29, 2024 19:32
Bare minimum Chrome extension to inject a JS file into the given page when you click on the browser action icon. The script then inserts a new div into the DOM.
// this is the background code...
// listen for our browerAction to be clicked
chrome.browserAction.onClicked.addListener(function (tab) {
// for the current tab, inject the "inject.js" file & execute it
chrome.tabs.executeScript(tab.ib, {
file: 'inject.js'
});
});
using System;
using System.Linq;
using System.Runtime.InteropServices;
using EnvDTE;
using EnvDTE80;
using Microsoft.VisualStudio;
using Microsoft.VisualStudio.Shell;
using Microsoft.VisualStudio.Shell.Interop;
using Microsoft.VisualStudio.Text.Editor;
using Microsoft.VisualStudio.TextManager.Interop;