Skip to content

Instantly share code, notes, and snippets.

@dannguyen
dannguyen / README.md
Last active December 28, 2023 15:21
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@GilLevi
GilLevi / README.md
Last active June 17, 2023 20:58
Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns

Gil Levi and Tal Hassner, Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns

Convolutional neural networks for emotion classification from facial images as described in the following work:

Gil Levi and Tal Hassner, Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns, Proc. ACM International Conference on Multimodal Interaction (ICMI), Seattle, Nov. 2015

Project page: http://www.openu.ac.il/home/hassner/projects/cnn_emotions/

If you find our models useful, please add suitable reference to our paper in your work.

@GilLevi
GilLevi / README.md
Last active July 25, 2023 18:05
Age and Gender Classification using Convolutional Neural Networks
@karpathy
karpathy / gist:7bae8033dcf5ca2630ba
Created May 5, 2015 07:31
Efficient LSTM cell in Torch
--[[
Efficient LSTM in Torch using nngraph library. This code was optimized
by Justin Johnson (@jcjohnson) based on the trick of batching up the
LSTM GEMMs, as also seen in my efficient Python LSTM gist.
--]]
function LSTM.fast_lstm(input_size, rnn_size)
local x = nn.Identity()()
local prev_c = nn.Identity()()
local prev_h = nn.Identity()()
@karpathy
karpathy / gist:587454dc0146a6ae21fc
Last active March 19, 2024 05:50
An efficient, batched LSTM.
"""
This is a batched LSTM forward and backward pass
"""
import numpy as np
import code
class LSTM:
@staticmethod
def init(input_size, hidden_size, fancy_forget_bias_init = 3):
@syhw
syhw / dnn.py
Last active January 24, 2024 19:38
A simple deep neural network with or w/o dropout in one file.
"""
A deep neural network with or w/o dropout in one file.
License: Do What The Fuck You Want to Public License http://www.wtfpl.net/
"""
import numpy, theano, sys, math
from theano import tensor as T
from theano import shared
from theano.tensor.shared_randomstreams import RandomStreams
@fish2000
fish2000 / setup.py
Created January 17, 2012 19:29
Setup script for the Python VLFeat bindings
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Installation script for the vlfeat module
"""
import sys, os
from distutils.core import Extension, setup
from distutils.errors import DistutilsFileError