Skip to content

Instantly share code, notes, and snippets.

View pengux's full-sized avatar

Peter Nguyen pengux

View GitHub Profile
package main
import (
"fmt"
"math/rand"
"os"
"sort"
"time"
)
@dannguyen
dannguyen / README.md
Last active July 29, 2025 14:26
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@jongrover
jongrover / kiosk-pi.txt
Last active December 3, 2023 21:45
How to Kiosk Raspberry Pi
Software for the Project:
Raspbian Wheezy Debian Linux
Win32Disk Imager
The CanaKit comes with a pre-loaded SD card that includes the same version of Debian Wheezy that I used for this project. However, in an effort to get a little more speed out of the system, I used the 95MB/s Sandisk extreme listed above. It seemed to help, but I did not bench mark it beyond observation.
Anyway, lets get down to building a Raspberry Pi Web Kiosk.
Step 0: Get all of the hardware.
Step 1: Get all of the software.
@MohamedAlaa
MohamedAlaa / tmux-cheatsheet.markdown
Last active December 10, 2025 21:19
tmux shortcuts & cheatsheet

tmux shortcuts & cheatsheet

start new:

tmux

start new with session name:

tmux new -s myname