Skip to content

Instantly share code, notes, and snippets.

@dannguyen
dannguyen / README.md
Last active May 17, 2024 02:07
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@alirobe
alirobe / reclaimWindows10.ps1
Last active May 22, 2024 20:58
This Windows 10 Setup Script turns off a bunch of unnecessary Windows 10 telemetery, bloatware, & privacy things. Not guaranteed to catch everything. Review and tweak before running. Reboot after running. Scripts for reversing are included and commented. Fork of https://github.com/Disassembler0/Win10-Initial-Setup-Script (different defaults). N.…
###
###
### UPDATE: For Win 11, I recommend using this tool in place of this script:
### https://christitus.com/windows-tool/
### https://github.com/ChrisTitusTech/winutil
### https://www.youtube.com/watch?v=6UQZ5oQg8XA
### iwr -useb https://christitus.com/win | iex
###
###
@joepie91
joepie91 / vpn.md
Last active May 20, 2024 03:37
Don't use VPN services.

Don't use VPN services.

No, seriously, don't. You're probably reading this because you've asked what VPN service to use, and this is the answer.

Note: The content in this post does not apply to using VPN for their intended purpose; that is, as a virtual private (internal) network. It only applies to using it as a glorified proxy, which is what every third-party "VPN provider" does.

  • A Russian translation of this article can be found here, contributed by Timur Demin.
  • A Turkish translation can be found here, contributed by agyild.
  • There's also this article about VPN services, which is honestly better written (and has more cat pictures!) than my article.
@haasn
haasn / about:config.md
Last active April 2, 2024 18:46
Firefox bullshit removal via about:config

Firefox bullshit removal

Updated: Just use qutebrowser (and disable javascript). The web is done for.

@ndarville
ndarville / webm.md
Last active September 30, 2023 18:56
4chan’s guide to converting GIF to WebM - https://boards.4chan.org/g/res/41212767

Grab ffmpeg from https://www.ffmpeg.org/download.html

It's a command line tool which means you will have to type things with your keyboard instead of clicking on buttons.

The most trivial operation would be converting gifs:

ffmpeg -i your_gif.gif -c:v libvpx -crf 12 -b:v 500K output.webm
  • -crf values can go from 4 to 63. Lower values mean better quality.
  • -b:v is the maximum allowed bitrate. Higher means better quality.
@arvearve
arvearve / gist:4158578
Created November 28, 2012 02:01
Mathematics: What do grad students in math do all day?

Mathematics: What do grad students in math do all day?

by Yasha Berchenko-Kogan

A lot of math grad school is reading books and papers and trying to understand what's going on. The difficulty is that reading math is not like reading a mystery thriller, and it's not even like reading a history book or a New York Times article.

The main issue is that, by the time you get to the frontiers of math, the words to describe the concepts don't really exist yet. Communicating these ideas is a bit like trying to explain a vacuum cleaner to someone who has never seen one, except you're only allowed to use words that are four letters long or shorter.

What can you say?

@andreyvit
andreyvit / tmux.md
Created June 13, 2012 03:41
tmux cheatsheet

tmux cheat sheet

(C-x means ctrl+x, M-x means alt+x)

Prefix key

The default prefix is C-b. If you (or your muscle memory) prefer C-a, you need to add this to ~/.tmux.conf:

remap prefix to Control + a