Skip to content

Instantly share code, notes, and snippets.

@aparrish
aparrish / understanding-word-vectors.ipynb
Last active April 20, 2024 01:36
Understanding word vectors: A tutorial for "Reading and Writing Electronic Text," a class I teach at ITP. (Python 2.7) Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@dannguyen
dannguyen / README.md
Last active December 28, 2023 15:21
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@jmindek
jmindek / gist:62c50dd766556b7b16d6
Last active January 31, 2024 15:48
DISTINCT ON like functionality for Redshift

distinct column -> For each row returned, return only the unique members of a set. Think of it as for each row in a projection, concatenate all the column values and return only the strings that are unique.

test_db=# SELECT DISTINCT parent_id, child_id, id FROM test.foo_table ORDER BY parent_id, child_id, id LIMIT 10;
parent_id | child_id | id
-----------+------------+-----------------------------
1000040 | 103 | 1000040|2645405726|0001|103
@carcinocron
carcinocron / debugger pause beforeunload
Last active April 25, 2024 16:48
Chrome: pause before redirect
// Run this in the F12 javascript console in chrome
// if a redirect happens, the page will pause
// this helps because chrome's network tab's
// "preserve log" seems to technically preserve the log
// but you can't actually LOOK at it...
// also the "replay xhr" feature does not work after reload
// even if you "preserve log".
window.addEventListener("beforeunload", function() { debugger; }, false)
@s-y
s-y / coro.py
Last active January 10, 2016 05:45 — forked from keis/coro.py
import asyncio
@asyncio.coroutine
def waiting(r):
print("hello from waiting -", r)
yield from asyncio.sleep(2)
print("bye from waiting -", r)
return r
@keis
keis / coro.py
Created April 14, 2014 08:27
asyncio examples
import asyncio
@asyncio.coroutine
def waiting(r):
print("hello from waiting -", r)
yield from asyncio.sleep(2)
print("bye from waiting -", r)
return r
; ___ _ __ ___ __ ___
; / __|_ _ __ _| |_____ / /| __|/ \_ )
; \__ \ ' \/ _` | / / -_) _ \__ \ () / /
; |___/_||_\__,_|_\_\___\___/___/\__/___|
; An annotated version of the snake example from Nick Morgan's 6502 assembly tutorial
; on http://skilldrick.github.io/easy6502/ that I created as an exercise for myself
; to learn a little bit about assembly. I **think** I understood everything, but I may
; also be completely wrong :-)
@bosconian-dynamics
bosconian-dynamics / Shortened URL RegEx
Last active July 10, 2018 21:46
Regular expression to match shortened/masked URLs generated from LongURL.org's api via the JavaScript functions at http://gist.github.com/KuroTsuto/8447974
/(?:https?:\/\/)?(?:(?:0rz\.tw)|(?:1link\.in)|(?:1url\.com)|(?:2\.gp)|(?:2big\.at)|(?:2tu\.us)|(?:3\.ly)|(?:307\.to)|(?:4ms\.me)|(?:4sq\.com)|(?:4url\.cc)|(?:6url\.com)|(?:7\.ly)|(?:a\.gg)|(?:a\.nf)|(?:aa\.cx)|(?:abcurl\.net)|(?:ad\.vu)|(?:adf\.ly)|(?:adjix\.com)|(?:afx\.cc)|(?:all\.fuseurl.com)|(?:alturl\.com)|(?:amzn\.to)|(?:ar\.gy)|(?:arst\.ch)|(?:atu\.ca)|(?:azc\.cc)|(?:b23\.ru)|(?:b2l\.me)|(?:bacn\.me)|(?:bcool\.bz)|(?:binged\.it)|(?:bit\.ly)|(?:bizj\.us)|(?:bloat\.me)|(?:bravo\.ly)|(?:bsa\.ly)|(?:budurl\.com)|(?:canurl\.com)|(?:chilp\.it)|(?:chzb\.gr)|(?:cl\.lk)|(?:cl\.ly)|(?:clck\.ru)|(?:cli\.gs)|(?:cliccami\.info)|(?:clickthru\.ca)|(?:clop\.in)|(?:conta\.cc)|(?:cort\.as)|(?:cot\.ag)|(?:crks\.me)|(?:ctvr\.us)|(?:cutt\.us)|(?:dai\.ly)|(?:decenturl\.com)|(?:dfl8\.me)|(?:digbig\.com)|(?:digg\.com)|(?:disq\.us)|(?:dld\.bz)|(?:dlvr\.it)|(?:do\.my)|(?:doiop\.com)|(?:dopen\.us)|(?:easyuri\.com)|(?:easyurl\.net)|(?:eepurl\.com)|(?:eweri\.com)|(?:fa\.by)|(?:fav\.me)|(?:fb\.me)|(?:fbshare\.me)|(?:ff\.im)|(?:fff\
@denji
denji / nginx-tuning.md
Last active April 26, 2024 11:21
NGINX tuning for best performance

Moved to git repository: https://github.com/denji/nginx-tuning

NGINX Tuning For Best Performance

For this configuration you can use web server you like, i decided, because i work mostly with it to use nginx.

Generally, properly configured nginx can handle up to 400K to 500K requests per second (clustered), most what i saw is 50K to 80K (non-clustered) requests per second and 30% CPU load, course, this was 2 x Intel Xeon with HyperThreading enabled, but it can work without problem on slower machines.

You must understand that this config is used in testing environment and not in production so you will need to find a way to implement most of those features best possible for your servers.

@dypsilon
dypsilon / frontendDevlopmentBookmarks.md
Last active April 28, 2024 08:45
A badass list of frontend development resources I collected over time.