Skip to content

Instantly share code, notes, and snippets.

View thirdwing's full-sized avatar
🏠
Working from home

Qiang Kou (KK) thirdwing

🏠
Working from home
View GitHub Profile
# Make mouse useful
setw -g mouse on
# Allow xterm titles in terminal window, terminal scrolling with scrollbar, and setting overrides of C-Up, C-Down, C-Left, C-Right
# (commented out because it disables cursor navigation in vim)
#set -g terminal-overrides "xterm*:XT:smcup@:rmcup@:kUP5=\eOA:kDN5=\eOB:kLFT5=\eOD:kRIT5=\eOC"
# Scroll History
set -g history-limit 30000
CFLAGS += -O3 -Wall -pipe -pedantic
CXXFLAGS += -O3 -Wall -pipe -Wno-unused -pedantic
VER=-4.8
CCACHE=ccache
CC=$(CCACHE) gcc$(VER)
CXX=$(CCACHE) g++$(VER)
SHLIB_CXXLD=g++$(VER)
FC=ccache gfortran$(VER)
F77=$(CCACHE) gfortran$(VER)
MAKE=make -j8
@thirdwing
thirdwing / ls_env.R
Created October 12, 2015 17:49
ls functions
find.funs <-function(pos = 1, ..., exclude.mcache = TRUE, mode = 'function') {
findo <- function(pos2) {
o <- named(lsall(pos = pos2, ...))
if (!length(o))
return(character(0))
# keep if exists
keep <- sapply(o, exists, where = pos2, mode = mode, inherits = FALSE)
if (!any(keep))
// boost graph serialization example
// g++ boost_graph_serialize.cpp -lboost_serialization -o test
#include <iostream>
#include <string>
#include <iostream>
#include <fstream>
#include <set>
require(RCurl)
require(XML)
webpage <- getURL("https://en.wikipedia.org/wiki/N-gram")
webpage <- readLines(tc <- textConnection(webpage)); close(tc)
pagetree <- htmlTreeParse(webpage, error=function(...){}, useInternalNodes = TRUE)
x <- xpathSApply(pagetree, "//*/p", xmlValue)
x <- unlist(strsplit(x, "\n"))
x <- gsub("\t","",x)
x <- sub("^[[:space:]]*(.*?)[[:space:]]*$", "\\1", x, perl=TRUE)
x <- x[!(x %in% c("", "|"))]
#!/bin/python
from collections import OrderedDict
def memo(f, k):
cache = OrderedDict()
def memoized(n):
if n not in cache:
cache[n] = f(n)
if len(cache) > k:
del cache[cache.keys()[0]]
@thirdwing
thirdwing / README.md
Created March 24, 2016 20:19 — forked from dannguyen/README.md
Using Google Cloud Vision API to OCR scanned documents to extract structured data

Using Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

logger <- mx.metric.logger$new()
mx.callback.plot.train.metric <- function(period, logger=NULL) {
function(iteration, nbatch, env, verbose=TRUE) {
if (nbatch %% period == 0 && !is.null(env$metric)) {
N = env$end.round
result <- env$metric$get(env$train.metric)
plot(c(0.5,1)~c(0,N), col=NA, ylab = paste0("Train-", result$name),xlab = "")
logger$train <- c(logger$train, result$value)
lines(logger$train, lwd = 3, col="red")
//java -Djava.awt.headless=true
import java.awt.image.BufferedImage;
import java.awt.image.DataBufferByte;
import java.io.File;
import java.io.IOException;
import javax.imageio.ImageIO;
public class Main {
@thirdwing
thirdwing / r_install_gpu_win.md
Last active April 6, 2018 02:03
MXNet R installation with GPU support on Windows

MXNet R installation with GPU support on Windows

Clone the MXNet github repo

git clone --recursive https://github.com/dmlc/mxnet

The --recursive is to clone all the submodules used by MXNet.