Skip to content

Instantly share code, notes, and snippets.

View suminb's full-sized avatar

Sumin Byeon suminb

View GitHub Profile
import sys
def parse_column(line):
cols = line.split('\t')
col_count = len(cols)
if col_count == 13:
return cols
elif col_count == 12:
@suminb
suminb / main.scala
Created April 2, 2016 17:41
My first Scala program
object Main {
abstract class Tree
case class Sum(l: Tree, r: Tree) extends Tree
case class Var(n: String) extends Tree
case class Const(v: Int) extends Tree
@suminb
suminb / freq.py
Created October 6, 2011 13:55
Counting word frequency
import codecs
import re
f = codecs.open('freq.txt', 'r', 'utf-8')
data = f.read()
f.close()
freq = {}
for word in re.findall(r'\w+', data, re.UNICODE):
if not word in freq:
@suminb
suminb / pandoc.sh
Last active December 30, 2015 20:29
Pandoc LaTeX template for CJK
pandoc \
--latex-engine=xelatex \
--template=template.tex \
-V geometry:margin=1.25in \
-o Final\ Paper.pdf \
Final\ Paper.md
@suminb
suminb / defective.c
Last active December 30, 2015 17:59
Explain why the following code does not (always) work. Discuss, if there is any, potential security vulnerabilities.
char* concat(char *s, char *t) {
char strbuf[BUF_SIZE];
strcpy(strbuf, s);
strcpy(strbuf+strlen(s), t);
return strbuf;
}
@suminb
suminb / analytics.py
Last active December 22, 2015 10:49
Analytics
from multiprocessing import Pool
from collections import Counter
from operator import add, itemgetter
import os, sys
import csv
def substrings(s):
n = len(s)
for i in xrange(0, n):
@suminb
suminb / substrings.py
Created September 3, 2013 21:41
Compute all non-empty substrings of a given string
def substrings(s):
n = len(s)
for i in xrange(0, n):
for j in xrange(i+1, n+1):
yield s[i:j]
@suminb
suminb / memory_usage.sh
Last active December 20, 2015 19:19
Memory usage of Better Translator (in KB)
ps aux | grep translator/apache | awk '{ mem+=$6 } END { print mem } '
@suminb
suminb / histogram.sql
Created July 9, 2013 08:35
Better Translator translation demand histogram
SELECT CASE
WHEN text_length < 10 THEN '< 10'
WHEN text_length BETWEEN 10 AND 100 THEN '< 100'
WHEN text_length BETWEEN 100 AND 1000 THEN '< 1000'
WHEN text_length BETWEEN 1000 AND 10000 THEN '< 10000'
ELSE '> 10000' END AS length_bucket,
COUNT(*) AS qty
FROM (SELECT *, length(original_text) AS text_length FROM translation) AS t GROUP BY length_bucket;
# Takes any arbitray data, splits it into multiple peices and encode into QR code.
# python-qrcode package can be obtained from https://github.com/lincolnloop/python-qrcode
import qrcode
import base64
from multiprocessing import Pool
BLOCK_SIZE = 100 # in bytes