Skip to content

Instantly share code, notes, and snippets.

View j-min's full-sized avatar

Jaemin Cho j-min

View GitHub Profile

NLTK API to Stanford NLP Tools compiled on 2015-12-09

Stanford NER

With NLTK version 3.1 and Stanford NER tool 2015-12-09, it is possible to hack the StanfordNERTagger._stanford_jar to include other .jar files that are necessary for the new tagger.

First set up the environment variables as per instructed at https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software

@j-min
j-min / AWS_Jupyter_Notebook.sh
Last active November 8, 2016 08:34
Jupyter Notebook setup on AWS EC2
sudo apt-get upgrade
sudo apt-get update
sudo pip install jupyter notebook
jupyter notebook --generate-config
ipython
from notebook.auth import passwd
passwd()
@j-min
j-min / TF_MDN.ipynb
Created November 15, 2016 10:46
Mixture Density Network in TensorFlow
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
import csv
import os
def get_csv_writer(filename, rows, delimiter):
with open(filename, 'w') as csvfile:
fieldnames = rows[0].keys()
writer = csv.DictWriter(csvfile, fieldnames=fieldnames, delimiter=delimiter)
writer.writeheader()
for row in rows:
try:
@j-min
j-min / 2_3.py
Created November 24, 2016 06:10
Python 2/3 compatibility
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
@j-min
j-min / tokenize_dparser.py
Last active December 3, 2016 09:02
Get tokenized list from dparser
import json
import requests
def tokenize_dparser(text):
dparser_link = 'http://parser.datanada.com/parse?version=1&string='
url = dparser_link+text
response = requests.get(url)
@j-min
j-min / hangul.py
Created December 20, 2016 04:07 — forked from allieus/hangul.py
# -*- coding: utf-8 -*-
class Hangul:
BASE_CODE = 44032
CHOSUNG = 588
JUNGSUNG = 28
# 초성 리스트. 00 ~ 18
CHOSUNG_LIST = [
'ㄱ', 'ㄲ', 'ㄴ', 'ㄷ', 'ㄸ', 'ㄹ', 'ㅁ', 'ㅂ', 'ㅃ',
@j-min
j-min / convertSize.py
Created January 6, 2017 16:14
convertSize.py
import math
def convertSize(size):
"""
Return filesize (in Bytes) in human-readable format
"""
if (size == 0):
return '0B'
units = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
i = int(math.floor(math.log(size,1024)))
ZSH=$HOME/.zsh
HISTFILE=$HOME/.history
HISTSIZE=10000
SAVEHIST=10000
export TERM=xterm-256color
export LANG=en_US.UTF-8
# added by Anaconda3 4.1.1 installer
export PATH=$HOME/anaconda3/bin:$PATH
@j-min
j-min / backprop.ipynb
Created March 1, 2017 14:25
Simple backprop implementation in TensorFlow without its optimizer API
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.