Skip to content

Instantly share code, notes, and snippets.

View ikegami-yukino's full-sized avatar

IKEGAMI Yukino ikegami-yukino

View GitHub Profile
@ikegami-yukino
ikegami-yukino / levenshtein.py
Created February 5, 2014 07:02
重み付きレーベンシュタイン距離 Weighted Levenshtein Distance
def weighted_levenshtein(a, b, insert=1, delete=1, substitute=1):
len_a = len(a)
len_b = len(b)
m = [ [0] * (len_b + 1) for i in xrange(len_a + 1) ]
for i in xrange(len_a + 1):
m[i][0] = i * delete
for j in xrange(len_b + 1):
m[0][j] = j * insert
@ikegami-yukino
ikegami-yukino / madoka_bayes.py
Last active August 29, 2015 13:57
Standard Naive Bayes and Complement Naive Bayes using madoka
#-*- coding: utf-8 -*-
import numpy as np
from collections import Counter, defaultdict
import madoka
NUM_DOCS_INDEX = '[[NUM_DOCS]]'
ALL_WORD_INDEX = '[[ALL]]'
class TFIDF(object):
@ikegami-yukino
ikegami-yukino / pig_.sh
Last active August 29, 2015 14:04
Apache Pig Installation on Ubuntu
wget http://ftp.kddilabs.jp/infosystems/apache/pig/latest/pig-0.13.0.tar.gz
tar -xvf pig-0.13.0.tar.gz
sudo mv pig-0.13.0 /usr/local/pig
rm pig-0.13.0.tar.gz
echo 'export PIG_HOME=/usr/local/pig' >> ~/.bashrc
echo 'export PATH=$PATH:$PIG_HOME/bin' >> ~/.bashrc
echo 'export PIG_CLASSPATH=$HADOOP_HOME/conf/' >> ~/.bashrc
source ~/.bashrc
pig -h
@ikegami-yukino
ikegami-yukino / vim_pig.sh
Last active August 29, 2015 14:07
Pig Latin syntax coloring for Vim
#!/bin/sh
git clone https://github.com/motus/pig.vim.git /tmp/pig.vim
mkdir ~/.vim/syntax/
mkdir ~/.vim/ftdetect/
cp /tmp/pig.vim/syntax/pig.vim ~/.vim/syntax/
cp /tmp/pig.vim/ftdetect/pig.vim ~/.vim/ftdetect/
rm -r /tmp/pig.vim
@ikegami-yukino
ikegami-yukino / longest_contiguous_common_subsequence.py
Created October 15, 2014 09:59
Longest Contiguous Common Subsequence
def to_ngrams(s, minimum_n):
"""Generate n-grams (len(string) >= n >= minimum) from string
Params:
<str> s
<int> minimum
Return:
<set <str>> ngrams
"""
ngrams = []
length = len(s)
@ikegami-yukino
ikegami-yukino / file0.txt
Last active April 9, 2016 08:23
Pure Python 版オンライン形態素解析ツール Rakuten MA ref: http://qiita.com/yukinoi/items/925bc238185aa2fad8a7
from rakutenma import RakutenMA
rma = RakutenMA(phi=1024, c=0.007812)
rma.load("model_ja.json")
rma.hash_func = rma.create_hash_func(15)
print(rma.tokenize("うらにわにはにわにわとりがいる"))
print(rma.train_one(
[["うらにわ","N-nc"],
["に","P-k"],
@ikegami-yukino
ikegami-yukino / file0.txt
Last active August 29, 2015 14:13
PythonでMeCabの制約付き解析を使う ref: http://qiita.com/yukinoi/items/4e7afb5e72b3a46da0f2
# -*- coding: utf-8 -*-
import re
import MeCab
from MeCab import MECAB_ANY_BOUNDARY, MECAB_INSIDE_TOKEN, MECAB_TOKEN_BOUNDARY
DICINFO_KEYS = ('charset', 'filename', 'lsize', 'rsize', 'size', 'type', 'version')
class Tagger(MeCab.Tagger):
'''
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
Copyright (C) 2004 Sam Hocevar <sam@hocevar.net>
Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.
@ikegami-yukino
ikegami-yukino / vim_hive.sh
Created January 23, 2015 10:06
Hive syntax for Vim
mkdir -p .vim/syntax
wget -O .vim/syntax/hive.vim https://raw.githubusercontent.com/autowitch/hive.vim/master/syntax/hive.vim
echo "au BufNewFile,BufRead *.hql set filetype=hive expandtab" >> ~/.vimrc
echo "au BufNewFile,BufRead *.q set filetype=hive expandtab" >> ~/.vimrc
@ikegami-yukino
ikegami-yukino / speech_install.sh
Last active March 25, 2017 07:01
Install hts_engine, hts_voice and open_jtalk
HTS_ENGINE_VERSION=1.10
HTS_VOICE_VERSION=1.05
OPENJTALK_VERSION=1.10
pushd .
cd /tmp
wget http://downloads.sourceforge.net/hts-engine/hts_engine_API-${HTS_ENGINE_VERSION}.tar.gz
tar xzf hts_engine_API-${HTS_ENGINE_VERSION}.tar.gz
cd hts_engine_API-${HTS_ENGINE_VERSION}