Skip to content

Instantly share code, notes, and snippets.

View ikegami-yukino's full-sized avatar

IKEGAMI Yukino ikegami-yukino

View GitHub Profile
@ikegami-yukino
ikegami-yukino / install_unidic_scientific_linux6.sh
Last active December 18, 2015 07:37
Scientific Linux6にUniDicいれる
if [ ! -e `/usr/local/bin/mecab-config --dicdir`/unidic ]; then
if [ "`yum list installed| grep unzip.x86_64`" = "" ]; then
yum install -y unzip
fi
wget "http://jaist.dl.osdn.jp/unidic/58338/unidic-mecab-2.1.2_src.zip" -O /tmp/unidic-mecab-2.1.2_src.zip
unzip /tmp/unidic-mecab-2.1.2_src.zip -d /tmp
cd /tmp/unidic-mecab-2.1.2_src
./configure
make
make install
@ikegami-yukino
ikegami-yukino / install_java8_scientific_linux_6.sh
Created December 18, 2015 07:32
Scientific Linux6にJava8入れる
wget --no-cookies --no-check-certificate \
--header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie"\
"http://download.oracle.com/otn-pub/java/jdk/8u65-b14/jdk-8u65-linux-x64.rpm" -O /tmp/jdk-8u65-linux-x64.rpm
rpm -ivh /tmp/jdk-8u65-linux-x64.rpm
@ikegami-yukino
ikegami-yukino / 1_parse_vs_parseToNode.ipynb
Last active December 15, 2015 11:31
PythonでのMeCabを速くするtips
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
pushd .
# Install blas
cd /tmp
wget http://www.netlib.org/blas/blas.tgz
tar xzf blas.tgz
cd BLAS*
gfortran -O3 -m64 -fPIC -c *.f
ar r libfblas.a *.o
ranlib libfblas.a
@ikegami-yukino
ikegami-yukino / dict2sparse.py
Last active May 6, 2021 09:01
dict to scipy.sparse
import numpy as np
from scipy.sparse import csr_matrix
def dict2sparse(d):
data = list(d.values())
indices = list(d.keys())
indptr = [0, len(d)]
return csr_matrix((data, indices, indptr), shape=(1, max(d)+1), dtype=np.uint32)
@ikegami-yukino
ikegami-yukino / arabic2chinese.py
Created September 3, 2015 02:55
Convert Arabic numerals to Chinese numerals
CHINESE_MAP = {'1': '一', '2': '二', '3': '三', '4': '四', '5': '五', '6': '六', '7': '七', '8': '八', '9': '九'}
CHINESE_DIGITS = ('十', '百', '千', '万', '十万', '百万', '千万', '億', '十億', '百億', '千億', '兆', '十兆', '百兆', '千兆')
def arabic2chinese(arabic):
chinese = []
if len(arabic) == '0':
return '〇'
arabic = arabic.replace(',', '')
for (i, num) in enumerate(arabic[::-1]):
if num == '0':
@ikegami-yukino
ikegami-yukino / sparse_eliminate_zero_raws.py
Last active December 29, 2015 05:15
scipyのsparseから0の行をカットする
import numpy as np
def eliminate_zero_raws(x):
return x[np.unique(x.nonzero()[0])]
import heapq
from collections import deque
class TopK():
def __init__(self, k=5):
self.k = k
self._initialize()
@ikegami-yukino
ikegami-yukino / mac_word2vec_install.sh
Last active May 28, 2019 19:41
Install word2vec to Mac OS X later than 10.9
pushd . &> /dev/null
cd /tmp
git clone --depth=1 https://github.com/tmikolov/word2vec
cd word2vec
sed -i -e 's/malloc.h/stdlib.h/g' *.c
make
rm *.c* *.txt makefile LICENSE
cp * /usr/local/bin
popd &> /dev/null
git clone --depth 1 https://github.com/neologd/mecab-ipadic-neologd.git /tmp/mecab-ipadic-neologd
bash /tmp/mecab-ipadic-neologd/bin/install-mecab-ipadic-neologd -n -y
rm -rf /tmp/mecab-ipadic-neologd