This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
brew install autoconf automake libtool protobuf | |
pushd . | |
git clone --depth=1 https://github.com/google/sentencepiece.git /tmp/ | |
cd /tmp/sentencepiece | |
perl -i -pe 's/libtoolize/glibtoolize/' autogen.sh | |
./autogen.sh | |
./configure | |
make | |
make check |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
wget http://www.chasen.org/~taku/software/mecab-skkserv/mecab-skkserv-0.03.tar.gz | |
tar xzf mecab-skkserv-0.03.tar.gz | |
cd mecab-skkserv-0.03 | |
ls *|xargs nkf -w --overwrite | |
./configure --with-charset=utf8 | |
echo 'cost-factor = 700' >>dicrc | |
perl -i -ne '$i++; print if ($i != 36 && $i != 37 && $i != 38 && $i != 44 && $i != 45 && $i != 46 && $i != 47 && $i != 48)' mecab-skkserv.cpp | |
make | |
make install |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Required download | |
# cudnn-8.0-linux-x64-v5.1.tgz | |
curl -L -o cuda_8.0.44_linux.run https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda_8.0.44_linux-run | |
curl -L -O http://us.download.nvidia.com/XFree86/Linux-x86_64/367.27/NVIDIA-Linux-x86_64-367.27.run | |
sudo apt-get install build-essential | |
sudo apt-get install linux-image-extra-`uname -r` | |
sudo sh cuda_8.0.44_linux.run | |
echo -e "export CUDA_HOME=/usr/local/cuda\nexport PATH=\$PATH:\$CUDA_HOME/bin\nexport LD_LIBRARY_PATH=\$LD_LINKER_PATH:\$CUDA_HOME/lib64" >> ~/.bashrc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
import os | |
import glob | |
re_pair = re.compile('^([ァ-ンー]+)\-([a-zA-Z \'\-\(\)]+)') | |
UNIDIC_PATH = 'path to UniDic directory' | |
with open('result.tsv', 'w') as out_fd: | |
for csvfile in glob.glob(os.path.join(UNIDIC_PATH, '*.csv')): | |
with open(csvfile) as dic_fd: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
# from 形態素解析システムJUMANをpython3で使う | |
# https://abeerforyou.com/?p=715 | |
set -eu | |
pushd . > /dev/null | |
cd /tmp | |
curl -L -O 'http://nlp.ist.i.kyoto-u.ac.jp/DLcounter/lime.cgi?down=http://nlp.ist.i.kyoto-u.ac.jp/nl-resource/juman/juman-7.01.tar.bz2&name=juman-7.01.tar.bz2' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- linear.cpp 2015-09-27 07:03:33.000000000 +0900 | |
+++ new_linear.cpp 2016-04-09 01:32:23.000000000 +0900 | |
@@ -2685,9 +2685,10 @@ double predict_probability(const struct | |
double label=predict_values(model_, x, prob_estimates); | |
for(i=0;i<nr_w;i++) | |
- prob_estimates[i]=1/(1+exp(-prob_estimates[i])); | |
+ prob_estimates[i]=exp(prob_estimates[i]); | |
if(nr_class==2) // for binary classification |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import shutil | |
import tempfile | |
import tcptest | |
from elasticsearch import Elasticsearch | |
SYNONYMS_PATH = "/tmp/wikipedia_synonym.txt" | |
settings = { |