Last active
August 29, 2015 14:26
-
-
Save Jwata/0bb45cd81010148fbdbb to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import MeCab | |
from gensim import corpora, models, similarities | |
file_name = 'requirements_3.txt' | |
mecab = MeCab.Tagger("-Ochasen -d /usr/local/Cellar/mecab/0.996/lib/mecab/dic/mecab-ipadic-neologd/") | |
with open(file_name) as f: | |
all_tokens = [] | |
for line in f: | |
l = line | |
token_node = mecab.parseToNode(l) | |
tokens = [] | |
while token_node: | |
tokens.append(token_node.surface) | |
token_node = token_node.next | |
all_tokens.append(tokens) | |
print(all_tokens) | |
dic = corpora.Dictionary(all_tokens) | |
dic.save('/tmp/mecab-dict') | |
print(dic.token2id) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
【必須】 | |
・何らかの言語での開発経験をお持ちであり、IoT領域に対しての興味関心をお持ちの方 | |
【尚可】 | |
・ハードウェアに関する開発経験 | |
・Web系開発言語(PHP、Python、Ruby、Perl、Java、Golang、Scala等)でのWebアプリケーション開発経験 | |
・データマイニング、機械学習に関する知識、経験やHadoop/Hiveに関わる知識・利用経験 | |
【求める人物像】 | |
・少数精鋭チームで自ら進んでタスクを見つけ、遂行できる能力を持った方 | |
・技術に関する興味、探究心が強く、ものを作るのが好きな方 | |
・技術的なチャレンジに臆することなく向かうことでき、成長に対する熱意がある方 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment