Last active
August 19, 2018 13:13
-
-
Save hujuu/ca15e204d8182c019af22cab6b1e1efd to your computer and use it in GitHub Desktop.
【R】【MeCab】RMeCabのインストールと形態素解析 ref: https://qiita.com/hujuu/items/314a64a50875cdabf755
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ brew doctor |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ brew doctor | |
Your system is ready to brew. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ brew doctor | |
Please note that these warnings are just used to help the Homebrew maintainers | |
with debugging if you file an issue. If everything you use Homebrew for is | |
working fine: please don't worry and just ignore them. Thanks! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ brew install mecab | |
$ brew install mecab-ipadic |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
install.packages("RMeCab", repos = "http://rmecab.jp/R") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(RMeCab) | |
res <- RMeCabC("すもももももももものうち") | |
unlist (res) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
名詞 助詞 名詞 助詞 名詞 助詞 名詞 | |
"すもも" "も" "もも" "も" "もも" "の" "うち" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(RMeCab) | |
library(ggplot2) | |
# 解析対象となるデータの読み込み | |
res <- RMeCabFreq("steve-jobs-speech.txt") | |
# 名詞だけを取り出してデータフレームres_nounへ | |
res_noun <- res[res[,2]=="名詞",] | |
# 2回以上登場する名詞の数。res[,4]で"Freq"列を参照 | |
nrow(res_noun <- res[res[,2]=="名詞" & res[,4] > 1,]) | |
# res_nounをFreqで降順ソート | |
res_noun[rev(order(res_noun$Freq)),] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# 1列目と4列目を抜き出してデータフレームを作成する | |
res_noun2 <- data.frame(word=as.character(res_noun[,1]), | |
freq=res_noun[,4]) | |
# 上位25位に絞り込む | |
res_noun2 <- subset(res_noun2, rank(-freq)<25) | |
# ggplotでグラフを描画する | |
ggplot(res_noun2, aes(x=reorder(word,freq), y=freq)) + | |
geom_bar(stat = "identity", fill="grey") + | |
theme_bw(base_size = 10, base_family = "HiraKakuProN-W3") + | |
coord_flip() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment