Last active
April 16, 2017 07:45
-
-
Save hujuu/15d792a2a47bb825d5753e923a8ad857 to your computer and use it in GitHub Desktop.
【R】【tidyr】スクレイピングからデータのグラフ化までまとめて実行 ref: http://qiita.com/hujuu/items/2bb08a511546f3cbc322
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
install.packages("tidyr") | |
library(tidyr) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(XML) | |
kanken = "http://www.kanken.or.jp/kanken/investigation/transition.html" | |
kanken.table = readHTMLTable(airline, header=T, which=1,stringsAsFactors=F) | |
# 列名を入れる | |
colnames(kanken.table) = c("年度","志願者数","合格者数") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
年度 志願者数 合格者数 | |
1 平成12年度(2000年度) 1,576,959人 686,388人 | |
2 平成13年度(2001年度) 1,797,608人 859,902人 | |
3 平成14年度(2002年度) 2,044,170人 1,067,356人 | |
4 平成15年度(2003年度) 2,195,595人 1,203,597人 | |
5 平成16年度(2004年度) 2,240,344人 1,133,875人 | |
6 平成17年度(2005年度) 2,407,075人 1,227,430人 | |
・・・ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kanken.g <- kanken.table %>% tidyr::gather(分類,人数,-年度) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
年度 分類 人数 | |
1 平成12年度(2000年度) 志願者数 1,576,959人 | |
2 平成13年度(2001年度) 志願者数 1,797,608人 | |
3 平成14年度(2002年度) 志願者数 2,044,170人 | |
・・・ | |
・・・ | |
・・・ | |
17 平成12年度(2000年度) 合格者数 686,388人 | |
18 平成13年度(2001年度) 合格者数 859,902人 | |
19 平成14年度(2002年度) 合格者数 1,067,356人 | |
・・・ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# 「,」の排除 | |
kanken.g[,3] <- gsub(",","",kanken.g[,3]) | |
# 「人」の排除とas.numericを用いて数値化 | |
kanken.g[,3] <- as.numeric(gsub("人","",kanken.g[,3])) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
g <- ggplot(kanken.g, aes(x = 年度, y = 人数, fill = 分類)) | |
# グラフのタイプを指定して棒グラフにする | |
g <- g + geom_bar(width = 0.8, stat = "identity", position = "dodge") | |
g <- g + scale_linetype_identity() | |
g <- g + theme_pander() | |
g <- g + theme(text = element_text(family = "HiraKakuProN-W3"), | |
plot.margin= unit(c(1, 1, 1, 1), "lines"), | |
axis.text.x = element_text(angle = 45, hjust = 1)) | |
# Y軸を3桁毎のカンマ区切りで表示する | |
g <- g + scale_y_continuous(labels = scales::comma) | |
g |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment