Skip to content

Instantly share code, notes, and snippets.

@gghatano
Created January 9, 2014 23:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gghatano/8344453 to your computer and use it in GitHub Desktop.
Save gghatano/8344453 to your computer and use it in GitHub Desktop.
pylr vs doBy speed test
# 分類用
types <- c("A","B","C","D","E")
# 行数
obs <- 4e+06
# データフレームの作成
dat <- data.frame(type = as.factor(sample(types, obs, replace=TRUE)))
# 列数を増やしながら時間を計測していく
Nmax <- 10
plyr_time <- 0
doBy_time <- 0
for (N in 1:Nmax){
dat[,N+1] <- round(runif(obs, min=0, max = 1), digits = 2)
names(dat)[N+1] <- paste("value", N, sep="")
plyr_time[N] <- system.time(
plyr_res <- ddply(dat, .(type), summarize,
mean_percent = mean(value1))
)[3]
doBy_time[N] <- system.time(
doBy_res <- summaryBy(value1 ~ type, data = dat, FUN = mean)
)[3]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment