Skip to content

Instantly share code, notes, and snippets.

@ayman
Created May 21, 2015 18:12
Show Gist options
  • Save ayman/f6b03286fda126377d51 to your computer and use it in GitHub Desktop.
Save ayman/f6b03286fda126377d51 to your computer and use it in GitHub Desktop.
By setting the number of rows a file is before it's read, R speeds up tremendously and doesn't leak memory. Here's a quick trick to get a file's row count within R.
f.name <- "file.csv"
f.command <- paste("wc -l",
f.name,
"| cut -d \" \" -f 2")
f.rows <- as.numeric(system(f.command, intern = TRUE))
f <- read.csv(f.name,
nrows=f.rows,
col.names=c("user", "phone", "total", "percent"),
colClasses=c("numeric","numeric","numeric","numeric"),
header=FALSE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment