Skip to content

Instantly share code, notes, and snippets.

@iros
Created September 28, 2012 20:46
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save iros/3802000 to your computer and use it in GitHub Desktop.
Save iros/3802000 to your computer and use it in GitHub Desktop.
R Trick - reading data faster
# Making R read data faster by precomputing the column
# data types
sample <- read.table("data.txt", nrows = 100)
types <- sapply(sample, classes)
allData <- read.table("data.txt", colClasses = classes)
@alexmasselot
Copy link

Hi Irene
Does it work better with a couple of fixes?
sample <- read.table(fname, nrows = 100, header=TRUE)
types <- sapply(sample, class)
read.table(fname, colClasses = types, header=TRUE)

Or maybe I miss something about the classes function and where types is used.
Then making some benchmark with the classic method vs this one, on ~10M lines, 3 col (numeric and text), I did not find major time improvement (barely a few %).

This method might be good, but in other situation (or with older R versions)

(Anyway, I love your tweets, I'm a big fan - stucked in slow load this morning, I remembered this one 3 months ago)
Alex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment