Skip to content

Instantly share code, notes, and snippets.

@nskeip
Created June 12, 2014 12:18
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save nskeip/9eef7db6b76c43849a83 to your computer and use it in GitHub Desktop.
Save nskeip/9eef7db6b76c43849a83 to your computer and use it in GitHub Desktop.
Read multiple CSV files into a single dataframe
create_big_data_from_csv_dir <- function(directory, ids) {
# locate the files
files <- list.files(directory, full.names=T)[ids]
# read the files into a list of data.frames
data.list <- lapply(files, read.csv)
# concatenate into one big data.frame
data.cat <- do.call(rbind, data.list)
# aggregate
# data.agg <- aggregate(value ~ index, data.cat, mean)
}
@SimonStolz
Copy link

Thanks for sharing! I added a parameter to alternatively load lists from JSON files into a combined list iteam, instead of CSV files. Can be enabled by calling getAllInDir("directoryOfJSONs", asList = TRUE). I also removed ids parameter for my use case.

getAllInDir <- function(directory, asList = FALSE) {
  # locate the files
  files <- list.files(directory, full.names = T)
  
  # if you want the script to read csvs
  if (asList == FALSE) {
    # read the files into a list of data.frames
    data.list <- lapply(files, read.csv)
    
    # concatenate into one big data.frame
    data.cat <- do.call(rbind, data.list)
    
    return(data.cat)
  # if you want the script to read lists  
  } else {
   
    data.list <- lapply(files, function(x) fromJSON(file=x))
    
    return(data.list)
    
  }
  
}

@nskeip
Copy link
Author

nskeip commented Oct 16, 2020

@SimonStolz that's pretty cool :) Thank you! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment