Skip to content

Instantly share code, notes, and snippets.

@apoorvalal
Last active December 5, 2021 21:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save apoorvalal/f14d7b142b56e377ebc77ad7e150b31f to your computer and use it in GitHub Desktop.
Save apoorvalal/f14d7b142b56e377ebc77ad7e150b31f to your computer and use it in GitHub Desktop.
functions to use dataverse API bindings in R to download repositories
library(dataverse)
library(stringr)
# add credentials (required for large jobs or gated files)
#Sys.setenv("DATAVERSE_SERVER" = "dataverse.harvard.edu")
#Sys.setenv("DATAVERSE_KEY" = "examplekey12345")
######################################################################
# download individual file from dataverse doi
dl_file = function(fn, dirn, doi){
cat(paste0("Downloading ", fn, "\n"))
f = try(get_file(fn, doi, orginal = TRUE)) # download to binary
# failure state - 404 or 403 errors
if(class(f) == "try-error"){
cat(paste0(fn, " failed. \n"))
return(NULL)
} else { # write binary to file
writeBin(f, file.path(dirn, fn))
cat(paste0(fn, " downloaded! \n"))
return(TRUE)
}
}
# function to mirror entire dataverse doi
dv_download = function(doi, dirn, pat = ""){
# create folder if it doesn't exist
dir.create(dirn, showWarnings = FALSE)
# get metadata
dataset = get_dataset(doi)
# list of files in repo
fl = dataset$files$label
# subset to pattern
fl = str_subset(fl, pat)
# download all files in repo
for(f in fl) dl_file(f, dirn, doi)
}
######################################################################
dv_download("doi:10.7910/DVN/5TBFXL", 'test')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment