Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
msd files list
library(XML)
#read url
search<-readLines('http://tbmmsd.s3.amazonaws.com/')
#convert to data.frame
df<-xmlToDataFrame(search)
#pull out files list
Files<-df$Key
#clean up NAs
Files2<-Files[!is.na(Files)]
#construct code
code<-paste0("hadoop fs -cp s3://tbmmsd/", Files2, " /data/files/", Files2)
#get list run either
writeClipboard(code)
write.table(code, "file_list.txt", quote=FALSE, row.names=FALSE, col.names=FALSE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment