Skip to content

Instantly share code, notes, and snippets.

@Zoldin
Created July 21, 2017 20:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Zoldin/7591c47ce5988cbe087e0038c9a850b9 to your computer and use it in GitHub Desktop.
Save Zoldin/7591c47ce5988cbe087e0038c9a850b9 to your computer and use it in GitHub Desktop.
train_test_splitting.R
#!/usr/bin/Rscript
library(caret)
args = commandArgs(trailingOnly=TRUE)
if (!length(args)==5) {
stop("Five arguments must be supplied (input file name, splitting ratio related to test data set, seed, train output file name, test output file name).n", call.=FALSE)
}
set.seed(as.numeric(args[3]))
df <- read.csv(args[1],stringsAsFactors = FALSE)
test.index <- createDataPartition(df$label, p = as.numeric(args[2]), list = FALSE)
train <- df[-test.index,]
test <- df[test.index,]
write.csv(train, file=args[4],row.names=FALSE)
write.csv(test, file=args[5],row.names=FALSE)
print("train/test files created....")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment