Skip to content

Instantly share code, notes, and snippets.

@xiaohk
Last active September 20, 2017 02:35
Show Gist options
  • Save xiaohk/62f5e649bafafc031f8fba1e9bb83783 to your computer and use it in GitHub Desktop.
Save xiaohk/62f5e649bafafc031f8fba1e9bb83783 to your computer and use it in GitHub Desktop.
A simple function to generate both data file and description file for the regression tree software GUIDE
# We want to create clean data file and description file for GUIDE
prepare_for_guide = function(df, y_name, output_name, desc_name){
# Find the index of response variable y
y_index = which(colnames(df)==y_name)
# Data file
write.table(df, output_name, row.names=FALSE, col.names=FALSE, quote=FALSE)
# Description file
desc_file = file(desc_name)
lines = c(output_name, "NA", 1)
names = colnames(df)
for (i in 1:length(names)){
# Check if discrete variable or response variable
if (i == y_index){
column_type = 'd'
} else if (is.factor(df[,i])) {
column_type = 'c'
} else {
column_type = 'n'
}
lines[length(lines)+1] = sprintf("%d %s %s", i, names[i], column_type)
}
writeLines(lines, desc_file)
close(desc_file)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment