Skip to content

Instantly share code, notes, and snippets.

@tylerlittlefield
Last active December 20, 2018 21:09
Show Gist options
  • Save tylerlittlefield/7a721e0da3144375ee87a943d462a6f2 to your computer and use it in GitHub Desktop.
Save tylerlittlefield/7a721e0da3144375ee87a943d462a6f2 to your computer and use it in GitHub Desktop.
Create a data.frame that describes each field
df <- data.frame(
x1 = letters[1:10],
x2 = 1:10,
x3 = sample(c(TRUE, FALSE), 10, replace = TRUE),
x4 = rnorm(10),
x5 = as.factor(LETTERS[1:10]),
stringsAsFactors = FALSE
)
describe <- function(x, class = TRUE) {
# Todo: Investigate class/typeof, which should be the default?
# https://stackoverflow.com/questions/21125222
# Message the user
message("Write a concise description for each column in your dataset.")
message("Note: You may type 'control + c' to end this process.")
# Loop through the column names and prompt the user to describe each one
user_input <- sapply(names(x), function(x) readline(prompt = paste0(x, ": ")))
# Stack the result to go from wide to long
input_stack <- stack(user_input)
# Add the class/type variable, defaults to class
ifelse(
class,
input_stack$class <- sapply(x, class),
input_stack$type <- sapply(x, typeof)
)
# Rename the variables
names(input_stack) <- c("description", "variable", "class")
# Return in my opinionated order
input_stack[c("variable", "class", "description")]
}
describe(df)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment