Skip to content

Instantly share code, notes, and snippets.

@toyeiei
Last active August 15, 2019 07:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save toyeiei/538795a3ff4071829a5d7a79c1af7614 to your computer and use it in GitHub Desktop.
Save toyeiei/538795a3ff4071829a5d7a79c1af7614 to your computer and use it in GitHub Desktop.
generate data R for Excel users
## generate vectors
set.seed(99)
id <- 1:100
student_name <- replicate(100, paste0(sample(LETTERS, size = 5, replace = T), collapse = ""))
gender <- factor(sample(c(0,1), size = 100, replace = T, prob = c(.7, .3)),
labels = c("M", "F"))
age <- sample(19:32, size = 100, replace = T)
gpa <- rnorm(n = 100, mean = 3.15, sd = 0.25)
gpa <- ifelse(gpa > 4, 4.00, gpa)
nationality <- factor(sample(c(0,1,2), size = 100, replace = T, prob = c(0.5, 0.3, 0.2)),
labels = c("TH", "US", "CH"))
## create a dataframe from vectors
df <- data.frame(id, student_name, gender, age, gpa, nationality)
head(df)
## create a lookup table in this example
set.seed(99)
us_students <- df %>% filter(nationality == "US") %>% .[["id"]]
lookup_state <- data.frame(us_students,
states = sample(state.name, size = length(us_students), replace = TRUE))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment