Skip to content

Instantly share code, notes, and snippets.

@leeper
Last active July 6, 2017 21:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save leeper/ec5a358f670c22128ae31452cc1020a2 to your computer and use it in GitHub Desktop.
Save leeper/ec5a358f670c22128ae31452cc1020a2 to your computer and use it in GitHub Desktop.
Base R Equivalents of the "Teach the Tidyverse to Beginners" Example
url <- "http://varianceexplained.org/files/Brauer2008_DataSet1.tds"
# Clean and tidy the data
d1 <- rio::import(url, format = "tsv")
d2 <- cbind(d1, setNames(do.call(rbind.data.frame,strsplit(d1$NAME, " ?\\|\\| ?"))[,-5],
c("name", "BP", "MF", "systematic_name")))
d3 <-
subset(
within(
reshape(d2,
varying = list(names(d2)[grepl("^.0", names(d2))]),
v.names = "expression",
times = names(d2)[grepl("^.0", names(d2))],
timevar = "sample",
direction = "long"
), {
nutrient = substring(sample, 1, 1);
rate = as.numeric(gsub("^.{1}", "", sample))
}),
!is.na(expression) & systematic_name != "",
-c(sample,id,NAME,GID,YORF,GWEIGHT)
)
# Visualize a set of four genes
ggplot(subset(d3, BP == "leucine biosynthesis"),
aes(rate, expression, color = nutrient)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
facet_wrap(~name + systematic_name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment