Skip to content

Instantly share code, notes, and snippets.

@jvwong
Created June 22, 2018 16:46
Show Gist options
  • Save jvwong/c558f77a9fa9f76eb9a3b628880f3e7e to your computer and use it in GitHub Desktop.
Save jvwong/c558f77a9fa9f76eb9a3b628880f3e7e to your computer and use it in GitHub Desktop.
Fisher's exact for Miller et al.
# Probability of GO term (GeneSet) in gene hits (Hits) from a screen
# Initialize our relevant variables
nBackground <- 407 # Total genes screened
nGeneSet <- 18 # Genes in GO term that are in the bachground
nHits <- 60 # Genes enriched in screen in background
nOutsideGeneSet <- nBackground - nGeneSet # Genes not in GeneSet but in background
nQueryPathway <- 0:nGeneSet # Overlap between query genes ('Hits') and GO term (in background)
# Use the dhyper built-in function for hypergeometric probability
probability <- dhyper(nQueryPathway, nGeneSet, nOutsideGeneSet, nHits, log = FALSE)
data <- data.frame( x = nQueryPathway, y = probability )
# Bar plot
library(ggplot2)
ggplot(data, aes(x=factor(x), y=y)) +
theme(axis.text=element_text(size=14),
axis.title=element_text(size=18,face="bold"),
axis.title.x=element_text(margin=margin(20,0,0,0)),
axis.title.y=element_text(margin=margin(0,20,0,0))
) +
geom_bar(stat="identity", fill= "#2c3e50", colour="black") +
labs(x = "Hits in GeneSet", y = "Probability")
pval <- sum(probability)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment