Skip to content

Instantly share code, notes, and snippets.

@turgeonmaxime
Last active May 12, 2017 20:56
Show Gist options
  • Save turgeonmaxime/1225b96891c859594d003fef080ee2cc to your computer and use it in GitHub Desktop.
Save turgeonmaxime/1225b96891c859594d003fef080ee2cc to your computer and use it in GitHub Desktop.
Mapping gene symbols to KEGG pathways
library(clusterProfiler)
library(org.Hs.eg.bd)
library(KEGGREST)
library(dplyr)
library(magrittr)
# List of genes of interest
x <- c("GPX3", "GLRX", "LBP", "CRYAB", "DEFB1", "HCLS1", "SOD2", "HSPA2",
"ORM1", "IGFBP1", "PTHLH", "GPC3", "IGFBP3","TOB1", "MITF", "NDRG1",
"NR1H4", "FGFR3", "PVR", "IL6", "PTPRM", "ERBB2", "NID2", "LAMB1",
"COMP", "PLS3", "MCAM", "SPP1", "LAMC1", "COL4A2", "COL4A1", "MYOC",
"ANXA4", "TFPI2", "CST6", "SLPI", "TIMP2", "CPM", "GGT1", "NNMT",
"MAL", "EEF1A2", "HGD", "TCN2", "CDA", "PCCA", "CRYM", "PDXK",
"STC1", "WARS", "HMOX1", "FXYD2", "RBP4", "SLC6A12", "KDELR3", "ITM2B")
# Need to specify an annotation package; here we use org.Hs.eg.bd
# Note: the interface of bitr has changed; earlier versions used AnnoDb instead of OrgDb
map_symb2entrez <- clusterProfiler::bitr(x, fromType = "SYMBOL", toType = "ENTREZID", OrgDb = "org.Hs.eg.db")
# keggLink outputs ENTREZID of the form 'hsa:####', and so we need to remove the first four characters
map_entrez2path <- KEGGREST::keggLink("hsa", "pathway") %>%
substr(start = 5, stop = nchar(.)) %>%
data.frame(PATH = names(.), ENTREZID = .,
stringsAsFactors = FALSE)
# Inner join to link the two
map_gene2path <- dplyr::inner_join(map_symb2entrez, map_entrez2path, by = "ENTREZID")
# You can also get the names of each pathway
KEGGREST::keggGet(c("path:hsa00480", "path:hsa00590")) %>%
sapply(function(query) query$NAME)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment