other databases: https://twitter.com/tangming2005/status/1470513427080699905
GSEA has a geneset https://www.gsea-msigdb.org/gsea/msigdb/cards/GOCC_CELL_SURFACE.html use https://cran.r-project.org/web/packages/msigdbr/vignettes/msigdbr-intro.html to get it
library(msigdbr)
cell_surface_proteins<- msigdbr(species = "Homo sapiens", category = "C5") %>%
filter(gs_name == "GOCC_CELL_SURFACE") %>%
pull(human_gene_symbol)
head(cell_surface_proteins)
misigdb is too resctrict, use the following.
download the data:
wget http://wlab.ethz.ch/surfaceome/table_S3_surfaceome.xlsx
library(tidyverse)
library(janitor)
library(here)
surfaceome<- readxl::read_xlsx(here("data/table_S3_surfaceome.xlsx"))
colnames(surfaceome)<- surfaceome %>% slice(1) %>% unlist(use.names = FALSE)
surfaceome<- surfaceome %>% clean_names()
surfaceome<- surfaceome %>% slice(-1)
#View(surfaceome)
table(surfaceome$surfaceome_label)
There is a more recent paper on this The Cancer Surfaceome Atlas integrates genomic, functional and drug response data to identify actionable targets.
Surfaceome mapping of primary human heart cells with CellSurfer uncovers cardiomyocyte surface protein LSMEM2 and proteome dynamics in failing hearts In this paper, it referenced several tools including https://www.cellsurfer.net/surfacegenie