Skip to content

Instantly share code, notes, and snippets.

@sudevschiz
Created January 28, 2016 19:30
Show Gist options
  • Save sudevschiz/61489627d1bee008cc41 to your computer and use it in GitHub Desktop.
Save sudevschiz/61489627d1bee008cc41 to your computer and use it in GitHub Desktop.
##If using for the first time, source the below packages
# source("https://bioconductor.org/biocLite.R")
# biocLite("org.Mm.eg.db")
##Load the db to the environment
library("org.Mm.eg.db")
## The list of EnsEMBL Gene IDs are stored in a file "ens.txt". Read the text file as a table
my.ensmusg <- read.table("ens.txt",
sep="\n",
col.names=c("gene_id"),
fill=FALSE,
strip.white=TRUE)
## Convert the table to chr
my.ensmusg1 <- sapply(my.ensmusg,as.character)
## Check for the existence of the gene in the NCBI db
a <- sapply(my.ensmusg1, function(x) exists(x, org.Mm.egENSEMBL2EG))
## Select the gene available
my.ensmusg.existed <- my.ensmusg1[a]
## Convert entire db to a list and choose the required gene_ids
xx <- as.list(org.Mm.egENSEMBL2EG)
out <- xx[my.ensmusg.existed]
## Write the output to a file temporarily and read it back as table variable. Direct trail to trnaspose out variable didn't work
write.table(out,"tmp.csv",sep=",")
tmp <- read.csv("tmp.csv",fill=FALSE,strip.white=TRUE)
## Take a transpose of the data
t_tmp <- t(tmp)
## Write the final output to a file.
write.table(t_tmp,"out_gene_id.csv",sep=",")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment