Skip to content

Instantly share code, notes, and snippets.

@friveroll
Created May 14, 2012 07:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save friveroll/2692475 to your computer and use it in GitHub Desktop.
Save friveroll/2692475 to your computer and use it in GitHub Desktop.
Get an input file for color-KEGG-pathways
##This script get an input file for color-KEGG-pathways
##https://github.com/ajmazurie/color-KEGG-pathways/
##Based from SPIA results at this tutorial
##http://gettinggeneticsdone.blogspot.mx/2012/03/pathway-analysis-for-high-throughput.html
##With this code
##https://gist.github.com/1945349
##https://gist.github.com/1950232
#Load a library needed for text manipulation
#for install run
#install.packages("stringr")
library("stringr")
#get a link for the pathway you need, in this case the 1st, form SPIA results .
link <- spia_result$KEGGLINK[1]
#Manipulate text to get all info the needed (KEGG_Pathway_ID,KEGG_obj_ID,value)
remove.link <- str_replace(link, "http://www.genome.jp/dbget-bin/show_pathway[?]", "")
texto <- gsub('[+]',',',remove.link)
texto.t <- read.table(textConnection(texto), sep = ",")
#Get all the ENTREZ gene IDs from the SPIA KEGG link
genes.via <- as.numeric(texto.t)
#The last step assign 1 for the pathway name, so we can get rid for this with
genes.via.sel <- c(genes.via[genes.via>1])
#Get the name of the pathway
via <- texto.t[,1]
#Get the name of the organism or KEGG object
org <- str_extract(via, "[a-z]+")
#Get a subset of logFC values with pathway linked genes
genes.both <- sig_genes[which(names(sig_genes) %in% genes.via.sel)]
#Load the results as data frame
genes.both.df <- as.data.frame(genes.both)
#KEGG_obj_ID
via.out <- c(paste(org ,names(genes.both), sep=":"))
#KEGG_Pathway_ID
ruta <- rep(via, length(via.out))
#Collect all the info needed for the input file as data.frame
color.input <- data.frame(KEGG_Pathway_ID=ruta, KEGG_obj_ID=via.out, value=genes.both.df[[1]])
#Write the input as csv file.
write.table(color.input, paste(via, "color.input.csv", sep="."), sep=",", quote=FALSE, row.names = FALSE,
col.names = FALSE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment