Skip to content

Instantly share code, notes, and snippets.

@aaronwolen
Created February 11, 2014 15:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aaronwolen/8937163 to your computer and use it in GitHub Desktop.
Save aaronwolen/8937163 to your computer and use it in GitHub Desktop.
Download the latest GENCODE annotations file and export as a GRanges object
# Download the latest GENCODE annotations file and export as a GRanges object
library(RCurl)
library(rtracklayer)
# Variables ---------------------------------------------------------------
# 'human' or 'mouse'
organism <- "human"
# output directory
out.dir <- "~/Downloads"
# Functions ---------------------------------------------------------------
ls_url <- function(url) {
stopifnot(url.exists(url))
out <- getURL(url, ftp.use.epsv = FALSE, dirlistonly = TRUE)
readLines(textConnection(out))
}
# Determine latest release ------------------------------------------------
ftp.url <- paste0("ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_", organism)
ftp.dirs <- ls_url(ftp.url)
ftp.dirs <- Filter(function(x) grepl("release", x), ftp.dirs)
releases <-sapply(strsplit(ftp.dirs, "_"), function(x) x[2])
ftp.dirs <- ftp.dirs[order(as.numeric(sub("\\D+", "", releases)),
sub("\\d+", "", releases))]
message("Latest release is ", tail(ftp.dirs, 1))
# Download latest annotation.gtf file -------------------------------------
ftp.url <- paste0(ftp.url, "/", tail(ftp.dirs, 1), "/")
ftp.files <- ls_url(ftp.url)
gtf.file <- Filter(function(x) grepl("\\d\\.annotation.gtf", x), ftp.files)
gtf.url <- paste0(ftp.url, gtf.file)
download.file(gtf.url, destfile = file.path(out.dir, gtf.file))
# Export GRanges object ---------------------------------------------------
gencode <- import.gff(file.path(out.dir, gtf.file), version = "2")
save(gencode, file = file.path(out.dir, paste0(sub("gtf.gz", "", gtf.file), "rda")))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment