Skip to content

Instantly share code, notes, and snippets.

@astatham
Created December 15, 2010 02:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save astatham/741538 to your computer and use it in GitHub Desktop.
Save astatham/741538 to your computer and use it in GitHub Desktop.
library(Repitools)
library(chipseq)
library(BSgenome.Hsapiens.UCSC.hg18)
library(rtracklayer)
rootdir <- "/home/data/NGS_New"
NGS <- read.csv(paste(rootdir, "NGS_Log14.csv", sep="/"), stringsAsFactors=FALSE, header=T)
load(paste(NGS$Path[1], NGS$RdataGR[1], sep="/"))
#set chromosome lengths
seqlengths(rs) <- seqlengths(Hsapiens)[names(seqlengths(rs))]
#extend reads
rs <- resize(rs, 300)
#create coverages - setting the seqlengths on the GRanges sets the length of each chromosome in the coverage properly
rs.cov <- coverage(rs)
#sample the coverage object every 100bp
#I dont know when GenomeBlocks got changed to return a GRanges, I originally made it return a RangedData so maybe we need to make it return a specified type eg RangesList, GRanges or RangedData
tiled <- genomeBlocks(Hsapiens, names(rs.cov), 100)
#actually that creates a tiled GRanges with 100bp widths, we just want to sample 1bp every 100bp
tiled <- resize(tiled, 1) # this object can be now reused for every lane you want to extract data from
tiled.split <- split(start(tiled), as.character(seqnames(tiled)))
#and extract the reads, and normalise to the number of reads
values(tiled)$score <- unlist(lapply(names(rs.cov), function(x) as.integer(rs.cov[[x]][tiled.split[[x]]])))/length(rs)*1e6
#(optionally) drop any which have a score of 0 as they wont get plotted in UCSC anyway
tiled <- tiled[!values(tiled)$score==0]
#convert to a RangedData for export to a bigwig through rtracklayer
export(as(tiled, "RangedData"), "test.bw", seqlengths=seqlengths(Hsapiens)[names(rs.cov)])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment