Skip to content

Instantly share code, notes, and snippets.

View al2na's full-sized avatar
💭
fishing 🎣

Altuna Akalin al2na

💭
fishing 🎣
View GitHub Profile
@al2na
al2na / readBismarkFiles.R
Last active August 30, 2023 15:47
read Bismark coverage and cytosine report files as methylKit objects
#' Read bismark coverage file as a methylKit object
#'
#' Bismark aligner can output methylation information per base in
#' multiple different formats. This function reads coverage files,
#' which have chr,start,end, number of cytosines (methylated bases)
#' and number of thymines (unmethylated bases).
#'
#' @param location a list or vector of file paths to coverage files
#'
#' @param sample.id a list or vector of sample ids
@al2na
al2na / ideoDMC.R
Last active June 7, 2023 16:02
ideogram for differential methylation values
#' function for making ideogram for differential methylation values
#' requires methylKit, ggbio and GenomicRanges
#'
#' @example
#' library(BSgenome)
#' library("BSgenome.Hsapiens.UCSC.hg18")
#' chr.len = seqlengths(Hsapiens) # get chromosome lengths
#' # remove X,Y,M and random chromosomes
#' chr.len = chr.len[grep("_|M|X|Y", names(chr.len), invert = T)]
#'
@al2na
al2na / callPrimer3.R
Last active December 11, 2021 19:48
calling primer3 from R
#' call primer3 for a given set of DNAstringSet object
#'
#' @param seq DNAstring object, one DNA string for the given amplicon
#' @param size_range default: '151-500'
#' @param Tm melting temprature parameters default:c(55,57,58)
#' @param name name of the amplicon in chr_start_end format
#' @param primer3 primer3 location
#' @param therme.param thermodynamic parameters folder
#' @param settings text file for parameters
#' @author Altuna Akalin modified Arnaud Krebs' original function
@al2na
al2na / readMethylRawList.para.R
Created January 3, 2015 12:26
read files in parallel
library(foreach)
library(methylKit)
library(doMC)
registerDoMC(2)
file.list=list( system.file("extdata", "test1.myCpG.txt", package = "methylKit"),
system.file("extdata", "test2.myCpG.txt", package = "methylKit"),
system.file("extdata", "control1.myCpG.txt", package = "methylKit"),
system.file("extdata", "control2.myCpG.txt", package = "methylKit") )
ids=c("test1","test2","ctrl1","ctrl2")
@al2na
al2na / annotateWithFeatures.R
Last active June 10, 2020 09:31
new annotation functions for genomation,
library(data.table)
library(genomation)
library(ggplot2)
#' Annotate given ranges with genomic features
#'
#' The function annotates a target GRangesList or GRanges object as overlapping
#' or not with
#' the elements of named GRangesList. This is useful to annotate your regions
@al2na
al2na / summarizeScoresOnRegions.R
Last active March 23, 2020 22:20
getting average score of a GRanges over another GRanges object. summarizeScores(loc.gr,score.gr,score.col)
.libPaths("/work2/gschub/altuna/RlibsDev")
require(data.table)
summarizeScores<-function(loc.gr,score.gr,score.col){
ov <- as.matrix(findOverlaps(loc.gr,score.gr))
dt=data.table(id=ov[,1],score=values(score.gr)[ov[,2],which(names(values(score.gr))==score.col)])
dt=dt[,list(av=mean(score)),by=id]
p300.url="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeAwgTfbsUniform/wgEncodeAwgTfbsHaibH1hescP300V0416102UniPk.narrowPeak.gz"
Nanog.url="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeAwgTfbsUniform/wgEncodeAwgTfbsHaibH1hescNanogsc33759V0416102UniPk.narrowPeak.gz"
SP1.url="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeAwgTfbsUniform/wgEncodeAwgTfbsHaibH1hescSp1Pcr1xUniPk.narrowPeak.gz"
chrHMM.url="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmH1hescHMM.bed.gz"
#http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeAwgSegmentation/wgEncodeAwgSegmentationChromhmmH1hesc.bed.gz"
# get chromHMM annotation and make a list out of it
@al2na
al2na / getOE.R
Last active January 4, 2016 02:59
get O/E ration and CpG ratio and GC content for given GRanges object and BSgenome. Returns a matrix of O/E ratio, CpGcontent and GC content
require(GenomicRanges)
require(Biostrings)
#' get OE ratio and GC content for a given set of DNAstrings
getOE.strset<-function(str.set)
{
di.mat=dinucleotideFrequency( str.set )
a.mat =alphabetFrequency( str.set ,baseOnly=TRUE )
exp=(a.mat[,2]*a.mat[,3])/width(str.set )
@al2na
al2na / tileMethylCounts2.R
Last active December 30, 2015 21:49
a version of tileMethylCounts where the chromosome names obtained from GRanges out of unique seqnames, so there will be no chr names that do not have coverage. see the discussion here: https://groups.google.com/forum/#!topic/methylkit_discussion/75y6dbrbCWI
setGeneric("tileMethylCounts2",
function(object,win.size=1000,step.size=1000,cov.bases=0)
standardGeneric("tileMethylCounts2") )
setMethod("tileMethylCounts2", signature(object="methylRaw"),
function(object,win.size,step.size,cov.bases){
g.meth =as(object,"GRanges")
chrs =as.character(unique(seqnames(g.meth)))
widths =seqlengths(g.meth)
@al2na
al2na / getCMethMatrix.R
Created November 11, 2013 15:13
getting a matrix methylation status matrix via QuasR
require(data.table)
require(QuasR)
# arguments:
# proj: qProject object
# range: GRanges object with ONE!!!! range
# samp: sample.name
#
getCMethMatrix<-function(proj,range,samp){
Cs=qMeth(proj, query=range,mode="allC",reportLevel="alignment")