Skip to content

Instantly share code, notes, and snippets.

@gklambauer
Forked from sonali-bioc/cnv-cn.mops.R
Last active October 29, 2016 21:24
Show Gist options
  • Save gklambauer/8955203 to your computer and use it in GitHub Desktop.
Save gklambauer/8955203 to your computer and use it in GitHub Desktop.
library(cn.mops)
library(RUnit)
tumor_gr <- getReadCountsFromBAM("tumorA.chr4.bam",
refSeqName="chr4",WL=10000,mode="unpaired")
normal_gr <- getReadCountsFromBAM("normalA.chr4.bam",
refSeqName="chr4",WL=10000,mode="unpaired")
# We need a special normalization because the tumor has made large CNVs
X <- tumor_gr
values(X) <- cbind(values(tumor_gr),values(normal_gr))
X <- normalizeGenome(X,normType="mode")
# Parameter settings for tumor:
# - norm=0, because we already have normalized
# - integer copy numbers higher than 8 allowed
# - DNAcopy as segmentation algorithm.
ref_analysis_norm0 <- referencecn.mops(X[,1], X[,2],
norm=0,
I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 8,16,32,64),
classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7","CN8","CN16","CN32","CN64","CN128"),
segAlgorithm="DNAcopy")
resCNMOPS <- calcIntegerCopyNumbers(ref_analysis_norm0)
resCNMOPS <-cn.mops:::.replaceNames(resCNMOPS, "tumorA.chr4.bam","tumor")
png("cn.mops-segplot.png")
segplot(resCNMOPS)
dev.off()
cnvs(resCNMOPS) #look at individual CNV regions
#-------------------------------------------------------------------------------
test_cnv_cn.mops <-
function()
{
BamFiles <- list.files(system.file("extdata", package="cn.mops") ,
pattern=".bam$", full.names=TRUE)
bamDataRanges <- getReadCountsFromBAM(BamFiles,
sampleNames=paste("Sample",1:2), mode="unpaired")
checkEquals(856, bamDataRanges$Sample.1[1])
data(cn.mops)
got <- cn.mops(XRanges[,1:3])
checkEquals(6,length(cnvs(got)))
checkEquals(1775001 ,start(ranges(cnvr(got))[1]))
checkEquals(1850000, end(ranges(cnvr(got))[1]))
checkEquals(75000, width(ranges(cnvr(got))[1]))
}
@jowkar
Copy link

jowkar commented Oct 29, 2016

Hi!

I have some questions regarding the referencecn.mops function and how to perform a tumor/normal analysis for the purpose of detecting somatic variants:

How does the referencecn.mops function operate when multiple tumors and normals are provided? I ran it using 20 tumors and 4 normals in one go, where the 4 normals matched 4 of the tumors. I did not perform any special normalization (merely setting norm=0, which may or may not be the right thing to do).

First of all: is this correct, or is it better to run the function on one pair at a time as in this gist example?

Second: I noticed that some of the CNVs that were detected in tumor samples by running it this way were also present in some of the normal samples. I suppose it would be correct to therefore label them variants, but is not the purpose of the referencecn.mops function to filter out variants that are present in the normal samples? Did I misunderstand the function, or are they just false positives?

Third: if it is indeed correct to run the function with different numbers of tumors and normals, would it be correct, for the purpose of normalization, to modify the above code to:

normal_gr <- bamDataRanges[,c(5,17,21,23)]
tumor_gr <- bamDataRanges[,-c(5,17,21,23)]
X <- tumor_gr
values(X) <- cbind(values(tumor_gr),values(normal_gr)) 
X <- normalizeGenome(X,normType="mode")

ref_analysis_norm0 <- referencecn.mops(X[,1:20], X[,21:24],
                                       norm=0,
                                       I = c(0.025, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 8,16,32,64),
                                       classes = c("CN0", "CN1", "CN2", "CN3", "CN4", "CN5", "CN6", "CN7","CN8","CN16","CN32","CN64","CN128"),
                                       segAlgorithm="DNAcopy")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment