Skip to content

Instantly share code, notes, and snippets.

View ohofmann's full-sized avatar

Oliver Hofmann ohofmann

  • University of Melbourne
  • Melbourne, Australia
View GitHub Profile
details:
- analysis: variant2
genome_build: GRCh37
algorithm:
# Alignment parameters
aligner: bwa
recalibrate: false
realign: false
mark_duplicates: true
remove_lcr: false
@ohofmann
ohofmann / sig.Rmd
Last active December 7, 2017 02:18
R postprocess
---
title: "UMCCR Patient Summary"
author: "Oliver Hofmann"
date: "`r Sys.Date()`"
output:
html_document:
theme: readable
toc: false
toc_float: false
code_folding: hide
@ohofmann
ohofmann / postprocess.md
Created December 6, 2017 05:23
Postprocessing notes

Patient analysis notes

Set up result/sample names:

BCINSTALL="/data/projects/punim0010/local/share/bcbio/"
BCRESULT="/data/cephfs/punim0010/data/Results/Avner/WPT-013/final"
BCFINAL="2017-10-19_WPT-013"
BCPOST="/data/cephfs/punim0010/projects/Hofmann_Explore/testrun"
EXTRAS="/data/projects/punim0010/local/share/extras/"

Code of Conduct

The conduct is based on Titus Brown's version (with his permission).

Lab Code of Conduct

All members of the lab, along with visitors, are expected to agree with the following code of conduct. We will enforce this code as needed. We expect cooperation from all members to help ensuring a safe environment for everybody.

The Quick Version

2016-07-27 23:57:53.400 [IPEngineApp] Using existing profile dir: u'/scratch/hofmann/projects/NA12878-exome-eval/work_cloud_dist/log/ipython'
2016-07-27 23:57:53.419 [IPEngineApp] Loading url_file u'/scratch/hofmann/projects/NA12878-exome-eval/work_cloud_dist/log/ipython/security/ipcontroller-1c9737c4-68fd-49be-b74e-85d8bdda4c49-engine.json'
2016-07-27 23:57:53.461 [IPEngineApp] Registering with controller at tcp://45.113.233.138:59195
2016-07-27 23:57:53.623 [IPEngineApp] Starting to monitor the heartbeat signal from the hub every 5010 ms.
2016-07-27 23:57:53.634 [IPEngineApp] Using existing profile dir: u'/scratch/hofmann/projects/NA12878-exome-eval/work_cloud_dist/log/ipython'
2016-07-27 23:57:53.637 [IPEngineApp] Completed registration with id 1
2016-07-28 00:14:25.604 [IPEngineApp] WARNING | No heartbeat in the last 5010 ms (1 time(s) in a row).
cat: write error: Broken pipe
cat: write error: Broken pipe
2016-07-28 00:20:01.180 [IPEngineApp] Exception in apply request:
@ohofmann
ohofmann / x10.md
Last active February 27, 2016 10:14
Illumina X Duplication Check

We are in the process of troubleshooting a relatively new Illumina X10 installation with different libraries, including Genome in a Bottle NA12878 samples. Preliminary results look comparable to Illumina's Platinum Genome data as well as data from other X10 facilities:

However, we still have to get a handle on our duplication rates. The Xs (and the 4000s) are known to struggle with higher duplication rates due to 'underloaded' flowcells which causes molecules to migrate to nearby empty nanowells which in turn results in an increased optical duplication rate. But even with that caveat duplication rates should be in the 15-20% range whereas ours tend to be quite a bit higher:

> summary(dup)
       V1       
 Min.   
---
title: "QC Report"
author: "Automatically generated"
date: "tba"
output:
html_document:
toc: true
theme: united
---
qc = read.table(file.path(path_results, "metrics", "metrics.tsv"),
header=T, sep="\t", check.names=F,
colClasses=list("sample"="character"))
rownames(qc) = qc$sample
qc$Mapped_reads_pct = as.numeric(gsub("%", "", qc$Mapped_reads_pct))
qc$Duplicates_pct = as.numeric(gsub("%", "", qc$Duplicates_pct))
metrics = c("sample", "Total_reads" ,"Mapped_reads_pct", "Duplicates_pct",
"offtarget",
"%GC", "Sequence_length", "Median_insert_size")
library(tidyr)
library(dplyr)
# Import/concat BED coverage files
file_list <- list.files(path='coverage/', pattern='*_coverage_fixed.bed')
for (file in file_list){
# if the merged dataset does exist, append to it
if (exists("dataset")){
temp_dataset <-read.table(file.path('coverage', file),
Individual filtered VCFs vs filtered standard (variants present in 5/9) samples, `rtg vcfeval -b Shared.vcf.gz -c foo.vcf.gz -t /cm/shared/apps/bcbio/20150720-devel/data/genomes/Hsapiens/GRCh37/rtg/GRCh37.sdf -o eval`:
```
File True-pos False-pos False-neg Precision Sensitivity F-measure
filtered.Fresh_4706 3338450 54119 339450 0.9840 0.9077 0.9443
filtered.Fresh_4707 3409349 154908 268551 0.9565 0.9270 0.9415
filtered.Fresh_Normal_4708 3464245 190615 213655 0.9478 0.9419 0.9449
filtered.UMFIX_4085 3134462 145638 543438 0.9556 0.8522 0.9010
filtered.UMFIX_4090 3101645 105139 576255 0.9672 0.8433 0.9010
filtered.UMFIX_Normal_4088 3268466 157860 409434 0.9539 0.8887 0.9201