Skip to content

Instantly share code, notes, and snippets.

@heuermh
Created October 26, 2020 19:33
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save heuermh/ba33a061957e99b6719ed9712a7ace67 to your computer and use it in GitHub Desktop.
Save heuermh/ba33a061957e99b6719ed9712a7ace67 to your computer and use it in GitHub Desktop.

freebayes performance notes

Small dataset, GIAB whole exome, chr 21 and 22 only

BAM → VCF

On laptop with 8 cores, 32G ram

$ time freebayes --fasta-reference /data/Homo_sapiens_assembly19.fasta --strict-vcf /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.bam > /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.vcf

real  3m1.476s
user  2m59.293s
sys   0m1.480s
$ time ./bin/cannoli-submit --driver-memory 8g --executor-memory 8g -- freebayes -reference /data/Homo_sapiens_assembly19.fasta -single /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.bam /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.vcf

real   3m12.600s
user  11m22.074s
sys    0m12.315s

With docker

$ time docker run -i -v /data:/data --rm quay.io/biocontainers/freebayes:1.3.2--py38h40864fe_2 freebayes --fasta-reference /data/Homo_sapiens_assembly19.fasta --strict-vcf /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.bam > /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.vcf

real  2m55.908s
user  0m1.111s
sys   0m1.921s
$ time ./bin/cannoli-submit --driver-memory 8g --executor-memory 8g -- freebayes -use_docker -single -reference /data/Homo_sapiens_assembly19.fasta /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.bam /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.vcf

real  3m44.914s
user  9m17.129s
sys   0m16.778s

Parquet Alignments → Parquet VariantContexts

On laptop with 8 cores, 32G ram

$ time ./bin/cannoli-submit --driver-memory 8g --executor-memory 8g -- freebayes -reference /data/Homo_sapiens_assembly19.fasta /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.alignments.adam /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.variantContexts.adam

real   2m26.583s
user  15m39.566s
sys    0m15.045s

With docker

$ time ./bin/cannoli-submit --driver-memory 8g --executor-memory 8g -- freebayes -use_docker -reference /data/Homo_sapiens_assembly19.fasta /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.alignments.adam /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.21.22.variantContexts.adam

real  3m2.373s
user  9m44.724s
sys   0m18.408s

Large data set, GIAB whole exome

BAM → VCF

On laptop with 8 cores, 32G ram

$ time freebayes --fasta-reference /data/hs37d5.fa --strict-vcf /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.bam > /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.vcf

real  94m21.787s
user  89m34.957s
sys    0m35.746s
$ time ./bin/cannoli-submit --driver-memory 8g --executor-memory 8g -- freebayes -single -reference /data/hs37d5.fa /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.bam /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.vcf

real   58m0.169s
user  437m31.885s
sys     5m28.345s

Parquet Alignments → Parquet VariantContexts

On laptop with 8 cores, 32G ram

$ time ./bin/cannoli-submit --driver-memory 8g --executor-memory 8g -- freebayes -reference /data/hs37d5.fa /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.alignments.adam /data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.variantContexts.adam

real   53m22.312s
user  404m47.057s
sys     5m13.893s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment