Skip to content

Instantly share code, notes, and snippets.

View arq5x's full-sized avatar

Aaron Quinlan arq5x

View GitHub Profile
@arq5x
arq5x / test.sh
Last active November 30, 2023 12:50
Compress and then Decompress a string with zlib.
# compile
$ g++ zlib-example.cpp -lz -o zlib-example
# run
$ ./zlib-example
Uncompressed size is: 36
Uncompressed string is: Hello Hello Hello Hello Hello Hello!
----------
#!/usr/bin/env python
"""
given a callable file from goleft depth on stdin, merge adjacent LOW_COVERAGE and CALLABLE regions.
also split regions that are larger than max_region.
this is to even out parallelism for sending regions to freebayes.
"""
from __future__ import print_function, division
import sys
from itertools import groupby
from operator import itemgetter
chr1 10
@arq5x
arq5x / Overview.md
Last active November 17, 2022 23:03
Analysis of rare disease VCF with bcftools

Analysis of rare disease VCF with bcftools

Big Picture - You are the genome detective.

Here's the deal.

  • We have sequenced the exomes of an affected child, and their unaffected parents. The child has a rare skin condition.
  • We aligned their exome data to the human reference genome.
  • We called variants using GATK.
  • The resulting VCF file is called trio.trim.vep.vcf.gz.
@arq5x
arq5x / all_the_seqs.txt
Created November 14, 2022 23:46
toy data for CSHL.
GTTGTACTTCGTTCAATCGGTAGGTGTTTAACCGGATGGTCACGCCTACC
CAAGCATACTTCATTCAGTCAGGCGAAATTATTGCCAGGTCGCCGCCTAC
GTTGTACTTCGTTCAGTCGGTGGTGTTTAACTGGGTCATCGCCTACCGTG
AGTAATACTTCGTTCAGTTTGTGGAAGGTAGTGTTTAACCGGTTGCTCGC
GGTATGCGCTGGTCAAATCGGAGAGTGGGTGTTTATCGGATGGATCGCTG
CGGTGCGTGCTGTGATTCATCTTTGACTGGTGTTTATGGTCGGTCTTTAC
CATTGTACTTCCCGTTCAGTTTATCAAATTTGGTGTTTATATGAACCATA
GATGGTGGTGGTGGCGGTGGCGCTGGTGTTGCATGGTGTTGCGCATTATT
GTATCGCGTTCAGTTTCGGAAGGTGGTGTTTAACCAGTTCGCCGCCTACC
GTTTTTTCGTTCATTTGGTACGGTGTTGCGCCGGTCGCCGCCTACCGTGA
@arq5x
arq5x / inheritance_scenarios.md
Last active March 28, 2022 21:45
mendelian violations
dad mom kid Inheritance description
HOM_REF HOM_REF HOM_REF Expected
HOM_REF HOM_REF HET Mendelian violation (plausible de novo)
HOM_REF HOM_REF HOM_ALT Mendelian violation (implausible de novo)
HOM_REF HOM_ALT HOM_REF Mendelian violation (uniparental disomy)
HOM_REF HOM_ALT HET Expected
HOM_REF HOM_ALT HOM_ALT Mendelian violation (uniparental disomy)
HOM_REF HET HOM_REF Expected
HOM_REF HET HET Expected
@arq5x
arq5x / go.sh
Last active November 23, 2021 02:38
compute average scores for share intervals
cat a.bed
chr1 10 50 10
cat b.bed
chr1 20 40 20
cat c.bed
chr1 30 33 30
# Find the sub-intervals shared and unique to each file.
@arq5x
arq5x / make-master-hmm.sh
Created July 18, 2012 20:19
For Gemini: Create a master ChromHMM track from the 9 distinct cell types.
echo "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmGm12878HMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmH1hescHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHepg2HMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHmecHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHsmmHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHuvecHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmK562HMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmNhekHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmNhlfHMM.bed.gz" \
> chromhmm-files.txt
@arq5x
arq5x / autosomal-dominant.sh
Last active November 13, 2019 17:55
GEMINI Tutorial Commands
# assumes you have SSH'ed and qlogin'ed
cd thu
cd mydata
# slide 5
curl https://s3.amazonaws.com/gemini-tutorials/trio.trim.vep.vcf.gz > trio.trim.vep.vcf.gz
curl https://s3.amazonaws.com/gemini-tutorials/dominant.ped > dominant.ped
gemini load --cores 2 \
-v trio.trim.vep.vcf.gz \
-t VEP \
@arq5x
arq5x / rest-example.md
Last active July 23, 2019 00:16
Example use of a RESTful API to GEMINI databases.

Load a GEMINI database from a VCF

$ gemini load -v nobel.vcf -t VEP --cores 23 -p samples.ped nobel.db

Launch the GEMINI web server

(this will run on your local machine on port 8088)

$ gemini browser nobel.db