Skip to content

Instantly share code, notes, and snippets.

View arq5x's full-sized avatar

Aaron Quinlan arq5x

View GitHub Profile
@arq5x
arq5x / macs.sh
Last active August 29, 2015 13:56
Examples for generating haplotypes with Macs
DL: https://code.google.com/p/macs/
# simulate:
# 100 individuals (200 haplotypes)
# "genome" is 1Mb (1e6)
# mutation and recombinaytion rate at 0.001
macs 200 1e6 -T -t .001 -r .001 > 200.macs
# peak at file:
grep SITE: 200.macs | head
@arq5x
arq5x / summary.md
Last active August 29, 2015 13:57
Discrepancies in PolyPhen2 predictions.

Example

Nonsynonymous change in question (human, build 37; 1-based coordinate)
chrom: chr22 
pos:   24379402
ref:   T
alt:   G
@arq5x
arq5x / go.sh
Created April 9, 2014 18:15
Track Github repo release download info , etc.
curl -i https://api.github.com/repos/arq5x/bedtools2/releases
@arq5x
arq5x / tabulate_mutations.py
Created April 10, 2014 17:42
ovarian-cancer-analysis
#!/usr/bibn/env python
print "hi"
@arq5x
arq5x / go.sh
Last active August 29, 2015 14:00
Find intervals where the mean RNA-seq coverage among 3 (for example) samples is >= a minimum threshold.
# compute a BEDGRAPH RNA-seq coverage across the entire genome for each sample:
bedtools genomecov -bga -split -ibam sample1.sorted.bam > 1.bg
bedtools genomecov -bga -split -ibam sample1.sorted.bam > 2.bg
bedtools genomecov -bga -split -ibam sample1.sorted.bam > 3.bg
# compute the mean coverage at each interval.
bedtools unionbedg -i 1.bg 2.bg 3.bg \
| awk '{sum=0; for (col=4; col<=NF; col++) sum += $col; print $0"\t"sum/(NF-4+1); }'
chr1 900 1000 0 60 0 20
@arq5x
arq5x / daniel.sh
Last active August 29, 2015 14:01
Merging adjacent BEDGRAPH intervals with coverage >= mincov
# assumes that the BEDGRAPH is the output of the bedtools genomecov tool using the -bga option.
$ cat foo.bedg
chr1 0 10 5
chr1 10 20 5
chr1 20 30 10
chr1 30 40 11
chr1 40 50 9
chr1 50 60 10
chr1 60 70 11
chr1 70 80 15
@arq5x
arq5x / test.sh
Last active August 29, 2015 14:02
testing bedtools intersect with multiple database (-b) files
# test each database file individually
time bedtools intersect -wa -wb -sorted \
-a hg19.rmsk.bed.gz \
-b hg19.segdup.bed.gz \
> /dev/null
real 0m5.069s
user 0m5.007s
sys 0m0.056s
time bedtools intersect -wa -wb -sorted \
@arq5x
arq5x / run.sh
Last active August 29, 2015 14:05
report duplicate rates for a directory
DIR=bam/t1d-run2a-redo/
# line 10: run flagstats on marked dup BAM
# line 11: grab the number of duplcates line and the total reads line
# line 12 grab the total and dup total
# line 13: place the total followed by duplicate count on same line
# line 14: print the duplicate fraction
for file in `ls $DIR/*.bwamem.sort.dedup.bam`;
do
samtools flagstat $file | \
@arq5x
arq5x / aws-setup.sh
Last active August 29, 2015 14:07
gemini database design experiment
# update things
sudo apt-get update
# get postgres up and running
sudo apt-get install postgresql-9.3
sudo apt-get install postgresql-server-dev-9.3
sudo apt-get install postgresql-client
sudo apt-get install postgresql postgresql-contrib
sudo -u postgres psql postgres
\password postgres