Skip to content

Instantly share code, notes, and snippets.

View arq5x's full-sized avatar

Aaron Quinlan arq5x

View GitHub Profile
@arq5x
arq5x / multi.yaml
Created December 4, 2014 23:42
multline
attributes:
name: cpg
version: 0.1
recipe:
full:
recipe_type: bash
recipe_cmds:
- >
mysql --user=genome --host=genome-mysql.cse.ucsc.edu \
@arq5x
arq5x / aws-setup.sh
Last active August 29, 2015 14:07
gemini database design experiment
# update things
sudo apt-get update
# get postgres up and running
sudo apt-get install postgresql-9.3
sudo apt-get install postgresql-server-dev-9.3
sudo apt-get install postgresql-client
sudo apt-get install postgresql postgresql-contrib
sudo -u postgres psql postgres
\password postgres
@arq5x
arq5x / run.sh
Last active August 29, 2015 14:05
report duplicate rates for a directory
DIR=bam/t1d-run2a-redo/
# line 10: run flagstats on marked dup BAM
# line 11: grab the number of duplcates line and the total reads line
# line 12 grab the total and dup total
# line 13: place the total followed by duplicate count on same line
# line 14: print the duplicate fraction
for file in `ls $DIR/*.bwamem.sort.dedup.bam`;
do
samtools flagstat $file | \
@arq5x
arq5x / autosomal-dominant.sh
Last active November 13, 2019 17:55
GEMINI Tutorial Commands
# assumes you have SSH'ed and qlogin'ed
cd thu
cd mydata
# slide 5
curl https://s3.amazonaws.com/gemini-tutorials/trio.trim.vep.vcf.gz > trio.trim.vep.vcf.gz
curl https://s3.amazonaws.com/gemini-tutorials/dominant.ped > dominant.ped
gemini load --cores 2 \
-v trio.trim.vep.vcf.gz \
-t VEP \
@arq5x
arq5x / test.sh
Last active August 29, 2015 14:02
testing bedtools intersect with multiple database (-b) files
# test each database file individually
time bedtools intersect -wa -wb -sorted \
-a hg19.rmsk.bed.gz \
-b hg19.segdup.bed.gz \
> /dev/null
real 0m5.069s
user 0m5.007s
sys 0m0.056s
time bedtools intersect -wa -wb -sorted \
@arq5x
arq5x / daniel.sh
Last active August 29, 2015 14:01
Merging adjacent BEDGRAPH intervals with coverage >= mincov
# assumes that the BEDGRAPH is the output of the bedtools genomecov tool using the -bga option.
$ cat foo.bedg
chr1 0 10 5
chr1 10 20 5
chr1 20 30 10
chr1 30 40 11
chr1 40 50 9
chr1 50 60 10
chr1 60 70 11
chr1 70 80 15
@arq5x
arq5x / go.sh
Last active August 29, 2015 14:00
Find intervals where the mean RNA-seq coverage among 3 (for example) samples is >= a minimum threshold.
# compute a BEDGRAPH RNA-seq coverage across the entire genome for each sample:
bedtools genomecov -bga -split -ibam sample1.sorted.bam > 1.bg
bedtools genomecov -bga -split -ibam sample1.sorted.bam > 2.bg
bedtools genomecov -bga -split -ibam sample1.sorted.bam > 3.bg
# compute the mean coverage at each interval.
bedtools unionbedg -i 1.bg 2.bg 3.bg \
| awk '{sum=0; for (col=4; col<=NF; col++) sum += $col; print $0"\t"sum/(NF-4+1); }'
chr1 900 1000 0 60 0 20
import sys
class Line(object):
def __init__(self, line):
self.fields = line.split('\t')
self.chrom = self.fields[0]
self.start = int(self.fields[1])
self.end = int(self.fields[2])
self.depth = int(self.fields[3])
@arq5x
arq5x / tabulate_mutations.py
Created April 10, 2014 17:42
ovarian-cancer-analysis
#!/usr/bibn/env python
print "hi"