Skip to content

Instantly share code, notes, and snippets.

Avatar

Aaron Quinlan arq5x

View GitHub Profile
@arq5x
arq5x / methods.sh
Last active Mar 14, 2018
breast-cancer-evolution-cnv-segmentation
View methods.sh
# bedtools --version
# bedtools v2.24.0-14-gaa11ef9
########################################################
# Create a BED file of 5kb windows with 2.5kb overlap
# tiling build 37 (hg19) of the human genome
########################################################
bedtools makewindows -g hg19.txt -w 5000 -s 2500 > hg19.w5k.s2.5k.bedg
########################################################
@arq5x
arq5x / example.sh
Last active Jan 24, 2019
Natural sort a VCF
View example.sh
chmod a+x vcfsort.sh
vcfsort.sh trio.trim.vep.vcf.gz
View example.sh
sudo pip install awscli
aws configure
aws s3 ls
aws s3 ls s3://gqt-data
@arq5x
arq5x / example.sh
Created Apr 4, 2015
minimum tiling path
View example.sh
cat ivl.bed
chr1 10 30
cat data.bed
chr1 9 20 d1
chr1 12 18 d2
chr1 12 20 d3
chr1 15 16 d4
chr1 25 40 d5
chr1 26 30 d6
@arq5x
arq5x / cl.py
Last active Aug 29, 2015
Python simulation of Chutes and Ladders
View cl.py
import sys
import numpy as np
"""
Simulate chutes and ladders.
Reports the number of moves for 1-player to reach the end,
followed by the list of rolls that player had.
Run as follows for 100000 games with 1 player. Report the total
number of moves made by the winning player:
@arq5x
arq5x / complexity.py
Last active Feb 6, 2019
kmer fun with jellyfish
View complexity.py
import sys
from itertools import *
"""
compute the complexity of each kmer passed in
given the format of the output of `jellyfish dump -ct`
complexity is measured as the number of runs divided
by the total length of the sequence.
e.g., "AAAAA" would be 1/5
and "ACTGC" would be 5/5
@arq5x
arq5x / table_s1.txt
Created Jan 2, 2015
Vogelstein Table S1
View table_s1.txt
Cancer_type Lifetime_cancer_incidence Total_cells_tissue Total_Stem_Cells Stem_cell_divisions_per_year Stem_cell_divisions_per_lifetime LCSD
ALL 0.0041 3000000000000 135000000 12 960 129900000000
BCC 0.3 180000000000 5820000000 7.6 608 3550000000000
CLL 0.0052 3000000000000 135000000 12 960 129900000000
Colorectal 0.048 30000000000 200000000 73 5840 1168000000000
Colorectal_FAP 1 30000000000 200000000 73 5840 1168000000000
Colorectal_Lynch 0.5 30000000000 200000000 73 5840 1168000000000
Duodenum_adenocarcinoma 0.0003 680000000 4000000 24 1947 7796000000
Duodenum_adenocarcinoma_with_FAP 0.035 680000000 4000000 24 1947 7796000000
Esophageal_squamous_cell_carcinoma 0.001938 3240000000 846000 17.4 1390 1203000000
@arq5x
arq5x / workflow.sh
Last active Aug 29, 2015
big multi-file intersect examples
View workflow.sh
# 1. Download BED files of 349 DHS experiments from Science, 337, no. 6099, pp. 1190-1195, 7 Sep. 2012
# http://www.uwencode.org/proj/Science_Maurano_Humbert_et_al/
wget http://www.uwencode.org/proj/Science_Maurano_Humbert_et_al/data/all_fdr0.05_hot.tgz
# 2. Unpack.
tar -zxvf all_fdr0.05_hot.tgz
# 3. Make sure all of the files are sorted lexicographically by chrom, then numerically by start.
# This is required for the sweep allgorithm.
# Hint: they are sorted correctly, this is just a sanity check.
View multi.yaml
attributes:
name: cpg
version: 0.1
recipe:
full:
recipe_type: bash
recipe_cmds:
- >
mysql --user=genome --host=genome-mysql.cse.ucsc.edu \
@arq5x
arq5x / aws-setup.sh
Last active Aug 29, 2015
gemini database design experiment
View aws-setup.sh
# update things
sudo apt-get update
# get postgres up and running
sudo apt-get install postgresql-9.3
sudo apt-get install postgresql-server-dev-9.3
sudo apt-get install postgresql-client
sudo apt-get install postgresql postgresql-contrib
sudo -u postgres psql postgres
\password postgres