Skip to content

Instantly share code, notes, and snippets.

View arq5x's full-sized avatar

Aaron Quinlan arq5x

View GitHub Profile
@arq5x
arq5x / sim-gemini-bin-genotypes.py
Created June 20, 2012 16:19
prototype for storing/reading gemini genotypes in PLINK BED-like format
#!/usr/bin/python
import array
import gzip
import struct
import numpy as np
import multiprocessing as mp
class BinaryGenoWriter(object):
@arq5x
arq5x / install.txt
Created July 9, 2012 19:27
CUDA instructions for BITS paper
1. Download and install the CUDA toolkit.
- http://developer.nvidia.com/cuda-downloads
- For Mac, http://developer.download.nvidia.com/compute/cuda/4_2/rel/toolkit/cudatoolkit_4.2.9_macos.pkg
2. Download and install the CUDA drivers for you system.
- http://developer.nvidia.com/cuda-downloads
- For Mac, http://developer.download.nvidia.com/compute/cuda/4_2/rel/drivers/devdriver_4.2.10_macos.dmg
3. Download and install the cudapp library
http://code.google.com/p/cudpp/
@arq5x
arq5x / make-master-hmm.sh
Created July 18, 2012 20:19
For Gemini: Create a master ChromHMM track from the 9 distinct cell types.
echo "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmGm12878HMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmH1hescHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHepg2HMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHmecHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHsmmHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHuvecHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmK562HMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmNhekHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmNhlfHMM.bed.gz" \
> chromhmm-files.txt
@arq5x
arq5x / config.log
Created August 17, 2012 15:44
Homebrew error log for graph-tool installation error.
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by graph-tool configure 2.2.14, which was
generated by GNU Autoconf 2.68. Invocation command line was
$ ./configure --disable-debug --disable-dependency-tracking --prefix=/usr/local/Cellar/graph-tool/2.2.14
## --------- ##
## Platform. ##
@arq5x
arq5x / nbn-chipseq-workflow.sh
Created August 29, 2012 15:06
Pipeline for NBN ChIP-seq
# alias the raw data files
ln -s C0PYJACXX_s7_0_GSL48index_10_SL16067.fastq.gz 0h-IP-nbn.fq.gz
ln -s C0PYJACXX_s7_0_GSL48index_11_SL16068.fastq.gz 3Gy-IP-nbn.fq.gz
ln -s C0PYJACXX_s7_0_GSL48index_12_SL16069.fastq.gz PpoI-IP-nbn.fq.gz
ln -s C0PYJACXX_s7_0_GSL48index_7_SL16064.fastq.gz 0h-Input.fq.gz
ln -s C0PYJACXX_s7_0_GSL48index_8_SL16065.fastq.gz 3Gy-Input.fq.gz
ln -s C0PYJACXX_s7_0_GSL48index_9_SL16066.fastq.gz PpoI-Input.fq.gz
# build a list of sample stubs
SAMPLES="0h-Input 0h-IP-nbn 3Gy-Input 3Gy-IP-nbn PpoI-Input PpoI-IP-nbn"
@arq5x
arq5x / make-unified-segmentation.sh
Created September 14, 2012 00:55
ENCODE consensus segmentations
# 1. Get the ENCODE segmentations from EBI.
# consensus
wget http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/awgHub/byDataType/segmentations/jan2011/gm12878.combined.bb
wget http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/awgHub/byDataType/segmentations/jan2011/h1hesc.combined.bb
wget http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/awgHub/byDataType/segmentations/jan2011/helas3.combined.bb
wget http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/awgHub/byDataType/segmentations/jan2011/hepg2.combined.bb
wget http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/awgHub/byDataType/segmentations/jan2011/huvec.combined.bb
wget http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/awgHub/byDataType/segmentations/jan2011/k562.combined.bb
# Segway (ahem; https://twitter.com/michaelhoffman/status/246679147164880897)
$ make clean
cd lib && make clean
rm -f *.o zlib-1.2.3/*.o zlib-1.2.3/libz.1.2.3.dylib
cd python && rm -fr build khmer/*.so
$ make all
cd lib && make
cd zlib-1.2.3; ./configure --shared; make; rm minigzip.o; rm example.o
Checking for gcc...
Checking for shared library support...
@arq5x
arq5x / unique-bash-history.sh
Created January 22, 2013 02:52
Unique BASH history
HISTCONTROL="erasedups"
export HISTCONTROL
{
"pile": {
"chrom": "chr1"
"start": 102033445
"end": 102033446
"ref": "G"
"total_ref_alleles": 45
"total_alt_alleles": 15
"total_fwd_a": 7
"total_rev_a": 8
@arq5x
arq5x / bioch5080.md
Last active December 15, 2015 07:09

Genome arithmetic for data exploration

The goal of today's practical session is to get your hands dirty with bedtools. We will be studying ChiP-seq data from three different cell types. Each cell type was assayed for H3K27ac. Our research goal is to understand and explore the similarities and differences between the ChIP peaks observed in the 3 different cell types.