Skip to content

Instantly share code, notes, and snippets.

View edawson's full-sized avatar

Eric T. Dawson edawson

View GitHub Profile
@edawson
edawson / example.sh
Last active March 5, 2019 20:42 — forked from arq5x/example.sh
Natural sort a VCF
chmod a+x vcfsort.sh
vcfsort.sh trio.trim.vep.vcf.gz
@edawson
edawson / picard_intervals_to_bed.sh
Created December 11, 2018 01:24
Convert a Picard interval list file to a bzip'ed, tabix-indexed BED file.
grep -v "^@" $1 | awk '{print $1"\t"$2"\t"$3"\t"$5}' > $(dirname $1)/$(basename $1 .interval_list).bed && \
bgzip $(dirname $1)/$(basename $1 .interval_list).bed && \
tabix $(dirname $1)/$(basename $1 .interval_list).bed.gz
@edawson
edawson / fastq_splitter.sh
Last active February 5, 2019 14:07
Split a FASTQ (or pair) into 100K read splits using GNU split and pigz. Modified from an original script by @ekg.
first_reads=$1
second_reads=$2
ddir=$(dirname $first_reads)
obase_first=$(basename $first_reads .fastq.gz)
obase_second=$(basename $second_reads .fastq.gz)
splitsz=4000000
if [ ! -z ${first_reads} ] && [ -e ${first_reads} ]
@edawson
edawson / processCentroAndTelo.py
Created July 20, 2018 10:33
Convert the default UCSC table browser format to BRASS-compatible centro/telomere dictionary file (python3).
import sys
import gzip
from collections import defaultdict
if __name__ == "__main__":
d_telo= defaultdict(list)
d_centro = {}
with gzip.open(sys.argv[1], "rt") as ifi:
for line in ifi:
@edawson
edawson / repo-rinse.sh
Created January 30, 2018 18:23 — forked from nicktoumpelis/repo-rinse.sh
Cleans and resets a git repo and its submodules
git clean -xfd
git submodule foreach --recursive git clean -xfd
git reset --hard
git submodule foreach --recursive git reset --hard
git submodule update --init --recursive
@edawson
edawson / import_vcf_into_df.R
Created June 18, 2017 17:55 — forked from sephraim/import_vcf_into_df.R
Import VCF file into data frame in R
library(vcfR)
# Import VCF
my.vcf <- read.vcfR('my.vcf.gz')
# Combine CHROM thru FILTER cols + INFO cols
my.vcf.df <- cbind(as.data.frame(getFIX(my.vcf)), INFO2df(my.vcf))