Skip to content

Instantly share code, notes, and snippets.

View ctb's full-sized avatar

C. Titus Brown ctb

View GitHub Profile
@ctb
ctb / argparse-foo.py
Created March 20, 2014 02:23
weird behavior by argparse
#!/usr/bin/env python
import argparse
def main():
argParser = argparse.ArgumentParser(description='Test argparser')
argParser.add_argument('-A', type=str, nargs=1, \
default='myarg', \
choices=['myarg', 'otherarg'], \
help='Select which arg to run', dest='arg')
argVals = argParser.parse_args()
curl -O -L http://sourceforge.net/projects/bowtie-bio/files/bowtie/0.12.7/bowtie-0.12.7-linux-x86_64.zip
@ctb
ctb / gist:b147518c0e25cda65207
Created September 8, 2014 10:55
AGTA 2014 talk info
Title: What's ahead for biology? The data-intensive future.
The rise of -omics over the last 20 years, and the increasing need for
efficient and integrative approaches to data analysis, portends some
interesting and challenging changes in biology research over the next decade.
For one example, what happens when high-density multi-omics data is available
for low cost for every experiment? I will discuss both my own work on
efficient sequence analysis approaches as well as a larger context of
some larger-scale data investigations.
import khmer
import screed
import random
K = 20
# read in genome reference
genome = list(screed.open('genome.fa'))[0].sequence
# build a counting hash and a read aligner on top of that
@ctb
ctb / .gitignore
Last active August 29, 2015 14:06
*.bam
*.sam
*.ht
*.bai
*.ebwt
*.fai
*.corr
*.keepalign
*.keep
*.keepvar
@ctb
ctb / compare.py
Last active August 29, 2015 14:06
#!/usr/bin/python
import sys
import screed
if len(sys.argv) != 4:
print >>sys.stderr, "USAGE: compare.py <mutations.txt> <orig.fasta> <corrected.fasta>"
sys.exit(1)
fp_out = open('fp.out', 'w')
echo hello, world!
@ctb
ctb / gist:819802526bb91304ff31
Last active August 29, 2015 14:06
tuesday 9/23 talk abstract
Title: "Challenges and rewards of data-intensive biology: sequencing
as a starting point for studying non-model organisms.”
From microbial ecology to organismal biology, we know very little about molecular and organismal function. The tremendous growth in sequencing capacity has given us a wonderful opportunity to explore these non-model organisms - but only if we can make some sense of the data. My chosen challenge for the last six years has been to develop tools and approaches for efficiently working with sequence data from environmental microbiology, evo-devo systems, and agricultural animals. In this talk, I will describe some of the progress we’ve made in understanding some new biology, and discuss our integrated approach to developing new computational techniques, applying them to real biological data while working closely with collaborators, investing heavily in teaching and training, and practicing open science. I will also discuss some of the big challenges ahead for biology more broadly, and prov
@ctb
ctb / gist:e182ef6139f80edda2fd
Created October 2, 2014 10:50
ABIC abstract.
Building Better Bioinformatics Software: Why the Heck Not?
I've been tremendously unhappy with a number of bioinformatics papers, many of
which focus on uselessly benchmarking isolated components of much larger
pipelines (short-read mappers being the best, or worst example). I will
talk about our efforts to provide pipeline-level benchmarks of a small
subset of the massive profusion of NGS tools, discuss the parlous state of
effective bioinformatics tools research, and argue that to a large degree
bioinformaticians deserve the scorn of biologists everywhere. But I'll be
humorous about it.
@ctb
ctb / gist:fc99f09669c7852b4af9
Last active August 29, 2015 14:07
Canberra reproducibility talk
Title: Openness and reproducibility in computational science: tools,
approaches, and thought patterns.
Oct 17, NICTA, Canberra.
Abstract:
Computational reproducibility should be relatively straightforward to achieve
at the time of publication, but experience has shown that most research
groups struggle with it. I will discuss the increasingly rich (and