C. Titus Brown ctb

## argparse-foo.py
#!/usr/bin/env python
import argparse

def main():
    argParser = argparse.ArgumentParser(description='Test argparser')
    argParser.add_argument('-A', type=str, nargs=1, \
                            default='myarg', \
                            choices=['myarg', 'otherarg'], \
                            help='Select which arg to run', dest='arg')
    argVals = argParser.parse_args()

## gist:6fd30c50341e6f930b65
curl -O -L http://sourceforge.net/projects/bowtie-bio/files/bowtie/0.12.7/bowtie-0.12.7-linux-x86_64.zip

## gist:b147518c0e25cda65207
Title: What's ahead for biology?  The data-intensive future.

The rise of -omics over the last 20 years, and the increasing need for
efficient and integrative approaches to data analysis, portends some
interesting and challenging changes in biology research over the next decade.
For one example, what happens when high-density multi-omics data is available
for low cost for every experiment?  I will discuss both my own work on
efficient sequence analysis approaches as well as a larger context of
some larger-scale data investigations.

## demo-map-by-align.py
import khmer
import screed
import random

K = 20

# read in genome reference
genome = list(screed.open('genome.fa'))[0].sequence

# build a counting hash and a read aligner on top of that

## .gitignore
*.bam
*.sam
*.ht
*.bai
*.ebwt
*.fai
*.corr
*.keepalign
*.keep
*.keepvar

## compare.py
#!/usr/bin/python

import sys
import screed

if len(sys.argv) != 4:
    print >>sys.stderr, "USAGE: compare.py <mutations.txt> <orig.fasta> <corrected.fasta>"
    sys.exit(1)

fp_out = open('fp.out', 'w')

## gist:57dc9723e11f4138fb5c
echo hello, world!

## gist:819802526bb91304ff31
Title: "Challenges and rewards of data-intensive biology: sequencing
as a starting point for studying non-model organisms.”

From microbial ecology to organismal biology, we know very little about molecular and organismal function. The tremendous growth in sequencing capacity has given us a wonderful opportunity to explore these non-model organisms - but only if we can make some sense of the data. My chosen challenge for the last six years has been to develop tools and approaches for efficiently working with sequence data from environmental microbiology, evo-devo systems, and agricultural animals.  In this talk, I will describe some of the progress we’ve made in understanding some new biology, and discuss our integrated approach to developing new computational techniques, applying them to real biological data while working closely with collaborators, investing heavily in teaching and training, and practicing open science.  I will also discuss some of the big challenges ahead for biology more broadly, and prov

## gist:e182ef6139f80edda2fd
Building Better Bioinformatics Software: Why the Heck Not?

I've been tremendously unhappy with a number of bioinformatics papers, many of
which focus on uselessly benchmarking isolated components of much larger
pipelines (short-read mappers being the best, or worst example).  I will
talk about our efforts to provide pipeline-level benchmarks of a small
subset of the massive profusion of NGS tools, discuss the parlous state of
effective bioinformatics tools research, and argue that to a large degree
bioinformaticians deserve the scorn of biologists everywhere.  But I'll be
humorous about it.

## gist:fc99f09669c7852b4af9
Title: Openness and reproducibility in computational science: tools,
approaches, and thought patterns.

Oct 17, NICTA, Canberra.

Abstract:

Computational reproducibility should be relatively straightforward to achieve
at the time of publication, but experience has shown that most research
groups struggle with it.  I will discuss the increasingly rich (and
	#!/usr/bin/env python
	import argparse

	def main():
	argParser = argparse.ArgumentParser(description='Test argparser')
	argParser.add_argument('-A', type=str, nargs=1, \
	default='myarg', \
	choices=['myarg', 'otherarg'], \
	help='Select which arg to run', dest='arg')
	argVals = argParser.parse_args()
	Title: What's ahead for biology? The data-intensive future.

	The rise of -omics over the last 20 years, and the increasing need for
	efficient and integrative approaches to data analysis, portends some
	interesting and challenging changes in biology research over the next decade.
	For one example, what happens when high-density multi-omics data is available
	for low cost for every experiment? I will discuss both my own work on
	efficient sequence analysis approaches as well as a larger context of
	some larger-scale data investigations.
	import khmer
	import screed
	import random

	K = 20

	# read in genome reference
	genome = list(screed.open('genome.fa'))[0].sequence

	# build a counting hash and a read aligner on top of that
	*.bam
	*.sam
	*.ht
	*.bai
	*.ebwt
	*.fai
	*.corr
	*.keepalign
	*.keep
	*.keepvar
	#!/usr/bin/python

	import sys
	import screed

	if len(sys.argv) != 4:
	print >>sys.stderr, "USAGE: compare.py <mutations.txt> <orig.fasta> <corrected.fasta>"
	sys.exit(1)

	fp_out = open('fp.out', 'w')
	Title: "Challenges and rewards of data-intensive biology: sequencing
	as a starting point for studying non-model organisms.”

	From microbial ecology to organismal biology, we know very little about molecular and organismal function. The tremendous growth in sequencing capacity has given us a wonderful opportunity to explore these non-model organisms - but only if we can make some sense of the data. My chosen challenge for the last six years has been to develop tools and approaches for efficiently working with sequence data from environmental microbiology, evo-devo systems, and agricultural animals. In this talk, I will describe some of the progress we’ve made in understanding some new biology, and discuss our integrated approach to developing new computational techniques, applying them to real biological data while working closely with collaborators, investing heavily in teaching and training, and practicing open science. I will also discuss some of the big challenges ahead for biology more broadly, and prov
	Building Better Bioinformatics Software: Why the Heck Not?

	I've been tremendously unhappy with a number of bioinformatics papers, many of
	which focus on uselessly benchmarking isolated components of much larger
	pipelines (short-read mappers being the best, or worst example). I will
	talk about our efforts to provide pipeline-level benchmarks of a small
	subset of the massive profusion of NGS tools, discuss the parlous state of
	effective bioinformatics tools research, and argue that to a large degree
	bioinformaticians deserve the scorn of biologists everywhere. But I'll be
	humorous about it.
	Title: Openness and reproducibility in computational science: tools,
	approaches, and thought patterns.

	Oct 17, NICTA, Canberra.

	Abstract:

	Computational reproducibility should be relatively straightforward to achieve
	at the time of publication, but experience has shown that most research
	groups struggle with it. I will discuss the increasingly rich (and