Skip to content

Instantly share code, notes, and snippets.

View ctb's full-sized avatar

C. Titus Brown ctb

View GitHub Profile
Hi Jonathan,
I noticed the omission as well! I do have some sympathy with people who don't know how to deal with preprints, but the almost complete omission of any discussion of your published, peer-reviewed paper is quite a bit stranger. Regardless, I think the best thing for science would have been for them to embrace the preprint, discuss it in some detail, and compare and contrast their results with yours. The other thought I have is this: do people who argue for precedence really think that this kind of interaction is going to lead in positive directions for their future work? It seems a bit shortsighted.
@ctb
ctb / diffk.ipynb
Last active August 29, 2015 14:13
EC k sizes
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
*.bt2
*.sam
*.pos
*.fq
*.kh
*.kh.info
*.fai
*~
*.fq.gz
*-reads.fa
@ctb
ctb / quote.txt
Last active August 29, 2015 14:13
Michael Barton quote @Bioinformatics during #BaltiandBioinformatics
https://www.youtube.com/watch?x-yt-cl=84359240&x-yt-ts=1421782837&feature=player_embedded&v=ZACVcJt0oJA#t=7303
Kai Blin: If we containerize all these things won’t it just encourage
worse software development practices; right now developers still need
to consider someone other than themselves installing the software.
Michael Barton:
“It’s a good point. Ultimately, though, if I can get a container, and
it works, and I know it will work, do you care how well it was
@ctb
ctb / README.md
Created February 15, 2015 12:00
Reproducibility workflow workshop at MSU - description

Seminar Topic: Reproducible Computational Analysis - How to start a new project

Time: March 24, 2015 from 9:00am- 12:00 noon
Location: Biomedical & Physical Sciences Bldg, Room 2245 Instructor: Dr. Titus Brown

Description

Computational science projects, from data analysis to modeling, can benefit dramatically from a little up-front investment in automation; starting off with version control and automated building of results will pay off in efficiency, agility, and both transparency and reproducibility of the results. However, most computational researchers have never been exposed to a completely automated analysis pipeline. I will demonstrate the process of initiating a new project, building a few initial scripts, and automating the generation of results, as well as building some graphs. While the topic will be from my own research in bioinformatics, the overall approach should apply to anyone doing data analysis or simulations.

@ctb
ctb / README.md
Last active August 29, 2015 14:16
Genome 10k #G10K meeting - Erich Jarvis' suggestions for the future of G10K

Six suggestions for the future of G10K -- Erich Jarvis

  1. Make a more focused mission than just 10k species

  2. Sequence on species per genus, ~9.5k species

  3. Generate a vertebrate genera phylogenome tree

  4. Reclassify species based on genomic distances and speciation timing.

Mike, great post - and I think the bit at the end is in some ways the most important. There are other ways to go about this, too; I have made the decision to openly sign my paper peer reviews and this has led me to be both more careful and more polite in my reviews, to the point where I feel comfortable posting them publicly once the paper is out.

Two additional thoughts --

I absolutely don't want to see a centralized commenting system come into being for all sorts of reasons; I think we need something sensibly federated. To that end, you might be interested in Chris Lee's "selected papers" network idea, http://journal.frontiersin.org/article/10.3389/fncom.2012.00001/abstract as a way to actually do pre-"pub" peer review in a minimally sensible way.

Second, there are annotation platforms like hypothes.is that I'd love to see applied to this general question of how to (technically) do post-pub peer review. See https://hypothes.is/. Any thoughts as to suitability?

advert said: "Titus will talk about the pros and cons of graduate school, how to choose if graduate school is a good fit for you, and tips for applying to graduate school"

see also:

https://www.netjeff.com/humor/item.cgi?file=GradSchoolOrHell

--

Grad school is very dependent on advisor, dept; talk to current grad students beforehand!

Trained by physicists, with a BA in pure math, a PhD in molecular developmental
biology, and lots of open source code to my name, I am currently a biologist
trapped in a computer science department. I work at the intersection of big
sequence data, novel computer science data structures and algorithms, and
biological hypothesis generation & validation.
# let's use the CSV library: http://docs.python.org/library/csv.html
import csv
import urllib
from datetime import datetime
# here workouts is GLOBAL
workouts = []
def load_data(url):
data = []