Skip to content

Instantly share code, notes, and snippets.

@vals
Created April 29, 2012 19:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vals/2552701 to your computer and use it in GitHub Desktop.
Save vals/2552701 to your computer and use it in GitHub Desktop.
Notes from Bio-IT World Expo 2012

###Jill Mesirov - Broad Institute

@broadinstitute

Broad has generated >150 TBPs in a year, how was this number calculated?


###Bas Burger - BT Global Commerce

There were like 5 misspellings in the feedback from Siri in the demo..

(Pretty sure Siri doesn't tie in to 2rd party apps)


###David Dooling - Washington University

WUSTL

"When in doubt, throw it out"

Consensus is to store the BAM

[they] Track everything in great detail

"We try to elliminate people"


###Toby Bloom - Broad Institute

Another number: 3 TBases per day (≈ 4 TBytes)

QC feedback to the lab has become critical

Having data graphically displayed in real time has changed everything. (Uses Tableau: http://www.tableausoftware.com/)

80% at 20x

51 Illumina machines

(One particular day, 14 TBytes)

Network becomes bottleneck (storage you can scale)

Tried writing to named pipes in stead of lots and lots of files -> Improved turn-arund time by 50%.

Haven't looked much at Hadoop style stuff

Lots of work to make sure analysis is run on files in place, so users don't need to copy data.


###Chris Botka - Harvard Medical School

About 40TB of data with keywords "junk, trash, backup, copy etc"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment