Last active
December 17, 2015 10:49
-
-
Save jeffhussmann/5597248 to your computer and use it in GitHub Desktop.
Activity 2
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "statement_2" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "heading", | |
"level": 1, | |
"metadata": {}, | |
"source": "Activity 2" | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": "The file `data/names.txt` has a million lines, each of which contains a saccharomyces cerevisiae gene name." | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": "!wc -l data/names.txt", | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": "1000000 data/names.txt\r\n" | |
} | |
], | |
"prompt_number": 1 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": "!head data/names.txt", | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": "YIL054W\r\nYOR366W\r\nYNL244C\r\nYNL085W\r\nYDL244W\r\nYGL204C\r\nYNL046W\r\nYIL006W\r\nYBR077C\r\nYHL029C\r\n" | |
} | |
], | |
"prompt_number": 2 | |
}, | |
{ | |
"cell_type": "heading", | |
"level": 3, | |
"metadata": {}, | |
"source": "Main goal" | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": "Create a dictionary of (gene name: number of times that gene name appears in the file) pairs." | |
}, | |
{ | |
"cell_type": "heading", | |
"level": 3, | |
"metadata": {}, | |
"source": "Stretch goals" | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": "Which gene appears the most?\n\nVisualize the distribution of counts for all names." | |
}, | |
{ | |
"cell_type": "heading", | |
"level": 3, | |
"metadata": {}, | |
"source": "Super strech goal" | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": "Implement the naive array-based scheme described in the lecture and compare its speed to using a dictionary." | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment