Skip to content

Instantly share code, notes, and snippets.

@kantale
Last active March 30, 2018 14:53
Show Gist options
  • Save kantale/5c483f2b23208b57895e3f2218755272 to your computer and use it in GitHub Desktop.
Save kantale/5c483f2b23208b57895e3f2218755272 to your computer and use it in GitHub Desktop.
Οδηγίες για τη 1η άσκηση
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Κάποιες οδηγίες για την άσκηση"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ας \"κατεβάσουμε\" το dataset:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"!wget -o gwas.tcv \"https://www.ebi.ac.uk/gwas/api/search/downloads/full\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ανοίγουμε το αρχείο:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"f = open('gwas.tsv')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Διαβάζουμε τη 1η γραμμή:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"first_line = f.readline()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Το first_line περιέχει και το \"enter\" ('\\n'). Μπορούμε να το αφαιρέσουμε:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"first_line = first_line.replace('\\n', '')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Τη κάνουμε \"split\" με βάση τα tabs:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"header = first_line.split('\\t')"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['DATE ADDED TO CATALOG', 'PUBMEDID', 'FIRST AUTHOR', 'DATE', 'JOURNAL', 'LINK', 'STUDY', 'DISEASE/TRAIT', 'INITIAL SAMPLE SIZE', 'REPLICATION SAMPLE SIZE', 'REGION', 'CHR_ID', 'CHR_POS', 'REPORTED GENE(S)', 'MAPPED_GENE', 'UPSTREAM_GENE_ID', 'DOWNSTREAM_GENE_ID', 'SNP_GENE_IDS', 'UPSTREAM_GENE_DISTANCE', 'DOWNSTREAM_GENE_DISTANCE', 'STRONGEST SNP-RISK ALLELE', 'SNPS', 'MERGED', 'SNP_ID_CURRENT', 'CONTEXT', 'INTERGENIC', 'RISK ALLELE FREQUENCY', 'P-VALUE', 'PVALUE_MLOG', 'P-VALUE (TEXT)', 'OR or BETA', '95% CI (TEXT)', 'PLATFORM [SNPS PASSING QC]', 'CNV']\n"
]
}
],
"source": [
"print (header)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ας αποθηκεύσουμε τώρα σε μία λίστα όλες τις υπόλοιπες γραμμές:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"lines = f.readlines()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Σε κάθε μία από αυτές τις γραμμές ας βγάλουμε το enter, και ας κάνουμε split με βάση τα tabs"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"content = [x.replace('\\n', '').split('\\t') for x in lines]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Πόσα entries έχει το content;"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"64239"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Τώρα μπορούμε να κλείσουμε και το αρχείο:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"f.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Τώρα έχουμε όλο το περιεχόμενο του αρχείου στις λίστες header και content. Ποιο είναι το index του 'CHR_ID';"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"11"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"header.index('CHR_ID')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ας πάρουμε τα CHR_ID από όλο το content:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"chromosomes = [x[11] for x in content]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ας ελέγξουμε τη τιμή για τα πρώτα 10"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['1', '13', '15', '1', '3', '15', '15', '8', '11', '18']"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chromosomes[:10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ποιες διαφορετικές τιμές υπάρχουν;"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'',\n",
" '1',\n",
" '1 x 1',\n",
" '1 x 10',\n",
" '1 x 13',\n",
" '1 x 14',\n",
" '1 x 16',\n",
" '1 x 17',\n",
" '1 x 19',\n",
" '1 x 3',\n",
" '1 x 6',\n",
" '1 x 7',\n",
" '1 x 9',\n",
" '10',\n",
" '10 x 11',\n",
" '10 x 12',\n",
" '10 x 14',\n",
" '10 x 19',\n",
" '10 x 21',\n",
" '10 x 22',\n",
" '10 x 8',\n",
" '10;10;10',\n",
" '10;10;10;10',\n",
" '10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10;10',\n",
" '11',\n",
" '11 x 4',\n",
" '11;11;11',\n",
" '11;11;11;11',\n",
" '12',\n",
" '12 x 12',\n",
" '12 x 15',\n",
" '12 x 16',\n",
" '12 x 17',\n",
" '12 x 20',\n",
" '12 x 22',\n",
" '12 x 8',\n",
" '12;12',\n",
" '12;12;12',\n",
" '13',\n",
" '13 x 16',\n",
" '13 x 18',\n",
" '13 x 2',\n",
" '13 x 5',\n",
" '13 x 8',\n",
" '14',\n",
" '14 x 11',\n",
" '14 x 21',\n",
" '14 x 3',\n",
" '14;14;14;14;14;14',\n",
" '15',\n",
" '15 x 11',\n",
" '15 x 8',\n",
" '15;15',\n",
" '16',\n",
" '16 x 7',\n",
" '16;16;16',\n",
" '16;16;16;16;16;16',\n",
" '17',\n",
" '17;17',\n",
" '17;17;17;17;17;17;17',\n",
" '17;17;17;17;17;17;17;17;17;17;17;17',\n",
" '18',\n",
" '18 x 22',\n",
" '18 x 3',\n",
" '18 x X',\n",
" '19',\n",
" '19;19;19;19',\n",
" '19;19;19;19;19;19;19',\n",
" '1;1',\n",
" '1;1;1',\n",
" '1;1;1;1',\n",
" '1;1;1;1;1',\n",
" '1;1;1;1;1;1',\n",
" '1;1;1;1;1;1;1',\n",
" '1;1;1;1;1;1;1;1',\n",
" '1;1;1;1;1;1;1;1;1',\n",
" '1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1',\n",
" '2',\n",
" '2 x 11',\n",
" '2 x 12',\n",
" '2 x 13',\n",
" '2 x 15',\n",
" '2 x 17',\n",
" '2 x 2',\n",
" '2 x 20',\n",
" '2 x 3',\n",
" '2 x 5',\n",
" '2 x 6',\n",
" '2 x 9',\n",
" '20',\n",
" '20 x 19',\n",
" '20 x 20',\n",
" '20;20',\n",
" '20;20;20;20',\n",
" '21',\n",
" '22',\n",
" '22 x 11',\n",
" '22 x 4',\n",
" '22 x 8',\n",
" '22;22;22;22',\n",
" '2;1;2;2;2;2;2;2;2;2;2;2;2;2',\n",
" '2;2',\n",
" '2;2;2',\n",
" '2;2;2;2',\n",
" '2;2;2;2;2;2;2;2;2;2;2;2',\n",
" '2;2;2;2;2;2;2;2;2;2;2;2;2;2;2;2;2;2;2;2',\n",
" '3',\n",
" '3 x 10',\n",
" '3 x 11',\n",
" '3 x 12',\n",
" '3 x 15',\n",
" '3 x 18',\n",
" '3 x 2',\n",
" '3 x 20',\n",
" '3 x 22',\n",
" '3 x 3',\n",
" '3 x 4',\n",
" '3 x 5',\n",
" '3 x 7',\n",
" '3 x 9',\n",
" '3;3',\n",
" '3;3;3;3',\n",
" '4',\n",
" '4 x 11',\n",
" '4 x 12',\n",
" '4 x 18',\n",
" '4 x 19',\n",
" '4 x 20',\n",
" '4 x 22',\n",
" '4 x 4',\n",
" '4 x 6',\n",
" '4 x 8',\n",
" '4;4',\n",
" '4;4;4;4',\n",
" '4;4;4;4;4',\n",
" '5',\n",
" '5 x 10',\n",
" '5 x 11',\n",
" '5 x 13',\n",
" '5 x 14',\n",
" '5 x 15',\n",
" '5 x 16',\n",
" '5 x 17',\n",
" '5 x 19',\n",
" '5 x 21',\n",
" '5 x 3',\n",
" '5 x 5',\n",
" '5 x 6',\n",
" '5 x 7',\n",
" '5 x 8',\n",
" '5;5',\n",
" '6',\n",
" '6 x 1',\n",
" '6 x 12',\n",
" '6 x 16',\n",
" '6 x 17',\n",
" '6 x 6',\n",
" '6 x 7',\n",
" '6 x 8',\n",
" '6 x 9',\n",
" '6;6',\n",
" '6;6;6',\n",
" '6;6;6;6',\n",
" '6;6;6;6;6',\n",
" '6;6;6;6;6;6',\n",
" '6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6;6',\n",
" '7',\n",
" '7 x 1',\n",
" '7 x 10',\n",
" '7 x 15',\n",
" '7 x 16',\n",
" '7 x 17',\n",
" '7 x 20',\n",
" '7 x 8',\n",
" '7 x 9',\n",
" '7;7',\n",
" '7;7;7;7',\n",
" '8',\n",
" '8 x 10',\n",
" '8 x 11',\n",
" '8 x 15',\n",
" '8 x 18',\n",
" '8 x 8',\n",
" '8 x 9',\n",
" '8;8',\n",
" '8;8;8',\n",
" '8;8;8;8;8;8;8;8;8;8;8;8;8;8',\n",
" '9',\n",
" '9 x 10',\n",
" '9 x 15',\n",
" '9 x 3',\n",
" '9 x 4',\n",
" '9 x 8',\n",
" '9 x 9',\n",
" '9;9',\n",
" '9;9;9;9',\n",
" 'X',\n",
" 'Y'}"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"set(chromosomes)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Παρατηρούμε ότι έχει διάφορες τιμές πέρα από τα κλασσικά ονόματα χρωμοσωμάτων. Ας τα πετάξουμε!"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', 'X', 'Y']\n"
]
}
],
"source": [
"accepted = [str(x) for x in range(1,23)] + ['X', 'Y']\n",
"print (accepted)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"chromosomes = [x for x in chromosomes if x in accepted]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ας μετρήσουμε πόσες φορές υπάρχει το κάθε ένα. \n",
"\n",
"Πρώτος τρόπος: Custom dictionary"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'1': 5529,\n",
" '10': 2755,\n",
" '11': 3348,\n",
" '12': 2912,\n",
" '13': 1405,\n",
" '14': 1640,\n",
" '15': 2534,\n",
" '16': 2318,\n",
" '17': 2130,\n",
" '18': 1278,\n",
" '19': 2177,\n",
" '2': 5039,\n",
" '20': 1440,\n",
" '21': 550,\n",
" '22': 1091,\n",
" '3': 4122,\n",
" '4': 3554,\n",
" '5': 3485,\n",
" '6': 6068,\n",
" '7': 3035,\n",
" '8': 2791,\n",
" '9': 2512,\n",
" 'X': 372,\n",
" 'Y': 2}"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"counts = {}\n",
"for c in chromosomes:\n",
" counts[c] = counts.get(c, 0) + 1\n",
"counts"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Δεύτερος τρόπος dictionary comprehension:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'1': 5529,\n",
" '10': 2755,\n",
" '11': 3348,\n",
" '12': 2912,\n",
" '13': 1405,\n",
" '14': 1640,\n",
" '15': 2534,\n",
" '16': 2318,\n",
" '17': 2130,\n",
" '18': 1278,\n",
" '19': 2177,\n",
" '2': 5039,\n",
" '20': 1440,\n",
" '21': 550,\n",
" '22': 1091,\n",
" '3': 4122,\n",
" '4': 3554,\n",
" '5': 3485,\n",
" '6': 6068,\n",
" '7': 3035,\n",
" '8': 2791,\n",
" '9': 2512,\n",
" 'X': 372,\n",
" 'Y': 2}"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"{x:sum([1 for y in chromosomes if y==x ]) for x in set(chromosomes)}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" Τρίτος τρόπος. Χρησιμοποιούμε τη κλάση Counter:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Counter({'6': 6068, '1': 5529, '2': 5039, '3': 4122, '4': 3554, '5': 3485, '11': 3348, '7': 3035, '12': 2912, '8': 2791, '10': 2755, '15': 2534, '9': 2512, '16': 2318, '19': 2177, '17': 2130, '14': 1640, '20': 1440, '13': 1405, '18': 1278, '22': 1091, '21': 550, 'X': 372, 'Y': 2})\n"
]
}
],
"source": [
"from collections import Counter\n",
"counts = Counter(chromosomes)\n",
"print (counts)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ποιο είναι το μικρότερο p-value για το χρωμόσωμα 5;"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5e-274"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chromosome_index = header.index('CHR_ID')\n",
"pvalue_index = header.index('P-VALUE')\n",
"min([float(x[pvalue_index]) for x in content if x[chromosome_index] == '5'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ποιος first author έχει κάνει τις περισσότερες δημοσιεύσεις για το χρωμόσωμα 10;"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(224, 'Astle WJ')"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"first_author_index = header.index('FIRST AUTHOR')\n",
"max((v,k) for k,v in Counter([x[first_author_index] for x in content if x[chromosome_index] == '10']).items())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Πόσα entries υπάρχουν που έχουν στο \"STUDY\" τη λέξη: \"cancer\";"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4108"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"study_index = header.index('STUDY')\n",
"sum(1 for x in content if 'cancer' in x[study_index])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ποιο χρωμόσωμα έχει τις περισσότερες μελέτες για cancer που έχουν δημοσιευτεί στο Nature;"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(85, '5')"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"journal_index = header.index('JOURNAL')\n",
"all_chromosomes = [x[chromosome_index] for x in content \n",
" if 'cancer' in x[study_index] and 'nature' in x[journal_index].lower()]\n",
"max([(v,k) for k,v in Counter(all_chromosomes).items()])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Υπάρχει κάποιο χρωμόσωμα που δεν έχει καμία δημοσίευση με τον τίτλο cancer στο Nature"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'Y'}"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"set(accepted) - set(all_chromosomes)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ας φτιάξουμε ένα dictionary όπου για κάθε χρωμόσωμα θα έχει τους 3 authors με τα περισσότερα publications στο Nature"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'1': [(131, 'Shungin D'), (67, 'Michailidou K'), (63, 'Locke AE')],\n",
" '10': [(64, 'Michailidou K'), (18, 'Locke AE'), (17, 'Okada Y')],\n",
" '11': [(41, 'Locke AE'), (35, 'Michailidou K'), (32, 'Shungin D')],\n",
" '12': [(66, 'Shungin D'), (33, 'Locke AE'), (12, 'Teslovich TM')],\n",
" '13': [(20, 'Locke AE'), (12, 'Shungin D'), (10, 'Michailidou K')],\n",
" '14': [(28, 'Locke AE'), (22, 'Michailidou K'), (19, 'Shungin D')],\n",
" '15': [(33, 'Shungin D'), (16, 'Locke AE'), (11, 'Lango Allen H')],\n",
" '16': [(52, 'Shungin D'), (37, 'Locke AE'), (21, 'Michailidou K')],\n",
" '17': [(36, 'Shungin D'), (29, 'Michailidou K'), (21, 'Locke AE')],\n",
" '18': [(28, 'Shungin D'), (23, 'Locke AE'), (19, 'Michailidou K')],\n",
" '19': [(39, 'Shungin D'), (28, 'Locke AE'), (21, 'Michailidou K')],\n",
" '2': [(76, 'Shungin D'), (70, 'Locke AE'), (50, 'Michailidou K')],\n",
" '20': [(37, 'Shungin D'), (11, 'Locke AE'), (10, 'Michailidou K')],\n",
" '21': [(8, 'Michailidou K'), (7, 'Locke AE'), (5, 'Okada Y')],\n",
" '22': [(37, 'Michailidou K'), (14, 'Shungin D'), (6, 'Okada Y')],\n",
" '3': [(93, 'Shungin D'), (54, 'Locke AE'), (39, 'Michailidou K')],\n",
" '4': [(39, 'Shungin D'), (26, 'Michailidou K'), (20, 'Locke AE')],\n",
" '5': [(83, 'Michailidou K'), (55, 'Shungin D'), (20, 'Locke AE')],\n",
" '6': [(126, 'Shungin D'), (68, 'Michailidou K'), (45, 'Locke AE')],\n",
" '7': [(54, 'Shungin D'), (33, 'Michailidou K'), (24, 'Locke AE')],\n",
" '8': [(40, 'Michailidou K'), (27, 'Locke AE'), (26, 'Shungin D')],\n",
" '9': [(26, 'Michailidou K'), (24, 'Locke AE'), (19, 'Shungin D')],\n",
" 'X': [(3, 'Ripke S'), (2, 'Okada Y'), (2, 'Michailidou K')],\n",
" 'Y': []}"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"results = { chromosome: \n",
" sorted(\n",
" [(times, author) for author, times in \n",
" Counter(\n",
" [y[first_author_index] for y in content \n",
" if y[chromosome_index]==chromosome and 'nature' in y[journal_index].lower()]\n",
" ).items()\n",
" ], \n",
" reverse=True)[:3] \n",
" for chromosome in accepted\n",
" }\n",
"\n",
"results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ας προσπαθήσουμε να δείξουμε τα αποτελέσματα πιο όμορφα:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>1st</th>\n",
" <th>2nd</th>\n",
" <th>3rd</th>\n",
" </tr>\n",
" <tr>\n",
" <th>chromosome</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Shungin D</td>\n",
" <td>Michailidou K</td>\n",
" <td>Locke AE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Shungin D</td>\n",
" <td>Locke AE</td>\n",
" <td>Michailidou K</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Shungin D</td>\n",
" <td>Locke AE</td>\n",
" <td>Michailidou K</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Shungin D</td>\n",
" <td>Michailidou K</td>\n",
" <td>Locke AE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Michailidou K</td>\n",
" <td>Shungin D</td>\n",
" <td>Locke AE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Shungin D</td>\n",
" <td>Michailidou K</td>\n",
" <td>Locke AE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Shungin D</td>\n",
" <td>Michailidou K</td>\n",
" <td>Locke AE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Michailidou K</td>\n",
" <td>Locke AE</td>\n",
" <td>Shungin D</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Michailidou K</td>\n",
" <td>Locke AE</td>\n",
" <td>Shungin D</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Michailidou K</td>\n",
" <td>Locke AE</td>\n",
" <td>Okada Y</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Locke AE</td>\n",
" <td>Michailidou K</td>\n",
" <td>Shungin D</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Shungin D</td>\n",
" <td>Locke AE</td>\n",
" <td>Teslovich TM</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Locke AE</td>\n",
" <td>Shungin D</td>\n",
" <td>Michailidou K</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Locke AE</td>\n",
" <td>Michailidou K</td>\n",
" <td>Shungin D</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Shungin D</td>\n",
" <td>Locke AE</td>\n",
" <td>Lango Allen H</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Shungin D</td>\n",
" <td>Locke AE</td>\n",
" <td>Michailidou K</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>Shungin D</td>\n",
" <td>Michailidou K</td>\n",
" <td>Locke AE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>Shungin D</td>\n",
" <td>Locke AE</td>\n",
" <td>Michailidou K</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Shungin D</td>\n",
" <td>Locke AE</td>\n",
" <td>Michailidou K</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>Shungin D</td>\n",
" <td>Locke AE</td>\n",
" <td>Michailidou K</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>Michailidou K</td>\n",
" <td>Locke AE</td>\n",
" <td>Okada Y</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>Michailidou K</td>\n",
" <td>Shungin D</td>\n",
" <td>Okada Y</td>\n",
" </tr>\n",
" <tr>\n",
" <th>X</th>\n",
" <td>Ripke S</td>\n",
" <td>Okada Y</td>\n",
" <td>Michailidou K</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" 1st 2nd 3rd\n",
"chromosome \n",
"1 Shungin D Michailidou K Locke AE\n",
"2 Shungin D Locke AE Michailidou K\n",
"3 Shungin D Locke AE Michailidou K\n",
"4 Shungin D Michailidou K Locke AE\n",
"5 Michailidou K Shungin D Locke AE\n",
"6 Shungin D Michailidou K Locke AE\n",
"7 Shungin D Michailidou K Locke AE\n",
"8 Michailidou K Locke AE Shungin D\n",
"9 Michailidou K Locke AE Shungin D\n",
"10 Michailidou K Locke AE Okada Y\n",
"11 Locke AE Michailidou K Shungin D\n",
"12 Shungin D Locke AE Teslovich TM\n",
"13 Locke AE Shungin D Michailidou K\n",
"14 Locke AE Michailidou K Shungin D\n",
"15 Shungin D Locke AE Lango Allen H\n",
"16 Shungin D Locke AE Michailidou K\n",
"17 Shungin D Michailidou K Locke AE\n",
"18 Shungin D Locke AE Michailidou K\n",
"19 Shungin D Locke AE Michailidou K\n",
"20 Shungin D Locke AE Michailidou K\n",
"21 Michailidou K Locke AE Okada Y\n",
"22 Michailidou K Shungin D Okada Y\n",
"X Ripke S Okada Y Michailidou K"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"\n",
"items_2 = sorted([x for x in results.items() if x[1]], key=lambda x : accepted.index(x[0]))\n",
"results_2 = {\n",
" 'chromosome': [x[0] for x in items_2],\n",
" '1st': [x[1][0][1] for x in items_2],\n",
" '2nd': [x[1][1][1] for x in items_2],\n",
" '3rd': [x[1][2][1] for x in items_2],\n",
"}\n",
"\n",
"df = pd.DataFrame.from_dict(results_2)\n",
"df = df.set_index('chromosome')\n",
"df\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [Root]",
"language": "python",
"name": "Python [Root]"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment