Skip to content

Instantly share code, notes, and snippets.

@flashton2003
Created April 17, 2014 12:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save flashton2003/10980140 to your computer and use it in GitHub Desktop.
Save flashton2003/10980140 to your computer and use it in GitHub Desktop.
{
"metadata": {
"name": "",
"signature": "sha256:dd3f380620371d7def93011e81b190a7a8f796fba0551a284b06626a6757a641"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Salmonella E-burst groups"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This ipython notebook (ipynb) is intended to convery the genetic diversity (or homogeneity!) of different Salmonella serotypes, as we move from phenotypic methods to genotypic methods.\n",
"\n",
"It is populated with 14 of the UK's most frequently observed Salmonella serotypes. Associated with each serotype is a python data structure (details below), that contains the different e-burst groups (EBG). Associated with each EBG is the MLST sequence types (STs) that make up each EBG. For more details see:\n",
"\n",
"http://www.plospathogens.org/article/info%3Adoi%2F10.1371%2Fjournal.ppat.1002776\n",
"\n",
"For people familiar with python syntax, the dictionary takes the format:\n",
"\n",
"{serotype:{ebg[sts], ebg[sts]}, serotype:{ebg[sts], ...}\n",
"\n"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"sero_ebg_st = {'parab-java': {32: ['681', '423', '42', '733', '734'], 19: ['88', '372', '127'], 59: ['28'], 155: ['404', '679'], 5: ['896', '772', '570', '307', '267', '110', '265', '264', '149', '86', '43', '325', '266']}, 'enteritidis': {32: ['74'], 4: ['745', '310', '616', '460', '168', '136', '11', '691', '640', '11SLV', '183', '366', '1479', '814'], 93: ['180', '172']}, 'typhimurium': {1: ['376', '429', '205', '204', '323', '19', '302', '394', '35', '34', '99', '98', '159', '137', '456', '209', '128', '313', '332', '328', '213'], 26: ['15'], 138: ['SLV36', '36'], 54: ['13']}, 'virchow': {9: ['618', '755', '303', 'SLV16', 'TLV16', '38', '16', '181', '326', '648', '359'], 70: ['197', '333']}, 'braenderup': {24: ['194', '311', '21', '22']}, 'kentucky': {56: ['727', '198'], 164: ['832', 'SLV314', '314'], 15: ['151', '152', '318', '723', '221']}, 'agona': {26: ['15'], 83: ['463'], 54: ['13', '37', '1328', '1215']}, 'newport': {3: ['201', '1496', '157', '211', '46', '158', '45', '355', '193', '121', '131', '116', '353', '125', '184', '350', '614', '165'], 2: ['199', '190', '115', '117', 'SLV118', '119', '118', '345', '347', '5', '187', '189', '120', '122', '123', '164', '167', '223', '163', '354', '352', '351', '375'], 35: ['166', '360', '156'], 154: ['808', '807'], 7: ['132', '346', '200', '348', '31', '191', '188', '349']}, 'oranienburg': {41: ['47', '1523', '1522', '23'], 203: ['1538', '1392', '1513', '1512'], 44: ['1553', '1516', '320', '169', '1515', '174'], 45: ['292'], 50: ['91', '1514'], 52: ['179']}, 'infantis': {31: ['295', '32', '1032', '41', 'SLV32']}, 'typhi': {13: ['1', '890', '3', '2', '8', '911', 'SLV2', '892']}, 'paratyphi-a': {11: ['130', '129', '495', '494', 'SLV85', '479', '85']}, 'stanley': {29: ['1027', 'DLV29', '51', '29', '182', 'SLV51']}, 'montevideo': {208: ['749', '1491'], 4: ['195', '305', '316', '1493', '1489', '1488', '81', '699', '1537', '1536', '1535', '4', '1531'], 39: ['748', '138', '1518']}}"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 46
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The result of the code in the section below tells you how many E-burst groups there are in each serotype"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print 'serotype\\tnumber of EBGs\\n'\n",
"for serotype in sero_ebg_st:\n",
" print serotype, '\\t', len(sero_ebg_st[serotype])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"serotype\tnumber of EBGs\n",
"\n",
"parab-java \t5\n",
"enteritidis \t3\n",
"typhimurium \t4\n",
"braenderup \t1\n",
"infantis \t1\n",
"stanley \t1\n",
"oranienburg \t6\n",
"virchow \t2\n",
"kentucky \t3\n",
"agona \t3\n",
"typhi \t1\n",
"newport \t5\n",
"paratyphi-a \t1\n",
"montevideo \t3\n"
]
}
],
"prompt_number": 47
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The below code will tell you the names of the EBGs for each serotype and the number of STs within each."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print 'serotype\\tNum of EBGs\\n'\n",
"for serotype in sero_ebg_st:\n",
" print serotype, '\\t', len(sero_ebg_st[serotype]), '\\t'\n",
" \n",
" print 'EBG : Num STs'\n",
" for ebg in sero_ebg_st[serotype]:\n",
" \n",
" \n",
" print '\\t', ebg, ':', len(sero_ebg_st[serotype][ebg]), '\\t\\t',\n",
" \n",
" print\n",
" "
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"serotype\tNum of EBGs\n",
"\n",
"parab-java \t5 \t\n",
"EBG : Num STs\n",
"\t32 : 5 \t\t\t59 : 1 \t\t\t19 : 3 \t\t\t155 : 2 \t\t\t5 : 13 \t\t\n",
"enteritidis \t3 \t\n",
"EBG : Num STs\n",
"\t32 : 1 \t\t\t4 : 14 \t\t\t93 : 2 \t\t\n",
"typhimurium \t4 \t\n",
"EBG : Num STs\n",
"\t1 : 21 \t\t\t26 : 1 \t\t\t138 : 2 \t\t\t54 : 1 \t\t\n",
"braenderup \t1 \t\n",
"EBG : Num STs\n",
"\t24 : 4 \t\t\n",
"infantis \t1 \t\n",
"EBG : Num STs\n",
"\t31 : 5 \t\t\n",
"stanley \t1 \t\n",
"EBG : Num STs\n",
"\t29 : 6 \t\t\n",
"oranienburg \t6 \t\n",
"EBG : Num STs\n",
"\t41 : 4 \t\t\t203 : 4 \t\t\t44 : 6 \t\t\t45 : 1 \t\t\t50 : 2 \t\t\t52 : 1 \t\t\n",
"virchow \t2 \t\n",
"EBG : Num STs\n",
"\t9 : 11 \t\t\t70 : 2 \t\t\n",
"kentucky \t3 \t\n",
"EBG : Num STs\n",
"\t56 : 2 \t\t\t164 : 3 \t\t\t15 : 5 \t\t\n",
"agona \t3 \t\n",
"EBG : Num STs\n",
"\t26 : 1 \t\t\t83 : 1 \t\t\t54 : 4 \t\t\n",
"typhi \t1 \t\n",
"EBG : Num STs\n",
"\t13 : 8 \t\t\n",
"newport \t5 \t\n",
"EBG : Num STs\n",
"\t35 : 3 \t\t\t2 : 23 \t\t\t3 : 18 \t\t\t154 : 2 \t\t\t7 : 8 \t\t\n",
"paratyphi-a \t1 \t\n",
"EBG : Num STs\n",
"\t11 : 7 \t\t\n",
"montevideo \t3 \t\n",
"EBG : Num STs\n",
"\t208 : 2 \t\t\t4 : 13 \t\t\t39 : 3 \t\t\n"
]
}
],
"prompt_number": 48
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Conclusion"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Admittedly, this does nothing that a simple blog post couldn't do, but I think it would be neat to allow people to interact with the ipynb e.g. to be able to input an ST and find what EBG it is in perhaps?\n",
"I think I will save that for another day though.\n",
"\n",
"I also think it is quite cool for any budding pythonistas, it shows off the advantages of using nested dictionaries. You can download the dictionary and the code and experiment with dictionaries within dictionaries. They're great!"
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment