Skip to content

Instantly share code, notes, and snippets.

@QuLogic
Last active December 17, 2015 21:44
Show Gist options
  • Save QuLogic/c65772479f6101393fc3 to your computer and use it in GitHub Desktop.
Save QuLogic/c65772479f6101393fc3 to your computer and use it in GitHub Desktop.
R/Pandas Introduction
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction to Pandas via data wrangling\n",
"## Based on \"Introduction to R via data wrangling\" by Luke Johnston\n",
"https://uoftcoders.github.io/studyGroup/lessons/introR/lesson/"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Import/export your data\n",
"\n",
"Use `pd.read_csv` or `df.to_csv`."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"pd.read_csv?"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"pd.DataFrame.to_csv?"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# There's no builtin \"swiss\" dataset, so just read the csv from R.\n",
"ds = pd.read_csv('swiss.csv')"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Unnamed: 0</th>\n",
" <th>Fertility</th>\n",
" <th>Agriculture</th>\n",
" <th>Examination</th>\n",
" <th>Education</th>\n",
" <th>Catholic</th>\n",
" <th>Infant.Mortality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Courtelary</td>\n",
" <td>80.2</td>\n",
" <td>17.0</td>\n",
" <td>15</td>\n",
" <td>12</td>\n",
" <td>9.96</td>\n",
" <td>22.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Delemont</td>\n",
" <td>83.1</td>\n",
" <td>45.1</td>\n",
" <td>6</td>\n",
" <td>9</td>\n",
" <td>84.84</td>\n",
" <td>22.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>92.5</td>\n",
" <td>39.7</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>93.40</td>\n",
" <td>20.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Moutier</td>\n",
" <td>85.8</td>\n",
" <td>36.5</td>\n",
" <td>12</td>\n",
" <td>7</td>\n",
" <td>33.77</td>\n",
" <td>20.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Neuveville</td>\n",
" <td>76.9</td>\n",
" <td>43.5</td>\n",
" <td>17</td>\n",
" <td>15</td>\n",
" <td>5.16</td>\n",
" <td>20.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Porrentruy</td>\n",
" <td>76.1</td>\n",
" <td>35.3</td>\n",
" <td>9</td>\n",
" <td>7</td>\n",
" <td>90.57</td>\n",
" <td>26.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Broye</td>\n",
" <td>83.8</td>\n",
" <td>70.2</td>\n",
" <td>16</td>\n",
" <td>7</td>\n",
" <td>92.85</td>\n",
" <td>23.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Glane</td>\n",
" <td>92.4</td>\n",
" <td>67.8</td>\n",
" <td>14</td>\n",
" <td>8</td>\n",
" <td>97.16</td>\n",
" <td>24.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Gruyere</td>\n",
" <td>82.4</td>\n",
" <td>53.3</td>\n",
" <td>12</td>\n",
" <td>7</td>\n",
" <td>97.67</td>\n",
" <td>21.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Sarine</td>\n",
" <td>82.9</td>\n",
" <td>45.2</td>\n",
" <td>16</td>\n",
" <td>13</td>\n",
" <td>91.38</td>\n",
" <td>24.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Veveyse</td>\n",
" <td>87.1</td>\n",
" <td>64.5</td>\n",
" <td>14</td>\n",
" <td>6</td>\n",
" <td>98.61</td>\n",
" <td>24.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Aigle</td>\n",
" <td>64.1</td>\n",
" <td>62.0</td>\n",
" <td>21</td>\n",
" <td>12</td>\n",
" <td>8.52</td>\n",
" <td>16.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Aubonne</td>\n",
" <td>66.9</td>\n",
" <td>67.5</td>\n",
" <td>14</td>\n",
" <td>7</td>\n",
" <td>2.27</td>\n",
" <td>19.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Avenches</td>\n",
" <td>68.9</td>\n",
" <td>60.7</td>\n",
" <td>19</td>\n",
" <td>12</td>\n",
" <td>4.43</td>\n",
" <td>22.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Cossonay</td>\n",
" <td>61.7</td>\n",
" <td>69.3</td>\n",
" <td>22</td>\n",
" <td>5</td>\n",
" <td>2.82</td>\n",
" <td>18.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Echallens</td>\n",
" <td>68.3</td>\n",
" <td>72.6</td>\n",
" <td>18</td>\n",
" <td>2</td>\n",
" <td>24.20</td>\n",
" <td>21.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Grandson</td>\n",
" <td>71.7</td>\n",
" <td>34.0</td>\n",
" <td>17</td>\n",
" <td>8</td>\n",
" <td>3.30</td>\n",
" <td>20.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>Lausanne</td>\n",
" <td>55.7</td>\n",
" <td>19.4</td>\n",
" <td>26</td>\n",
" <td>28</td>\n",
" <td>12.11</td>\n",
" <td>20.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>La Vallee</td>\n",
" <td>54.3</td>\n",
" <td>15.2</td>\n",
" <td>31</td>\n",
" <td>20</td>\n",
" <td>2.15</td>\n",
" <td>10.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Lavaux</td>\n",
" <td>65.1</td>\n",
" <td>73.0</td>\n",
" <td>19</td>\n",
" <td>9</td>\n",
" <td>2.84</td>\n",
" <td>20.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>Morges</td>\n",
" <td>65.5</td>\n",
" <td>59.8</td>\n",
" <td>22</td>\n",
" <td>10</td>\n",
" <td>5.23</td>\n",
" <td>18.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>Moudon</td>\n",
" <td>65.0</td>\n",
" <td>55.1</td>\n",
" <td>14</td>\n",
" <td>3</td>\n",
" <td>4.52</td>\n",
" <td>22.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>Nyone</td>\n",
" <td>56.6</td>\n",
" <td>50.9</td>\n",
" <td>22</td>\n",
" <td>12</td>\n",
" <td>15.14</td>\n",
" <td>16.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>Orbe</td>\n",
" <td>57.4</td>\n",
" <td>54.1</td>\n",
" <td>20</td>\n",
" <td>6</td>\n",
" <td>4.20</td>\n",
" <td>15.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>Oron</td>\n",
" <td>72.5</td>\n",
" <td>71.2</td>\n",
" <td>12</td>\n",
" <td>1</td>\n",
" <td>2.40</td>\n",
" <td>21.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>Payerne</td>\n",
" <td>74.2</td>\n",
" <td>58.1</td>\n",
" <td>14</td>\n",
" <td>8</td>\n",
" <td>5.23</td>\n",
" <td>23.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>Paysd'enhaut</td>\n",
" <td>72.0</td>\n",
" <td>63.5</td>\n",
" <td>6</td>\n",
" <td>3</td>\n",
" <td>2.56</td>\n",
" <td>18.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>Rolle</td>\n",
" <td>60.5</td>\n",
" <td>60.8</td>\n",
" <td>16</td>\n",
" <td>10</td>\n",
" <td>7.72</td>\n",
" <td>16.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>Vevey</td>\n",
" <td>58.3</td>\n",
" <td>26.8</td>\n",
" <td>25</td>\n",
" <td>19</td>\n",
" <td>18.46</td>\n",
" <td>20.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>Yverdon</td>\n",
" <td>65.4</td>\n",
" <td>49.5</td>\n",
" <td>15</td>\n",
" <td>8</td>\n",
" <td>6.10</td>\n",
" <td>22.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>Conthey</td>\n",
" <td>75.5</td>\n",
" <td>85.9</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>99.71</td>\n",
" <td>15.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>Entremont</td>\n",
" <td>69.3</td>\n",
" <td>84.9</td>\n",
" <td>7</td>\n",
" <td>6</td>\n",
" <td>99.68</td>\n",
" <td>19.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32</th>\n",
" <td>Herens</td>\n",
" <td>77.3</td>\n",
" <td>89.7</td>\n",
" <td>5</td>\n",
" <td>2</td>\n",
" <td>100.00</td>\n",
" <td>18.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>33</th>\n",
" <td>Martigwy</td>\n",
" <td>70.5</td>\n",
" <td>78.2</td>\n",
" <td>12</td>\n",
" <td>6</td>\n",
" <td>98.96</td>\n",
" <td>19.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34</th>\n",
" <td>Monthey</td>\n",
" <td>79.4</td>\n",
" <td>64.9</td>\n",
" <td>7</td>\n",
" <td>3</td>\n",
" <td>98.22</td>\n",
" <td>20.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>35</th>\n",
" <td>St Maurice</td>\n",
" <td>65.0</td>\n",
" <td>75.9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>99.06</td>\n",
" <td>17.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36</th>\n",
" <td>Sierre</td>\n",
" <td>92.2</td>\n",
" <td>84.6</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>99.46</td>\n",
" <td>16.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37</th>\n",
" <td>Sion</td>\n",
" <td>79.3</td>\n",
" <td>63.1</td>\n",
" <td>13</td>\n",
" <td>13</td>\n",
" <td>96.83</td>\n",
" <td>18.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>38</th>\n",
" <td>Boudry</td>\n",
" <td>70.4</td>\n",
" <td>38.4</td>\n",
" <td>26</td>\n",
" <td>12</td>\n",
" <td>5.62</td>\n",
" <td>20.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>39</th>\n",
" <td>La Chauxdfnd</td>\n",
" <td>65.7</td>\n",
" <td>7.7</td>\n",
" <td>29</td>\n",
" <td>11</td>\n",
" <td>13.79</td>\n",
" <td>20.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>40</th>\n",
" <td>Le Locle</td>\n",
" <td>72.7</td>\n",
" <td>16.7</td>\n",
" <td>22</td>\n",
" <td>13</td>\n",
" <td>11.22</td>\n",
" <td>18.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41</th>\n",
" <td>Neuchatel</td>\n",
" <td>64.4</td>\n",
" <td>17.6</td>\n",
" <td>35</td>\n",
" <td>32</td>\n",
" <td>16.92</td>\n",
" <td>23.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42</th>\n",
" <td>Val de Ruz</td>\n",
" <td>77.6</td>\n",
" <td>37.6</td>\n",
" <td>15</td>\n",
" <td>7</td>\n",
" <td>4.97</td>\n",
" <td>20.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>43</th>\n",
" <td>ValdeTravers</td>\n",
" <td>67.6</td>\n",
" <td>18.7</td>\n",
" <td>25</td>\n",
" <td>7</td>\n",
" <td>8.65</td>\n",
" <td>19.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44</th>\n",
" <td>V. De Geneve</td>\n",
" <td>35.0</td>\n",
" <td>1.2</td>\n",
" <td>37</td>\n",
" <td>53</td>\n",
" <td>42.34</td>\n",
" <td>18.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45</th>\n",
" <td>Rive Droite</td>\n",
" <td>44.7</td>\n",
" <td>46.6</td>\n",
" <td>16</td>\n",
" <td>29</td>\n",
" <td>50.43</td>\n",
" <td>18.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>46</th>\n",
" <td>Rive Gauche</td>\n",
" <td>42.8</td>\n",
" <td>27.7</td>\n",
" <td>22</td>\n",
" <td>29</td>\n",
" <td>58.33</td>\n",
" <td>19.3</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Unnamed: 0 Fertility Agriculture Examination Education Catholic \\\n",
"0 Courtelary 80.2 17.0 15 12 9.96 \n",
"1 Delemont 83.1 45.1 6 9 84.84 \n",
"2 Franches-Mnt 92.5 39.7 5 5 93.40 \n",
"3 Moutier 85.8 36.5 12 7 33.77 \n",
"4 Neuveville 76.9 43.5 17 15 5.16 \n",
"5 Porrentruy 76.1 35.3 9 7 90.57 \n",
"6 Broye 83.8 70.2 16 7 92.85 \n",
"7 Glane 92.4 67.8 14 8 97.16 \n",
"8 Gruyere 82.4 53.3 12 7 97.67 \n",
"9 Sarine 82.9 45.2 16 13 91.38 \n",
"10 Veveyse 87.1 64.5 14 6 98.61 \n",
"11 Aigle 64.1 62.0 21 12 8.52 \n",
"12 Aubonne 66.9 67.5 14 7 2.27 \n",
"13 Avenches 68.9 60.7 19 12 4.43 \n",
"14 Cossonay 61.7 69.3 22 5 2.82 \n",
"15 Echallens 68.3 72.6 18 2 24.20 \n",
"16 Grandson 71.7 34.0 17 8 3.30 \n",
"17 Lausanne 55.7 19.4 26 28 12.11 \n",
"18 La Vallee 54.3 15.2 31 20 2.15 \n",
"19 Lavaux 65.1 73.0 19 9 2.84 \n",
"20 Morges 65.5 59.8 22 10 5.23 \n",
"21 Moudon 65.0 55.1 14 3 4.52 \n",
"22 Nyone 56.6 50.9 22 12 15.14 \n",
"23 Orbe 57.4 54.1 20 6 4.20 \n",
"24 Oron 72.5 71.2 12 1 2.40 \n",
"25 Payerne 74.2 58.1 14 8 5.23 \n",
"26 Paysd'enhaut 72.0 63.5 6 3 2.56 \n",
"27 Rolle 60.5 60.8 16 10 7.72 \n",
"28 Vevey 58.3 26.8 25 19 18.46 \n",
"29 Yverdon 65.4 49.5 15 8 6.10 \n",
"30 Conthey 75.5 85.9 3 2 99.71 \n",
"31 Entremont 69.3 84.9 7 6 99.68 \n",
"32 Herens 77.3 89.7 5 2 100.00 \n",
"33 Martigwy 70.5 78.2 12 6 98.96 \n",
"34 Monthey 79.4 64.9 7 3 98.22 \n",
"35 St Maurice 65.0 75.9 9 9 99.06 \n",
"36 Sierre 92.2 84.6 3 3 99.46 \n",
"37 Sion 79.3 63.1 13 13 96.83 \n",
"38 Boudry 70.4 38.4 26 12 5.62 \n",
"39 La Chauxdfnd 65.7 7.7 29 11 13.79 \n",
"40 Le Locle 72.7 16.7 22 13 11.22 \n",
"41 Neuchatel 64.4 17.6 35 32 16.92 \n",
"42 Val de Ruz 77.6 37.6 15 7 4.97 \n",
"43 ValdeTravers 67.6 18.7 25 7 8.65 \n",
"44 V. De Geneve 35.0 1.2 37 53 42.34 \n",
"45 Rive Droite 44.7 46.6 16 29 50.43 \n",
"46 Rive Gauche 42.8 27.7 22 29 58.33 \n",
"\n",
" Infant.Mortality \n",
"0 22.2 \n",
"1 22.2 \n",
"2 20.2 \n",
"3 20.3 \n",
"4 20.6 \n",
"5 26.6 \n",
"6 23.6 \n",
"7 24.9 \n",
"8 21.0 \n",
"9 24.4 \n",
"10 24.5 \n",
"11 16.5 \n",
"12 19.1 \n",
"13 22.7 \n",
"14 18.7 \n",
"15 21.2 \n",
"16 20.0 \n",
"17 20.2 \n",
"18 10.8 \n",
"19 20.0 \n",
"20 18.0 \n",
"21 22.4 \n",
"22 16.7 \n",
"23 15.3 \n",
"24 21.0 \n",
"25 23.8 \n",
"26 18.0 \n",
"27 16.3 \n",
"28 20.9 \n",
"29 22.5 \n",
"30 15.1 \n",
"31 19.8 \n",
"32 18.3 \n",
"33 19.4 \n",
"34 20.2 \n",
"35 17.8 \n",
"36 16.3 \n",
"37 18.1 \n",
"38 20.3 \n",
"39 20.5 \n",
"40 18.9 \n",
"41 23.0 \n",
"42 20.0 \n",
"43 19.5 \n",
"44 18.0 \n",
"45 18.2 \n",
"46 19.3 "
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Viewing your data"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Unnamed: 0</th>\n",
" <th>Fertility</th>\n",
" <th>Agriculture</th>\n",
" <th>Examination</th>\n",
" <th>Education</th>\n",
" <th>Catholic</th>\n",
" <th>Infant.Mortality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Courtelary</td>\n",
" <td>80.2</td>\n",
" <td>17.0</td>\n",
" <td>15</td>\n",
" <td>12</td>\n",
" <td>9.96</td>\n",
" <td>22.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Delemont</td>\n",
" <td>83.1</td>\n",
" <td>45.1</td>\n",
" <td>6</td>\n",
" <td>9</td>\n",
" <td>84.84</td>\n",
" <td>22.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>92.5</td>\n",
" <td>39.7</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>93.40</td>\n",
" <td>20.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Moutier</td>\n",
" <td>85.8</td>\n",
" <td>36.5</td>\n",
" <td>12</td>\n",
" <td>7</td>\n",
" <td>33.77</td>\n",
" <td>20.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Neuveville</td>\n",
" <td>76.9</td>\n",
" <td>43.5</td>\n",
" <td>17</td>\n",
" <td>15</td>\n",
" <td>5.16</td>\n",
" <td>20.6</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Unnamed: 0 Fertility Agriculture Examination Education Catholic \\\n",
"0 Courtelary 80.2 17.0 15 12 9.96 \n",
"1 Delemont 83.1 45.1 6 9 84.84 \n",
"2 Franches-Mnt 92.5 39.7 5 5 93.40 \n",
"3 Moutier 85.8 36.5 12 7 33.77 \n",
"4 Neuveville 76.9 43.5 17 15 5.16 \n",
"\n",
" Infant.Mortality \n",
"0 22.2 \n",
"1 22.2 \n",
"2 20.2 \n",
"3 20.3 \n",
"4 20.6 "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds.head()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"Index(['Unnamed: 0', 'Fertility', 'Agriculture', 'Examination', 'Education',\n",
" 'Catholic', 'Infant.Mortality'],\n",
" dtype='object')"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds.columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the first column is not automatically transformed into a \"Factor\" variable like in R.\n",
"It is just a column of `object` (any non-integral Python type qualifies; this one is in fact `str`.)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"Unnamed: 0 object\n",
"Fertility float64\n",
"Agriculture float64\n",
"Examination int64\n",
"Education int64\n",
"Catholic float64\n",
"Infant.Mortality float64\n",
"dtype: object"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds.dtypes"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"pandas.core.frame.DataFrame"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"type(ds)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Fertility</th>\n",
" <th>Agriculture</th>\n",
" <th>Examination</th>\n",
" <th>Education</th>\n",
" <th>Catholic</th>\n",
" <th>Infant.Mortality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>47.000000</td>\n",
" <td>47.000000</td>\n",
" <td>47.000000</td>\n",
" <td>47.000000</td>\n",
" <td>47.00000</td>\n",
" <td>47.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>70.142553</td>\n",
" <td>50.659574</td>\n",
" <td>16.489362</td>\n",
" <td>10.978723</td>\n",
" <td>41.14383</td>\n",
" <td>19.942553</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>12.491697</td>\n",
" <td>22.711218</td>\n",
" <td>7.977883</td>\n",
" <td>9.615407</td>\n",
" <td>41.70485</td>\n",
" <td>2.912697</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>35.000000</td>\n",
" <td>1.200000</td>\n",
" <td>3.000000</td>\n",
" <td>1.000000</td>\n",
" <td>2.15000</td>\n",
" <td>10.800000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>64.700000</td>\n",
" <td>35.900000</td>\n",
" <td>12.000000</td>\n",
" <td>6.000000</td>\n",
" <td>5.19500</td>\n",
" <td>18.150000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>70.400000</td>\n",
" <td>54.100000</td>\n",
" <td>16.000000</td>\n",
" <td>8.000000</td>\n",
" <td>15.14000</td>\n",
" <td>20.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>78.450000</td>\n",
" <td>67.650000</td>\n",
" <td>22.000000</td>\n",
" <td>12.000000</td>\n",
" <td>93.12500</td>\n",
" <td>21.700000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>92.500000</td>\n",
" <td>89.700000</td>\n",
" <td>37.000000</td>\n",
" <td>53.000000</td>\n",
" <td>100.00000</td>\n",
" <td>26.600000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Fertility Agriculture Examination Education Catholic \\\n",
"count 47.000000 47.000000 47.000000 47.000000 47.00000 \n",
"mean 70.142553 50.659574 16.489362 10.978723 41.14383 \n",
"std 12.491697 22.711218 7.977883 9.615407 41.70485 \n",
"min 35.000000 1.200000 3.000000 1.000000 2.15000 \n",
"25% 64.700000 35.900000 12.000000 6.000000 5.19500 \n",
"50% 70.400000 54.100000 16.000000 8.000000 15.14000 \n",
"75% 78.450000 67.650000 22.000000 12.000000 93.12500 \n",
"max 92.500000 89.700000 37.000000 53.000000 100.00000 \n",
"\n",
" Infant.Mortality \n",
"count 47.000000 \n",
"mean 19.942553 \n",
"std 2.912697 \n",
"min 10.800000 \n",
"25% 18.150000 \n",
"50% 20.000000 \n",
"75% 21.700000 \n",
"max 26.600000 "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Wrangling your data\n",
"\n",
"On the one hand, you need to import `pandas` to load your data; on the other hand, you don't need the extra libraries to do the same manipulation done below in R."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Education</th>\n",
" <th>Catholic</th>\n",
" <th>Fertility</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>12</td>\n",
" <td>9.96</td>\n",
" <td>80.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>9</td>\n",
" <td>84.84</td>\n",
" <td>83.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>5</td>\n",
" <td>93.40</td>\n",
" <td>92.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>7</td>\n",
" <td>33.77</td>\n",
" <td>85.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>15</td>\n",
" <td>5.16</td>\n",
" <td>76.9</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Education Catholic Fertility\n",
"0 12 9.96 80.2\n",
"1 9 84.84 83.1\n",
"2 5 93.40 92.5\n",
"3 7 33.77 85.8\n",
"4 15 5.16 76.9"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds[['Education', 'Catholic', 'Fertility']].head()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"ds.rename?"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>County</th>\n",
" <th>Fertility</th>\n",
" <th>Agriculture</th>\n",
" <th>Examination</th>\n",
" <th>Education</th>\n",
" <th>Catholic</th>\n",
" <th>Infant.Mortality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Courtelary</td>\n",
" <td>80.2</td>\n",
" <td>17.0</td>\n",
" <td>15</td>\n",
" <td>12</td>\n",
" <td>9.96</td>\n",
" <td>22.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Delemont</td>\n",
" <td>83.1</td>\n",
" <td>45.1</td>\n",
" <td>6</td>\n",
" <td>9</td>\n",
" <td>84.84</td>\n",
" <td>22.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>92.5</td>\n",
" <td>39.7</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>93.40</td>\n",
" <td>20.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Moutier</td>\n",
" <td>85.8</td>\n",
" <td>36.5</td>\n",
" <td>12</td>\n",
" <td>7</td>\n",
" <td>33.77</td>\n",
" <td>20.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Neuveville</td>\n",
" <td>76.9</td>\n",
" <td>43.5</td>\n",
" <td>17</td>\n",
" <td>15</td>\n",
" <td>5.16</td>\n",
" <td>20.6</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" County Fertility Agriculture Examination Education Catholic \\\n",
"0 Courtelary 80.2 17.0 15 12 9.96 \n",
"1 Delemont 83.1 45.1 6 9 84.84 \n",
"2 Franches-Mnt 92.5 39.7 5 5 93.40 \n",
"3 Moutier 85.8 36.5 12 7 33.77 \n",
"4 Neuveville 76.9 43.5 17 15 5.16 \n",
"\n",
" Infant.Mortality \n",
"0 22.2 \n",
"1 22.2 \n",
"2 20.2 \n",
"3 20.3 \n",
"4 20.6 "
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dsc = ds.rename(columns={'Unnamed: 0': 'County'})\n",
"dsc.head()"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Unnamed: 0</th>\n",
" <th>Fertility</th>\n",
" <th>Agriculture</th>\n",
" <th>Examination</th>\n",
" <th>Education</th>\n",
" <th>Catholic</th>\n",
" <th>Infant.Mortality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Courtelary</td>\n",
" <td>80.2</td>\n",
" <td>17.0</td>\n",
" <td>15</td>\n",
" <td>12</td>\n",
" <td>9.96</td>\n",
" <td>22.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>Yverdon</td>\n",
" <td>65.4</td>\n",
" <td>49.5</td>\n",
" <td>15</td>\n",
" <td>8</td>\n",
" <td>6.10</td>\n",
" <td>22.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42</th>\n",
" <td>Val de Ruz</td>\n",
" <td>77.6</td>\n",
" <td>37.6</td>\n",
" <td>15</td>\n",
" <td>7</td>\n",
" <td>4.97</td>\n",
" <td>20.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Unnamed: 0 Fertility Agriculture Examination Education Catholic \\\n",
"0 Courtelary 80.2 17.0 15 12 9.96 \n",
"29 Yverdon 65.4 49.5 15 8 6.10 \n",
"42 Val de Ruz 77.6 37.6 15 7 4.97 \n",
"\n",
" Infant.Mortality \n",
"0 22.2 \n",
"29 22.5 \n",
"42 20.0 "
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# NOTE: Parentheses are important! The & has higher priority than >, which will mess things up.\n",
"# Also note that filtering does *not* relabel the index (0, 29, 42)\n",
"ds[(ds.Catholic < 20) & (ds.Examination == 15)]"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Unnamed: 0</th>\n",
" <th>Fertility</th>\n",
" <th>Agriculture</th>\n",
" <th>Examination</th>\n",
" <th>Education</th>\n",
" <th>Catholic</th>\n",
" <th>Infant.Mortality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Aigle</td>\n",
" <td>64.1</td>\n",
" <td>62</td>\n",
" <td>21</td>\n",
" <td>12</td>\n",
" <td>8.52</td>\n",
" <td>16.5</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Unnamed: 0 Fertility Agriculture Examination Education Catholic \\\n",
"11 Aigle 64.1 62 21 12 8.52 \n",
"\n",
" Infant.Mortality \n",
"11 16.5 "
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# You can't do `ds.Unnamed: 0` since there's a space in the name, and\n",
"# space or colon can't be used in Python identifiers.\n",
"# ds.iloc[row, col] specifies rows and columns by *index*.\n",
"ds[ds.iloc[:, 0] == 'Aigle']"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Unnamed: 0</th>\n",
" <th>Fertility</th>\n",
" <th>Agriculture</th>\n",
" <th>Examination</th>\n",
" <th>Education</th>\n",
" <th>Catholic</th>\n",
" <th>Infant.Mortality</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Aigle</td>\n",
" <td>64.1</td>\n",
" <td>62</td>\n",
" <td>21</td>\n",
" <td>12</td>\n",
" <td>8.52</td>\n",
" <td>16.5</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Unnamed: 0 Fertility Agriculture Examination Education Catholic \\\n",
"11 Aigle 64.1 62 21 12 8.52 \n",
"\n",
" Infant.Mortality \n",
"11 16.5 "
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# You can also do ds.loc[row, col] which specifies rows and columns by the *name*.\n",
"ds[ds.loc[:, 'Unnamed: 0'] == 'Aigle']"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Education</th>\n",
" <th>Fertility</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>9</td>\n",
" <td>83.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>5</td>\n",
" <td>92.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>7</td>\n",
" <td>85.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>7</td>\n",
" <td>76.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>83.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>8</td>\n",
" <td>92.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>7</td>\n",
" <td>82.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>13</td>\n",
" <td>82.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>6</td>\n",
" <td>87.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>2</td>\n",
" <td>68.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>2</td>\n",
" <td>75.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>6</td>\n",
" <td>69.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32</th>\n",
" <td>2</td>\n",
" <td>77.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>33</th>\n",
" <td>6</td>\n",
" <td>70.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34</th>\n",
" <td>3</td>\n",
" <td>79.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>35</th>\n",
" <td>9</td>\n",
" <td>65.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36</th>\n",
" <td>3</td>\n",
" <td>92.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37</th>\n",
" <td>13</td>\n",
" <td>79.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44</th>\n",
" <td>53</td>\n",
" <td>35.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45</th>\n",
" <td>29</td>\n",
" <td>44.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>46</th>\n",
" <td>29</td>\n",
" <td>42.8</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Education Fertility\n",
"1 9 83.1\n",
"2 5 92.5\n",
"3 7 85.8\n",
"5 7 76.1\n",
"6 7 83.8\n",
"7 8 92.4\n",
"8 7 82.4\n",
"9 13 82.9\n",
"10 6 87.1\n",
"15 2 68.3\n",
"30 2 75.5\n",
"31 6 69.3\n",
"32 2 77.3\n",
"33 6 70.5\n",
"34 3 79.4\n",
"35 9 65.0\n",
"36 3 92.2\n",
"37 13 79.3\n",
"44 53 35.0\n",
"45 29 44.7\n",
"46 29 42.8"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds.loc[ds.Catholic > 20, ['Education', 'Fertility']]"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>County</th>\n",
" <th>Education</th>\n",
" <th>Fertility</th>\n",
" <th>Agriculture</th>\n",
" <th>infertile</th>\n",
" <th>testing</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>Conthey</td>\n",
" <td>2</td>\n",
" <td>75.5</td>\n",
" <td>85.9</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Echallens</td>\n",
" <td>2</td>\n",
" <td>68.3</td>\n",
" <td>72.6</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32</th>\n",
" <td>Herens</td>\n",
" <td>2</td>\n",
" <td>77.3</td>\n",
" <td>89.7</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36</th>\n",
" <td>Sierre</td>\n",
" <td>3</td>\n",
" <td>92.2</td>\n",
" <td>84.6</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34</th>\n",
" <td>Monthey</td>\n",
" <td>3</td>\n",
" <td>79.4</td>\n",
" <td>64.9</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>5</td>\n",
" <td>92.5</td>\n",
" <td>39.7</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Veveyse</td>\n",
" <td>6</td>\n",
" <td>87.1</td>\n",
" <td>64.5</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>Entremont</td>\n",
" <td>6</td>\n",
" <td>69.3</td>\n",
" <td>84.9</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>33</th>\n",
" <td>Martigwy</td>\n",
" <td>6</td>\n",
" <td>70.5</td>\n",
" <td>78.2</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Moutier</td>\n",
" <td>7</td>\n",
" <td>85.8</td>\n",
" <td>36.5</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Porrentruy</td>\n",
" <td>7</td>\n",
" <td>76.1</td>\n",
" <td>35.3</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Broye</td>\n",
" <td>7</td>\n",
" <td>83.8</td>\n",
" <td>70.2</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Gruyere</td>\n",
" <td>7</td>\n",
" <td>82.4</td>\n",
" <td>53.3</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Glane</td>\n",
" <td>8</td>\n",
" <td>92.4</td>\n",
" <td>67.8</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>35</th>\n",
" <td>St Maurice</td>\n",
" <td>9</td>\n",
" <td>65.0</td>\n",
" <td>75.9</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Delemont</td>\n",
" <td>9</td>\n",
" <td>83.1</td>\n",
" <td>45.1</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Sarine</td>\n",
" <td>13</td>\n",
" <td>82.9</td>\n",
" <td>45.2</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37</th>\n",
" <td>Sion</td>\n",
" <td>13</td>\n",
" <td>79.3</td>\n",
" <td>63.1</td>\n",
" <td>no</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45</th>\n",
" <td>Rive Droite</td>\n",
" <td>29</td>\n",
" <td>44.7</td>\n",
" <td>46.6</td>\n",
" <td>yes</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>46</th>\n",
" <td>Rive Gauche</td>\n",
" <td>29</td>\n",
" <td>42.8</td>\n",
" <td>27.7</td>\n",
" <td>yes</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44</th>\n",
" <td>V. De Geneve</td>\n",
" <td>53</td>\n",
" <td>35.0</td>\n",
" <td>1.2</td>\n",
" <td>yes</td>\n",
" <td>Yes</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" County Education Fertility Agriculture infertile testing\n",
"30 Conthey 2 75.5 85.9 no Yes\n",
"15 Echallens 2 68.3 72.6 no Yes\n",
"32 Herens 2 77.3 89.7 no Yes\n",
"36 Sierre 3 92.2 84.6 no Yes\n",
"34 Monthey 3 79.4 64.9 no Yes\n",
"2 Franches-Mnt 5 92.5 39.7 no Yes\n",
"10 Veveyse 6 87.1 64.5 no Yes\n",
"31 Entremont 6 69.3 84.9 no Yes\n",
"33 Martigwy 6 70.5 78.2 no Yes\n",
"3 Moutier 7 85.8 36.5 no Yes\n",
"5 Porrentruy 7 76.1 35.3 no Yes\n",
"6 Broye 7 83.8 70.2 no Yes\n",
"8 Gruyere 7 82.4 53.3 no Yes\n",
"7 Glane 8 92.4 67.8 no Yes\n",
"35 St Maurice 9 65.0 75.9 no Yes\n",
"1 Delemont 9 83.1 45.1 no Yes\n",
"9 Sarine 13 82.9 45.2 no Yes\n",
"37 Sion 13 79.3 63.1 no Yes\n",
"45 Rive Droite 29 44.7 46.6 yes Yes\n",
"46 Rive Gauche 29 42.8 27.7 yes Yes\n",
"44 V. De Geneve 53 35.0 1.2 yes Yes"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# There isn't really a `mutate`, so just make a copy (which is done by default in many of these methods below.)\n",
"ds2 = ds[ds.Catholic > 20]\\\n",
" .rename(columns={'Unnamed: 0': 'County'})\\\n",
" [['County', 'Education', 'Fertility', 'Agriculture']]\\\n",
" .sort_values('Education')\n",
"\n",
"ds2['infertile'] = 'no'\n",
"ds2.loc[ds2.Fertility < 50, 'infertile'] = 'yes'\n",
"# Could also just set directly as a boolean:\n",
"# ds2['infertile'] = ds.Fertility < 50\n",
"\n",
"ds2['testing'] = 'Yes'\n",
"\n",
"ds2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reorganizing your data"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>County</th>\n",
" <th>Fertility</th>\n",
" <th>Agriculture</th>\n",
" <th>Examination</th>\n",
" <th>Education</th>\n",
" <th>Catholic</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Courtelary</td>\n",
" <td>80.2</td>\n",
" <td>17.0</td>\n",
" <td>15</td>\n",
" <td>12</td>\n",
" <td>9.96</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Delemont</td>\n",
" <td>83.1</td>\n",
" <td>45.1</td>\n",
" <td>6</td>\n",
" <td>9</td>\n",
" <td>84.84</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>92.5</td>\n",
" <td>39.7</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>93.40</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Moutier</td>\n",
" <td>85.8</td>\n",
" <td>36.5</td>\n",
" <td>12</td>\n",
" <td>7</td>\n",
" <td>33.77</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Neuveville</td>\n",
" <td>76.9</td>\n",
" <td>43.5</td>\n",
" <td>17</td>\n",
" <td>15</td>\n",
" <td>5.16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Porrentruy</td>\n",
" <td>76.1</td>\n",
" <td>35.3</td>\n",
" <td>9</td>\n",
" <td>7</td>\n",
" <td>90.57</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Broye</td>\n",
" <td>83.8</td>\n",
" <td>70.2</td>\n",
" <td>16</td>\n",
" <td>7</td>\n",
" <td>92.85</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Glane</td>\n",
" <td>92.4</td>\n",
" <td>67.8</td>\n",
" <td>14</td>\n",
" <td>8</td>\n",
" <td>97.16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Gruyere</td>\n",
" <td>82.4</td>\n",
" <td>53.3</td>\n",
" <td>12</td>\n",
" <td>7</td>\n",
" <td>97.67</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Sarine</td>\n",
" <td>82.9</td>\n",
" <td>45.2</td>\n",
" <td>16</td>\n",
" <td>13</td>\n",
" <td>91.38</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Veveyse</td>\n",
" <td>87.1</td>\n",
" <td>64.5</td>\n",
" <td>14</td>\n",
" <td>6</td>\n",
" <td>98.61</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Aigle</td>\n",
" <td>64.1</td>\n",
" <td>62.0</td>\n",
" <td>21</td>\n",
" <td>12</td>\n",
" <td>8.52</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Aubonne</td>\n",
" <td>66.9</td>\n",
" <td>67.5</td>\n",
" <td>14</td>\n",
" <td>7</td>\n",
" <td>2.27</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Avenches</td>\n",
" <td>68.9</td>\n",
" <td>60.7</td>\n",
" <td>19</td>\n",
" <td>12</td>\n",
" <td>4.43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Cossonay</td>\n",
" <td>61.7</td>\n",
" <td>69.3</td>\n",
" <td>22</td>\n",
" <td>5</td>\n",
" <td>2.82</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Echallens</td>\n",
" <td>68.3</td>\n",
" <td>72.6</td>\n",
" <td>18</td>\n",
" <td>2</td>\n",
" <td>24.20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Grandson</td>\n",
" <td>71.7</td>\n",
" <td>34.0</td>\n",
" <td>17</td>\n",
" <td>8</td>\n",
" <td>3.30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>Lausanne</td>\n",
" <td>55.7</td>\n",
" <td>19.4</td>\n",
" <td>26</td>\n",
" <td>28</td>\n",
" <td>12.11</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>La Vallee</td>\n",
" <td>54.3</td>\n",
" <td>15.2</td>\n",
" <td>31</td>\n",
" <td>20</td>\n",
" <td>2.15</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Lavaux</td>\n",
" <td>65.1</td>\n",
" <td>73.0</td>\n",
" <td>19</td>\n",
" <td>9</td>\n",
" <td>2.84</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>Morges</td>\n",
" <td>65.5</td>\n",
" <td>59.8</td>\n",
" <td>22</td>\n",
" <td>10</td>\n",
" <td>5.23</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>Moudon</td>\n",
" <td>65.0</td>\n",
" <td>55.1</td>\n",
" <td>14</td>\n",
" <td>3</td>\n",
" <td>4.52</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>Nyone</td>\n",
" <td>56.6</td>\n",
" <td>50.9</td>\n",
" <td>22</td>\n",
" <td>12</td>\n",
" <td>15.14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>Orbe</td>\n",
" <td>57.4</td>\n",
" <td>54.1</td>\n",
" <td>20</td>\n",
" <td>6</td>\n",
" <td>4.20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>Oron</td>\n",
" <td>72.5</td>\n",
" <td>71.2</td>\n",
" <td>12</td>\n",
" <td>1</td>\n",
" <td>2.40</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>Payerne</td>\n",
" <td>74.2</td>\n",
" <td>58.1</td>\n",
" <td>14</td>\n",
" <td>8</td>\n",
" <td>5.23</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>Paysd'enhaut</td>\n",
" <td>72.0</td>\n",
" <td>63.5</td>\n",
" <td>6</td>\n",
" <td>3</td>\n",
" <td>2.56</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>Rolle</td>\n",
" <td>60.5</td>\n",
" <td>60.8</td>\n",
" <td>16</td>\n",
" <td>10</td>\n",
" <td>7.72</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>Vevey</td>\n",
" <td>58.3</td>\n",
" <td>26.8</td>\n",
" <td>25</td>\n",
" <td>19</td>\n",
" <td>18.46</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>Yverdon</td>\n",
" <td>65.4</td>\n",
" <td>49.5</td>\n",
" <td>15</td>\n",
" <td>8</td>\n",
" <td>6.10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>Conthey</td>\n",
" <td>75.5</td>\n",
" <td>85.9</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>99.71</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>Entremont</td>\n",
" <td>69.3</td>\n",
" <td>84.9</td>\n",
" <td>7</td>\n",
" <td>6</td>\n",
" <td>99.68</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32</th>\n",
" <td>Herens</td>\n",
" <td>77.3</td>\n",
" <td>89.7</td>\n",
" <td>5</td>\n",
" <td>2</td>\n",
" <td>100.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>33</th>\n",
" <td>Martigwy</td>\n",
" <td>70.5</td>\n",
" <td>78.2</td>\n",
" <td>12</td>\n",
" <td>6</td>\n",
" <td>98.96</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34</th>\n",
" <td>Monthey</td>\n",
" <td>79.4</td>\n",
" <td>64.9</td>\n",
" <td>7</td>\n",
" <td>3</td>\n",
" <td>98.22</td>\n",
" </tr>\n",
" <tr>\n",
" <th>35</th>\n",
" <td>St Maurice</td>\n",
" <td>65.0</td>\n",
" <td>75.9</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>99.06</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36</th>\n",
" <td>Sierre</td>\n",
" <td>92.2</td>\n",
" <td>84.6</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>99.46</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37</th>\n",
" <td>Sion</td>\n",
" <td>79.3</td>\n",
" <td>63.1</td>\n",
" <td>13</td>\n",
" <td>13</td>\n",
" <td>96.83</td>\n",
" </tr>\n",
" <tr>\n",
" <th>38</th>\n",
" <td>Boudry</td>\n",
" <td>70.4</td>\n",
" <td>38.4</td>\n",
" <td>26</td>\n",
" <td>12</td>\n",
" <td>5.62</td>\n",
" </tr>\n",
" <tr>\n",
" <th>39</th>\n",
" <td>La Chauxdfnd</td>\n",
" <td>65.7</td>\n",
" <td>7.7</td>\n",
" <td>29</td>\n",
" <td>11</td>\n",
" <td>13.79</td>\n",
" </tr>\n",
" <tr>\n",
" <th>40</th>\n",
" <td>Le Locle</td>\n",
" <td>72.7</td>\n",
" <td>16.7</td>\n",
" <td>22</td>\n",
" <td>13</td>\n",
" <td>11.22</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41</th>\n",
" <td>Neuchatel</td>\n",
" <td>64.4</td>\n",
" <td>17.6</td>\n",
" <td>35</td>\n",
" <td>32</td>\n",
" <td>16.92</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42</th>\n",
" <td>Val de Ruz</td>\n",
" <td>77.6</td>\n",
" <td>37.6</td>\n",
" <td>15</td>\n",
" <td>7</td>\n",
" <td>4.97</td>\n",
" </tr>\n",
" <tr>\n",
" <th>43</th>\n",
" <td>ValdeTravers</td>\n",
" <td>67.6</td>\n",
" <td>18.7</td>\n",
" <td>25</td>\n",
" <td>7</td>\n",
" <td>8.65</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44</th>\n",
" <td>V. De Geneve</td>\n",
" <td>35.0</td>\n",
" <td>1.2</td>\n",
" <td>37</td>\n",
" <td>53</td>\n",
" <td>42.34</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45</th>\n",
" <td>Rive Droite</td>\n",
" <td>44.7</td>\n",
" <td>46.6</td>\n",
" <td>16</td>\n",
" <td>29</td>\n",
" <td>50.43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>46</th>\n",
" <td>Rive Gauche</td>\n",
" <td>42.8</td>\n",
" <td>27.7</td>\n",
" <td>22</td>\n",
" <td>29</td>\n",
" <td>58.33</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" County Fertility Agriculture Examination Education Catholic\n",
"0 Courtelary 80.2 17.0 15 12 9.96\n",
"1 Delemont 83.1 45.1 6 9 84.84\n",
"2 Franches-Mnt 92.5 39.7 5 5 93.40\n",
"3 Moutier 85.8 36.5 12 7 33.77\n",
"4 Neuveville 76.9 43.5 17 15 5.16\n",
"5 Porrentruy 76.1 35.3 9 7 90.57\n",
"6 Broye 83.8 70.2 16 7 92.85\n",
"7 Glane 92.4 67.8 14 8 97.16\n",
"8 Gruyere 82.4 53.3 12 7 97.67\n",
"9 Sarine 82.9 45.2 16 13 91.38\n",
"10 Veveyse 87.1 64.5 14 6 98.61\n",
"11 Aigle 64.1 62.0 21 12 8.52\n",
"12 Aubonne 66.9 67.5 14 7 2.27\n",
"13 Avenches 68.9 60.7 19 12 4.43\n",
"14 Cossonay 61.7 69.3 22 5 2.82\n",
"15 Echallens 68.3 72.6 18 2 24.20\n",
"16 Grandson 71.7 34.0 17 8 3.30\n",
"17 Lausanne 55.7 19.4 26 28 12.11\n",
"18 La Vallee 54.3 15.2 31 20 2.15\n",
"19 Lavaux 65.1 73.0 19 9 2.84\n",
"20 Morges 65.5 59.8 22 10 5.23\n",
"21 Moudon 65.0 55.1 14 3 4.52\n",
"22 Nyone 56.6 50.9 22 12 15.14\n",
"23 Orbe 57.4 54.1 20 6 4.20\n",
"24 Oron 72.5 71.2 12 1 2.40\n",
"25 Payerne 74.2 58.1 14 8 5.23\n",
"26 Paysd'enhaut 72.0 63.5 6 3 2.56\n",
"27 Rolle 60.5 60.8 16 10 7.72\n",
"28 Vevey 58.3 26.8 25 19 18.46\n",
"29 Yverdon 65.4 49.5 15 8 6.10\n",
"30 Conthey 75.5 85.9 3 2 99.71\n",
"31 Entremont 69.3 84.9 7 6 99.68\n",
"32 Herens 77.3 89.7 5 2 100.00\n",
"33 Martigwy 70.5 78.2 12 6 98.96\n",
"34 Monthey 79.4 64.9 7 3 98.22\n",
"35 St Maurice 65.0 75.9 9 9 99.06\n",
"36 Sierre 92.2 84.6 3 3 99.46\n",
"37 Sion 79.3 63.1 13 13 96.83\n",
"38 Boudry 70.4 38.4 26 12 5.62\n",
"39 La Chauxdfnd 65.7 7.7 29 11 13.79\n",
"40 Le Locle 72.7 16.7 22 13 11.22\n",
"41 Neuchatel 64.4 17.6 35 32 16.92\n",
"42 Val de Ruz 77.6 37.6 15 7 4.97\n",
"43 ValdeTravers 67.6 18.7 25 7 8.65\n",
"44 V. De Geneve 35.0 1.2 37 53 42.34\n",
"45 Rive Droite 44.7 46.6 16 29 50.43\n",
"46 Rive Gauche 42.8 27.7 22 29 58.33"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dsc.drop('Infant.Mortality', axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>County</th>\n",
" <th>Measure</th>\n",
" <th>Value</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Courtelary</td>\n",
" <td>Fertility</td>\n",
" <td>80.20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Courtelary</td>\n",
" <td>Agriculture</td>\n",
" <td>17.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Courtelary</td>\n",
" <td>Examination</td>\n",
" <td>15.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Courtelary</td>\n",
" <td>Education</td>\n",
" <td>12.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Courtelary</td>\n",
" <td>Catholic</td>\n",
" <td>9.96</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Delemont</td>\n",
" <td>Fertility</td>\n",
" <td>83.10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Delemont</td>\n",
" <td>Agriculture</td>\n",
" <td>45.10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Delemont</td>\n",
" <td>Examination</td>\n",
" <td>6.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Delemont</td>\n",
" <td>Education</td>\n",
" <td>9.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Delemont</td>\n",
" <td>Catholic</td>\n",
" <td>84.84</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>Fertility</td>\n",
" <td>92.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>Agriculture</td>\n",
" <td>39.70</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>Examination</td>\n",
" <td>5.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>Education</td>\n",
" <td>5.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Franches-Mnt</td>\n",
" <td>Catholic</td>\n",
" <td>93.40</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Moutier</td>\n",
" <td>Fertility</td>\n",
" <td>85.80</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Moutier</td>\n",
" <td>Agriculture</td>\n",
" <td>36.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>Moutier</td>\n",
" <td>Examination</td>\n",
" <td>12.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>Moutier</td>\n",
" <td>Education</td>\n",
" <td>7.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Moutier</td>\n",
" <td>Catholic</td>\n",
" <td>33.77</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>Neuveville</td>\n",
" <td>Fertility</td>\n",
" <td>76.90</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>Neuveville</td>\n",
" <td>Agriculture</td>\n",
" <td>43.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>Neuveville</td>\n",
" <td>Examination</td>\n",
" <td>17.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>Neuveville</td>\n",
" <td>Education</td>\n",
" <td>15.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>Neuveville</td>\n",
" <td>Catholic</td>\n",
" <td>5.16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>Porrentruy</td>\n",
" <td>Fertility</td>\n",
" <td>76.10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>Porrentruy</td>\n",
" <td>Agriculture</td>\n",
" <td>35.30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>Porrentruy</td>\n",
" <td>Examination</td>\n",
" <td>9.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>Porrentruy</td>\n",
" <td>Education</td>\n",
" <td>7.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>Porrentruy</td>\n",
" <td>Catholic</td>\n",
" <td>90.57</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>205</th>\n",
" <td>Neuchatel</td>\n",
" <td>Fertility</td>\n",
" <td>64.40</td>\n",
" </tr>\n",
" <tr>\n",
" <th>206</th>\n",
" <td>Neuchatel</td>\n",
" <td>Agriculture</td>\n",
" <td>17.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>207</th>\n",
" <td>Neuchatel</td>\n",
" <td>Examination</td>\n",
" <td>35.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>208</th>\n",
" <td>Neuchatel</td>\n",
" <td>Education</td>\n",
" <td>32.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>209</th>\n",
" <td>Neuchatel</td>\n",
" <td>Catholic</td>\n",
" <td>16.92</td>\n",
" </tr>\n",
" <tr>\n",
" <th>210</th>\n",
" <td>Val de Ruz</td>\n",
" <td>Fertility</td>\n",
" <td>77.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>211</th>\n",
" <td>Val de Ruz</td>\n",
" <td>Agriculture</td>\n",
" <td>37.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>212</th>\n",
" <td>Val de Ruz</td>\n",
" <td>Examination</td>\n",
" <td>15.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>213</th>\n",
" <td>Val de Ruz</td>\n",
" <td>Education</td>\n",
" <td>7.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>214</th>\n",
" <td>Val de Ruz</td>\n",
" <td>Catholic</td>\n",
" <td>4.97</td>\n",
" </tr>\n",
" <tr>\n",
" <th>215</th>\n",
" <td>ValdeTravers</td>\n",
" <td>Fertility</td>\n",
" <td>67.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>216</th>\n",
" <td>ValdeTravers</td>\n",
" <td>Agriculture</td>\n",
" <td>18.70</td>\n",
" </tr>\n",
" <tr>\n",
" <th>217</th>\n",
" <td>ValdeTravers</td>\n",
" <td>Examination</td>\n",
" <td>25.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>218</th>\n",
" <td>ValdeTravers</td>\n",
" <td>Education</td>\n",
" <td>7.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>219</th>\n",
" <td>ValdeTravers</td>\n",
" <td>Catholic</td>\n",
" <td>8.65</td>\n",
" </tr>\n",
" <tr>\n",
" <th>220</th>\n",
" <td>V. De Geneve</td>\n",
" <td>Fertility</td>\n",
" <td>35.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>221</th>\n",
" <td>V. De Geneve</td>\n",
" <td>Agriculture</td>\n",
" <td>1.20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>222</th>\n",
" <td>V. De Geneve</td>\n",
" <td>Examination</td>\n",
" <td>37.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>223</th>\n",
" <td>V. De Geneve</td>\n",
" <td>Education</td>\n",
" <td>53.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>224</th>\n",
" <td>V. De Geneve</td>\n",
" <td>Catholic</td>\n",
" <td>42.34</td>\n",
" </tr>\n",
" <tr>\n",
" <th>225</th>\n",
" <td>Rive Droite</td>\n",
" <td>Fertility</td>\n",
" <td>44.70</td>\n",
" </tr>\n",
" <tr>\n",
" <th>226</th>\n",
" <td>Rive Droite</td>\n",
" <td>Agriculture</td>\n",
" <td>46.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>227</th>\n",
" <td>Rive Droite</td>\n",
" <td>Examination</td>\n",
" <td>16.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>228</th>\n",
" <td>Rive Droite</td>\n",
" <td>Education</td>\n",
" <td>29.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>229</th>\n",
" <td>Rive Droite</td>\n",
" <td>Catholic</td>\n",
" <td>50.43</td>\n",
" </tr>\n",
" <tr>\n",
" <th>230</th>\n",
" <td>Rive Gauche</td>\n",
" <td>Fertility</td>\n",
" <td>42.80</td>\n",
" </tr>\n",
" <tr>\n",
" <th>231</th>\n",
" <td>Rive Gauche</td>\n",
" <td>Agriculture</td>\n",
" <td>27.70</td>\n",
" </tr>\n",
" <tr>\n",
" <th>232</th>\n",
" <td>Rive Gauche</td>\n",
" <td>Examination</td>\n",
" <td>22.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>233</th>\n",
" <td>Rive Gauche</td>\n",
" <td>Education</td>\n",
" <td>29.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>234</th>\n",
" <td>Rive Gauche</td>\n",
" <td>Catholic</td>\n",
" <td>58.33</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>235 rows × 3 columns</p>\n",
"</div>"
],
"text/plain": [
" County Measure Value\n",
"0 Courtelary Fertility 80.20\n",
"1 Courtelary Agriculture 17.00\n",
"2 Courtelary Examination 15.00\n",
"3 Courtelary Education 12.00\n",
"4 Courtelary Catholic 9.96\n",
"5 Delemont Fertility 83.10\n",
"6 Delemont Agriculture 45.10\n",
"7 Delemont Examination 6.00\n",
"8 Delemont Education 9.00\n",
"9 Delemont Catholic 84.84\n",
"10 Franches-Mnt Fertility 92.50\n",
"11 Franches-Mnt Agriculture 39.70\n",
"12 Franches-Mnt Examination 5.00\n",
"13 Franches-Mnt Education 5.00\n",
"14 Franches-Mnt Catholic 93.40\n",
"15 Moutier Fertility 85.80\n",
"16 Moutier Agriculture 36.50\n",
"17 Moutier Examination 12.00\n",
"18 Moutier Education 7.00\n",
"19 Moutier Catholic 33.77\n",
"20 Neuveville Fertility 76.90\n",
"21 Neuveville Agriculture 43.50\n",
"22 Neuveville Examination 17.00\n",
"23 Neuveville Education 15.00\n",
"24 Neuveville Catholic 5.16\n",
"25 Porrentruy Fertility 76.10\n",
"26 Porrentruy Agriculture 35.30\n",
"27 Porrentruy Examination 9.00\n",
"28 Porrentruy Education 7.00\n",
"29 Porrentruy Catholic 90.57\n",
".. ... ... ...\n",
"205 Neuchatel Fertility 64.40\n",
"206 Neuchatel Agriculture 17.60\n",
"207 Neuchatel Examination 35.00\n",
"208 Neuchatel Education 32.00\n",
"209 Neuchatel Catholic 16.92\n",
"210 Val de Ruz Fertility 77.60\n",
"211 Val de Ruz Agriculture 37.60\n",
"212 Val de Ruz Examination 15.00\n",
"213 Val de Ruz Education 7.00\n",
"214 Val de Ruz Catholic 4.97\n",
"215 ValdeTravers Fertility 67.60\n",
"216 ValdeTravers Agriculture 18.70\n",
"217 ValdeTravers Examination 25.00\n",
"218 ValdeTravers Education 7.00\n",
"219 ValdeTravers Catholic 8.65\n",
"220 V. De Geneve Fertility 35.00\n",
"221 V. De Geneve Agriculture 1.20\n",
"222 V. De Geneve Examination 37.00\n",
"223 V. De Geneve Education 53.00\n",
"224 V. De Geneve Catholic 42.34\n",
"225 Rive Droite Fertility 44.70\n",
"226 Rive Droite Agriculture 46.60\n",
"227 Rive Droite Examination 16.00\n",
"228 Rive Droite Education 29.00\n",
"229 Rive Droite Catholic 50.43\n",
"230 Rive Gauche Fertility 42.80\n",
"231 Rive Gauche Agriculture 27.70\n",
"232 Rive Gauche Examination 22.00\n",
"233 Rive Gauche Education 29.00\n",
"234 Rive Gauche Catholic 58.33\n",
"\n",
"[235 rows x 3 columns]"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# `.stack()` works somewhat like `gather`, taking column *names* and entering them as a new index level.\n",
"# It also doesn't allow you to ignore a column, so we temporarily set 'County' as the first index level\n",
"# and then reset it.\n",
"dsc2 = dsc\\\n",
" .drop('Infant.Mortality', axis=1)\\\n",
" .set_index('County')\\\n",
" .stack()\\\n",
" .reset_index()\\\n",
" .rename(columns={'level_1': 'Measure', 0: 'Value'})\n",
"dsc2"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th>Measure</th>\n",
" <th>Agriculture</th>\n",
" <th>Catholic</th>\n",
" <th>Education</th>\n",
" <th>Examination</th>\n",
" <th>Fertility</th>\n",
" </tr>\n",
" <tr>\n",
" <th>County</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Aigle</th>\n",
" <td>62.0</td>\n",
" <td>8.52</td>\n",
" <td>12</td>\n",
" <td>21</td>\n",
" <td>64.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Aubonne</th>\n",
" <td>67.5</td>\n",
" <td>2.27</td>\n",
" <td>7</td>\n",
" <td>14</td>\n",
" <td>66.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Avenches</th>\n",
" <td>60.7</td>\n",
" <td>4.43</td>\n",
" <td>12</td>\n",
" <td>19</td>\n",
" <td>68.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Boudry</th>\n",
" <td>38.4</td>\n",
" <td>5.62</td>\n",
" <td>12</td>\n",
" <td>26</td>\n",
" <td>70.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Broye</th>\n",
" <td>70.2</td>\n",
" <td>92.85</td>\n",
" <td>7</td>\n",
" <td>16</td>\n",
" <td>83.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Conthey</th>\n",
" <td>85.9</td>\n",
" <td>99.71</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>75.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Cossonay</th>\n",
" <td>69.3</td>\n",
" <td>2.82</td>\n",
" <td>5</td>\n",
" <td>22</td>\n",
" <td>61.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Courtelary</th>\n",
" <td>17.0</td>\n",
" <td>9.96</td>\n",
" <td>12</td>\n",
" <td>15</td>\n",
" <td>80.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Delemont</th>\n",
" <td>45.1</td>\n",
" <td>84.84</td>\n",
" <td>9</td>\n",
" <td>6</td>\n",
" <td>83.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Echallens</th>\n",
" <td>72.6</td>\n",
" <td>24.20</td>\n",
" <td>2</td>\n",
" <td>18</td>\n",
" <td>68.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Entremont</th>\n",
" <td>84.9</td>\n",
" <td>99.68</td>\n",
" <td>6</td>\n",
" <td>7</td>\n",
" <td>69.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Franches-Mnt</th>\n",
" <td>39.7</td>\n",
" <td>93.40</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>92.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Glane</th>\n",
" <td>67.8</td>\n",
" <td>97.16</td>\n",
" <td>8</td>\n",
" <td>14</td>\n",
" <td>92.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Grandson</th>\n",
" <td>34.0</td>\n",
" <td>3.30</td>\n",
" <td>8</td>\n",
" <td>17</td>\n",
" <td>71.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Gruyere</th>\n",
" <td>53.3</td>\n",
" <td>97.67</td>\n",
" <td>7</td>\n",
" <td>12</td>\n",
" <td>82.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Herens</th>\n",
" <td>89.7</td>\n",
" <td>100.00</td>\n",
" <td>2</td>\n",
" <td>5</td>\n",
" <td>77.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>La Chauxdfnd</th>\n",
" <td>7.7</td>\n",
" <td>13.79</td>\n",
" <td>11</td>\n",
" <td>29</td>\n",
" <td>65.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>La Vallee</th>\n",
" <td>15.2</td>\n",
" <td>2.15</td>\n",
" <td>20</td>\n",
" <td>31</td>\n",
" <td>54.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Lausanne</th>\n",
" <td>19.4</td>\n",
" <td>12.11</td>\n",
" <td>28</td>\n",
" <td>26</td>\n",
" <td>55.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Lavaux</th>\n",
" <td>73.0</td>\n",
" <td>2.84</td>\n",
" <td>9</td>\n",
" <td>19</td>\n",
" <td>65.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Le Locle</th>\n",
" <td>16.7</td>\n",
" <td>11.22</td>\n",
" <td>13</td>\n",
" <td>22</td>\n",
" <td>72.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Martigwy</th>\n",
" <td>78.2</td>\n",
" <td>98.96</td>\n",
" <td>6</td>\n",
" <td>12</td>\n",
" <td>70.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Monthey</th>\n",
" <td>64.9</td>\n",
" <td>98.22</td>\n",
" <td>3</td>\n",
" <td>7</td>\n",
" <td>79.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Morges</th>\n",
" <td>59.8</td>\n",
" <td>5.23</td>\n",
" <td>10</td>\n",
" <td>22</td>\n",
" <td>65.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Moudon</th>\n",
" <td>55.1</td>\n",
" <td>4.52</td>\n",
" <td>3</td>\n",
" <td>14</td>\n",
" <td>65.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Moutier</th>\n",
" <td>36.5</td>\n",
" <td>33.77</td>\n",
" <td>7</td>\n",
" <td>12</td>\n",
" <td>85.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Neuchatel</th>\n",
" <td>17.6</td>\n",
" <td>16.92</td>\n",
" <td>32</td>\n",
" <td>35</td>\n",
" <td>64.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Neuveville</th>\n",
" <td>43.5</td>\n",
" <td>5.16</td>\n",
" <td>15</td>\n",
" <td>17</td>\n",
" <td>76.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Nyone</th>\n",
" <td>50.9</td>\n",
" <td>15.14</td>\n",
" <td>12</td>\n",
" <td>22</td>\n",
" <td>56.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Orbe</th>\n",
" <td>54.1</td>\n",
" <td>4.20</td>\n",
" <td>6</td>\n",
" <td>20</td>\n",
" <td>57.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Oron</th>\n",
" <td>71.2</td>\n",
" <td>2.40</td>\n",
" <td>1</td>\n",
" <td>12</td>\n",
" <td>72.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Payerne</th>\n",
" <td>58.1</td>\n",
" <td>5.23</td>\n",
" <td>8</td>\n",
" <td>14</td>\n",
" <td>74.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Paysd'enhaut</th>\n",
" <td>63.5</td>\n",
" <td>2.56</td>\n",
" <td>3</td>\n",
" <td>6</td>\n",
" <td>72.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Porrentruy</th>\n",
" <td>35.3</td>\n",
" <td>90.57</td>\n",
" <td>7</td>\n",
" <td>9</td>\n",
" <td>76.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Rive Droite</th>\n",
" <td>46.6</td>\n",
" <td>50.43</td>\n",
" <td>29</td>\n",
" <td>16</td>\n",
" <td>44.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Rive Gauche</th>\n",
" <td>27.7</td>\n",
" <td>58.33</td>\n",
" <td>29</td>\n",
" <td>22</td>\n",
" <td>42.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Rolle</th>\n",
" <td>60.8</td>\n",
" <td>7.72</td>\n",
" <td>10</td>\n",
" <td>16</td>\n",
" <td>60.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Sarine</th>\n",
" <td>45.2</td>\n",
" <td>91.38</td>\n",
" <td>13</td>\n",
" <td>16</td>\n",
" <td>82.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Sierre</th>\n",
" <td>84.6</td>\n",
" <td>99.46</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>92.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Sion</th>\n",
" <td>63.1</td>\n",
" <td>96.83</td>\n",
" <td>13</td>\n",
" <td>13</td>\n",
" <td>79.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>St Maurice</th>\n",
" <td>75.9</td>\n",
" <td>99.06</td>\n",
" <td>9</td>\n",
" <td>9</td>\n",
" <td>65.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>V. De Geneve</th>\n",
" <td>1.2</td>\n",
" <td>42.34</td>\n",
" <td>53</td>\n",
" <td>37</td>\n",
" <td>35.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Val de Ruz</th>\n",
" <td>37.6</td>\n",
" <td>4.97</td>\n",
" <td>7</td>\n",
" <td>15</td>\n",
" <td>77.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>ValdeTravers</th>\n",
" <td>18.7</td>\n",
" <td>8.65</td>\n",
" <td>7</td>\n",
" <td>25</td>\n",
" <td>67.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Vevey</th>\n",
" <td>26.8</td>\n",
" <td>18.46</td>\n",
" <td>19</td>\n",
" <td>25</td>\n",
" <td>58.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Veveyse</th>\n",
" <td>64.5</td>\n",
" <td>98.61</td>\n",
" <td>6</td>\n",
" <td>14</td>\n",
" <td>87.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Yverdon</th>\n",
" <td>49.5</td>\n",
" <td>6.10</td>\n",
" <td>8</td>\n",
" <td>15</td>\n",
" <td>65.4</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"Measure Agriculture Catholic Education Examination Fertility\n",
"County \n",
"Aigle 62.0 8.52 12 21 64.1\n",
"Aubonne 67.5 2.27 7 14 66.9\n",
"Avenches 60.7 4.43 12 19 68.9\n",
"Boudry 38.4 5.62 12 26 70.4\n",
"Broye 70.2 92.85 7 16 83.8\n",
"Conthey 85.9 99.71 2 3 75.5\n",
"Cossonay 69.3 2.82 5 22 61.7\n",
"Courtelary 17.0 9.96 12 15 80.2\n",
"Delemont 45.1 84.84 9 6 83.1\n",
"Echallens 72.6 24.20 2 18 68.3\n",
"Entremont 84.9 99.68 6 7 69.3\n",
"Franches-Mnt 39.7 93.40 5 5 92.5\n",
"Glane 67.8 97.16 8 14 92.4\n",
"Grandson 34.0 3.30 8 17 71.7\n",
"Gruyere 53.3 97.67 7 12 82.4\n",
"Herens 89.7 100.00 2 5 77.3\n",
"La Chauxdfnd 7.7 13.79 11 29 65.7\n",
"La Vallee 15.2 2.15 20 31 54.3\n",
"Lausanne 19.4 12.11 28 26 55.7\n",
"Lavaux 73.0 2.84 9 19 65.1\n",
"Le Locle 16.7 11.22 13 22 72.7\n",
"Martigwy 78.2 98.96 6 12 70.5\n",
"Monthey 64.9 98.22 3 7 79.4\n",
"Morges 59.8 5.23 10 22 65.5\n",
"Moudon 55.1 4.52 3 14 65.0\n",
"Moutier 36.5 33.77 7 12 85.8\n",
"Neuchatel 17.6 16.92 32 35 64.4\n",
"Neuveville 43.5 5.16 15 17 76.9\n",
"Nyone 50.9 15.14 12 22 56.6\n",
"Orbe 54.1 4.20 6 20 57.4\n",
"Oron 71.2 2.40 1 12 72.5\n",
"Payerne 58.1 5.23 8 14 74.2\n",
"Paysd'enhaut 63.5 2.56 3 6 72.0\n",
"Porrentruy 35.3 90.57 7 9 76.1\n",
"Rive Droite 46.6 50.43 29 16 44.7\n",
"Rive Gauche 27.7 58.33 29 22 42.8\n",
"Rolle 60.8 7.72 10 16 60.5\n",
"Sarine 45.2 91.38 13 16 82.9\n",
"Sierre 84.6 99.46 3 3 92.2\n",
"Sion 63.1 96.83 13 13 79.3\n",
"St Maurice 75.9 99.06 9 9 65.0\n",
"V. De Geneve 1.2 42.34 53 37 35.0\n",
"Val de Ruz 37.6 4.97 7 15 77.6\n",
"ValdeTravers 18.7 8.65 7 25 67.6\n",
"Vevey 26.8 18.46 19 25 58.3\n",
"Veveyse 64.5 98.61 6 14 87.1\n",
"Yverdon 49.5 6.10 8 15 65.4"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# `.unstack()` probably works like `spread`, but I still can't wrap my head around it.\n",
"# `.pivot()` seems good enough here...\n",
"dsc2.pivot(index='County', columns='Measure', values='Value')"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"Fertility 70.142553\n",
"Agriculture 50.659574\n",
"Examination 16.489362\n",
"Education 10.978723\n",
"Catholic 41.143830\n",
"Infant.Mortality 19.942553\n",
"dtype: float64"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dsc\\\n",
" .drop('County', axis='columns')\\\n",
" .stack()\\\n",
" .groupby(level=1)\\\n",
" .mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Challenge: Try this out for yourself!\n",
"\n",
"Make a table with the means of `Agriculture`, `Examination`, `Education`,\n",
"and `Infant.Mortality` for each category of `Fertility` (hint: convert it\n",
"into a factor by values `>50` vs `<50`), when `Catholic` is less than 60.\n",
"Have the `Fertility` groups as two columns in the new table."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th>Fertility</th>\n",
" <th>False</th>\n",
" <th>True</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Agriculture</th>\n",
" <td>25.166667</td>\n",
" <td>44.900000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Examination</th>\n",
" <td>25.000000</td>\n",
" <td>19.607143</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Education</th>\n",
" <td>37.000000</td>\n",
" <td>10.678571</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Infant.Mortality</th>\n",
" <td>18.500000</td>\n",
" <td>19.621429</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"Fertility False True \n",
"Agriculture 25.166667 44.900000\n",
"Examination 25.000000 19.607143\n",
"Education 37.000000 10.678571\n",
"Infant.Mortality 18.500000 19.621429"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds3 = dsc.loc[dsc.Catholic < 60].drop(['Catholic', 'County'], axis=1)\n",
"# This may not be the most straightforward way...\n",
"# ds3 = ds.loc[ds.Catholic < 60, ['Agriculture', 'Examination', 'Education', 'Infant.Mortality', 'Fertility']]\n",
"\n",
"# Set fertility to boolean yes/no\n",
"ds3.Fertility = ds3.Fertility > 50\n",
"\n",
"# Alternate method: Set separate infertility column\n",
"#ds3['infertility'] = 'no'\n",
"#ds3.loc[ds3.Fertility < 50, 'infertility'] = 'yes'\n",
"\n",
"ds3.groupby('Fertility').mean().T"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.4.3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Fertility Agriculture Examination Education Catholic Infant.Mortality
Courtelary 80.2 17 15 12 9.96 22.2
Delemont 83.1 45.1 6 9 84.84 22.2
Franches-Mnt 92.5 39.7 5 5 93.4 20.2
Moutier 85.8 36.5 12 7 33.77 20.3
Neuveville 76.9 43.5 17 15 5.16 20.6
Porrentruy 76.1 35.3 9 7 90.57 26.6
Broye 83.8 70.2 16 7 92.85 23.6
Glane 92.4 67.8 14 8 97.16 24.9
Gruyere 82.4 53.3 12 7 97.67 21
Sarine 82.9 45.2 16 13 91.38 24.4
Veveyse 87.1 64.5 14 6 98.61 24.5
Aigle 64.1 62 21 12 8.52 16.5
Aubonne 66.9 67.5 14 7 2.27 19.1
Avenches 68.9 60.7 19 12 4.43 22.7
Cossonay 61.7 69.3 22 5 2.82 18.7
Echallens 68.3 72.6 18 2 24.2 21.2
Grandson 71.7 34 17 8 3.3 20
Lausanne 55.7 19.4 26 28 12.11 20.2
La Vallee 54.3 15.2 31 20 2.15 10.8
Lavaux 65.1 73 19 9 2.84 20
Morges 65.5 59.8 22 10 5.23 18
Moudon 65 55.1 14 3 4.52 22.4
Nyone 56.6 50.9 22 12 15.14 16.7
Orbe 57.4 54.1 20 6 4.2 15.3
Oron 72.5 71.2 12 1 2.4 21
Payerne 74.2 58.1 14 8 5.23 23.8
Paysd'enhaut 72 63.5 6 3 2.56 18
Rolle 60.5 60.8 16 10 7.72 16.3
Vevey 58.3 26.8 25 19 18.46 20.9
Yverdon 65.4 49.5 15 8 6.1 22.5
Conthey 75.5 85.9 3 2 99.71 15.1
Entremont 69.3 84.9 7 6 99.68 19.8
Herens 77.3 89.7 5 2 100 18.3
Martigwy 70.5 78.2 12 6 98.96 19.4
Monthey 79.4 64.9 7 3 98.22 20.2
St Maurice 65 75.9 9 9 99.06 17.8
Sierre 92.2 84.6 3 3 99.46 16.3
Sion 79.3 63.1 13 13 96.83 18.1
Boudry 70.4 38.4 26 12 5.62 20.3
La Chauxdfnd 65.7 7.7 29 11 13.79 20.5
Le Locle 72.7 16.7 22 13 11.22 18.9
Neuchatel 64.4 17.6 35 32 16.92 23
Val de Ruz 77.6 37.6 15 7 4.97 20
ValdeTravers 67.6 18.7 25 7 8.65 19.5
V. De Geneve 35 1.2 37 53 42.34 18
Rive Droite 44.7 46.6 16 29 50.43 18.2
Rive Gauche 42.8 27.7 22 29 58.33 19.3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment