Skip to content

Instantly share code, notes, and snippets.

@iswetha522
Created March 17, 2021 16:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save iswetha522/3e3124fd9255641a1ed404d5d4d728fd to your computer and use it in GitHub Desktop.
Save iswetha522/3e3124fd9255641a1ed404d5d4d728fd to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "# Working with structured data in Python using Pandas\n\n\n## Table of Contents\n\n1. [Introduction](#introduction)<br>\n2. [Series and DataFrames](#series)<br>\n3. [Cleaning Data](#cleaning)<br>\n4. [Selecting Data](#selection)<br>\n5. [Merging Data](#merging)<br>\n6. [Grouping Data](#grouping)<br>\n7. [Visualising Data](#visualise)<br>"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<a id=\"introduction\"></a>\n## 1. Introduction\n\nA lot of data is **structured data**, which is data that is organized and formatted so it is easily readable, for example a table with variables as columns and records as rows, or key-value pairs in a noSQL database. As long as the data is formatted consistently and has multiple records with numbers, text and dates, you can probably read the data with [Pandas](https://pandas.pydata.org/pandas-docs/stable/index.html), an open-source Python package providing high-performance data manipulation and analysis."
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### Data\n\nThe data that you will explore in this notebook is about the boroughs in London. Within Greater London there are [32 boroughs](https://en.wikipedia.org/wiki/London_boroughs). You can download the data from [data.gov.uk](https://data.gov.uk/dataset/248f5f04-23cf-4470-9216-0d0be9b877a8/london-borough-profiles-and-atlas) where this description is given:\n\n> The London Borough Profiles help paint a general picture of an area by presenting a range of headline indicator data to help show statistics covering demographic, economic, social and environmental datasets for each borough, alongside relevant comparator areas. \n\n**Let's start with loading the required Python packages and loading our data into the notebook.**\n\n* To run the code, select the below cell by clicking on it, and then click on the `Run` button at the top of the notebook (or use `Shift+Enter`), to run the cells in the notebook\n* The numbers in front of the cells tell you in which order you have run them, for instance `[1]`\n* When you see a `[*]` the cell is currently running and `[]` means you have not run the cell yet. Make sure run all of them!"
},
{
"metadata": {},
"cell_type": "code",
"source": "import numpy as np\nimport pandas as pd",
"execution_count": 1,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "**Read data from a CSV file using the `read_csv` function. Load a file by running the next cell:**\n\nThis file is read directly from a URL, but you can replace this with a local path when running this notebook on a local system. When you are using IBM Watson Studio you can also [upload](https://dataplatform.cloud.ibm.com/docs/content/wsj/manage-data/add-data-project.html?linkInPage=true) a file to your Cloud Object Storage, and then [import](https://dataplatform.cloud.ibm.com/docs/content/wsj/manage-data/add-data-project.html?linkInPage=true#os) it by clicking on the file in the menu on the right of the notebook. "
},
{
"metadata": {},
"cell_type": "code",
"source": "df = pd.read_csv('https://data.london.gov.uk/download/london-borough-profiles/c1693b82-68b1-44ee-beb2-3decf17dc1f8/london-borough-profiles.csv',encoding = 'unicode_escape', sep=',', thousands=',')",
"execution_count": 2,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Only keep the data from the 32 boroughs by removng the last 5 rows from the DataFrame: "
},
{
"metadata": {},
"cell_type": "code",
"source": "df = df.drop([33,34,35,36,37])\ndf.head(35)",
"execution_count": 3,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 3,
"data": {
"text/plain": " Code Area_name Inner/_Outer_London \\\n0 E09000001 City of London Inner London \n1 E09000002 Barking and Dagenham Outer London \n2 E09000003 Barnet Outer London \n3 E09000004 Bexley Outer London \n4 E09000005 Brent Outer London \n5 E09000006 Bromley Outer London \n6 E09000007 Camden Inner London \n7 E09000008 Croydon Outer London \n8 E09000009 Ealing Outer London \n9 E09000010 Enfield Outer London \n10 E09000011 Greenwich Outer London \n11 E09000012 Hackney Inner London \n12 E09000013 Hammersmith and Fulham Inner London \n13 E09000014 Haringey Inner London \n14 E09000015 Harrow Outer London \n15 E09000016 Havering Outer London \n16 E09000017 Hillingdon Outer London \n17 E09000018 Hounslow Outer London \n18 E09000019 Islington Inner London \n19 E09000020 Kensington and Chelsea Inner London \n20 E09000021 Kingston upon Thames Outer London \n21 E09000022 Lambeth Inner London \n22 E09000023 Lewisham Inner London \n23 E09000024 Merton Outer London \n24 E09000025 Newham Inner London \n25 E09000026 Redbridge Outer London \n26 E09000027 Richmond upon Thames Outer London \n27 E09000028 Southwark Inner London \n28 E09000029 Sutton Outer London \n29 E09000030 Tower Hamlets Inner London \n30 E09000031 Waltham Forest Outer London \n31 E09000032 Wandsworth Inner London \n32 E09000033 Westminster Inner London \n\n GLA_Population_Estimate_2017 GLA_Household_Estimate_2017 \\\n0 8800 5326 \n1 209000 78188 \n2 389600 151423 \n3 244300 97736 \n4 332100 121048 \n5 327900 140602 \n6 242500 107654 \n7 386500 159010 \n8 351600 132663 \n9 333000 130328 \n10 280100 113964 \n11 274300 115417 \n12 185300 83552 \n13 278000 115608 \n14 252300 92557 \n15 254300 104098 \n16 301000 110827 \n17 274200 105887 \n18 231200 105038 \n19 159000 80200 \n20 175400 69849 \n21 328900 144400 \n22 303400 131076 \n23 208100 84201 \n24 342900 119172 \n25 304200 110708 \n26 197300 85108 \n27 314300 134254 \n28 202600 85243 \n29 304000 123720 \n30 276200 105981 \n31 321000 138149 \n32 242100 118975 \n\n Inland_Area_(Hectares) Population_density_(per_hectare)_2017 \\\n0 290 30.3 \n1 3,611 57.9 \n2 8,675 44.9 \n3 6,058 40.3 \n4 4,323 76.8 \n5 15,013 21.8 \n6 2,179 111.3 \n7 8,650 44.7 \n8 5,554 63.3 \n9 8,083 41.2 \n10 4,733 59.2 \n11 1,905 144 \n12 1,640 113 \n13 2,960 93.9 \n14 5,046 50 \n15 11,235 22.6 \n16 11,570 26 \n17 5,598 49 \n18 1,486 155.6 \n19 1,212 131.1 \n20 3,726 47.1 \n21 2,681 122.7 \n22 3,515 86.3 \n23 3,762 55.3 \n24 3,620 94.7 \n25 5,642 53.9 \n26 5,741 34.4 \n27 2,886 108.9 \n28 4,385 46.2 \n29 1,978 153.7 \n30 3,881 71.2 \n31 3,426 93.7 \n32 2,149 112.7 \n\n Average_Age,_2017 Proportion_of_population_aged_0-15,_2015 \\\n0 43.2 11.4 \n1 32.9 27.2 \n2 37.3 21.1 \n3 39.0 20.6 \n4 35.6 20.9 \n5 40.2 19.9 \n6 36.4 17.3 \n7 37.0 22.0 \n8 36.2 21.4 \n9 36.3 22.8 \n10 35.0 21.9 \n11 33.1 20.7 \n12 35.7 17.4 \n13 35.1 20.0 \n14 38.3 20.5 \n15 40.3 19.3 \n16 36.4 21.3 \n17 35.8 21.1 \n18 34.8 15.9 \n19 39.3 16.4 \n20 37.1 19.6 \n21 34.5 17.6 \n22 35.0 20.6 \n23 36.7 20.6 \n24 32.1 22.7 \n25 35.8 22.8 \n26 38.8 20.7 \n27 34.4 18.6 \n28 38.9 20.7 \n29 31.4 20.1 \n30 35.1 21.8 \n31 35.0 17.8 \n32 37.7 15.9 \n\n Proportion_of_population_of_working-age,_2015 ... \\\n0 73.1 ... \n1 63.1 ... \n2 64.9 ... \n3 62.9 ... \n4 67.8 ... \n5 62.6 ... \n6 71.0 ... \n7 64.9 ... \n8 66.8 ... \n9 64.4 ... \n10 67.7 ... \n11 72.1 ... \n12 72.3 ... \n13 70.7 ... \n14 64.5 ... \n15 62.3 ... \n16 65.6 ... \n17 67.6 ... \n18 75.3 ... \n19 69.3 ... \n20 67.2 ... \n21 74.6 ... \n22 70.1 ... \n23 67.2 ... \n24 70.2 ... \n25 65.0 ... \n26 64.5 ... \n27 73.5 ... \n28 64.3 ... \n29 73.9 ... \n30 67.9 ... \n31 72.8 ... \n32 72.3 ... \n\n Happiness_score_2011-14_(out_of_10) Anxiety_score_2011-14_(out_of_10) \\\n0 6.0 5.6 \n1 7.1 3.1 \n2 7.4 2.8 \n3 7.2 3.3 \n4 7.2 2.9 \n5 7.4 3.3 \n6 7.1 3.6 \n7 7.2 3.3 \n8 7.3 3.6 \n9 7.3 2.6 \n10 7.2 3.4 \n11 7.0 3.8 \n12 7.2 3.1 \n13 7.2 3.2 \n14 7.3 2.7 \n15 7.2 3.3 \n16 7.3 3.5 \n17 7.4 3.4 \n18 7.1 3.7 \n19 7.6 3.1 \n20 7.4 3.3 \n21 7.2 3.5 \n22 7.3 3.4 \n23 7.1 3.6 \n24 7.2 3.4 \n25 7.3 3.2 \n26 7.3 3.2 \n27 7.3 3.4 \n28 7.3 3.2 \n29 7.2 3.3 \n30 7.1 3.1 \n31 7.4 3.6 \n32 7.1 3.4 \n\n Childhood_Obesity_Prevalance_(%)_2015/16 People_aged_17+_with_diabetes_(%) \\\n0 NaN 2.6 \n1 28.5 7.3 \n2 20.7 6.0 \n3 22.7 6.9 \n4 24.3 7.9 \n5 16 5.2 \n6 21.3 3.9 \n7 24.5 6.5 \n8 23.8 6.9 \n9 25.2 7.0 \n10 27.7 6.1 \n11 27 5.8 \n12 21.3 4.4 \n13 23.8 5.9 \n14 20.2 8.5 \n15 21.8 5.9 \n16 21.1 6.4 \n17 24.1 6.5 \n18 22.8 5.0 \n19 18.6 4.2 \n20 16.9 4.9 \n21 23 5.0 \n22 23.6 6.1 \n23 19.2 5.6 \n24 27.6 7.6 \n25 23.3 7.9 \n26 12.6 3.7 \n27 27.6 5.5 \n28 18.4 5.9 \n29 27.1 6.6 \n30 26.3 6.4 \n31 19.3 4.2 \n32 24.9 4.4 \n\n Mortality_rate_from_causes_considered_preventable_2012/14 \\\n0 129 \n1 228 \n2 134 \n3 164 \n4 169 \n5 148 \n6 164 \n7 178 \n8 164 \n9 152 \n10 193 \n11 211 \n12 187 \n13 183 \n14 134 \n15 159 \n16 170 \n17 166 \n18 203 \n19 136 \n20 141 \n21 205 \n22 191 \n23 162 \n24 193 \n25 142 \n26 137 \n27 207 \n28 163 \n29 239 \n30 185 \n31 177 \n32 162 \n\n Political_control_in_council \\\n0 . \n1 Lab \n2 Cons \n3 Cons \n4 Lab \n5 Cons \n6 Lab \n7 Lab \n8 Lab \n9 Lab \n10 Lab \n11 Lab \n12 Lab \n13 Lab \n14 Lab \n15 No Overall Control \n16 Cons \n17 Lab \n18 Lab \n19 Cons \n20 Cons \n21 Lab \n22 Lab \n23 Lab \n24 Lab \n25 Lab \n26 Cons \n27 Lab \n28 Lib Dem \n29 Tower Hamlets First \n30 Lab \n31 Cons \n32 Cons \n\n Proportion_of_seats_won_by_Conservatives_in_2014_election \\\n0 . \n1 0 \n2 50.8 \n3 71.4 \n4 9.5 \n5 85 \n6 22.2 \n7 42.9 \n8 17.4 \n9 34.9 \n10 15.7 \n11 7 \n12 43.5 \n13 0 \n14 41.3 \n15 40.7 \n16 64.6 \n17 18.3 \n18 0 \n19 74 \n20 58.3 \n21 4.8 \n22 0 \n23 33.3 \n24 0 \n25 39.7 \n26 72.2 \n27 3.2 \n28 16.7 \n29 11.1 \n30 26.7 \n31 68.3 \n32 73.3 \n\n Proportion_of_seats_won_by_Labour_in_2014_election \\\n0 . \n1 100 \n2 . \n3 23.8 \n4 88.9 \n5 11.7 \n6 74.1 \n7 57.1 \n8 76.8 \n9 65.1 \n10 84.3 \n11 87.7 \n12 56.5 \n13 84.2 \n14 54 \n15 1.9 \n16 35.4 \n17 81.7 \n18 97.9 \n19 24 \n20 4.2 \n21 93.7 \n22 98.1 \n23 60 \n24 100 \n25 55.6 \n26 0 \n27 76.2 \n28 0 \n29 48.9 \n30 73.3 \n31 31.7 \n32 26.7 \n\n Proportion_of_seats_won_by_Lib_Dems_in_2014_election \\\n0 . \n1 0 \n2 1.6 \n3 0 \n4 1.6 \n5 0 \n6 1.9 \n7 0 \n8 5.8 \n9 0 \n10 0 \n11 5.3 \n12 0 \n13 15.8 \n14 1.6 \n15 0 \n16 0 \n17 0 \n18 0 \n19 2 \n20 37.5 \n21 0 \n22 0 \n23 1.7 \n24 0 \n25 4.8 \n26 27.8 \n27 20.6 \n28 83.3 \n29 0 \n30 0 \n31 0 \n32 0 \n\n Turnout_at_2014_local_elections \n0 . \n1 36.5 \n2 40.5 \n3 39.6 \n4 36.3 \n5 40.8 \n6 38.7 \n7 38.6 \n8 41.2 \n9 38.2 \n10 37.3 \n11 39.4 \n12 37.6 \n13 38.1 \n14 40.7 \n15 43.1 \n16 36.1 \n17 36.8 \n18 38.4 \n19 29.8 \n20 43.1 \n21 34.5 \n22 37.2 \n23 41.3 \n24 40.5 \n25 39.7 \n26 46.1 \n27 36.2 \n28 42.6 \n29 47.2 \n30 37.6 \n31 36.9 \n32 32.3 \n\n[33 rows x 84 columns]",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Code</th>\n <th>Area_name</th>\n <th>Inner/_Outer_London</th>\n <th>GLA_Population_Estimate_2017</th>\n <th>GLA_Household_Estimate_2017</th>\n <th>Inland_Area_(Hectares)</th>\n <th>Population_density_(per_hectare)_2017</th>\n <th>Average_Age,_2017</th>\n <th>Proportion_of_population_aged_0-15,_2015</th>\n <th>Proportion_of_population_of_working-age,_2015</th>\n <th>...</th>\n <th>Happiness_score_2011-14_(out_of_10)</th>\n <th>Anxiety_score_2011-14_(out_of_10)</th>\n <th>Childhood_Obesity_Prevalance_(%)_2015/16</th>\n <th>People_aged_17+_with_diabetes_(%)</th>\n <th>Mortality_rate_from_causes_considered_preventable_2012/14</th>\n <th>Political_control_in_council</th>\n <th>Proportion_of_seats_won_by_Conservatives_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Labour_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Lib_Dems_in_2014_election</th>\n <th>Turnout_at_2014_local_elections</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>E09000001</td>\n <td>City of London</td>\n <td>Inner London</td>\n <td>8800</td>\n <td>5326</td>\n <td>290</td>\n <td>30.3</td>\n <td>43.2</td>\n <td>11.4</td>\n <td>73.1</td>\n <td>...</td>\n <td>6.0</td>\n <td>5.6</td>\n <td>NaN</td>\n <td>2.6</td>\n <td>129</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n </tr>\n <tr>\n <th>1</th>\n <td>E09000002</td>\n <td>Barking and Dagenham</td>\n <td>Outer London</td>\n <td>209000</td>\n <td>78188</td>\n <td>3,611</td>\n <td>57.9</td>\n <td>32.9</td>\n <td>27.2</td>\n <td>63.1</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.1</td>\n <td>28.5</td>\n <td>7.3</td>\n <td>228</td>\n <td>Lab</td>\n <td>0</td>\n <td>100</td>\n <td>0</td>\n <td>36.5</td>\n </tr>\n <tr>\n <th>2</th>\n <td>E09000003</td>\n <td>Barnet</td>\n <td>Outer London</td>\n <td>389600</td>\n <td>151423</td>\n <td>8,675</td>\n <td>44.9</td>\n <td>37.3</td>\n <td>21.1</td>\n <td>64.9</td>\n <td>...</td>\n <td>7.4</td>\n <td>2.8</td>\n <td>20.7</td>\n <td>6.0</td>\n <td>134</td>\n <td>Cons</td>\n <td>50.8</td>\n <td>.</td>\n <td>1.6</td>\n <td>40.5</td>\n </tr>\n <tr>\n <th>3</th>\n <td>E09000004</td>\n <td>Bexley</td>\n <td>Outer London</td>\n <td>244300</td>\n <td>97736</td>\n <td>6,058</td>\n <td>40.3</td>\n <td>39.0</td>\n <td>20.6</td>\n <td>62.9</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.3</td>\n <td>22.7</td>\n <td>6.9</td>\n <td>164</td>\n <td>Cons</td>\n <td>71.4</td>\n <td>23.8</td>\n <td>0</td>\n <td>39.6</td>\n </tr>\n <tr>\n <th>4</th>\n <td>E09000005</td>\n <td>Brent</td>\n <td>Outer London</td>\n <td>332100</td>\n <td>121048</td>\n <td>4,323</td>\n <td>76.8</td>\n <td>35.6</td>\n <td>20.9</td>\n <td>67.8</td>\n <td>...</td>\n <td>7.2</td>\n <td>2.9</td>\n <td>24.3</td>\n <td>7.9</td>\n <td>169</td>\n <td>Lab</td>\n <td>9.5</td>\n <td>88.9</td>\n <td>1.6</td>\n <td>36.3</td>\n </tr>\n <tr>\n <th>5</th>\n <td>E09000006</td>\n <td>Bromley</td>\n <td>Outer London</td>\n <td>327900</td>\n <td>140602</td>\n <td>15,013</td>\n <td>21.8</td>\n <td>40.2</td>\n <td>19.9</td>\n <td>62.6</td>\n <td>...</td>\n <td>7.4</td>\n <td>3.3</td>\n <td>16</td>\n <td>5.2</td>\n <td>148</td>\n <td>Cons</td>\n <td>85</td>\n <td>11.7</td>\n <td>0</td>\n <td>40.8</td>\n </tr>\n <tr>\n <th>6</th>\n <td>E09000007</td>\n <td>Camden</td>\n <td>Inner London</td>\n <td>242500</td>\n <td>107654</td>\n <td>2,179</td>\n <td>111.3</td>\n <td>36.4</td>\n <td>17.3</td>\n <td>71.0</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.6</td>\n <td>21.3</td>\n <td>3.9</td>\n <td>164</td>\n <td>Lab</td>\n <td>22.2</td>\n <td>74.1</td>\n <td>1.9</td>\n <td>38.7</td>\n </tr>\n <tr>\n <th>7</th>\n <td>E09000008</td>\n <td>Croydon</td>\n <td>Outer London</td>\n <td>386500</td>\n <td>159010</td>\n <td>8,650</td>\n <td>44.7</td>\n <td>37.0</td>\n <td>22.0</td>\n <td>64.9</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.3</td>\n <td>24.5</td>\n <td>6.5</td>\n <td>178</td>\n <td>Lab</td>\n <td>42.9</td>\n <td>57.1</td>\n <td>0</td>\n <td>38.6</td>\n </tr>\n <tr>\n <th>8</th>\n <td>E09000009</td>\n <td>Ealing</td>\n <td>Outer London</td>\n <td>351600</td>\n <td>132663</td>\n <td>5,554</td>\n <td>63.3</td>\n <td>36.2</td>\n <td>21.4</td>\n <td>66.8</td>\n <td>...</td>\n <td>7.3</td>\n <td>3.6</td>\n <td>23.8</td>\n <td>6.9</td>\n <td>164</td>\n <td>Lab</td>\n <td>17.4</td>\n <td>76.8</td>\n <td>5.8</td>\n <td>41.2</td>\n </tr>\n <tr>\n <th>9</th>\n <td>E09000010</td>\n <td>Enfield</td>\n <td>Outer London</td>\n <td>333000</td>\n <td>130328</td>\n <td>8,083</td>\n <td>41.2</td>\n <td>36.3</td>\n <td>22.8</td>\n <td>64.4</td>\n <td>...</td>\n <td>7.3</td>\n <td>2.6</td>\n <td>25.2</td>\n <td>7.0</td>\n <td>152</td>\n <td>Lab</td>\n <td>34.9</td>\n <td>65.1</td>\n <td>0</td>\n <td>38.2</td>\n </tr>\n <tr>\n <th>10</th>\n <td>E09000011</td>\n <td>Greenwich</td>\n <td>Outer London</td>\n <td>280100</td>\n <td>113964</td>\n <td>4,733</td>\n <td>59.2</td>\n <td>35.0</td>\n <td>21.9</td>\n <td>67.7</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.4</td>\n <td>27.7</td>\n <td>6.1</td>\n <td>193</td>\n <td>Lab</td>\n <td>15.7</td>\n <td>84.3</td>\n <td>0</td>\n <td>37.3</td>\n </tr>\n <tr>\n <th>11</th>\n <td>E09000012</td>\n <td>Hackney</td>\n <td>Inner London</td>\n <td>274300</td>\n <td>115417</td>\n <td>1,905</td>\n <td>144</td>\n <td>33.1</td>\n <td>20.7</td>\n <td>72.1</td>\n <td>...</td>\n <td>7.0</td>\n <td>3.8</td>\n <td>27</td>\n <td>5.8</td>\n <td>211</td>\n <td>Lab</td>\n <td>7</td>\n <td>87.7</td>\n <td>5.3</td>\n <td>39.4</td>\n </tr>\n <tr>\n <th>12</th>\n <td>E09000013</td>\n <td>Hammersmith and Fulham</td>\n <td>Inner London</td>\n <td>185300</td>\n <td>83552</td>\n <td>1,640</td>\n <td>113</td>\n <td>35.7</td>\n <td>17.4</td>\n <td>72.3</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.1</td>\n <td>21.3</td>\n <td>4.4</td>\n <td>187</td>\n <td>Lab</td>\n <td>43.5</td>\n <td>56.5</td>\n <td>0</td>\n <td>37.6</td>\n </tr>\n <tr>\n <th>13</th>\n <td>E09000014</td>\n <td>Haringey</td>\n <td>Inner London</td>\n <td>278000</td>\n <td>115608</td>\n <td>2,960</td>\n <td>93.9</td>\n <td>35.1</td>\n <td>20.0</td>\n <td>70.7</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.2</td>\n <td>23.8</td>\n <td>5.9</td>\n <td>183</td>\n <td>Lab</td>\n <td>0</td>\n <td>84.2</td>\n <td>15.8</td>\n <td>38.1</td>\n </tr>\n <tr>\n <th>14</th>\n <td>E09000015</td>\n <td>Harrow</td>\n <td>Outer London</td>\n <td>252300</td>\n <td>92557</td>\n <td>5,046</td>\n <td>50</td>\n <td>38.3</td>\n <td>20.5</td>\n <td>64.5</td>\n <td>...</td>\n <td>7.3</td>\n <td>2.7</td>\n <td>20.2</td>\n <td>8.5</td>\n <td>134</td>\n <td>Lab</td>\n <td>41.3</td>\n <td>54</td>\n <td>1.6</td>\n <td>40.7</td>\n </tr>\n <tr>\n <th>15</th>\n <td>E09000016</td>\n <td>Havering</td>\n <td>Outer London</td>\n <td>254300</td>\n <td>104098</td>\n <td>11,235</td>\n <td>22.6</td>\n <td>40.3</td>\n <td>19.3</td>\n <td>62.3</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.3</td>\n <td>21.8</td>\n <td>5.9</td>\n <td>159</td>\n <td>No Overall Control</td>\n <td>40.7</td>\n <td>1.9</td>\n <td>0</td>\n <td>43.1</td>\n </tr>\n <tr>\n <th>16</th>\n <td>E09000017</td>\n <td>Hillingdon</td>\n <td>Outer London</td>\n <td>301000</td>\n <td>110827</td>\n <td>11,570</td>\n <td>26</td>\n <td>36.4</td>\n <td>21.3</td>\n <td>65.6</td>\n <td>...</td>\n <td>7.3</td>\n <td>3.5</td>\n <td>21.1</td>\n <td>6.4</td>\n <td>170</td>\n <td>Cons</td>\n <td>64.6</td>\n <td>35.4</td>\n <td>0</td>\n <td>36.1</td>\n </tr>\n <tr>\n <th>17</th>\n <td>E09000018</td>\n <td>Hounslow</td>\n <td>Outer London</td>\n <td>274200</td>\n <td>105887</td>\n <td>5,598</td>\n <td>49</td>\n <td>35.8</td>\n <td>21.1</td>\n <td>67.6</td>\n <td>...</td>\n <td>7.4</td>\n <td>3.4</td>\n <td>24.1</td>\n <td>6.5</td>\n <td>166</td>\n <td>Lab</td>\n <td>18.3</td>\n <td>81.7</td>\n <td>0</td>\n <td>36.8</td>\n </tr>\n <tr>\n <th>18</th>\n <td>E09000019</td>\n <td>Islington</td>\n <td>Inner London</td>\n <td>231200</td>\n <td>105038</td>\n <td>1,486</td>\n <td>155.6</td>\n <td>34.8</td>\n <td>15.9</td>\n <td>75.3</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.7</td>\n <td>22.8</td>\n <td>5.0</td>\n <td>203</td>\n <td>Lab</td>\n <td>0</td>\n <td>97.9</td>\n <td>0</td>\n <td>38.4</td>\n </tr>\n <tr>\n <th>19</th>\n <td>E09000020</td>\n <td>Kensington and Chelsea</td>\n <td>Inner London</td>\n <td>159000</td>\n <td>80200</td>\n <td>1,212</td>\n <td>131.1</td>\n <td>39.3</td>\n <td>16.4</td>\n <td>69.3</td>\n <td>...</td>\n <td>7.6</td>\n <td>3.1</td>\n <td>18.6</td>\n <td>4.2</td>\n <td>136</td>\n <td>Cons</td>\n <td>74</td>\n <td>24</td>\n <td>2</td>\n <td>29.8</td>\n </tr>\n <tr>\n <th>20</th>\n <td>E09000021</td>\n <td>Kingston upon Thames</td>\n <td>Outer London</td>\n <td>175400</td>\n <td>69849</td>\n <td>3,726</td>\n <td>47.1</td>\n <td>37.1</td>\n <td>19.6</td>\n <td>67.2</td>\n <td>...</td>\n <td>7.4</td>\n <td>3.3</td>\n <td>16.9</td>\n <td>4.9</td>\n <td>141</td>\n <td>Cons</td>\n <td>58.3</td>\n <td>4.2</td>\n <td>37.5</td>\n <td>43.1</td>\n </tr>\n <tr>\n <th>21</th>\n <td>E09000022</td>\n <td>Lambeth</td>\n <td>Inner London</td>\n <td>328900</td>\n <td>144400</td>\n <td>2,681</td>\n <td>122.7</td>\n <td>34.5</td>\n <td>17.6</td>\n <td>74.6</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.5</td>\n <td>23</td>\n <td>5.0</td>\n <td>205</td>\n <td>Lab</td>\n <td>4.8</td>\n <td>93.7</td>\n <td>0</td>\n <td>34.5</td>\n </tr>\n <tr>\n <th>22</th>\n <td>E09000023</td>\n <td>Lewisham</td>\n <td>Inner London</td>\n <td>303400</td>\n <td>131076</td>\n <td>3,515</td>\n <td>86.3</td>\n <td>35.0</td>\n <td>20.6</td>\n <td>70.1</td>\n <td>...</td>\n <td>7.3</td>\n <td>3.4</td>\n <td>23.6</td>\n <td>6.1</td>\n <td>191</td>\n <td>Lab</td>\n <td>0</td>\n <td>98.1</td>\n <td>0</td>\n <td>37.2</td>\n </tr>\n <tr>\n <th>23</th>\n <td>E09000024</td>\n <td>Merton</td>\n <td>Outer London</td>\n <td>208100</td>\n <td>84201</td>\n <td>3,762</td>\n <td>55.3</td>\n <td>36.7</td>\n <td>20.6</td>\n <td>67.2</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.6</td>\n <td>19.2</td>\n <td>5.6</td>\n <td>162</td>\n <td>Lab</td>\n <td>33.3</td>\n <td>60</td>\n <td>1.7</td>\n <td>41.3</td>\n </tr>\n <tr>\n <th>24</th>\n <td>E09000025</td>\n <td>Newham</td>\n <td>Inner London</td>\n <td>342900</td>\n <td>119172</td>\n <td>3,620</td>\n <td>94.7</td>\n <td>32.1</td>\n <td>22.7</td>\n <td>70.2</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.4</td>\n <td>27.6</td>\n <td>7.6</td>\n <td>193</td>\n <td>Lab</td>\n <td>0</td>\n <td>100</td>\n <td>0</td>\n <td>40.5</td>\n </tr>\n <tr>\n <th>25</th>\n <td>E09000026</td>\n <td>Redbridge</td>\n <td>Outer London</td>\n <td>304200</td>\n <td>110708</td>\n <td>5,642</td>\n <td>53.9</td>\n <td>35.8</td>\n <td>22.8</td>\n <td>65.0</td>\n <td>...</td>\n <td>7.3</td>\n <td>3.2</td>\n <td>23.3</td>\n <td>7.9</td>\n <td>142</td>\n <td>Lab</td>\n <td>39.7</td>\n <td>55.6</td>\n <td>4.8</td>\n <td>39.7</td>\n </tr>\n <tr>\n <th>26</th>\n <td>E09000027</td>\n <td>Richmond upon Thames</td>\n <td>Outer London</td>\n <td>197300</td>\n <td>85108</td>\n <td>5,741</td>\n <td>34.4</td>\n <td>38.8</td>\n <td>20.7</td>\n <td>64.5</td>\n <td>...</td>\n <td>7.3</td>\n <td>3.2</td>\n <td>12.6</td>\n <td>3.7</td>\n <td>137</td>\n <td>Cons</td>\n <td>72.2</td>\n <td>0</td>\n <td>27.8</td>\n <td>46.1</td>\n </tr>\n <tr>\n <th>27</th>\n <td>E09000028</td>\n <td>Southwark</td>\n <td>Inner London</td>\n <td>314300</td>\n <td>134254</td>\n <td>2,886</td>\n <td>108.9</td>\n <td>34.4</td>\n <td>18.6</td>\n <td>73.5</td>\n <td>...</td>\n <td>7.3</td>\n <td>3.4</td>\n <td>27.6</td>\n <td>5.5</td>\n <td>207</td>\n <td>Lab</td>\n <td>3.2</td>\n <td>76.2</td>\n <td>20.6</td>\n <td>36.2</td>\n </tr>\n <tr>\n <th>28</th>\n <td>E09000029</td>\n <td>Sutton</td>\n <td>Outer London</td>\n <td>202600</td>\n <td>85243</td>\n <td>4,385</td>\n <td>46.2</td>\n <td>38.9</td>\n <td>20.7</td>\n <td>64.3</td>\n <td>...</td>\n <td>7.3</td>\n <td>3.2</td>\n <td>18.4</td>\n <td>5.9</td>\n <td>163</td>\n <td>Lib Dem</td>\n <td>16.7</td>\n <td>0</td>\n <td>83.3</td>\n <td>42.6</td>\n </tr>\n <tr>\n <th>29</th>\n <td>E09000030</td>\n <td>Tower Hamlets</td>\n <td>Inner London</td>\n <td>304000</td>\n <td>123720</td>\n <td>1,978</td>\n <td>153.7</td>\n <td>31.4</td>\n <td>20.1</td>\n <td>73.9</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.3</td>\n <td>27.1</td>\n <td>6.6</td>\n <td>239</td>\n <td>Tower Hamlets First</td>\n <td>11.1</td>\n <td>48.9</td>\n <td>0</td>\n <td>47.2</td>\n </tr>\n <tr>\n <th>30</th>\n <td>E09000031</td>\n <td>Waltham Forest</td>\n <td>Outer London</td>\n <td>276200</td>\n <td>105981</td>\n <td>3,881</td>\n <td>71.2</td>\n <td>35.1</td>\n <td>21.8</td>\n <td>67.9</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.1</td>\n <td>26.3</td>\n <td>6.4</td>\n <td>185</td>\n <td>Lab</td>\n <td>26.7</td>\n <td>73.3</td>\n <td>0</td>\n <td>37.6</td>\n </tr>\n <tr>\n <th>31</th>\n <td>E09000032</td>\n <td>Wandsworth</td>\n <td>Inner London</td>\n <td>321000</td>\n <td>138149</td>\n <td>3,426</td>\n <td>93.7</td>\n <td>35.0</td>\n <td>17.8</td>\n <td>72.8</td>\n <td>...</td>\n <td>7.4</td>\n <td>3.6</td>\n <td>19.3</td>\n <td>4.2</td>\n <td>177</td>\n <td>Cons</td>\n <td>68.3</td>\n <td>31.7</td>\n <td>0</td>\n <td>36.9</td>\n </tr>\n <tr>\n <th>32</th>\n <td>E09000033</td>\n <td>Westminster</td>\n <td>Inner London</td>\n <td>242100</td>\n <td>118975</td>\n <td>2,149</td>\n <td>112.7</td>\n <td>37.7</td>\n <td>15.9</td>\n <td>72.3</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.4</td>\n <td>24.9</td>\n <td>4.4</td>\n <td>162</td>\n <td>Cons</td>\n <td>73.3</td>\n <td>26.7</td>\n <td>0</td>\n <td>32.3</td>\n </tr>\n </tbody>\n</table>\n<p>33 rows \u00d7 84 columns</p>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "**Let's take a first look at the data loaded into the notebook**\n\n* With `df.head()` or `df.tail()` you can view the first five or last five lines from the data \n* Add a number between the brackets `()` to specify the number of lines you want to display., e.g. `df.head(2)`\n* Use `df.dtypes` to check the different variables and their datatype\n* `df.columns` gives a list of all column names\n* `len(df)` gives the number of rows\n* `df.shape` gives the number of rows and columns\n\n> **Tip**: to add more cells to run additional commands, activate a cell by clicking on it and then click on the '+' button at the top of the notebook. This will add a new cell. Click on the buttons with the upwards and downwards arrows to move the cells up and down to change their order\n\n<div class=\"alert alert-success\">\n <b>EXERCISE</b> <br/> \n Now let's have a look at the data that was loaded into the notebook. What are we actually looking at? \n \n Explore some of the following commands:\n <ul>\n <li><font face=\"Courier\">df.head()</font></li>\n <li><font face=\"Courier\">df.tail()</font></li>\n <li><font face=\"Courier\">df.columns</font></li>\n <li><font face=\"Courier\">df.values</font></li>\n <li><font face=\"Courier\">len(df)</font></li>\n <li><font face=\"Courier\">list(df)</font></li>\n </ul>\n</div> \n"
},
{
"metadata": {},
"cell_type": "code",
"source": "# try the commands here (add as many cells as you need):\n\ndf.head()",
"execution_count": 4,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 4,
"data": {
"text/plain": " Code Area_name Inner/_Outer_London \\\n0 E09000001 City of London Inner London \n1 E09000002 Barking and Dagenham Outer London \n2 E09000003 Barnet Outer London \n3 E09000004 Bexley Outer London \n4 E09000005 Brent Outer London \n\n GLA_Population_Estimate_2017 GLA_Household_Estimate_2017 \\\n0 8800 5326 \n1 209000 78188 \n2 389600 151423 \n3 244300 97736 \n4 332100 121048 \n\n Inland_Area_(Hectares) Population_density_(per_hectare)_2017 \\\n0 290 30.3 \n1 3,611 57.9 \n2 8,675 44.9 \n3 6,058 40.3 \n4 4,323 76.8 \n\n Average_Age,_2017 Proportion_of_population_aged_0-15,_2015 \\\n0 43.2 11.4 \n1 32.9 27.2 \n2 37.3 21.1 \n3 39.0 20.6 \n4 35.6 20.9 \n\n Proportion_of_population_of_working-age,_2015 ... \\\n0 73.1 ... \n1 63.1 ... \n2 64.9 ... \n3 62.9 ... \n4 67.8 ... \n\n Happiness_score_2011-14_(out_of_10) Anxiety_score_2011-14_(out_of_10) \\\n0 6.0 5.6 \n1 7.1 3.1 \n2 7.4 2.8 \n3 7.2 3.3 \n4 7.2 2.9 \n\n Childhood_Obesity_Prevalance_(%)_2015/16 People_aged_17+_with_diabetes_(%) \\\n0 NaN 2.6 \n1 28.5 7.3 \n2 20.7 6.0 \n3 22.7 6.9 \n4 24.3 7.9 \n\n Mortality_rate_from_causes_considered_preventable_2012/14 \\\n0 129 \n1 228 \n2 134 \n3 164 \n4 169 \n\n Political_control_in_council \\\n0 . \n1 Lab \n2 Cons \n3 Cons \n4 Lab \n\n Proportion_of_seats_won_by_Conservatives_in_2014_election \\\n0 . \n1 0 \n2 50.8 \n3 71.4 \n4 9.5 \n\n Proportion_of_seats_won_by_Labour_in_2014_election \\\n0 . \n1 100 \n2 . \n3 23.8 \n4 88.9 \n\n Proportion_of_seats_won_by_Lib_Dems_in_2014_election \\\n0 . \n1 0 \n2 1.6 \n3 0 \n4 1.6 \n\n Turnout_at_2014_local_elections \n0 . \n1 36.5 \n2 40.5 \n3 39.6 \n4 36.3 \n\n[5 rows x 84 columns]",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Code</th>\n <th>Area_name</th>\n <th>Inner/_Outer_London</th>\n <th>GLA_Population_Estimate_2017</th>\n <th>GLA_Household_Estimate_2017</th>\n <th>Inland_Area_(Hectares)</th>\n <th>Population_density_(per_hectare)_2017</th>\n <th>Average_Age,_2017</th>\n <th>Proportion_of_population_aged_0-15,_2015</th>\n <th>Proportion_of_population_of_working-age,_2015</th>\n <th>...</th>\n <th>Happiness_score_2011-14_(out_of_10)</th>\n <th>Anxiety_score_2011-14_(out_of_10)</th>\n <th>Childhood_Obesity_Prevalance_(%)_2015/16</th>\n <th>People_aged_17+_with_diabetes_(%)</th>\n <th>Mortality_rate_from_causes_considered_preventable_2012/14</th>\n <th>Political_control_in_council</th>\n <th>Proportion_of_seats_won_by_Conservatives_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Labour_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Lib_Dems_in_2014_election</th>\n <th>Turnout_at_2014_local_elections</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>E09000001</td>\n <td>City of London</td>\n <td>Inner London</td>\n <td>8800</td>\n <td>5326</td>\n <td>290</td>\n <td>30.3</td>\n <td>43.2</td>\n <td>11.4</td>\n <td>73.1</td>\n <td>...</td>\n <td>6.0</td>\n <td>5.6</td>\n <td>NaN</td>\n <td>2.6</td>\n <td>129</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n </tr>\n <tr>\n <th>1</th>\n <td>E09000002</td>\n <td>Barking and Dagenham</td>\n <td>Outer London</td>\n <td>209000</td>\n <td>78188</td>\n <td>3,611</td>\n <td>57.9</td>\n <td>32.9</td>\n <td>27.2</td>\n <td>63.1</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.1</td>\n <td>28.5</td>\n <td>7.3</td>\n <td>228</td>\n <td>Lab</td>\n <td>0</td>\n <td>100</td>\n <td>0</td>\n <td>36.5</td>\n </tr>\n <tr>\n <th>2</th>\n <td>E09000003</td>\n <td>Barnet</td>\n <td>Outer London</td>\n <td>389600</td>\n <td>151423</td>\n <td>8,675</td>\n <td>44.9</td>\n <td>37.3</td>\n <td>21.1</td>\n <td>64.9</td>\n <td>...</td>\n <td>7.4</td>\n <td>2.8</td>\n <td>20.7</td>\n <td>6.0</td>\n <td>134</td>\n <td>Cons</td>\n <td>50.8</td>\n <td>.</td>\n <td>1.6</td>\n <td>40.5</td>\n </tr>\n <tr>\n <th>3</th>\n <td>E09000004</td>\n <td>Bexley</td>\n <td>Outer London</td>\n <td>244300</td>\n <td>97736</td>\n <td>6,058</td>\n <td>40.3</td>\n <td>39.0</td>\n <td>20.6</td>\n <td>62.9</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.3</td>\n <td>22.7</td>\n <td>6.9</td>\n <td>164</td>\n <td>Cons</td>\n <td>71.4</td>\n <td>23.8</td>\n <td>0</td>\n <td>39.6</td>\n </tr>\n <tr>\n <th>4</th>\n <td>E09000005</td>\n <td>Brent</td>\n <td>Outer London</td>\n <td>332100</td>\n <td>121048</td>\n <td>4,323</td>\n <td>76.8</td>\n <td>35.6</td>\n <td>20.9</td>\n <td>67.8</td>\n <td>...</td>\n <td>7.2</td>\n <td>2.9</td>\n <td>24.3</td>\n <td>7.9</td>\n <td>169</td>\n <td>Lab</td>\n <td>9.5</td>\n <td>88.9</td>\n <td>1.6</td>\n <td>36.3</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows \u00d7 84 columns</p>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "df.tail()",
"execution_count": 5,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 5,
"data": {
"text/plain": " Code Area_name Inner/_Outer_London \\\n28 E09000029 Sutton Outer London \n29 E09000030 Tower Hamlets Inner London \n30 E09000031 Waltham Forest Outer London \n31 E09000032 Wandsworth Inner London \n32 E09000033 Westminster Inner London \n\n GLA_Population_Estimate_2017 GLA_Household_Estimate_2017 \\\n28 202600 85243 \n29 304000 123720 \n30 276200 105981 \n31 321000 138149 \n32 242100 118975 \n\n Inland_Area_(Hectares) Population_density_(per_hectare)_2017 \\\n28 4,385 46.2 \n29 1,978 153.7 \n30 3,881 71.2 \n31 3,426 93.7 \n32 2,149 112.7 \n\n Average_Age,_2017 Proportion_of_population_aged_0-15,_2015 \\\n28 38.9 20.7 \n29 31.4 20.1 \n30 35.1 21.8 \n31 35.0 17.8 \n32 37.7 15.9 \n\n Proportion_of_population_of_working-age,_2015 ... \\\n28 64.3 ... \n29 73.9 ... \n30 67.9 ... \n31 72.8 ... \n32 72.3 ... \n\n Happiness_score_2011-14_(out_of_10) Anxiety_score_2011-14_(out_of_10) \\\n28 7.3 3.2 \n29 7.2 3.3 \n30 7.1 3.1 \n31 7.4 3.6 \n32 7.1 3.4 \n\n Childhood_Obesity_Prevalance_(%)_2015/16 People_aged_17+_with_diabetes_(%) \\\n28 18.4 5.9 \n29 27.1 6.6 \n30 26.3 6.4 \n31 19.3 4.2 \n32 24.9 4.4 \n\n Mortality_rate_from_causes_considered_preventable_2012/14 \\\n28 163 \n29 239 \n30 185 \n31 177 \n32 162 \n\n Political_control_in_council \\\n28 Lib Dem \n29 Tower Hamlets First \n30 Lab \n31 Cons \n32 Cons \n\n Proportion_of_seats_won_by_Conservatives_in_2014_election \\\n28 16.7 \n29 11.1 \n30 26.7 \n31 68.3 \n32 73.3 \n\n Proportion_of_seats_won_by_Labour_in_2014_election \\\n28 0 \n29 48.9 \n30 73.3 \n31 31.7 \n32 26.7 \n\n Proportion_of_seats_won_by_Lib_Dems_in_2014_election \\\n28 83.3 \n29 0 \n30 0 \n31 0 \n32 0 \n\n Turnout_at_2014_local_elections \n28 42.6 \n29 47.2 \n30 37.6 \n31 36.9 \n32 32.3 \n\n[5 rows x 84 columns]",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Code</th>\n <th>Area_name</th>\n <th>Inner/_Outer_London</th>\n <th>GLA_Population_Estimate_2017</th>\n <th>GLA_Household_Estimate_2017</th>\n <th>Inland_Area_(Hectares)</th>\n <th>Population_density_(per_hectare)_2017</th>\n <th>Average_Age,_2017</th>\n <th>Proportion_of_population_aged_0-15,_2015</th>\n <th>Proportion_of_population_of_working-age,_2015</th>\n <th>...</th>\n <th>Happiness_score_2011-14_(out_of_10)</th>\n <th>Anxiety_score_2011-14_(out_of_10)</th>\n <th>Childhood_Obesity_Prevalance_(%)_2015/16</th>\n <th>People_aged_17+_with_diabetes_(%)</th>\n <th>Mortality_rate_from_causes_considered_preventable_2012/14</th>\n <th>Political_control_in_council</th>\n <th>Proportion_of_seats_won_by_Conservatives_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Labour_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Lib_Dems_in_2014_election</th>\n <th>Turnout_at_2014_local_elections</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>28</th>\n <td>E09000029</td>\n <td>Sutton</td>\n <td>Outer London</td>\n <td>202600</td>\n <td>85243</td>\n <td>4,385</td>\n <td>46.2</td>\n <td>38.9</td>\n <td>20.7</td>\n <td>64.3</td>\n <td>...</td>\n <td>7.3</td>\n <td>3.2</td>\n <td>18.4</td>\n <td>5.9</td>\n <td>163</td>\n <td>Lib Dem</td>\n <td>16.7</td>\n <td>0</td>\n <td>83.3</td>\n <td>42.6</td>\n </tr>\n <tr>\n <th>29</th>\n <td>E09000030</td>\n <td>Tower Hamlets</td>\n <td>Inner London</td>\n <td>304000</td>\n <td>123720</td>\n <td>1,978</td>\n <td>153.7</td>\n <td>31.4</td>\n <td>20.1</td>\n <td>73.9</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.3</td>\n <td>27.1</td>\n <td>6.6</td>\n <td>239</td>\n <td>Tower Hamlets First</td>\n <td>11.1</td>\n <td>48.9</td>\n <td>0</td>\n <td>47.2</td>\n </tr>\n <tr>\n <th>30</th>\n <td>E09000031</td>\n <td>Waltham Forest</td>\n <td>Outer London</td>\n <td>276200</td>\n <td>105981</td>\n <td>3,881</td>\n <td>71.2</td>\n <td>35.1</td>\n <td>21.8</td>\n <td>67.9</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.1</td>\n <td>26.3</td>\n <td>6.4</td>\n <td>185</td>\n <td>Lab</td>\n <td>26.7</td>\n <td>73.3</td>\n <td>0</td>\n <td>37.6</td>\n </tr>\n <tr>\n <th>31</th>\n <td>E09000032</td>\n <td>Wandsworth</td>\n <td>Inner London</td>\n <td>321000</td>\n <td>138149</td>\n <td>3,426</td>\n <td>93.7</td>\n <td>35.0</td>\n <td>17.8</td>\n <td>72.8</td>\n <td>...</td>\n <td>7.4</td>\n <td>3.6</td>\n <td>19.3</td>\n <td>4.2</td>\n <td>177</td>\n <td>Cons</td>\n <td>68.3</td>\n <td>31.7</td>\n <td>0</td>\n <td>36.9</td>\n </tr>\n <tr>\n <th>32</th>\n <td>E09000033</td>\n <td>Westminster</td>\n <td>Inner London</td>\n <td>242100</td>\n <td>118975</td>\n <td>2,149</td>\n <td>112.7</td>\n <td>37.7</td>\n <td>15.9</td>\n <td>72.3</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.4</td>\n <td>24.9</td>\n <td>4.4</td>\n <td>162</td>\n <td>Cons</td>\n <td>73.3</td>\n <td>26.7</td>\n <td>0</td>\n <td>32.3</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows \u00d7 84 columns</p>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "df.dtypes",
"execution_count": 6,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 6,
"data": {
"text/plain": "Code object\nArea_name object\nInner/_Outer_London object\nGLA_Population_Estimate_2017 int64\nGLA_Household_Estimate_2017 object\n ... \nPolitical_control_in_council object\nProportion_of_seats_won_by_Conservatives_in_2014_election object\nProportion_of_seats_won_by_Labour_in_2014_election object\nProportion_of_seats_won_by_Lib_Dems_in_2014_election object\nTurnout_at_2014_local_elections object\nLength: 84, dtype: object"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "df.columns",
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 7,
"data": {
"text/plain": "Index(['Code', 'Area_name', 'Inner/_Outer_London',\n 'GLA_Population_Estimate_2017', 'GLA_Household_Estimate_2017',\n 'Inland_Area_(Hectares)', 'Population_density_(per_hectare)_2017',\n 'Average_Age,_2017', 'Proportion_of_population_aged_0-15,_2015',\n 'Proportion_of_population_of_working-age,_2015',\n 'Proportion_of_population_aged_65_and_over,_2015',\n 'Net_internal_migration_(2015)', 'Net_international_migration_(2015)',\n 'Net_natural_change_(2015)',\n '%_of_resident_population_born_abroad_(2015)',\n 'Largest_migrant_population_by_country_of_birth_(2011)',\n '%_of_largest_migrant_population_(2011)',\n 'Second_largest_migrant_population_by_country_of_birth_(2011)',\n '%_of_second_largest_migrant_population_(2011)',\n 'Third_largest_migrant_population_by_country_of_birth_(2011)',\n '%_of_third_largest_migrant_population_(2011)',\n '%_of_population_from_BAME_groups_(2016)',\n '%_people_aged_3+_whose_main_language_is_not_English_(2011_Census)',\n 'Overseas_nationals_entering_the_UK_(NINo),_(2015/16)',\n 'New_migrant_(NINo)_rates,_(2015/16)',\n 'Largest_migrant_population_arrived_during_2015/16',\n 'Second_largest_migrant_population_arrived_during_2015/16',\n 'Third_largest_migrant_population_arrived_during_2015/16',\n 'Employment_rate_(%)_(2015)', 'Male_employment_rate_(2015)',\n 'Female_employment_rate_(2015)', 'Unemployment_rate_(2015)',\n 'Youth_Unemployment_(claimant)_rate_18-24_(Dec-15)',\n 'Proportion_of_16-18_year_olds_who_are_NEET_(%)_(2014)',\n 'Proportion_of_the_working-age_population_who_claim_out-of-work_benefits_(%)_(May-2016)',\n '%_working-age_with_a_disability_(2015)',\n 'Proportion_of_working_age_people_with_no_qualifications_(%)_2015',\n 'Proportion_of_working_age_with_degree_or_equivalent_and_above_(%)_2015',\n 'Gross_Annual_Pay,_(2016)', 'Gross_Annual_Pay_-_Male_(2016)',\n 'Gross_Annual_Pay_-_Female_(2016)',\n 'Modelled_Household_median_income_estimates_2012/13',\n '%_adults_that_volunteered_in_past_12_months_(2010/11_to_2012/13)',\n 'Number_of_jobs_by_workplace_(2014)',\n '%_of_employment_that_is_in_public_sector_(2014)', 'Jobs_Density,_2015',\n 'Number_of_active_businesses,_2015',\n 'Two-year_business_survival_rates_(started_in_2013)',\n 'Crime_rates_per_thousand_population_2014/15',\n 'Fires_per_thousand_population_(2014)',\n 'Ambulance_incidents_per_hundred_population_(2014)',\n 'Median_House_Price,_2015',\n 'Average_Band_D_Council_Tax_charge_(\u00a3),_2015/16',\n 'New_Homes_(net)_2015/16_(provisional)',\n 'Homes_Owned_outright,_(2014)_%',\n 'Being_bought_with_mortgage_or_loan,_(2014)_%',\n 'Rented_from_Local_Authority_or_Housing_Association,_(2014)_%',\n 'Rented_from_Private_landlord,_(2014)_%',\n '%_of_area_that_is_Greenspace,_2005', 'Total_carbon_emissions_(2014)',\n 'Household_Waste_Recycling_Rate,_2014/15',\n 'Number_of_cars,_(2011_Census)',\n 'Number_of_cars_per_household,_(2011_Census)',\n '%_of_adults_who_cycle_at_least_once_per_month,_2014/15',\n 'Average_Public_Transport_Accessibility_score,_2014',\n 'Achievement_of_5_or_more_A*-_C_grades_at_GCSE_or_equivalent_including_English_and_Maths,_2013/14',\n 'Rates_of_Children_Looked_After_(2016)',\n '%_of_pupils_whose_first_language_is_not_English_(2015)',\n '%_children_living_in_out-of-work_households_(2015)',\n 'Male_life_expectancy,_(2012-14)', 'Female_life_expectancy,_(2012-14)',\n 'Teenage_conception_rate_(2014)',\n 'Life_satisfaction_score_2011-14_(out_of_10)',\n 'Worthwhileness_score_2011-14_(out_of_10)',\n 'Happiness_score_2011-14_(out_of_10)',\n 'Anxiety_score_2011-14_(out_of_10)',\n 'Childhood_Obesity_Prevalance_(%)_2015/16',\n 'People_aged_17+_with_diabetes_(%)',\n 'Mortality_rate_from_causes_considered_preventable_2012/14',\n 'Political_control_in_council',\n 'Proportion_of_seats_won_by_Conservatives_in_2014_election',\n 'Proportion_of_seats_won_by_Labour_in_2014_election',\n 'Proportion_of_seats_won_by_Lib_Dems_in_2014_election',\n 'Turnout_at_2014_local_elections'],\n dtype='object')"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "df.values",
"execution_count": 8,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 8,
"data": {
"text/plain": "array([['E09000001', 'City of London', 'Inner London', ..., '.', '.',\n '.'],\n ['E09000002', 'Barking and Dagenham', 'Outer London', ..., '100',\n '0', '36.5'],\n ['E09000003', 'Barnet', 'Outer London', ..., '.', '1.6', '40.5'],\n ...,\n ['E09000031', 'Waltham Forest', 'Outer London', ..., '73.3', '0',\n '37.6'],\n ['E09000032', 'Wandsworth', 'Inner London', ..., '31.7', '0',\n '36.9'],\n ['E09000033', 'Westminster', 'Inner London', ..., '26.7', '0',\n '32.3']], dtype=object)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "len(df)",
"execution_count": 9,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 9,
"data": {
"text/plain": "33"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "df.shape",
"execution_count": 10,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 10,
"data": {
"text/plain": "(33, 84)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<a id=\"series\"></a>\n## 2. Series and DataFrames \n\nA `Series` is a one-dimensional labelled array that can contain of any type (integer, string, float, python objects, etc.)."
},
{
"metadata": {},
"cell_type": "code",
"source": "s = pd.Series([1, 3, 5, np.nan, 6, 8])\ns",
"execution_count": 11,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 11,
"data": {
"text/plain": "0 1.0\n1 3.0\n2 5.0\n3 NaN\n4 6.0\n5 8.0\ndtype: float64"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "A `DataFrame` is a two-dimensional data structure, the data consists of rows and columns that you can create a in many ways, by loading a file or using a NumPy array and a date for the index.\n\n<div class=\"alert alert-info\" style=\"font-size:100%\">\n<a href=\"https://numpy.org\"> NumPy</a> is a Python library for working with multi-dimensional arrays and matrices with a large collection of mathematical functions to operate on these arrays.\nHave a look at this <a href=\"https://docs.scipy.org/doc/numpy-1.15.0/user/quickstart.html\"> NumPy tutorial</a> for an overview.\n</div>\n\n"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Create DataFrame `df1` with `dates` as the index, a 6 by 4 array of random `numbers` as values, and column names A, B, C and D (the index will be explained in the next section): "
},
{
"metadata": {},
"cell_type": "code",
"source": "dates = pd.date_range('20200101', periods=6)\ndates",
"execution_count": 12,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 12,
"data": {
"text/plain": "DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',\n '2020-01-05', '2020-01-06'],\n dtype='datetime64[ns]', freq='D')"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "numbers = np.random.randn(6, 4)\nnumbers",
"execution_count": 13,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 13,
"data": {
"text/plain": "array([[ 1.09054916, -0.51653351, -0.00731379, -0.43697309],\n [-0.10125424, 0.59432908, -0.30622382, 0.89295525],\n [-1.00937237, 0.12463626, -1.16027016, -0.17297152],\n [-1.49150918, -0.70101389, -1.63453832, 1.10616148],\n [-0.57305205, 0.92916559, 0.03763414, 1.35654442],\n [-0.25329179, 2.92123624, -1.42516862, 0.40216703]])"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "df1 = pd.DataFrame(numbers, index=dates, columns=['A', 'B', 'C', 'D'])\ndf1",
"execution_count": 14,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 14,
"data": {
"text/plain": " A B C D\n2020-01-01 1.090549 -0.516534 -0.007314 -0.436973\n2020-01-02 -0.101254 0.594329 -0.306224 0.892955\n2020-01-03 -1.009372 0.124636 -1.160270 -0.172972\n2020-01-04 -1.491509 -0.701014 -1.634538 1.106161\n2020-01-05 -0.573052 0.929166 0.037634 1.356544\n2020-01-06 -0.253292 2.921236 -1.425169 0.402167",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>A</th>\n <th>B</th>\n <th>C</th>\n <th>D</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>2020-01-01</th>\n <td>1.090549</td>\n <td>-0.516534</td>\n <td>-0.007314</td>\n <td>-0.436973</td>\n </tr>\n <tr>\n <th>2020-01-02</th>\n <td>-0.101254</td>\n <td>0.594329</td>\n <td>-0.306224</td>\n <td>0.892955</td>\n </tr>\n <tr>\n <th>2020-01-03</th>\n <td>-1.009372</td>\n <td>0.124636</td>\n <td>-1.160270</td>\n <td>-0.172972</td>\n </tr>\n <tr>\n <th>2020-01-04</th>\n <td>-1.491509</td>\n <td>-0.701014</td>\n <td>-1.634538</td>\n <td>1.106161</td>\n </tr>\n <tr>\n <th>2020-01-05</th>\n <td>-0.573052</td>\n <td>0.929166</td>\n <td>0.037634</td>\n <td>1.356544</td>\n </tr>\n <tr>\n <th>2020-01-06</th>\n <td>-0.253292</td>\n <td>2.921236</td>\n <td>-1.425169</td>\n <td>0.402167</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Or create a DataFrame by combining the above in one command:"
},
{
"metadata": {},
"cell_type": "code",
"source": "df2 = pd.DataFrame({'A': 1.,\n 'B': pd.Timestamp('20130102'),\n 'C': pd.Series(1, index=list(range(4)), dtype='float32'),\n 'D': np.array([3] * 4, dtype='int32'),\n 'E': pd.Categorical([\"test\", \"train\", \"test\", \"train\"]),\n 'F': 'foo'})",
"execution_count": 15,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "df2.head()",
"execution_count": 16,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 16,
"data": {
"text/plain": " A B C D E F\n0 1.0 2013-01-02 1.0 3 test foo\n1 1.0 2013-01-02 1.0 3 train foo\n2 1.0 2013-01-02 1.0 3 test foo\n3 1.0 2013-01-02 1.0 3 train foo",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>A</th>\n <th>B</th>\n <th>C</th>\n <th>D</th>\n <th>E</th>\n <th>F</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1.0</td>\n <td>2013-01-02</td>\n <td>1.0</td>\n <td>3</td>\n <td>test</td>\n <td>foo</td>\n </tr>\n <tr>\n <th>1</th>\n <td>1.0</td>\n <td>2013-01-02</td>\n <td>1.0</td>\n <td>3</td>\n <td>train</td>\n <td>foo</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1.0</td>\n <td>2013-01-02</td>\n <td>1.0</td>\n <td>3</td>\n <td>test</td>\n <td>foo</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1.0</td>\n <td>2013-01-02</td>\n <td>1.0</td>\n <td>3</td>\n <td>train</td>\n <td>foo</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Use `type()` to check the data type of each variable. Below `print` is used to display the data type of all of them used so far:"
},
{
"metadata": {},
"cell_type": "code",
"source": "print('Data type of s is '+str(type(s)))\nprint('Data type of s is '+str(type(dates)))\nprint('Data type of s is '+str(type(numbers)))\nprint('Data type of df is '+str(type(df1)))",
"execution_count": 17,
"outputs": [
{
"output_type": "stream",
"text": "Data type of s is <class 'pandas.core.series.Series'>\nData type of s is <class 'pandas.core.indexes.datetimes.DatetimeIndex'>\nData type of s is <class 'numpy.ndarray'>\nData type of df is <class 'pandas.core.frame.DataFrame'>\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "type(df)",
"execution_count": 18,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 18,
"data": {
"text/plain": "pandas.core.frame.DataFrame"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<a id=\"cleaning\"></a>\n## 3. Cleaning Data\n\nWhen exploring data there are always transformations needed to get it in the format you need for your analysis, visualisations or models. Below are only a few examples of the endless possibilities. The best way to learn is to find a dataset and try to answer questions with the data."
},
{
"metadata": {},
"cell_type": "markdown",
"source": "First, let's make a copy of the Dataframe loaded from the URL:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs = df.copy()",
"execution_count": 19,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### Adding an index\n\nIndexing and selecting data is key to data analysis and creating visualizations. For more information on indexing have a look at the [documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html).\n\nSet the area code (`Code`) as the index, which will change the table slightly:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs = boroughs.set_index(['Code'])\nboroughs.head()",
"execution_count": 20,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 20,
"data": {
"text/plain": " Area_name Inner/_Outer_London \\\nCode \nE09000001 City of London Inner London \nE09000002 Barking and Dagenham Outer London \nE09000003 Barnet Outer London \nE09000004 Bexley Outer London \nE09000005 Brent Outer London \n\n GLA_Population_Estimate_2017 GLA_Household_Estimate_2017 \\\nCode \nE09000001 8800 5326 \nE09000002 209000 78188 \nE09000003 389600 151423 \nE09000004 244300 97736 \nE09000005 332100 121048 \n\n Inland_Area_(Hectares) Population_density_(per_hectare)_2017 \\\nCode \nE09000001 290 30.3 \nE09000002 3,611 57.9 \nE09000003 8,675 44.9 \nE09000004 6,058 40.3 \nE09000005 4,323 76.8 \n\n Average_Age,_2017 Proportion_of_population_aged_0-15,_2015 \\\nCode \nE09000001 43.2 11.4 \nE09000002 32.9 27.2 \nE09000003 37.3 21.1 \nE09000004 39.0 20.6 \nE09000005 35.6 20.9 \n\n Proportion_of_population_of_working-age,_2015 \\\nCode \nE09000001 73.1 \nE09000002 63.1 \nE09000003 64.9 \nE09000004 62.9 \nE09000005 67.8 \n\n Proportion_of_population_aged_65_and_over,_2015 ... \\\nCode ... \nE09000001 15.5 ... \nE09000002 9.7 ... \nE09000003 14.0 ... \nE09000004 16.6 ... \nE09000005 11.3 ... \n\n Happiness_score_2011-14_(out_of_10) \\\nCode \nE09000001 6.0 \nE09000002 7.1 \nE09000003 7.4 \nE09000004 7.2 \nE09000005 7.2 \n\n Anxiety_score_2011-14_(out_of_10) \\\nCode \nE09000001 5.6 \nE09000002 3.1 \nE09000003 2.8 \nE09000004 3.3 \nE09000005 2.9 \n\n Childhood_Obesity_Prevalance_(%)_2015/16 \\\nCode \nE09000001 NaN \nE09000002 28.5 \nE09000003 20.7 \nE09000004 22.7 \nE09000005 24.3 \n\n People_aged_17+_with_diabetes_(%) \\\nCode \nE09000001 2.6 \nE09000002 7.3 \nE09000003 6.0 \nE09000004 6.9 \nE09000005 7.9 \n\n Mortality_rate_from_causes_considered_preventable_2012/14 \\\nCode \nE09000001 129 \nE09000002 228 \nE09000003 134 \nE09000004 164 \nE09000005 169 \n\n Political_control_in_council \\\nCode \nE09000001 . \nE09000002 Lab \nE09000003 Cons \nE09000004 Cons \nE09000005 Lab \n\n Proportion_of_seats_won_by_Conservatives_in_2014_election \\\nCode \nE09000001 . \nE09000002 0 \nE09000003 50.8 \nE09000004 71.4 \nE09000005 9.5 \n\n Proportion_of_seats_won_by_Labour_in_2014_election \\\nCode \nE09000001 . \nE09000002 100 \nE09000003 . \nE09000004 23.8 \nE09000005 88.9 \n\n Proportion_of_seats_won_by_Lib_Dems_in_2014_election \\\nCode \nE09000001 . \nE09000002 0 \nE09000003 1.6 \nE09000004 0 \nE09000005 1.6 \n\n Turnout_at_2014_local_elections \nCode \nE09000001 . \nE09000002 36.5 \nE09000003 40.5 \nE09000004 39.6 \nE09000005 36.3 \n\n[5 rows x 83 columns]",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Area_name</th>\n <th>Inner/_Outer_London</th>\n <th>GLA_Population_Estimate_2017</th>\n <th>GLA_Household_Estimate_2017</th>\n <th>Inland_Area_(Hectares)</th>\n <th>Population_density_(per_hectare)_2017</th>\n <th>Average_Age,_2017</th>\n <th>Proportion_of_population_aged_0-15,_2015</th>\n <th>Proportion_of_population_of_working-age,_2015</th>\n <th>Proportion_of_population_aged_65_and_over,_2015</th>\n <th>...</th>\n <th>Happiness_score_2011-14_(out_of_10)</th>\n <th>Anxiety_score_2011-14_(out_of_10)</th>\n <th>Childhood_Obesity_Prevalance_(%)_2015/16</th>\n <th>People_aged_17+_with_diabetes_(%)</th>\n <th>Mortality_rate_from_causes_considered_preventable_2012/14</th>\n <th>Political_control_in_council</th>\n <th>Proportion_of_seats_won_by_Conservatives_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Labour_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Lib_Dems_in_2014_election</th>\n <th>Turnout_at_2014_local_elections</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000001</th>\n <td>City of London</td>\n <td>Inner London</td>\n <td>8800</td>\n <td>5326</td>\n <td>290</td>\n <td>30.3</td>\n <td>43.2</td>\n <td>11.4</td>\n <td>73.1</td>\n <td>15.5</td>\n <td>...</td>\n <td>6.0</td>\n <td>5.6</td>\n <td>NaN</td>\n <td>2.6</td>\n <td>129</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n </tr>\n <tr>\n <th>E09000002</th>\n <td>Barking and Dagenham</td>\n <td>Outer London</td>\n <td>209000</td>\n <td>78188</td>\n <td>3,611</td>\n <td>57.9</td>\n <td>32.9</td>\n <td>27.2</td>\n <td>63.1</td>\n <td>9.7</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.1</td>\n <td>28.5</td>\n <td>7.3</td>\n <td>228</td>\n <td>Lab</td>\n <td>0</td>\n <td>100</td>\n <td>0</td>\n <td>36.5</td>\n </tr>\n <tr>\n <th>E09000003</th>\n <td>Barnet</td>\n <td>Outer London</td>\n <td>389600</td>\n <td>151423</td>\n <td>8,675</td>\n <td>44.9</td>\n <td>37.3</td>\n <td>21.1</td>\n <td>64.9</td>\n <td>14.0</td>\n <td>...</td>\n <td>7.4</td>\n <td>2.8</td>\n <td>20.7</td>\n <td>6.0</td>\n <td>134</td>\n <td>Cons</td>\n <td>50.8</td>\n <td>.</td>\n <td>1.6</td>\n <td>40.5</td>\n </tr>\n <tr>\n <th>E09000004</th>\n <td>Bexley</td>\n <td>Outer London</td>\n <td>244300</td>\n <td>97736</td>\n <td>6,058</td>\n <td>40.3</td>\n <td>39.0</td>\n <td>20.6</td>\n <td>62.9</td>\n <td>16.6</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.3</td>\n <td>22.7</td>\n <td>6.9</td>\n <td>164</td>\n <td>Cons</td>\n <td>71.4</td>\n <td>23.8</td>\n <td>0</td>\n <td>39.6</td>\n </tr>\n <tr>\n <th>E09000005</th>\n <td>Brent</td>\n <td>Outer London</td>\n <td>332100</td>\n <td>121048</td>\n <td>4,323</td>\n <td>76.8</td>\n <td>35.6</td>\n <td>20.9</td>\n <td>67.8</td>\n <td>11.3</td>\n <td>...</td>\n <td>7.2</td>\n <td>2.9</td>\n <td>24.3</td>\n <td>7.9</td>\n <td>169</td>\n <td>Lab</td>\n <td>9.5</td>\n <td>88.9</td>\n <td>1.6</td>\n <td>36.3</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows \u00d7 83 columns</p>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### Adding and deleting columns\n\nAdding a column can be done by creating a new column `new`, which can be dropped using the `drop` function."
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['new'] = 1\nboroughs.head()",
"execution_count": 21,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 21,
"data": {
"text/plain": " Area_name Inner/_Outer_London \\\nCode \nE09000001 City of London Inner London \nE09000002 Barking and Dagenham Outer London \nE09000003 Barnet Outer London \nE09000004 Bexley Outer London \nE09000005 Brent Outer London \n\n GLA_Population_Estimate_2017 GLA_Household_Estimate_2017 \\\nCode \nE09000001 8800 5326 \nE09000002 209000 78188 \nE09000003 389600 151423 \nE09000004 244300 97736 \nE09000005 332100 121048 \n\n Inland_Area_(Hectares) Population_density_(per_hectare)_2017 \\\nCode \nE09000001 290 30.3 \nE09000002 3,611 57.9 \nE09000003 8,675 44.9 \nE09000004 6,058 40.3 \nE09000005 4,323 76.8 \n\n Average_Age,_2017 Proportion_of_population_aged_0-15,_2015 \\\nCode \nE09000001 43.2 11.4 \nE09000002 32.9 27.2 \nE09000003 37.3 21.1 \nE09000004 39.0 20.6 \nE09000005 35.6 20.9 \n\n Proportion_of_population_of_working-age,_2015 \\\nCode \nE09000001 73.1 \nE09000002 63.1 \nE09000003 64.9 \nE09000004 62.9 \nE09000005 67.8 \n\n Proportion_of_population_aged_65_and_over,_2015 ... \\\nCode ... \nE09000001 15.5 ... \nE09000002 9.7 ... \nE09000003 14.0 ... \nE09000004 16.6 ... \nE09000005 11.3 ... \n\n Anxiety_score_2011-14_(out_of_10) \\\nCode \nE09000001 5.6 \nE09000002 3.1 \nE09000003 2.8 \nE09000004 3.3 \nE09000005 2.9 \n\n Childhood_Obesity_Prevalance_(%)_2015/16 \\\nCode \nE09000001 NaN \nE09000002 28.5 \nE09000003 20.7 \nE09000004 22.7 \nE09000005 24.3 \n\n People_aged_17+_with_diabetes_(%) \\\nCode \nE09000001 2.6 \nE09000002 7.3 \nE09000003 6.0 \nE09000004 6.9 \nE09000005 7.9 \n\n Mortality_rate_from_causes_considered_preventable_2012/14 \\\nCode \nE09000001 129 \nE09000002 228 \nE09000003 134 \nE09000004 164 \nE09000005 169 \n\n Political_control_in_council \\\nCode \nE09000001 . \nE09000002 Lab \nE09000003 Cons \nE09000004 Cons \nE09000005 Lab \n\n Proportion_of_seats_won_by_Conservatives_in_2014_election \\\nCode \nE09000001 . \nE09000002 0 \nE09000003 50.8 \nE09000004 71.4 \nE09000005 9.5 \n\n Proportion_of_seats_won_by_Labour_in_2014_election \\\nCode \nE09000001 . \nE09000002 100 \nE09000003 . \nE09000004 23.8 \nE09000005 88.9 \n\n Proportion_of_seats_won_by_Lib_Dems_in_2014_election \\\nCode \nE09000001 . \nE09000002 0 \nE09000003 1.6 \nE09000004 0 \nE09000005 1.6 \n\n Turnout_at_2014_local_elections new \nCode \nE09000001 . 1 \nE09000002 36.5 1 \nE09000003 40.5 1 \nE09000004 39.6 1 \nE09000005 36.3 1 \n\n[5 rows x 84 columns]",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Area_name</th>\n <th>Inner/_Outer_London</th>\n <th>GLA_Population_Estimate_2017</th>\n <th>GLA_Household_Estimate_2017</th>\n <th>Inland_Area_(Hectares)</th>\n <th>Population_density_(per_hectare)_2017</th>\n <th>Average_Age,_2017</th>\n <th>Proportion_of_population_aged_0-15,_2015</th>\n <th>Proportion_of_population_of_working-age,_2015</th>\n <th>Proportion_of_population_aged_65_and_over,_2015</th>\n <th>...</th>\n <th>Anxiety_score_2011-14_(out_of_10)</th>\n <th>Childhood_Obesity_Prevalance_(%)_2015/16</th>\n <th>People_aged_17+_with_diabetes_(%)</th>\n <th>Mortality_rate_from_causes_considered_preventable_2012/14</th>\n <th>Political_control_in_council</th>\n <th>Proportion_of_seats_won_by_Conservatives_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Labour_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Lib_Dems_in_2014_election</th>\n <th>Turnout_at_2014_local_elections</th>\n <th>new</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000001</th>\n <td>City of London</td>\n <td>Inner London</td>\n <td>8800</td>\n <td>5326</td>\n <td>290</td>\n <td>30.3</td>\n <td>43.2</td>\n <td>11.4</td>\n <td>73.1</td>\n <td>15.5</td>\n <td>...</td>\n <td>5.6</td>\n <td>NaN</td>\n <td>2.6</td>\n <td>129</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>1</td>\n </tr>\n <tr>\n <th>E09000002</th>\n <td>Barking and Dagenham</td>\n <td>Outer London</td>\n <td>209000</td>\n <td>78188</td>\n <td>3,611</td>\n <td>57.9</td>\n <td>32.9</td>\n <td>27.2</td>\n <td>63.1</td>\n <td>9.7</td>\n <td>...</td>\n <td>3.1</td>\n <td>28.5</td>\n <td>7.3</td>\n <td>228</td>\n <td>Lab</td>\n <td>0</td>\n <td>100</td>\n <td>0</td>\n <td>36.5</td>\n <td>1</td>\n </tr>\n <tr>\n <th>E09000003</th>\n <td>Barnet</td>\n <td>Outer London</td>\n <td>389600</td>\n <td>151423</td>\n <td>8,675</td>\n <td>44.9</td>\n <td>37.3</td>\n <td>21.1</td>\n <td>64.9</td>\n <td>14.0</td>\n <td>...</td>\n <td>2.8</td>\n <td>20.7</td>\n <td>6.0</td>\n <td>134</td>\n <td>Cons</td>\n <td>50.8</td>\n <td>.</td>\n <td>1.6</td>\n <td>40.5</td>\n <td>1</td>\n </tr>\n <tr>\n <th>E09000004</th>\n <td>Bexley</td>\n <td>Outer London</td>\n <td>244300</td>\n <td>97736</td>\n <td>6,058</td>\n <td>40.3</td>\n <td>39.0</td>\n <td>20.6</td>\n <td>62.9</td>\n <td>16.6</td>\n <td>...</td>\n <td>3.3</td>\n <td>22.7</td>\n <td>6.9</td>\n <td>164</td>\n <td>Cons</td>\n <td>71.4</td>\n <td>23.8</td>\n <td>0</td>\n <td>39.6</td>\n <td>1</td>\n </tr>\n <tr>\n <th>E09000005</th>\n <td>Brent</td>\n <td>Outer London</td>\n <td>332100</td>\n <td>121048</td>\n <td>4,323</td>\n <td>76.8</td>\n <td>35.6</td>\n <td>20.9</td>\n <td>67.8</td>\n <td>11.3</td>\n <td>...</td>\n <td>2.9</td>\n <td>24.3</td>\n <td>7.9</td>\n <td>169</td>\n <td>Lab</td>\n <td>9.5</td>\n <td>88.9</td>\n <td>1.6</td>\n <td>36.3</td>\n <td>1</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows \u00d7 84 columns</p>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs = boroughs.drop(columns='new')\nboroughs.head()",
"execution_count": 22,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 22,
"data": {
"text/plain": " Area_name Inner/_Outer_London \\\nCode \nE09000001 City of London Inner London \nE09000002 Barking and Dagenham Outer London \nE09000003 Barnet Outer London \nE09000004 Bexley Outer London \nE09000005 Brent Outer London \n\n GLA_Population_Estimate_2017 GLA_Household_Estimate_2017 \\\nCode \nE09000001 8800 5326 \nE09000002 209000 78188 \nE09000003 389600 151423 \nE09000004 244300 97736 \nE09000005 332100 121048 \n\n Inland_Area_(Hectares) Population_density_(per_hectare)_2017 \\\nCode \nE09000001 290 30.3 \nE09000002 3,611 57.9 \nE09000003 8,675 44.9 \nE09000004 6,058 40.3 \nE09000005 4,323 76.8 \n\n Average_Age,_2017 Proportion_of_population_aged_0-15,_2015 \\\nCode \nE09000001 43.2 11.4 \nE09000002 32.9 27.2 \nE09000003 37.3 21.1 \nE09000004 39.0 20.6 \nE09000005 35.6 20.9 \n\n Proportion_of_population_of_working-age,_2015 \\\nCode \nE09000001 73.1 \nE09000002 63.1 \nE09000003 64.9 \nE09000004 62.9 \nE09000005 67.8 \n\n Proportion_of_population_aged_65_and_over,_2015 ... \\\nCode ... \nE09000001 15.5 ... \nE09000002 9.7 ... \nE09000003 14.0 ... \nE09000004 16.6 ... \nE09000005 11.3 ... \n\n Happiness_score_2011-14_(out_of_10) \\\nCode \nE09000001 6.0 \nE09000002 7.1 \nE09000003 7.4 \nE09000004 7.2 \nE09000005 7.2 \n\n Anxiety_score_2011-14_(out_of_10) \\\nCode \nE09000001 5.6 \nE09000002 3.1 \nE09000003 2.8 \nE09000004 3.3 \nE09000005 2.9 \n\n Childhood_Obesity_Prevalance_(%)_2015/16 \\\nCode \nE09000001 NaN \nE09000002 28.5 \nE09000003 20.7 \nE09000004 22.7 \nE09000005 24.3 \n\n People_aged_17+_with_diabetes_(%) \\\nCode \nE09000001 2.6 \nE09000002 7.3 \nE09000003 6.0 \nE09000004 6.9 \nE09000005 7.9 \n\n Mortality_rate_from_causes_considered_preventable_2012/14 \\\nCode \nE09000001 129 \nE09000002 228 \nE09000003 134 \nE09000004 164 \nE09000005 169 \n\n Political_control_in_council \\\nCode \nE09000001 . \nE09000002 Lab \nE09000003 Cons \nE09000004 Cons \nE09000005 Lab \n\n Proportion_of_seats_won_by_Conservatives_in_2014_election \\\nCode \nE09000001 . \nE09000002 0 \nE09000003 50.8 \nE09000004 71.4 \nE09000005 9.5 \n\n Proportion_of_seats_won_by_Labour_in_2014_election \\\nCode \nE09000001 . \nE09000002 100 \nE09000003 . \nE09000004 23.8 \nE09000005 88.9 \n\n Proportion_of_seats_won_by_Lib_Dems_in_2014_election \\\nCode \nE09000001 . \nE09000002 0 \nE09000003 1.6 \nE09000004 0 \nE09000005 1.6 \n\n Turnout_at_2014_local_elections \nCode \nE09000001 . \nE09000002 36.5 \nE09000003 40.5 \nE09000004 39.6 \nE09000005 36.3 \n\n[5 rows x 83 columns]",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Area_name</th>\n <th>Inner/_Outer_London</th>\n <th>GLA_Population_Estimate_2017</th>\n <th>GLA_Household_Estimate_2017</th>\n <th>Inland_Area_(Hectares)</th>\n <th>Population_density_(per_hectare)_2017</th>\n <th>Average_Age,_2017</th>\n <th>Proportion_of_population_aged_0-15,_2015</th>\n <th>Proportion_of_population_of_working-age,_2015</th>\n <th>Proportion_of_population_aged_65_and_over,_2015</th>\n <th>...</th>\n <th>Happiness_score_2011-14_(out_of_10)</th>\n <th>Anxiety_score_2011-14_(out_of_10)</th>\n <th>Childhood_Obesity_Prevalance_(%)_2015/16</th>\n <th>People_aged_17+_with_diabetes_(%)</th>\n <th>Mortality_rate_from_causes_considered_preventable_2012/14</th>\n <th>Political_control_in_council</th>\n <th>Proportion_of_seats_won_by_Conservatives_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Labour_in_2014_election</th>\n <th>Proportion_of_seats_won_by_Lib_Dems_in_2014_election</th>\n <th>Turnout_at_2014_local_elections</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000001</th>\n <td>City of London</td>\n <td>Inner London</td>\n <td>8800</td>\n <td>5326</td>\n <td>290</td>\n <td>30.3</td>\n <td>43.2</td>\n <td>11.4</td>\n <td>73.1</td>\n <td>15.5</td>\n <td>...</td>\n <td>6.0</td>\n <td>5.6</td>\n <td>NaN</td>\n <td>2.6</td>\n <td>129</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n <td>.</td>\n </tr>\n <tr>\n <th>E09000002</th>\n <td>Barking and Dagenham</td>\n <td>Outer London</td>\n <td>209000</td>\n <td>78188</td>\n <td>3,611</td>\n <td>57.9</td>\n <td>32.9</td>\n <td>27.2</td>\n <td>63.1</td>\n <td>9.7</td>\n <td>...</td>\n <td>7.1</td>\n <td>3.1</td>\n <td>28.5</td>\n <td>7.3</td>\n <td>228</td>\n <td>Lab</td>\n <td>0</td>\n <td>100</td>\n <td>0</td>\n <td>36.5</td>\n </tr>\n <tr>\n <th>E09000003</th>\n <td>Barnet</td>\n <td>Outer London</td>\n <td>389600</td>\n <td>151423</td>\n <td>8,675</td>\n <td>44.9</td>\n <td>37.3</td>\n <td>21.1</td>\n <td>64.9</td>\n <td>14.0</td>\n <td>...</td>\n <td>7.4</td>\n <td>2.8</td>\n <td>20.7</td>\n <td>6.0</td>\n <td>134</td>\n <td>Cons</td>\n <td>50.8</td>\n <td>.</td>\n <td>1.6</td>\n <td>40.5</td>\n </tr>\n <tr>\n <th>E09000004</th>\n <td>Bexley</td>\n <td>Outer London</td>\n <td>244300</td>\n <td>97736</td>\n <td>6,058</td>\n <td>40.3</td>\n <td>39.0</td>\n <td>20.6</td>\n <td>62.9</td>\n <td>16.6</td>\n <td>...</td>\n <td>7.2</td>\n <td>3.3</td>\n <td>22.7</td>\n <td>6.9</td>\n <td>164</td>\n <td>Cons</td>\n <td>71.4</td>\n <td>23.8</td>\n <td>0</td>\n <td>39.6</td>\n </tr>\n <tr>\n <th>E09000005</th>\n <td>Brent</td>\n <td>Outer London</td>\n <td>332100</td>\n <td>121048</td>\n <td>4,323</td>\n <td>76.8</td>\n <td>35.6</td>\n <td>20.9</td>\n <td>67.8</td>\n <td>11.3</td>\n <td>...</td>\n <td>7.2</td>\n <td>2.9</td>\n <td>24.3</td>\n <td>7.9</td>\n <td>169</td>\n <td>Lab</td>\n <td>9.5</td>\n <td>88.9</td>\n <td>1.6</td>\n <td>36.3</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows \u00d7 83 columns</p>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "As not all columns are needed, let's remove some. If you are interested in any of these, change the code and do not remove the columns."
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs = boroughs.drop(columns=['GLA_Household_Estimate_2017',\n 'Proportion_of_population_aged_0-15,_2015',\n 'Proportion_of_population_of_working-age,_2015',\n 'Proportion_of_population_aged_65_and_over,_2015',\n 'Net_internal_migration_(2015)', 'Net_international_migration_(2015)',\n 'Net_natural_change_(2015)',\n '%_of_largest_migrant_population_(2011)',\n 'Second_largest_migrant_population_by_country_of_birth_(2011)',\n '%_of_second_largest_migrant_population_(2011)',\n 'Third_largest_migrant_population_by_country_of_birth_(2011)',\n '%_of_third_largest_migrant_population_(2011)',\n '%_of_population_from_BAME_groups_(2016)',\n '%_people_aged_3+_whose_main_language_is_not_English_(2011_Census)',\n 'Overseas_nationals_entering_the_UK_(NINo),_(2015/16)',\n 'Largest_migrant_population_arrived_during_2015/16',\n 'Second_largest_migrant_population_arrived_during_2015/16',\n 'Third_largest_migrant_population_arrived_during_2015/16',\n 'Male_employment_rate_(2015)',\n 'Female_employment_rate_(2015)', 'Unemployment_rate_(2015)',\n 'Youth_Unemployment_(claimant)_rate_18-24_(Dec-15)',\n 'Proportion_of_16-18_year_olds_who_are_NEET_(%)_(2014)',\n 'Proportion_of_the_working-age_population_who_claim_out-of-work_benefits_(%)_(May-2016)',\n '%_working-age_with_a_disability_(2015)',\n 'Proportion_of_working_age_people_with_no_qualifications_(%)_2015',\n 'Proportion_of_working_age_with_degree_or_equivalent_and_above_(%)_2015',\n 'Gross_Annual_Pay,_(2016)',\n 'Modelled_Household_median_income_estimates_2012/13',\n '%_adults_that_volunteered_in_past_12_months_(2010/11_to_2012/13)',\n 'Number_of_jobs_by_workplace_(2014)',\n '%_of_employment_that_is_in_public_sector_(2014)', 'Jobs_Density,_2015',\n 'Number_of_active_businesses,_2015',\n 'Two-year_business_survival_rates_(started_in_2013)',\n 'Crime_rates_per_thousand_population_2014/15',\n 'Fires_per_thousand_population_(2014)',\n 'Ambulance_incidents_per_hundred_population_(2014)',\n 'Average_Band_D_Council_Tax_charge_(\u00a3),_2015/16',\n 'New_Homes_(net)_2015/16_(provisional)',\n 'Homes_Owned_outright,_(2014)_%',\n 'Being_bought_with_mortgage_or_loan,_(2014)_%',\n 'Rented_from_Local_Authority_or_Housing_Association,_(2014)_%',\n 'Rented_from_Private_landlord,_(2014)_%',\n 'Total_carbon_emissions_(2014)',\n 'Household_Waste_Recycling_Rate,_2014/15',\n 'Number_of_cars,_(2011_Census)',\n 'Number_of_cars_per_household,_(2011_Census)',\n '%_of_adults_who_cycle_at_least_once_per_month,_2014/15',\n 'Average_Public_Transport_Accessibility_score,_2014',\n 'Achievement_of_5_or_more_A*-_C_grades_at_GCSE_or_equivalent_including_English_and_Maths,_2013/14',\n 'Rates_of_Children_Looked_After_(2016)',\n '%_of_pupils_whose_first_language_is_not_English_(2015)',\n '%_children_living_in_out-of-work_households_(2015)',\n 'Male_life_expectancy,_(2012-14)', 'Female_life_expectancy,_(2012-14)',\n 'Teenage_conception_rate_(2014)',\n 'Life_satisfaction_score_2011-14_(out_of_10)',\n 'Worthwhileness_score_2011-14_(out_of_10)',\n 'Anxiety_score_2011-14_(out_of_10)',\n 'Childhood_Obesity_Prevalance_(%)_2015/16',\n 'People_aged_17+_with_diabetes_(%)',\n 'Mortality_rate_from_causes_considered_preventable_2012/14',\n 'Proportion_of_seats_won_by_Conservatives_in_2014_election',\n 'Proportion_of_seats_won_by_Labour_in_2014_election',\n 'Proportion_of_seats_won_by_Lib_Dems_in_2014_election'])",
"execution_count": 23,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.columns",
"execution_count": 24,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 24,
"data": {
"text/plain": "Index(['Area_name', 'Inner/_Outer_London', 'GLA_Population_Estimate_2017',\n 'Inland_Area_(Hectares)', 'Population_density_(per_hectare)_2017',\n 'Average_Age,_2017', '%_of_resident_population_born_abroad_(2015)',\n 'Largest_migrant_population_by_country_of_birth_(2011)',\n 'New_migrant_(NINo)_rates,_(2015/16)', 'Employment_rate_(%)_(2015)',\n 'Gross_Annual_Pay_-_Male_(2016)', 'Gross_Annual_Pay_-_Female_(2016)',\n 'Median_House_Price,_2015', '%_of_area_that_is_Greenspace,_2005',\n 'Happiness_score_2011-14_(out_of_10)', 'Political_control_in_council',\n 'Turnout_at_2014_local_elections'],\n dtype='object')"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<a id=\"Renaming\"></a>\n\nYou can change names of columns using `rename`:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.rename(columns={'Area_name':'Name',\n 'Inner/_Outer_London':'Inner/Outer',\n 'GLA_Population_Estimate_2017':'Population',\n 'Inland_Area_(Hectares)':'Area (ha)',\n 'Average_Age,_2017':'Average Age',\n 'Political_control_in_council':'Political control',\n 'Population_density_(per_hectare)_2017':'Population density (/ha)',\n 'New_migrant_(NINo)_rates,_(2015/16)':'New migrant rates',\n 'Happiness_score_2011-14_(out_of_10)':'Happiness score',\n '%_of_resident_population_born_abroad_(2015)':'Population born abroad (%)',\n 'Employment_rate_(%)_(2015)':'Employment rate (%)',\n 'Turnout_at_2014_local_elections':'Turnout at local elections',\n 'Median_House_Price,_2015':'Median House Price',\n \"Largest_migrant_population_by_country_of_birth_(2011)\":'Largest migrant population',\n 'Gross_Annual_Pay_-_Female_(2016)':'Gross Pay (Female)',\n 'Gross_Annual_Pay_-_Male_(2016)':'Gross Pay (Male)',\n '%_of_area_that_is_Greenspace,_2005':'Greenspace (%)'},\n inplace=True)",
"execution_count": 25,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.columns",
"execution_count": 26,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 26,
"data": {
"text/plain": "Index(['Name', 'Inner/Outer', 'Population', 'Area (ha)',\n 'Population density (/ha)', 'Average Age', 'Population born abroad (%)',\n 'Largest migrant population', 'New migrant rates',\n 'Employment rate (%)', 'Gross Pay (Male)', 'Gross Pay (Female)',\n 'Median House Price', 'Greenspace (%)', 'Happiness score',\n 'Political control', 'Turnout at local elections'],\n dtype='object')"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.head()",
"execution_count": 27,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 27,
"data": {
"text/plain": " Name Inner/Outer Population Area (ha) \\\nCode \nE09000001 City of London Inner London 8800 290 \nE09000002 Barking and Dagenham Outer London 209000 3,611 \nE09000003 Barnet Outer London 389600 8,675 \nE09000004 Bexley Outer London 244300 6,058 \nE09000005 Brent Outer London 332100 4,323 \n\n Population density (/ha) Average Age Population born abroad (%) \\\nCode \nE09000001 30.3 43.2 . \nE09000002 57.9 32.9 37.8 \nE09000003 44.9 37.3 35.2 \nE09000004 40.3 39.0 16.1 \nE09000005 76.8 35.6 53.9 \n\n Largest migrant population New migrant rates Employment rate (%) \\\nCode \nE09000001 United States 152.2 64.6 \nE09000002 Nigeria 59.1 65.8 \nE09000003 India 53.1 68.5 \nE09000004 Nigeria 14.4 75.1 \nE09000005 India 100.9 69.5 \n\n Gross Pay (Male) Gross Pay (Female) Median House Price \\\nCode \nE09000001 . . 799999 \nE09000002 30104 24602 243500 \nE09000003 36475 31235 445000 \nE09000004 37881 28924 275000 \nE09000005 30129 29600 407250 \n\n Greenspace (%) Happiness score Political control \\\nCode \nE09000001 4.8 6.0 . \nE09000002 33.6 7.1 Lab \nE09000003 41.3 7.4 Cons \nE09000004 31.7 7.2 Cons \nE09000005 21.9 7.2 Lab \n\n Turnout at local elections \nCode \nE09000001 . \nE09000002 36.5 \nE09000003 40.5 \nE09000004 39.6 \nE09000005 36.3 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Name</th>\n <th>Inner/Outer</th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>Largest migrant population</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Political control</th>\n <th>Turnout at local elections</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000001</th>\n <td>City of London</td>\n <td>Inner London</td>\n <td>8800</td>\n <td>290</td>\n <td>30.3</td>\n <td>43.2</td>\n <td>.</td>\n <td>United States</td>\n <td>152.2</td>\n <td>64.6</td>\n <td>.</td>\n <td>.</td>\n <td>799999</td>\n <td>4.8</td>\n <td>6.0</td>\n <td>.</td>\n <td>.</td>\n </tr>\n <tr>\n <th>E09000002</th>\n <td>Barking and Dagenham</td>\n <td>Outer London</td>\n <td>209000</td>\n <td>3,611</td>\n <td>57.9</td>\n <td>32.9</td>\n <td>37.8</td>\n <td>Nigeria</td>\n <td>59.1</td>\n <td>65.8</td>\n <td>30104</td>\n <td>24602</td>\n <td>243500</td>\n <td>33.6</td>\n <td>7.1</td>\n <td>Lab</td>\n <td>36.5</td>\n </tr>\n <tr>\n <th>E09000003</th>\n <td>Barnet</td>\n <td>Outer London</td>\n <td>389600</td>\n <td>8,675</td>\n <td>44.9</td>\n <td>37.3</td>\n <td>35.2</td>\n <td>India</td>\n <td>53.1</td>\n <td>68.5</td>\n <td>36475</td>\n <td>31235</td>\n <td>445000</td>\n <td>41.3</td>\n <td>7.4</td>\n <td>Cons</td>\n <td>40.5</td>\n </tr>\n <tr>\n <th>E09000004</th>\n <td>Bexley</td>\n <td>Outer London</td>\n <td>244300</td>\n <td>6,058</td>\n <td>40.3</td>\n <td>39.0</td>\n <td>16.1</td>\n <td>Nigeria</td>\n <td>14.4</td>\n <td>75.1</td>\n <td>37881</td>\n <td>28924</td>\n <td>275000</td>\n <td>31.7</td>\n <td>7.2</td>\n <td>Cons</td>\n <td>39.6</td>\n </tr>\n <tr>\n <th>E09000005</th>\n <td>Brent</td>\n <td>Outer London</td>\n <td>332100</td>\n <td>4,323</td>\n <td>76.8</td>\n <td>35.6</td>\n <td>53.9</td>\n <td>India</td>\n <td>100.9</td>\n <td>69.5</td>\n <td>30129</td>\n <td>29600</td>\n <td>407250</td>\n <td>21.9</td>\n <td>7.2</td>\n <td>Lab</td>\n <td>36.3</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### Further Data Cleaning\n\n**Things to check:**\n\n* Is the data tidy: each variable forms a column, each observation forms a row and each type of observational unit forms a table.\n* Are all columns in the right data format?\n* Are there missing values?\n* Are there unrealistic outliers?"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Get a quick overview of the numeric data using the `.describe()` function. If any of the numeric columns are missing this is a probably because of a wrong data type."
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.describe()",
"execution_count": 28,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 28,
"data": {
"text/plain": " Population Average Age New migrant rates Employment rate (%) \\\ncount 33.000000 33.000000 33.000000 33.000000 \nmean 267739.393939 36.375758 55.330303 72.715152 \nstd 75383.345058 2.487849 29.414659 4.219384 \nmin 8800.000000 31.400000 14.400000 64.600000 \n25% 231200.000000 35.000000 37.600000 69.200000 \n50% 276200.000000 36.200000 53.500000 73.100000 \n75% 321000.000000 37.700000 66.200000 75.400000 \nmax 389600.000000 43.200000 152.200000 79.600000 \n\n Happiness score \ncount 33.000000 \nmean 7.209091 \nstd 0.249203 \nmin 6.000000 \n25% 7.200000 \n50% 7.200000 \n75% 7.300000 \nmax 7.600000 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Population</th>\n <th>Average Age</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Happiness score</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>count</th>\n <td>33.000000</td>\n <td>33.000000</td>\n <td>33.000000</td>\n <td>33.000000</td>\n <td>33.000000</td>\n </tr>\n <tr>\n <th>mean</th>\n <td>267739.393939</td>\n <td>36.375758</td>\n <td>55.330303</td>\n <td>72.715152</td>\n <td>7.209091</td>\n </tr>\n <tr>\n <th>std</th>\n <td>75383.345058</td>\n <td>2.487849</td>\n <td>29.414659</td>\n <td>4.219384</td>\n <td>0.249203</td>\n </tr>\n <tr>\n <th>min</th>\n <td>8800.000000</td>\n <td>31.400000</td>\n <td>14.400000</td>\n <td>64.600000</td>\n <td>6.000000</td>\n </tr>\n <tr>\n <th>25%</th>\n <td>231200.000000</td>\n <td>35.000000</td>\n <td>37.600000</td>\n <td>69.200000</td>\n <td>7.200000</td>\n </tr>\n <tr>\n <th>50%</th>\n <td>276200.000000</td>\n <td>36.200000</td>\n <td>53.500000</td>\n <td>73.100000</td>\n <td>7.200000</td>\n </tr>\n <tr>\n <th>75%</th>\n <td>321000.000000</td>\n <td>37.700000</td>\n <td>66.200000</td>\n <td>75.400000</td>\n <td>7.300000</td>\n </tr>\n <tr>\n <th>max</th>\n <td>389600.000000</td>\n <td>43.200000</td>\n <td>152.200000</td>\n <td>79.600000</td>\n <td>7.600000</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "When looking at the `Turnout at local elections` columns you can see a `.`, this needs to be replaced to a missing value. Change them all with `replace`:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs = boroughs.replace('.', float('NaN'))",
"execution_count": 29,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Check if all datatypes are as you expect with `dtypes`:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.dtypes",
"execution_count": 30,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 30,
"data": {
"text/plain": "Name object\nInner/Outer object\nPopulation int64\nArea (ha) object\nPopulation density (/ha) object\nAverage Age float64\nPopulation born abroad (%) object\nLargest migrant population object\nNew migrant rates float64\nEmployment rate (%) float64\nGross Pay (Male) object\nGross Pay (Female) object\nMedian House Price object\nGreenspace (%) object\nHappiness score float64\nPolitical control object\nTurnout at local elections object\ndtype: object"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Expect for `Inner/Outer `, `Largest migration population` and `Political control` these all should be numeric (`float64` or `int64`). Change the data type to numeric with `to_numeric`:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Population density (/ha)'] = pd.to_numeric(boroughs['Population density (/ha)'])\nboroughs['Population born abroad (%)'] = pd.to_numeric(boroughs['Population born abroad (%)'])\nboroughs['Gross Pay (Male)'] = pd.to_numeric(boroughs['Gross Pay (Male)'])\nboroughs['Gross Pay (Female)'] = pd.to_numeric(boroughs['Gross Pay (Female)'])\nboroughs['Median House Price'] = pd.to_numeric(boroughs['Median House Price'])\nboroughs['Greenspace (%)'] = pd.to_numeric(boroughs['Greenspace (%)'])\nboroughs['Turnout at local elections'] = pd.to_numeric(boroughs['Turnout at local elections'])\n\nboroughs['Area (ha)'] = boroughs['Area (ha)'].str.replace(',', '')\nboroughs['Area (ha)'] = pd.to_numeric(boroughs['Area (ha)'])\n\nboroughs.dtypes",
"execution_count": 31,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 31,
"data": {
"text/plain": "Name object\nInner/Outer object\nPopulation int64\nArea (ha) int64\nPopulation density (/ha) float64\nAverage Age float64\nPopulation born abroad (%) float64\nLargest migrant population object\nNew migrant rates float64\nEmployment rate (%) float64\nGross Pay (Male) float64\nGross Pay (Female) float64\nMedian House Price int64\nGreenspace (%) float64\nHappiness score float64\nPolitical control object\nTurnout at local elections float64\ndtype: object"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<a id=\"selection\"></a>\n## 4. Selecting Data"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "\nAccess single or groups of rows and columns with labels using `.loc[]`. (This only works for the column that was set to the index):"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.loc['E09000001', 'Area (ha)']",
"execution_count": 32,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 32,
"data": {
"text/plain": "290"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.loc['E09000001':'E09000004', ['Area (ha)', 'Average Age']]",
"execution_count": 33,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 33,
"data": {
"text/plain": " Area (ha) Average Age\nCode \nE09000001 290 43.2\nE09000002 3611 32.9\nE09000003 8675 37.3\nE09000004 6058 39.0",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Area (ha)</th>\n <th>Average Age</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000001</th>\n <td>290</td>\n <td>43.2</td>\n </tr>\n <tr>\n <th>E09000002</th>\n <td>3611</td>\n <td>32.9</td>\n </tr>\n <tr>\n <th>E09000003</th>\n <td>8675</td>\n <td>37.3</td>\n </tr>\n <tr>\n <th>E09000004</th>\n <td>6058</td>\n <td>39.0</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Or select by position with `.iloc[]`. Select a single row, multiple rows (or columns) at particular positions in the index. This function is integer based (from 0 to length-1 of the axis):"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.iloc[0]",
"execution_count": 34,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 34,
"data": {
"text/plain": "Name City of London\nInner/Outer Inner London\nPopulation 8800\nArea (ha) 290\nPopulation density (/ha) 30.3\nAverage Age 43.2\nPopulation born abroad (%) NaN\nLargest migrant population United States\nNew migrant rates 152.2\nEmployment rate (%) 64.6\nGross Pay (Male) NaN\nGross Pay (Female) NaN\nMedian House Price 799999\nGreenspace (%) 4.8\nHappiness score 6\nPolitical control NaN\nTurnout at local elections NaN\nName: E09000001, dtype: object"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.iloc[:,1]",
"execution_count": 35,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 35,
"data": {
"text/plain": "Code\nE09000001 Inner London\nE09000002 Outer London\nE09000003 Outer London\nE09000004 Outer London\nE09000005 Outer London\nE09000006 Outer London\nE09000007 Inner London\nE09000008 Outer London\nE09000009 Outer London\nE09000010 Outer London\nE09000011 Outer London\nE09000012 Inner London\nE09000013 Inner London\nE09000014 Inner London\nE09000015 Outer London\nE09000016 Outer London\nE09000017 Outer London\nE09000018 Outer London\nE09000019 Inner London\nE09000020 Inner London\nE09000021 Outer London\nE09000022 Inner London\nE09000023 Inner London\nE09000024 Outer London\nE09000025 Inner London\nE09000026 Outer London\nE09000027 Outer London\nE09000028 Inner London\nE09000029 Outer London\nE09000030 Inner London\nE09000031 Outer London\nE09000032 Inner London\nE09000033 Inner London\nName: Inner/Outer, dtype: object"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.iloc[:,0:2]",
"execution_count": 36,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 36,
"data": {
"text/plain": " Name Inner/Outer\nCode \nE09000001 City of London Inner London\nE09000002 Barking and Dagenham Outer London\nE09000003 Barnet Outer London\nE09000004 Bexley Outer London\nE09000005 Brent Outer London\nE09000006 Bromley Outer London\nE09000007 Camden Inner London\nE09000008 Croydon Outer London\nE09000009 Ealing Outer London\nE09000010 Enfield Outer London\nE09000011 Greenwich Outer London\nE09000012 Hackney Inner London\nE09000013 Hammersmith and Fulham Inner London\nE09000014 Haringey Inner London\nE09000015 Harrow Outer London\nE09000016 Havering Outer London\nE09000017 Hillingdon Outer London\nE09000018 Hounslow Outer London\nE09000019 Islington Inner London\nE09000020 Kensington and Chelsea Inner London\nE09000021 Kingston upon Thames Outer London\nE09000022 Lambeth Inner London\nE09000023 Lewisham Inner London\nE09000024 Merton Outer London\nE09000025 Newham Inner London\nE09000026 Redbridge Outer London\nE09000027 Richmond upon Thames Outer London\nE09000028 Southwark Inner London\nE09000029 Sutton Outer London\nE09000030 Tower Hamlets Inner London\nE09000031 Waltham Forest Outer London\nE09000032 Wandsworth Inner London\nE09000033 Westminster Inner London",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Name</th>\n <th>Inner/Outer</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000001</th>\n <td>City of London</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000002</th>\n <td>Barking and Dagenham</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000003</th>\n <td>Barnet</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000004</th>\n <td>Bexley</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000005</th>\n <td>Brent</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000006</th>\n <td>Bromley</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000007</th>\n <td>Camden</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000008</th>\n <td>Croydon</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000009</th>\n <td>Ealing</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000010</th>\n <td>Enfield</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000011</th>\n <td>Greenwich</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000012</th>\n <td>Hackney</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000013</th>\n <td>Hammersmith and Fulham</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000014</th>\n <td>Haringey</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000015</th>\n <td>Harrow</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000016</th>\n <td>Havering</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000017</th>\n <td>Hillingdon</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000018</th>\n <td>Hounslow</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000019</th>\n <td>Islington</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000020</th>\n <td>Kensington and Chelsea</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000021</th>\n <td>Kingston upon Thames</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000022</th>\n <td>Lambeth</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000023</th>\n <td>Lewisham</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000024</th>\n <td>Merton</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000025</th>\n <td>Newham</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000026</th>\n <td>Redbridge</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000027</th>\n <td>Richmond upon Thames</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000028</th>\n <td>Southwark</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000029</th>\n <td>Sutton</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000030</th>\n <td>Tower Hamlets</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000031</th>\n <td>Waltham Forest</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000032</th>\n <td>Wandsworth</td>\n <td>Inner London</td>\n </tr>\n <tr>\n <th>E09000033</th>\n <td>Westminster</td>\n <td>Inner London</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.iloc[2:4,0:2]",
"execution_count": 37,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 37,
"data": {
"text/plain": " Name Inner/Outer\nCode \nE09000003 Barnet Outer London\nE09000004 Bexley Outer London",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Name</th>\n <th>Inner/Outer</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000003</th>\n <td>Barnet</td>\n <td>Outer London</td>\n </tr>\n <tr>\n <th>E09000004</th>\n <td>Bexley</td>\n <td>Outer London</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "All the above examples can be used to create a new DataFrame. Or create a new DataFrame from 2 columns:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs2 = boroughs[['Area (ha)', 'Average Age']]\nboroughs2.head()",
"execution_count": 38,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 38,
"data": {
"text/plain": " Area (ha) Average Age\nCode \nE09000001 290 43.2\nE09000002 3611 32.9\nE09000003 8675 37.3\nE09000004 6058 39.0\nE09000005 4323 35.6",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Area (ha)</th>\n <th>Average Age</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000001</th>\n <td>290</td>\n <td>43.2</td>\n </tr>\n <tr>\n <th>E09000002</th>\n <td>3611</td>\n <td>32.9</td>\n </tr>\n <tr>\n <th>E09000003</th>\n <td>8675</td>\n <td>37.3</td>\n </tr>\n <tr>\n <th>E09000004</th>\n <td>6058</td>\n <td>39.0</td>\n </tr>\n <tr>\n <th>E09000005</th>\n <td>4323</td>\n <td>35.6</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### Filtering\n\nSelecting rows based on a certain condition can be done with Boolean indexing. This uses the actual values of the data in the DataFrame as opposed to the row/column labels or index positions."
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Average Age'] > 39",
"execution_count": 39,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 39,
"data": {
"text/plain": "Code\nE09000001 True\nE09000002 False\nE09000003 False\nE09000004 False\nE09000005 False\nE09000006 True\nE09000007 False\nE09000008 False\nE09000009 False\nE09000010 False\nE09000011 False\nE09000012 False\nE09000013 False\nE09000014 False\nE09000015 False\nE09000016 True\nE09000017 False\nE09000018 False\nE09000019 False\nE09000020 True\nE09000021 False\nE09000022 False\nE09000023 False\nE09000024 False\nE09000025 False\nE09000026 False\nE09000027 False\nE09000028 False\nE09000029 False\nE09000030 False\nE09000031 False\nE09000032 False\nE09000033 False\nName: Average Age, dtype: bool"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "When you want to select the rows and see all the data add `boroughs[]` around your function:"
},
{
"metadata": {
"scrolled": true
},
"cell_type": "code",
"source": "boroughs[boroughs['Average Age'] > 39]",
"execution_count": 40,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 40,
"data": {
"text/plain": " Name Inner/Outer Population Area (ha) \\\nCode \nE09000001 City of London Inner London 8800 290 \nE09000006 Bromley Outer London 327900 15013 \nE09000016 Havering Outer London 254300 11235 \nE09000020 Kensington and Chelsea Inner London 159000 1212 \n\n Population density (/ha) Average Age Population born abroad (%) \\\nCode \nE09000001 30.3 43.2 NaN \nE09000006 21.8 40.2 18.3 \nE09000016 22.6 40.3 10.9 \nE09000020 131.1 39.3 51.9 \n\n Largest migrant population New migrant rates Employment rate (%) \\\nCode \nE09000001 United States 152.2 64.6 \nE09000006 India 14.4 75.3 \nE09000016 Ireland 17.0 76.5 \nE09000020 United States 66.2 68.2 \n\n Gross Pay (Male) Gross Pay (Female) Median House Price \\\nCode \nE09000001 NaN NaN 799999 \nE09000006 42026.0 32491.0 374975 \nE09000016 36539.0 27455.0 287500 \nE09000020 NaN NaN 1200000 \n\n Greenspace (%) Happiness score Political control \\\nCode \nE09000001 4.8 6.0 NaN \nE09000006 57.8 7.4 Cons \nE09000016 59.3 7.2 No Overall Control \nE09000020 15.1 7.6 Cons \n\n Turnout at local elections \nCode \nE09000001 NaN \nE09000006 40.8 \nE09000016 43.1 \nE09000020 29.8 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Name</th>\n <th>Inner/Outer</th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>Largest migrant population</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Political control</th>\n <th>Turnout at local elections</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000001</th>\n <td>City of London</td>\n <td>Inner London</td>\n <td>8800</td>\n <td>290</td>\n <td>30.3</td>\n <td>43.2</td>\n <td>NaN</td>\n <td>United States</td>\n <td>152.2</td>\n <td>64.6</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>799999</td>\n <td>4.8</td>\n <td>6.0</td>\n <td>NaN</td>\n <td>NaN</td>\n </tr>\n <tr>\n <th>E09000006</th>\n <td>Bromley</td>\n <td>Outer London</td>\n <td>327900</td>\n <td>15013</td>\n <td>21.8</td>\n <td>40.2</td>\n <td>18.3</td>\n <td>India</td>\n <td>14.4</td>\n <td>75.3</td>\n <td>42026.0</td>\n <td>32491.0</td>\n <td>374975</td>\n <td>57.8</td>\n <td>7.4</td>\n <td>Cons</td>\n <td>40.8</td>\n </tr>\n <tr>\n <th>E09000016</th>\n <td>Havering</td>\n <td>Outer London</td>\n <td>254300</td>\n <td>11235</td>\n <td>22.6</td>\n <td>40.3</td>\n <td>10.9</td>\n <td>Ireland</td>\n <td>17.0</td>\n <td>76.5</td>\n <td>36539.0</td>\n <td>27455.0</td>\n <td>287500</td>\n <td>59.3</td>\n <td>7.2</td>\n <td>No Overall Control</td>\n <td>43.1</td>\n </tr>\n <tr>\n <th>E09000020</th>\n <td>Kensington and Chelsea</td>\n <td>Inner London</td>\n <td>159000</td>\n <td>1212</td>\n <td>131.1</td>\n <td>39.3</td>\n <td>51.9</td>\n <td>United States</td>\n <td>66.2</td>\n <td>68.2</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>1200000</td>\n <td>15.1</td>\n <td>7.6</td>\n <td>Cons</td>\n <td>29.8</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "You can combine different columns using `&`, `|` and `==` operators:"
},
{
"metadata": {
"scrolled": true
},
"cell_type": "code",
"source": "boroughs[(boroughs['Average Age'] > 39) & (boroughs['Political control'] == 'Cons')]",
"execution_count": 41,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 41,
"data": {
"text/plain": " Name Inner/Outer Population Area (ha) \\\nCode \nE09000006 Bromley Outer London 327900 15013 \nE09000020 Kensington and Chelsea Inner London 159000 1212 \n\n Population density (/ha) Average Age Population born abroad (%) \\\nCode \nE09000006 21.8 40.2 18.3 \nE09000020 131.1 39.3 51.9 \n\n Largest migrant population New migrant rates Employment rate (%) \\\nCode \nE09000006 India 14.4 75.3 \nE09000020 United States 66.2 68.2 \n\n Gross Pay (Male) Gross Pay (Female) Median House Price \\\nCode \nE09000006 42026.0 32491.0 374975 \nE09000020 NaN NaN 1200000 \n\n Greenspace (%) Happiness score Political control \\\nCode \nE09000006 57.8 7.4 Cons \nE09000020 15.1 7.6 Cons \n\n Turnout at local elections \nCode \nE09000006 40.8 \nE09000020 29.8 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Name</th>\n <th>Inner/Outer</th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>Largest migrant population</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Political control</th>\n <th>Turnout at local elections</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000006</th>\n <td>Bromley</td>\n <td>Outer London</td>\n <td>327900</td>\n <td>15013</td>\n <td>21.8</td>\n <td>40.2</td>\n <td>18.3</td>\n <td>India</td>\n <td>14.4</td>\n <td>75.3</td>\n <td>42026.0</td>\n <td>32491.0</td>\n <td>374975</td>\n <td>57.8</td>\n <td>7.4</td>\n <td>Cons</td>\n <td>40.8</td>\n </tr>\n <tr>\n <th>E09000020</th>\n <td>Kensington and Chelsea</td>\n <td>Inner London</td>\n <td>159000</td>\n <td>1212</td>\n <td>131.1</td>\n <td>39.3</td>\n <td>51.9</td>\n <td>United States</td>\n <td>66.2</td>\n <td>68.2</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>1200000</td>\n <td>15.1</td>\n <td>7.6</td>\n <td>Cons</td>\n <td>29.8</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {
"scrolled": true
},
"cell_type": "code",
"source": "boroughs[(boroughs['Political control'] == 'Lab') | (boroughs['Political control'] == 'Lib Dem')]",
"execution_count": 42,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 42,
"data": {
"text/plain": " Name Inner/Outer Population Area (ha) \\\nCode \nE09000002 Barking and Dagenham Outer London 209000 3611 \nE09000005 Brent Outer London 332100 4323 \nE09000007 Camden Inner London 242500 2179 \nE09000008 Croydon Outer London 386500 8650 \nE09000009 Ealing Outer London 351600 5554 \nE09000010 Enfield Outer London 333000 8083 \nE09000011 Greenwich Outer London 280100 4733 \nE09000012 Hackney Inner London 274300 1905 \nE09000013 Hammersmith and Fulham Inner London 185300 1640 \nE09000014 Haringey Inner London 278000 2960 \nE09000015 Harrow Outer London 252300 5046 \nE09000018 Hounslow Outer London 274200 5598 \nE09000019 Islington Inner London 231200 1486 \nE09000022 Lambeth Inner London 328900 2681 \nE09000023 Lewisham Inner London 303400 3515 \nE09000024 Merton Outer London 208100 3762 \nE09000025 Newham Inner London 342900 3620 \nE09000026 Redbridge Outer London 304200 5642 \nE09000028 Southwark Inner London 314300 2886 \nE09000029 Sutton Outer London 202600 4385 \nE09000031 Waltham Forest Outer London 276200 3881 \n\n Population density (/ha) Average Age Population born abroad (%) \\\nCode \nE09000002 57.9 32.9 37.8 \nE09000005 76.8 35.6 53.9 \nE09000007 111.3 36.4 41.4 \nE09000008 44.7 37.0 29.4 \nE09000009 63.3 36.2 47.4 \nE09000010 41.2 36.3 35.0 \nE09000011 59.2 35.0 35.4 \nE09000012 144.0 33.1 35.8 \nE09000013 113.0 35.7 43.2 \nE09000014 93.9 35.1 39.6 \nE09000015 50.0 38.3 49.6 \nE09000018 49.0 35.8 46.3 \nE09000019 155.6 34.8 36.6 \nE09000022 122.7 34.5 32.2 \nE09000023 86.3 35.0 34.9 \nE09000024 55.3 36.7 37.4 \nE09000025 94.7 32.1 54.1 \nE09000026 53.9 35.8 40.2 \nE09000028 108.9 34.4 38.4 \nE09000029 46.2 38.9 23.1 \nE09000031 71.2 35.1 37.2 \n\n Largest migrant population New migrant rates Employment rate (%) \\\nCode \nE09000002 Nigeria 59.1 65.8 \nE09000005 India 100.9 69.5 \nE09000007 United States 60.7 69.2 \nE09000008 India 32.3 75.4 \nE09000009 India 65.2 72.7 \nE09000010 Turkey 43.8 73.0 \nE09000011 Nigeria 37.6 72.1 \nE09000012 Turkey 46.0 69.0 \nE09000013 France 71.4 77.5 \nE09000014 Poland 78.5 71.3 \nE09000015 India 65.4 73.9 \nE09000018 India 62.4 74.2 \nE09000019 Ireland 54.3 72.6 \nE09000022 Jamaica 46.5 78.5 \nE09000023 Jamaica 38.3 75.9 \nE09000024 Poland 48.6 78.8 \nE09000025 India 109.6 66.2 \nE09000026 India 54.6 68.3 \nE09000028 Nigeria 53.5 74.2 \nE09000029 Sri Lanka 16.1 78.2 \nE09000031 Pakistan 83.9 73.1 \n\n Gross Pay (Male) Gross Pay (Female) Median House Price \\\nCode \nE09000002 30104.0 24602.0 243500 \nE09000005 30129.0 29600.0 407250 \nE09000007 NaN 36632.0 700000 \nE09000008 35839.0 29819.0 300000 \nE09000009 32185.0 29875.0 430000 \nE09000010 35252.0 30222.0 320000 \nE09000011 35596.0 29833.0 340000 \nE09000012 NaN 31919.0 485000 \nE09000013 43845.0 34808.0 730000 \nE09000014 NaN 29513.0 432500 \nE09000015 NaN 29335.0 396150 \nE09000018 32235.0 27226.0 355000 \nE09000019 38284.0 NaN 583000 \nE09000022 35995.0 30173.0 450000 \nE09000023 NaN 31641.0 352000 \nE09000024 NaN 30722.0 415000 \nE09000025 30141.0 24006.0 305000 \nE09000026 39272.0 29204.0 345000 \nE09000028 36712.0 32696.0 475000 \nE09000029 36636.0 28540.0 320000 \nE09000031 33126.0 NaN 366569 \n\n Greenspace (%) Happiness score Political control \\\nCode \nE09000002 33.6 7.1 Lab \nE09000005 21.9 7.2 Lab \nE09000007 24.8 7.1 Lab \nE09000008 37.1 7.2 Lab \nE09000009 30.9 7.3 Lab \nE09000010 45.6 7.3 Lab \nE09000011 34.4 7.2 Lab \nE09000012 23.2 7.0 Lab \nE09000013 19.1 7.2 Lab \nE09000014 25.5 7.2 Lab \nE09000015 34.6 7.3 Lab \nE09000018 39.6 7.4 Lab \nE09000019 12.4 7.1 Lab \nE09000022 17.3 7.2 Lab \nE09000023 22.5 7.3 Lab \nE09000024 34.6 7.1 Lab \nE09000025 23.9 7.2 Lab \nE09000026 40.6 7.3 Lab \nE09000028 24.9 7.3 Lab \nE09000029 32.0 7.3 Lib Dem \nE09000031 31.4 7.1 Lab \n\n Turnout at local elections \nCode \nE09000002 36.5 \nE09000005 36.3 \nE09000007 38.7 \nE09000008 38.6 \nE09000009 41.2 \nE09000010 38.2 \nE09000011 37.3 \nE09000012 39.4 \nE09000013 37.6 \nE09000014 38.1 \nE09000015 40.7 \nE09000018 36.8 \nE09000019 38.4 \nE09000022 34.5 \nE09000023 37.2 \nE09000024 41.3 \nE09000025 40.5 \nE09000026 39.7 \nE09000028 36.2 \nE09000029 42.6 \nE09000031 37.6 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Name</th>\n <th>Inner/Outer</th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>Largest migrant population</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Political control</th>\n <th>Turnout at local elections</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000002</th>\n <td>Barking and Dagenham</td>\n <td>Outer London</td>\n <td>209000</td>\n <td>3611</td>\n <td>57.9</td>\n <td>32.9</td>\n <td>37.8</td>\n <td>Nigeria</td>\n <td>59.1</td>\n <td>65.8</td>\n <td>30104.0</td>\n <td>24602.0</td>\n <td>243500</td>\n <td>33.6</td>\n <td>7.1</td>\n <td>Lab</td>\n <td>36.5</td>\n </tr>\n <tr>\n <th>E09000005</th>\n <td>Brent</td>\n <td>Outer London</td>\n <td>332100</td>\n <td>4323</td>\n <td>76.8</td>\n <td>35.6</td>\n <td>53.9</td>\n <td>India</td>\n <td>100.9</td>\n <td>69.5</td>\n <td>30129.0</td>\n <td>29600.0</td>\n <td>407250</td>\n <td>21.9</td>\n <td>7.2</td>\n <td>Lab</td>\n <td>36.3</td>\n </tr>\n <tr>\n <th>E09000007</th>\n <td>Camden</td>\n <td>Inner London</td>\n <td>242500</td>\n <td>2179</td>\n <td>111.3</td>\n <td>36.4</td>\n <td>41.4</td>\n <td>United States</td>\n <td>60.7</td>\n <td>69.2</td>\n <td>NaN</td>\n <td>36632.0</td>\n <td>700000</td>\n <td>24.8</td>\n <td>7.1</td>\n <td>Lab</td>\n <td>38.7</td>\n </tr>\n <tr>\n <th>E09000008</th>\n <td>Croydon</td>\n <td>Outer London</td>\n <td>386500</td>\n <td>8650</td>\n <td>44.7</td>\n <td>37.0</td>\n <td>29.4</td>\n <td>India</td>\n <td>32.3</td>\n <td>75.4</td>\n <td>35839.0</td>\n <td>29819.0</td>\n <td>300000</td>\n <td>37.1</td>\n <td>7.2</td>\n <td>Lab</td>\n <td>38.6</td>\n </tr>\n <tr>\n <th>E09000009</th>\n <td>Ealing</td>\n <td>Outer London</td>\n <td>351600</td>\n <td>5554</td>\n <td>63.3</td>\n <td>36.2</td>\n <td>47.4</td>\n <td>India</td>\n <td>65.2</td>\n <td>72.7</td>\n <td>32185.0</td>\n <td>29875.0</td>\n <td>430000</td>\n <td>30.9</td>\n <td>7.3</td>\n <td>Lab</td>\n <td>41.2</td>\n </tr>\n <tr>\n <th>E09000010</th>\n <td>Enfield</td>\n <td>Outer London</td>\n <td>333000</td>\n <td>8083</td>\n <td>41.2</td>\n <td>36.3</td>\n <td>35.0</td>\n <td>Turkey</td>\n <td>43.8</td>\n <td>73.0</td>\n <td>35252.0</td>\n <td>30222.0</td>\n <td>320000</td>\n <td>45.6</td>\n <td>7.3</td>\n <td>Lab</td>\n <td>38.2</td>\n </tr>\n <tr>\n <th>E09000011</th>\n <td>Greenwich</td>\n <td>Outer London</td>\n <td>280100</td>\n <td>4733</td>\n <td>59.2</td>\n <td>35.0</td>\n <td>35.4</td>\n <td>Nigeria</td>\n <td>37.6</td>\n <td>72.1</td>\n <td>35596.0</td>\n <td>29833.0</td>\n <td>340000</td>\n <td>34.4</td>\n <td>7.2</td>\n <td>Lab</td>\n <td>37.3</td>\n </tr>\n <tr>\n <th>E09000012</th>\n <td>Hackney</td>\n <td>Inner London</td>\n <td>274300</td>\n <td>1905</td>\n <td>144.0</td>\n <td>33.1</td>\n <td>35.8</td>\n <td>Turkey</td>\n <td>46.0</td>\n <td>69.0</td>\n <td>NaN</td>\n <td>31919.0</td>\n <td>485000</td>\n <td>23.2</td>\n <td>7.0</td>\n <td>Lab</td>\n <td>39.4</td>\n </tr>\n <tr>\n <th>E09000013</th>\n <td>Hammersmith and Fulham</td>\n <td>Inner London</td>\n <td>185300</td>\n <td>1640</td>\n <td>113.0</td>\n <td>35.7</td>\n <td>43.2</td>\n <td>France</td>\n <td>71.4</td>\n <td>77.5</td>\n <td>43845.0</td>\n <td>34808.0</td>\n <td>730000</td>\n <td>19.1</td>\n <td>7.2</td>\n <td>Lab</td>\n <td>37.6</td>\n </tr>\n <tr>\n <th>E09000014</th>\n <td>Haringey</td>\n <td>Inner London</td>\n <td>278000</td>\n <td>2960</td>\n <td>93.9</td>\n <td>35.1</td>\n <td>39.6</td>\n <td>Poland</td>\n <td>78.5</td>\n <td>71.3</td>\n <td>NaN</td>\n <td>29513.0</td>\n <td>432500</td>\n <td>25.5</td>\n <td>7.2</td>\n <td>Lab</td>\n <td>38.1</td>\n </tr>\n <tr>\n <th>E09000015</th>\n <td>Harrow</td>\n <td>Outer London</td>\n <td>252300</td>\n <td>5046</td>\n <td>50.0</td>\n <td>38.3</td>\n <td>49.6</td>\n <td>India</td>\n <td>65.4</td>\n <td>73.9</td>\n <td>NaN</td>\n <td>29335.0</td>\n <td>396150</td>\n <td>34.6</td>\n <td>7.3</td>\n <td>Lab</td>\n <td>40.7</td>\n </tr>\n <tr>\n <th>E09000018</th>\n <td>Hounslow</td>\n <td>Outer London</td>\n <td>274200</td>\n <td>5598</td>\n <td>49.0</td>\n <td>35.8</td>\n <td>46.3</td>\n <td>India</td>\n <td>62.4</td>\n <td>74.2</td>\n <td>32235.0</td>\n <td>27226.0</td>\n <td>355000</td>\n <td>39.6</td>\n <td>7.4</td>\n <td>Lab</td>\n <td>36.8</td>\n </tr>\n <tr>\n <th>E09000019</th>\n <td>Islington</td>\n <td>Inner London</td>\n <td>231200</td>\n <td>1486</td>\n <td>155.6</td>\n <td>34.8</td>\n <td>36.6</td>\n <td>Ireland</td>\n <td>54.3</td>\n <td>72.6</td>\n <td>38284.0</td>\n <td>NaN</td>\n <td>583000</td>\n <td>12.4</td>\n <td>7.1</td>\n <td>Lab</td>\n <td>38.4</td>\n </tr>\n <tr>\n <th>E09000022</th>\n <td>Lambeth</td>\n <td>Inner London</td>\n <td>328900</td>\n <td>2681</td>\n <td>122.7</td>\n <td>34.5</td>\n <td>32.2</td>\n <td>Jamaica</td>\n <td>46.5</td>\n <td>78.5</td>\n <td>35995.0</td>\n <td>30173.0</td>\n <td>450000</td>\n <td>17.3</td>\n <td>7.2</td>\n <td>Lab</td>\n <td>34.5</td>\n </tr>\n <tr>\n <th>E09000023</th>\n <td>Lewisham</td>\n <td>Inner London</td>\n <td>303400</td>\n <td>3515</td>\n <td>86.3</td>\n <td>35.0</td>\n <td>34.9</td>\n <td>Jamaica</td>\n <td>38.3</td>\n <td>75.9</td>\n <td>NaN</td>\n <td>31641.0</td>\n <td>352000</td>\n <td>22.5</td>\n <td>7.3</td>\n <td>Lab</td>\n <td>37.2</td>\n </tr>\n <tr>\n <th>E09000024</th>\n <td>Merton</td>\n <td>Outer London</td>\n <td>208100</td>\n <td>3762</td>\n <td>55.3</td>\n <td>36.7</td>\n <td>37.4</td>\n <td>Poland</td>\n <td>48.6</td>\n <td>78.8</td>\n <td>NaN</td>\n <td>30722.0</td>\n <td>415000</td>\n <td>34.6</td>\n <td>7.1</td>\n <td>Lab</td>\n <td>41.3</td>\n </tr>\n <tr>\n <th>E09000025</th>\n <td>Newham</td>\n <td>Inner London</td>\n <td>342900</td>\n <td>3620</td>\n <td>94.7</td>\n <td>32.1</td>\n <td>54.1</td>\n <td>India</td>\n <td>109.6</td>\n <td>66.2</td>\n <td>30141.0</td>\n <td>24006.0</td>\n <td>305000</td>\n <td>23.9</td>\n <td>7.2</td>\n <td>Lab</td>\n <td>40.5</td>\n </tr>\n <tr>\n <th>E09000026</th>\n <td>Redbridge</td>\n <td>Outer London</td>\n <td>304200</td>\n <td>5642</td>\n <td>53.9</td>\n <td>35.8</td>\n <td>40.2</td>\n <td>India</td>\n <td>54.6</td>\n <td>68.3</td>\n <td>39272.0</td>\n <td>29204.0</td>\n <td>345000</td>\n <td>40.6</td>\n <td>7.3</td>\n <td>Lab</td>\n <td>39.7</td>\n </tr>\n <tr>\n <th>E09000028</th>\n <td>Southwark</td>\n <td>Inner London</td>\n <td>314300</td>\n <td>2886</td>\n <td>108.9</td>\n <td>34.4</td>\n <td>38.4</td>\n <td>Nigeria</td>\n <td>53.5</td>\n <td>74.2</td>\n <td>36712.0</td>\n <td>32696.0</td>\n <td>475000</td>\n <td>24.9</td>\n <td>7.3</td>\n <td>Lab</td>\n <td>36.2</td>\n </tr>\n <tr>\n <th>E09000029</th>\n <td>Sutton</td>\n <td>Outer London</td>\n <td>202600</td>\n <td>4385</td>\n <td>46.2</td>\n <td>38.9</td>\n <td>23.1</td>\n <td>Sri Lanka</td>\n <td>16.1</td>\n <td>78.2</td>\n <td>36636.0</td>\n <td>28540.0</td>\n <td>320000</td>\n <td>32.0</td>\n <td>7.3</td>\n <td>Lib Dem</td>\n <td>42.6</td>\n </tr>\n <tr>\n <th>E09000031</th>\n <td>Waltham Forest</td>\n <td>Outer London</td>\n <td>276200</td>\n <td>3881</td>\n <td>71.2</td>\n <td>35.1</td>\n <td>37.2</td>\n <td>Pakistan</td>\n <td>83.9</td>\n <td>73.1</td>\n <td>33126.0</td>\n <td>NaN</td>\n <td>366569</td>\n <td>31.4</td>\n <td>7.1</td>\n <td>Lab</td>\n <td>37.6</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<div class=\"alert alert-success\">\n <b>EXERCISE</b> <br/> \n With the above commands you can now start exploring the data some more. Answer the following questions by writing a little code (add as many cells as you need):\n <ol>\n <li>Which borough has the largest population density per hectare? </li> \n <li>What are the maximum and minimum number of new migrants? And for which boroughs?</li> \n <li> Which borough is happiest? </li>\n \n </ol> \n</div> \n\n\n> *Tips*: \n- Find the maximum of a row with for instance `boroughs['Population'].max()` \n- Extract the value from a cell in a DataFrame with `.value[]`\n- Print a value with `print()` for instance: `print(boroughs['area'][0])` for the first row. If you calculate multiple values in one cell you will need this, else the answers will not be displayed.\n- To extract an entire row use `idxmax()` which returns column with maximum value, and `.loc[]` to return row of the index\n- To see the answer uncomment the line in the cell that contains `%load` (by deleting the `#`) and then run the cell, but try to find your own solution first in the cell above the solution!"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "**Which borough has the largest population density per hectare?**"
},
{
"metadata": {},
"cell_type": "code",
"source": "# your answer:\nboroughs['Population density (/ha)'].max()",
"execution_count": 43,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 43,
"data": {
"text/plain": "155.6"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# %load https://raw.githubusercontent.com/IBMDeveloperUK/python-pandas-workshop/master/answers/answer1.py\nboroughs[boroughs['Population density (/ha)'] == boroughs['Population density (/ha)'].max()]\n",
"execution_count": 44,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 44,
"data": {
"text/plain": " Name Inner/Outer Population Area (ha) \\\nCode \nE09000019 Islington Inner London 231200 1486 \n\n Population density (/ha) Average Age Population born abroad (%) \\\nCode \nE09000019 155.6 34.8 36.6 \n\n Largest migrant population New migrant rates Employment rate (%) \\\nCode \nE09000019 Ireland 54.3 72.6 \n\n Gross Pay (Male) Gross Pay (Female) Median House Price \\\nCode \nE09000019 38284.0 NaN 583000 \n\n Greenspace (%) Happiness score Political control \\\nCode \nE09000019 12.4 7.1 Lab \n\n Turnout at local elections \nCode \nE09000019 38.4 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Name</th>\n <th>Inner/Outer</th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>Largest migrant population</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Political control</th>\n <th>Turnout at local elections</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000019</th>\n <td>Islington</td>\n <td>Inner London</td>\n <td>231200</td>\n <td>1486</td>\n <td>155.6</td>\n <td>34.8</td>\n <td>36.6</td>\n <td>Ireland</td>\n <td>54.3</td>\n <td>72.6</td>\n <td>38284.0</td>\n <td>NaN</td>\n <td>583000</td>\n <td>12.4</td>\n <td>7.1</td>\n <td>Lab</td>\n <td>38.4</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "**What are the maximum and minimum number of new migrants? And for which boroughs?**"
},
{
"metadata": {},
"cell_type": "code",
"source": "# your answer:\nboroughs[(boroughs['New migrant rates'] == boroughs['New migrant rates'].max()) | (boroughs['New migrant rates'] == boroughs['New migrant rates'].min())]",
"execution_count": 45,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 45,
"data": {
"text/plain": " Name Inner/Outer Population Area (ha) \\\nCode \nE09000001 City of London Inner London 8800 290 \nE09000004 Bexley Outer London 244300 6058 \nE09000006 Bromley Outer London 327900 15013 \n\n Population density (/ha) Average Age Population born abroad (%) \\\nCode \nE09000001 30.3 43.2 NaN \nE09000004 40.3 39.0 16.1 \nE09000006 21.8 40.2 18.3 \n\n Largest migrant population New migrant rates Employment rate (%) \\\nCode \nE09000001 United States 152.2 64.6 \nE09000004 Nigeria 14.4 75.1 \nE09000006 India 14.4 75.3 \n\n Gross Pay (Male) Gross Pay (Female) Median House Price \\\nCode \nE09000001 NaN NaN 799999 \nE09000004 37881.0 28924.0 275000 \nE09000006 42026.0 32491.0 374975 \n\n Greenspace (%) Happiness score Political control \\\nCode \nE09000001 4.8 6.0 NaN \nE09000004 31.7 7.2 Cons \nE09000006 57.8 7.4 Cons \n\n Turnout at local elections \nCode \nE09000001 NaN \nE09000004 39.6 \nE09000006 40.8 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Name</th>\n <th>Inner/Outer</th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>Largest migrant population</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Political control</th>\n <th>Turnout at local elections</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000001</th>\n <td>City of London</td>\n <td>Inner London</td>\n <td>8800</td>\n <td>290</td>\n <td>30.3</td>\n <td>43.2</td>\n <td>NaN</td>\n <td>United States</td>\n <td>152.2</td>\n <td>64.6</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>799999</td>\n <td>4.8</td>\n <td>6.0</td>\n <td>NaN</td>\n <td>NaN</td>\n </tr>\n <tr>\n <th>E09000004</th>\n <td>Bexley</td>\n <td>Outer London</td>\n <td>244300</td>\n <td>6058</td>\n <td>40.3</td>\n <td>39.0</td>\n <td>16.1</td>\n <td>Nigeria</td>\n <td>14.4</td>\n <td>75.1</td>\n <td>37881.0</td>\n <td>28924.0</td>\n <td>275000</td>\n <td>31.7</td>\n <td>7.2</td>\n <td>Cons</td>\n <td>39.6</td>\n </tr>\n <tr>\n <th>E09000006</th>\n <td>Bromley</td>\n <td>Outer London</td>\n <td>327900</td>\n <td>15013</td>\n <td>21.8</td>\n <td>40.2</td>\n <td>18.3</td>\n <td>India</td>\n <td>14.4</td>\n <td>75.3</td>\n <td>42026.0</td>\n <td>32491.0</td>\n <td>374975</td>\n <td>57.8</td>\n <td>7.4</td>\n <td>Cons</td>\n <td>40.8</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# %load https://raw.githubusercontent.com/IBMDeveloperUK/python-pandas-workshop/master/answers/answer2.py\nprint (boroughs['New migrant rates'].min())\nprint (boroughs['Name'][boroughs['New migrant rates'] == boroughs['New migrant rates'].min()])\n\nprint (boroughs['New migrant rates'].max())\nprint (boroughs['Name'][boroughs['New migrant rates'] == boroughs['New migrant rates'].max()])\n",
"execution_count": 46,
"outputs": [
{
"output_type": "stream",
"text": "14.4\nCode\nE09000004 Bexley\nE09000006 Bromley\nName: Name, dtype: object\n152.2\nCode\nE09000001 City of London\nName: Name, dtype: object\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "**Which borough is happiest?**"
},
{
"metadata": {},
"cell_type": "code",
"source": "# your answer:\nboroughs[boroughs['Happiness score'] == boroughs['Happiness score'].max()]",
"execution_count": 47,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 47,
"data": {
"text/plain": " Name Inner/Outer Population Area (ha) \\\nCode \nE09000020 Kensington and Chelsea Inner London 159000 1212 \n\n Population density (/ha) Average Age Population born abroad (%) \\\nCode \nE09000020 131.1 39.3 51.9 \n\n Largest migrant population New migrant rates Employment rate (%) \\\nCode \nE09000020 United States 66.2 68.2 \n\n Gross Pay (Male) Gross Pay (Female) Median House Price \\\nCode \nE09000020 NaN NaN 1200000 \n\n Greenspace (%) Happiness score Political control \\\nCode \nE09000020 15.1 7.6 Cons \n\n Turnout at local elections \nCode \nE09000020 29.8 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Name</th>\n <th>Inner/Outer</th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>Largest migrant population</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Political control</th>\n <th>Turnout at local elections</th>\n </tr>\n <tr>\n <th>Code</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E09000020</th>\n <td>Kensington and Chelsea</td>\n <td>Inner London</td>\n <td>159000</td>\n <td>1212</td>\n <td>131.1</td>\n <td>39.3</td>\n <td>51.9</td>\n <td>United States</td>\n <td>66.2</td>\n <td>68.2</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>1200000</td>\n <td>15.1</td>\n <td>7.6</td>\n <td>Cons</td>\n <td>29.8</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Name'].loc[boroughs['Happiness score'].idxmax()]",
"execution_count": 48,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 48,
"data": {
"text/plain": "'Kensington and Chelsea'"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# %load https://raw.githubusercontent.com/IBMDeveloperUK/python-pandas-workshop/master/answers/answer3.py\nboroughs['Name'].loc[boroughs['Happiness score'].idxmax()]\n",
"execution_count": 49,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 49,
"data": {
"text/plain": "'Kensington and Chelsea'"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<a id=\"merging\"></a>\n## 5. Merging Data\n\nPandas has several different options to combine or merge data. The [documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html) has lots of examples. "
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Let's create two new Dataframes to explore this: `cities` and `cities2`"
},
{
"metadata": {},
"cell_type": "code",
"source": "data = {'city': ['London','Manchester','Birmingham','Leeds','Glasgow'],\n 'population': [9787426, 2553379, 2440986, 1777934,1209143],\n 'area': [1737.9, 630.3, 598.9, 487.8, 368.5 ]}\ncities = pd.DataFrame(data)\n\ndata2 = {'city': ['Liverpool','Southampton'],\n 'population': [864122, 855569],\n 'area': [199.6, 192.0]}\ncities2 = pd.DataFrame(data2)",
"execution_count": 50,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Use `append()` to combine these Dataframes:"
},
{
"metadata": {},
"cell_type": "code",
"source": "cities = cities.append(cities2)\ncities",
"execution_count": 51,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 51,
"data": {
"text/plain": " city population area\n0 London 9787426 1737.9\n1 Manchester 2553379 630.3\n2 Birmingham 2440986 598.9\n3 Leeds 1777934 487.8\n4 Glasgow 1209143 368.5\n0 Liverpool 864122 199.6\n1 Southampton 855569 192.0",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>city</th>\n <th>population</th>\n <th>area</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>London</td>\n <td>9787426</td>\n <td>1737.9</td>\n </tr>\n <tr>\n <th>1</th>\n <td>Manchester</td>\n <td>2553379</td>\n <td>630.3</td>\n </tr>\n <tr>\n <th>2</th>\n <td>Birmingham</td>\n <td>2440986</td>\n <td>598.9</td>\n </tr>\n <tr>\n <th>3</th>\n <td>Leeds</td>\n <td>1777934</td>\n <td>487.8</td>\n </tr>\n <tr>\n <th>4</th>\n <td>Glasgow</td>\n <td>1209143</td>\n <td>368.5</td>\n </tr>\n <tr>\n <th>0</th>\n <td>Liverpool</td>\n <td>864122</td>\n <td>199.6</td>\n </tr>\n <tr>\n <th>1</th>\n <td>Southampton</td>\n <td>855569</td>\n <td>192.0</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "data = {'city': ['London','Manchester','Birmingham','Leeds','Glasgow'],\n 'density': [5630,4051,4076,3645,3390]}\ncities3 = pd.DataFrame(data)",
"execution_count": 52,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "cities3",
"execution_count": 53,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 53,
"data": {
"text/plain": " city density\n0 London 5630\n1 Manchester 4051\n2 Birmingham 4076\n3 Leeds 3645\n4 Glasgow 3390",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>city</th>\n <th>density</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>London</td>\n <td>5630</td>\n </tr>\n <tr>\n <th>1</th>\n <td>Manchester</td>\n <td>4051</td>\n </tr>\n <tr>\n <th>2</th>\n <td>Birmingham</td>\n <td>4076</td>\n </tr>\n <tr>\n <th>3</th>\n <td>Leeds</td>\n <td>3645</td>\n </tr>\n <tr>\n <th>4</th>\n <td>Glasgow</td>\n <td>3390</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "An extra column can be added with `.merge()` with an outer join using the city names:"
},
{
"metadata": {},
"cell_type": "code",
"source": "cities = pd.merge(cities, cities3, how='outer', sort=True,on='city')\ncities",
"execution_count": 54,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 54,
"data": {
"text/plain": " city population area density\n0 Birmingham 2440986 598.9 4076.0\n1 Glasgow 1209143 368.5 3390.0\n2 Leeds 1777934 487.8 3645.0\n3 Liverpool 864122 199.6 NaN\n4 London 9787426 1737.9 5630.0\n5 Manchester 2553379 630.3 4051.0\n6 Southampton 855569 192.0 NaN",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>city</th>\n <th>population</th>\n <th>area</th>\n <th>density</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>Birmingham</td>\n <td>2440986</td>\n <td>598.9</td>\n <td>4076.0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>Glasgow</td>\n <td>1209143</td>\n <td>368.5</td>\n <td>3390.0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>Leeds</td>\n <td>1777934</td>\n <td>487.8</td>\n <td>3645.0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>Liverpool</td>\n <td>864122</td>\n <td>199.6</td>\n <td>NaN</td>\n </tr>\n <tr>\n <th>4</th>\n <td>London</td>\n <td>9787426</td>\n <td>1737.9</td>\n <td>5630.0</td>\n </tr>\n <tr>\n <th>5</th>\n <td>Manchester</td>\n <td>2553379</td>\n <td>630.3</td>\n <td>4051.0</td>\n </tr>\n <tr>\n <th>6</th>\n <td>Southampton</td>\n <td>855569</td>\n <td>192.0</td>\n <td>NaN</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<a id=\"grouping\"></a>\n## 6. Grouping Data\n\nGrouping data is a quick way to calculate values for classes in your DataFrame. "
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.groupby(['Inner/Outer']).mean()",
"execution_count": 55,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 55,
"data": {
"text/plain": " Population Area (ha) Population density (/ha) Average Age \\\nInner/Outer \nInner London 252550.000000 2280.5 110.850000 35.550000 \nOuter London 278931.578947 6594.0 47.673684 36.984211 \n\n Population born abroad (%) New migrant rates \\\nInner/Outer \nInner London 40.715385 69.514286 \nOuter London 33.636842 44.878947 \n\n Employment rate (%) Gross Pay (Male) Gross Pay (Female) \\\nInner/Outer \nInner London 71.571429 38580.714286 32674.909091 \nOuter London 73.557895 35668.125000 29986.666667 \n\n Median House Price Greenspace (%) Happiness score \\\nInner/Outer \nInner London 600321.357143 20.985714 7.135714 \nOuter London 366102.315789 39.094737 7.263158 \n\n Turnout at local elections \nInner/Outer \nInner London 37.446154 \nOuter London 39.794737 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Turnout at local elections</th>\n </tr>\n <tr>\n <th>Inner/Outer</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>Inner London</th>\n <td>252550.000000</td>\n <td>2280.5</td>\n <td>110.850000</td>\n <td>35.550000</td>\n <td>40.715385</td>\n <td>69.514286</td>\n <td>71.571429</td>\n <td>38580.714286</td>\n <td>32674.909091</td>\n <td>600321.357143</td>\n <td>20.985714</td>\n <td>7.135714</td>\n <td>37.446154</td>\n </tr>\n <tr>\n <th>Outer London</th>\n <td>278931.578947</td>\n <td>6594.0</td>\n <td>47.673684</td>\n <td>36.984211</td>\n <td>33.636842</td>\n <td>44.878947</td>\n <td>73.557895</td>\n <td>35668.125000</td>\n <td>29986.666667</td>\n <td>366102.315789</td>\n <td>39.094737</td>\n <td>7.263158</td>\n <td>39.794737</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "When you have multiple categorial variables you can create a nested index:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.groupby(['Inner/Outer','Political control']).sum().head(8)",
"execution_count": 56,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 56,
"data": {
"text/plain": " Population Area (ha) \\\nInner/Outer Political control \nInner London Cons 722100 6787 \n Lab 2500800 22872 \n Tower Hamlets First 304000 1978 \nOuter London Cons 1635500 50783 \n Lab 3207300 58883 \n Lib Dem 202600 4385 \n No Overall Control 254300 11235 \n\n Population density (/ha) Average Age \\\nInner/Outer Political control \nInner London Cons 337.5 112.0 \n Lab 1030.4 311.1 \n Tower Hamlets First 153.7 31.4 \nOuter London Cons 214.5 228.8 \n Lab 622.5 394.7 \n Lib Dem 46.2 38.9 \n No Overall Control 22.6 40.3 \n\n Population born abroad (%) \\\nInner/Outer Political control \nInner London Cons 134.5 \n Lab 356.2 \n Tower Hamlets First 38.6 \nOuter London Cons 155.5 \n Lab 449.6 \n Lib Dem 23.1 \n No Overall Control 10.9 \n\n New migrant rates Employment rate (%) \\\nInner/Outer Political control \nInner London Cons 181.7 212.6 \n Lab 558.8 654.4 \n Tower Hamlets First 80.5 70.4 \nOuter London Cons 165.8 446.1 \n Lab 653.8 796.8 \n Lib Dem 16.1 78.2 \n No Overall Control 17.0 76.5 \n\n Gross Pay (Male) Gross Pay (Female) \\\nInner/Outer Political control \nInner London Cons 46627.0 75379.0 \n Lab 184977.0 251388.0 \n Tower Hamlets First 38461.0 32657.0 \nOuter London Cons 193777.0 193327.0 \n Lab 303738.0 290438.0 \n Lib Dem 36636.0 28540.0 \n No Overall Control 36539.0 27455.0 \n\n Median House Price Greenspace (%) \\\nInner/Outer Political control \nInner London Cons 2677000 80.2 \n Lab 4512500 193.6 \n Tower Hamlets First 415000 15.2 \nOuter London Cons 2429975 267.2 \n Lab 3918469 384.3 \n Lib Dem 320000 32.0 \n No Overall Control 287500 59.3 \n\n Happiness score Turnout at local elections \nInner/Outer Political control \nInner London Cons 22.1 99.0 \n Lab 64.6 340.6 \n Tower Hamlets First 7.2 47.2 \nOuter London Cons 44.0 246.2 \n Lab 79.5 424.2 \n Lib Dem 7.3 42.6 \n No Overall Control 7.2 43.1 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th></th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Turnout at local elections</th>\n </tr>\n <tr>\n <th>Inner/Outer</th>\n <th>Political control</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th rowspan=\"3\" valign=\"top\">Inner London</th>\n <th>Cons</th>\n <td>722100</td>\n <td>6787</td>\n <td>337.5</td>\n <td>112.0</td>\n <td>134.5</td>\n <td>181.7</td>\n <td>212.6</td>\n <td>46627.0</td>\n <td>75379.0</td>\n <td>2677000</td>\n <td>80.2</td>\n <td>22.1</td>\n <td>99.0</td>\n </tr>\n <tr>\n <th>Lab</th>\n <td>2500800</td>\n <td>22872</td>\n <td>1030.4</td>\n <td>311.1</td>\n <td>356.2</td>\n <td>558.8</td>\n <td>654.4</td>\n <td>184977.0</td>\n <td>251388.0</td>\n <td>4512500</td>\n <td>193.6</td>\n <td>64.6</td>\n <td>340.6</td>\n </tr>\n <tr>\n <th>Tower Hamlets First</th>\n <td>304000</td>\n <td>1978</td>\n <td>153.7</td>\n <td>31.4</td>\n <td>38.6</td>\n <td>80.5</td>\n <td>70.4</td>\n <td>38461.0</td>\n <td>32657.0</td>\n <td>415000</td>\n <td>15.2</td>\n <td>7.2</td>\n <td>47.2</td>\n </tr>\n <tr>\n <th rowspan=\"4\" valign=\"top\">Outer London</th>\n <th>Cons</th>\n <td>1635500</td>\n <td>50783</td>\n <td>214.5</td>\n <td>228.8</td>\n <td>155.5</td>\n <td>165.8</td>\n <td>446.1</td>\n <td>193777.0</td>\n <td>193327.0</td>\n <td>2429975</td>\n <td>267.2</td>\n <td>44.0</td>\n <td>246.2</td>\n </tr>\n <tr>\n <th>Lab</th>\n <td>3207300</td>\n <td>58883</td>\n <td>622.5</td>\n <td>394.7</td>\n <td>449.6</td>\n <td>653.8</td>\n <td>796.8</td>\n <td>303738.0</td>\n <td>290438.0</td>\n <td>3918469</td>\n <td>384.3</td>\n <td>79.5</td>\n <td>424.2</td>\n </tr>\n <tr>\n <th>Lib Dem</th>\n <td>202600</td>\n <td>4385</td>\n <td>46.2</td>\n <td>38.9</td>\n <td>23.1</td>\n <td>16.1</td>\n <td>78.2</td>\n <td>36636.0</td>\n <td>28540.0</td>\n <td>320000</td>\n <td>32.0</td>\n <td>7.3</td>\n <td>42.6</td>\n </tr>\n <tr>\n <th>No Overall Control</th>\n <td>254300</td>\n <td>11235</td>\n <td>22.6</td>\n <td>40.3</td>\n <td>10.9</td>\n <td>17.0</td>\n <td>76.5</td>\n <td>36539.0</td>\n <td>27455.0</td>\n <td>287500</td>\n <td>59.3</td>\n <td>7.2</td>\n <td>43.1</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<a id=\"visualise\"></a>\n## 7. Visualising Data\n\nPandas uses [`Matplotlib`](https://matplotlib.org/users/index.html) as the default for visualisations. \n\nImport the package and also add the magic line starting with `%` to output the charts within the notebook:"
},
{
"metadata": {},
"cell_type": "code",
"source": "import matplotlib.pyplot as plt\n\n%matplotlib inline",
"execution_count": 57,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs = boroughs.reset_index()",
"execution_count": 58,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "The default plot is a line chart that uses the index for the x-axis:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Employment rate (%)'].plot();",
"execution_count": 59,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "To create a plot that makes more sense for this data have a look at the [documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html) for all options. \n\nFor the above example, a histogram might work better. You can change the number of `bins` to get the desired output:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Employment rate (%)'].plot.hist(bins=10);",
"execution_count": 60,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAD4CAYAAADmWv3KAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAP3klEQVR4nO3dfYxldX3H8fcHFoS1IFoGNcp0xCrWGBUcsBYfwmoNiNJaW6vRRm11+2CN9CG6WlNtmiagrU+pqa5PRXyqoFAr1SJaNCYF3EVUZDFaXWRFBW0tioYV/PaPe5a9O7s7e3dmztzDb9+vZDL3nPvw+3Dv/j6ce+bcc1NVSJLac9C0A0iS+mHBS1KjLHhJapQFL0mNsuAlqVFrph1g3NFHH11zc3PTjiFJdxmbN2/+flXN7Om6QRX83NwcmzZtmnYMSbrLSHL93q5zF40kNcqCl6RGWfCS1CgLXpIaZcFLUqMseElqVK8Fn+SoJBckuS7JliSP6XM8SdJOfR8H/ybgE1X120kOBdb2PJ4kqdNbwSc5Eng88HyAqtoObO9rPEnSrvrcgj8OuBl4d5JHAJuBl1bVreM3SrIeWA8wOzvbYxxp6eY2XDyVcbeefcZUxlUb+twHvwY4EfinqjoBuBXYsPBGVbWxquaran5mZo+nU5AkLUGfBb8N2FZVV3TLFzAqfEnSKuit4Kvqu8ANSY7vVj0RuLav8SRJu+r7KJqXAO/rjqD5BvCCnseTJHV6LfiquhqY73MMSdKe+UlWSWqUBS9JjbLgJalRFrwkNcqCl6RGWfCS1CgLXpIaZcFLUqMseElqlAUvSY2y4CWpURa8JDXKgpekRlnwktQoC16SGmXBS1KjLHhJapQFL0mNsuAlqVEWvCQ1yoKXpEZZ8JLUKAtekhplwUtSoyx4SWrUmj4fPMlW4EfAHcDtVTXf53iSpJ16LfjOqVX1/VUYR5I0xl00ktSovrfgC7gkSQFvq6qNC2+QZD2wHmB2drbnOJImNbfh4qmMu/XsM6YyLrT339z3FvwpVXUicDrw4iSPX3iDqtpYVfNVNT8zM9NzHEk6cPRa8FV1Y/f7JuBC4OQ+x5Mk7dRbwSe5e5IjdlwGngxc09d4kqRd9bkP/t7AhUl2jPP+qvpEj+NJksb0VvBV9Q3gEX09viRpcR4mKUmNsuAlqVEWvCQ1yoKXpEZZ8JLUKAtekhplwUtSoyx4SWqUBS9JjbLgJalRFrwkNcqCl6RGWfCS1CgLXpIaZcFLUqMseElqlAUvSY2y4CWpURa8JDXKgpekRlnwktQoC16SGmXBS1KjLHhJapQFL0mNsuAlqVG9F3ySg5N8IcnH+h5LkrTTamzBvxTYsgrjSJLG9FrwSe4PnAG8o89xJEm7W9Pz478ReBlwxN5ukGQ9sB5gdna25zhtmdtw8VTG3Xr2GVMZ90A0rddYbehtCz7JU4GbqmrzYrerqo1VNV9V8zMzM33FkaQDTp+7aE4BzkyyFfggsC7Je3scT5I0preCr6pXVNX9q2oOeBbw6ap6bl/jSZJ25XHwktSovv/ICkBVXQZcthpjSZJGJtqCT/KwvoNIklbWpLto3prkyiR/kuSoXhNJklbERAVfVY8FngMcC2xK8v4kv95rMknSskz8R9aq+hrwKuDlwBOANye5Lslv9RVOkrR0k+6Df3iSNzA6p8w64GlV9Svd5Tf0mE+StESTHkXzj8DbgVdW1U93rKyqG5O8qpdkkqRlmbTgnwL8tKruAEhyEHBYVf2kqs7rLZ0kackm3Qd/KXD42PLabp0kaaAmLfjDqurHOxa6y2v7iSRJWgmTFvytSU7csZDkUcBPF7m9JGnKJt0HfxZwfpIbu+X7Ar/bTyRJ0kqYqOCr6vNJHgIcDwS4rqp+1msySdKy7M/Jxk4C5rr7nJCEqnpPL6kkScs2UcEnOQ94IHA1cEe3ugALXpIGatIt+HngoVVVfYaRJK2cSY+iuQa4T59BJEkra9It+KOBa5NcCdy2Y2VVndlLKknSsk1a8K/pM4QkaeVNepjkZ5L8EvCgqro0yVrg4H6jSZKWY9LTBb8IuAB4W7fqfsBFfYWSJC3fpH9kfTFwCnAL3PnlH8f0FUqStHyTFvxtVbV9x0KSNYyOg5ckDdSkBf+ZJK8EDu++i/V84N/6iyVJWq5JC34DcDPwZeAPgX9n9P2skqSBmvQomp8z+sq+t/cbR5K0UiY9F8032cM+96o6bpH7HAZ8FrhbN84FVfXqJeaUJO2n/TkXzQ6HAb8D3Gsf97kNWFdVP05yCPC5JB+vqsuXkFOStJ8m2gdfVT8Y+/l2Vb0RWLeP+9TY1/wd0v145I0krZJJd9GcOLZ4EKMt+iMmuN/BwGbgl4G3VNUVe7jNemA9wOzs7CRx9mhuw8VLvu9ybD37jKmMO03Teq7hwHy+paWadBfNP4xdvh3YCjxzX3eqqjuARyY5CrgwycOq6poFt9kIbASYn593C1+SVsikR9GcupxBquqHSS4DTmN06mFJUs8m3UXz54tdX1Wv38N9ZoCfdeV+OPAk4JwlpZQk7bf9OYrmJOCj3fLTGB0CecMi97kvcG63H/4g4ENV9bGlBpUk7Z/9+cKPE6vqRwBJXgOcX1Uv3NsdqupLwAnLTihJWpJJT1UwC2wfW94OzK14GknSipl0C/484MokFzI6lv3pwHt6SyVJWrZJj6L5uyQfBx7XrXpBVX2hv1iSpOWadBcNwFrglqp6E7AtyQN6yiRJWgGTfmXfq4GXA6/oVh0CvLevUJKk5Zt0C/7pwJnArQBVdSMTnKpAkjQ9kxb89qoqupOFJbl7f5EkSSth0oL/UJK3AUcleRFwKX75hyQN2j6PokkS4F+AhwC3AMcDf11Vn+w5myRpGfZZ8FVVSS6qqkcBlrok3UVMuovm8iQn9ZpEkrSiJv0k66nAHyXZyuhImjDauH94X8EkScuzaMEnma2qbwGnr1IeSdIK2dcW/EWMziJ5fZIPV9UzViOUJGn59rUPPmOXj+sziCRpZe2r4GsvlyVJA7evXTSPSHILoy35w7vLsPOPrEf2mk6StGSLFnxVHbxaQSRJK2t/ThcsSboLseAlqVEWvCQ1yoKXpEZZ8JLUKAtekhplwUtSoyx4SWpUbwWf5Ngk/5lkS5KvJHlpX2NJknY36fngl+J24C+q6qokRwCbk3yyqq7tcUxJUqe3Lfiq+k5VXdVd/hGwBbhfX+NJkna1Kvvgk8wBJwBX7OG69Uk2Jdl08803r0YcSTog9F7wSX4B+DBwVlXdsvD6qtpYVfNVNT8zM9N3HEk6YPRa8EkOYVTu76uqj/Q5liRpV30eRRPgncCWqnp9X+NIkvaszy34U4DfA9Ylubr7eUqP40mSxvR2mGRVfY5dv9NVkrSK/CSrJDXKgpekRlnwktQoC16SGmXBS1KjLHhJapQFL0mNsuAlqVEWvCQ1yoKXpEZZ8JLUKAtekhplwUtSoyx4SWqUBS9JjbLgJalRFrwkNcqCl6RGWfCS1CgLXpIaZcFLUqMseElqlAUvSY2y4CWpURa8JDWqt4JP8q4kNyW5pq8xJEl71+cW/D8Dp/X4+JKkRfRW8FX1WeB/+np8SdLi1kw7QJL1wHqA2dnZKafZf3MbLp52hAOKz3f7fI1XztT/yFpVG6tqvqrmZ2Zmph1Hkpox9YKXJPXDgpekRvV5mOQHgP8Cjk+yLckf9DWWJGl3vf2Rtaqe3ddjS5L2zV00ktQoC16SGmXBS1KjLHhJapQFL0mNsuAlqVEWvCQ1yoKXpEZZ8JLUKAtekhplwUtSoyx4SWqUBS9JjbLgJalRFrwkNcqCl6RGWfCS1CgLXpIaZcFLUqMseElqlAUvSY2y4CWpURa8JDXKgpekRlnwktQoC16SGtVrwSc5LclXk3w9yYY+x5Ik7aq3gk9yMPAW4HTgocCzkzy0r/EkSbvqcwv+ZODrVfWNqtoOfBD4jR7HkySNWdPjY98PuGFseRvw6IU3SrIeWN8t/jjJV7vLRwPf7zHfcg09Hww/49DzgRlXwtDzwZQz5px93mSxfL+0tzv1WfDZw7rabUXVRmDjbndONlXVfB/BVsLQ88HwMw49H5hxJQw9Hww/41Lz9bmLZhtw7Njy/YEbexxPkjSmz4L/PPCgJA9IcijwLOCjPY4nSRrT2y6aqro9yZ8C/wEcDLyrqr6yHw+x226bgRl6Phh+xqHnAzOuhKHng+FnXFK+VO22W1yS1AA/ySpJjbLgJalRgyj4JEcluSDJdUm2JHnM2HV/maSSHD3EjEle0p2O4StJXjukfEkemeTyJFcn2ZTk5CnmO77LsePnliRnJblXkk8m+Vr3+54Dy/e67jn9UpILkxw1jXyLZRy7fupzZbGMQ5gri7zOg5krXc4/656na5J8IMlhS5orVTX1H+Bc4IXd5UOBo7rLxzL6I+31wNFDywicClwK3K1bf8zA8l0CnN6tewpw2bRf6y7LwcB3GX1A47XAhm79BuCcgeV7MrCmW3/OEPItzNgtD2au7OV5HMxc2Uu+wcwVRh8S/SZweLf8IeD5S5krU9+CT3Ik8HjgnQBVtb2qfthd/QbgZezhA1KraZGMfwycXVW3detvGli+Ao7sbnYPhvM5hCcC/11V1zM6fcW53fpzgd+cWqqd7sxXVZdU1e3d+ssZfZ5jCMafQxjIXFlgPOMg5soC4/mGNlfWAIcnWQOs7fLs91yZesEDxwE3A+9O8oUk70hy9yRnAt+uqi9OOR/sJSPwYOBxSa5I8pkkJw0s31nA65LcAPw98Iop5VvoWcAHusv3rqrvAHS/j5laqp3G8437feDjq5xlb+7MOLC5Mm78eRzKXBk3nm8wc6Wqvt1l+BbwHeD/quoSljJXBvA2aR64HXh0t/wm4HXAFcA9unVbmeLbzr1k/FvgGuDNjE7LcDKjt1UZUL43A8/o1j0TuHQAr/ehjM6pce9u+YcLrv/fIeUbW/9XwIXTeH0Xy8ho624wc2WR13kQc2WRfIOZK8A9gU8DM8AhwEXAc5cyV4awBb8N2FZVV3TLFwAnAg8AvphkK6O3xVcluc90Iu414zbgIzVyJfBzRicFGkq+5wEf6dadz2hiTdvpwFVV9b1u+XtJ7gvQ/Z72W/eF+UjyPOCpwHOqm1lTNp7xgQxrruyw8HkcylzZYWG+Ic2VJwHfrKqbq+pnjHL9GkuYK1Mv+Kr6LnBDkuO7VU9k9MQfU1VzVTXH6B/Hid1th5LxWkb/Z10HkOTB7NwqGEq+G4EndOvWAV9b7Wx78Gx23f3xUUaTi+73v656ol3tki/JacDLgTOr6idTS7WrOzNW1ZeHNFfGLHydBzFXxizMN6S58i3gV5OsTRJG83kLS5kr03obsuCtxiOBTcCXGP1DuOeC67cy/aNodsvI6B/pexm9/bwKWDewfI8FNgNfZPQ2/lFTfg7XAj+g253QrftF4FOMJtSngHsNLN/XGZ32+uru561Dew4XXD+EubKn53FIc2VP+YY2V/4GuK57vs4D7raUueKpCiSpUVPfRSNJ6ocFL0mNsuAlqVEWvCQ1yoKXpEZZ8JLUKAtekhr1/zOsOPLH1GQ5AAAAAElFTkSuQmCC\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Change the size of the histogram with the `figsize` option:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Employment rate (%)'].plot.hist(bins=15,figsize=(10,5));",
"execution_count": 61,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 720x360 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAlcAAAEvCAYAAABoouS1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAASWklEQVR4nO3de5Cld1kn8O9DBjYJgqCZKELaBi+h4pYkYWBRdJWgW4FovJUrlG5ZeBkv7JawqxDUcvGPrYrrei2tXQOiiIomCMgaUEIhoFUCJgE0kFggDiSEFRAxgJgYePaPPkPONN0zJ8nv7XNO9+dT1dXnvOfyPv306d985/feqrsDAMAY91l2AQAA+4lwBQAwkHAFADCQcAUAMJBwBQAwkHAFADDQoWUXMO+ss87qzc3NZZcBAHBK11133Ye6+/D25SsVrjY3N3PttdcuuwwAgFOqqvfstNxmQQCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgSY9z1VVHUvy0SSfTHJndx+Zcn0AAMu2FycRfUJ3f2gP1gMAsHQ2CwIADDR1uOokr66q66rq6MTrAgBYuqk3Cz6+u2+tqrOTXFNVN3X3G+afMAtdR5NkY2Nj4nIA7r7Ny65edglJkmOXX7LsEoAFTDpz1d23zr5/IMnLkjx2h+dc0d1HuvvI4cOfcWFpAIC1Mlm4qqr7V9UDjt9O8h+S3DDV+gAAVsGUmwU/L8nLqur4en63u/94wvUBACzdZOGqu9+d5FFTvT8AwCpyKgYAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIGEKwCAgYQrAICBhCsAgIEmD1dVdVpVvaWq/mjqdQEALNtezFz9SJIb92A9AABLN2m4qqqHJbkkyfOnXA8AwKqYeubqF5M8K8mnJl4PAMBKODTVG1fVNyT5QHdfV1Vfe5LnHU1yNEk2NjamKgdg7W1edvWyS0iSHLv8kmWXACttypmrxye5tKqOJfm9JBdV1W9vf1J3X9HdR7r7yOHDhycsBwBgepOFq+5+Tnc/rLs3kzwlyWu7+7umWh8AwCpwnisAgIEm2+dqXne/Lsnr9mJdAADLZOYKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgIOEKAGAg4QoAYCDhCgBgoIXCVVX926kLAQDYDxadufo/VfXmqvrhqnrQIi+oqtNnr3lbVb29qn76XtQJALAWFgpX3f1VSb4zyTlJrq2q362qrz/Fy25PclF3PyrJ+UkurqrH3atqAQBW3ML7XHX3O5P8ZJJnJ/maJL9cVTdV1bfu8vzu7o/N7t539tX3sl4AgJV2aJEnVdWXJ3lakkuSXJPkG7v7+qr6giR/keSlu7zutCTXJfniJL/a3W/a4TlHkxxNko2NjXvyM6ylzcuuXnYJSZJjl1+y7BJWpherYhV+J3Ayq/I3629l9fhsbFl05upXklyf5FHd/fTuvj5JuvvWbM1m7ai7P9nd5yd5WJLH7rRjfHdf0d1HuvvI4cOH7/5PAACwQhaauUry5CSf6O5PJklV3SfJ6d39z939olO9uLs/UlWvS3JxkhvuabEAAKtu0Zmr1yQ5Y+7+mbNlu6qqw8ePLKyqM5J8XZKb7kmRAADrYtGZq9Pndk5Pd3+sqs48xWsekuSFs/2u7pPkyu7+o3tYJwDAWlg0XH28qi48vq9VVT06ySdO9oLu/qskF9zL+gAA1sqi4eoZSa6qqltn9x+S5DumKQkAYH0tFK66+y+r6pFJzk1SSW7q7n+dtDIAgDW06MxVkjwmyebsNRdUVbr7tyapCgBgTS16EtEXJfmiJG9N8snZ4k4iXAEAzFl05upIkvO62+VrAABOYtHzXN2Q5POnLAQAYD9YdObqrCTvqKo3J7n9+MLuvnSSqgAA1tSi4eq5UxYBALBfLHoqhtdX1Rcm+ZLufs3s7OynTVsaAMD6WWifq6r6/iQvSfJrs0UPTfLyqYoCAFhXi+7Q/vQkj09yW5J09zuTnD1VUQAA62rRcHV7d99x/E5VHcrWea4AAJizaLh6fVX9eJIzqurrk1yV5P9OVxYAwHpaNFxdluSDSf46yQ8keWWSn5yqKACAdbXo0YKfSvK82RcAALtY9NqCf5cd9rHq7kcMrwgAYI3dnWsLHnd6km9P8jnjywEAWG8L7XPV3f8w9/W+7v7FJBdNXBsAwNpZdLPghXN375OtmawHTFIRAMAaW3Sz4M/N3b4zybEk/3F4NQAAa27RowWfMHUhAAD7waKbBf/ryR7v7p8fUw4AwHq7O0cLPibJK2b3vzHJG5LcPEVRAADratFwdVaSC7v7o0lSVc9NclV3f99UhQEArKNFL3+zkeSOuft3JNkcXg0AwJpbdObqRUneXFUvy9aZ2r8lyW9NVhUAwJpa9GjB/1FVr0ry1bNFT+vut0xXFgDAelp0s2CSnJnktu7+pSS3VNXDJ6oJAGBtLRSuquq/J3l2kufMFt03yW9PVRQAwLpadObqW5JcmuTjSdLdt8blbwAAPsOi4eqO7u5s7cyeqrr/dCUBAKyvRcPVlVX1a0keVFXfn+Q1SZ43XVkAAOvplEcLVlUl+f0kj0xyW5Jzk/xUd18zcW0AAGvnlOGqu7uqXt7dj04iUAEAnMSimwXfWFWPmbQSAIB9YNEztD8hyQ9W1bFsHTFY2ZrU+vKpCgMAWEcnDVdVtdHd703ypD2qBwBgrZ1q5urlSS7s7vdU1R9097ftRVEAAOvqVPtc1dztR0xZCADAfnCqcNW73AYAYAen2iz4qKq6LVszWGfMbid37dD+wEmrAwBYMycNV9192l4VAgCwHyx6nqu7rarOqao/raobq+rtVfUjU60LAGBVLHqeq3viziT/rbuvr6oHJLmuqq7p7ndMuE4AgKWabOaqu9/f3dfPbn80yY1JHjrV+gAAVsFk4WpeVW0muSDJm/ZifQAAyzLlZsEkSVV9VpI/SPKM7r5th8ePJjmaJBsbG1OXA8A+sXnZ1csuIUly7PJLll1CktXpBxPPXFXVfbMVrH6nu1+603O6+4ruPtLdRw4fPjxlOQAAk5vyaMFK8utJbuzun59qPQAAq2TKmavHJ/lPSS6qqrfOvp484foAAJZusn2uuvvPc+K1CQEA9r09OVoQAOCgEK4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGEq4AAAYSrgAABhKuAAAGmixcVdULquoDVXXDVOsAAFg1U85c/WaSiyd8fwCAlTNZuOruNyT58FTvDwCwiuxzBQAw0KFlF1BVR5McTZKNjY3J17d52dWTrwPuDZ9RWC/+Ztlu6TNX3X1Fdx/p7iOHDx9edjkAAPfK0sMVAMB+MuWpGF6c5C+SnFtVt1TV9061LgCAVTHZPlfd/dSp3hsAYFXZLAgAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMJBwBQAwkHAFADCQcAUAMNCk4aqqLq6qv6mqd1XVZVOuCwBgFUwWrqrqtCS/muRJSc5L8tSqOm+q9QEArIIpZ64em+Rd3f3u7r4jye8l+aYJ1wcAsHRThquHJrl57v4ts2UAAPvWoQnfu3ZY1p/xpKqjSY7O7n6sqv5m21POSvKhwbWtq+G9qJ8Z+W57zmfjRPpxF704kX7cRS9OtC/7cS/+bbu7/fjCnRZOGa5uSXLO3P2HJbl1+5O6+4okV+z2JlV1bXcfGV/e+tGLE+nHifTjLnpxIv24i16cSD9ONKofU24W/MskX1JVD6+q+yV5SpJXTLg+AIClm2zmqrvvrKr/nORPkpyW5AXd/fap1gcAsAqm3CyY7n5lklfey7fZdZPhAaQXJ9KPE+nHXfTiRPpxF704kX6caEg/qvsz9jEHAOAecvkbAICBVipcVdWDquolVXVTVd1YVV8x99iPVlVX1VnLrHEv7daPqvovs8sKvb2q/uey69wLO/Wiqs6vqjdW1Vur6tqqeuyy69wLVXXu7Gc+/nVbVT2jqj6nqq6pqnfOvj942bXuhZP042dnn5e/qqqXVdWDll3r1HbrxdzjB2ocPVk/Dug4utvfykEdS585+/3fUFUvrqrTR42jK7VZsKpemOTPuvv5syMMz+zuj1TVOUmen+SRSR7d3fvunBw72akfSS5I8hNJLunu26vq7O7+wFIL3QO79OLKJL/Q3a+qqicneVZ3f+0y69xrs8tMvS/Jv0vy9CQf7u7LZ9fyfHB3P3upBe6xbf04N8lrZwfX/EySHKR+zPeiu99zUMfR47Z9Nh6RAziOztvWj+flgI2lVfXQJH+e5Lzu/kRVXZmtfcTPy4BxdGVmrqrqgUn+fZJfT5LuvqO7PzJ7+BeSPCs7nIR0vzpJP34oyeXdffts+b4fEE7Si07ywNnTPjs7nEftAHhikr/t7vdk6/JSL5wtf2GSb15aVcvz6X5096u7+87Z8jdm61x7B8n8ZyM5gOPoNvP9OHDj6A7m+3FQx9JDSc6oqkPZ+g/7rRk0jq5MuMrW/yQ+mOQ3quotVfX8qrp/VV2a5H3d/bYl17fXduxHki9N8tVV9aaqen1VPWa5Ze6J3XrxjCQ/W1U3J/lfSZ6zzCKX5ClJXjy7/Xnd/f4kmX0/e2lVLc98P+Z9T5JX7XEty/bpXhzgcXTe/GfjII6j283348CNpd39vmz9rO9N8v4k/9Tdr86gcXSVwtWhJBcm+d/dfUGSjyd5brambn9qiXUty079uGy2/MFJHpfkx5JcWVU7XWpoP9mtFz+U5JndfU6SZ2Y2s3VQzDaPXprkqmXXsgp260dV/USSO5P8zjLqWob5XlTVmTm442iSHT8bB3Ec/bQd+nHgxtLZvlTflOThSb4gyf2r6rtGvf8qhatbktzS3W+a3X9Jtv5BfXiSt1XVsWxN619fVZ+/nBL31G79uCXJS3vLm5N8KlvXQtrPduvFdyd56WzZVUkOxE6Yc56U5Pru/vvZ/b+vqockyez7QdvUsb0fqarvTvINSb6zV2kH0+nN9+KLcnDH0eO2fzYO4jg6b3s/DuJY+nVJ/q67P9jd/5qtn/8rM2gcXZlw1d3/L8nNVXXubNETs/XLP7u7N7t7M1t/EBfOnruv7dKPdyR5eZKLkqSqvjTJ/bIPL7o57yS9uDXJ18yWXZTknUsob5memhM3gb0iW4NkZt//cM8rWq4T+lFVFyd5dpJLu/ufl1bVcny6F9391wd1HJ2z/W/lwI2j22zvx0EcS9+b5HFVdeZs1vKJSW7MoHF01Y4WPD9bR7PcL8m7kzytu/9x7vFjSY4clKNcdupHtjaJvSDJ+UnuSPKj3f3apRW5R3bpxZcl+aVsTfH/S5If7u7rllbkHppt6rk5ySO6+59myz43W0dQbmRr4Pj27v7w8qrcO7v0411J/k2Sf5g97Y3d/YNLKnHP7NSLbY8fy8EaR3f6bNwvB3AcTXbtx1flAI6lVfXTSb4jW7sNvCXJ9yX5rAwYR1cqXAEArLuV2SwIALAfCFcAAAMJVwAAAwlXAAADCVcAAAMJVwAAAwlXAAADCVcAAAP9f3ms7/o7DJP9AAAAAElFTkSuQmCC\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Within the plot command you can select the data directly. The below histogram shows the Employment rate for Outer London only:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Employment rate (%)'][boroughs['Inner/Outer']=='Outer London'].plot.hist(bins=15,figsize=(10,5));",
"execution_count": 62,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 720x360 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAmEAAAEvCAYAAAANTxbKAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAVZUlEQVR4nO3df9BldX0f8PdHFov4o5juWinysJohWsykgitiTRqicUbQQtOaBieNGTvJFmI60cZGYhxN/uiMaRptHDJu8Ecj1pr6k9IKk+gkNfoHKiCgiI5bg7IuVaMTEKESzKd/3At5eHj22euyZ793n+f1mrnznPM933vvZ75779n3nO8551Z3BwCAI+thowsAANiKhDAAgAGEMACAAYQwAIABhDAAgAGEMACAAbaNLuD7tX379t65c+foMgAADuraa6/9y+7esd62oy6E7dy5M9dcc83oMgAADqqqvnygbaYjAQAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGmCyEVdVxVfXJqrqhqm6qqt9ap09V1Zuqam9V3VhVZ0xVDwDAMpnyZq3fTfKc7r6zqo5N8vGquqq7r17V55wkp84fz0zy5vlfAIBNbbIjYT1z53z12Pmj13Q7P8ll875XJzmhqk6cqiYAgGUx6TlhVXVMVV2f5OtJPtzdn1jT5aQkt65a3zdvAwDY1Cb97cju/l6Sp1XVCUk+WFU/3N2fXdWl1nva2oaq2p1kd5KsrKxMUitw9Np58YdGl5AkueX1LxhdwlLx7wIbOyJXR3b3XyX530mev2bTviQnr1p/QpL96zz/0u7e1d27duxY94fIAQCOKlNeHbljfgQsVfWIJD+Z5PNrul2R5CXzqyTPSnJ7d982VU0AAMtiyunIE5O8o6qOySzsvae7/1dVXZgk3b0nyZVJzk2yN8ldSV46YT0AAEtjshDW3TcmOX2d9j2rljvJy6aqAQBgWbljPgDAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAJOFsKo6uar+rKpurqqbqupX1ulzdlXdXlXXzx+vnaoeAIBlsm3C1743ya9293VV9egk11bVh7v7c2v6fay7XzhhHQAAS2eyI2HdfVt3Xzdf/naSm5OcNNX7AQAcTY7IOWFVtTPJ6Uk+sc7mZ1XVDVV1VVU99UjUAwAw2pTTkUmSqnpUkvcneXl337Fm83VJTunuO6vq3CSXJzl1ndfYnWR3kqysrExcMQDA9CY9ElZVx2YWwN7V3R9Yu7277+juO+fLVyY5tqq2r9Pv0u7e1d27duzYMWXJAABHxJRXR1aStyW5ubvfcIA+j5/3S1WdOa/nm1PVBACwLKacjnx2kp9L8pmqun7e9uokK0nS3XuSvCjJRVV1b5K7k1zQ3T1hTQAAS2GyENbdH09SB+lzSZJLpqoBAGBZuWM+AMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAk4Wwqjq5qv6sqm6uqpuq6lfW6VNV9aaq2ltVN1bVGVPVAwCwTLZN+Nr3JvnV7r6uqh6d5Nqq+nB3f25Vn3OSnDp/PDPJm+d/AQA2tcmOhHX3bd193Xz520luTnLSmm7nJ7msZ65OckJVnThVTQAAy+KInBNWVTuTnJ7kE2s2nZTk1lXr+/LgoAYAsOlMOR2ZJKmqRyV5f5KXd/cdazev85Re5zV2J9mdJCsrK4e9Rja28+IPjS4hSXLL618wugTY0LJ8V2DZLct3ZfT/K5MeCauqYzMLYO/q7g+s02VfkpNXrT8hyf61nbr70u7e1d27duzYMU2xAABH0JRXR1aStyW5ubvfcIBuVyR5yfwqybOS3N7dt01VEwDAsphyOvLZSX4uyWeq6vp526uTrCRJd+9JcmWSc5PsTXJXkpdOWA8AwNKYLIR198ez/jlfq/t0kpdNVQMAwLJyx3wAgAGEMACAAYQwAIABhDAAgAGEMACAAYQwAIABhDAAgAGEMACAARYKYVX1w1MXAgCwlSx6JGxPVX2yqn6pqk6YtCIAgC1goRDW3T+a5GeTnJzkmqr6b1X1vEkrAwDYxBY+J6y7v5jkNUleleTHk7ypqj5fVf98quIAADarRc8J+5GqemOSm5M8J8k/7e5/OF9+44T1AQBsStsW7HdJkrckeXV3331fY3fvr6rXTFIZAMAmtmgIOzfJ3d39vSSpqoclOa677+rud05WHQDAJrXoOWEfSfKIVevHz9sAADgEi4aw47r7zvtW5svHT1MSAMDmt2gI+05VnXHfSlU9PcndG/QHAGADi54T9vIk762q/fP1E5P8zDQlAQBsfguFsO7+VFU9JcmTk1SSz3f3X09aGQDAJrbokbAkeUaSnfPnnF5V6e7LJqkKAGCTWyiEVdU7k/xgkuuTfG/e3EmEMACAQ7DokbBdSU7r7p6yGACArWLRqyM/m+TxUxYCALCVLHokbHuSz1XVJ5N8977G7j5vkqoAADa5RUPYb05ZBADAVrPoLSo+WlWnJDm1uz9SVccnOWba0gAANq+Fzgmrql9M8r4kfzBvOinJ5VMVBQCw2S16Yv7Lkjw7yR1J0t1fTPK4qYoCANjsFg1h3+3ue+5bqaptmd0nDACAQ7BoCPtoVb06ySOq6nlJ3pvkf05XFgDA5rZoCLs4yTeSfCbJv0lyZZLXTFUUAMBmt+jVkX+T5C3zBwAAD9Givx35F1nnHLDuftJhrwgAYAv4fn478j7HJfnpJD9w+MsBANgaFjonrLu/uerx1e7+z0mes9FzqurtVfX1qvrsAbafXVW3V9X188drD6F+AICj0qLTkWesWn1YZkfGHn2Qp/1hkkuSXLZBn4919wsXqQEAYDNZdDryd1ct35vkliT/cqMndPefV9XOQ6oKAGCTW/TqyJ+Y6P2fVVU3JNmf5JXdfdN6napqd5LdSbKysjJRKQAAR86i05H/bqPt3f2GQ3jv65Kc0t13VtW5mf0W5akHeP1Lk1yaJLt27XKnfgDgqLfozVp3Jbkosx/uPinJhUlOy+y8sIOdG7au7r6ju++cL1+Z5Niq2n4orwUAcLRZ9Jyw7UnO6O5vJ0lV/WaS93b3LxzqG1fV45N8rbu7qs7MLBB+81BfDwDgaLJoCFtJcs+q9XuS7NzoCVX17iRnJ9leVfuSvC7JsUnS3XuSvCjJRVV1b5K7k1zQ3aYaAYAtYdEQ9s4kn6yqD2Z25/yfysa3nkh3v/gg2y/J7BYWAABbzqJXR/6HqroqyY/Nm17a3Z+eriwAgM1t0RPzk+T4JHd09+8l2VdVT5yoJgCATW+hEFZVr0vyqiS/Pm86Nsl/naooAIDNbtEjYT+V5Lwk30mS7t6fQ7w1BQAAi4ewe+ZXLnaSVNUjpysJAGDzWzSEvaeq/iDJCVX1i0k+kuQt05UFALC5HfTqyKqqJP89yVOS3JHkyUle290fnrg2AIBN66AhbH5H+8u7++lJBC8AgMNg0enIq6vqGZNWAgCwhSx6x/yfSHJhVd2S2RWSldlBsh+ZqjAAgM1swxBWVSvd/ZUk5xyhegAAtoSDHQm7PMkZ3f3lqnp/d/+LI1EUAMBmd7BzwmrV8pOmLAQAYCs5WAjrAywDAPAQHGw68h9V1R2ZHRF7xHw5+dsT8x8zaXUAAJvUhiGsu485UoUAAGwli94nDACAw0gIAwAYQAgDABhACAMAGEAIAwAYQAgDABhACAMAGEAIAwAYQAgDABhACAMAGEAIAwAYQAgDABhACAMAGEAIAwAYQAgDABhACAMAGEAIAwAYQAgDABhgshBWVW+vqq9X1WcPsL2q6k1VtbeqbqyqM6aqBQBg2Ux5JOwPkzx/g+3nJDl1/tid5M0T1gIAsFQmC2Hd/edJvrVBl/OTXNYzVyc5oapOnKoeAIBlMvKcsJOS3Lpqfd+8DQBg09s28L1rnbZet2PV7symLLOysjJlTffbefGHjsj7bOSW179gdAmsYxk+G8lyfD6WZSxgI8vwOV2G72uyHGPB3xp5JGxfkpNXrT8hyf71Onb3pd29q7t37dix44gUBwAwpZEh7IokL5lfJXlWktu7+7aB9QAAHDGTTUdW1buTnJ1ke1XtS/K6JMcmSXfvSXJlknOT7E1yV5KXTlULAMCymSyEdfeLD7K9k7xsqvcHAFhm7pgPADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADCAEAYAMIAQBgAwgBAGADDApCGsqp5fVV+oqr1VdfE628+uqtur6vr547VT1gMAsCy2TfXCVXVMkt9P8rwk+5J8qqqu6O7Pren6se5+4VR1AAAsoymPhJ2ZZG93f6m770nyR0nOn/D9AACOGlOGsJOS3Lpqfd+8ba1nVdUNVXVVVT11wnoAAJbGZNORSWqdtl6zfl2SU7r7zqo6N8nlSU590AtV7U6yO0lWVlYOd50AAEfclEfC9iU5edX6E5LsX92hu+/o7jvny1cmObaqtq99oe6+tLt3dfeuHTt2TFgyAMCRMWUI+1SSU6vqiVX18CQXJLlidYeqenxV1Xz5zHk935ywJgCApTDZdGR331tVv5zkj5Mck+Tt3X1TVV04374nyYuSXFRV9ya5O8kF3b12yhIAYNOZ8pyw+6YYr1zTtmfV8iVJLpmyBgCAZeSO+QAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAAwhhAAADCGEAAAMIYQAAA0wawqrq+VX1haraW1UXr7O9qupN8+03VtUZU9YDALAsJgthVXVMkt9Pck6S05K8uKpOW9PtnCSnzh+7k7x5qnoAAJbJlEfCzkyyt7u/1N33JPmjJOev6XN+kst65uokJ1TViRPWBACwFKYMYScluXXV+r552/fbBwBg09k24WvXOm19CH1SVbszm65Mkjur6gsPsbaHanuSv5z6Teq3p36HSR32MTrKx+NADnmcNul4rOeIfN82AeO0mCHjdJR9X7fMZ+kh/rssOk6nHGjDlCFsX5KTV60/Icn+Q+iT7r40yaWHu8BDVVXXdPeu0XUsM2O0GON0cMZoMcZpMcbp4IzRYg7HOE05HfmpJKdW1ROr6uFJLkhyxZo+VyR5yfwqybOS3N7dt01YEwDAUpjsSFh331tVv5zkj5Mck+Tt3X1TVV04374nyZVJzk2yN8ldSV46VT0AAMtkyunIdPeVmQWt1W17Vi13kpdNWcNElmZqdIkZo8UYp4MzRosxTosxTgdnjBbzkMepZjkIAIAjyc8WAQAMIIQdRFWdUFXvq6rPV9XNVfWsefu/nf8k001V9R9H1znaeuNUVU+rqqur6vqquqaqzhxd5yhV9eT5ONz3uKOqXl5VP1BVH66qL87/PnZ0rSNtME6/M/9s3VhVH6yqE0bXOsqBxmjV9ldWVVfV9pF1jrbRONl/z2zwfbPvXqOqXjH/vHy2qt5dVccdjv236ciDqKp3JPlYd791fpXn8UlOT/IbSV7Q3d+tqsd199eHFjrYAcbpPUne2N1XVdW5SX6tu88eWecymP+k11eTPDOzcyK/1d2vn/++6mO7+1VDC1wSa8bpyUn+dH7Bz28niXF64Bh195er6uQkb03ylCRP7+4tca+ng1nzWXpS7L8fZM0YvSX23ferqpOSfDzJad19d1W9J7Pz3U/LQ9x/OxK2gap6TJJ/kuRtSdLd93T3XyW5KMnru/u78/Yt/QXeYJw6yWPm3f5u1rkH3Bb13CT/p7u/nNlPd71j3v6OJP9sWFXL5/5x6u4/6e575+1XZ3ZPQR74WUqSNyb5taxz0+stbvU42X+vb/UY2Xc/2LYkj6iqbZkdZNifw7D/FsI29qQk30jyX6rq01X11qp6ZJIfSvJjVfWJqvpoVT1jbJnDHWicXp7kd6rq1iT/KcmvjyxyiVyQ5N3z5b9/373x5n8fN6yq5bN6nFb710muOsK1LKv7x6iqzkvy1e6+YWxJS2n1Z8n+e32rx8i+e5Xu/mpm4/CVJLdldk/TP8lh2H8LYRvbluSMJG/u7tOTfCfJxfP2xyY5K8m/T/KeqlrvJ5i2igON00VJXtHdJyd5ReZHyray+VTteUneO7qWZXagcaqq30hyb5J3jahrmaweo6o6PrMptteOrWr5rPNZsv9eY50xsu9eZX6u1/lJnpjkHyR5ZFX9q8Px2kLYxvYl2dfdn5ivvy+zsLEvyQd65pNJ/iaz35Daqg40Tj+f5APztvcm2fIndyY5J8l13f21+frXqurEJJn/NTUys3acUlU/n+SFSX62ncyaPHCMfjCz/yBuqKpbMpuuva6qHj+wvmWx9rNk//1ga8fIvvuBfjLJX3T3N7r7rzMbm3+cw7D/FsI20N3/N8mtVfXkedNzk3wuyeVJnpMkVfVDSR6eLfJjp+vZYJz2J/nxedtzknxxQHnL5sV54BTbFZnt8DL/+z+OeEXL6QHjVFXPT/KqJOd1913Dqlou949Rd3+mux/X3Tu7e2dmQeOM+Xdzq1v7nbP/frC1Y2Tf/UBfSXJWVR0/P2r63CQ35zDsv10deRBV9bTMrjZ6eJIvZfbTSt9J8vYkT0tyT5JXdvefDityCRxgnJ6a5PcyO/z//5L8UndfO6zIweZTRrcmeVJ33z5v+3uZXUW6ktkX/ae7+1vjqhzvAOO0N8nfSfLNeberu/vCQSUOt94Yrdl+S5JdW/3qyAN8lh4e++/7HWCMfjT23Q9QVb+V5GcyOx3i00l+Icmj8hD330IYAMAApiMBAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABvj/M5PZqsmMRwsAAAAASUVORK5CYII=\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "To add the Employment rate for Inner London, repeat the plot command with a different selection of the data:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Employment rate (%)'][boroughs['Inner/Outer']=='Outer London'].plot.hist(bins=15,figsize=(10,5));\nboroughs['Employment rate (%)'][boroughs['Inner/Outer']=='Inner London'].plot.hist(bins=15,figsize=(10,5));",
"execution_count": 63,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 720x360 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAmEAAAEvCAYAAAANTxbKAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAWDklEQVR4nO3df7CldX0f8PdHwCL+wnTXSpFlNUO0mEkFV8SaNETjjKCVpjUNJo0ZOskWYjrRxkZiHMU/OmOaVBuHjAR/NGLV1J+UVphEJ6nRP1ABAUV03BqUFapGK4igBPPpH+cgd6/37h7Xfe737j2v18yZe57n+Z7zfPaz5z773ufXqe4OAAAb6wGjCwAAWEZCGADAAEIYAMAAQhgAwABCGADAAEIYAMAAR44u4Ae1bdu23rlz5+gyAAAO6Jprrvmb7t6+1rLDLoTt3LkzV1999egyAAAOqKq+sN4yhyMBAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAaYLIRV1dFV9bGqur6qbqyqV60xpqrqdVW1p6puqKpTp6oHAGAzmfJmrd9J8vTuvrOqjkrykaq6sruvWjHmzCQnzR9PSfL6+U8AgC1tsj1hPXPnfPKo+aNXDTs7yaXzsVclObaqjpuqJgCAzWLSc8Kq6oiqui7JV5J8oLs/umrI8UluWTG9dz4PAGBLm/S7I7v7u0meWFXHJnlfVf14d39qxZBa62WrZ1TV7iS7k2THjh2T1Aqw2s4L3j9s3Te/+tnD1j2SnrNMNuTqyO7+RpL/neRZqxbtTXLCiulHJ7l1jddf0t27unvX9u1rfhE5AMBhZcqrI7fP94Clqh6U5GeTfGbVsMuTvGB+leTpSW7v7tumqgkAYLOY8nDkcUneUlVHZBb23tnd/6uqzkuS7r44yRVJzkqyJ8ldSc6dsB4AgE1jshDW3TckOWWN+ReveN5JXjhVDQAAm5U75gMADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADCCEAQAMIIQBAAwghAEADDBZCKuqE6rqL6vqpqq6sap+c40xZ1TV7VV13fzxiqnqAQDYTI6c8L3vTfJb3X1tVT00yTVV9YHu/vSqcR/u7udMWAcAwKYz2Z6w7r6tu6+dP/9mkpuSHD/V+gAADicbck5YVe1MckqSj66x+KlVdX1VXVlVT9iIegAARpvycGSSpKoekuQ9SV7U3XesWnxtkhO7+86qOivJZUlOWuM9difZnSQ7duyYuGIAgOlNuiesqo7KLIC9rbvfu3p5d9/R3XfOn1+R5Kiq2rbGuEu6e1d379q+ffuUJQMAbIgpr46sJG9KclN3v2adMY+aj0tVnTav52tT1QQAsFlMeTjyaUl+Ocknq+q6+byXJdmRJN19cZLnJTm/qu5NcneSc7q7J6wJAGBTmCyEdfdHktQBxlyU5KKpagAA2KzcMR8AYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYAAhDABgACEMAGAAIQwAYIDJQlhVnVBVf1lVN1XVjVX1m2uMqap6XVXtqaobqurUqeoBANhMjpzwve9N8lvdfW1VPTTJNVX1ge7+9IoxZyY5af54SpLXz38CAGxpk+0J6+7buvva+fNvJrkpyfGrhp2d5NKeuSrJsVV13FQ1AQBsFhtyTlhV7UxySpKPrlp0fJJbVkzvzfcHNQCALWfKw5FJkqp6SJL3JHlRd9+xevEaL+k13mN3kt1JsmPHjkNe46Z04cMHr//2setfx84L3r8h67n56F/ckPWsa5P2n42zUZ91+D6D//3Z+e23b9i6bn71szdsXWuZdE9YVR2VWQB7W3e/d40he5OcsGL60UluXT2ouy/p7l3dvWv79u3TFAsAsIGmvDqykrwpyU3d/Zp1hl2e5AXzqyRPT3J7d982VU0AAJvFlIcjn5bkl5N8sqqum897WZIdSdLdFye5IslZSfYkuSvJuRPWAwCwaUwWwrr7I1n7nK+VYzrJC6eqAQBgs3LHfACAAYQwAIABhDAAgAGEMACAAYQwAIABhDAAgAGEMACAAYQwAIABFgphVfXjUxcCALBMFt0TdnFVfayqfr2qjp20IgCAJbBQCOvun0zyS0lOSHJ1Vb29qp45aWUAAFvYwueEdffnkrw8yUuT/HSS11XVZ6rqX0xVHADAVrXoOWE/UVWvTXJTkqcn+Wfd/Y/mz187YX0AAFvSkQuOuyjJG5K8rLvvvm9md99aVS+fpDIAgC1s0RB2VpK7u/u7SVJVD0hydHff1d1vnaw6AIAtatFzwj6Y5EErpo+ZzwMA4CAsGsKO7u4775uYPz9mmpIAALa+RUPYt6rq1PsmqupJSe7ez3gAAPZj0XPCXpTkXVV163z6uCS/ME1JAABb30IhrLs/XlWPT/K4JJXkM939t5NWBgCwhS26JyxJnpxk5/w1p1RVuvvSSaoCANjiFgphVfXWJD+a5Lok353P7iRCGADAQVh0T9iuJCd3d09ZDADAslj06shPJXnUlIUAACyTRfeEbUvy6ar6WJLv3Dezu587SVUAAFvcoiHswimLAABYNoveouJDVXVikpO6+4NVdUySI6YtDQBg61ronLCq+rUk707yx/NZxye5bKqiAAC2ukVPzH9hkqcluSNJuvtzSR45VVEAAFvdoiHsO919z30TVXVkZvcJAwDgICwawj5UVS9L8qCqemaSdyX5n9OVBQCwtS0awi5I8tUkn0zyb5NckeTlUxUFALDVLXp15N8lecP8AQDAD2nR747866xxDlh3P/aQVwQAsAR+kO+OvM/RSX4+yY8c+nIAAJbDQueEdffXVjy+1N3/JcnT9/eaqnpzVX2lqj61zvIzqur2qrpu/njFQdQPAHBYWvRw5KkrJh+Q2Z6xhx7gZX+S5KIkl+5nzIe7+zmL1AAAsJUsejjyP694fm+Sm5P8q/29oLv/qqp2HlRVAABb3KJXR/7MROt/alVdn+TWJC/p7hvXGlRVu5PsTpIdO3ZMVAoAwMZZ9HDkv9/f8u5+zUGs+9okJ3b3nVV1VmbfRXnSOu9/SZJLkmTXrl3u1A8AHPYWvVnrriTnZ/bF3ccnOS/JyZmdF3agc8PW1N13dPed8+dXJDmqqrYdzHsBABxuFj0nbFuSU7v7m0lSVRcmeVd3/+rBrriqHpXky93dVXVaZoHwawf7fgAAh5NFQ9iOJPesmL4nyc79vaCq3pHkjCTbqmpvklcmOSpJuvviJM9Lcn5V3Zvk7iTndLdDjQDAUlg0hL01yceq6n2Z3Tn/57L/W0+ku59/gOUXZXYLCwCApbPo1ZH/saquTPJT81nndvcnpisLAGBrW/TE/CQ5Jskd3f2HSfZW1WMmqgkAYMtbKIRV1SuTvDTJ78xnHZXkv01VFADAVrfonrCfS/LcJN9Kku6+NQd5awoAABYPYffMr1zsJKmqB09XEgDA1rdoCHtnVf1xkmOr6teSfDDJG6YrCwBgazvg1ZFVVUn+e5LHJ7kjyeOSvKK7PzBxbQAAW9YBQ9j8jvaXdfeTkgheAACHwKKHI6+qqidPWgkAwBJZ9I75P5PkvKq6ObMrJCuznWQ/MVVhAABb2X5DWFXt6O4vJjlzg+oBAFgKB9oTdlmSU7v7C1X1nu7+lxtRFADAVnegc8JqxfPHTlkIAMAyOVAI63WeAwDwQzjQ4ch/XFV3ZLZH7EHz58n9J+Y/bNLqAAC2qP2GsO4+YqMKAQBYJoveJwwAgENICAMAGEAIAwAYQAgDABhACAMAGEAIAwAYQAgDABhACAMAGEAIAwAYQAgDABhACAMAGEAIAwAYQAgDABhACAMAGEAIAwAYQAgDABhACAMAGEAIAwAYYLIQVlVvrqqvVNWn1lleVfW6qtpTVTdU1alT1QIAsNlMuSfsT5I8az/Lz0xy0vyxO8nrJ6wFAGBTmSyEdfdfJfn6foacneTSnrkqybFVddxU9QAAbCYjzwk7PsktK6b3zucBAGx5Rw5cd60xr9ccWLU7s0OW2bFjx5Q13e/Ch2/MejapnRe8f91lN7/62dMXsE7/bz56+lVvVvv7OznUbj76FzdsXZvZqM/bzm+/fcyK55b27//C2Y8R/d9nu7rk//4sk5F7wvYmOWHF9KOT3LrWwO6+pLt3dfeu7du3b0hxAABTGhnCLk/ygvlVkqcnub27bxtYDwDAhpnscGRVvSPJGUm2VdXeJK9MclSSdPfFSa5IclaSPUnuSnLuVLUAAGw2k4Ww7n7+AZZ3khdOtX4AgM3MHfMBAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAaYNIRV1bOq6rNVtaeqLlhj+RlVdXtVXTd/vGLKegAANosjp3rjqjoiyR8leWaSvUk+XlWXd/enVw39cHc/Z6o6AAA2oyn3hJ2WZE93f76770nyp0nOnnB9AACHjSlD2PFJblkxvXc+b7WnVtX1VXVlVT1hwnoAADaNyQ5HJqk15vWq6WuTnNjdd1bVWUkuS3LS971R1e4ku5Nkx44dh7pOAIANN+WesL1JTlgx/egkt64c0N13dPed8+dXJDmqqratfqPuvqS7d3X3ru3bt09YMgDAxpgyhH08yUlV9ZiqemCSc5JcvnJAVT2qqmr+/LR5PV+bsCYAgE1hssOR3X1vVf1Gkj9LckSSN3f3jVV13nz5xUmel+T8qro3yd1Jzunu1YcsAQC2nCnPCbvvEOMVq+ZdvOL5RUkumrIGAIDNyB3zAQAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGEMIAAAYQwgAABhDCAAAGmDSEVdWzquqzVbWnqi5YY3lV1evmy2+oqlOnrAcAYLOYLIRV1RFJ/ijJmUlOTvL8qjp51bAzk5w0f+xO8vqp6gEA2Eym3BN2WpI93f357r4nyZ8mOXvVmLOTXNozVyU5tqqOm7AmAIBNYcoQdnySW1ZM753P+0HHAABsOUdO+N61xrw+iDGpqt2ZHa5Mkjur6rOrhmxL8jc/cIVb1yHox3PWXVK/98O98wCH3+fjVWv9ahwSC/VisrVvPpv0s7H+79+EvteLJfr7X8dzkg3+bBwG29UN7MfGff4Psu8/aC9OXG/BlCFsb5ITVkw/OsmtBzEm3X1JkkvWW1FVXd3duw6+1K1FP/alH/fTi33px/30Yl/6sS/9uN+h7MWUhyM/nuSkqnpMVT0wyTlJLl815vIkL5hfJXl6ktu7+7YJawIA2BQm2xPW3fdW1W8k+bMkRyR5c3ffWFXnzZdfnOSKJGcl2ZPkriTnTlUPAMBmMuXhyHT3FZkFrZXzLl7xvJO88BCsat1DlUtKP/alH/fTi33px/30Yl/6sS/9uN8h60XNchAAABvJ1xYBAAxwWIawqjq2qt5dVZ+pqpuq6qkrlr2kqrqqto2scaOs14uq+nfzr4y6sar+0+g6N8pa/aiqJ1bVVVV1XVVdXVWnja5zI1TV4+Z/5vsed1TVi6rqR6rqA1X1ufnPR4yudWr76cXvzz8rN1TV+6rq2NG1boT1+rFi+bJtR9ftx7JtS/fzu7KU29EkqaoXz//+P1VV76iqow/VdvSwPBxZVW9J8uHufuP8ystjuvsbVXVCkjcmeXySJ3X3Jrz/z6G1Vi+SnJLkd5M8u7u/U1WP7O6vDC10g6zTj3cmeW13X1lVZyX57e4+Y2SdG23+NWJfSvKUzM7D/Hp3v7pm3+n6iO5+6dACN9CqXjwuyV/MLyT6vSRZpl4k+/aju7+wjNvRlVZ9Ph6bJd2WJt/XizdkCbejVXV8ko8kObm7766qd2Z2rvvJOQTb0cNuT1hVPSzJP03ypiTp7nu6+xvzxa9N8ttZ44avW9F+enF+kld393fm85dio7GffnSSh82HPTxr3ItuCTwjyf/p7i9k9nVhb5nPf0uSfz6sqjG+14vu/vPuvnc+/6rM7lW4bFZ+NpIl246uYWU/lnJbusLKXizzdvTIJA+qqiMz+4/9rTlE29HDLoRl9j+Tryb5r1X1iap6Y1U9uKqem+RL3X394Po20pq9SPJjSX6qqj5aVR+qqiePLXPDrNePFyX5/aq6JckfJPmdkUUOck6Sd8yf/4P77sc3//nIYVWNsbIXK/2bJFducC2bwff6saTb0dVWfj6WdVt6n5W9WMrtaHd/KbM/7xeT3JbZ/Uz/PIdoO3o4hrAjk5ya5PXdfUqSbyW5MLNdxq8YWNcIa/Xigvn8RyQ5Pcl/SPLOqlqGbyJZrx/nJ3lxd5+Q5MWZ7ylbFvPDss9N8q7RtYy2Xi+q6neT3JvkbSPqGmVlP6rqmCzndvR71vh8LOu2dK1eLOV2dH6u19lJHpPkHyZ5cFX960P1/odjCNubZG93f3Q+/e7M/uF9TJLrq+rmzA4pXFtVjxpT4oZZrxd7k7y3Zz6W5O8y+66rrW69fvxKkvfO570rydKcUDp3ZpJru/vL8+kvV9VxSTL/uUyHWFb3IlX1K5l9Wd0v9eF4kuwPZ2U/fjTLuR1dafXnY1m3pcn392JZt6M/m+Svu/ur3f23mfXgn+QQbUcPuxDW3f83yS1V9bj5rGdk9kF5ZHfv7O6dmf3inDofu2Wt04tPJ7ksydOTpKp+LMkDsym/pPjQ2k8/bk3y0/N5T0/yuQHljfT87Hv47fLMNqiZ//wfG17ROPv0oqqeleSlSZ7b3XcNq2qc7/Wjuz+5jNvRVVb/rizltnRudS+WdTv6xSSnV9Ux872gz0hyUw7RdvRwvTryiZldvfPAJJ9Pcm53/78Vy29OsmsZrupZqxeZHYZ7c5InJrknyUu6+y+GFbmB1unHE5L8YWaHFr6d5Ne7+5phRW6g+SGmW5I8trtvn8/7+5ldMbojsw3Mz3f318dVuTHW6cWeJH8vydfmw67q7vMGlbih1urHquU3Z0m2o8m6n48HZgm3pev04iezvNvRVyX5hcxOWfhEkl9N8pAcgu3oYRnCAAAOd4fd4UgAgK1ACAMAGEAIAwAYQAgDABhACAMAGEAIAwAYQAgDABhACAMAGOD/A6mXPJbz0w+jAAAAAElFTkSuQmCC\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "The above plot is difficult to read as the histograms have overlapped. You can fix this by changing the colours and making them transparant. \n \nTo add a legend each histogram needs to be assigned to an object `ax`. With `legend()` you can then add a legend. With `plt.xlabel()` you can also add a label for the x-axis (this works similar for the y-axis):"
},
{
"metadata": {},
"cell_type": "code",
"source": "ax = boroughs['Employment rate (%)'][boroughs['Inner/Outer']=='Outer London'].plot.hist(\n bins=15,figsize=(10,5),alpha=0.5,color='#1A4D3B');\nax = boroughs['Employment rate (%)'][boroughs['Inner/Outer']=='Inner London'].plot.hist(\n bins=15,figsize=(10,5),alpha=0.5,color='#4D1A39');\nax.legend(['Outer London','Inner London'])\nplt.xlabel('Employment rate (%)');",
"execution_count": 64,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 720x360 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "There are various options available to change every aspect of your chart. Below are some examples to get you started.\n \n**Go ahead and create new charts and customise the options.** \n\nEspecially the next one can be improved on to make it look better:"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Population density (/ha)'].plot.hist(\n bins=15, \n title=\"Population Density (/ha)\",\n legend=False,\n fontsize=14,\n grid=False,\n linestyle='--',\n edgecolor='black',\n color='darkred',\n linewidth=3);",
"execution_count": 65,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs['Population density (/ha)'].plot.hist(bins=15, figsize =(10,5), title='Population Density (/ha)', alpha=0.5, linestyle = '--', edgecolor='green', linewidth=3)",
"execution_count": 66,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 66,
"data": {
"text/plain": "<matplotlib.axes._subplots.AxesSubplot at 0x7f84ba874fd0>"
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 720x360 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Seaborn\n\nSeaborn is a Python data visualization library based on matplotlib. It is an easy to use visualisation package that works well with Pandas DataFrames. \n\nBelow are a few examples using Seaborn. \n\nRefer to this [documentation](https://seaborn.pydata.org/index.html) for information on lots of plots you can create."
},
{
"metadata": {},
"cell_type": "code",
"source": "import seaborn as sns",
"execution_count": 67,
"outputs": []
},
{
"metadata": {},
"cell_type": "code",
"source": "sns.distplot(boroughs['Population density (/ha)'])",
"execution_count": 69,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 69,
"data": {
"text/plain": "<matplotlib.axes._subplots.AxesSubplot at 0x7f84bc6ab050>"
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Let's look at a distribution plot using `distplot`, which shows a distribution of the data. \n\nUse the `dropna()` function to remove rows and columns with Null/NaN values:"
},
{
"metadata": {},
"cell_type": "code",
"source": "sns.distplot(boroughs['Population density (/ha)'].dropna());",
"execution_count": 70,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<You can create categorical plots with `catplot`. There are categorical scatter plots, distribution plots and estimate plots. The `kind` parameter selects the function to use, for instance box, violin, swarm ,bar, stripplot and boxen.\n \nThe default representation in catplot() uses a scatter plot:"
},
{
"metadata": {},
"cell_type": "code",
"source": "sns.catplot(x='Turnout at local elections', y='Political control', data=boroughs);",
"execution_count": 71,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 360x360 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Also try `kind=\"swarm\"`, `kind=\"box\"` or `kind=\"violin\"`:"
},
{
"metadata": {},
"cell_type": "code",
"source": "sns.catplot(x='Median House Price', y='Name', kind='swarm', data=boroughs);",
"execution_count": 72,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 360x360 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "sns.catplot(x='Employment rate (%)', y='Largest migrant population', kind=\"box\", data=boroughs);",
"execution_count": 73,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 360x360 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "sns.catplot(x='Turnout at local elections', y='Political control', kind=\"violin\", data=boroughs);",
"execution_count": 74,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 360x360 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "<div class=\"alert alert-success\">\n <b>EXERCISE</b>\n <ol>\n <li>Create two histograms that compare the Gross Annual pay for Male and Female Employees using `.plot.hist()`</li>\n <li>Create a bar plot comparing the median house prices for different boroughs</li>\n <li>Create a scatter plot comparing the Median House price and percentage of area that is greenspace </li>\n </ol> \n </div> \n \n <ul></ul> \n <ul></ul> \n <ul></ul> \n \n > *Tips*:\n- To add two histograms to one plot you can repeat `.plot()` in the same cell \n- Add a legend by assiging each histogram to an object `ax`, which is used to create a legend\n- To customise the size of your maps, use the example of `[fig, ax]`, which customises the figsize for each map in other examples above \n\n**Create two histograms that compare the Gross Annual pay for Male and Female Employees using `.plot.hist()`**"
},
{
"metadata": {},
"cell_type": "code",
"source": "boroughs.head()",
"execution_count": 75,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 75,
"data": {
"text/plain": " Code Name Inner/Outer Population Area (ha) \\\n0 E09000001 City of London Inner London 8800 290 \n1 E09000002 Barking and Dagenham Outer London 209000 3611 \n2 E09000003 Barnet Outer London 389600 8675 \n3 E09000004 Bexley Outer London 244300 6058 \n4 E09000005 Brent Outer London 332100 4323 \n\n Population density (/ha) Average Age Population born abroad (%) \\\n0 30.3 43.2 NaN \n1 57.9 32.9 37.8 \n2 44.9 37.3 35.2 \n3 40.3 39.0 16.1 \n4 76.8 35.6 53.9 \n\n Largest migrant population New migrant rates Employment rate (%) \\\n0 United States 152.2 64.6 \n1 Nigeria 59.1 65.8 \n2 India 53.1 68.5 \n3 Nigeria 14.4 75.1 \n4 India 100.9 69.5 \n\n Gross Pay (Male) Gross Pay (Female) Median House Price Greenspace (%) \\\n0 NaN NaN 799999 4.8 \n1 30104.0 24602.0 243500 33.6 \n2 36475.0 31235.0 445000 41.3 \n3 37881.0 28924.0 275000 31.7 \n4 30129.0 29600.0 407250 21.9 \n\n Happiness score Political control Turnout at local elections \n0 6.0 NaN NaN \n1 7.1 Lab 36.5 \n2 7.4 Cons 40.5 \n3 7.2 Cons 39.6 \n4 7.2 Lab 36.3 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Code</th>\n <th>Name</th>\n <th>Inner/Outer</th>\n <th>Population</th>\n <th>Area (ha)</th>\n <th>Population density (/ha)</th>\n <th>Average Age</th>\n <th>Population born abroad (%)</th>\n <th>Largest migrant population</th>\n <th>New migrant rates</th>\n <th>Employment rate (%)</th>\n <th>Gross Pay (Male)</th>\n <th>Gross Pay (Female)</th>\n <th>Median House Price</th>\n <th>Greenspace (%)</th>\n <th>Happiness score</th>\n <th>Political control</th>\n <th>Turnout at local elections</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>E09000001</td>\n <td>City of London</td>\n <td>Inner London</td>\n <td>8800</td>\n <td>290</td>\n <td>30.3</td>\n <td>43.2</td>\n <td>NaN</td>\n <td>United States</td>\n <td>152.2</td>\n <td>64.6</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>799999</td>\n <td>4.8</td>\n <td>6.0</td>\n <td>NaN</td>\n <td>NaN</td>\n </tr>\n <tr>\n <th>1</th>\n <td>E09000002</td>\n <td>Barking and Dagenham</td>\n <td>Outer London</td>\n <td>209000</td>\n <td>3611</td>\n <td>57.9</td>\n <td>32.9</td>\n <td>37.8</td>\n <td>Nigeria</td>\n <td>59.1</td>\n <td>65.8</td>\n <td>30104.0</td>\n <td>24602.0</td>\n <td>243500</td>\n <td>33.6</td>\n <td>7.1</td>\n <td>Lab</td>\n <td>36.5</td>\n </tr>\n <tr>\n <th>2</th>\n <td>E09000003</td>\n <td>Barnet</td>\n <td>Outer London</td>\n <td>389600</td>\n <td>8675</td>\n <td>44.9</td>\n <td>37.3</td>\n <td>35.2</td>\n <td>India</td>\n <td>53.1</td>\n <td>68.5</td>\n <td>36475.0</td>\n <td>31235.0</td>\n <td>445000</td>\n <td>41.3</td>\n <td>7.4</td>\n <td>Cons</td>\n <td>40.5</td>\n </tr>\n <tr>\n <th>3</th>\n <td>E09000004</td>\n <td>Bexley</td>\n <td>Outer London</td>\n <td>244300</td>\n <td>6058</td>\n <td>40.3</td>\n <td>39.0</td>\n <td>16.1</td>\n <td>Nigeria</td>\n <td>14.4</td>\n <td>75.1</td>\n <td>37881.0</td>\n <td>28924.0</td>\n <td>275000</td>\n <td>31.7</td>\n <td>7.2</td>\n <td>Cons</td>\n <td>39.6</td>\n </tr>\n <tr>\n <th>4</th>\n <td>E09000005</td>\n <td>Brent</td>\n <td>Outer London</td>\n <td>332100</td>\n <td>4323</td>\n <td>76.8</td>\n <td>35.6</td>\n <td>53.9</td>\n <td>India</td>\n <td>100.9</td>\n <td>69.5</td>\n <td>30129.0</td>\n <td>29600.0</td>\n <td>407250</td>\n <td>21.9</td>\n <td>7.2</td>\n <td>Lab</td>\n <td>36.3</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# your answer:\nax = boroughs['Gross Pay (Male)'].plot.hist(bins=10, figsize=(10,5), color='skyblue', alpha=0.5)\nax = boroughs['Gross Pay (Female)'].plot.hist(bins=10, figsize=(10,5), color='pink', alpha=0.5)\nax.legend(['Male','Female'])\nplt.xlabel('Gross Annual Pay')",
"execution_count": 84,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 84,
"data": {
"text/plain": "Text(0.5, 0, 'Gross Annual Pay')"
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 720x360 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# %load https://raw.githubusercontent.com/IBMDeveloperUK/python-pandas-workshop/master/answers/answer4.py\nax = boroughs['Gross Pay (Female)'].plot.hist(bins=15,figsize=(10,5),alpha=0.5);\nax = boroughs['Gross Pay (Male)'].plot.hist(bins=15,figsize=(10,5),alpha=0.5);\nax.legend(['female','male']);\n",
"execution_count": 85,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 720x360 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "**Create a bar plot comparing the median house prices for different boroughs**"
},
{
"metadata": {},
"cell_type": "code",
"source": "# your answer:\nax = sns.barplot(y ='Name', x='Median House Price', data=boroughs)",
"execution_count": 94,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# %load https://raw.githubusercontent.com/IBMDeveloperUK/python-pandas-workshop/master/answers/answer5.py\n[fig, ax] = plt.subplots(1, figsize=(7,7))\nsns.barplot(x='Median House Price', y='Name', data=boroughs, ax=ax);\n",
"execution_count": 96,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 504x504 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "**Create a scatter plot comparing the Median House price and percentage of greenspace area** "
},
{
"metadata": {},
"cell_type": "code",
"source": "# your answer:\n\nsns.scatterplot(x = 'Median House Price', y ='Greenspace (%)', data=boroughs)",
"execution_count": 106,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 106,
"data": {
"text/plain": "<matplotlib.axes._subplots.AxesSubplot at 0x7f84b7b9e050>"
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "code",
"source": "# %load https://raw.githubusercontent.com/IBMDeveloperUK/python-pandas-workshop/master/answers/answer6.py\n[fig, ax] = plt.subplots(1, figsize=(7,7))\nax=sns.scatterplot(y='Median House Price', x='Greenspace (%)', data=boroughs,ax=ax);\n",
"execution_count": 107,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 504x504 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAbkAAAG0CAYAAACvyln2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3de3RdZ3nn8e8jCQVjDDGO6wGEa6CJM5SVBBCX1rQk4dKEYUg9LS1XUy5OaYChs6bT0M5MmTa9wEzLKrS4aZwGcFvCAkpSoCEp0IRkDClRIIRcasgKpYjQxDGi8TjGQtYzf5wt50SR5CPr7HN5z/ezlpZ09tk659FWop/fy37fyEwkSSrRULcLkCSpLoacJKlYhpwkqViGnCSpWIacJKlYhpwkqVh9GXIRcUlE3BMRt7R4/i9ExG0RcWtEfKju+iRJvSH68T65iPhp4P8BuzLzqUc590TgI8CZmTkVET+Smfd0ok5JUnf1ZUsuM68Fvtd8LCKeHBFXRsSNEXFdRJxcPbUdeF9mTlXfa8BJ0oDoy5BbxEXAWzPzGcCvATuq4ycBJ0XE7oi4PiLO6lqFkqSOGul2Ae0QEY8EfhL4aETMHT6u+jwCnAicDowB10XEUzPz+52uU5LUWUWEHI0W6fcz87QFnpsErs/MHwLfjIg9NELvhk4WKEnqvCK6KzPzPhoB9jKAaDi1evpy4Izq+Ak0ui/v7EqhkqSO6suQi4hLgS8CmyNiMiLeALwKeENEfBW4FTinOv0qYF9E3AZcDfy3zNzXjbolSZ3Vl7cQSJLUir5syUmS1Iq+m3hywgkn5KZNm7pdhiSph9x44433Zub6+cf7LuQ2bdrExMREt8uQJPWQiPjWQsftrpQkFcuQkyQVy5CTJBXLkJMkFcuQkyQVy5CTJBXLkJMkFcuQkyQVy5CTJBXLkJMkFcuQkyQVy5CTJBWrtpCLiEsi4p6IuGWR518VETdXH19o2slbkqS2qLMl9wHgrCWe/ybwvMw8BbgAuKjGWrSA2dlk7/5DfGfqfvbuP8TsrBvoSipLbVvtZOa1EbFpiee/0PTwemCsrlr0ULOzyZ6797N91wSTUwcZW7uKndvG2bxhDUND0e3yJKktemVM7g3Ap7tdxCDZd2D6SMABTE4dZPuuCfYdmO5yZZLUPl3fNDUizqARcs9d4pxzgXMBNm7c2KHKyjY9c/hIwM2ZnDrI9MzhLlUkSe3X1ZZcRJwCXAyck5n7FjsvMy/KzPHMHF+//iG7m+sYjI4MM7Z21YOOja1dxejIcJcqkqT261rIRcRG4OPAazLz692qY1CtWz3Kzm3jR4Jubkxu3erRLlcmSe1TW3dlRFwKnA6cEBGTwDuAhwFk5oXAbwHrgB0RATCTmeN11aMHGxoKNm9Yw2XnbWF65jCjI8OsWz3qpBNJRalzduUrjvL8G4E31vX+OrqhoWD9muO6XYYk1aZXZldKktR2hpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYhpwkqViGnCSpWIacJKlYtYVcRFwSEfdExC2LPB8R8d6IuCMibo6Ip9dViyRpMNXZkvsAcNYSz58NnFh9nAv8WY21SJIGUG0hl5nXAt9b4pRzgF3ZcD1wfEQ8tq56JEmDp5tjco8Hvt30eLI69hARcW5ETETExN69eztSnCSp/3Uz5GKBY7nQiZl5UWaOZ+b4+vXray5LklSKbobcJPCEpsdjwF1dqkWSVKBuhtwngG3VLMvnAP+Wmd/tYj2SpMKM1PXCEXEpcDpwQkRMAu8AHgaQmRcCVwAvBu4A7gdeV1ctkqTBVFvIZeYrjvJ8Am+u6/0lSXLFE0lSsQw5SVKxDDlJUrEMOUlSsQw5SVKxDDlJUrEMOUlSsQw5SVKxDDlJUrEMOUlSsQw5SVKxDDlJUrEMOUlSsQw5SVKxDDlJUrEMOUlSsQw5SVKxDDlJUrEMOUlSsQw5SVKxRrpdgKTyzM4m+w5MMz1zmNGRYdatHmVoKLpdlgaQISeprWZnkz1372f7rgkmpw4ytnYVO7eNs3nDGoNOHWd3paS22ndg+kjAAUxOHWT7rgn2HZjucmUaRIacpLaanjl8JODmTE4dZHrmcJcq0iAz5CS11ejIMGNrVz3o2NjaVYyODHepIg0yQ05SW61bPcrObeNHgm5uTG7d6tEuV6ZB5MQTSW01NBRs3rCGy87b4uxKdZ0hJ6nthoaC9WuO63YZkt2VkqRyGXKSpGIZcpKkYhlykqRiGXKSpGIZcpKkYhlykqRiGXKSpGIZcpKkYhlykqRiGXKSpGIZcpKkYhlykqRiGXKSpGIZcpKkYhlykqRiGXKSpGIZcpKkYhlykqRiGXKSpGIdNeQi4qSI+FxE3FI9PiUi/kf9pUmStDKttOR2Ar8B/BAgM28GXl5nUZIktUMrIfeIzPzSvGMzdRQjSVI7tRJy90bEk4EEiIifB75ba1WSJLXBSAvnvBm4CDg5Ir4DfBN4da1VSZLUBkcNucy8E3hBRKwGhjJzf/1lSZK0cq3Mrvz9iDg+Mw9k5v6IWBsRv9uJ4iRJWolWxuTOzszvzz3IzCngxfWVJElSe7QScsMRcdzcg4hYBRy3xPmSJPWEViae/BXwuYh4P40Zlq8HPlhrVZIktUErE0/+d0R8DXg+EMAFmXlV7ZVJkrRCrbTkyMxPA5+uuRZJktpq0ZCLiP+bmc+NiP1UN4LPPQVkZj6q9uokSVqBRUMuM59bfV7TuXIkSWqfJWdXRsTQ3O4DkiT1myVDLjNnga9GxMZjefGIOCsi9kTEHRHx9gWef3REfDIivhoRt0bE647lfZZjdjbZu/8Q35m6n737DzE7m0f/JklSX2pl4sljgVsj4kvAgbmDmfnSpb4pIoaB9wEvBCaBGyLiE5l5W9NpbwZuy8z/GBHrgT0R8deZOb3cH6QVs7PJnrv3s33XBJNTBxlbu4qd28bZvGENQ0NRx1tKkrqolZD77WN87WcBd1RrXxIRHwbOAZpDLoE1ERHAI4HvUeM2PvsOTB8JOIDJqYNs3zXBZedtYf0a72+XpNIsNbvy4cCbgB8Dvgb8RWYuJ4AeD3y76fEk8Ox55/wp8AngLmAN8ItVF+n8Ws4FzgXYuPGYek4BmJ45fCTgjhQ1dZDpmcPH/JqSpN611JjcB4FxGgF3NvBHy3zthfr/5g+A/QxwE/A44DTgTyPiIbcmZOZFmTmemePr169fZhkPGB0ZZmztqgcdG1u7itGR4WN+TUlS71oq5J6Sma/OzD8Hfh74qWW+9iTwhKbHYzRabM1eB3w8G+6gsVfdyct8n5atWz3Kzm3jR4Jubkxu3erRut5SktRFS43J/XDui8ycaQybLcsNwIkR8UTgO8DLgVfOO+dfaCwXdl1EbAA2A3cu941aNTQUbN6whsvO28L0zGFGR4ZZt3rUSSeSVKilQu7UiLiv+jqAVdXjllY8qYLxLcBVwDBwSWbeGhFvqp6/ELgA+EC1NmYA52fmvSv7kZY2NBROMpGkAbHUiicrHqjKzCuAK+Ydu7Dp67uAF630fSRJWkgr+8lJktSXDDlJUrEMOUlSsVoKuYj40Yh4QfX1qohwZwJJUs87ashFxHbgY8CfV4fGgMvrLEqSpHZopSX3ZmALcB9AZn4D+JE6i5IkqR1aCblDzbsCRMQID12eS5KkntNKyH0+In6Txs3gLwQ+Cnyy3rIkSVq5VkLu7cBeGgs1/zKNm7v/R51FSZLUDkfdT67a+mYnsDMiHgOMZabdlZKkntfK7MprIuJRVcDdBLw/It5df2mSJK1MK92Vj87M+4D/BLw/M58BvKDesiRJWrlWQm4kIh4L/ALwqZrrkSSpbVoJud+hsV3OHZl5Q0Q8CfhGvWVJkrRyrUw8+SiN2wbmHt8J/FydRUmS1A5HDbmIeD8L3Pydma+vpSJJktrkqCHHg8fhHg5sBe6qpxxJktqnle7Kv2l+HBGXAp+trSIVZ3Y22XdgmumZw4yODLNu9ShDQ9HtsiQNgFZacvOdCGxsdyEq0+xssufu/WzfNcHk1EHG1q5i57ZxNm9YY9BJql0rN4Pvj4j75j7TWLfy/PpLUwn2HZg+EnAAk1MH2b5rgn0Hpo/ynZK0cq10V7pBqo7Z9MzhIwE3Z3LqINMzh7tUkaRB0lJ3ZUS8FPjp6uE1melN4WrJ6MgwY2tXPSjoxtauYnRkuItVSRoUrXRXvhN4G3Bb9fG2iPiDugtTGdatHmXntnHG1q4CODImt271aJcrkzQI4mgbCkTEzcBp1W4ERMQw8JXMPKUD9T3E+Ph4TkxMdOOtdYycXSmpbhFxY2aOzz/e6uzK44HvVV8/um1VaSAMDQXr1xzX7TIkDaBWQu4PgK9ExNVA0Bib+41aq5IkqQ1amV15aURcAzyTRsidn5n/WndhkiSt1KIhFxFPn3dosvr8uIh4XGZ+ub6ypHo5TigNhqVacn/U9PUzgAkaLTloLNh8Zl1FSXVyFRZpcCwacpl5xtzXEfGVzDTUVITFVmG57LwtTpCRCtPKpqmwwFY7Ur9yFRZpcLQaclIx5lZhaeYqLFKZlpp48ic80IIbi4j3Nj+fmf+5zsKkusytwjJ/TM5VWKTyLDXxpHlZkRvrLkTqlKGhYPOGNVx23hZnV0qFW2riyQc7WYjUSa7CIg0Gx+QkScUy5CRJxTLkJEnFOuralRGxHtgObGo+PzNfX19ZkiStXCu7EPwtcB3wWcC7ZSVJfaOVkHtEZp5feyWSJLVZK2Nyn4qIF9deiSRJbdZKyL2NRtAdjIj7ImJ/RNxXd2GSJK1UK5umrulEIZIktVsrY3JExFrgRODhc8cy89q6ipIkqR1auYXgjTS6LMeAm4DnAF/ETVMlST2u1TG5ZwLfqjZSfRqwt9aqJElqg1ZC7geZ+QOAiDguM/8J2FxvWZIkrVwrY3KTEXE8cDnwmYiYAu6qtyxJklauldmVW6sv/1dEXA08Griy1qokSWqDpXYGf1Rm3hcRj2k6/LXq8yOB79VamSRJK7RUS+5DwEto7AqeQPO2yQk8qca6JElasaV2Bn9J9fmJnStHkqT2Waq78ulLfWNmfrn95UiS1D5LdVf+UfX54cA48FUaXZanAP8IPLfe0iRJWplF75PLzDOqm7+/BTw9M8cz8xk0bga/o1MFSpJ0rFq5GfzkzJybVUlm3gKcVl9JkiS1Rys3g98eERcDf0VjVuWrgdtrrUqSpDZoJeReB/wKjTUsAa4F/qy2iiRJapNWVjz5QURcCFyRmXs6UJMkSW1x1DG5iHgpjS12rqwenxYRn6i7MEmSVqqViSfvAJ4FfB8gM28CNtVYkyRJbdFKyM1k5r/VXokkSW3WSsjdEhGvBIYj4sSI+BPgC628eEScFRF7IuKOiHj7IuecHhE3RcStEfH5ZdQuSdKSWgm5twI/DhwCLgXuA371aN8UEcPA+4CzgacAr4iIp8w753hgB/DSzPxx4GXLql6SpCW0MrvyfuC/Vx/L8Szgjsy8EyAiPgycA9zWdM4rgY9n5r9U73XPMt9DkqRFLbVA85IzKDPzpUd57ccD3256PAk8e945JwEPi4hrgDXAezJz11FeV5KklizVkvsJGiF1KY0FmWOJcxey0Pm5wPs/A3g+sAr4YkRcn5lff9ALRZwLnAuwcePGZZYhSRpUS43J/TvgN4GnAu8BXgjcm5mfz8xWJohMAk9oejwG3LXAOVdm5oHMvJfGaiqnzn+hzLyoWiB6fP369S28tSRJS+9CcDgzr8zM1wLPobHzwDUR8dYWX/sG4MSIeGJEjAIvB+Z3gf4t8FMRMRIRj6DRnem6mJKktlhy4klEHAf8B+AVNG4Afy/w8VZeODNnIuItwFXAMHBJZt4aEW+qnr8wM2+PiCuBm4FZ4OJqlwMtYXY22XdgmumZw4yODLNu9ShDQ8vtTZak8kXm/GGy6omID9Loqvw08OFeCZ/x8fGcmJjodhldMzub7Ll7P9t3TTA5dZCxtavYuW2czRvWGHSSBlZE3JiZ4/OPLzUm9xoasx/fBnwhIu6rPvZHxH11Faql7TswfSTgACanDrJ91wT7Dkx3uTJJ6j2LdldmZis3iqvDpmcOHwm4OZNTB5meOdyliiTVxaGJlWtlPzn1kNGRYcbWrnpQ0I2tXcXoyHAXq5LUbg5NtIettT6zbvUoO7eNM7Z2FcCR//DXrR7tcmWS2smhifawJddnhoaCzRvWcNl5W+zCkArm0ER7GHJ9aGgoWL/muG6XIalGDk20h92VktSDHJpoD1ty0jI4202d4tBEexhyUouc7aZOc2hi5eyulFrkbDep/xhyUouc7Sb1H0NOatHcbLdmznaTepshJ7XI2W5S/3HiidQiZ7tJ/ceQk5bB2W5Sf7G7UpJULENOklQsQ06SVCxDTpJULENOklQsQ06SVCxDTpJULENOklQsQ06SVCxDTpJULENOklQsQ06SVCxDTpJULENOklQsQ06SVCxDTpJULENOklQsQ06SVKyRbhcgtdPsbLLvwDTTM4cZHRlm3epRhoai22VJ6hJDTsWYnU323L2f7bsmmJw6yNjaVezcNs7mDWsMOmlA2V2pYuw7MH0k4AAmpw6yfdcE+w5Md7kySd1iyKkY0zOHjwTcnMmpg0zPHO5SRZK6zZBTMUZHhhlbu+pBx8bWrmJ0ZHjB82dnk737D/GdqfvZu/8Qs7PZiTIldZAhp2KsWz3Kzm3jR4Jubkxu3erRh5w7N363dcdutrzrarbu2M2eu/cbdFJhIrO//qceHx/PiYmJbpehHtXq7Mq9+w+xdcfuB3Vvjq1dxWXnbWH9muM6WbKkNoiIGzNzfP5xZ1eqKEND0VJIOX4nDQa7KzWQljt+J6k/GXIaSMsZv5PUv+yu1EAaGgo2b1jDZedtcXUUqWCGnAZWq+N3kvqX3ZWSpGLZktOyuQiypH5hyGlZXARZUj+xu1LL4iLI6gaXYNOxsiWnZfEmanWavQdaCVtyWhZvolan2XuglTDktCzeRK1Os/dAK2F3pZal5JuonTXam+Z6D+Yvpm3vgVphyGnZSryJutfGfQzcB8z1Hsz/3dh7oFa41Y5Eb22902uB2wsMfR3NYlvtOCYn0VvjPk60eKi53oPHr30E69ccZ8CpZYacRG/NGu2lwJX6nSEn0VuzRnspcKV+55jcgHOs4wG9ci06MSbXKz+r1C6Ljck5u3KAOcHhwXpl1mjdt2n4e9cgsbtygDnBoXfVOdHiWH7vrh2pfmVLboD12wQHu9jaY7m/d1t+6me25AZYP01wmPtDu3XHbra862q27tjNnrv326I4Bsv9vdviVz8z5AZYL80oPBr/0LbPcn/v/dbil5rV2l0ZEWcB7wGGgYsz852LnPdM4HrgFzPzY3XWpAf00zqU/qFtn+X+3l07Uv2stpZcRAwD7wPOBp4CvCIinrLIee8CrqqrFi2uX1aS6Keu1dL0U4tfmq/OltyzgDsy806AiPgwcA5w27zz3gr8DfDMGmtRn3OR3vZZ7kSSfmrxS/PVGXKPB77d9HgSeHbzCRHxeGArcCZLhFxEnAucC7Bx48a2F6re5x/a9llsfHOpxah75R5CabnqDLmF/vrMnwr3x8D5mXk4YvE/Vpl5EXARNFY8aVuF6iv+oW0Pxzc1SOoMuUngCU2Px4C75p0zDny4CrgTgBdHxExmXl5jXdJAcyKJBkmdtxDcAJwYEU+MiFHg5cAnmk/IzCdm5qbM3AR8DDjPgJPq5UQSDZLaWnKZORMRb6Exa3IYuCQzb42IN1XPX1jXe0tanOObGiS13ieXmVcAV8w7tmC4ZeYv1VmLpAc4vqlB4dqVkqS26bU1Zg05SVJb9OJi3q5dKUlqi15cY9aWnNTjeq37R1pML96DachJPawXu3+kxfTiPZh2V0o9rBe7f6TF9OI9mLbkpB7Wi90/0mJ68R5MQ07qYb3Y/dMujjWWqdfuwbS7Un1rdjbZu/8Q35m6n737DzE7W97a3b3Y/dMOc2ONW3fsZsu7rmbrjt3suXt/kb9DdVdk9td/VOPj4zkxMdHtMtRlgzQho8QWz979h9i6Y/dDWqhLbfdTihJ/n70gIm7MzPH5x23JqS8N0oSMftm9fTkGdazRFmznGXLqS4P6R7IUc2ONzebGGkvuhh6kf5z1CkNOfWmpP5LqfYuNNa5d9bCiWzr+46zzDDn1pVInZHRCL7SUmqea7z7/DC47bwubN6xh6uAPi27p+I+zzvMWAvWlXrwfpx/00oSdhaaal97SmfvH2fzr7z/O6mPIqW/12v04/WCxMaFemdVY8n2B4D/OusHuSmmA9HpLaRC6oUucLdvLbMlJA6TXW0q2dNRutuSkAdIPLSVbOmonW3LSALGlpEFjyEl9ZqXLQjlh5+hceqschpzUR3rpFoBSeY3L4pic1EdcFqp+XuOyGHJSH+n1WwBK4DUuiyEn9RGXhaqf17gshpyK1gvrNLZTP9wC0O+8xmVx01QVq9QJBM78q5/XuP+4aaoGTqkTCHrhZunSWsjz9cI1Vnt4C4GK5QSCepTaQlaZbMmpWE4gqEepLWSVyZBTsZxAUA9byOondleqWK7TWI9e38lAamZLTkVzAkH72UJWO3Rq8pItOUnLYgtZK9XJyUu25CQtmy1krUQnJy8ZcpKkjurk5CVDTuoTpd+ArcHRydt7DDmpD8yNYWzdsZst77qarTt2s+fu/Qad+lInJy+5dqXUB/buP8TWHbsfMm3/svO2uMu3+lK71wddbO1KZ1dKfcAbsFWauclLtb9P7e8gacVcokw6Noac1Ae8AVs6NnZXSn3AG7ClY2PISX2iU2MYUknsrpQkFcuQkyQVy5CTJBXLkJMkFcuJJ1LB2r2qhNRvDDmpUJ3cs0vqVXZXSoXq5J5dUq8y5KRCud6lZMhJxXK9S8mQk4rlepeSE0+kYrnepWTISUVzvUsNOrsrJUnFMuQkScUy5CRJxTLkJEnFMuQkScUy5CRJxTLkJEnFqjXkIuKsiNgTEXdExNsXeP5VEXFz9fGFiDi1znokSYOltpCLiGHgfcDZwFOAV0TEU+ad9k3geZl5CnABcFFd9UiSBk+dLblnAXdk5p2ZOQ18GDin+YTM/EJmTlUPrwfGaqxHkjRg6gy5xwPfbno8WR1bzBuATy/0REScGxETETGxd+/eNpYoSSpZnSG30CqwueCJEWfQCLnzF3o+My/KzPHMHF+/fn0bS5QklazOBZongSc0PR4D7pp/UkScAlwMnJ2Z+472ojfeeOO9EfGttlXZfScA93a7iC7zGngNwGsAXoM5x3IdfnShg5G5YONqxSJiBPg68HzgO8ANwCsz89amczYC/wBsy8wv1FJIj4uIicwc73Yd3eQ18BqA1wC8BnPaeR1qa8ll5kxEvAW4ChgGLsnMWyPiTdXzFwK/BawDdkQEwIy/YElSu9S6n1xmXgFcMe/YhU1fvxF4Y501SJIGlyuedJ/3BnoNwGsAXgPwGsxp23WobUxOkqRusyUnSSqWISdJKpYh10ERcUlE3BMRtzQde0xEfCYivlF9XtvNGusUEU+IiKsj4vaIuDUi3lYdH5hrABARD4+IL0XEV6vr8NvV8UG7DsMR8ZWI+FT1eKB+foCI+OeI+FpE3BQRE9WxgboOEXF8RHwsIv6p+tvwE+28BoZcZ30AOGvesbcDn8vME4HPVY9LNQP818z898BzgDdXi3YP0jUAOAScmZmnAqcBZ0XEcxi86/A24Pamx4P28885IzNPa7p9atCuw3uAKzPzZOBUGv9NtO8aZKYfHfwANgG3ND3eAzy2+vqxwJ5u19jBa/G3wAsH/Bo8Avgy8OxBug40VkD6HHAm8Knq2MD8/E3X4Z+BE+YdG5jrADyKxm40Udc1sCXXfRsy87sA1ecf6XI9HRERm4CnAf/IAF6DqqvuJuAe4DOZOWjX4Y+BXwdmm44N0s8/J4G/j4gbI+Lc6tggXYcnAXuB91dd1xdHxGraeA0MOXVcRDwS+BvgVzPzvm7X0w2ZeTgzT6PRonlWRDy12zV1SkS8BLgnM2/sdi09YEtmPp3Gvptvjoif7nZBHTYCPB34s8x8GnCANnfPGnLdd3dEPBag+nxPl+upVUQ8jEbA/XVmfrw6PFDXoFlmfh+4hsZY7aBchy3ASyPin2nsM3lmRPwVg/PzH5GZd1Wf7wEuo7EP5yBdh0lgsurJAPgYjdBr2zUw5LrvE8Brq69fS2OcqkjRWKD0L4DbM/PdTU8NzDUAiIj1EXF89fUq4AXAPzEg1yEzfyMzxzJzE/By4B8y89UMyM8/JyJWR8Saua+BFwG3MEDXITP/Ffh2RGyuDj0fuI02XgNXPOmgiLgUOJ3GNhJ3A+8ALgc+AmwE/gV4WWZ+r1s11ikingtcB3yNB8ZifpPGuNxAXAM4sr3UB2ksXD4EfCQzfyci1jFA1wEgIk4Hfi0zXzJoP39EPIlG6w0a3XYfyszfG8DrcBqN7dZGgTuB11H9f0EbroEhJ0kqlt2VkqRiGXKSpGIZcpKkYhlykqRiGXKSpGIZctJRRMSGiPhQRNxZLb/0xYjY2u26OiEifjYifqv6+q0RcUtEXBERo9Wx50bEu5vOXx8RV3arXmk+Q05aQnUD++XAtZn5pMx8Bo0bmMcWOHek0/V1wK8DO6qv3wicAnwF+Jnq2vxP4IK5kzNzL/DdiNjS6UKlhRhy0tLOBKYz88K5A5n5rcz8E4CI+KWI+GhEfKnslWkAAAOZSURBVJLGQrurq30Db6gWnD2nOm84Iv5PdfzmiPjl6vjpEXFN035af12FBxHxzoi4rTr/D6tjH4iICyPiuoj4erUOJBGxqTr25erjJ+fqjYhfr/Ys+2pEvLM69uSIuLJqmV4XESfP/8Ej4iTgUGbe23T4YTR2Tvgh8BrgisycmvetlwOvWslFl9qlxH95Su304zS2wlnKTwCnZOb3IuL3aSxT9fpq6a4vRcRnafzR/7fMfGZEHAfsjoi/r77/adX73AXsBrZExG3AVuDkzMy5ZcAqm4DnAU8Gro6IH6Oxtt8LM/MHEXEicCkwHhFnAz8LPDsz74+Ix1SvcRHwpsz8RkQ8m0Zr7cx5P9eWeT/7HwLXA7dWdV7OQ/dHBJgAfvco10zqCENOWoaIeB/wXBqtu2dWhz/TtOTQi2gsPvxr1eOH01ia6EXAKRHx89XxRwMnAtPAlzJzsnr9m2iE2PXAD4CLI+LvgE81lfGRzJwFvhERdwIn09iT60+rJZIOAydV574AeH9m3g9QBfEjgZ8EPlo1GgGOW+DHfSyNbVCovvcvgb+s6nwH8F7g7IjYBnybxoa4szQC93FLXkipQww5aWm3Aj839yAz3xwRJ9Borcw50PR1AD+XmXuaX6TqgnxrZl417/jpNHYKn3MYGMnMmYh4Fo0Fa18OvIUHWlrz1+JL4L/QWA/1VBrDED9oqmf++UPA96utfpZykEYYP0hEPA54Zmb+dkR8iUZL9veqWj9DI9gPHuW1pY5wTE5a2j8AD4+IX2k69oglzr8KeGvTuNrTmo7/SjS2GiIiTqpWnl9Q1dp6dGZeAfwq0BxIL4uIoYh4Mo1NJ/fQCKPvVi2p19BY/Bng74HXR8Qjqtd9TLWH3zcj4mXVsYiIUxco43bgxxY4fgGNCScAq2iE6CwPXJeTaKymL3WdISctIRsrmP8s8LyI+GbVcvkgcP4i33IBjckZN0fELTww8/BiGluIfLk6/ucs3ZOyBvhURNwMfJ5GS23OnurYp2mMq/2AxpjaayPiehohc6Cq/0oa25ZMVF2hc92orwLeEBFfpdFaPWeBGq4FnjYX2PBAaGfmV6pDf0FjV4mnA3O3DpwB/N0SP5vUMe5CIPWRiPgA8KnM/FiH3u89wCcz87PL+J5rgXMWmHUpdZwtOUlL+X2W7p59kIhYD7zbgFOvsCUnSSqWLTlJUrEMOUlSsQw5SVKxDDlJUrEMOUlSsf4/Qoyyc8CTcm0AAAAASUVORK5CYII=\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Now that you have explored some real data with Python and Pandas keep learning by exploring more of this dataset or create a new notebook and start with your own data. The data used here is a clean dataset, which is definitely not always the case, so stay alert to always check all your data. \n\n<div class=\"alert alert-info\" style=\"font-size:100%\">\n<b>To learn more about Pandas start with this <a href=\"http://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html\">10 minute introduction</a><br>\n</div>\n\n## Optional Excercises and further learning\n\nIf you finish early:\n\n2. Try to create other plots. Have a look at the [Pandas plot examples](https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html) or the [Seaborn gallery](https://seaborn.pydata.org/examples/index.html) for inspiration. \n3. Or load one of your own datasets into a new notebook and play around with the data to practice what you have learned. You can use the free account you created today for your own projects as well! \n4. Have a look at these Pandas workshops and book: <br>\n4.1. [Pandas workshop by Alexander Hensdorf](https://github.com/alanderex/pydata-pandas-workshop) <br>\n4.2. [Pandas tutorial by Joris van den Bossche](https://github.com/jorisvandenbossche/pandas-tutorial) <br>\n4.3. [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/) <br>\n\n### Authors\n\nMargriet Groenendijk is a Data & AI Developer Advocate for IBM. She develops and presents talks and workshops about data science and AI. She is active in the local developer communities through attending, presenting and organising meetups and conferences. She has a background in climate science where she\u00a0explored large observational\u00a0datasets of carbon uptake by forests\u00a0during her PhD, and\u00a0global scale weather and climate models as a postdoctoral fellow.\u00a0\n\nYamini Rao is a Developer Advocate for IBM. She compiles developer scenarios, workshops and training material based on IBM Cloud technologies to demonstrate value. She also works as a community manager, collaborating with local developer communites to organise workshops and meetups. She has a background in computer science and has worked extensively as an Implementation Engineer for various IBM Analytical tools. \n\nCopyright \u00a9 2020 IBM. This notebook and its source code are released under the terms of the MIT License."
},
{
"metadata": {},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3.7",
"language": "python"
},
"language_info": {
"name": "python",
"version": "3.7.10",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment