Skip to content

Instantly share code, notes, and snippets.

@ikoustou
Created March 12, 2018 20:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ikoustou/d2a0ba583fe59d5219640a2abe0fae4c to your computer and use it in GitHub Desktop.
Save ikoustou/d2a0ba583fe59d5219640a2abe0fae4c to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from numpy.random import randn #randn return samples from the standard normal distribution"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a dataframe (2-Dimensional data size tabular data), giving 'data', 'index' and 'columns'"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"df = pd.DataFrame(randn(3,3), index = ['A','B','C'] , columns = 'X Y Z'.split())\n",
"#you can give the names of index, columns as a list or you can split a string to create the list"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>X</th>\n",
" <th>Y</th>\n",
" <th>Z</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>A</th>\n",
" <td>-0.555143</td>\n",
" <td>0.694758</td>\n",
" <td>0.940296</td>\n",
" </tr>\n",
" <tr>\n",
" <th>B</th>\n",
" <td>0.122956</td>\n",
" <td>1.242026</td>\n",
" <td>0.220803</td>\n",
" </tr>\n",
" <tr>\n",
" <th>C</th>\n",
" <td>-0.207121</td>\n",
" <td>0.039789</td>\n",
" <td>-1.027087</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" X Y Z\n",
"A -0.555143 0.694758 0.940296\n",
"B 0.122956 1.242026 0.220803\n",
"C -0.207121 0.039789 -1.027087"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Show first column"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"A -0.555143\n",
"B 0.122956\n",
"C -0.207121\n",
"Name: X, dtype: float64"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['X'] #use []"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The result is a 'Series'. So we can count each column of a 'Dataframe' as a 'Series'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Show columns 'X' and 'Y'"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>X</th>\n",
" <th>Y</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>A</th>\n",
" <td>-0.555143</td>\n",
" <td>0.694758</td>\n",
" </tr>\n",
" <tr>\n",
" <th>B</th>\n",
" <td>0.122956</td>\n",
" <td>1.242026</td>\n",
" </tr>\n",
" <tr>\n",
" <th>C</th>\n",
" <td>-0.207121</td>\n",
" <td>0.039789</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" X Y\n",
"A -0.555143 0.694758\n",
"B 0.122956 1.242026\n",
"C -0.207121 0.039789"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[['X', 'Y']] #notice the double [[]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Show the first two rows"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>X</th>\n",
" <th>Y</th>\n",
" <th>Z</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>A</th>\n",
" <td>-0.555143</td>\n",
" <td>0.694758</td>\n",
" <td>0.940296</td>\n",
" </tr>\n",
" <tr>\n",
" <th>B</th>\n",
" <td>0.122956</td>\n",
" <td>1.242026</td>\n",
" <td>0.220803</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" X Y Z\n",
"A -0.555143 0.694758 0.940296\n",
"B 0.122956 1.242026 0.220803"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[:2]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Show the last 2 rows of columns 'Y' and 'Z'"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Y</th>\n",
" <th>Z</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>B</th>\n",
" <td>1.242026</td>\n",
" <td>0.220803</td>\n",
" </tr>\n",
" <tr>\n",
" <th>C</th>\n",
" <td>0.039789</td>\n",
" <td>-1.027087</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Y Z\n",
"B 1.242026 0.220803\n",
"C 0.039789 -1.027087"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[-2:][['Y', 'Z']]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Add a new column with the name 'SUM' at the end of the Dataframe containing the sum of X,Y,Z "
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"df['SUM'] = df['X'] + df['Y'] + df['Z']"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>X</th>\n",
" <th>Y</th>\n",
" <th>Z</th>\n",
" <th>SUM</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>A</th>\n",
" <td>-0.555143</td>\n",
" <td>0.694758</td>\n",
" <td>0.940296</td>\n",
" <td>1.079911</td>\n",
" </tr>\n",
" <tr>\n",
" <th>B</th>\n",
" <td>0.122956</td>\n",
" <td>1.242026</td>\n",
" <td>0.220803</td>\n",
" <td>1.585785</td>\n",
" </tr>\n",
" <tr>\n",
" <th>C</th>\n",
" <td>-0.207121</td>\n",
" <td>0.039789</td>\n",
" <td>-1.027087</td>\n",
" <td>-1.194419</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" X Y Z SUM\n",
"A -0.555143 0.694758 0.940296 1.079911\n",
"B 0.122956 1.242026 0.220803 1.585785\n",
"C -0.207121 0.039789 -1.027087 -1.194419"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Drop column 'SUM' permanently "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"df.drop('SUM', axis=1 , inplace=True) \n",
"\n",
"#axis=1 because rows have axis=0 and columns have axis=1\n",
"#inplace=True for permanent drop. If False or don't include it the drop is not permanent"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>X</th>\n",
" <th>Y</th>\n",
" <th>Z</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>A</th>\n",
" <td>-0.555143</td>\n",
" <td>0.694758</td>\n",
" <td>0.940296</td>\n",
" </tr>\n",
" <tr>\n",
" <th>B</th>\n",
" <td>0.122956</td>\n",
" <td>1.242026</td>\n",
" <td>0.220803</td>\n",
" </tr>\n",
" <tr>\n",
" <th>C</th>\n",
" <td>-0.207121</td>\n",
" <td>0.039789</td>\n",
" <td>-1.027087</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" X Y Z\n",
"A -0.555143 0.694758 0.940296\n",
"B 0.122956 1.242026 0.220803\n",
"C -0.207121 0.039789 -1.027087"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Locate using 'loc'. \n",
"Locate the four corners of your Dataframe (where rows 'A','C' meet columns 'X' and 'Z')"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>X</th>\n",
" <th>Z</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>A</th>\n",
" <td>-0.555143</td>\n",
" <td>0.940296</td>\n",
" </tr>\n",
" <tr>\n",
" <th>C</th>\n",
" <td>-0.207121</td>\n",
" <td>-1.027087</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" X Z\n",
"A -0.555143 0.940296\n",
"C -0.207121 -1.027087"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[['A','C'] , ['X','Z']]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Conditional Selection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Condition on 'X' column, showing all the row: Show the rows where 'X' is negative"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>X</th>\n",
" <th>Y</th>\n",
" <th>Z</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>A</th>\n",
" <td>-0.555143</td>\n",
" <td>0.694758</td>\n",
" <td>0.940296</td>\n",
" </tr>\n",
" <tr>\n",
" <th>C</th>\n",
" <td>-0.207121</td>\n",
" <td>0.039789</td>\n",
" <td>-1.027087</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" X Y Z\n",
"A -0.555143 0.694758 0.940296\n",
"C -0.207121 0.039789 -1.027087"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[df['X']<0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Condition on 'X' column but data on 'Y' column"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"A 0.694758\n",
"C 0.039789\n",
"Name: Y, dtype: float64"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[df['X']<0]['Y']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Dataframe is a collection of Series"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pandas.core.frame.DataFrame"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"type(df)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pandas.core.series.Series"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"type(df['X'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Load a CSV file from a URL or a local file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Read a 'csv' file offered online by gov.uk which shows the volume of property sales in UK from 1995 to Oct 2017. The link is\n",
"http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Sales-2017-12.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=sales&utm_term=9.30_13_02_18"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"df_sales = pd.read_csv('http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Sales-2017-12.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=sales&utm_term=9.30_13_02_18')"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date</th>\n",
" <th>Region_Name</th>\n",
" <th>Area_Code</th>\n",
" <th>Sales_Volume</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1995-01-01</td>\n",
" <td>England</td>\n",
" <td>E92000001</td>\n",
" <td>47639</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1995-01-01</td>\n",
" <td>Wales</td>\n",
" <td>W92000004</td>\n",
" <td>1991</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1995-01-01</td>\n",
" <td>Inner London</td>\n",
" <td>E13000001</td>\n",
" <td>2890</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1995-01-01</td>\n",
" <td>Outer London</td>\n",
" <td>E13000002</td>\n",
" <td>4616</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1995-01-01</td>\n",
" <td>London</td>\n",
" <td>E12000007</td>\n",
" <td>7506</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Date Region_Name Area_Code Sales_Volume\n",
"0 1995-01-01 England E92000001 47639\n",
"1 1995-01-01 Wales W92000004 1991\n",
"2 1995-01-01 Inner London E13000001 2890\n",
"3 1995-01-01 Outer London E13000002 4616\n",
"4 1995-01-01 London E12000007 7506"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#show first rows of Dataframe\n",
"df_sales.head()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date</th>\n",
" <th>Region_Name</th>\n",
" <th>Area_Code</th>\n",
" <th>Sales_Volume</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>115597</th>\n",
" <td>2017-10-01</td>\n",
" <td>Conwy</td>\n",
" <td>W06000003</td>\n",
" <td>188</td>\n",
" </tr>\n",
" <tr>\n",
" <th>115598</th>\n",
" <td>2017-10-01</td>\n",
" <td>Flintshire</td>\n",
" <td>W06000005</td>\n",
" <td>176</td>\n",
" </tr>\n",
" <tr>\n",
" <th>115599</th>\n",
" <td>2017-10-01</td>\n",
" <td>England and Wales</td>\n",
" <td>K04000001</td>\n",
" <td>67408</td>\n",
" </tr>\n",
" <tr>\n",
" <th>115600</th>\n",
" <td>2017-10-01</td>\n",
" <td>Great Britain</td>\n",
" <td>K03000001</td>\n",
" <td>76255</td>\n",
" </tr>\n",
" <tr>\n",
" <th>115601</th>\n",
" <td>2017-10-01</td>\n",
" <td>United Kingdom</td>\n",
" <td>K02000001</td>\n",
" <td>81756</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Date Region_Name Area_Code Sales_Volume\n",
"115597 2017-10-01 Conwy W06000003 188\n",
"115598 2017-10-01 Flintshire W06000005 176\n",
"115599 2017-10-01 England and Wales K04000001 67408\n",
"115600 2017-10-01 Great Britain K03000001 76255\n",
"115601 2017-10-01 United Kingdom K02000001 81756"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#show last rows of Dataframe\n",
"df_sales.tail()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Show sales for Reading"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date</th>\n",
" <th>Region_Name</th>\n",
" <th>Area_Code</th>\n",
" <th>Sales_Volume</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>380</th>\n",
" <td>1995-01-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>137</td>\n",
" </tr>\n",
" <tr>\n",
" <th>719</th>\n",
" <td>1995-02-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>161</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1169</th>\n",
" <td>1995-03-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>217</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1538</th>\n",
" <td>1995-04-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>153</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1967</th>\n",
" <td>1995-05-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>232</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2291</th>\n",
" <td>1995-06-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>209</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2739</th>\n",
" <td>1995-07-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>241</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3149</th>\n",
" <td>1995-08-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>244</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3536</th>\n",
" <td>1995-09-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>223</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3883</th>\n",
" <td>1995-10-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>182</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4322</th>\n",
" <td>1995-11-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>206</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4650</th>\n",
" <td>1995-12-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>196</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5047</th>\n",
" <td>1996-01-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>151</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5512</th>\n",
" <td>1996-02-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>164</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5859</th>\n",
" <td>1996-03-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>244</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6249</th>\n",
" <td>1996-04-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>203</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6626</th>\n",
" <td>1996-05-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>251</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7052</th>\n",
" <td>1996-06-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>281</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7474</th>\n",
" <td>1996-07-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>290</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7874</th>\n",
" <td>1996-08-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>345</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8226</th>\n",
" <td>1996-09-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>311</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8614</th>\n",
" <td>1996-10-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>286</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9035</th>\n",
" <td>1996-11-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>280</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9416</th>\n",
" <td>1996-12-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>303</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9795</th>\n",
" <td>1997-01-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>210</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10196</th>\n",
" <td>1997-02-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>248</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10614</th>\n",
" <td>1997-03-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>279</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10979</th>\n",
" <td>1997-04-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>269</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11348</th>\n",
" <td>1997-05-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>290</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11770</th>\n",
" <td>1997-06-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>324</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102793</th>\n",
" <td>2015-05-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>103248</th>\n",
" <td>2015-06-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>254</td>\n",
" </tr>\n",
" <tr>\n",
" <th>103683</th>\n",
" <td>2015-07-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>285</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104087</th>\n",
" <td>2015-08-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>288</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104552</th>\n",
" <td>2015-09-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>259</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104943</th>\n",
" <td>2015-10-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>281</td>\n",
" </tr>\n",
" <tr>\n",
" <th>105416</th>\n",
" <td>2015-11-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>243</td>\n",
" </tr>\n",
" <tr>\n",
" <th>105890</th>\n",
" <td>2015-12-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>227</td>\n",
" </tr>\n",
" <tr>\n",
" <th>106278</th>\n",
" <td>2016-01-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>188</td>\n",
" </tr>\n",
" <tr>\n",
" <th>106744</th>\n",
" <td>2016-02-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>225</td>\n",
" </tr>\n",
" <tr>\n",
" <th>107168</th>\n",
" <td>2016-03-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>461</td>\n",
" </tr>\n",
" <tr>\n",
" <th>107591</th>\n",
" <td>2016-04-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>194</td>\n",
" </tr>\n",
" <tr>\n",
" <th>108077</th>\n",
" <td>2016-05-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>155</td>\n",
" </tr>\n",
" <tr>\n",
" <th>108485</th>\n",
" <td>2016-06-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>261</td>\n",
" </tr>\n",
" <tr>\n",
" <th>108949</th>\n",
" <td>2016-07-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>291</td>\n",
" </tr>\n",
" <tr>\n",
" <th>109365</th>\n",
" <td>2016-08-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>255</td>\n",
" </tr>\n",
" <tr>\n",
" <th>109832</th>\n",
" <td>2016-09-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>213</td>\n",
" </tr>\n",
" <tr>\n",
" <th>110245</th>\n",
" <td>2016-10-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>201</td>\n",
" </tr>\n",
" <tr>\n",
" <th>110678</th>\n",
" <td>2016-11-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>184</td>\n",
" </tr>\n",
" <tr>\n",
" <th>111127</th>\n",
" <td>2016-12-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>214</td>\n",
" </tr>\n",
" <tr>\n",
" <th>111585</th>\n",
" <td>2017-01-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>194</td>\n",
" </tr>\n",
" <tr>\n",
" <th>112015</th>\n",
" <td>2017-02-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>188</td>\n",
" </tr>\n",
" <tr>\n",
" <th>112490</th>\n",
" <td>2017-03-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>204</td>\n",
" </tr>\n",
" <tr>\n",
" <th>112891</th>\n",
" <td>2017-04-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>154</td>\n",
" </tr>\n",
" <tr>\n",
" <th>113374</th>\n",
" <td>2017-05-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>161</td>\n",
" </tr>\n",
" <tr>\n",
" <th>113786</th>\n",
" <td>2017-06-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>231</td>\n",
" </tr>\n",
" <tr>\n",
" <th>114237</th>\n",
" <td>2017-07-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>204</td>\n",
" </tr>\n",
" <tr>\n",
" <th>114675</th>\n",
" <td>2017-08-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>215</td>\n",
" </tr>\n",
" <tr>\n",
" <th>115136</th>\n",
" <td>2017-09-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>173</td>\n",
" </tr>\n",
" <tr>\n",
" <th>115556</th>\n",
" <td>2017-10-01</td>\n",
" <td>Reading</td>\n",
" <td>E06000038</td>\n",
" <td>173</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>274 rows × 4 columns</p>\n",
"</div>"
],
"text/plain": [
" Date Region_Name Area_Code Sales_Volume\n",
"380 1995-01-01 Reading E06000038 137\n",
"719 1995-02-01 Reading E06000038 161\n",
"1169 1995-03-01 Reading E06000038 217\n",
"1538 1995-04-01 Reading E06000038 153\n",
"1967 1995-05-01 Reading E06000038 232\n",
"2291 1995-06-01 Reading E06000038 209\n",
"2739 1995-07-01 Reading E06000038 241\n",
"3149 1995-08-01 Reading E06000038 244\n",
"3536 1995-09-01 Reading E06000038 223\n",
"3883 1995-10-01 Reading E06000038 182\n",
"4322 1995-11-01 Reading E06000038 206\n",
"4650 1995-12-01 Reading E06000038 196\n",
"5047 1996-01-01 Reading E06000038 151\n",
"5512 1996-02-01 Reading E06000038 164\n",
"5859 1996-03-01 Reading E06000038 244\n",
"6249 1996-04-01 Reading E06000038 203\n",
"6626 1996-05-01 Reading E06000038 251\n",
"7052 1996-06-01 Reading E06000038 281\n",
"7474 1996-07-01 Reading E06000038 290\n",
"7874 1996-08-01 Reading E06000038 345\n",
"8226 1996-09-01 Reading E06000038 311\n",
"8614 1996-10-01 Reading E06000038 286\n",
"9035 1996-11-01 Reading E06000038 280\n",
"9416 1996-12-01 Reading E06000038 303\n",
"9795 1997-01-01 Reading E06000038 210\n",
"10196 1997-02-01 Reading E06000038 248\n",
"10614 1997-03-01 Reading E06000038 279\n",
"10979 1997-04-01 Reading E06000038 269\n",
"11348 1997-05-01 Reading E06000038 290\n",
"11770 1997-06-01 Reading E06000038 324\n",
"... ... ... ... ...\n",
"102793 2015-05-01 Reading E06000038 305\n",
"103248 2015-06-01 Reading E06000038 254\n",
"103683 2015-07-01 Reading E06000038 285\n",
"104087 2015-08-01 Reading E06000038 288\n",
"104552 2015-09-01 Reading E06000038 259\n",
"104943 2015-10-01 Reading E06000038 281\n",
"105416 2015-11-01 Reading E06000038 243\n",
"105890 2015-12-01 Reading E06000038 227\n",
"106278 2016-01-01 Reading E06000038 188\n",
"106744 2016-02-01 Reading E06000038 225\n",
"107168 2016-03-01 Reading E06000038 461\n",
"107591 2016-04-01 Reading E06000038 194\n",
"108077 2016-05-01 Reading E06000038 155\n",
"108485 2016-06-01 Reading E06000038 261\n",
"108949 2016-07-01 Reading E06000038 291\n",
"109365 2016-08-01 Reading E06000038 255\n",
"109832 2016-09-01 Reading E06000038 213\n",
"110245 2016-10-01 Reading E06000038 201\n",
"110678 2016-11-01 Reading E06000038 184\n",
"111127 2016-12-01 Reading E06000038 214\n",
"111585 2017-01-01 Reading E06000038 194\n",
"112015 2017-02-01 Reading E06000038 188\n",
"112490 2017-03-01 Reading E06000038 204\n",
"112891 2017-04-01 Reading E06000038 154\n",
"113374 2017-05-01 Reading E06000038 161\n",
"113786 2017-06-01 Reading E06000038 231\n",
"114237 2017-07-01 Reading E06000038 204\n",
"114675 2017-08-01 Reading E06000038 215\n",
"115136 2017-09-01 Reading E06000038 173\n",
"115556 2017-10-01 Reading E06000038 173\n",
"\n",
"[274 rows x 4 columns]"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_sales[df_sales['Region_Name']=='Reading']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Find the max 'Sales_Volume' properties sold in Reading"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"463"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_sales[df_sales['Region_Name']=='Reading']['Sales_Volume'].max()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Find the index of the max 'Sales_Volume' sold in Reading"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"21616"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_sales[df_sales['Region_Name']=='Reading']['Sales_Volume'].argmax()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Locate a row using its index. Use 'iloc' method"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Date 1999-07-01\n",
"Region_Name Reading\n",
"Area_Code E06000038\n",
"Sales_Volume 463\n",
"Name: 21616, dtype: object"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_sales.iloc[21616]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Find the Region where there was the minimum Sales_volume ever"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'City of London'"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Use loc and idxmin() methods\n",
"df_sales.loc[df_sales['Sales_Volume'].argmin()]['Region_Name']"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.14"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment