Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save psychemedia/5ed854a661d91c2ed970a65464282ac9 to your computer and use it in GitHub Desktop.
Save psychemedia/5ed854a661d91c2ed970a65464282ac9 to your computer and use it in GitHub Desktop.
Casting time periods in pandas - NHS timeseries example
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Representing Time Periods\n",
"\n",
"A notebook that demonstrates how to cast various time periods (weekly, monthly, quarterly) using *pandas*."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Quarterly Time Series\n",
"\n",
"Time series such as the NHS Digital Accident and Emergency figures are published as quarterly and monthly time series.\n",
"\n",
"The following example shows how to cast the quarterly reports to a quarterly time period in *pandas*, with quarters aligned by financial year."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2017-08-08 11:15:51-- https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2016/06/Quarterly-timeseries-May-2017.xls\n",
"Resolving www.england.nhs.uk... 54.230.199.109, 54.230.199.252, 54.230.199.172, ...\n",
"Connecting to www.england.nhs.uk|54.230.199.109|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 92672 (90K) [application/vnd.ms-excel]\n",
"Saving to: 'data/Quarterly-timeseries-May-2017.xls'\n",
"\n",
"Quarterly-timeserie 100%[=====================>] 90.50K --.-KB/s in 0.02s \n",
"\n",
"2017-08-08 11:15:52 (4.52 MB/s) - 'data/Quarterly-timeseries-May-2017.xls' saved [92672/92672]\n",
"\n"
]
}
],
"source": [
"#Download example data file\n",
"url='https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2016/06/Quarterly-timeseries-May-2017.xls'\n",
"!wget -P data/ {url}"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th>0</th>\n",
" <th colspan=\"3\" halign=\"left\">Reporting</th>\n",
" <th colspan=\"4\" halign=\"left\">A&amp;E attendances</th>\n",
" <th colspan=\"6\" halign=\"left\">A&amp;E attendances &gt; 4 hours from arrival to admission, transfer or discharge</th>\n",
" <th colspan=\"8\" halign=\"left\">Emergency Admissions</th>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <th>Code</th>\n",
" <th>Year</th>\n",
" <th>Quarter</th>\n",
" <th>Type 1 Departments - Major A&amp;E</th>\n",
" <th>Type 2 Departments - Single Specialty</th>\n",
" <th>Type 3 Departments - Other A&amp;E/Minor Injury Unit</th>\n",
" <th>Total attendances</th>\n",
" <th>Type 1 Departments - Major A&amp;E</th>\n",
" <th>Type 2 Departments - Single Specialty</th>\n",
" <th>Type 3 Departments - Other A&amp;E/Minor Injury Unit</th>\n",
" <th>...</th>\n",
" <th>Percentage in 4 hours or less (type 1)</th>\n",
" <th>Percentage in 4 hours or less (all)</th>\n",
" <th>Emergency Admissions via Type 1 A&amp;E</th>\n",
" <th>Emergency Admissions via Type 2 A&amp;E</th>\n",
" <th>Emergency Admissions via Type 3 and 4 A&amp;E</th>\n",
" <th>Total Emergency admissions via A&amp;E</th>\n",
" <th>Other Emergency admissions (i.e not via A&amp;E)</th>\n",
" <th>Total emergency admissions</th>\n",
" <th>Number of patients spending &gt;4 hours from decision to admit to admission</th>\n",
" <th>Number of patients spending &gt;12 hours from decision to admit to admission</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>NaN</td>\n",
" <td>2004-05</td>\n",
" <td>Q1: April - June</td>\n",
" <td>3377850</td>\n",
" <td>159966</td>\n",
" <td>964762</td>\n",
" <td>4502578</td>\n",
" <td>232110</td>\n",
" <td>2612</td>\n",
" <td>3305</td>\n",
" <td>...</td>\n",
" <td>0.931285</td>\n",
" <td>0.947135</td>\n",
" <td>651785</td>\n",
" <td>6778</td>\n",
" <td>7830</td>\n",
" <td>666393</td>\n",
" <td>666393</td>\n",
" <td>666393</td>\n",
" <td>27665</td>\n",
" <td>27665</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>Q2: July - Sept</td>\n",
" <td>3381219</td>\n",
" <td>156697</td>\n",
" <td>1018779</td>\n",
" <td>4556695</td>\n",
" <td>182554</td>\n",
" <td>1830</td>\n",
" <td>2625</td>\n",
" <td>...</td>\n",
" <td>0.946009</td>\n",
" <td>0.95896</td>\n",
" <td>673452</td>\n",
" <td>6670</td>\n",
" <td>8256</td>\n",
" <td>688378</td>\n",
" <td>688378</td>\n",
" <td>688378</td>\n",
" <td>20989</td>\n",
" <td>20989</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>Q3: Oct - Dec</td>\n",
" <td>3257398</td>\n",
" <td>150643</td>\n",
" <td>966886</td>\n",
" <td>4374927</td>\n",
" <td>136953</td>\n",
" <td>1415</td>\n",
" <td>1364</td>\n",
" <td>...</td>\n",
" <td>0.957956</td>\n",
" <td>0.968061</td>\n",
" <td>705901</td>\n",
" <td>6802</td>\n",
" <td>8691</td>\n",
" <td>721394</td>\n",
" <td>721394</td>\n",
" <td>721394</td>\n",
" <td>19831</td>\n",
" <td>19831</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>Q4: Jan - Mar</td>\n",
" <td>3249353</td>\n",
" <td>151926</td>\n",
" <td>1001701</td>\n",
" <td>4402980</td>\n",
" <td>115978</td>\n",
" <td>982</td>\n",
" <td>687</td>\n",
" <td>...</td>\n",
" <td>0.964307</td>\n",
" <td>0.97328</td>\n",
" <td>724814</td>\n",
" <td>6975</td>\n",
" <td>5414</td>\n",
" <td>737203</td>\n",
" <td>737203</td>\n",
" <td>737203</td>\n",
" <td>24178</td>\n",
" <td>24178</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>NaN</td>\n",
" <td>2005-06</td>\n",
" <td>Q1: April - June</td>\n",
" <td>3520931</td>\n",
" <td>165801</td>\n",
" <td>1172846</td>\n",
" <td>4859578</td>\n",
" <td>83845</td>\n",
" <td>742</td>\n",
" <td>555</td>\n",
" <td>...</td>\n",
" <td>0.976187</td>\n",
" <td>0.98248</td>\n",
" <td>719644</td>\n",
" <td>7717</td>\n",
" <td>6048</td>\n",
" <td>733409</td>\n",
" <td>733409</td>\n",
" <td>733409</td>\n",
" <td>12991</td>\n",
" <td>12991</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
"0 Reporting A&E attendances \\\n",
"1 Code Year Quarter Type 1 Departments - Major A&E \n",
"0 NaN 2004-05 Q1: April - June 3377850 \n",
"1 NaN NaN Q2: July - Sept 3381219 \n",
"2 NaN NaN Q3: Oct - Dec 3257398 \n",
"3 NaN NaN Q4: Jan - Mar 3249353 \n",
"4 NaN 2005-06 Q1: April - June 3520931 \n",
"\n",
"0 \\\n",
"1 Type 2 Departments - Single Specialty \n",
"0 159966 \n",
"1 156697 \n",
"2 150643 \n",
"3 151926 \n",
"4 165801 \n",
"\n",
"0 \\\n",
"1 Type 3 Departments - Other A&E/Minor Injury Unit Total attendances \n",
"0 964762 4502578 \n",
"1 1018779 4556695 \n",
"2 966886 4374927 \n",
"3 1001701 4402980 \n",
"4 1172846 4859578 \n",
"\n",
"0 A&E attendances > 4 hours from arrival to admission, transfer or discharge \\\n",
"1 Type 1 Departments - Major A&E \n",
"0 232110 \n",
"1 182554 \n",
"2 136953 \n",
"3 115978 \n",
"4 83845 \n",
"\n",
"0 \\\n",
"1 Type 2 Departments - Single Specialty \n",
"0 2612 \n",
"1 1830 \n",
"2 1415 \n",
"3 982 \n",
"4 742 \n",
"\n",
"0 \\\n",
"1 Type 3 Departments - Other A&E/Minor Injury Unit \n",
"0 3305 \n",
"1 2625 \n",
"2 1364 \n",
"3 687 \n",
"4 555 \n",
"\n",
"0 ... \\\n",
"1 ... \n",
"0 ... \n",
"1 ... \n",
"2 ... \n",
"3 ... \n",
"4 ... \n",
"\n",
"0 \\\n",
"1 Percentage in 4 hours or less (type 1) Percentage in 4 hours or less (all) \n",
"0 0.931285 0.947135 \n",
"1 0.946009 0.95896 \n",
"2 0.957956 0.968061 \n",
"3 0.964307 0.97328 \n",
"4 0.976187 0.98248 \n",
"\n",
"0 Emergency Admissions \\\n",
"1 Emergency Admissions via Type 1 A&E Emergency Admissions via Type 2 A&E \n",
"0 651785 6778 \n",
"1 673452 6670 \n",
"2 705901 6802 \n",
"3 724814 6975 \n",
"4 719644 7717 \n",
"\n",
"0 \\\n",
"1 Emergency Admissions via Type 3 and 4 A&E \n",
"0 7830 \n",
"1 8256 \n",
"2 8691 \n",
"3 5414 \n",
"4 6048 \n",
"\n",
"0 \\\n",
"1 Total Emergency admissions via A&E \n",
"0 666393 \n",
"1 688378 \n",
"2 721394 \n",
"3 737203 \n",
"4 733409 \n",
"\n",
"0 \\\n",
"1 Other Emergency admissions (i.e not via A&E) Total emergency admissions \n",
"0 666393 666393 \n",
"1 688378 688378 \n",
"2 721394 721394 \n",
"3 737203 737203 \n",
"4 733409 733409 \n",
"\n",
"0 \\\n",
"1 Number of patients spending >4 hours from decision to admit to admission \n",
"0 27665 \n",
"1 20989 \n",
"2 19831 \n",
"3 24178 \n",
"4 12991 \n",
"\n",
"0 \n",
"1 Number of patients spending >12 hours from decision to admit to admission \n",
"0 27665 \n",
"1 20989 \n",
"2 19831 \n",
"3 24178 \n",
"4 12991 \n",
"\n",
"[5 rows x 21 columns]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"\n",
"#Preview data\n",
"dfw=pd.read_excel('data/Quarterly-timeseries-May-2017.xls',\n",
" skiprows=15,header=None,na_values='-').dropna(how='all').dropna(axis=1,how='all')\n",
"dfw.iloc[0,0]='Reporting'\n",
"dfw.iloc[1,0]='Code'\n",
"\n",
"dfw= dfw.fillna(axis=1,method='ffill').T.set_index([0,1]).T.dropna(how='all').dropna(axis=1,how='all')\n",
"dfw=dfw.reset_index(drop=True)\n",
"dfw.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note how the time series is described in the data file across two columns: the financial year and the quarter:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th>0</th>\n",
" <th colspan=\"2\" halign=\"left\">Reporting</th>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <th>Year</th>\n",
" <th>Quarter</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2004-05</td>\n",
" <td>Q1: April - June</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>NaN</td>\n",
" <td>Q2: July - Sept</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>NaN</td>\n",
" <td>Q3: Oct - Dec</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>NaN</td>\n",
" <td>Q4: Jan - Mar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2005-06</td>\n",
" <td>Q1: April - June</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"0 Reporting \n",
"1 Year Quarter\n",
"0 2004-05 Q1: April - June\n",
"1 NaN Q2: July - Sept\n",
"2 NaN Q3: Oct - Dec\n",
"3 NaN Q4: Jan - Mar\n",
"4 2005-06 Q1: April - June"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dfw[[('Reporting','Year'),('Reporting','Quarter')]].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Fill down on the financial year, and create a new *period* column joining those two elements."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th>0</th>\n",
" <th colspan=\"3\" halign=\"left\">Reporting</th>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <th>Year</th>\n",
" <th>Quarter</th>\n",
" <th>period</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2004-05</td>\n",
" <td>Q1: April - June</td>\n",
" <td>2004-05 Q1: April - June</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2004-05</td>\n",
" <td>Q2: July - Sept</td>\n",
" <td>2004-05 Q2: July - Sept</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2004-05</td>\n",
" <td>Q3: Oct - Dec</td>\n",
" <td>2004-05 Q3: Oct - Dec</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2004-05</td>\n",
" <td>Q4: Jan - Mar</td>\n",
" <td>2004-05 Q4: Jan - Mar</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2005-06</td>\n",
" <td>Q1: April - June</td>\n",
" <td>2005-06 Q1: April - June</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"0 Reporting \n",
"1 Year Quarter period\n",
"0 2004-05 Q1: April - June 2004-05 Q1: April - June\n",
"1 2004-05 Q2: July - Sept 2004-05 Q2: July - Sept\n",
"2 2004-05 Q3: Oct - Dec 2004-05 Q3: Oct - Dec\n",
"3 2004-05 Q4: Jan - Mar 2004-05 Q4: Jan - Mar\n",
"4 2005-06 Q1: April - June 2005-06 Q1: April - June"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dfw['Reporting','Year'].fillna(method='ffill',inplace=True)\n",
"dfw['Reporting','period']=dfw['Reporting','Year']+' '+dfw['Reporting','Quarter']\n",
"dfw[[('Reporting','Year'),('Reporting','Quarter'),('Reporting','period')]].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use the *pandas* [\"anchored offsets\"](http://pandas.pydata.org/pandas-docs/version/0.12.0/timeseries.html#anchored-offsets) to specifiy the quarterly period relative to the financial year."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th>0</th>\n",
" <th colspan=\"4\" halign=\"left\">Reporting</th>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <th>Year</th>\n",
" <th>Quarter</th>\n",
" <th>period</th>\n",
" <th>_quarter</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2004-05</td>\n",
" <td>Q1: April - June</td>\n",
" <td>2004-05 Q1: April - June</td>\n",
" <td>2005Q1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2004-05</td>\n",
" <td>Q2: July - Sept</td>\n",
" <td>2004-05 Q2: July - Sept</td>\n",
" <td>2005Q2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2004-05</td>\n",
" <td>Q3: Oct - Dec</td>\n",
" <td>2004-05 Q3: Oct - Dec</td>\n",
" <td>2005Q3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2004-05</td>\n",
" <td>Q4: Jan - Mar</td>\n",
" <td>2004-05 Q4: Jan - Mar</td>\n",
" <td>2005Q4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2005-06</td>\n",
" <td>Q1: April - June</td>\n",
" <td>2005-06 Q1: April - June</td>\n",
" <td>2006Q1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"0 Reporting \n",
"1 Year Quarter period _quarter\n",
"0 2004-05 Q1: April - June 2004-05 Q1: April - June 2005Q1\n",
"1 2004-05 Q2: July - Sept 2004-05 Q2: July - Sept 2005Q2\n",
"2 2004-05 Q3: Oct - Dec 2004-05 Q3: Oct - Dec 2005Q3\n",
"3 2004-05 Q4: Jan - Mar 2004-05 Q4: Jan - Mar 2005Q4\n",
"4 2005-06 Q1: April - June 2005-06 Q1: April - June 2006Q1"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Return a date that we can use to anchor a period on\n",
"def getMonthYear(row):\n",
" if not row['Reporting','Quarter'].startswith('Q'): return ''\n",
" month=row['Reporting','Quarter'].split(':')[1].split('-')[0].strip()\n",
" year=int(row['Reporting','Year'].split('-')[0])\n",
" if month in ['Jan']:\n",
" year= year+1\n",
" #Following the conversion, the _quarter year specifies the calendar year in which the financial year ends\n",
" #Return the date corresponding to the first day of the month in the calendar year the financial quarter applies to\n",
" return pd.to_datetime(\"01-{}-{}\".format(month[:3],year),format='%d-%b-%Y')\n",
"\n",
"#Get a date corrsponding to the first day in the month of the first month in the quarter, by calendar year\n",
"#Use this date to align a particular quarterly period\n",
"#Q-MAR specifies a quarterly frequency, with year ending in March\n",
"dfw['Reporting','_quarter']=pd.PeriodIndex(dfw.apply(getMonthYear,axis=1), freq='Q-MAR')\n",
"\n",
"#Sort the columns\n",
"dfw=dfw.sort_index(axis=1)\n",
"dfw=dfw.reindex(columns=['Reporting']+[c for c in dfw.columns.levels[0] if c!='Reporting'],level=0)\n",
"\n",
"#Preview the data with the quarter time period column defined\n",
"dfw[[('Reporting','Year'),('Reporting','Quarter'),('Reporting','period'),('Reporting','_quarter')]].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Monthly Time Series\n",
"\n",
"NHS Digital Accident and Emergency figures are also published as a monthly time series."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2017-08-08 11:31:25-- https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2017/06/May-17-Monthly-AE-Timeseries-with-growth-charts.xlsx\n",
"Resolving www.england.nhs.uk... 54.230.199.30, 54.230.199.113, 54.230.199.252, ...\n",
"Connecting to www.england.nhs.uk|54.230.199.30|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 111779 (109K) [application/vnd.openxmlformats-officedocument.spreadsheetml.sheet]\n",
"Saving to: 'data/May-17-Monthly-AE-Timeseries-with-growth-charts.xlsx'\n",
"\n",
"May-17-Monthly-AE-T 100%[=====================>] 109.16K --.-KB/s in 0.02s \n",
"\n",
"2017-08-08 11:31:26 (4.53 MB/s) - 'data/May-17-Monthly-AE-Timeseries-with-growth-charts.xlsx' saved [111779/111779]\n",
"\n"
]
}
],
"source": [
"url='https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2017/06/May-17-Monthly-AE-Timeseries-with-growth-charts.xlsx'\n",
"!wget -P data/ {url}"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th>0</th>\n",
" <th colspan=\"2\" halign=\"left\">Reporting</th>\n",
" <th colspan=\"4\" halign=\"left\">A&amp;E attendances</th>\n",
" <th colspan=\"6\" halign=\"left\">A&amp;E attendances &gt; 4 hours from arrival to admission, transfer or discharge</th>\n",
" <th colspan=\"9\" halign=\"left\">Emergency Admissions</th>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <th>Code</th>\n",
" <th>Period</th>\n",
" <th>Type 1 Departments - Major A&amp;E</th>\n",
" <th>Type 2 Departments - Single Specialty</th>\n",
" <th>Type 3 Departments - Other A&amp;E/Minor Injury Unit</th>\n",
" <th>Total Attendances</th>\n",
" <th>Type 1 Departments - Major A&amp;E</th>\n",
" <th>Type 2 Departments - Single Specialty</th>\n",
" <th>Type 3 Departments - Other A&amp;E/Minor Injury Unit</th>\n",
" <th>Total Attendances &gt; 4 hours</th>\n",
" <th>...</th>\n",
" <th>Percentage in 4 hours or less (all)</th>\n",
" <th>Emergency Admissions via Type 1 A&amp;E</th>\n",
" <th>Emergency Admissions via Type 2 A&amp;E</th>\n",
" <th>Emergency Admissions via Type 3 and 4 A&amp;E</th>\n",
" <th>Total Emergency Admissions via A&amp;E</th>\n",
" <th>Other Emergency Admissions (i.e not via A&amp;E)</th>\n",
" <th>Total Emergency Admissions</th>\n",
" <th>Number of patients spending &gt;4 hours from decision to admit to admission</th>\n",
" <th>Number of patients spending &gt;12 hours from decision to admit to admission</th>\n",
" <th>Operational standard (Performance)</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>NaN</td>\n",
" <td>2010-08-01 00:00:00</td>\n",
" <td>1138652</td>\n",
" <td>54371</td>\n",
" <td>559358</td>\n",
" <td>1752381</td>\n",
" <td>32664</td>\n",
" <td>97</td>\n",
" <td>423</td>\n",
" <td>33184</td>\n",
" <td>...</td>\n",
" <td>0.981063</td>\n",
" <td>287438</td>\n",
" <td>5367</td>\n",
" <td>8081</td>\n",
" <td>300886</td>\n",
" <td>124816</td>\n",
" <td>425702</td>\n",
" <td>3697</td>\n",
" <td>1</td>\n",
" <td>0.95</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>NaN</td>\n",
" <td>2010-09-01 00:00:00</td>\n",
" <td>1150728</td>\n",
" <td>55181</td>\n",
" <td>550359</td>\n",
" <td>1756268</td>\n",
" <td>40416</td>\n",
" <td>142</td>\n",
" <td>593</td>\n",
" <td>41151</td>\n",
" <td>...</td>\n",
" <td>0.976569</td>\n",
" <td>293991</td>\n",
" <td>5543</td>\n",
" <td>3673</td>\n",
" <td>303207</td>\n",
" <td>121693</td>\n",
" <td>424900</td>\n",
" <td>5907</td>\n",
" <td>0</td>\n",
" <td>0.95</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>NaN</td>\n",
" <td>2010-10-01 00:00:00</td>\n",
" <td>1163143</td>\n",
" <td>54961</td>\n",
" <td>583244</td>\n",
" <td>1801348</td>\n",
" <td>46467</td>\n",
" <td>141</td>\n",
" <td>806</td>\n",
" <td>47414</td>\n",
" <td>...</td>\n",
" <td>0.973679</td>\n",
" <td>303452</td>\n",
" <td>5485</td>\n",
" <td>2560</td>\n",
" <td>311497</td>\n",
" <td>124718</td>\n",
" <td>436215</td>\n",
" <td>6932</td>\n",
" <td>0</td>\n",
" <td>0.95</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>NaN</td>\n",
" <td>2010-11-01 00:00:00</td>\n",
" <td>1.11129e+06</td>\n",
" <td>53727.4</td>\n",
" <td>486005</td>\n",
" <td>1.65103e+06</td>\n",
" <td>45838.4</td>\n",
" <td>143.429</td>\n",
" <td>454.571</td>\n",
" <td>46436.4</td>\n",
" <td>...</td>\n",
" <td>0.971874</td>\n",
" <td>297832</td>\n",
" <td>5731.14</td>\n",
" <td>3279</td>\n",
" <td>306842</td>\n",
" <td>122257</td>\n",
" <td>429099</td>\n",
" <td>7179</td>\n",
" <td>2</td>\n",
" <td>0.95</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>NaN</td>\n",
" <td>2010-12-01 00:00:00</td>\n",
" <td>1.1592e+06</td>\n",
" <td>45536.4</td>\n",
" <td>533001</td>\n",
" <td>1.73774e+06</td>\n",
" <td>88475.3</td>\n",
" <td>140.571</td>\n",
" <td>1301.43</td>\n",
" <td>89917.3</td>\n",
" <td>...</td>\n",
" <td>0.948256</td>\n",
" <td>318602</td>\n",
" <td>6277</td>\n",
" <td>3198.43</td>\n",
" <td>328078</td>\n",
" <td>124651</td>\n",
" <td>452729</td>\n",
" <td>13818.1</td>\n",
" <td>15</td>\n",
" <td>0.95</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
"0 Reporting A&E attendances \\\n",
"1 Code Period Type 1 Departments - Major A&E \n",
"2 NaN 2010-08-01 00:00:00 1138652 \n",
"3 NaN 2010-09-01 00:00:00 1150728 \n",
"4 NaN 2010-10-01 00:00:00 1163143 \n",
"5 NaN 2010-11-01 00:00:00 1.11129e+06 \n",
"6 NaN 2010-12-01 00:00:00 1.1592e+06 \n",
"\n",
"0 \\\n",
"1 Type 2 Departments - Single Specialty \n",
"2 54371 \n",
"3 55181 \n",
"4 54961 \n",
"5 53727.4 \n",
"6 45536.4 \n",
"\n",
"0 \\\n",
"1 Type 3 Departments - Other A&E/Minor Injury Unit Total Attendances \n",
"2 559358 1752381 \n",
"3 550359 1756268 \n",
"4 583244 1801348 \n",
"5 486005 1.65103e+06 \n",
"6 533001 1.73774e+06 \n",
"\n",
"0 A&E attendances > 4 hours from arrival to admission, transfer or discharge \\\n",
"1 Type 1 Departments - Major A&E \n",
"2 32664 \n",
"3 40416 \n",
"4 46467 \n",
"5 45838.4 \n",
"6 88475.3 \n",
"\n",
"0 \\\n",
"1 Type 2 Departments - Single Specialty \n",
"2 97 \n",
"3 142 \n",
"4 141 \n",
"5 143.429 \n",
"6 140.571 \n",
"\n",
"0 \\\n",
"1 Type 3 Departments - Other A&E/Minor Injury Unit \n",
"2 423 \n",
"3 593 \n",
"4 806 \n",
"5 454.571 \n",
"6 1301.43 \n",
"\n",
"0 ... \\\n",
"1 Total Attendances > 4 hours ... \n",
"2 33184 ... \n",
"3 41151 ... \n",
"4 47414 ... \n",
"5 46436.4 ... \n",
"6 89917.3 ... \n",
"\n",
"0 Emergency Admissions \\\n",
"1 Percentage in 4 hours or less (all) Emergency Admissions via Type 1 A&E \n",
"2 0.981063 287438 \n",
"3 0.976569 293991 \n",
"4 0.973679 303452 \n",
"5 0.971874 297832 \n",
"6 0.948256 318602 \n",
"\n",
"0 \\\n",
"1 Emergency Admissions via Type 2 A&E \n",
"2 5367 \n",
"3 5543 \n",
"4 5485 \n",
"5 5731.14 \n",
"6 6277 \n",
"\n",
"0 \\\n",
"1 Emergency Admissions via Type 3 and 4 A&E \n",
"2 8081 \n",
"3 3673 \n",
"4 2560 \n",
"5 3279 \n",
"6 3198.43 \n",
"\n",
"0 \\\n",
"1 Total Emergency Admissions via A&E \n",
"2 300886 \n",
"3 303207 \n",
"4 311497 \n",
"5 306842 \n",
"6 328078 \n",
"\n",
"0 \\\n",
"1 Other Emergency Admissions (i.e not via A&E) Total Emergency Admissions \n",
"2 124816 425702 \n",
"3 121693 424900 \n",
"4 124718 436215 \n",
"5 122257 429099 \n",
"6 124651 452729 \n",
"\n",
"0 \\\n",
"1 Number of patients spending >4 hours from decision to admit to admission \n",
"2 3697 \n",
"3 5907 \n",
"4 6932 \n",
"5 7179 \n",
"6 13818.1 \n",
"\n",
"0 \\\n",
"1 Number of patients spending >12 hours from decision to admit to admission \n",
"2 1 \n",
"3 0 \n",
"4 0 \n",
"5 2 \n",
"6 15 \n",
"\n",
"0 \n",
"1 Operational standard (Performance) \n",
"2 0.95 \n",
"3 0.95 \n",
"4 0.95 \n",
"5 0.95 \n",
"6 0.95 \n",
"\n",
"[5 rows x 21 columns]"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dfw=pd.read_excel('data/May-17-Monthly-AE-Timeseries-with-growth-charts.xlsx',skiprows=16,header=None,na_values='-').dropna(how='all').dropna(axis=1,how='all')\n",
"dfw.iloc[0,0]='Reporting'\n",
"dfw.iloc[1,0]='Code'\n",
"\n",
"dfw= dfw.fillna(axis=1,method='ffill').T.set_index([0,1]).T.dropna(how='all').dropna(axis=1,how='all')\n",
"dfw.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The *pandas* `dt.to_period('M')` function returns a monthly period corresponding to the month within which a specified date falls."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th>0</th>\n",
" <th colspan=\"2\" halign=\"left\">Reporting</th>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <th>Period</th>\n",
" <th>_period</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2010-08-01 00:00:00</td>\n",
" <td>2010-08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2010-09-01 00:00:00</td>\n",
" <td>2010-09</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2010-10-01 00:00:00</td>\n",
" <td>2010-10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>2010-11-01 00:00:00</td>\n",
" <td>2010-11</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>2010-12-01 00:00:00</td>\n",
" <td>2010-12</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"0 Reporting \n",
"1 Period _period\n",
"2 2010-08-01 00:00:00 2010-08\n",
"3 2010-09-01 00:00:00 2010-09\n",
"4 2010-10-01 00:00:00 2010-10\n",
"5 2010-11-01 00:00:00 2010-11\n",
"6 2010-12-01 00:00:00 2010-12"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Cast the date corresponding to the month start as a month period starting on that date\n",
"dfw['Reporting','_period']=pd.to_datetime(dfw['Reporting','Period'],\n",
" format='%d/%m/%Y').dt.to_period('M')\n",
"\n",
"dfw=dfw.sort_index(axis=1)\n",
"dfw=dfw.reindex(columns=['Reporting']+[c for c in dfw.columns.levels[0] if c!='Reporting'],level=0)\n",
"dfw[[('Reporting','Period'),('Reporting','_period')]].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Weekly Time Series\n",
"\n",
"The NHS used to publish weekly time series for A & E figures. Archival data is [still available](https://www.england.nhs.uk/statistics/statistical-work-areas/ae-waiting-times-and-activity/statistical-work-areasae-waiting-times-and-activityweekly-ae-sitreps-2015-16/)."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2017-08-08 11:40:14-- https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2015/04/2015.06.28-AE-TimeseriesBaG87.xls\n",
"Resolving www.england.nhs.uk... 54.230.199.90, 54.230.199.191, 54.230.199.113, ...\n",
"Connecting to www.england.nhs.uk|54.230.199.90|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 84480 (82K) [application/vnd.ms-excel]\n",
"Saving to: 'data/2015.06.28-AE-TimeseriesBaG87.xls'\n",
"\n",
"2015.06.28-AE-Times 100%[=====================>] 82.50K --.-KB/s in 0.03s \n",
"\n",
"2017-08-08 11:40:15 (3.18 MB/s) - 'data/2015.06.28-AE-TimeseriesBaG87.xls' saved [84480/84480]\n",
"\n"
]
}
],
"source": [
"url='https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2015/04/2015.06.28-AE-TimeseriesBaG87.xls'\n",
"!wget -P data/ {url}"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th>0</th>\n",
" <th colspan=\"2\" halign=\"left\">Reporting</th>\n",
" <th colspan=\"4\" halign=\"left\">A&amp;E attendances</th>\n",
" <th colspan=\"6\" halign=\"left\">A&amp;E attendances &gt; 4 hours from arrival to admission, transfer or discharge</th>\n",
" <th colspan=\"8\" halign=\"left\">Emergency Admissions</th>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <th>Code</th>\n",
" <th>Period</th>\n",
" <th>Type 1 Departments - Major A&amp;E</th>\n",
" <th>Type 2 Departments - Single Specialty</th>\n",
" <th>Type 3 Departments - Other A&amp;E/Minor Injury Unit</th>\n",
" <th>Total Attendances</th>\n",
" <th>Type 1 Departments - Major A&amp;E</th>\n",
" <th>Type 2 Departments - Single Specialty</th>\n",
" <th>Type 3 Departments - Other A&amp;E/Minor Injury Unit</th>\n",
" <th>Total Attendances &gt; 4 hours</th>\n",
" <th>Percentage in 4 hours or less (type 1)</th>\n",
" <th>Percentage in 4 hours or less (all)</th>\n",
" <th>Emergency Admissions via Type 1 A&amp;E</th>\n",
" <th>Emergency Admissions via Type 2 A&amp;E</th>\n",
" <th>Emergency Admissions via Type 3 and 4 A&amp;E</th>\n",
" <th>Total Emergency Admissions via A&amp;E</th>\n",
" <th>Other Emergency Admissions (i.e not via A&amp;E)</th>\n",
" <th>Total Emergency Admissions</th>\n",
" <th>Number of patients spending &gt;4 hours from decision to admit to admission</th>\n",
" <th>Number of patients spending &gt;12 hours from decision to admit to admission</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>NaN</td>\n",
" <td>2015-16 Year to Date</td>\n",
" <td>3702458</td>\n",
" <td>159697</td>\n",
" <td>1850023</td>\n",
" <td>5712178</td>\n",
" <td>329060</td>\n",
" <td>1278</td>\n",
" <td>6765</td>\n",
" <td>337103</td>\n",
" <td>0.911124</td>\n",
" <td>0.940985</td>\n",
" <td>994256</td>\n",
" <td>4077</td>\n",
" <td>12543</td>\n",
" <td>1010876</td>\n",
" <td>363004</td>\n",
" <td>1373880</td>\n",
" <td>71252</td>\n",
" <td>130</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Eng</td>\n",
" <td>W/E 07/11/2010</td>\n",
" <td>267142</td>\n",
" <td>12806</td>\n",
" <td>111547</td>\n",
" <td>391495</td>\n",
" <td>11827</td>\n",
" <td>40</td>\n",
" <td>85</td>\n",
" <td>11952</td>\n",
" <td>0.955728</td>\n",
" <td>0.969471</td>\n",
" <td>73589</td>\n",
" <td>1372</td>\n",
" <td>1272</td>\n",
" <td>76233</td>\n",
" <td>28181</td>\n",
" <td>104414</td>\n",
" <td>1847</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Eng</td>\n",
" <td>W/E 14/11/2010</td>\n",
" <td>256893</td>\n",
" <td>13135</td>\n",
" <td>115657</td>\n",
" <td>385685</td>\n",
" <td>10223</td>\n",
" <td>29</td>\n",
" <td>131</td>\n",
" <td>10383</td>\n",
" <td>0.960205</td>\n",
" <td>0.973079</td>\n",
" <td>67526</td>\n",
" <td>1340</td>\n",
" <td>563</td>\n",
" <td>69429</td>\n",
" <td>28558</td>\n",
" <td>97987</td>\n",
" <td>1601</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Eng</td>\n",
" <td>W/E 21/11/2010</td>\n",
" <td>260958</td>\n",
" <td>12751</td>\n",
" <td>115034</td>\n",
" <td>388743</td>\n",
" <td>9790</td>\n",
" <td>33</td>\n",
" <td>94</td>\n",
" <td>9917</td>\n",
" <td>0.962484</td>\n",
" <td>0.97449</td>\n",
" <td>68990</td>\n",
" <td>1343</td>\n",
" <td>649</td>\n",
" <td>70982</td>\n",
" <td>28794</td>\n",
" <td>99776</td>\n",
" <td>1430</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Eng</td>\n",
" <td>W/E 28/11/2010</td>\n",
" <td>256061</td>\n",
" <td>12126</td>\n",
" <td>113086</td>\n",
" <td>381273</td>\n",
" <td>10087</td>\n",
" <td>36</td>\n",
" <td>96</td>\n",
" <td>10219</td>\n",
" <td>0.960607</td>\n",
" <td>0.973198</td>\n",
" <td>68181</td>\n",
" <td>1305</td>\n",
" <td>639</td>\n",
" <td>70125</td>\n",
" <td>29095</td>\n",
" <td>99220</td>\n",
" <td>1703</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"0 Reporting A&E attendances \\\n",
"1 Code Period Type 1 Departments - Major A&E \n",
"2 NaN 2015-16 Year to Date 3702458 \n",
"4 Eng W/E 07/11/2010 267142 \n",
"5 Eng W/E 14/11/2010 256893 \n",
"6 Eng W/E 21/11/2010 260958 \n",
"7 Eng W/E 28/11/2010 256061 \n",
"\n",
"0 \\\n",
"1 Type 2 Departments - Single Specialty \n",
"2 159697 \n",
"4 12806 \n",
"5 13135 \n",
"6 12751 \n",
"7 12126 \n",
"\n",
"0 \\\n",
"1 Type 3 Departments - Other A&E/Minor Injury Unit Total Attendances \n",
"2 1850023 5712178 \n",
"4 111547 391495 \n",
"5 115657 385685 \n",
"6 115034 388743 \n",
"7 113086 381273 \n",
"\n",
"0 A&E attendances > 4 hours from arrival to admission, transfer or discharge \\\n",
"1 Type 1 Departments - Major A&E \n",
"2 329060 \n",
"4 11827 \n",
"5 10223 \n",
"6 9790 \n",
"7 10087 \n",
"\n",
"0 \\\n",
"1 Type 2 Departments - Single Specialty \n",
"2 1278 \n",
"4 40 \n",
"5 29 \n",
"6 33 \n",
"7 36 \n",
"\n",
"0 \\\n",
"1 Type 3 Departments - Other A&E/Minor Injury Unit \n",
"2 6765 \n",
"4 85 \n",
"5 131 \n",
"6 94 \n",
"7 96 \n",
"\n",
"0 \\\n",
"1 Total Attendances > 4 hours Percentage in 4 hours or less (type 1) \n",
"2 337103 0.911124 \n",
"4 11952 0.955728 \n",
"5 10383 0.960205 \n",
"6 9917 0.962484 \n",
"7 10219 0.960607 \n",
"\n",
"0 Emergency Admissions \\\n",
"1 Percentage in 4 hours or less (all) Emergency Admissions via Type 1 A&E \n",
"2 0.940985 994256 \n",
"4 0.969471 73589 \n",
"5 0.973079 67526 \n",
"6 0.97449 68990 \n",
"7 0.973198 68181 \n",
"\n",
"0 \\\n",
"1 Emergency Admissions via Type 2 A&E \n",
"2 4077 \n",
"4 1372 \n",
"5 1340 \n",
"6 1343 \n",
"7 1305 \n",
"\n",
"0 \\\n",
"1 Emergency Admissions via Type 3 and 4 A&E \n",
"2 12543 \n",
"4 1272 \n",
"5 563 \n",
"6 649 \n",
"7 639 \n",
"\n",
"0 \\\n",
"1 Total Emergency Admissions via A&E \n",
"2 1010876 \n",
"4 76233 \n",
"5 69429 \n",
"6 70982 \n",
"7 70125 \n",
"\n",
"0 \\\n",
"1 Other Emergency Admissions (i.e not via A&E) Total Emergency Admissions \n",
"2 363004 1373880 \n",
"4 28181 104414 \n",
"5 28558 97987 \n",
"6 28794 99776 \n",
"7 29095 99220 \n",
"\n",
"0 \\\n",
"1 Number of patients spending >4 hours from decision to admit to admission \n",
"2 71252 \n",
"4 1847 \n",
"5 1601 \n",
"6 1430 \n",
"7 1703 \n",
"\n",
"0 \n",
"1 Number of patients spending >12 hours from decision to admit to admission \n",
"2 130 \n",
"4 0 \n",
"5 0 \n",
"6 0 \n",
"7 2 "
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dfw=pd.read_excel('data/2015.06.28-AE-TimeseriesBaG87.xls',\n",
" skiprows=14,header=None,na_values='-').dropna(how='all').dropna(axis=1,how='all')\n",
"dfw.iloc[0,0]='Reporting'\n",
"dfw.iloc[1,0]='Code'\n",
"\n",
"dfw= dfw.fillna(axis=1,method='ffill').T.set_index([0,1]).T.dropna(how='all').dropna(axis=1,how='all')\n",
"\n",
"dfw.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The *pandas* `.to_period('W')` function returns a week period, starting on a Monday, that a given date falls within."
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th>0</th>\n",
" <th colspan=\"2\" halign=\"left\">Reporting</th>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <th>Period</th>\n",
" <th>_period</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>W/E 07/11/2010</td>\n",
" <td>2010-11-01/2010-11-07</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>W/E 14/11/2010</td>\n",
" <td>2010-11-08/2010-11-14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>W/E 21/11/2010</td>\n",
" <td>2010-11-15/2010-11-21</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>W/E 28/11/2010</td>\n",
" <td>2010-11-22/2010-11-28</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>W/E 05/12/2010</td>\n",
" <td>2010-11-29/2010-12-05</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"0 Reporting \n",
"1 Period _period\n",
"4 W/E 07/11/2010 2010-11-01/2010-11-07\n",
"5 W/E 14/11/2010 2010-11-08/2010-11-14\n",
"6 W/E 21/11/2010 2010-11-15/2010-11-21\n",
"7 W/E 28/11/2010 2010-11-22/2010-11-28\n",
"8 W/E 05/12/2010 2010-11-29/2010-12-05"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Filter to retain rows that have a W/E period specified\n",
"dfw=dfw[dfw[('Reporting','Period')].str.startswith('W/E')]\n",
"\n",
"#Remove the W/E prefix and use the date to define\n",
"dfw['Reporting','_period']=pd.to_datetime(dfw['Reporting','Period'].str.replace('W/E ',''),\n",
" format='%d/%m/%Y').dt.to_period('W') \n",
"\n",
"dfw[[('Reporting','Period'),('Reporting','_period')]].head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment