Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save abdatasci/038144c1e12e5b736317d58266d98277 to your computer and use it in GitHub Desktop.
Save abdatasci/038144c1e12e5b736317d58266d98277 to your computer and use it in GitHub Desktop.
Created on Skills Network Labs
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<center>\n",
" <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DV0101EN-SkillsNetwork/labs/Module%204/logo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\" />\n",
"</center>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Assignment\n",
"\n",
"- [Story](#story)\n",
"- [Components of the report items](#components-of-the-report-items)\n",
"- [Expected layout](#expected-layout)\n",
"- [Requirements to create the dashboard](#requirements-to-create-the-dashboard)\n",
"- [What is new in this exercise compared to other labs?](#what-is-new-in-this-exercise-compared-to-other-labs?)\n",
"- [Review](#review)\n",
"- [Hints to complete TODOs](#hints-to-complete-todos)\n",
"- [Application](#application)\n",
"\n",
"## Story:\n",
"\n",
"As a data analyst, you have been given a task to monitor and report US domestic airline flights performance. Goal is to analyze the performance of the reporting airline to improve fight reliability thereby improving customer relaibility.\n",
"\n",
"Below are the key report items,\n",
"\n",
"- Yearly airline performance report \n",
"- Yearly average flight delay statistics\n",
"\n",
"_NOTE:_ Year range is between 2005 and 2020.\n",
"\n",
"## Components of the report items\n",
"\n",
"1. Yearly airline performance report \n",
"\n",
" For the chosen year provide,\n",
"\n",
" - Number of flights under different cancellation categories using bar chart.\n",
" - Average flight time by reporting airline using line chart.\n",
" - Percentage of diverted airport landings per reporting airline using pie chart.\n",
" - Number of flights flying from each state using choropleth map.\n",
" - Number of flights flying to each state from each reporting airline using treemap chart.\n",
"2. Yearly average flight delay statistics\n",
"\n",
" For the chosen year provide,\n",
"\n",
" - Monthly average carrier delay by reporting airline for the given year.\n",
" - Monthly average weather delay by reporting airline for the given year.\n",
" - Monthly average natioanl air system delay by reporting airline for the given year.\n",
" - Monthly average security delay by reporting airline for the given year.\n",
" - Monthly average late aircraft delay by reporting airline for the given year.\n",
"\n",
" _NOTE:_ You have worked created the same dashboard components in `Flight Delay Time Statistics Dashboard` section. We will be reusing the same.\n",
"\n",
"## Expected Layout\n",
"\n",
"<center>\n",
" <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DV0101EN-SkillsNetwork/labs/Module%205/Layout.png\" width=\"2000\" alt=\"cognitiveclass.ai logo\"/>\n",
"</center>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Requirements to create the dashboard\n",
"\n",
"- Create dropdown using the reference [here](https://dash.plotly.com/dash-core-components/dropdown?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkDV0101ENSkillsNetwork20297740-2021-01-01)\n",
"- Create two HTML divisions that can accomodate two components (in one division) side by side. One is HTML heading and the other one is dropdown.\n",
"- Add graph components. \n",
"- Callback function to compute data, create graph and return to the layout.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What's new in this exercise compared to other labs?\n",
"\n",
"- Make sure the layout is clean without any defualt graphs or graph layouts. We will do this by 3 changes:\n",
"\n",
" 1. Add `app.config.suppress_callback_exceptions = True` right after `app = JupyterDash(__name__)`.\n",
" 2. Having empty html.Div and use the callback to Output the dcc.graph as the Children of that Div. \n",
" 3. Add a state variable in addition to callback decorator input and output parameter. This will allow us to pass extra values without firing the callbacks. \n",
" Here, we need to pass two inputs `chart type` and `year`. Input is read only after user entering all the information.\n",
"\n",
"- Use new html display style `flex` to arrange the dropdown menu with description.\n",
"\n",
"- Update app run step to avoid getting error message before initiating callback.\n",
"\n",
"_NOTE:_ These steps are only for review.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Review\n",
"\n",
"Search/Look for review to know how commands are used and computations are carried out. There are 7 review items.\n",
"\n",
"- REVIEW1: Clear the layout and do not display exception till callback gets executed.\n",
"- REVIEW2: Dropdown creation.\n",
"- REVIEW3: Observe how we add an empty division and providing an id that will be updated during callback.\n",
"- REVIEW4: Holding output state till user enters all the form information. In this case, it will be chart type and year.\n",
"- REVIEW5: Number of flights flying from each state using choropleth\n",
"- REVIEW6: Return dcc.Graph component to the empty division\n",
"- REVIEW7: This covers chart type 2 and we have completed this exercise under Flight Delay Time Statistics Dashboard section\n",
"\n",
"## Hints to complete TODOs\n",
"\n",
"### TODO1\n",
"\n",
"Reference [link](https://dash.plotly.com/dash-html-components/h1?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkDV0101ENSkillsNetwork20297740-2021-01-01)\n",
"\n",
"- Provide title of the dash application title as `US Domestic Airline Flights Performance`.\n",
"- Make the heading center aligned, set color as `#503D36`, and font size as `24`.\n",
" Sample: style={'textAlign': 'left', 'color': '#000000', 'font-size': 0}\n",
"\n",
"### TODO2\n",
"\n",
"Reference [link](https://dash.plotly.com/dash-core-components/dropdown?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkDV0101ENSkillsNetwork20297740-2021-01-01)\n",
"\n",
"Create a dropdown menu and add two chart options to it.\n",
"\n",
"Parameters to be updated in `dcc.Dropdown`:\n",
"\n",
"- Set `id` as `input-type`.\n",
"- Set `options` to list containing dictionaries with key as `label` and user provided value for labels in `value`. \n",
"\n",
" _1st dictionary_\n",
"\n",
" - label: Yearly Airline Performance Report\n",
" - value: OPT1\n",
"\n",
" _2nd dictionary_\n",
"\n",
" - label: Yearly Airline Delay Report\n",
" - value: OPT2\n",
"- Set placeholder to `Select a report type`.\n",
"- Set width as `80%`, padding as `3px`, font size as `20px`, text-align-last as `center` inside style parameter dictionary.\n",
"\n",
"#### Skeleton:\n",
"\n",
"```\n",
" dcc.Dropdown(id='....', \n",
" options=[\n",
" {'label': '....', 'value': '...'},\n",
" {'label': '....', 'value': '...'}\n",
" ],\n",
" placeholder='....',\n",
" style={....})\n",
"```\n",
"\n",
"### TODO3\n",
"\n",
"Add a division with two empty divisions inside. For reference, observe how code under `REVIEW` has been structured.\n",
"\n",
"Provide division ids as `plot4` and `plot5`. Display style as `flex`.\n",
"\n",
"#### Skeleton\n",
"\n",
"```\n",
"html.Div([\n",
" html.Div([ ], id='....'),\n",
" html.Div([ ], id='....')\n",
" ], style={....})\n",
"```\n",
"\n",
"### TODO4\n",
"\n",
"Our layout has 5 outputs so we need to create 5 output components. Review how input components are constructured to fill in for output component.\n",
"\n",
"It is a list with 5 output parameters with component id and property. Here, the component property will be `children` as we have created empty division and passing in `dcc.Graph` after computation.\n",
"\n",
"Component ids will be `plot1` , `plot2`, `plot2`, `plot4`, and `plot5`.\n",
"\n",
"#### Skeleton\n",
"\n",
"```\n",
"[Output(component_id='plot1', component_property='children'),\n",
" Output(....),\n",
" Output(....),\n",
" Output(....),\n",
" Output(....)]\n",
"```\n",
"\n",
"### TODO5\n",
"\n",
"Deals with creating line plots using returned dataframes from the above step using `plotly.express`. Link for reference is [here](https://plotly.com/python/line-charts/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkDV0101ENSkillsNetwork20297740-2021-01-01)\n",
"\n",
"Average flight time by reporting airline\n",
"\n",
"- Set figure name as `line_fig`, data as `line_data`, x as `Month`, y as `AirTime`, color as `Reporting_Airline` and `title` as `Average monthly flight time (minutes) by airline`.\n",
"\n",
"#### Skeleton\n",
"\n",
"```\n",
"carrier_fig = px.line(avg_car, x='Month', y='CarrierDelay', color='Reporting_Airline', title='Average carrrier delay time (minutes) by airline')`\n",
"```\n",
"\n",
")\n",
"\n",
"### TODO6\n",
"\n",
"Deals with creating treemap plot using returned dataframes from the above step using `plotly.express`. Link for reference is [here](https://plotly.com/python/treemaps/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkDV0101ENSkillsNetwork20297740-2021-01-01)\n",
"\n",
"Number of flights flying to each state from each reporting airline\n",
"\n",
"- Set figure name as `tree_fig`, data as `tree_data`, path as `['DestState', 'Reporting_Airline']`, values as `Flights`, colors as `Flights`, color_continuous_scale as `'RdBu'`, and title as `'Flight count by airline to destination state'`\n",
"\n",
"#### Skeleton\n",
"\n",
"```\n",
"tree_fig = px.treemap(data, path=['...', '...'], \n",
" values='...',\n",
" color='...',\n",
" color_continuous_scale='...',\n",
" title='...'\n",
" )\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Application\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"ename": "IndentationError",
"evalue": "unindent does not match any outer indentation level (<tokenize>, line 194)",
"output_type": "error",
"traceback": [
"\u001b[0;36m File \u001b[0;32m\"<tokenize>\"\u001b[0;36m, line \u001b[0;32m194\u001b[0m\n\u001b[0;31m ]\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mIndentationError\u001b[0m\u001b[0;31m:\u001b[0m unindent does not match any outer indentation level\n"
]
}
],
"source": [
"# Import required libraries\n",
"import pandas as pd\n",
"import dash\n",
"import dash_html_components as html\n",
"import dash_core_components as dcc\n",
"from dash.dependencies import Input, Output, State\n",
"from jupyter_dash import JupyterDash\n",
"import plotly.graph_objects as go\n",
"import plotly.express as px\n",
"from dash import no_update\n",
"\n",
"\n",
"# Create a dash application\n",
"app = JupyterDash(__name__)\n",
"JupyterDash.infer_jupyter_proxy_config()\n",
"\n",
"# REVIEW1: Clear the layout and do not display exception till callback gets executed\n",
"app.config.suppress_callback_exceptions = True\n",
"\n",
"# Read the airline data into pandas dataframe\n",
"airline_data = pd.read_csv('https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DV0101EN-SkillsNetwork/Data%20Files/airline_data.csv', \n",
" encoding = \"ISO-8859-1\",\n",
" dtype={'Div1Airport': str, 'Div1TailNum': str, \n",
" 'Div2Airport': str, 'Div2TailNum': str})\n",
"\n",
"\n",
"# List of years \n",
"year_list = [i for i in range(2005, 2021, 1)]\n",
"\n",
"\"\"\"Compute graph data for creating yearly airline performance report \n",
"\n",
"Function that takes airline data as input and create 5 dataframes based on the grouping condition to be used for plottling charts and grphs.\n",
"\n",
"Argument:\n",
" \n",
" df: Filtered dataframe\n",
" \n",
"Returns:\n",
" Dataframes to create graph. \n",
"\"\"\"\n",
"def compute_data_choice_1(df):\n",
" # Cancellation Category Count\n",
" bar_data = df.groupby(['Month','CancellationCode'])['Flights'].sum().reset_index()\n",
" # Average flight time by reporting airline\n",
" line_data = df.groupby(['Month','Reporting_Airline'])['AirTime'].mean().reset_index()\n",
" # Diverted Airport Landings\n",
" div_data = df[df['DivAirportLandings'] != 0.0]\n",
" # Source state count\n",
" map_data = df.groupby(['OriginState'])['Flights'].sum().reset_index()\n",
" # Destination state count\n",
" tree_data = df.groupby(['DestState', 'Reporting_Airline'])['Flights'].sum().reset_index()\n",
" return bar_data, line_data, div_data, map_data, tree_data\n",
"\n",
"\n",
"\"\"\"Compute graph data for creating yearly airline delay report\n",
"\n",
"This function takes in airline data and selected year as an input and performs computation for creating charts and plots.\n",
"\n",
"Arguments:\n",
" df: Input airline data.\n",
" \n",
"Returns:\n",
" Computed average dataframes for carrier delay, weather delay, NAS delay, security delay, and late aircraft delay.\n",
"\"\"\"\n",
"def compute_data_choice_2(df):\n",
" # Compute delay averages\n",
" avg_car = df.groupby(['Month','Reporting_Airline'])['CarrierDelay'].mean().reset_index()\n",
" avg_weather = df.groupby(['Month','Reporting_Airline'])['WeatherDelay'].mean().reset_index()\n",
" avg_NAS = df.groupby(['Month','Reporting_Airline'])['NASDelay'].mean().reset_index()\n",
" avg_sec = df.groupby(['Month','Reporting_Airline'])['SecurityDelay'].mean().reset_index()\n",
" avg_late = df.groupby(['Month','Reporting_Airline'])['LateAircraftDelay'].mean().reset_index()\n",
" return avg_car, avg_weather, avg_NAS, avg_sec, avg_late\n",
"\n",
"\n",
"# Application layout\n",
"app.layout = html.Div(children=['US Domestic Airline Flights Performance',\n",
" style={'textAlign': 'center',\n",
" 'color': '#503D36',\n",
" 'font-size': 24})\n",
" # TODO1: Add title to the dashboard\n",
" \n",
" \n",
" # REVIEW2: Dropdown creation\n",
" # Create an outer division \n",
" html.Div([\n",
" # Add an division\n",
" html.Div([\n",
" # Create an division for adding dropdown helper text for report type\n",
" html.Div(\n",
" [\n",
" html.H2('Report Type:', style={'margin-right': '2em'}),\n",
" ]\n",
" ),\n",
" # TODO2: Add a dropdown\n",
" \n",
" # Place them next to each other using the division style\n",
" ], style={'display':'flex'}),\n",
" \n",
" # Add next division \n",
" html.Div([\n",
" # Create an division for adding dropdown helper text for choosing year\n",
" html.Div(\n",
" [\n",
" html.H2('Choose Year:', style={'margin-right': '2em'})\n",
" ]\n",
" ),\n",
" dcc.Dropdown(id='input-year', \n",
" # Update dropdown values using list comphrehension\n",
" options=[{'label': 'Yearly Airline Performance Report', 'value': 'OPT1'},\n",
" {'label': 'Yearly Airline Delay Report', 'Value: OPT2'} for i in year_list],\n",
" placeholder=\"Select a year\",\n",
" style={'width':'80%', 'padding':'3px', 'font-size': '20px', 'text-align-last' : 'center'}),\n",
" # Place them next to each other using the division style\n",
" ], style={'display': 'flex'}), \n",
" ]),\n",
" \n",
" # Add Computed graphs\n",
" # REVIEW3: Observe how we add an empty division and providing an id that will be updated during callback\n",
" html.Div([ ], id='plot1'),\n",
" \n",
" html.Div([\n",
" html.Div([ ], id='plot2'),\n",
" html.Div([ ], id='plot3')\n",
" ], style={'display': 'flex'}),\n",
" \n",
" # TODO3: Add a division with two empty divisions inside. See above disvision for example.\n",
" html.Div([\n",
" html.Div([ ], id='plot4'),\n",
" html.Div([ ], id='plot5')\n",
" ], style={'display': 'flex'}),\n",
" ])\n",
"\n",
"# Callback function definition\n",
"# TODO4: Add 5 ouput components\n",
"@app.callback( [output(component_id='plot1', component_property='children'),\n",
" output(component_id='plot2', component_property='children'),\n",
" output(component_id='plot3', component_property='children'),\n",
" output(component_id='plot4', component_property='children'),\n",
" output(component_id='plot5', component_property='children')],\n",
" [Input(component_id='input-type', component_property='value'),\n",
" Input(component_id='input-year', component_property='value')],\n",
" # REVIEW4: Holding output state till user enters all the form information. In this case, it will be chart type and year\n",
" [State(\"plot1\", 'children'), State(\"plot2\", \"children\"),\n",
" State(\"plot3\", \"children\"), State(\"plot4\", \"children\"),\n",
" State(\"plot5\", \"children\")\n",
" ])\n",
"# Add computation to callback function and return graph\n",
"def get_graph(chart, year, children1, children2, c3, c4, c5):\n",
" \n",
" # Select data\n",
" df = airline_data[airline_data['Year']==int(year)]\n",
" \n",
" if chart == 'OPT1':\n",
" # Compute required information for creating graph from the data\n",
" bar_data, line_data, div_data, map_data, tree_data = compute_data_choice_1(df)\n",
" \n",
" # Number of flights under different cancellation categories\n",
" bar_fig = px.bar(bar_data, x='Month', y='Flights', color='CancellationCode', title='Monthly Flight Cancellation')\n",
" line_fig = px.line(line_data, x='Month', y='AirTime', color='Reporting_Airline', title='Average Monthly Flight Time (minutes) by Airline')\n",
" # TODO5: Average flight time by reporting airline\n",
" \n",
" \n",
" # Percentage of diverted airport landings per reporting airline\n",
" pie_fig = px.pie(div_data, values='Flights', names='Reporting_Airline', title='% of flights by reporting airline')\n",
" \n",
" # REVIEW5: Number of flights flying from each state using choropleth\n",
" map_fig = px.choropleth(map_data, # Input data\n",
" locations='OriginState', \n",
" color='Flights', \n",
" hover_data=['OriginState', 'Flights'], \n",
" locationmode = 'USA-states', # Set to plot as US States\n",
" color_continuous_scale='GnBu',\n",
" range_color=[0, map_data['Flights'].max()]) \n",
" map_fig.update_layout(\n",
" title_text = 'Number of flights from origin state', \n",
" geo_scope='usa') # Plot only the USA instead of globe\n",
" tree_fig = px.treemap(tree_data, path=['Deststate', 'Reporting_Airline'],\n",
" values='Flights',\n",
" color='Flights',\n",
" color_continuous_scale='RdBu',\n",
" title='Flight count by airline to destination state'\n",
" )\n",
" \n",
" # TODO6: Number of flights flying to each state from each reporting airline\n",
" \n",
" \n",
" \n",
" # REVIEW6: Return dcc.Graph component to the empty division\n",
" return [dcc.Graph(figure=tree_fig), \n",
" dcc.Graph(figure=pie_fig),\n",
" dcc.Graph(figure=map_fig),\n",
" dcc.Graph(figure=bar_fig),\n",
" dcc.Graph(figure=line_fig)\n",
" ]\n",
" else:\n",
" # REVIEW7: This covers chart type 2 and we have completed this exercise under Flight Delay Time Statistics Dashboard section\n",
" # Compute required information for creating graph from the data\n",
" avg_car, avg_weather, avg_NAS, avg_sec, avg_late = compute_data_choice_2(df)\n",
" \n",
" # Create graph\n",
" carrier_fig = px.line(avg_car, x='Month', y='CarrierDelay', color='Reporting_Airline', title='Average carrrier delay time (minutes) by airline')\n",
" weather_fig = px.line(avg_weather, x='Month', y='WeatherDelay', color='Reporting_Airline', title='Average weather delay time (minutes) by airline')\n",
" nas_fig = px.line(avg_NAS, x='Month', y='NASDelay', color='Reporting_Airline', title='Average NAS delay time (minutes) by airline')\n",
" sec_fig = px.line(avg_sec, x='Month', y='SecurityDelay', color='Reporting_Airline', title='Average security delay time (minutes) by airline')\n",
" late_fig = px.line(avg_late, x='Month', y='LateAircraftDelay', color='Reporting_Airline', title='Average late aircraft delay time (minutes) by airline')\n",
" \n",
" return[dcc.Graph(figure=carrier_fig), \n",
" dcc.Graph(figure=weather_fig), \n",
" dcc.Graph(figure=nas_fig), \n",
" dcc.Graph(figure=sec_fig), \n",
" dcc.Graph(figure=late_fig)]\n",
"\n",
"\n",
"# Run the app\n",
"if __name__ == '__main__':\n",
" # REVIEW8: Adding dev_tools_ui=False, dev_tools_props_check=False can prevent error appearing before calling callback function\n",
" app.run_server(mode=\"inline\", host=\"localhost\", debug=False, dev_tools_ui=False, dev_tools_props_check=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summary\n",
"\n",
"Congratulations for completing your dash and plotly assignment. \n",
"\n",
"More information about the libraries can be found [here](https://dash.plotly.com/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkDV0101ENSkillsNetwork20297740-2021-01-01)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Author\n",
"\n",
"[Saishruthi Swaminathan](https://www.linkedin.com/in/saishruthi-swaminathan/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkDV0101ENSkillsNetwork20297740-2021-01-01) \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Changelog\n",
"\n",
"| Date | Version | Changed by | Change Description |\n",
"| ---------- | ------- | ---------- | ------------------------------------ |\n",
"| 12-18-2020 | 1.0 | Nayef | Added dataset link and upload to Git |\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## <h3 align=\"center\"> © IBM Corporation 2020. All rights reserved. <h3/>\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python",
"language": "python",
"name": "conda-env-python-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment