Skip to content

Instantly share code, notes, and snippets.

@barronh
Created March 12, 2021 16:45
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save barronh/ce818df08c6efb0356859698e545e4e8 to your computer and use it in GitHub Desktop.
Save barronh/ce818df08c6efb0356859698e545e4e8 to your computer and use it in GitHub Desktop.
Python AQS API Example
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "comfortable-consent",
"metadata": {},
"source": [
"# AQS API Query Parameters and Perform Analysis\n",
"\n",
" author: Barron H. Henderson\n",
" date: 2021-03-12\n",
"\n",
"\n",
"Overview:\n",
"* List Criteria Pollutant and CBSA codes\n",
"* Query observational data for a pollutant and CBSA on a day\n",
"* Show descriptive statistics\n",
"* Plot a spatial map\n",
"\n",
"Prerequisites: Python3, numpy, and pandas\n",
"\n",
"Tested on: Windows, Linux, Google Colab"
]
},
{
"cell_type": "markdown",
"id": "stunning-conflict",
"metadata": {},
"source": [
"## Import Libraries\n",
"\n",
"* `urllib` and `json` are default system libraries\n",
" * `urllib` is used to interact with the AQS API\n",
" * `json` is used to interpret the results\n",
"* `pandas` is used to convert the json to a tabular style format\n",
"* `pycno` is a python interface to NASA's overlays for mapping"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "enormous-format",
"metadata": {},
"outputs": [],
"source": [
"from urllib.request import urlretrieve, urlopen\n",
"import json\n",
"import pandas as pd\n",
"import pycno"
]
},
{
"cell_type": "markdown",
"id": "economic-vietnamese",
"metadata": {},
"source": [
"## Configure to use your own credentials\n",
"\n",
"* the test credientials will only work for small queries\n",
"* see https://aqs.epa.gov/aqsweb/documents/data_api.html"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "interim-brass",
"metadata": {},
"outputs": [],
"source": [
"user = \"test@aqs.api\"\n",
"key = \"test\""
]
},
{
"cell_type": "markdown",
"id": "advance-husband",
"metadata": {},
"source": [
"## Display CBSA Codes\n",
"\n",
"* The API uses codes that correspond to CBSAs\n",
"* This is the first query, which will show the CBSA codes.\n",
"* The query is setup to show Rhode Island because the result is short, but you can change RI to any state abbreviation.\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "limited-toddler",
"metadata": {},
"outputs": [],
"source": [
"responsetxt = urlopen(f'https://aqs.epa.gov/data/api/list/cbsas?email={user}&key={key}').read()\n",
"cbsa_df = pd.DataFrame.from_dict(json.loads(responsetxt)['Data']).set_index('code')"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "focal-country",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>value_represented</th>\n",
" </tr>\n",
" <tr>\n",
" <th>code</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>39300</th>\n",
" <td>Providence-Warwick, RI-MA</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" value_represented\n",
"code \n",
"39300 Providence-Warwick, RI-MA"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Find a CBSA by state\n",
"cbsa_df.query('value_represented.str.contains(\"RI\")')"
]
},
{
"cell_type": "markdown",
"id": "occupational-agreement",
"metadata": {},
"source": [
"## Display Pollutant Parameter Codes\n",
"\n",
"* This is similar to CBSAs, but for pollutants\n",
"* First, we query types.\n",
"* Then, we use CRITERIA to query pollutant codes "
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "concrete-therapy",
"metadata": {},
"outputs": [],
"source": [
"responsetxt = urlopen(f'https://aqs.epa.gov/data/api/list/classes?email={user}&key={key}').read()\n",
"polltype_df = pd.DataFrame.from_dict(json.loads(responsetxt)['Data']).set_index('code')"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "hollywood-drawing",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>value_represented</th>\n",
" </tr>\n",
" <tr>\n",
" <th>code</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>AIRNOW MAPS</th>\n",
" <td>The parameters represented on AirNow maps (881...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>ALL</th>\n",
" <td>Select all Parameters Available</td>\n",
" </tr>\n",
" <tr>\n",
" <th>AQI POLLUTANTS</th>\n",
" <td>Pollutants that have an AQI Defined</td>\n",
" </tr>\n",
" <tr>\n",
" <th>CORE_HAPS</th>\n",
" <td>Urban Air Toxic Pollutants</td>\n",
" </tr>\n",
" <tr>\n",
" <th>CRITERIA</th>\n",
" <td>Criteria Pollutants</td>\n",
" </tr>\n",
" <tr>\n",
" <th>CSN DART</th>\n",
" <td>List of CSN speciation parameters to populate ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>FORECAST</th>\n",
" <td>Parameters routinely extracted by AirNow (STI)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>HAPS</th>\n",
" <td>Hazardous Air Pollutants</td>\n",
" </tr>\n",
" <tr>\n",
" <th>IMPROVE CARBON</th>\n",
" <td>IMPROVE Carbon Parameters</td>\n",
" </tr>\n",
" <tr>\n",
" <th>IMPROVE_SPECIATION</th>\n",
" <td>PM2.5 Speciated Parameters Measured at IMPROVE...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>MET</th>\n",
" <td>Meteorological Parameters</td>\n",
" </tr>\n",
" <tr>\n",
" <th>NATTS CORE HAPS</th>\n",
" <td>The core list of toxics of interest to the NAT...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>NATTS REQUIRED</th>\n",
" <td>Required compounds to be collected in the Nati...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>PAMS</th>\n",
" <td>Photochemical Assessment Monitoring System</td>\n",
" </tr>\n",
" <tr>\n",
" <th>PAMS_VOC</th>\n",
" <td>Volatile Organic Compound subset of the PAMS P...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>PM COARSE</th>\n",
" <td>PM between 2.5 and 10 micrometers</td>\n",
" </tr>\n",
" <tr>\n",
" <th>PM10 SPECIATION</th>\n",
" <td>PM10 Speciated Parameters</td>\n",
" </tr>\n",
" <tr>\n",
" <th>PM2.5 CONT NONREF</th>\n",
" <td>PM2.5 Continuous, Nonreference Methods</td>\n",
" </tr>\n",
" <tr>\n",
" <th>PM2.5 MASS/QA</th>\n",
" <td>PM2.5 Mass and QA Parameters</td>\n",
" </tr>\n",
" <tr>\n",
" <th>SCHOOL AIR TOXICS</th>\n",
" <td>School Air Toxics Program Parameters</td>\n",
" </tr>\n",
" <tr>\n",
" <th>SPECIATION</th>\n",
" <td>PM2.5 Speciated Parameters</td>\n",
" </tr>\n",
" <tr>\n",
" <th>SPECIATION CARBON</th>\n",
" <td>PM2.5 Speciation Carbon Parameters</td>\n",
" </tr>\n",
" <tr>\n",
" <th>SPECIATION CATION/ANION</th>\n",
" <td>PM2.5 Speciation Cation/Anion Parameters</td>\n",
" </tr>\n",
" <tr>\n",
" <th>SPECIATION METALS</th>\n",
" <td>PM2.5 Speciation Metal Parameters</td>\n",
" </tr>\n",
" <tr>\n",
" <th>UATMP CARBONYL</th>\n",
" <td>Urban Air Toxics Monitoring Program Carbonyls</td>\n",
" </tr>\n",
" <tr>\n",
" <th>UATMP VOC</th>\n",
" <td>Urban Air Toxics Monitoring Program VOCs</td>\n",
" </tr>\n",
" <tr>\n",
" <th>VOC</th>\n",
" <td>Volatile organic compounds</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" value_represented\n",
"code \n",
"AIRNOW MAPS The parameters represented on AirNow maps (881...\n",
"ALL Select all Parameters Available\n",
"AQI POLLUTANTS Pollutants that have an AQI Defined\n",
"CORE_HAPS Urban Air Toxic Pollutants\n",
"CRITERIA Criteria Pollutants\n",
"CSN DART List of CSN speciation parameters to populate ...\n",
"FORECAST Parameters routinely extracted by AirNow (STI)\n",
"HAPS Hazardous Air Pollutants\n",
"IMPROVE CARBON IMPROVE Carbon Parameters\n",
"IMPROVE_SPECIATION PM2.5 Speciated Parameters Measured at IMPROVE...\n",
"MET Meteorological Parameters\n",
"NATTS CORE HAPS The core list of toxics of interest to the NAT...\n",
"NATTS REQUIRED Required compounds to be collected in the Nati...\n",
"PAMS Photochemical Assessment Monitoring System\n",
"PAMS_VOC Volatile Organic Compound subset of the PAMS P...\n",
"PM COARSE PM between 2.5 and 10 micrometers\n",
"PM10 SPECIATION PM10 Speciated Parameters\n",
"PM2.5 CONT NONREF PM2.5 Continuous, Nonreference Methods\n",
"PM2.5 MASS/QA PM2.5 Mass and QA Parameters\n",
"SCHOOL AIR TOXICS School Air Toxics Program Parameters\n",
"SPECIATION PM2.5 Speciated Parameters\n",
"SPECIATION CARBON PM2.5 Speciation Carbon Parameters\n",
"SPECIATION CATION/ANION PM2.5 Speciation Cation/Anion Parameters\n",
"SPECIATION METALS PM2.5 Speciation Metal Parameters\n",
"UATMP CARBONYL Urban Air Toxics Monitoring Program Carbonyls\n",
"UATMP VOC Urban Air Toxics Monitoring Program VOCs\n",
"VOC Volatile organic compounds"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"polltype_df"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "heard-burden",
"metadata": {},
"outputs": [],
"source": [
"responsetxt = urlopen(f'https://aqs.epa.gov/data/api/list/parametersByClass?email={user}&key={key}&pc=CRITERIA').read()\n",
"param_df = pd.DataFrame.from_dict(json.loads(responsetxt)['Data']).set_index('code')"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "judicial-northwest",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>value_represented</th>\n",
" </tr>\n",
" <tr>\n",
" <th>code</th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>14129</th>\n",
" <td>Lead (TSP) LC</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42101</th>\n",
" <td>Carbon monoxide</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42401</th>\n",
" <td>Sulfur dioxide</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42602</th>\n",
" <td>Nitrogen dioxide (NO2)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44201</th>\n",
" <td>Ozone</td>\n",
" </tr>\n",
" <tr>\n",
" <th>81102</th>\n",
" <td>PM10 Total 0-10um STP</td>\n",
" </tr>\n",
" <tr>\n",
" <th>85129</th>\n",
" <td>Lead PM10 LC FRM/FEM</td>\n",
" </tr>\n",
" <tr>\n",
" <th>88101</th>\n",
" <td>PM2.5 - Local Conditions</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" value_represented\n",
"code \n",
"14129 Lead (TSP) LC\n",
"42101 Carbon monoxide\n",
"42401 Sulfur dioxide\n",
"42602 Nitrogen dioxide (NO2)\n",
"44201 Ozone\n",
"81102 PM10 Total 0-10um STP\n",
"85129 Lead PM10 LC FRM/FEM\n",
"88101 PM2.5 - Local Conditions"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"param_df"
]
},
{
"cell_type": "markdown",
"id": "wired-enemy",
"metadata": {},
"source": [
"## Select Data Query\n",
"\n",
"Based on previous queries we know:\n",
"1. 88101 corresponds to PM2.5\n",
"2. 16980 corresponds to Chicago-Naperville-Elgin, IL-IN-WI\n",
"3. In this case, I choose to focus on July 4th."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "boxed-headset",
"metadata": {},
"outputs": [],
"source": [
"param = 88101\n",
"cbsa = 16980\n",
"bdate = 20200704\n",
"edate = 20200704"
]
},
{
"cell_type": "markdown",
"id": "agricultural-greenhouse",
"metadata": {},
"source": [
"## Now download and Archive Data\n",
"\n",
"* Storing data in a json file\n",
"* Loading data into `records` a pandas.DataFrame"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "federal-omaha",
"metadata": {},
"outputs": [],
"source": [
"# This is where results will be stored\n",
"archivepath = 'temp.json'"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "million-scout",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('temp.json', <http.client.HTTPMessage at 0x161634fefd0>)"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This will download \n",
"cbsa_url = f'https://aqs.epa.gov/data/api/dailyData/byCBSA?email={user}&key={key}&param={param}&bdate={bdate}&edate={edate}&cbsa={cbsa}'\n",
"urlretrieve(cbsa_url, archivepath)\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "growing-tennessee",
"metadata": {},
"outputs": [],
"source": [
"data = json.load(open(archivepath, 'r'))\n",
"records = pd.DataFrame.from_dict(data['Data'])"
]
},
{
"cell_type": "markdown",
"id": "mineral-controversy",
"metadata": {},
"source": [
"## Show Descriptive Information\n",
"\n",
"* Grouping by duration because the results can be quite different.\n",
"* `pandas.DataFrame.describe` displays counts, means, standard deviation and quantile information"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "dominican-swing",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>sample_duration</th>\n",
" <th>1 HOUR</th>\n",
" <th>24-HR BLK AVG</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">arithmetic_mean</th>\n",
" <th>count</th>\n",
" <td>13.000000</td>\n",
" <td>52.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>18.917338</td>\n",
" <td>18.869231</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>8.456861</td>\n",
" <td>8.227966</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>11.391304</td>\n",
" <td>11.300000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>12.220833</td>\n",
" <td>12.200000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>15.570833</td>\n",
" <td>15.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>23.150000</td>\n",
" <td>23.100000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>37.233333</td>\n",
" <td>37.200000</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">first_max_value</th>\n",
" <th>count</th>\n",
" <td>13.000000</td>\n",
" <td>52.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>62.476923</td>\n",
" <td>18.869231</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>68.668675</td>\n",
" <td>8.227966</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>15.200000</td>\n",
" <td>11.300000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>19.600000</td>\n",
" <td>12.200000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>38.600000</td>\n",
" <td>15.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>66.000000</td>\n",
" <td>23.100000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>247.600000</td>\n",
" <td>37.200000</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">first_max_hour</th>\n",
" <th>count</th>\n",
" <td>13.000000</td>\n",
" <td>52.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>18.153846</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>6.135228</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>15.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>21.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>22.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>23.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"sample_duration 1 HOUR 24-HR BLK AVG\n",
"arithmetic_mean count 13.000000 52.000000\n",
" mean 18.917338 18.869231\n",
" std 8.456861 8.227966\n",
" min 11.391304 11.300000\n",
" 25% 12.220833 12.200000\n",
" 50% 15.570833 15.500000\n",
" 75% 23.150000 23.100000\n",
" max 37.233333 37.200000\n",
"first_max_value count 13.000000 52.000000\n",
" mean 62.476923 18.869231\n",
" std 68.668675 8.227966\n",
" min 15.200000 11.300000\n",
" 25% 19.600000 12.200000\n",
" 50% 38.600000 15.500000\n",
" 75% 66.000000 23.100000\n",
" max 247.600000 37.200000\n",
"first_max_hour count 13.000000 52.000000\n",
" mean 18.153846 0.000000\n",
" std 6.135228 0.000000\n",
" min 1.000000 0.000000\n",
" 25% 15.000000 0.000000\n",
" 50% 21.000000 0.000000\n",
" 75% 22.000000 0.000000\n",
" max 23.000000 0.000000"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"records.filter(['sample_duration', 'arithmetic_mean', 'first_max_value', 'first_max_hour']).groupby('sample_duration').describe().T"
]
},
{
"cell_type": "markdown",
"id": "arabic-scholarship",
"metadata": {},
"source": [
"## Display Data as a Map\n",
"\n",
"* `pycno` is used to fetch and display a medium resolution map\n",
" * You may get a warning the first time this is run. pycno is downloading the overlay.\n",
" * You could replace the pycno functionality with `basemap` or `cartopy`, which have more options.\n",
" * You can also make your own `cnob` or `cno` files from shapefiles or kml, but that is not covered here.\n",
"* `records.plot.scatter` is used to display the arithmetic_mean"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "temporal-turning",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"ax = records.plot.scatter(x=['longitude'], y=['latitude'], c=records.arithmetic_mean, cmap='viridis')\n",
"cno = pycno.cno()\n",
"cno.draw('MWDB_Coasts_USA_1.cnob');"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "hidden-condition",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"((41.157377749999995, 42.568881250000004), (-88.240557, -87.14957700000001))"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ax.get_ylim(), ax.get_xlim()"
]
},
{
"cell_type": "markdown",
"id": "chubby-lesbian",
"metadata": {},
"source": [
"## Many more options are available in addition to CBSA\n",
"\n",
"* You could replace cbsa_url with any of the following\n",
"* `box_url` to use a longitude/latitude bounding box\n",
"* see https://aqs.epa.gov/aqsweb/documents/data_api.html"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "minor-budget",
"metadata": {},
"outputs": [],
"source": [
"box_url = f'https://aqs.epa.gov/data/api/dailyData/byBox?email={user}&key={key}&param={param}&bdate={bdate}&edate={edate}&minlat=41.15&maxlat=42.6&minlon=-88.24&maxlon=-87.15'\n",
"state_url = f'https://aqs.epa.gov/data/api/dailyData/byState?email={user}&key={key}&param={param}&bdate={bdate}&edate={edate}&state=17'\n",
"county_url = f'https://aqs.epa.gov/data/api/dailyData/byCounty?email={user}&key={key}&param={param}&bdate={bdate}&edate={edate}&state=17&county=031'\n",
"site_url = f'https://aqs.epa.gov/data/api/dailyData/bySite?email={user}&key={key}&param={param}&bdate={bdate}&edate={edate}&state=17&county=031&site=4201'"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "powered-guyana",
"metadata": {},
"outputs": [],
"source": [
"responsetxt = urlopen(state_url).read()\n",
"records = pd.DataFrame.from_dict(json.loads(responsetxt)['Data'])"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "turned-involvement",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>sample_duration</th>\n",
" <th>1 HOUR</th>\n",
" <th>24-HR BLK AVG</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">arithmetic_mean</th>\n",
" <th>count</th>\n",
" <td>17.000000</td>\n",
" <td>68.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>23.102496</td>\n",
" <td>23.064706</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>9.275212</td>\n",
" <td>9.064277</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>11.266667</td>\n",
" <td>11.200000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>16.787500</td>\n",
" <td>16.700000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>20.575000</td>\n",
" <td>20.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>31.809091</td>\n",
" <td>31.800000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>40.695833</td>\n",
" <td>40.600000</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">first_max_value</th>\n",
" <th>count</th>\n",
" <td>17.000000</td>\n",
" <td>68.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>102.841176</td>\n",
" <td>23.064706</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>77.531518</td>\n",
" <td>9.064277</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>22.500000</td>\n",
" <td>11.200000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>51.800000</td>\n",
" <td>16.700000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>66.100000</td>\n",
" <td>20.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>150.800000</td>\n",
" <td>31.800000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>247.600000</td>\n",
" <td>40.600000</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">first_max_hour</th>\n",
" <th>count</th>\n",
" <td>17.000000</td>\n",
" <td>68.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>20.705882</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>5.132795</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>21.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>22.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>22.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>23.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"sample_duration 1 HOUR 24-HR BLK AVG\n",
"arithmetic_mean count 17.000000 68.000000\n",
" mean 23.102496 23.064706\n",
" std 9.275212 9.064277\n",
" min 11.266667 11.200000\n",
" 25% 16.787500 16.700000\n",
" 50% 20.575000 20.500000\n",
" 75% 31.809091 31.800000\n",
" max 40.695833 40.600000\n",
"first_max_value count 17.000000 68.000000\n",
" mean 102.841176 23.064706\n",
" std 77.531518 9.064277\n",
" min 22.500000 11.200000\n",
" 25% 51.800000 16.700000\n",
" 50% 66.100000 20.500000\n",
" 75% 150.800000 31.800000\n",
" max 247.600000 40.600000\n",
"first_max_hour count 17.000000 68.000000\n",
" mean 20.705882 0.000000\n",
" std 5.132795 0.000000\n",
" min 1.000000 0.000000\n",
" 25% 21.000000 0.000000\n",
" 50% 22.000000 0.000000\n",
" 75% 22.000000 0.000000\n",
" max 23.000000 0.000000"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"records.filter(['sample_duration', 'arithmetic_mean', 'first_max_value', 'first_max_hour']).groupby(['sample_duration']).describe().T"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "dirty-syntax",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"ax = records.plot.scatter(x=['longitude'], y=['latitude'], c=records.arithmetic_mean, cmap='viridis')\n",
"cno = pycno.cno()\n",
"cno.draw('MWDB_Coasts_USA_1.cnob');"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "mounted-distribution",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
},
"toc-showtags": false
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment