Skip to content

Instantly share code, notes, and snippets.

@mhweber
Last active June 9, 2022 21:06
Show Gist options
  • Save mhweber/fdb44aa69a48890f559bafa3d3e2a5e5 to your computer and use it in GitHub Desktop.
Save mhweber/fdb44aa69a48890f559bafa3d3e2a5e5 to your computer and use it in GitHub Desktop.
Get StreamCat metrics from secure ftp for particular catchments and certain metrics
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a way to download StreamCat data for the counties that you mentioned.\n",
"prior to this script I went to counties and selected out the ones you named, then...\n",
"selected NHD Catchments that intersected these selected counties and exported it as a shp file\n",
"what I called cats, I then use this dbf to get an array of all those selected cats in a DF\n",
"and make a call to the ftp site sripping this array of FEATUREIDs out of each table with a '17' in it\n",
"this is the zone where he counties lie. The only package that you may ned to run this is pysal which you \n",
"can install with conda."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# identify where you want the tables\n",
"out_dir = 'C:/Users/mweber/Temp'"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import os\n",
"import geopandas as gpd\n",
"import ftplib\n",
"#%qtconsole"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# location and name of the NHDPlus catchments you're interested in\n",
"cats = gpd.read_file('C:/Users/mweber/OneDrive - Environmental Protection Agency (EPA)/WA_Catchments.shp')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'250 Directory changed to /EPADataCommons/ORD/NHDPlusLandscapeAttributes/StreamCat/HydroRegions'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"url = 'https://gaftp.epa.gov/EPADataCommons/ORD/NHDPlusLandscapeAttributes/StreamCat/HydroRegions'\n",
"ftp_url = 'newftp.epa.gov'\n",
"ftp = ftplib.FTP(ftp_url)\n",
"ftp.login()\n",
"ftp.cwd('EPADataCommons/ORD/NHDPlusLandscapeAttributes/StreamCat/HydroRegions') "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Loop through all the files in the directory and select only those in list of metrics you need\n",
"- read each into a DF and select only those in the cats.FEATUREID array and write out to csv"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CanalDensity_Region17.zip\n",
"FirePerimetersRipBuf100_Region17.zip\n",
"GeoChemPhys1_Region17.zip\n"
]
}
],
"source": [
"metric_list = ['CanalDensity_Region17.zip','FirePerimetersRipBuf100_Region17.zip','GeoChemPhys1_Region17.zip']\n",
"for f in ftp.nlst():\n",
" if f in metric_list:\n",
" if not os.path.exists('{}/{}'.format(out_dir, f.replace('zip','csv'))):\n",
" print(f)\n",
" data = pd.read_csv('{}/{}'.format(url, f))\n",
" data = data.loc[data.COMID.isin(cats.FEATUREID.values)]\n",
" data.to_csv('{}/{}'.format(out_dir, f.replace('zip','csv')),index=False)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment