mhweber/GetStreamCatMetricsList.ipynb

## GetStreamCatMetricsList.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is a way to download StreamCat data for the counties that you mentioned.\n",
    "prior to this script I went to counties and selected out the ones you named, then...\n",
    "selected NHD Catchments that intersected these selected counties and exported it as a shp file\n",
    "what I called cats, I then use this dbf to get an array of all those selected cats in a DF\n",
    "and make a call to the ftp site sripping this array of FEATUREIDs out of each table with a '17' in it\n",
    "this is the zone where he counties lie. The only package that you may ned to run this is pysal which you \n",
    "can install with conda."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# identify where you want the tables\n",
    "out_dir = 'C:/Users/mweber/Temp'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import os\n",
    "import geopandas as gpd\n",
    "import ftplib\n",
    "#%qtconsole"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# location and name of the NHDPlus catchments you're interested in\n",
    "cats = gpd.read_file('C:/Users/mweber/OneDrive - Environmental Protection Agency (EPA)/WA_Catchments.shp')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'250 Directory changed to /EPADataCommons/ORD/NHDPlusLandscapeAttributes/StreamCat/HydroRegions'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "url = 'https://gaftp.epa.gov/EPADataCommons/ORD/NHDPlusLandscapeAttributes/StreamCat/HydroRegions'\n",
    "ftp_url = 'newftp.epa.gov'\n",
    "ftp = ftplib.FTP(ftp_url)\n",
    "ftp.login()\n",
    "ftp.cwd('EPADataCommons/ORD/NHDPlusLandscapeAttributes/StreamCat/HydroRegions')    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Loop through all the files in the directory and select only those in list of metrics you need\n",
    "- read each into a DF and select only those in the cats.FEATUREID array and write out to csv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CanalDensity_Region17.zip\n",
      "FirePerimetersRipBuf100_Region17.zip\n",
      "GeoChemPhys1_Region17.zip\n"
     ]
    }
   ],
   "source": [
    "metric_list = ['CanalDensity_Region17.zip','FirePerimetersRipBuf100_Region17.zip','GeoChemPhys1_Region17.zip']\n",
    "for f in ftp.nlst():\n",
    "    if f in metric_list:\n",
    "        if not os.path.exists('{}/{}'.format(out_dir, f.replace('zip','csv'))):\n",
    "            print(f)\n",
    "            data = pd.read_csv('{}/{}'.format(url, f))\n",
    "            data = data.loc[data.COMID.isin(cats.FEATUREID.values)]\n",
    "            data.to_csv('{}/{}'.format(out_dir, f.replace('zip','csv')),index=False)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"This is a way to download StreamCat data for the counties that you mentioned.\n",
	"prior to this script I went to counties and selected out the ones you named, then...\n",
	"selected NHD Catchments that intersected these selected counties and exported it as a shp file\n",
	"what I called cats, I then use this dbf to get an array of all those selected cats in a DF\n",
	"and make a call to the ftp site sripping this array of FEATUREIDs out of each table with a '17' in it\n",
	"this is the zone where he counties lie. The only package that you may ned to run this is pysal which you \n",
	"can install with conda."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [],
	"source": [
	"# identify where you want the tables\n",
	"out_dir = 'C:/Users/mweber/Temp'"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"import pandas as pd\n",
	"import os\n",
	"import geopandas as gpd\n",
	"import ftplib\n",
	"#%qtconsole"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {},
	"outputs": [],
	"source": [
	"# location and name of the NHDPlus catchments you're interested in\n",
	"cats = gpd.read_file('C:/Users/mweber/OneDrive - Environmental Protection Agency (EPA)/WA_Catchments.shp')"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"'250 Directory changed to /EPADataCommons/ORD/NHDPlusLandscapeAttributes/StreamCat/HydroRegions'"
	]
	},
	"execution_count": 3,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"url = 'https://gaftp.epa.gov/EPADataCommons/ORD/NHDPlusLandscapeAttributes/StreamCat/HydroRegions'\n",
	"ftp_url = 'newftp.epa.gov'\n",
	"ftp = ftplib.FTP(ftp_url)\n",
	"ftp.login()\n",
	"ftp.cwd('EPADataCommons/ORD/NHDPlusLandscapeAttributes/StreamCat/HydroRegions') "
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"- Loop through all the files in the directory and select only those in list of metrics you need\n",
	"- read each into a DF and select only those in the cats.FEATUREID array and write out to csv"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"CanalDensity_Region17.zip\n",
	"FirePerimetersRipBuf100_Region17.zip\n",
	"GeoChemPhys1_Region17.zip\n"
	]
	}
	],
	"source": [
	"metric_list = ['CanalDensity_Region17.zip','FirePerimetersRipBuf100_Region17.zip','GeoChemPhys1_Region17.zip']\n",
	"for f in ftp.nlst():\n",
	" if f in metric_list:\n",
	" if not os.path.exists('{}/{}'.format(out_dir, f.replace('zip','csv'))):\n",
	" print(f)\n",
	" data = pd.read_csv('{}/{}'.format(url, f))\n",
	" data = data.loc[data.COMID.isin(cats.FEATUREID.values)]\n",
	" data.to_csv('{}/{}'.format(out_dir, f.replace('zip','csv')),index=False)"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3 (ipykernel)",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.9.7"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 1
	}