knu2xs/load_csv_spatialdataframe.ipynb

## load_csv_spatialdataframe.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Load Data from a CSV Previously Exported from a SpatialDataFrame\n",
    "\n",
    "Data previously in a SpatialDataFrame exported to a CSV file unfortunately does not currently (at v1.4) load directly back into a SpatialDataFrame. An intermediate step is required to get the SpatialDataFrame to correclty recognize the SHAPE column as geometry."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from arcgis.features import SpatialDataFrame as SDF\n",
    "from arcgis.geometry import Geometry\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Data into a Pandas DataFrame\n",
    "\n",
    "The first step is to load data into a normal Pandas DataFrame. This gives us a starting point to start manipulating the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>STATE</th>\n",
       "      <th>CLUSTER</th>\n",
       "      <th>PROJ_EARLY_MORN</th>\n",
       "      <th>SHAPE</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>150010201001</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>214127.969928</td>\n",
       "      <td>{\"rings\": [[[-155.09868999999998, 19.801140001...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>150010201002</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>177040.856850</td>\n",
       "      <td>{\"rings\": [[[-155.10497000099997, 19.869760001...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>150010201003</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>211432.928668</td>\n",
       "      <td>{\"rings\": [[[-155.09024999999997, 19.804200001...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>150010201004</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>208857.945568</td>\n",
       "      <td>{\"rings\": [[[-155.11466041399996, 19.793709357...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>150010202021</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>162845.872009</td>\n",
       "      <td>{\"rings\": [[[-155.09386972499996, 19.757200099...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             ID STATE  CLUSTER  PROJ_EARLY_MORN  \\\n",
       "0  150010201001    HI        5    214127.969928   \n",
       "1  150010201002    HI        5    177040.856850   \n",
       "2  150010201003    HI        5    211432.928668   \n",
       "3  150010201004    HI        5    208857.945568   \n",
       "4  150010202021    HI        5    162845.872009   \n",
       "\n",
       "                                               SHAPE  \n",
       "0  {\"rings\": [[[-155.09868999999998, 19.801140001...  \n",
       "1  {\"rings\": [[[-155.10497000099997, 19.869760001...  \n",
       "2  {\"rings\": [[[-155.09024999999997, 19.804200001...  \n",
       "3  {\"rings\": [[[-155.11466041399996, 19.793709357...  \n",
       "4  {\"rings\": [[[-155.09386972499996, 19.757200099...  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.read_csv(path_csv, index_col=0)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a DataFrame Without the Geometry Column\n",
    "\n",
    "When we create the SpatialDataFrame, we will load the geometry in through a separate input argumen than the rest of the data. Hence, we need to create a DataFrame _without_ the `SHAPE` column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>STATE</th>\n",
       "      <th>CLUSTER</th>\n",
       "      <th>PROJ_EARLY_MORN</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>150010201001</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>214127.969928</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>150010201002</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>177040.856850</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>150010201003</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>211432.928668</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>150010201004</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>208857.945568</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>150010202021</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>162845.872009</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             ID STATE  CLUSTER  PROJ_EARLY_MORN\n",
       "0  150010201001    HI        5    214127.969928\n",
       "1  150010201002    HI        5    177040.856850\n",
       "2  150010201003    HI        5    211432.928668\n",
       "3  150010201004    HI        5    208857.945568\n",
       "4  150010202021    HI        5    162845.872009"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_data = df[[col for col in df.columns if col != 'SHAPE']]\n",
    "df_data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Convert the SHAPE Column to Geometry\n",
    "\n",
    "To illustrate, if we take a look at the first value from the `SHAPE` column, it is actually an extremely long string. To be recoginzed as a Geometry object by the Python API, it needs to be a dictionary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "str"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(df.SHAPE.iloc[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can convert the string to a dictionary by using the Python `eval` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dict"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(eval(df.SHAPE.iloc[0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This gets us _most_ of the way there. We still need to create a geometry object. This can be accomplished by converting the dictionary using the Geometry object constructor. The Geometry object constructor reads the geometry type based on the JSON strucutre of the input dictionary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "arcgis.geometry._types.Polygon"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(Geometry(eval(df.SHAPE.iloc[0])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To convert the entire column, we can use the Pandas `apply` method to apply this conversion to every value in the column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    {'rings': [[[-155.09868999999998, 19.801140001...\n",
       "1    {'rings': [[[-155.10497000099997, 19.869760001...\n",
       "2    {'rings': [[[-155.09024999999997, 19.804200001...\n",
       "3    {'rings': [[[-155.11466041399996, 19.793709357...\n",
       "4    {'rings': [[[-155.09386972499996, 19.757200099...\n",
       "Name: SHAPE, dtype: object"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "geom = df.SHAPE.apply(lambda value: Geometry(eval(value)))\n",
    "geom.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, with a Pandas `DataFrame` containing the normal tabular data columns, and a Pandas `Series` popualted with valid `Geometry` objects, we use these as input parameters to create a `SpatialDataFrame`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>STATE</th>\n",
       "      <th>CLUSTER</th>\n",
       "      <th>PROJ_EARLY_MORN</th>\n",
       "      <th>SHAPE</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>150010201001</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>214127.969928</td>\n",
       "      <td>{'rings': [[[-155.09868999999998, 19.801140001...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>150010201002</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>177040.856850</td>\n",
       "      <td>{'rings': [[[-155.10497000099997, 19.869760001...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>150010201003</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>211432.928668</td>\n",
       "      <td>{'rings': [[[-155.09024999999997, 19.804200001...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>150010201004</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>208857.945568</td>\n",
       "      <td>{'rings': [[[-155.11466041399996, 19.793709357...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>150010202021</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>162845.872009</td>\n",
       "      <td>{'rings': [[[-155.09386972499996, 19.757200099...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             ID STATE  CLUSTER  PROJ_EARLY_MORN  \\\n",
       "0  150010201001    HI        5    214127.969928   \n",
       "1  150010201002    HI        5    177040.856850   \n",
       "2  150010201003    HI        5    211432.928668   \n",
       "3  150010201004    HI        5    208857.945568   \n",
       "4  150010202021    HI        5    162845.872009   \n",
       "\n",
       "                                               SHAPE  \n",
       "0  {'rings': [[[-155.09868999999998, 19.801140001...  \n",
       "1  {'rings': [[[-155.10497000099997, 19.869760001...  \n",
       "2  {'rings': [[[-155.09024999999997, 19.804200001...  \n",
       "3  {'rings': [[[-155.11466041399996, 19.793709357...  \n",
       "4  {'rings': [[[-155.09386972499996, 19.757200099...  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sdf = SDF(data=df_data, geometry=geom)\n",
    "sdf.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A Function Doing It All\n",
    "\n",
    "This can all be condensed into a single function facilitating using this as part of larger workflows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "def read_csv(path_csv):\n",
    "    df = pd.read_csv(path_csv, index_col=0)\n",
    "    sdf = SDF(\n",
    "        data=df[[col for col in df.columns if col != 'SHAPE']],\n",
    "        geometry=df.SHAPE.apply(lambda value: Geometry(eval(value)))\n",
    "    )\n",
    "    return sdf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the sake of completeness, this is how using this function looks when used to get the same result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>STATE</th>\n",
       "      <th>CLUSTER</th>\n",
       "      <th>PROJ_EARLY_MORN</th>\n",
       "      <th>SHAPE</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>150010201001</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>214127.969928</td>\n",
       "      <td>{'rings': [[[-155.09868999999998, 19.801140001...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>150010201002</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>177040.856850</td>\n",
       "      <td>{'rings': [[[-155.10497000099997, 19.869760001...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>150010201003</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>211432.928668</td>\n",
       "      <td>{'rings': [[[-155.09024999999997, 19.804200001...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>150010201004</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>208857.945568</td>\n",
       "      <td>{'rings': [[[-155.11466041399996, 19.793709357...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>150010202021</td>\n",
       "      <td>HI</td>\n",
       "      <td>5</td>\n",
       "      <td>162845.872009</td>\n",
       "      <td>{'rings': [[[-155.09386972499996, 19.757200099...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             ID STATE  CLUSTER  PROJ_EARLY_MORN  \\\n",
       "0  150010201001    HI        5    214127.969928   \n",
       "1  150010201002    HI        5    177040.856850   \n",
       "2  150010201003    HI        5    211432.928668   \n",
       "3  150010201004    HI        5    208857.945568   \n",
       "4  150010202021    HI        5    162845.872009   \n",
       "\n",
       "                                               SHAPE  \n",
       "0  {'rings': [[[-155.09868999999998, 19.801140001...  \n",
       "1  {'rings': [[[-155.10497000099997, 19.869760001...  \n",
       "2  {'rings': [[[-155.09024999999997, 19.804200001...  \n",
       "3  {'rings': [[[-155.11466041399996, 19.793709357...  \n",
       "4  {'rings': [[[-155.09386972499996, 19.757200099...  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sdf = read_csv(path_csv)\n",
    "sdf.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Load Data from a CSV Previously Exported from a SpatialDataFrame\n",
	"\n",
	"Data previously in a SpatialDataFrame exported to a CSV file unfortunately does not currently (at v1.4) load directly back into a SpatialDataFrame. An intermediate step is required to get the SpatialDataFrame to correclty recognize the SHAPE column as geometry."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"from arcgis.features import SpatialDataFrame as SDF\n",
	"from arcgis.geometry import Geometry\n",
	"import pandas as pd"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Load Data into a Pandas DataFrame\n",
	"\n",
	"The first step is to load data into a normal Pandas DataFrame. This gives us a starting point to start manipulating the data."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<div>\n",
	"<style scoped>\n",
	" .dataframe tbody tr th:only-of-type {\n",
	" vertical-align: middle;\n",
	" }\n",
	"\n",
	" .dataframe tbody tr th {\n",
	" vertical-align: top;\n",
	" }\n",
	"\n",
	" .dataframe thead th {\n",
	" text-align: right;\n",
	" }\n",
	"</style>\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>ID</th>\n",
	" <th>STATE</th>\n",
	" <th>CLUSTER</th>\n",
	" <th>PROJ_EARLY_MORN</th>\n",
	" <th>SHAPE</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>0</th>\n",
	" <td>150010201001</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>214127.969928</td>\n",
	" <td>{\"rings\": [[[-155.09868999999998, 19.801140001...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1</th>\n",
	" <td>150010201002</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>177040.856850</td>\n",
	" <td>{\"rings\": [[[-155.10497000099997, 19.869760001...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2</th>\n",
	" <td>150010201003</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>211432.928668</td>\n",
	" <td>{\"rings\": [[[-155.09024999999997, 19.804200001...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>3</th>\n",
	" <td>150010201004</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>208857.945568</td>\n",
	" <td>{\"rings\": [[[-155.11466041399996, 19.793709357...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>4</th>\n",
	" <td>150010202021</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>162845.872009</td>\n",
	" <td>{\"rings\": [[[-155.09386972499996, 19.757200099...</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"text/plain": [
	" ID STATE CLUSTER PROJ_EARLY_MORN \\\n",
	"0 150010201001 HI 5 214127.969928 \n",
	"1 150010201002 HI 5 177040.856850 \n",
	"2 150010201003 HI 5 211432.928668 \n",
	"3 150010201004 HI 5 208857.945568 \n",
	"4 150010202021 HI 5 162845.872009 \n",
	"\n",
	" SHAPE \n",
	"0 {\"rings\": [[[-155.09868999999998, 19.801140001... \n",
	"1 {\"rings\": [[[-155.10497000099997, 19.869760001... \n",
	"2 {\"rings\": [[[-155.09024999999997, 19.804200001... \n",
	"3 {\"rings\": [[[-155.11466041399996, 19.793709357... \n",
	"4 {\"rings\": [[[-155.09386972499996, 19.757200099... "
	]
	},
	"execution_count": 3,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"df = pd.read_csv(path_csv, index_col=0)\n",
	"df.head()"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Create a DataFrame Without the Geometry Column\n",
	"\n",
	"When we create the SpatialDataFrame, we will load the geometry in through a separate input argumen than the rest of the data. Hence, we need to create a DataFrame _without_ the `SHAPE` column."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<div>\n",
	"<style scoped>\n",
	" .dataframe tbody tr th:only-of-type {\n",
	" vertical-align: middle;\n",
	" }\n",
	"\n",
	" .dataframe tbody tr th {\n",
	" vertical-align: top;\n",
	" }\n",
	"\n",
	" .dataframe thead th {\n",
	" text-align: right;\n",
	" }\n",
	"</style>\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>ID</th>\n",
	" <th>STATE</th>\n",
	" <th>CLUSTER</th>\n",
	" <th>PROJ_EARLY_MORN</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>0</th>\n",
	" <td>150010201001</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>214127.969928</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1</th>\n",
	" <td>150010201002</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>177040.856850</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2</th>\n",
	" <td>150010201003</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>211432.928668</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>3</th>\n",
	" <td>150010201004</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>208857.945568</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>4</th>\n",
	" <td>150010202021</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>162845.872009</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"text/plain": [
	" ID STATE CLUSTER PROJ_EARLY_MORN\n",
	"0 150010201001 HI 5 214127.969928\n",
	"1 150010201002 HI 5 177040.856850\n",
	"2 150010201003 HI 5 211432.928668\n",
	"3 150010201004 HI 5 208857.945568\n",
	"4 150010202021 HI 5 162845.872009"
	]
	},
	"execution_count": 4,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"df_data = df[[col for col in df.columns if col != 'SHAPE']]\n",
	"df_data.head()"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Convert the SHAPE Column to Geometry\n",
	"\n",
	"To illustrate, if we take a look at the first value from the `SHAPE` column, it is actually an extremely long string. To be recoginzed as a Geometry object by the Python API, it needs to be a dictionary."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"str"
	]
	},
	"execution_count": 5,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"type(df.SHAPE.iloc[0])"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We can convert the string to a dictionary by using the Python `eval` method."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"dict"
	]
	},
	"execution_count": 6,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"type(eval(df.SHAPE.iloc[0]))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"This gets us _most_ of the way there. We still need to create a geometry object. This can be accomplished by converting the dictionary using the Geometry object constructor. The Geometry object constructor reads the geometry type based on the JSON strucutre of the input dictionary."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"arcgis.geometry._types.Polygon"
	]
	},
	"execution_count": 7,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"type(Geometry(eval(df.SHAPE.iloc[0])))"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"To convert the entire column, we can use the Pandas `apply` method to apply this conversion to every value in the column."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"0 {'rings': [[[-155.09868999999998, 19.801140001...\n",
	"1 {'rings': [[[-155.10497000099997, 19.869760001...\n",
	"2 {'rings': [[[-155.09024999999997, 19.804200001...\n",
	"3 {'rings': [[[-155.11466041399996, 19.793709357...\n",
	"4 {'rings': [[[-155.09386972499996, 19.757200099...\n",
	"Name: SHAPE, dtype: object"
	]
	},
	"execution_count": 8,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"geom = df.SHAPE.apply(lambda value: Geometry(eval(value)))\n",
	"geom.head()"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Now, with a Pandas `DataFrame` containing the normal tabular data columns, and a Pandas `Series` popualted with valid `Geometry` objects, we use these as input parameters to create a `SpatialDataFrame`."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<div>\n",
	"<style scoped>\n",
	" .dataframe tbody tr th:only-of-type {\n",
	" vertical-align: middle;\n",
	" }\n",
	"\n",
	" .dataframe tbody tr th {\n",
	" vertical-align: top;\n",
	" }\n",
	"\n",
	" .dataframe thead th {\n",
	" text-align: right;\n",
	" }\n",
	"</style>\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>ID</th>\n",
	" <th>STATE</th>\n",
	" <th>CLUSTER</th>\n",
	" <th>PROJ_EARLY_MORN</th>\n",
	" <th>SHAPE</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>0</th>\n",
	" <td>150010201001</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>214127.969928</td>\n",
	" <td>{'rings': [[[-155.09868999999998, 19.801140001...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1</th>\n",
	" <td>150010201002</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>177040.856850</td>\n",
	" <td>{'rings': [[[-155.10497000099997, 19.869760001...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2</th>\n",
	" <td>150010201003</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>211432.928668</td>\n",
	" <td>{'rings': [[[-155.09024999999997, 19.804200001...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>3</th>\n",
	" <td>150010201004</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>208857.945568</td>\n",
	" <td>{'rings': [[[-155.11466041399996, 19.793709357...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>4</th>\n",
	" <td>150010202021</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>162845.872009</td>\n",
	" <td>{'rings': [[[-155.09386972499996, 19.757200099...</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"text/plain": [
	" ID STATE CLUSTER PROJ_EARLY_MORN \\\n",
	"0 150010201001 HI 5 214127.969928 \n",
	"1 150010201002 HI 5 177040.856850 \n",
	"2 150010201003 HI 5 211432.928668 \n",
	"3 150010201004 HI 5 208857.945568 \n",
	"4 150010202021 HI 5 162845.872009 \n",
	"\n",
	" SHAPE \n",
	"0 {'rings': [[[-155.09868999999998, 19.801140001... \n",
	"1 {'rings': [[[-155.10497000099997, 19.869760001... \n",
	"2 {'rings': [[[-155.09024999999997, 19.804200001... \n",
	"3 {'rings': [[[-155.11466041399996, 19.793709357... \n",
	"4 {'rings': [[[-155.09386972499996, 19.757200099... "
	]
	},
	"execution_count": 9,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"sdf = SDF(data=df_data, geometry=geom)\n",
	"sdf.head()"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## A Function Doing It All\n",
	"\n",
	"This can all be condensed into a single function facilitating using this as part of larger workflows."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 10,
	"metadata": {},
	"outputs": [],
	"source": [
	"def read_csv(path_csv):\n",
	" df = pd.read_csv(path_csv, index_col=0)\n",
	" sdf = SDF(\n",
	" data=df[[col for col in df.columns if col != 'SHAPE']],\n",
	" geometry=df.SHAPE.apply(lambda value: Geometry(eval(value)))\n",
	" )\n",
	" return sdf"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"For the sake of completeness, this is how using this function looks when used to get the same result."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<div>\n",
	"<style scoped>\n",
	" .dataframe tbody tr th:only-of-type {\n",
	" vertical-align: middle;\n",
	" }\n",
	"\n",
	" .dataframe tbody tr th {\n",
	" vertical-align: top;\n",
	" }\n",
	"\n",
	" .dataframe thead th {\n",
	" text-align: right;\n",
	" }\n",
	"</style>\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>ID</th>\n",
	" <th>STATE</th>\n",
	" <th>CLUSTER</th>\n",
	" <th>PROJ_EARLY_MORN</th>\n",
	" <th>SHAPE</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>0</th>\n",
	" <td>150010201001</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>214127.969928</td>\n",
	" <td>{'rings': [[[-155.09868999999998, 19.801140001...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1</th>\n",
	" <td>150010201002</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>177040.856850</td>\n",
	" <td>{'rings': [[[-155.10497000099997, 19.869760001...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2</th>\n",
	" <td>150010201003</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>211432.928668</td>\n",
	" <td>{'rings': [[[-155.09024999999997, 19.804200001...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>3</th>\n",
	" <td>150010201004</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>208857.945568</td>\n",
	" <td>{'rings': [[[-155.11466041399996, 19.793709357...</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>4</th>\n",
	" <td>150010202021</td>\n",
	" <td>HI</td>\n",
	" <td>5</td>\n",
	" <td>162845.872009</td>\n",
	" <td>{'rings': [[[-155.09386972499996, 19.757200099...</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"text/plain": [
	" ID STATE CLUSTER PROJ_EARLY_MORN \\\n",
	"0 150010201001 HI 5 214127.969928 \n",
	"1 150010201002 HI 5 177040.856850 \n",
	"2 150010201003 HI 5 211432.928668 \n",
	"3 150010201004 HI 5 208857.945568 \n",
	"4 150010202021 HI 5 162845.872009 \n",
	"\n",
	" SHAPE \n",
	"0 {'rings': [[[-155.09868999999998, 19.801140001... \n",
	"1 {'rings': [[[-155.10497000099997, 19.869760001... \n",
	"2 {'rings': [[[-155.09024999999997, 19.804200001... \n",
	"3 {'rings': [[[-155.11466041399996, 19.793709357... \n",
	"4 {'rings': [[[-155.09386972499996, 19.757200099... "
	]
	},
	"execution_count": 11,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"sdf = read_csv(path_csv)\n",
	"sdf.head()"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.6.2"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}