Skip to content

Instantly share code, notes, and snippets.

@BryanSWeber
Created March 29, 2023 16:34
Show Gist options
  • Save BryanSWeber/f08f05c5e977dfd0b7a088d3adbd424b to your computer and use it in GitHub Desktop.
Save BryanSWeber/f08f05c5e977dfd0b7a088d3adbd424b to your computer and use it in GitHub Desktop.
Day 1 Lesson.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"authorship_tag": "ABX9TyMwNVGpBx48It5VQR+7X8TP",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/BryanSWeber/f08f05c5e977dfd0b7a088d3adbd424b/day-1-lesson.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"## Day 1 Lecture"
],
"metadata": {
"id": "5GweGhl0_iNz"
}
},
{
"cell_type": "markdown",
"source": [
"Let's begin with a basic task that we know will be successful.\n",
"\n",
"I note I will ask many questions in these comments. I *suggest* you explore them on your own. "
],
"metadata": {
"id": "EwS8qczVkcw-"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Jz-U7rjKMPbe"
},
"outputs": [],
"source": [
"#Think of packages as the \"DLC\" of Python. We need 2, and are making aliases.\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"source": [
"#We are using pandas to get the data into our computer.\n",
"pd.read_csv('sample_data/california_housing_train.csv')\n",
"#Note the '' or \"\" around the text, this means \"read the whole thing as one string.\""
],
"metadata": {
"id": "V2NvuNpo9m90",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 424
},
"outputId": "2ceb799b-24b4-4782-c83c-e4e7767da3eb"
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" longitude latitude housing_median_age total_rooms total_bedrooms \\\n",
"0 -114.31 34.19 15.0 5612.0 1283.0 \n",
"1 -114.47 34.40 19.0 7650.0 1901.0 \n",
"2 -114.56 33.69 17.0 720.0 174.0 \n",
"3 -114.57 33.64 14.0 1501.0 337.0 \n",
"4 -114.57 33.57 20.0 1454.0 326.0 \n",
"... ... ... ... ... ... \n",
"16995 -124.26 40.58 52.0 2217.0 394.0 \n",
"16996 -124.27 40.69 36.0 2349.0 528.0 \n",
"16997 -124.30 41.84 17.0 2677.0 531.0 \n",
"16998 -124.30 41.80 19.0 2672.0 552.0 \n",
"16999 -124.35 40.54 52.0 1820.0 300.0 \n",
"\n",
" population households median_income median_house_value \n",
"0 1015.0 472.0 1.4936 66900.0 \n",
"1 1129.0 463.0 1.8200 80100.0 \n",
"2 333.0 117.0 1.6509 85700.0 \n",
"3 515.0 226.0 3.1917 73400.0 \n",
"4 624.0 262.0 1.9250 65500.0 \n",
"... ... ... ... ... \n",
"16995 907.0 369.0 2.3571 111400.0 \n",
"16996 1194.0 465.0 2.5179 79000.0 \n",
"16997 1244.0 456.0 3.0313 103600.0 \n",
"16998 1298.0 478.0 1.9797 85800.0 \n",
"16999 806.0 270.0 3.0147 94600.0 \n",
"\n",
"[17000 rows x 9 columns]"
],
"text/html": [
"\n",
" <div id=\"df-0d341ddc-b0de-401e-8e81-fd1f7ffc1ee0\">\n",
" <div class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>longitude</th>\n",
" <th>latitude</th>\n",
" <th>housing_median_age</th>\n",
" <th>total_rooms</th>\n",
" <th>total_bedrooms</th>\n",
" <th>population</th>\n",
" <th>households</th>\n",
" <th>median_income</th>\n",
" <th>median_house_value</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>-114.31</td>\n",
" <td>34.19</td>\n",
" <td>15.0</td>\n",
" <td>5612.0</td>\n",
" <td>1283.0</td>\n",
" <td>1015.0</td>\n",
" <td>472.0</td>\n",
" <td>1.4936</td>\n",
" <td>66900.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>-114.47</td>\n",
" <td>34.40</td>\n",
" <td>19.0</td>\n",
" <td>7650.0</td>\n",
" <td>1901.0</td>\n",
" <td>1129.0</td>\n",
" <td>463.0</td>\n",
" <td>1.8200</td>\n",
" <td>80100.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>-114.56</td>\n",
" <td>33.69</td>\n",
" <td>17.0</td>\n",
" <td>720.0</td>\n",
" <td>174.0</td>\n",
" <td>333.0</td>\n",
" <td>117.0</td>\n",
" <td>1.6509</td>\n",
" <td>85700.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>-114.57</td>\n",
" <td>33.64</td>\n",
" <td>14.0</td>\n",
" <td>1501.0</td>\n",
" <td>337.0</td>\n",
" <td>515.0</td>\n",
" <td>226.0</td>\n",
" <td>3.1917</td>\n",
" <td>73400.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>-114.57</td>\n",
" <td>33.57</td>\n",
" <td>20.0</td>\n",
" <td>1454.0</td>\n",
" <td>326.0</td>\n",
" <td>624.0</td>\n",
" <td>262.0</td>\n",
" <td>1.9250</td>\n",
" <td>65500.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16995</th>\n",
" <td>-124.26</td>\n",
" <td>40.58</td>\n",
" <td>52.0</td>\n",
" <td>2217.0</td>\n",
" <td>394.0</td>\n",
" <td>907.0</td>\n",
" <td>369.0</td>\n",
" <td>2.3571</td>\n",
" <td>111400.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16996</th>\n",
" <td>-124.27</td>\n",
" <td>40.69</td>\n",
" <td>36.0</td>\n",
" <td>2349.0</td>\n",
" <td>528.0</td>\n",
" <td>1194.0</td>\n",
" <td>465.0</td>\n",
" <td>2.5179</td>\n",
" <td>79000.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16997</th>\n",
" <td>-124.30</td>\n",
" <td>41.84</td>\n",
" <td>17.0</td>\n",
" <td>2677.0</td>\n",
" <td>531.0</td>\n",
" <td>1244.0</td>\n",
" <td>456.0</td>\n",
" <td>3.0313</td>\n",
" <td>103600.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16998</th>\n",
" <td>-124.30</td>\n",
" <td>41.80</td>\n",
" <td>19.0</td>\n",
" <td>2672.0</td>\n",
" <td>552.0</td>\n",
" <td>1298.0</td>\n",
" <td>478.0</td>\n",
" <td>1.9797</td>\n",
" <td>85800.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16999</th>\n",
" <td>-124.35</td>\n",
" <td>40.54</td>\n",
" <td>52.0</td>\n",
" <td>1820.0</td>\n",
" <td>300.0</td>\n",
" <td>806.0</td>\n",
" <td>270.0</td>\n",
" <td>3.0147</td>\n",
" <td>94600.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>17000 rows × 9 columns</p>\n",
"</div>\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-0d341ddc-b0de-401e-8e81-fd1f7ffc1ee0')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
" \n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
" </svg>\n",
" </button>\n",
" \n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" flex-wrap:wrap;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-0d341ddc-b0de-401e-8e81-fd1f7ffc1ee0 button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-0d341ddc-b0de-401e-8e81-fd1f7ffc1ee0');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
" </div>\n",
" "
]
},
"metadata": {},
"execution_count": 16
}
]
},
{
"cell_type": "code",
"source": [
"#If we don't save it as an object, it will just print and disappear. :0!\n",
"myDataObj = pd.read_csv('sample_data/california_housing_train.csv')"
],
"metadata": {
"id": "FRR8gdEo-SMA"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"#Let's look at our data in a meaningful way! This is the histogram command.\n",
"myhist = plt.hist(myDataObj['population'])"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 265
},
"id": "yzUdcHxR-1hO",
"outputId": "1ace4eff-eeea-43fd-ca0c-d4cd351a529b"
},
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
],
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYMAAAD4CAYAAAAO9oqkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/NK7nSAAAACXBIWXMAAAsTAAALEwEAmpwYAAAWRklEQVR4nO3dfZBd9X3f8fenksHPloAtpZKmkmPFGcEkNt4AGaeexHSQwJ6IP4hHTFJURxPNxHLqpGltiGdCapsZcNLiMLHxKEZBuB6EQpyiqXGIiklpp+FhMY8CY60FtqQBtEYCJ/UEIvztH/cn+7LeXWnvvfsg6/2a2dlzvud3zv2es1f67D3n7L2pKiRJJ7Z/NtcNSJLmnmEgSTIMJEmGgSQJw0CSBCyc6wZ6ddppp9Xy5cvnug1JOq488MAD362qofH14zYMli9fzsjIyFy3IUnHlSTfnqjuaSJJkmEgSTIMJEkYBpIkDANJEoaBJAnDQJKEYSBJwjCQJHEc/wVyP5Zf/pU5edynr37fnDyuJB3NUV8ZJNmS5ECSx8bVfzvJN5LsSvLprvoVSUaTPJlkdVd9TauNJrm8q74iyb2tfkuSkwa1c5KkY3Msp4luBNZ0F5L8MrAW+LmqOhP441ZfBawDzmzrfC7JgiQLgM8CFwKrgEvbWIBrgGur6m3AIWBDvzslSZqeo4ZBVd0NHBxX/i3g6qp6qY050OprgW1V9VJVPQWMAue0r9Gq2lNVLwPbgLVJArwXuLWtvxW4uL9dkiRNV68XkH8a+Nft9M7/SvLzrb4E2Ns1bl+rTVY/FXihqg6Pq08oycYkI0lGxsbGemxdkjRer2GwEDgFOA/4T8D29lv+jKqqzVU1XFXDQ0M/9nbckqQe9Xo30T7gy1VVwH1JfgCcBuwHlnWNW9pqTFJ/HliUZGF7ddA9XpI0S3p9ZfDfgV8GSPLTwEnAd4EdwLokJydZAawE7gPuB1a2O4dOonOReUcLk7uAS9p21wO39diTJKlHR31lkORm4JeA05LsA64EtgBb2u2mLwPr23/su5JsBx4HDgObquqVtp0PA3cAC4AtVbWrPcTHgG1JPgU8CNwwwP2TJB2Do4ZBVV06yaJfn2T8VcBVE9RvB26foL6Hzt1GkqQ54ttRSJIMA0mSYSBJwjCQJGEYSJIwDCRJGAaSJAwDSRKGgSQJw0CShGEgScIwkCRhGEiSMAwkSRgGkiQMA0kSxxAGSbYkOdA+1Wz8st9LUklOa/NJcl2S0SSPJDm7a+z6JLvb1/qu+ruSPNrWuS5JBrVzkqRjcyyvDG4E1owvJlkGXAB8p6t8IZ3PPV4JbASub2NPofNxmefS+VSzK5MsbutcD/xm13o/9liSpJl11DCoqruBgxMsuhb4KFBdtbXATdVxD7AoyRnAamBnVR2sqkPATmBNW/bmqrqnfYbyTcDFfe2RJGnaerpmkGQtsL+qHh63aAmwt2t+X6tNVd83QX2yx92YZCTJyNjYWC+tS5ImMO0wSPJ64PeBPxh8O1Orqs1VNVxVw0NDQ7P98JL0E6uXVwY/BawAHk7yNLAU+HqSfwHsB5Z1jV3aalPVl05QlyTNommHQVU9WlX/vKqWV9VyOqd2zq6qZ4EdwGXtrqLzgBer6hngDuCCJIvbheMLgDvasu8lOa/dRXQZcNuA9k2SdIyO5dbSm4G/A96eZF+SDVMMvx3YA4wCfwZ8CKCqDgKfBO5vX59oNdqYL7R1vgV8tbddkST1auHRBlTVpUdZvrxruoBNk4zbAmyZoD4CnHW0PiRJM8e/QJYkGQaSJMNAkoRhIEnCMJAkYRhIkjAMJEkYBpIkDANJEoaBJAnDQJKEYSBJwjCQJGEYSJIwDCRJGAaSJI7tk862JDmQ5LGu2h8l+UaSR5L8VZJFXcuuSDKa5Mkkq7vqa1ptNMnlXfUVSe5t9VuSnDTA/ZMkHYNjeWVwI7BmXG0ncFZV/SzwTeAKgCSrgHXAmW2dzyVZkGQB8FngQmAVcGkbC3ANcG1VvQ04BEz1sZqSpBlw1DCoqruBg+Nqf1NVh9vsPcDSNr0W2FZVL1XVU3Q+1/ic9jVaVXuq6mVgG7A2SYD3Are29bcCF/e3S5Kk6RrENYPf4EcfYr8E2Nu1bF+rTVY/FXihK1iO1CeUZGOSkSQjY2NjA2hdkgR9hkGSjwOHgS8Npp2pVdXmqhququGhoaHZeEhJOiEs7HXFJP8OeD9wflVVK+8HlnUNW9pqTFJ/HliUZGF7ddA9XpI0S3p6ZZBkDfBR4Feq6vtdi3YA65KcnGQFsBK4D7gfWNnuHDqJzkXmHS1E7gIuaeuvB27rbVckSb06lltLbwb+Dnh7kn1JNgB/CrwJ2JnkoSSfB6iqXcB24HHgr4FNVfVK+63/w8AdwBPA9jYW4GPAf0gySucawg0D3UNJ0lEd9TRRVV06QXnS/7Cr6irgqgnqtwO3T1DfQ+duI0nSHPEvkCVJhoEkyTCQJGEYSJIwDCRJGAaSJAwDSRKGgSQJw0CShGEgScIwkCRhGEiSMAwkSRgGkiQMA0kShoEkCcNAksSxfezlliQHkjzWVTslyc4ku9v3xa2eJNclGU3ySJKzu9ZZ38bvTrK+q/6uJI+2da5LkkHvpCRpasfyyuBGYM242uXAnVW1ErizzQNcCKxsXxuB66ETHsCVwLl0PuLyyiMB0sb8Ztd64x9LkjTDjhoGVXU3cHBceS2wtU1vBS7uqt9UHfcAi5KcAawGdlbVwao6BOwE1rRlb66qe6qqgJu6tiVJmiW9XjM4vaqeadPPAqe36SXA3q5x+1ptqvq+CeoTSrIxyUiSkbGxsR5blySN1/cF5PYbfQ2gl2N5rM1VNVxVw0NDQ7PxkJJ0Qug1DJ5rp3ho3w+0+n5gWde4pa02VX3pBHVJ0izqNQx2AEfuCFoP3NZVv6zdVXQe8GI7nXQHcEGSxe3C8QXAHW3Z95Kc1+4iuqxrW5KkWbLwaAOS3Az8EnBakn107gq6GtieZAPwbeADbfjtwEXAKPB94IMAVXUwySeB+9u4T1TVkYvSH6Jzx9LrgK+2L0nSLDpqGFTVpZMsOn+CsQVsmmQ7W4AtE9RHgLOO1ockaeb4F8iSJMNAkmQYSJIwDCRJGAaSJAwDSRKGgSQJw0CShGEgScIwkCRhGEiSMAwkSRgGkiQMA0kShoEkCcNAkkSfYZDkd5PsSvJYkpuTvDbJiiT3JhlNckuSk9rYk9v8aFu+vGs7V7T6k0lW97lPkqRp6jkMkiwB/j0wXFVnAQuAdcA1wLVV9TbgELChrbIBONTq17ZxJFnV1jsTWAN8LsmCXvuSJE1fv6eJFgKvS7IQeD3wDPBe4Na2fCtwcZte2+Zpy89PklbfVlUvVdVTdD4/+Zw++5IkTUPPYVBV+4E/Br5DJwReBB4AXqiqw23YPmBJm14C7G3rHm7jT+2uT7DOqyTZmGQkycjY2FivrUuSxunnNNFiOr/VrwD+JfAGOqd5ZkxVba6q4aoaHhoamsmHkqQTSj+nif4N8FRVjVXVPwFfBt4NLGqnjQCWAvvb9H5gGUBb/hbg+e76BOtIkmZBP2HwHeC8JK9v5/7PBx4H7gIuaWPWA7e16R1tnrb8a1VVrb6u3W20AlgJ3NdHX5KkaVp49CETq6p7k9wKfB04DDwIbAa+AmxL8qlWu6GtcgPwxSSjwEE6dxBRVbuSbKcTJIeBTVX1Sq99SZKmr+cwAKiqK4Erx5X3MMHdQFX1j8CvTrKdq4Cr+ulFktQ7/wJZkmQYSJIMA0kShoEkCcNAkoRhIEnCMJAkYRhIkjAMJEkYBpIkDANJEoaBJAnDQJKEYSBJwjCQJGEYSJLoMwySLEpya5JvJHkiyS8kOSXJziS72/fFbWySXJdkNMkjSc7u2s76Nn53kvWTP6IkaSb0+8rgT4C/rqqfAX4OeAK4HLizqlYCd7Z5gAvpfL7xSmAjcD1AklPofFrauXQ+Ie3KIwEiSZodPYdBkrcA76F9xnFVvVxVLwBrga1t2Fbg4ja9FripOu4BFiU5A1gN7Kyqg1V1CNgJrOm1L0nS9PXzymAFMAb8eZIHk3whyRuA06vqmTbmWeD0Nr0E2Nu1/r5Wm6wuSZol/YTBQuBs4Pqqeifw//jRKSEAqqqA6uMxXiXJxiQjSUbGxsYGtVlJOuH1Ewb7gH1VdW+bv5VOODzXTv/Qvh9oy/cDy7rWX9pqk9V/TFVtrqrhqhoeGhrqo3VJUreew6CqngX2Jnl7K50PPA7sAI7cEbQeuK1N7wAua3cVnQe82E4n3QFckGRxu3B8QatJkmbJwj7X/23gS0lOAvYAH6QTMNuTbAC+DXygjb0duAgYBb7fxlJVB5N8Eri/jftEVR3ssy9J0jT0FQZV9RAwPMGi8ycYW8CmSbazBdjSTy+SpN75F8iSJMNAkmQYSJIwDCRJGAaSJAwDSRKGgSQJw0CShGEgScIwkCRhGEiSMAwkSRgGkiQMA0kShoEkCcNAkoRhIEliAGGQZEGSB5P8jza/Ism9SUaT3NI+EpMkJ7f50bZ8edc2rmj1J5Os7rcnSdL0DOKVwUeAJ7rmrwGuraq3AYeADa2+ATjU6te2cSRZBawDzgTWAJ9LsmAAfUmSjlFfYZBkKfA+4AttPsB7gVvbkK3AxW16bZunLT+/jV8LbKuql6rqKWAUOKefviRJ09PvK4PPAB8FftDmTwVeqKrDbX4fsKRNLwH2ArTlL7bxP6xPsM6rJNmYZCTJyNjYWJ+tS5KO6DkMkrwfOFBVDwywnylV1eaqGq6q4aGhodl6WEn6ibewj3XfDfxKkouA1wJvBv4EWJRkYfvtfymwv43fDywD9iVZCLwFeL6rfkT3OpKkWdDzK4OquqKqllbVcjoXgL9WVb8G3AVc0oatB25r0zvaPG3516qqWn1du9toBbASuK/XviRJ09fPK4PJfAzYluRTwIPADa1+A/DFJKPAQToBQlXtSrIdeBw4DGyqqldmoC9J0iQGEgZV9bfA37bpPUxwN1BV/SPwq5OsfxVw1SB6kSRNn3+BLEkyDCRJhoEkCcNAkoRhIEnCMJAkYRhIkjAMJEkYBpIkDANJEoaBJAnDQJKEYSBJwjCQJGEYSJIwDCRJ9BEGSZYluSvJ40l2JflIq5+SZGeS3e374lZPkuuSjCZ5JMnZXdta38bvTrJ+sseUJM2Mfl4ZHAZ+r6pWAecBm5KsAi4H7qyqlcCdbR7gQjqfb7wS2AhcD53wAK4EzqXzCWlXHgkQSdLs6DkMquqZqvp6m/574AlgCbAW2NqGbQUubtNrgZuq4x5gUZIzgNXAzqo6WFWHgJ3Aml77kiRN30CuGSRZDrwTuBc4vaqeaYueBU5v00uAvV2r7Wu1yeoTPc7GJCNJRsbGxgbRuiSJAYRBkjcCfwn8TlV9r3tZVRVQ/T5G1/Y2V9VwVQ0PDQ0NarOSdMLrKwySvIZOEHypqr7cys+10z+07wdafT+wrGv1pa02WV2SNEv6uZsowA3AE1X1X7sW7QCO3BG0Hritq35Zu6voPODFdjrpDuCCJIvbheMLWk2SNEsW9rHuu4F/Czya5KFW+33gamB7kg3At4EPtGW3AxcBo8D3gQ8CVNXBJJ8E7m/jPlFVB/voS5I0TT2HQVX9HyCTLD5/gvEFbJpkW1uALb32Iknqj3+BLEkyDCRJhoEkCcNAkoRhIEnCMJAkYRhIkjAMJEkYBpIkDANJEoaBJAnDQJJEf+9aqmlafvlX5uyxn776fXP22JLmP18ZSJIMA0mSYSBJwjCQJDGPwiDJmiRPJhlNcvlc9yNJJ5J5EQZJFgCfBS4EVgGXJlk1t11J0oljvtxaeg4wWlV7AJJsA9YCj89pVz9B5uq2Vm9plY4P8yUMlgB7u+b3AeeOH5RkI7Cxzf5Dkid7fLzTgO/2uO5sO156nbDPXDMHnRzd8XJM4fjp9XjpE46fXmeqz381UXG+hMExqarNwOZ+t5NkpKqGB9DSjDteej1e+gR7nQnHS59w/PQ6233Oi2sGwH5gWdf80laTJM2C+RIG9wMrk6xIchKwDtgxxz1J0gljXpwmqqrDST4M3AEsALZU1a4ZfMi+TzXNouOl1+OlT7DXmXC89AnHT6+z2meqajYfT5I0D82X00SSpDlkGEiSTqwwmC9veZHk6SSPJnkoyUirnZJkZ5Ld7fviVk+S61rPjyQ5u2s769v43UnWD6i3LUkOJHmsqzaw3pK8q+37aFs3A+zzD5Psb8f1oSQXdS27oj3mk0lWd9UnfE60mxnubfVb2o0NPUmyLMldSR5PsivJR1p9Xh3XKfqcd8c1yWuT3Jfk4dbrf55q+0lObvOjbfnyXvdhgL3emOSpruP6jlafm39XVXVCfNG5MP0t4K3AScDDwKo56uVp4LRxtU8Dl7fpy4Fr2vRFwFeBAOcB97b6KcCe9n1xm148gN7eA5wNPDYTvQH3tbFp6144wD7/EPiPE4xd1X7eJwMr2vNgwVTPCWA7sK5Nfx74rT6O6RnA2W36TcA3W0/z6rhO0ee8O65tP9/Ypl8D3Nv2f8LtAx8CPt+m1wG39LoPA+z1RuCSCcbPyc//RHpl8MO3vKiql4Ejb3kxX6wFtrbprcDFXfWbquMeYFGSM4DVwM6qOlhVh4CdwJp+m6iqu4GDM9FbW/bmqrqnOs/gm7q2NYg+J7MW2FZVL1XVU8AonefDhM+J9lvVe4FbJ9jnXnp9pqq+3qb/HniCzl/dz6vjOkWfk5mz49qOzT+02de0r5pi+93H+lbg/NbPtPZhwL1OZk5+/idSGEz0lhdTPdFnUgF/k+SBdN5iA+D0qnqmTT8LnN6mJ+t7NvdnUL0tadPj64P04fbSesuR0y499Hkq8EJVHR50n+30xDvp/HY4b4/ruD5hHh7XJAuSPAQcoPMf47em2P4Pe2rLX2z9zMq/r/G9VtWR43pVO67XJjl5fK/H2NNAfv4nUhjMJ79YVWfTeZfWTUne072wpfu8vOd3PvcGXA/8FPAO4Bngv8xpN+MkeSPwl8DvVNX3upfNp+M6QZ/z8rhW1StV9Q4671hwDvAzc9vR5Mb3muQs4Ao6Pf88nVM/H5u7Dk+sMJg3b3lRVfvb9wPAX9F5Ij/XXu7Rvh9owyfrezb3Z1C97W/TM9JzVT3X/tH9APgzOse1lz6fp/PSfOG4es+SvIbOf7Bfqqovt/K8O64T9Tmfj2vr7wXgLuAXptj+D3tqy9/S+pnVf19dva5pp+Wqql4C/pzej+tgfv7TvchwvH7R+WvrPXQuEh25IHTmHPTxBuBNXdP/l865/j/i1RcTP92m38erLybdVz+6mPQUnQtJi9v0KQPqcTmvvjA7sN748QtdFw2wzzO6pn+XzrlggDN59UXCPXQuEE76nAD+gldfiPxQH32Gznncz4yrz6vjOkWf8+64AkPAojb9OuB/A++fbPvAJl59AXl7r/swwF7P6DrunwGuntOff69P8OPxi85V+m/SObf48Tnq4a3tifUwsOtIH3TOX94J7Ab+Z9cPOXQ++OdbwKPAcNe2foPOBa9R4IMD6u9mOqcC/onOuccNg+wNGAYea+v8Ke2v4AfU5xdbH4/QeW+r7v/EPt4e80m67rSY7DnRfk73tf7/Aji5j2P6i3ROAT0CPNS+Lppvx3WKPufdcQV+Fniw9fQY8AdTbR94bZsfbcvf2us+DLDXr7Xj+hjw3/jRHUdz8vP37SgkSSfUNQNJ0iQMA0mSYSBJMgwkSRgGkiQMA0kShoEkCfj/E7YzjWQ7ur0AAAAASUVORK5CYII=\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"source": [
"## Problem Solving\n"
],
"metadata": {
"id": "z-bDg3hYkUX4"
}
},
{
"cell_type": "markdown",
"source": [
"Let's make another plot/graph for a new data set that isn't already in Colab.\n",
"\n",
"How do we get our own data into Python? Drag and Drop into the Menu on the RHS.\n"
],
"metadata": {
"id": "jVhFly4k_hW0"
}
},
{
"cell_type": "code",
"source": [
"titanicData = pd.read_csv('titanic.csv')\n",
"titanicData.head() #What does this command do? Why would I do it? Any guesses as to what titanicData.tail() would do?"
],
"metadata": {
"id": "b2RThVwQ-_kY",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 336
},
"outputId": "e99f51bd-054b-4fd7-d537-19255b80bbe4"
},
"execution_count": null,
"outputs": [
{
"output_type": "error",
"ename": "FileNotFoundError",
"evalue": "ignored",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-19-3ab1df7d767a>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtitanicData\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread_csv\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'titanic.csv'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mtitanicData\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhead\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m#What does this command do? Why would I do it? Any guesses as to what titanicData.tail() would do?\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/util/_decorators.py\u001b[0m in \u001b[0;36mwrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 309\u001b[0m \u001b[0mstacklevel\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mstacklevel\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 310\u001b[0m )\n\u001b[0;32m--> 311\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 312\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 313\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mwrapper\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36mread_csv\u001b[0;34m(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)\u001b[0m\n\u001b[1;32m 584\u001b[0m \u001b[0mkwds\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkwds_defaults\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 585\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 586\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0m_read\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 587\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 588\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36m_read\u001b[0;34m(filepath_or_buffer, kwds)\u001b[0m\n\u001b[1;32m 480\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 481\u001b[0m \u001b[0;31m# Create the parser.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 482\u001b[0;31m \u001b[0mparser\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mTextFileReader\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 483\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 484\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mchunksize\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0miterator\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, f, engine, **kwds)\u001b[0m\n\u001b[1;32m 809\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0moptions\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"has_index_names\"\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mkwds\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"has_index_names\"\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 810\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 811\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_engine\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_make_engine\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mengine\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 812\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 813\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mclose\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36m_make_engine\u001b[0;34m(self, engine)\u001b[0m\n\u001b[1;32m 1038\u001b[0m )\n\u001b[1;32m 1039\u001b[0m \u001b[0;31m# error: Too many arguments for \"ParserBase\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1040\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mmapping\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mengine\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0moptions\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# type: ignore[call-arg]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1041\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1042\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m_failover_to_python\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/c_parser_wrapper.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, src, **kwds)\u001b[0m\n\u001b[1;32m 49\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 50\u001b[0m \u001b[0;31m# open handles\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 51\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_open_handles\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msrc\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 52\u001b[0m \u001b[0;32massert\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhandles\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 53\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/base_parser.py\u001b[0m in \u001b[0;36m_open_handles\u001b[0;34m(self, src, kwds)\u001b[0m\n\u001b[1;32m 220\u001b[0m \u001b[0mLet\u001b[0m \u001b[0mthe\u001b[0m \u001b[0mreaders\u001b[0m \u001b[0mopen\u001b[0m \u001b[0mIOHandles\u001b[0m \u001b[0mafter\u001b[0m \u001b[0mthey\u001b[0m \u001b[0mare\u001b[0m \u001b[0mdone\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mtheir\u001b[0m \u001b[0mpotential\u001b[0m \u001b[0mraises\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 221\u001b[0m \"\"\"\n\u001b[0;32m--> 222\u001b[0;31m self.handles = get_handle(\n\u001b[0m\u001b[1;32m 223\u001b[0m \u001b[0msrc\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 224\u001b[0m \u001b[0;34m\"r\"\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/io/common.py\u001b[0m in \u001b[0;36mget_handle\u001b[0;34m(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)\u001b[0m\n\u001b[1;32m 700\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mioargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mencoding\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0;34m\"b\"\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mioargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmode\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 701\u001b[0m \u001b[0;31m# Encoding\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 702\u001b[0;31m handle = open(\n\u001b[0m\u001b[1;32m 703\u001b[0m \u001b[0mhandle\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 704\u001b[0m \u001b[0mioargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmode\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: 'titanic.csv'"
]
}
]
},
{
"cell_type": "code",
"source": [
"#Uh oh!\n",
"#plt.pie(titanicData['class'])"
],
"metadata": {
"id": "MOSsEzdt_RLF"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"Hmm... command failed. This isn't unusual or impossible. Become used to this happening. Response should be, first reading your error message, particularly the end. Then, search for more information.\n",
"\n",
"My search, in order:\n",
"\n",
"1. pie chart in matplotlib\n",
"2. pie chart string matplotlib\n",
"3. matplotlib pie chart string input *Worked!\n",
"\n",
"\n",
"Note the successful search contains the *name of package*, and takes into account the error message I recieved: *Could not convert* **string** *to* **float**. You should be familiar with these bolded terms from your Zybooks readings.\n",
"\n",
"Solution was found here: https://stackoverflow.com/questions/63687789/how-do-i-create-a-pie-chart-using-categorical-data-in-matplotlib"
],
"metadata": {
"id": "TYESXbL1g_XH"
}
},
{
"cell_type": "markdown",
"source": [
"But the solution isn't exactly what I am looking for (different context, etc.) How do we adapt? **Always modularly test your solutions** "
],
"metadata": {
"id": "FcsQp9qNkP_K"
}
},
{
"cell_type": "code",
"source": [
"# Let's test the first part.\n",
"# What do you think .groupby('class') does?\n",
"# How about .size()?\n",
"\n",
"collapsedTitanic = titanicData.groupby('class').size()\n",
"collapsedTitanic\n",
"\n",
"#You will eventually need this groupby function, it is in about 40% of projects."
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 241
},
"id": "TvH5MMvLkF14",
"outputId": "b7ed594d-7a4b-4968-ad78-c7f1297e0998"
},
"execution_count": null,
"outputs": [
{
"output_type": "error",
"ename": "NameError",
"evalue": "ignored",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-21-35bc9e3321ee>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;31m# How about .size()?\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0mcollapsedTitanic\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtitanicData\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgroupby\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'class'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6\u001b[0m \u001b[0mcollapsedTitanic\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mNameError\u001b[0m: name 'titanicData' is not defined"
]
}
]
},
{
"cell_type": "code",
"source": [
"titanicPie = plt.pie(collapsedTitanic)\n",
"#No labels using this approach. Checking documentation again.\n",
"#Programming with small data operates on a \"figure it out\" basis. \n",
"#With large data, test everything on a smaller set before you try to \"figure it out\" on the big one. You only will need to make this mistake once or twice... unless you're paid hourly."
],
"metadata": {
"id": "EkOItTzNgu6n"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"#Alternative method from SO, has partial labels. \n",
"titanicPie = collapsedTitanic.plot(kind = 'pie')"
],
"metadata": {
"id": "iPAtOaFhhhWb"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"#Adding labels.****\n",
"titanicPie = collapsedTitanic.plot(kind = 'pie')\n",
"titanicPie = titanicPie.set_ylabel('Class of Titanic Passengers')"
],
"metadata": {
"id": "kn6nMpmlmaja"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"That looks good. If this was my personal work, I would now:\n",
"1. Go back and clean up my incorrect/bad code. \n",
"2. RERUN everything from the first cell. (Restart Runtime CTRL+M, Run All CTRL+F9)\n",
"\n",
"You will end up with bad code if you only do one cleanup at the end. Why? If you're too tired to do it immediately, you'll be too tired to do it at the end.\n",
"\n",
"More uses of the Groupby command:"
],
"metadata": {
"id": "8A7rwR7BqaTW"
}
},
{
"cell_type": "code",
"source": [
"#What can I tell from this? Which class would I rather be from?\n",
"titanicData.groupby(['survived','class']).size()"
],
"metadata": {
"id": "69ayfCdRpamz"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"#Why is this better and what did I do?\n",
"titanicData.groupby(['survived','class']).size()/titanicData.groupby(['class']).size() * 100\n",
"#Always manually check your plan works."
],
"metadata": {
"id": "Vl8jwI6_qT0Z"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"Do your own groupbys on the titanic data now, and make a new chart of this data (one I haven't shown you).\n",
"\n"
],
"metadata": {
"id": "WLLvGXy8rCck"
}
},
{
"cell_type": "code",
"source": [
"fig, ax = plt.subplots() #create a plot object with axis\n",
"ax.scatter(titanicData['age'], titanicData['fare']) #Tell it what to put on the axis.\n",
"ax.set(xlabel = \"Passenger Age\", ylabel = \"Passenger Fare\") # tell it what to put on the labels. \n",
"plt.show() #show us the plot."
],
"metadata": {
"id": "P1W8HJaUrBoL"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"Great work everyone, don't forget to look at your first project. \n",
"Continue the Zybooks work. "
],
"metadata": {
"id": "aFRBOP9dseBB"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment