Skip to content

Instantly share code, notes, and snippets.

@davixcky
Last active August 29, 2020 20:10
Show Gist options
  • Save davixcky/a9d1729af3e098a7b2ad8692297ff4a3 to your computer and use it in GitHub Desktop.
Save davixcky/a9d1729af3e098a7b2ad8692297ff4a3 to your computer and use it in GitHub Desktop.
Introduction to data science - Universidad del Norte
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "IS_HW1.ipynb",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/davixcky/a9d1729af3e098a7b2ad8692297ff4a3/is_hw1.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UvEzkU-64k9W",
"colab_type": "text"
},
"source": [
"# **Introducción a la Ingeniería de Sistemas**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DhIuksNiAtmU",
"colab_type": "text"
},
"source": [
"![unnamed.png](https://upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Logo_uninorte_colombia.jpg/360px-Logo_uninorte_colombia.jpg)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "q1dCRnET7JUK",
"colab_type": "text"
},
"source": [
"Profesor: Elías D. Niño-Ruiz, Ph.D."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JNRYWNXx6sye",
"colab_type": "text"
},
"source": [
"* Nombre del Estudiante: David Orozco\n",
"* Código del Estudiante: 200152584"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "de09gPd_4TYz",
"colab_type": "text"
},
"source": [
"### ***Actividad 1 - Individual - DataFrame, Seaborn, PyPlot, y Agregación***"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vzuhewTv5qiY",
"colab_type": "text"
},
"source": [
"En esta actividad, de manera individual, trabajarán sobre la base de datos de los accidentes registrados en la calle 30 de Barranquilla, Colombia. Como bien saben, la fuente de datos reposa en [Datos Abiertos](https://www.datos.gov.co/) , de manera especifica pueden consultar: [Accidentes de Tránsito Barranquilla Calle 30](https://www.datos.gov.co/Transporte/accidentes-calle-30-2015-2019/sefb-a755)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-90zLfj-8bnF",
"colab_type": "text"
},
"source": [
"Para nuestro análisis, procedemos a cargar las librerías de Python (**no olvide ejecutar TODAS las celdas de código**)."
]
},
{
"cell_type": "code",
"metadata": {
"id": "f6q5Iob77g2P",
"colab_type": "code",
"colab": {}
},
"source": [
"import pandas as pd\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZYf9ZNeY7d26",
"colab_type": "text"
},
"source": [
"Creamos el `DataFrame` con la base de datos alojada en el servidor de Datos Abiertos:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Dk3VYRf34kBk",
"colab_type": "code",
"colab": {}
},
"source": [
"df_trans = pd.read_csv('https://www.datos.gov.co/api/views/yb9r-2dsi/rows.csv?accessType=DOWNLOAD');"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "F8auYgO48I3_",
"colab_type": "text"
},
"source": [
"Los cincos primeros registros del DataFrame `df_trans` son:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "4LXn52-B4St8",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 326
},
"outputId": "433a9502-621a-4d1f-bd3a-b50ef8f88625"
},
"source": [
"df_trans.head(5)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>FECHA_ACCIDENTE</th>\n",
" <th>AÑO_ACCIDENTE</th>\n",
" <th>MES_ACCIDENTE</th>\n",
" <th>DIA_ACCIDENTE</th>\n",
" <th>HORA_ACCIDENTE</th>\n",
" <th>GRAVEDAD_ACCIDENTE</th>\n",
" <th>CLASE_ACCIDENTE</th>\n",
" <th>SITIO_EXACTO_ACCIDENTE</th>\n",
" <th>CANT_HERIDOS_EN _SITIO_ACCIDENTE</th>\n",
" <th>CANT_MUERTOS_EN _SITIO_ACCIDENTE</th>\n",
" <th>CANTIDAD_ACCIDENTES</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>01/01/2015 12:00:00 AM</td>\n",
" <td>2015</td>\n",
" <td>1</td>\n",
" <td>Jue</td>\n",
" <td>02:10:00:PM</td>\n",
" <td>Con heridos</td>\n",
" <td>Choque</td>\n",
" <td>VIA 40 CON 77</td>\n",
" <td>1.0</td>\n",
" <td>NaN</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>01/01/2015 12:00:00 AM</td>\n",
" <td>2015</td>\n",
" <td>1</td>\n",
" <td>Jue</td>\n",
" <td>02:15:00:PM</td>\n",
" <td>Solo daños</td>\n",
" <td>Choque</td>\n",
" <td>CALLE 14 CR 13</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>01/01/2015 12:00:00 AM</td>\n",
" <td>2015</td>\n",
" <td>1</td>\n",
" <td>Jue</td>\n",
" <td>02:20:00:PM</td>\n",
" <td>Solo daños</td>\n",
" <td>Choque</td>\n",
" <td>CL 74 CR 38C</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>01/01/2015 12:00:00 AM</td>\n",
" <td>2015</td>\n",
" <td>1</td>\n",
" <td>Jue</td>\n",
" <td>03:30:00:PM</td>\n",
" <td>Con heridos</td>\n",
" <td>Choque</td>\n",
" <td>CL 45 CR 19</td>\n",
" <td>2.0</td>\n",
" <td>NaN</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>01/01/2015 12:00:00 AM</td>\n",
" <td>2015</td>\n",
" <td>1</td>\n",
" <td>Jue</td>\n",
" <td>04:20:00:AM</td>\n",
" <td>Solo daños</td>\n",
" <td>Choque</td>\n",
" <td>CRA 15 CLLE 21</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" FECHA_ACCIDENTE ... CANTIDAD_ACCIDENTES\n",
"0 01/01/2015 12:00:00 AM ... 1\n",
"1 01/01/2015 12:00:00 AM ... 1\n",
"2 01/01/2015 12:00:00 AM ... 1\n",
"3 01/01/2015 12:00:00 AM ... 1\n",
"4 01/01/2015 12:00:00 AM ... 1\n",
"\n",
"[5 rows x 11 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 17
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Bb_rITOF-ZwU",
"colab_type": "text"
},
"source": [
"**Punto 1.** Elabore un gráfico de violin (`violinplot`) en donde se muestre el número de accidentes por día de la semana. ¿Qué conclusiones puede sacar con fundamento en el gráfico?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "g6qvt9R4-4Rx",
"colab_type": "text"
},
"source": [
"**Respuesta.**"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Utl5k8QlDWG-",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 280
},
"outputId": "6771673f-ba4c-4cf4-be46-d00854033d26"
},
"source": [
"label_dia = ['Dom', 'Lun', 'Mar', 'Mié', 'Jue', 'Vie', 'Sáb']\n",
"\n",
"df_semana= df_trans.copy()\n",
"sns.violinplot(data=df_semana,\n",
" x='DIA_ACCIDENTE',\n",
" y='CANTIDAD_ACCIDENTES',\n",
" order=label_dia);\n"
],
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yLwmGXx4-ZId",
"colab_type": "text"
},
"source": [
"El dia con mayor numero de accidentes (por dia) es el viernes"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2xfXLGk5_gyJ",
"colab_type": "text"
},
"source": [
"**Punto 2.** Elabore un gráfico de caja (`boxplot`) en donde se muestre el número de accidentes por la gravedad de este. ¿Qué puede concluir de la gráfica obtenida?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kWJkAH4R_0MG",
"colab_type": "text"
},
"source": [
"**Respuesta.**"
]
},
{
"cell_type": "code",
"metadata": {
"id": "IheJ50nf_1WM",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 298
},
"outputId": "dc2b1743-28c6-4015-8c0e-fcd1d89cd2d6"
},
"source": [
"labels_stick = ['Solo daños', 'Con heridos', 'Con muertos']\n",
"\n",
"df_accidentes = df_trans.copy()\n",
"sns.boxplot(data=df_accidentes,\n",
" x='GRAVEDAD_ACCIDENTE',\n",
" y='CANTIDAD_ACCIDENTES',\n",
" order=labels_stick)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7f3062fa95c0>"
]
},
"metadata": {
"tags": []
},
"execution_count": 19
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "idoDH8XgDnBb",
"colab_type": "text"
},
"source": [
"La gravedad \"Con heridos\" y \"Con muertos\", son muy similares, en cambio \"Solo danos\", tiene un punto max de 2.0"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZOvVN5GUB41A",
"colab_type": "text"
},
"source": [
"**Punto 3.** ¿Cuántos accidentes ocurren por día de la semana? Exporte el resultado a un archivo de Excel. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Cex6TLpLCRja",
"colab_type": "text"
},
"source": [
"**Respuesta.**"
]
},
{
"cell_type": "code",
"metadata": {
"id": "URgqQKnxCTHo",
"colab_type": "code",
"colab": {}
},
"source": [
"df_accidents_by_day = df_trans.copy()\n",
"df_accidents_by_day = df_accidents_by_day[['DIA_ACCIDENTE', 'CANTIDAD_ACCIDENTES']]\n",
"df_accidents_by_day = df_accidents_by_day.groupby(by=['DIA_ACCIDENTE']).count()\n",
"\n",
"label_dia = ['Dom', 'Lun', 'Mar', 'Mié', 'Jue', 'Vie', 'Sáb']\n",
"df_accidents_by_day = df_accidents_by_day.reindex(label_dia)\n",
"\n",
"df_accidents_by_day.to_excel('output_ex3.xlsx')"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZdJk_O3l9ZC-",
"colab_type": "text"
},
"source": [
"El domingo es el dia donde menos accidentes han ocurrido y el viernes, todo lo contrario."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "J7eCs671Cg6t",
"colab_type": "text"
},
"source": [
"**Punto 4.** Elabore un reporte en donde se muestre, por año, el número de accidentes dependiento de su gravedad. Exporte el DataFrame resultante a un archivo de Excel. ¿Qué conclusiones puede sacar con base al informe generado?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "16TJczBrC9G2",
"colab_type": "text"
},
"source": [
"**Respuesta.**"
]
},
{
"cell_type": "code",
"metadata": {
"id": "wLl5wxPlC-z7",
"colab_type": "code",
"colab": {}
},
"source": [
"df_accidents_by_year = df_trans.copy()\n",
"df_accidents_by_year = df_accidents_by_year[['AÑO_ACCIDENTE', 'GRAVEDAD_ACCIDENTE', 'CANTIDAD_ACCIDENTES']]\n",
"df_accidents_by_year = df_accidents_by_year.groupby(by=['AÑO_ACCIDENTE', 'GRAVEDAD_ACCIDENTE']).count()\n",
"\n",
"df_accidents_by_year.to_excel('output_ex4.xlsx')"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "uG_SlUvX97qg",
"colab_type": "text"
},
"source": [
"La mayoria de los accidentes son de gravedad \"Solo danos\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8_jKQBv464An",
"colab_type": "text"
},
"source": [
"**Recuerde:** En este curso no se tolerará el plagio. Sin excepción, en caso de presentarse esta situación, a los estudiantes involucrados se les iniciará proceso de investigación, y se actuará en conformidad con el Reglamento de Estudiantes de la Universidad del Norte. El plagio incluye: usar contenidos sin la debida referencia, de manera literal o con mínimos cambios que no alteren el espíritu del texto/código; adquirir con o sin intención, trabajos de terceros y presentarlos parcial o totalmente como propios; presentar trabajos en grupo donde alguno de los integrantes no trabajó o donde no se hubo trabajo en equipo demostrable; entre otras situaciones definidas en el manual de fraude académico de la Universidad del Norte:\n",
"\n",
"(https://guayacan.uninorte.edu.co/normatividad_interna/upload/File/Guia_Prevencion_Fraude%20estudiantes(5).pdf )."
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment