Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save ruloweb/6728e5dfbc41a60e1dd4985dbb792de6 to your computer and use it in GitHub Desktop.
Save ruloweb/6728e5dfbc41a60e1dd4985dbb792de6 to your computer and use it in GitHub Desktop.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Práctica independiente"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import plotly.tools as tls\n",
"#tls.set_credentials_file(username='sebasggx', api_key='uLZskgQnsV7QPNGSOGbq')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Importar los paquetes requeridos\n",
"\n",
"No nos olvidemos de ejecutar esta celda"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Paquetes numéricos y estadísticos:\n",
"import numpy as np\n",
"import scipy.stats as stats\n",
"\n",
"# Pandas maneja la carga y manipulación del dataset\n",
"import pandas as pd\n",
"\n",
"import plotly\n",
"import plotly.plotly as py\n",
"import plotly.graph_objs as go\n",
"\n",
"# Inicializar plotly en modo offline para la notebook\n",
"plotly.offline.init_notebook_mode()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 1 Dataset salary.csv: Una vez más\n",
"\n",
"Queremos generar un gráfico paracido a este utilizando el dataset salary.csv y ``plotly``"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Grafico](grafico.png)\n",
"![Este es el gráfico que queremos generar]()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"salary = pd.read_csv('salary.csv')\n",
"salary.columns = ['gender', 'professor_rank', 'years_in_job', 'degree_level', 'years_since_degree', 'yearly_salary']\n",
"salary.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" ### Obtener los 6 subconjuntos del dataset que representan los cruces entre gender y professor_rank (hay 6 subconjuntos: 2 generos * 3 tipos de cargo de profesor = 6)) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"genders = salary.gender.unique()\n",
"ranks = salary.professor_rank.unique()\n",
"\n",
"symbols = [\"circle\", \"cross\", \"x\"]\n",
"colors = [\"#F22\", \"#22F\"]\n",
"\n",
"traces = []\n",
"for i, gender in enumerate(genders):\n",
" for j, rank in enumerate(ranks):\n",
" data = salary[(salary['gender'] == gender) & (salary['professor_rank'] == rank)]\n",
" \n",
" traces.append(go.Scatter(\n",
" name=gender + '-' + rank,\n",
" x=data.years_in_job.values,\n",
" y=data.yearly_salary.values,\n",
" mode='markers',\n",
" marker=dict(\n",
" color=colors[i],\n",
" symbol=symbols[j],\n",
" size=data.years_since_degree.values\n",
" )\n",
" ))\n",
"\n",
"layout = go.Layout(\n",
" title='Salaries'\n",
")\n",
"\n",
"fig = go.Figure(data=traces, layout=layout)\n",
"plotly.offline.iplot(fig)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. NBA\n",
"\n",
"Vamos a utilizar el dataset 'nba.tsv' que representa la probabilidad que tiene un equipo dado en un minuto dado del partido de ganar. Cada fila representa un equipo y cada columna un minuto del partido (de 0 a 48)\n",
"\n",
"Queremos un gráfico que nos muestre en el eje x los minutos de un partido de NBA y en el eje Y la probabilidad de ganar que, en ese minuto de juego, tiene cada equipo.\n",
"\n",
"Utilizaremos la funcion ``Scatter`` de ``plotly``."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"nba = pd.read_csv('nba.tsv', sep='\\t')\n",
"\n",
"xaxis = nba.columns.values[1:]\n",
"\n",
"data = [go.Scatter(name=i[1].team, x=xaxis, y=i[1].values[1:]) for i in nba.iterrows()]\n",
"\n",
"layout = go.Layout(\n",
" title='NBA',\n",
" xaxis=dict(title='Minutes'),\n",
" yaxis=dict(title='Percentage')\n",
")\n",
"\n",
"fig = go.Figure(data=data, layout=layout)\n",
"plotly.offline.iplot(fig)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 3. Mapas\n",
"\n",
"Utilizando el dataset del Desafío 1, generar un gráfico que muestre el mapa de Estados Unidos y mostrar por intensidad/divergencia de color las notas promedio de los estudiantes de cada estado para alguno de los tests.\n",
"\n",
"Utilizar la estrucutra ``choropleth`` y basarse en el ejemplo de la demo"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"df = pd.read_csv('sat_scores.csv')\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"scl = [\n",
" [0.0, 'rgb(242,240,247)'],\n",
" [0.2, 'rgb(218,218,235)'],\n",
" [0.4, 'rgb(188,189,220)'],\n",
" [0.6, 'rgb(158,154,200)'],\n",
" [0.8, 'rgb(117,107,177)'],\n",
" [1.0, 'rgb(84,39,143)']\n",
"]\n",
"\n",
"df['text'] = df['State'] + '<br>' + \\\n",
" 'Rate ' + df['Rate'].astype(str) + '<br>' + \\\n",
" 'Math ' + df['Math'].astype(str)+ '<br>' + \\\n",
" 'Verbal ' + df['Verbal'].astype(str)\n",
"\n",
"data = [\n",
" dict(\n",
" type='choropleth',\n",
" colorscale = scl,\n",
" autocolorscale = False,\n",
" locations = df['State'],\n",
" z = df['Rate'],\n",
" locationmode = 'USA-states',\n",
" text = df['text'],\n",
" marker = dict(\n",
" line = dict (\n",
" color = 'rgb(255,255,255)',\n",
" width = 2\n",
" )\n",
" ),\n",
" colorbar = dict(\n",
" title = \"Rate\"\n",
" )\n",
" )\n",
"]\n",
"\n",
"layout = dict(\n",
" title = '2011 SAT scores',\n",
" geo = dict(\n",
" scope='usa',\n",
" projection=dict(type='albers usa'),\n",
" showlakes = True,\n",
" lakecolor = 'rgb(255, 255, 255)'\n",
" ),\n",
")\n",
"\n",
"fig = dict(data=data, layout=layout)\n",
"plotly.offline.iplot(fig)"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment