Instantly share code, notes, and snippets.

# patrickthoreson/DV0101EN-Exercise-Pie-Charts-Box-Plots-Scatter-Plots-and-Bubble-Plots-py.ipynb Created Feb 22, 2019

Created on Cognitive Class Labs
 { "cells": [ { "cell_type": "markdown", "metadata": { "button": false, "deletable": true, "editable": true, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ " \n", "\n", "

Pie Charts, Box Plots, Scatter Plots, and Bubble Plots

" ] }, { "cell_type": "markdown", "metadata": { "button": false, "deletable": true, "editable": true, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "## Introduction\n", "\n", "In this lab session, we continue exploring the Matplotlib library. More specificatlly, we will learn how to create pie charts, box plots, scatter plots, and bubble charts." ] }, { "cell_type": "markdown", "metadata": { "button": false, "deletable": true, "editable": true, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "## Table of Contents\n", "\n", "
\n", "\n", "1. [Exploring Datasets with *p*andas](#0)
\n", "3. [Visualizing Data using Matplotlib](#4)
\n", "4. [Pie Charts](#6)
\n", "5. [Box Plots](#8)
\n", "6. [Scatter Plots](#10)
\n", "7. [Bubble Plots](#12)
\n", "
\n", "
\n", "\n", "\n", "
TypeCoverageOdNameAREAAreaNameREGRegNameDEVDevName1980...2004200520062007200820092010201120122013
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0ImmigrantsForeignersAfghanistan935Asia5501Southern Asia902Developing regions16...2978343630092652211117461758220326352004
1ImmigrantsForeignersAlbania908Europe925Southern Europe901Developed regions1...14501223856702560716561539620603
2ImmigrantsForeignersAlgeria903Africa912Northern Africa902Developing regions80...3616362648073623400553934752432537744331
3ImmigrantsForeignersAmerican Samoa909Oceania957Polynesia902Developing regions0...0010000000
4ImmigrantsForeignersAndorra908Europe925Southern Europe901Developed regions0...0011000011
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "

5 rows × 43 columns

\n", "\n", "\n", "
1980198119821983198419851986198719881989...200520062007200820092010201120122013Total
Continent
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Africa3951436338192671263926503782749475529894...275232918828284298903453440892354413808338543618948
Asia31025343143021424696272742385028739432034745460256...1592531490541334591398941414341638451468941522181550753317794
Europe39760448024272024638222872084424370466985472660893...3595533053334953469235078334252677829177286911410947
Latin America and the Caribbean13081152151676915427136781517121179284712192425060...247472467626011265472686728818278562717324950765148
Northern America93781003090747100666165437074770564696790...8394961394631019089958142767778928503241142
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "

5 rows × 35 columns

CountryJapan
\n", " \n", " \n", " \n", " \n", " \n", "
1980701
1981756
1982598
1983309
1984246
CountryJapan
\n", " \n", " \n", " \n", " \n", " \n", "
count34.000000
mean814.911765
std337.219771
min198.000000
25%529.000000
50%902.000000
75%1079.000000
max1284.000000
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "" ], "text/plain": [ "Country Japan\n", "count 34.000000\n", "mean 814.911765\n", "std 337.219771\n", "min 198.000000\n", "25% 529.000000\n", "50% 902.000000\n", "75% 1079.000000\n", "max 1284.000000" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_japan.describe()" ] }, { "cell_type": "markdown", "metadata": { "button": false, "deletable": true, "editable": true, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "One of the key benefits of box plots is comparing the distribution of multiple datasets. In one of the previous labs, we observed that China and India had very similar immigration trends. Let's analyize these two countries further using box plots.\n", "\n", "**Question:** Compare the distribution of the number of new immigrants from India and China for the period 1980 - 2013." ] }, { "cell_type": "markdown", "metadata": { "button": false, "deletable": true, "editable": true, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "Step 1: Get the dataset for China and India and call the dataframe **df_CI**." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "button": false, "collapsed": false, "deletable": true, "new_sheet": false, "run_control": { "read_only": false } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "
CountryChinaIndia
\n", " \n", " \n", " \n", " \n", " \n", " \n", "
198051238880
198166828670
198233088147
198318637338
198415275704
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "" ], "text/plain": [ "Country China India\n", "1980 5123 8880\n", "1981 6682 8670\n", "1982 3308 8147\n", "1983 1863 7338\n", "1984 1527 5704" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "### type your answer here\n", "df_CI = df_can.loc[['China','India'], years].transpose()\n", "df_CI.head()" ] }, { "cell_type": "markdown", "metadata": { "button": false, "deletable": true, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "Double-click __here__ for the solution.\n", "" ] }, { "cell_type": "markdown", "metadata": { "button": false, "deletable": true, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "Let's view the percentages associated with both countries using the `describe()` method." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "button": false, "collapsed": false, "deletable": true, "new_sheet": false, "run_control": { "read_only": false }, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "
CountryChinaIndia
\n", " \n", " \n", " \n", " \n", " \n", " \n", "
count34.00000034.000000
mean19410.64705920350.117647
std13568.23079010007.342579
min1527.0000004211.000000
25%5512.75000010637.750000
50%19945.00000020235.000000
75%31568.50000028699.500000
max42584.00000036210.000000