Skip to content

Instantly share code, notes, and snippets.

Created October 13, 2017 07:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anonymous/3d70bdff025c985d4c22ca816fbedcb7 to your computer and use it in GitHub Desktop.
Save anonymous/3d70bdff025c985d4c22ca816fbedcb7 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Crash Course in Pandas\n",
"\n",
"[Pandas](http://pandas.pydata.org) is a very popular Python package that provides data structures and data analysis tools. It includes tools for reading and writing various data formats, processing data sets in an efficient DataFrame object, and the ability to reshape, filter, index, and subset data easily.\n",
"\n",
"This notebook is a quick introduction to some of the features of the Pandas package."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reading Data into DataFrames\n",
"\n",
"The primary data structure in the Pandas package is the DataFrame. A DataFrame is a table-like object that contains named columns of data. Pandas provides readers for many different file types including CSV, Excel, SAS, HTML, etc. Since CSV is such a common format, let's look at reading a CSV file into a Pandas DataFrame.\n",
"\n",
"First, we need to import the Pandas package. A very common convention is to import Pandas using a shortcut named of 'pd'. This is done with the following line."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The function that reads CSV files into DataFrames is called ``read_csv``. In the simplest form, you just supply it with a filename or URL. We have the SAS cars data set stored in github that we can point to directly. We'll use the ``head`` method to display just the first few records."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Model</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Acura</td>\n",
" <td>MDX</td>\n",
" <td>SUV</td>\n",
" <td>Asia</td>\n",
" <td>All</td>\n",
" <td>36945.0</td>\n",
" <td>33337.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>265.0</td>\n",
" <td>17.0</td>\n",
" <td>23.0</td>\n",
" <td>4451.0</td>\n",
" <td>106.0</td>\n",
" <td>189.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Acura</td>\n",
" <td>RSX Type S 2dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>23820.0</td>\n",
" <td>21761.0</td>\n",
" <td>2.0</td>\n",
" <td>4.0</td>\n",
" <td>200.0</td>\n",
" <td>24.0</td>\n",
" <td>31.0</td>\n",
" <td>2778.0</td>\n",
" <td>101.0</td>\n",
" <td>172.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Acura</td>\n",
" <td>TSX 4dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>26990.0</td>\n",
" <td>24647.0</td>\n",
" <td>2.4</td>\n",
" <td>4.0</td>\n",
" <td>200.0</td>\n",
" <td>22.0</td>\n",
" <td>29.0</td>\n",
" <td>3230.0</td>\n",
" <td>105.0</td>\n",
" <td>183.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Acura</td>\n",
" <td>TL 4dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>33195.0</td>\n",
" <td>30299.0</td>\n",
" <td>3.2</td>\n",
" <td>6.0</td>\n",
" <td>270.0</td>\n",
" <td>20.0</td>\n",
" <td>28.0</td>\n",
" <td>3575.0</td>\n",
" <td>108.0</td>\n",
" <td>186.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Acura</td>\n",
" <td>3.5 RL 4dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>43755.0</td>\n",
" <td>39014.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3880.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Model Type Origin DriveTrain MSRP Invoice \\\n",
"0 Acura MDX SUV Asia All 36945.0 33337.0 \n",
"1 Acura RSX Type S 2dr Sedan Asia Front 23820.0 21761.0 \n",
"2 Acura TSX 4dr Sedan Asia Front 26990.0 24647.0 \n",
"3 Acura TL 4dr Sedan Asia Front 33195.0 30299.0 \n",
"4 Acura 3.5 RL 4dr Sedan Asia Front 43755.0 39014.0 \n",
"\n",
" EngineSize Cylinders Horsepower MPG_City MPG_Highway Weight \\\n",
"0 3.5 6.0 265.0 17.0 23.0 4451.0 \n",
"1 2.0 4.0 200.0 24.0 31.0 2778.0 \n",
"2 2.4 4.0 200.0 22.0 29.0 3230.0 \n",
"3 3.2 6.0 270.0 20.0 28.0 3575.0 \n",
"4 3.5 6.0 225.0 18.0 24.0 3880.0 \n",
"\n",
" Wheelbase Length \n",
"0 106.0 189.0 \n",
"1 101.0 172.0 \n",
"2 105.0 183.0 \n",
"3 108.0 186.0 \n",
"4 115.0 197.0 "
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv('https://raw.githubusercontent.com/sassoftware/sas-viya-programming/master/data/cars.csv')\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are many more Pandas data readers that you read about on the [Pandas web site](http://pandas.pydata.org/pandas-docs/stable/io.html)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Displaying Information about DataFrames\n",
"\n",
"We have displayed the DataFrame above. We can get more information about the DataFrame using various properties and methods.\n",
"\n",
"The list of column names can be displayed using the ``columns`` property."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['Make', 'Model', 'Type', 'Origin', 'DriveTrain', 'MSRP', 'Invoice',\n",
" 'EngineSize', 'Cylinders', 'Horsepower', 'MPG_City', 'MPG_Highway',\n",
" 'Weight', 'Wheelbase', 'Length'],\n",
" dtype='object')"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The data types of the columns can be displayed using the ``dtypes`` property."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Make object\n",
"Model object\n",
"Type object\n",
"Origin object\n",
"DriveTrain object\n",
"MSRP float64\n",
"Invoice float64\n",
"EngineSize float64\n",
"Cylinders float64\n",
"Horsepower float64\n",
"MPG_City float64\n",
"MPG_Highway float64\n",
"Weight float64\n",
"Wheelbase float64\n",
"Length float64\n",
"dtype: object"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.dtypes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For general information about the DataFrame as a whole, you can use the ``info`` method."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 428 entries, 0 to 427\n",
"Data columns (total 15 columns):\n",
"Make 428 non-null object\n",
"Model 428 non-null object\n",
"Type 428 non-null object\n",
"Origin 428 non-null object\n",
"DriveTrain 428 non-null object\n",
"MSRP 428 non-null float64\n",
"Invoice 428 non-null float64\n",
"EngineSize 428 non-null float64\n",
"Cylinders 426 non-null float64\n",
"Horsepower 428 non-null float64\n",
"MPG_City 428 non-null float64\n",
"MPG_Highway 428 non-null float64\n",
"Weight 428 non-null float64\n",
"Wheelbase 428 non-null float64\n",
"Length 428 non-null float64\n",
"dtypes: float64(10), object(5)\n",
"memory usage: 50.2+ KB\n"
]
}
],
"source": [
"df.info()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we know more about the columns and their data types, we can move on to subsetting DataFrames into other DataFrames or columns."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Subsetting DataFrames\n",
"\n",
"You can select columns from a DataFrame to create a DataFrame with fewer columns, or you can select a single column of data into a Series. The syntax used for this operation is Python's indexing syntax ``[...]``.\n",
"\n",
"If you index into a DataFrame using just the name of a column, you will get a Series back. If you index into a DataFrame using a list of names, you will get a DataFrame back."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Model</th>\n",
" <th>Horsepower</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Acura</td>\n",
" <td>MDX</td>\n",
" <td>265.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Acura</td>\n",
" <td>RSX Type S 2dr</td>\n",
" <td>200.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Acura</td>\n",
" <td>TSX 4dr</td>\n",
" <td>200.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Acura</td>\n",
" <td>TL 4dr</td>\n",
" <td>270.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Acura</td>\n",
" <td>3.5 RL 4dr</td>\n",
" <td>225.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Model Horsepower\n",
"0 Acura MDX 265.0\n",
"1 Acura RSX Type S 2dr 200.0\n",
"2 Acura TSX 4dr 200.0\n",
"3 Acura TL 4dr 270.0\n",
"4 Acura 3.5 RL 4dr 225.0"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"subdf = df[['Make', 'Model', 'Horsepower']]\n",
"subdf.head()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 MDX\n",
"1 RSX Type S 2dr\n",
"2 TSX 4dr\n",
"3 TL 4dr\n",
"4 3.5 RL 4dr\n",
"Name: Model, dtype: object"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model = df['Model']\n",
"model.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In addition to selecting columns of data, DataFrames have more advanced ways of selecting data. We'll look at that in the next section."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Row Indexing\n",
"\n",
"Pandas DataFrames have an index associated both with rows and columns. We have seen the column index in the ``columns`` property above. We have also indexed into that collection using the column names. You can also index rows by position and label using the ``iloc`` and ``loc`` properties, respectively.\n",
"\n",
"Currently, our cars DataFrame only has numeric row indexes, it doesn't have labels, so we'll start with ``iloc``."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Model</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Acura</td>\n",
" <td>TL 4dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>33195.0</td>\n",
" <td>30299.0</td>\n",
" <td>3.2</td>\n",
" <td>6.0</td>\n",
" <td>270.0</td>\n",
" <td>20.0</td>\n",
" <td>28.0</td>\n",
" <td>3575.0</td>\n",
" <td>108.0</td>\n",
" <td>186.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Acura</td>\n",
" <td>3.5 RL 4dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>43755.0</td>\n",
" <td>39014.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3880.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Acura</td>\n",
" <td>3.5 RL w/Navigation 4dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>46100.0</td>\n",
" <td>41100.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3893.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Acura</td>\n",
" <td>NSX coupe 2dr manual S</td>\n",
" <td>Sports</td>\n",
" <td>Asia</td>\n",
" <td>Rear</td>\n",
" <td>89765.0</td>\n",
" <td>79978.0</td>\n",
" <td>3.2</td>\n",
" <td>6.0</td>\n",
" <td>290.0</td>\n",
" <td>17.0</td>\n",
" <td>24.0</td>\n",
" <td>3153.0</td>\n",
" <td>100.0</td>\n",
" <td>174.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Model Type Origin DriveTrain MSRP Invoice \\\n",
"3 Acura TL 4dr Sedan Asia Front 33195.0 30299.0 \n",
"4 Acura 3.5 RL 4dr Sedan Asia Front 43755.0 39014.0 \n",
"5 Acura 3.5 RL w/Navigation 4dr Sedan Asia Front 46100.0 41100.0 \n",
"6 Acura NSX coupe 2dr manual S Sports Asia Rear 89765.0 79978.0 \n",
"\n",
" EngineSize Cylinders Horsepower MPG_City MPG_Highway Weight \\\n",
"3 3.2 6.0 270.0 20.0 28.0 3575.0 \n",
"4 3.5 6.0 225.0 18.0 24.0 3880.0 \n",
"5 3.5 6.0 225.0 18.0 24.0 3893.0 \n",
"6 3.2 6.0 290.0 17.0 24.0 3153.0 \n",
"\n",
" Wheelbase Length \n",
"3 108.0 186.0 \n",
"4 115.0 197.0 \n",
"5 115.0 197.0 \n",
"6 100.0 174.0 "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.iloc[3:7]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As you can see above, we selected the row indexs from 3 to 7. In Python, numeric slices such as this include the first index, but do not include the last index. That is why you see row indexes 3, 4, 5, and 6 only in the result.\n",
"\n",
"You can also select rows using a list of row indexes."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Model</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Acura</td>\n",
" <td>3.5 RL 4dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>43755.0</td>\n",
" <td>39014.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3880.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Acura</td>\n",
" <td>3.5 RL w/Navigation 4dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>46100.0</td>\n",
" <td>41100.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3893.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Audi</td>\n",
" <td>A4 3.0 Quattro 4dr manual</td>\n",
" <td>Sedan</td>\n",
" <td>Europe</td>\n",
" <td>All</td>\n",
" <td>33430.0</td>\n",
" <td>30366.0</td>\n",
" <td>3.0</td>\n",
" <td>6.0</td>\n",
" <td>220.0</td>\n",
" <td>17.0</td>\n",
" <td>26.0</td>\n",
" <td>3583.0</td>\n",
" <td>104.0</td>\n",
" <td>179.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Model Type Origin DriveTrain MSRP \\\n",
"4 Acura 3.5 RL 4dr Sedan Asia Front 43755.0 \n",
"5 Acura 3.5 RL w/Navigation 4dr Sedan Asia Front 46100.0 \n",
"10 Audi A4 3.0 Quattro 4dr manual Sedan Europe All 33430.0 \n",
"\n",
" Invoice EngineSize Cylinders Horsepower MPG_City MPG_Highway Weight \\\n",
"4 39014.0 3.5 6.0 225.0 18.0 24.0 3880.0 \n",
"5 41100.0 3.5 6.0 225.0 18.0 24.0 3893.0 \n",
"10 30366.0 3.0 6.0 220.0 17.0 26.0 3583.0 \n",
"\n",
" Wheelbase Length \n",
"4 115.0 197.0 \n",
"5 115.0 197.0 \n",
"10 104.0 179.0 "
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.iloc[[4, 5, 10]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Arguably a more useful way of indexing Pandas DataFrames is using row labels. We can set a column as a row index using the ``set_index`` method. Then we can use those labels in the ``loc`` property for easy data selection."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"df = df.set_index('Model')"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>MDX</th>\n",
" <td>Acura</td>\n",
" <td>SUV</td>\n",
" <td>Asia</td>\n",
" <td>All</td>\n",
" <td>36945.0</td>\n",
" <td>33337.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>265.0</td>\n",
" <td>17.0</td>\n",
" <td>23.0</td>\n",
" <td>4451.0</td>\n",
" <td>106.0</td>\n",
" <td>189.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>RSX Type S 2dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>23820.0</td>\n",
" <td>21761.0</td>\n",
" <td>2.0</td>\n",
" <td>4.0</td>\n",
" <td>200.0</td>\n",
" <td>24.0</td>\n",
" <td>31.0</td>\n",
" <td>2778.0</td>\n",
" <td>101.0</td>\n",
" <td>172.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>TSX 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>26990.0</td>\n",
" <td>24647.0</td>\n",
" <td>2.4</td>\n",
" <td>4.0</td>\n",
" <td>200.0</td>\n",
" <td>22.0</td>\n",
" <td>29.0</td>\n",
" <td>3230.0</td>\n",
" <td>105.0</td>\n",
" <td>183.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>TL 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>33195.0</td>\n",
" <td>30299.0</td>\n",
" <td>3.2</td>\n",
" <td>6.0</td>\n",
" <td>270.0</td>\n",
" <td>20.0</td>\n",
" <td>28.0</td>\n",
" <td>3575.0</td>\n",
" <td>108.0</td>\n",
" <td>186.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3.5 RL 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>43755.0</td>\n",
" <td>39014.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3880.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Type Origin DriveTrain MSRP Invoice EngineSize \\\n",
"Model \n",
"MDX Acura SUV Asia All 36945.0 33337.0 3.5 \n",
"RSX Type S 2dr Acura Sedan Asia Front 23820.0 21761.0 2.0 \n",
"TSX 4dr Acura Sedan Asia Front 26990.0 24647.0 2.4 \n",
"TL 4dr Acura Sedan Asia Front 33195.0 30299.0 3.2 \n",
"3.5 RL 4dr Acura Sedan Asia Front 43755.0 39014.0 3.5 \n",
"\n",
" Cylinders Horsepower MPG_City MPG_Highway Weight \\\n",
"Model \n",
"MDX 6.0 265.0 17.0 23.0 4451.0 \n",
"RSX Type S 2dr 4.0 200.0 24.0 31.0 2778.0 \n",
"TSX 4dr 4.0 200.0 22.0 29.0 3230.0 \n",
"TL 4dr 6.0 270.0 20.0 28.0 3575.0 \n",
"3.5 RL 4dr 6.0 225.0 18.0 24.0 3880.0 \n",
"\n",
" Wheelbase Length \n",
"Model \n",
"MDX 106.0 189.0 \n",
"RSX Type S 2dr 101.0 172.0 \n",
"TSX 4dr 105.0 183.0 \n",
"TL 4dr 108.0 186.0 \n",
"3.5 RL 4dr 115.0 197.0 "
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You'll see in the output above that the Model column is now displayed as a row index. We can use the values in that index in the ``loc`` property. This includes both single values, lists of values, or slices."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Make Acura\n",
"Type SUV\n",
"Origin Asia\n",
"DriveTrain All\n",
"MSRP 36945\n",
"Invoice 33337\n",
"EngineSize 3.5\n",
"Cylinders 6\n",
"Horsepower 265\n",
"MPG_City 17\n",
"MPG_Highway 23\n",
"Weight 4451\n",
"Wheelbase 106\n",
"Length 189\n",
"Name: MDX, dtype: object"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc['MDX']"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>TSX 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>26990.0</td>\n",
" <td>24647.0</td>\n",
" <td>2.4</td>\n",
" <td>4.0</td>\n",
" <td>200.0</td>\n",
" <td>22.0</td>\n",
" <td>29.0</td>\n",
" <td>3230.0</td>\n",
" <td>105.0</td>\n",
" <td>183.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>TL 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>33195.0</td>\n",
" <td>30299.0</td>\n",
" <td>3.2</td>\n",
" <td>6.0</td>\n",
" <td>270.0</td>\n",
" <td>20.0</td>\n",
" <td>28.0</td>\n",
" <td>3575.0</td>\n",
" <td>108.0</td>\n",
" <td>186.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Type Origin DriveTrain MSRP Invoice EngineSize \\\n",
"Model \n",
"TSX 4dr Acura Sedan Asia Front 26990.0 24647.0 2.4 \n",
"TL 4dr Acura Sedan Asia Front 33195.0 30299.0 3.2 \n",
"\n",
" Cylinders Horsepower MPG_City MPG_Highway Weight Wheelbase \\\n",
"Model \n",
"TSX 4dr 4.0 200.0 22.0 29.0 3230.0 105.0 \n",
"TL 4dr 6.0 270.0 20.0 28.0 3575.0 108.0 \n",
"\n",
" Length \n",
"Model \n",
"TSX 4dr 183.0 \n",
"TL 4dr 186.0 "
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[['TSX 4dr', 'TL 4dr']]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>MDX</th>\n",
" <td>Acura</td>\n",
" <td>SUV</td>\n",
" <td>Asia</td>\n",
" <td>All</td>\n",
" <td>36945.0</td>\n",
" <td>33337.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>265.0</td>\n",
" <td>17.0</td>\n",
" <td>23.0</td>\n",
" <td>4451.0</td>\n",
" <td>106.0</td>\n",
" <td>189.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>RSX Type S 2dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>23820.0</td>\n",
" <td>21761.0</td>\n",
" <td>2.0</td>\n",
" <td>4.0</td>\n",
" <td>200.0</td>\n",
" <td>24.0</td>\n",
" <td>31.0</td>\n",
" <td>2778.0</td>\n",
" <td>101.0</td>\n",
" <td>172.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>TSX 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>26990.0</td>\n",
" <td>24647.0</td>\n",
" <td>2.4</td>\n",
" <td>4.0</td>\n",
" <td>200.0</td>\n",
" <td>22.0</td>\n",
" <td>29.0</td>\n",
" <td>3230.0</td>\n",
" <td>105.0</td>\n",
" <td>183.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>TL 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>33195.0</td>\n",
" <td>30299.0</td>\n",
" <td>3.2</td>\n",
" <td>6.0</td>\n",
" <td>270.0</td>\n",
" <td>20.0</td>\n",
" <td>28.0</td>\n",
" <td>3575.0</td>\n",
" <td>108.0</td>\n",
" <td>186.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Type Origin DriveTrain MSRP Invoice EngineSize \\\n",
"Model \n",
"MDX Acura SUV Asia All 36945.0 33337.0 3.5 \n",
"RSX Type S 2dr Acura Sedan Asia Front 23820.0 21761.0 2.0 \n",
"TSX 4dr Acura Sedan Asia Front 26990.0 24647.0 2.4 \n",
"TL 4dr Acura Sedan Asia Front 33195.0 30299.0 3.2 \n",
"\n",
" Cylinders Horsepower MPG_City MPG_Highway Weight \\\n",
"Model \n",
"MDX 6.0 265.0 17.0 23.0 4451.0 \n",
"RSX Type S 2dr 4.0 200.0 24.0 31.0 2778.0 \n",
"TSX 4dr 4.0 200.0 22.0 29.0 3230.0 \n",
"TL 4dr 6.0 270.0 20.0 28.0 3575.0 \n",
"\n",
" Wheelbase Length \n",
"Model \n",
"MDX 106.0 189.0 \n",
"RSX Type S 2dr 101.0 172.0 \n",
"TSX 4dr 105.0 183.0 \n",
"TL 4dr 108.0 186.0 "
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc['MDX':'TL 4dr']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that when using slicing on labels, the endpoint for the slice is included. This is different than for row indexes."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Indexing Both Rows and Columns\n",
"\n",
"The ``iloc`` and ``loc`` properties can be used to subset both rows and columns simultaneously. However, the ``iloc`` property must only contain positional row and column index information, and ``loc`` must only contain row and column labels."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>RSX Type S 2dr</th>\n",
" <td>Front</td>\n",
" <td>23820.0</td>\n",
" <td>21761.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>TSX 4dr</th>\n",
" <td>Front</td>\n",
" <td>26990.0</td>\n",
" <td>24647.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>TL 4dr</th>\n",
" <td>Front</td>\n",
" <td>33195.0</td>\n",
" <td>30299.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3.5 RL 4dr</th>\n",
" <td>Front</td>\n",
" <td>43755.0</td>\n",
" <td>39014.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" DriveTrain MSRP Invoice\n",
"Model \n",
"RSX Type S 2dr Front 23820.0 21761.0\n",
"TSX 4dr Front 26990.0 24647.0\n",
"TL 4dr Front 33195.0 30299.0\n",
"3.5 RL 4dr Front 43755.0 39014.0"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.iloc[1:5, 3:6]"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>MDX</th>\n",
" <td>36945.0</td>\n",
" <td>33337.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>265.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>RSX Type S 2dr</th>\n",
" <td>23820.0</td>\n",
" <td>21761.0</td>\n",
" <td>2.0</td>\n",
" <td>4.0</td>\n",
" <td>200.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>TSX 4dr</th>\n",
" <td>26990.0</td>\n",
" <td>24647.0</td>\n",
" <td>2.4</td>\n",
" <td>4.0</td>\n",
" <td>200.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>TL 4dr</th>\n",
" <td>33195.0</td>\n",
" <td>30299.0</td>\n",
" <td>3.2</td>\n",
" <td>6.0</td>\n",
" <td>270.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" MSRP Invoice EngineSize Cylinders Horsepower\n",
"Model \n",
"MDX 36945.0 33337.0 3.5 6.0 265.0\n",
"RSX Type S 2dr 23820.0 21761.0 2.0 4.0 200.0\n",
"TSX 4dr 26990.0 24647.0 2.4 4.0 200.0\n",
"TL 4dr 33195.0 30299.0 3.2 6.0 270.0"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc['MDX':'TL 4dr', 'MSRP':'Horsepower']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Boolean Indexing\n",
"\n",
"A more dynamic way of indexing DataFrames is to use boolean indexing. Rather than specifying explicit values for the index, we will use an expression to indicate which rows we want to select. The expression creates a boolean Series which indicates the rows to keep. Here is an example of a boolean series created by comparing the MSRP column to 40000."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Model\n",
"MDX False\n",
"RSX Type S 2dr False\n",
"TSX 4dr False\n",
"TL 4dr False\n",
"3.5 RL 4dr True\n",
"3.5 RL w/Navigation 4dr True\n",
"NSX coupe 2dr manual S True\n",
"A4 1.8T 4dr False\n",
"A41.8T convertible 2dr False\n",
"A4 3.0 4dr False\n",
"A4 3.0 Quattro 4dr manual False\n",
"A4 3.0 Quattro 4dr auto False\n",
"A6 3.0 4dr False\n",
"A6 3.0 Quattro 4dr False\n",
"A4 3.0 convertible 2dr True\n",
"A4 3.0 Quattro convertible 2dr True\n",
"A6 2.7 Turbo Quattro 4dr True\n",
"A6 4.2 Quattro 4dr True\n",
"A8 L Quattro 4dr True\n",
"S4 Quattro 4dr True\n",
"RS 6 4dr True\n",
"TT 1.8 convertible 2dr (coupe) False\n",
"TT 1.8 Quattro 2dr (convertible) False\n",
"TT 3.2 coupe 2dr (convertible) True\n",
"A6 3.0 Avant Quattro True\n",
"S4 Avant Quattro True\n",
"X3 3.0i False\n",
"X5 4.4i True\n",
"325i 4dr False\n",
"325Ci 2dr False\n",
" ... \n",
"Tundra Regular Cab V6 False\n",
"Tundra Access Cab V6 SR5 False\n",
"Matrix XR False\n",
"Touareg V6 False\n",
"Golf GLS 4dr False\n",
"GTI 1.8T 2dr hatch False\n",
"Jetta GLS TDI 4dr False\n",
"New Beetle GLS 1.8T 2dr False\n",
"Jetta GLI VR6 4dr False\n",
"New Beetle GLS convertible 2dr False\n",
"Passat GLS 4dr False\n",
"Passat GLX V6 4MOTION 4dr False\n",
"Passat W8 4MOTION 4dr False\n",
"Phaeton 4dr True\n",
"Phaeton W12 4dr True\n",
"Jetta GL False\n",
"Passat GLS 1.8T False\n",
"Passat W8 True\n",
"XC90 T6 True\n",
"S40 4dr False\n",
"S60 2.5 4dr False\n",
"S60 T5 4dr False\n",
"S60 R 4dr False\n",
"S80 2.9 4dr False\n",
"S80 2.5T 4dr False\n",
"C70 LPT convertible 2dr True\n",
"C70 HPT convertible 2dr True\n",
"S80 T6 4dr True\n",
"V40 False\n",
"XC70 False\n",
"Name: MSRP, dtype: bool"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['MSRP'] > 40000"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As you can see, the values of the Series are all True or False. If we use that condition as an index value of a DataFrame, we'll get a new DataFrame containing only the rows where the condition was true."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>3.5 RL 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>43755.0</td>\n",
" <td>39014.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3880.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3.5 RL w/Navigation 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>46100.0</td>\n",
" <td>41100.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3893.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>NSX coupe 2dr manual S</th>\n",
" <td>Acura</td>\n",
" <td>Sports</td>\n",
" <td>Asia</td>\n",
" <td>Rear</td>\n",
" <td>89765.0</td>\n",
" <td>79978.0</td>\n",
" <td>3.2</td>\n",
" <td>6.0</td>\n",
" <td>290.0</td>\n",
" <td>17.0</td>\n",
" <td>24.0</td>\n",
" <td>3153.0</td>\n",
" <td>100.0</td>\n",
" <td>174.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>A4 3.0 convertible 2dr</th>\n",
" <td>Audi</td>\n",
" <td>Sedan</td>\n",
" <td>Europe</td>\n",
" <td>Front</td>\n",
" <td>42490.0</td>\n",
" <td>38325.0</td>\n",
" <td>3.0</td>\n",
" <td>6.0</td>\n",
" <td>220.0</td>\n",
" <td>20.0</td>\n",
" <td>27.0</td>\n",
" <td>3814.0</td>\n",
" <td>105.0</td>\n",
" <td>180.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>A4 3.0 Quattro convertible 2dr</th>\n",
" <td>Audi</td>\n",
" <td>Sedan</td>\n",
" <td>Europe</td>\n",
" <td>All</td>\n",
" <td>44240.0</td>\n",
" <td>40075.0</td>\n",
" <td>3.0</td>\n",
" <td>6.0</td>\n",
" <td>220.0</td>\n",
" <td>18.0</td>\n",
" <td>25.0</td>\n",
" <td>4013.0</td>\n",
" <td>105.0</td>\n",
" <td>180.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Type Origin DriveTrain MSRP \\\n",
"Model \n",
"3.5 RL 4dr Acura Sedan Asia Front 43755.0 \n",
"3.5 RL w/Navigation 4dr Acura Sedan Asia Front 46100.0 \n",
"NSX coupe 2dr manual S Acura Sports Asia Rear 89765.0 \n",
"A4 3.0 convertible 2dr Audi Sedan Europe Front 42490.0 \n",
"A4 3.0 Quattro convertible 2dr Audi Sedan Europe All 44240.0 \n",
"\n",
" Invoice EngineSize Cylinders Horsepower \\\n",
"Model \n",
"3.5 RL 4dr 39014.0 3.5 6.0 225.0 \n",
"3.5 RL w/Navigation 4dr 41100.0 3.5 6.0 225.0 \n",
"NSX coupe 2dr manual S 79978.0 3.2 6.0 290.0 \n",
"A4 3.0 convertible 2dr 38325.0 3.0 6.0 220.0 \n",
"A4 3.0 Quattro convertible 2dr 40075.0 3.0 6.0 220.0 \n",
"\n",
" MPG_City MPG_Highway Weight Wheelbase \\\n",
"Model \n",
"3.5 RL 4dr 18.0 24.0 3880.0 115.0 \n",
"3.5 RL w/Navigation 4dr 18.0 24.0 3893.0 115.0 \n",
"NSX coupe 2dr manual S 17.0 24.0 3153.0 100.0 \n",
"A4 3.0 convertible 2dr 20.0 27.0 3814.0 105.0 \n",
"A4 3.0 Quattro convertible 2dr 18.0 25.0 4013.0 105.0 \n",
"\n",
" Length \n",
"Model \n",
"3.5 RL 4dr 197.0 \n",
"3.5 RL w/Navigation 4dr 197.0 \n",
"NSX coupe 2dr manual S 174.0 \n",
"A4 3.0 convertible 2dr 180.0 \n",
"A4 3.0 Quattro convertible 2dr 180.0 "
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[df['MSRP'] > 40000].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can combine conditions using ``&`` for \"and\" and ``|`` for \"or\". For example, if we wanted cars that had an MSRP greater than $40,000, but also had 8 or more Cylinders, we could do the following. Note that due to the order of precedence of comparison and combination operatiors, you need to surround your comparions with parentheses."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Viper SRT-10 convertible 2dr</th>\n",
" <td>Dodge</td>\n",
" <td>Sports</td>\n",
" <td>USA</td>\n",
" <td>Rear</td>\n",
" <td>81795.0</td>\n",
" <td>74451.0</td>\n",
" <td>8.3</td>\n",
" <td>10.0</td>\n",
" <td>500.0</td>\n",
" <td>12.0</td>\n",
" <td>20.0</td>\n",
" <td>3410.0</td>\n",
" <td>99.0</td>\n",
" <td>176.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Excursion 6.8 XLT</th>\n",
" <td>Ford</td>\n",
" <td>SUV</td>\n",
" <td>USA</td>\n",
" <td>All</td>\n",
" <td>41475.0</td>\n",
" <td>36494.0</td>\n",
" <td>6.8</td>\n",
" <td>10.0</td>\n",
" <td>310.0</td>\n",
" <td>10.0</td>\n",
" <td>13.0</td>\n",
" <td>7190.0</td>\n",
" <td>137.0</td>\n",
" <td>227.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>CL600 2dr</th>\n",
" <td>Mercedes-Benz</td>\n",
" <td>Sedan</td>\n",
" <td>Europe</td>\n",
" <td>Rear</td>\n",
" <td>128420.0</td>\n",
" <td>119600.0</td>\n",
" <td>5.5</td>\n",
" <td>12.0</td>\n",
" <td>493.0</td>\n",
" <td>13.0</td>\n",
" <td>19.0</td>\n",
" <td>4473.0</td>\n",
" <td>114.0</td>\n",
" <td>196.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>SL600 convertible 2dr</th>\n",
" <td>Mercedes-Benz</td>\n",
" <td>Sports</td>\n",
" <td>Europe</td>\n",
" <td>Rear</td>\n",
" <td>126670.0</td>\n",
" <td>117854.0</td>\n",
" <td>5.5</td>\n",
" <td>12.0</td>\n",
" <td>493.0</td>\n",
" <td>13.0</td>\n",
" <td>19.0</td>\n",
" <td>4429.0</td>\n",
" <td>101.0</td>\n",
" <td>179.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Phaeton W12 4dr</th>\n",
" <td>Volkswagen</td>\n",
" <td>Sedan</td>\n",
" <td>Europe</td>\n",
" <td>Front</td>\n",
" <td>75000.0</td>\n",
" <td>69130.0</td>\n",
" <td>6.0</td>\n",
" <td>12.0</td>\n",
" <td>420.0</td>\n",
" <td>12.0</td>\n",
" <td>19.0</td>\n",
" <td>5399.0</td>\n",
" <td>118.0</td>\n",
" <td>204.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Type Origin DriveTrain \\\n",
"Model \n",
"Viper SRT-10 convertible 2dr Dodge Sports USA Rear \n",
"Excursion 6.8 XLT Ford SUV USA All \n",
"CL600 2dr Mercedes-Benz Sedan Europe Rear \n",
"SL600 convertible 2dr Mercedes-Benz Sports Europe Rear \n",
"Phaeton W12 4dr Volkswagen Sedan Europe Front \n",
"\n",
" MSRP Invoice EngineSize Cylinders \\\n",
"Model \n",
"Viper SRT-10 convertible 2dr 81795.0 74451.0 8.3 10.0 \n",
"Excursion 6.8 XLT 41475.0 36494.0 6.8 10.0 \n",
"CL600 2dr 128420.0 119600.0 5.5 12.0 \n",
"SL600 convertible 2dr 126670.0 117854.0 5.5 12.0 \n",
"Phaeton W12 4dr 75000.0 69130.0 6.0 12.0 \n",
"\n",
" Horsepower MPG_City MPG_Highway Weight \\\n",
"Model \n",
"Viper SRT-10 convertible 2dr 500.0 12.0 20.0 3410.0 \n",
"Excursion 6.8 XLT 310.0 10.0 13.0 7190.0 \n",
"CL600 2dr 493.0 13.0 19.0 4473.0 \n",
"SL600 convertible 2dr 493.0 13.0 19.0 4429.0 \n",
"Phaeton W12 4dr 420.0 12.0 19.0 5399.0 \n",
"\n",
" Wheelbase Length \n",
"Model \n",
"Viper SRT-10 convertible 2dr 99.0 176.0 \n",
"Excursion 6.8 XLT 137.0 227.0 \n",
"CL600 2dr 114.0 196.0 \n",
"SL600 convertible 2dr 101.0 179.0 \n",
"Phaeton W12 4dr 118.0 204.0 "
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[(df['MSRP'] > 40000) & (df['Cylinders'] > 8)].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sorting\n",
"\n",
"Sorting in Pandas DataFrames can be done according to the index or column values. The methods used to sort a DataFrame are ``sort_index`` for sorting by the index and ``sort_values`` for sorting by the data values.\n",
"\n",
"Since we still have our index set as Model, let's sort by that first."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>3.5 RL 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>43755.0</td>\n",
" <td>39014.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3880.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3.5 RL w/Navigation 4dr</th>\n",
" <td>Acura</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>46100.0</td>\n",
" <td>41100.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>225.0</td>\n",
" <td>18.0</td>\n",
" <td>24.0</td>\n",
" <td>3893.0</td>\n",
" <td>115.0</td>\n",
" <td>197.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>300M 4dr</th>\n",
" <td>Chrysler</td>\n",
" <td>Sedan</td>\n",
" <td>USA</td>\n",
" <td>Front</td>\n",
" <td>29865.0</td>\n",
" <td>27797.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>250.0</td>\n",
" <td>18.0</td>\n",
" <td>27.0</td>\n",
" <td>3581.0</td>\n",
" <td>113.0</td>\n",
" <td>198.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>300M Special Edition 4dr</th>\n",
" <td>Chrysler</td>\n",
" <td>Sedan</td>\n",
" <td>USA</td>\n",
" <td>Front</td>\n",
" <td>33295.0</td>\n",
" <td>30884.0</td>\n",
" <td>3.5</td>\n",
" <td>6.0</td>\n",
" <td>255.0</td>\n",
" <td>18.0</td>\n",
" <td>27.0</td>\n",
" <td>3650.0</td>\n",
" <td>113.0</td>\n",
" <td>198.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>325Ci 2dr</th>\n",
" <td>BMW</td>\n",
" <td>Sedan</td>\n",
" <td>Europe</td>\n",
" <td>Rear</td>\n",
" <td>30795.0</td>\n",
" <td>28245.0</td>\n",
" <td>2.5</td>\n",
" <td>6.0</td>\n",
" <td>184.0</td>\n",
" <td>20.0</td>\n",
" <td>29.0</td>\n",
" <td>3197.0</td>\n",
" <td>107.0</td>\n",
" <td>177.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Type Origin DriveTrain MSRP \\\n",
"Model \n",
"3.5 RL 4dr Acura Sedan Asia Front 43755.0 \n",
"3.5 RL w/Navigation 4dr Acura Sedan Asia Front 46100.0 \n",
"300M 4dr Chrysler Sedan USA Front 29865.0 \n",
"300M Special Edition 4dr Chrysler Sedan USA Front 33295.0 \n",
"325Ci 2dr BMW Sedan Europe Rear 30795.0 \n",
"\n",
" Invoice EngineSize Cylinders Horsepower \\\n",
"Model \n",
"3.5 RL 4dr 39014.0 3.5 6.0 225.0 \n",
"3.5 RL w/Navigation 4dr 41100.0 3.5 6.0 225.0 \n",
"300M 4dr 27797.0 3.5 6.0 250.0 \n",
"300M Special Edition 4dr 30884.0 3.5 6.0 255.0 \n",
"325Ci 2dr 28245.0 2.5 6.0 184.0 \n",
"\n",
" MPG_City MPG_Highway Weight Wheelbase Length \n",
"Model \n",
"3.5 RL 4dr 18.0 24.0 3880.0 115.0 197.0 \n",
"3.5 RL w/Navigation 4dr 18.0 24.0 3893.0 115.0 197.0 \n",
"300M 4dr 18.0 27.0 3581.0 113.0 198.0 \n",
"300M Special Edition 4dr 18.0 27.0 3650.0 113.0 198.0 \n",
"325Ci 2dr 20.0 29.0 3197.0 107.0 177.0 "
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.sort_index().head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Sorting by values is done by specify the columns that you want to sort by."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Rio 4dr manual</th>\n",
" <td>Kia</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>10280.0</td>\n",
" <td>9875.0</td>\n",
" <td>1.6</td>\n",
" <td>4.0</td>\n",
" <td>104.0</td>\n",
" <td>26.0</td>\n",
" <td>33.0</td>\n",
" <td>2403.0</td>\n",
" <td>95.0</td>\n",
" <td>167.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Accent 2dr hatch</th>\n",
" <td>Hyundai</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>10539.0</td>\n",
" <td>10107.0</td>\n",
" <td>1.6</td>\n",
" <td>4.0</td>\n",
" <td>103.0</td>\n",
" <td>29.0</td>\n",
" <td>33.0</td>\n",
" <td>2255.0</td>\n",
" <td>96.0</td>\n",
" <td>167.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Echo 2dr manual</th>\n",
" <td>Toyota</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>10760.0</td>\n",
" <td>10144.0</td>\n",
" <td>1.5</td>\n",
" <td>4.0</td>\n",
" <td>108.0</td>\n",
" <td>35.0</td>\n",
" <td>43.0</td>\n",
" <td>2035.0</td>\n",
" <td>93.0</td>\n",
" <td>163.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Ion1 4dr</th>\n",
" <td>Saturn</td>\n",
" <td>Sedan</td>\n",
" <td>USA</td>\n",
" <td>Front</td>\n",
" <td>10995.0</td>\n",
" <td>10319.0</td>\n",
" <td>2.2</td>\n",
" <td>4.0</td>\n",
" <td>140.0</td>\n",
" <td>26.0</td>\n",
" <td>35.0</td>\n",
" <td>2692.0</td>\n",
" <td>103.0</td>\n",
" <td>185.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Rio 4dr auto</th>\n",
" <td>Kia</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>11155.0</td>\n",
" <td>10705.0</td>\n",
" <td>1.6</td>\n",
" <td>4.0</td>\n",
" <td>104.0</td>\n",
" <td>25.0</td>\n",
" <td>32.0</td>\n",
" <td>2458.0</td>\n",
" <td>95.0</td>\n",
" <td>167.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Type Origin DriveTrain MSRP Invoice \\\n",
"Model \n",
"Rio 4dr manual Kia Sedan Asia Front 10280.0 9875.0 \n",
"Accent 2dr hatch Hyundai Sedan Asia Front 10539.0 10107.0 \n",
"Echo 2dr manual Toyota Sedan Asia Front 10760.0 10144.0 \n",
"Ion1 4dr Saturn Sedan USA Front 10995.0 10319.0 \n",
"Rio 4dr auto Kia Sedan Asia Front 11155.0 10705.0 \n",
"\n",
" EngineSize Cylinders Horsepower MPG_City MPG_Highway \\\n",
"Model \n",
"Rio 4dr manual 1.6 4.0 104.0 26.0 33.0 \n",
"Accent 2dr hatch 1.6 4.0 103.0 29.0 33.0 \n",
"Echo 2dr manual 1.5 4.0 108.0 35.0 43.0 \n",
"Ion1 4dr 2.2 4.0 140.0 26.0 35.0 \n",
"Rio 4dr auto 1.6 4.0 104.0 25.0 32.0 \n",
"\n",
" Weight Wheelbase Length \n",
"Model \n",
"Rio 4dr manual 2403.0 95.0 167.0 \n",
"Accent 2dr hatch 2255.0 96.0 167.0 \n",
"Echo 2dr manual 2035.0 93.0 163.0 \n",
"Ion1 4dr 2692.0 103.0 185.0 \n",
"Rio 4dr auto 2458.0 95.0 167.0 "
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.sort_values(['MSRP', 'Horsepower']).head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also use the ``ascending=`` option to specify the order in which the columns should be sorted."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Model</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>911 GT2 2dr</th>\n",
" <td>Porsche</td>\n",
" <td>Sports</td>\n",
" <td>Europe</td>\n",
" <td>Rear</td>\n",
" <td>192465.0</td>\n",
" <td>173560.0</td>\n",
" <td>3.6</td>\n",
" <td>6.0</td>\n",
" <td>477.0</td>\n",
" <td>17.0</td>\n",
" <td>24.0</td>\n",
" <td>3131.0</td>\n",
" <td>93.0</td>\n",
" <td>175.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>CL600 2dr</th>\n",
" <td>Mercedes-Benz</td>\n",
" <td>Sedan</td>\n",
" <td>Europe</td>\n",
" <td>Rear</td>\n",
" <td>128420.0</td>\n",
" <td>119600.0</td>\n",
" <td>5.5</td>\n",
" <td>12.0</td>\n",
" <td>493.0</td>\n",
" <td>13.0</td>\n",
" <td>19.0</td>\n",
" <td>4473.0</td>\n",
" <td>114.0</td>\n",
" <td>196.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>SL600 convertible 2dr</th>\n",
" <td>Mercedes-Benz</td>\n",
" <td>Sports</td>\n",
" <td>Europe</td>\n",
" <td>Rear</td>\n",
" <td>126670.0</td>\n",
" <td>117854.0</td>\n",
" <td>5.5</td>\n",
" <td>12.0</td>\n",
" <td>493.0</td>\n",
" <td>13.0</td>\n",
" <td>19.0</td>\n",
" <td>4429.0</td>\n",
" <td>101.0</td>\n",
" <td>179.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>SL55 AMG 2dr</th>\n",
" <td>Mercedes-Benz</td>\n",
" <td>Sports</td>\n",
" <td>Europe</td>\n",
" <td>Rear</td>\n",
" <td>121770.0</td>\n",
" <td>113388.0</td>\n",
" <td>5.5</td>\n",
" <td>8.0</td>\n",
" <td>493.0</td>\n",
" <td>14.0</td>\n",
" <td>21.0</td>\n",
" <td>4235.0</td>\n",
" <td>101.0</td>\n",
" <td>179.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>CL500 2dr</th>\n",
" <td>Mercedes-Benz</td>\n",
" <td>Sedan</td>\n",
" <td>Europe</td>\n",
" <td>Rear</td>\n",
" <td>94820.0</td>\n",
" <td>88324.0</td>\n",
" <td>5.0</td>\n",
" <td>8.0</td>\n",
" <td>302.0</td>\n",
" <td>16.0</td>\n",
" <td>24.0</td>\n",
" <td>4085.0</td>\n",
" <td>114.0</td>\n",
" <td>196.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Type Origin DriveTrain MSRP \\\n",
"Model \n",
"911 GT2 2dr Porsche Sports Europe Rear 192465.0 \n",
"CL600 2dr Mercedes-Benz Sedan Europe Rear 128420.0 \n",
"SL600 convertible 2dr Mercedes-Benz Sports Europe Rear 126670.0 \n",
"SL55 AMG 2dr Mercedes-Benz Sports Europe Rear 121770.0 \n",
"CL500 2dr Mercedes-Benz Sedan Europe Rear 94820.0 \n",
"\n",
" Invoice EngineSize Cylinders Horsepower MPG_City \\\n",
"Model \n",
"911 GT2 2dr 173560.0 3.6 6.0 477.0 17.0 \n",
"CL600 2dr 119600.0 5.5 12.0 493.0 13.0 \n",
"SL600 convertible 2dr 117854.0 5.5 12.0 493.0 13.0 \n",
"SL55 AMG 2dr 113388.0 5.5 8.0 493.0 14.0 \n",
"CL500 2dr 88324.0 5.0 8.0 302.0 16.0 \n",
"\n",
" MPG_Highway Weight Wheelbase Length \n",
"Model \n",
"911 GT2 2dr 24.0 3131.0 93.0 175.0 \n",
"CL600 2dr 19.0 4473.0 114.0 196.0 \n",
"SL600 convertible 2dr 19.0 4429.0 101.0 179.0 \n",
"SL55 AMG 2dr 21.0 4235.0 101.0 179.0 \n",
"CL500 2dr 24.0 4085.0 114.0 196.0 "
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.sort_values(['MSRP', 'Horsepower'], ascending=[False, True]).head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Keep in mind that the ``sort_index`` and ``sort_values`` methods return a new DataFrame with the sorting applied. If you want to sort a DataFrame in place, you need to specify the ``inplace=True`` option."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's reset the index before moving on to the next sections."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"df.reset_index(inplace=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Doing Simple Statistics\n",
"\n",
"Pandas DataFrames have many builtin methods for doing simple statistics. Probably the most common one is ``describe``. It computes various basic statistics for the entire DataFrame."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>426.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>32774.855140</td>\n",
" <td>30014.700935</td>\n",
" <td>3.196729</td>\n",
" <td>5.807512</td>\n",
" <td>215.885514</td>\n",
" <td>20.060748</td>\n",
" <td>26.843458</td>\n",
" <td>3577.953271</td>\n",
" <td>108.154206</td>\n",
" <td>186.362150</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>19431.716674</td>\n",
" <td>17642.117750</td>\n",
" <td>1.108595</td>\n",
" <td>1.558443</td>\n",
" <td>71.836032</td>\n",
" <td>5.238218</td>\n",
" <td>5.741201</td>\n",
" <td>758.983215</td>\n",
" <td>8.311813</td>\n",
" <td>14.357991</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>10280.000000</td>\n",
" <td>9875.000000</td>\n",
" <td>1.300000</td>\n",
" <td>3.000000</td>\n",
" <td>73.000000</td>\n",
" <td>10.000000</td>\n",
" <td>12.000000</td>\n",
" <td>1850.000000</td>\n",
" <td>89.000000</td>\n",
" <td>143.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>20334.250000</td>\n",
" <td>18866.000000</td>\n",
" <td>2.375000</td>\n",
" <td>4.000000</td>\n",
" <td>165.000000</td>\n",
" <td>17.000000</td>\n",
" <td>24.000000</td>\n",
" <td>3104.000000</td>\n",
" <td>103.000000</td>\n",
" <td>178.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>27635.000000</td>\n",
" <td>25294.500000</td>\n",
" <td>3.000000</td>\n",
" <td>6.000000</td>\n",
" <td>210.000000</td>\n",
" <td>19.000000</td>\n",
" <td>26.000000</td>\n",
" <td>3474.500000</td>\n",
" <td>107.000000</td>\n",
" <td>187.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>39205.000000</td>\n",
" <td>35710.250000</td>\n",
" <td>3.900000</td>\n",
" <td>6.000000</td>\n",
" <td>255.000000</td>\n",
" <td>21.250000</td>\n",
" <td>29.000000</td>\n",
" <td>3977.750000</td>\n",
" <td>112.000000</td>\n",
" <td>194.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>192465.000000</td>\n",
" <td>173560.000000</td>\n",
" <td>8.300000</td>\n",
" <td>12.000000</td>\n",
" <td>500.000000</td>\n",
" <td>60.000000</td>\n",
" <td>66.000000</td>\n",
" <td>7190.000000</td>\n",
" <td>144.000000</td>\n",
" <td>238.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" MSRP Invoice EngineSize Cylinders Horsepower \\\n",
"count 428.000000 428.000000 428.000000 426.000000 428.000000 \n",
"mean 32774.855140 30014.700935 3.196729 5.807512 215.885514 \n",
"std 19431.716674 17642.117750 1.108595 1.558443 71.836032 \n",
"min 10280.000000 9875.000000 1.300000 3.000000 73.000000 \n",
"25% 20334.250000 18866.000000 2.375000 4.000000 165.000000 \n",
"50% 27635.000000 25294.500000 3.000000 6.000000 210.000000 \n",
"75% 39205.000000 35710.250000 3.900000 6.000000 255.000000 \n",
"max 192465.000000 173560.000000 8.300000 12.000000 500.000000 \n",
"\n",
" MPG_City MPG_Highway Weight Wheelbase Length \n",
"count 428.000000 428.000000 428.000000 428.000000 428.000000 \n",
"mean 20.060748 26.843458 3577.953271 108.154206 186.362150 \n",
"std 5.238218 5.741201 758.983215 8.311813 14.357991 \n",
"min 10.000000 12.000000 1850.000000 89.000000 143.000000 \n",
"25% 17.000000 24.000000 3104.000000 103.000000 178.000000 \n",
"50% 19.000000 26.000000 3474.500000 107.000000 187.000000 \n",
"75% 21.250000 29.000000 3977.750000 112.000000 194.000000 \n",
"max 60.000000 66.000000 7190.000000 144.000000 238.000000 "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The ``describe`` method has a couple of options to select different columns of data (``include=``) and the percentiles to compute (``percentiles=``). In the example below, we are selecting all of the columns and changing the displayed percentiles to 30% and 70% (50% is always displayed)."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Make</th>\n",
" <th>Model</th>\n",
" <th>Type</th>\n",
" <th>Origin</th>\n",
" <th>DriveTrain</th>\n",
" <th>MSRP</th>\n",
" <th>Invoice</th>\n",
" <th>EngineSize</th>\n",
" <th>Cylinders</th>\n",
" <th>Horsepower</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" <th>Length</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>428</td>\n",
" <td>428</td>\n",
" <td>428</td>\n",
" <td>428</td>\n",
" <td>428</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>426.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>unique</th>\n",
" <td>38</td>\n",
" <td>425</td>\n",
" <td>6</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>top</th>\n",
" <td>Toyota</td>\n",
" <td>C240 4dr</td>\n",
" <td>Sedan</td>\n",
" <td>Asia</td>\n",
" <td>Front</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>freq</th>\n",
" <td>28</td>\n",
" <td>2</td>\n",
" <td>262</td>\n",
" <td>158</td>\n",
" <td>226</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>32774.855140</td>\n",
" <td>30014.700935</td>\n",
" <td>3.196729</td>\n",
" <td>5.807512</td>\n",
" <td>215.885514</td>\n",
" <td>20.060748</td>\n",
" <td>26.843458</td>\n",
" <td>3577.953271</td>\n",
" <td>108.154206</td>\n",
" <td>186.362150</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>19431.716674</td>\n",
" <td>17642.117750</td>\n",
" <td>1.108595</td>\n",
" <td>1.558443</td>\n",
" <td>71.836032</td>\n",
" <td>5.238218</td>\n",
" <td>5.741201</td>\n",
" <td>758.983215</td>\n",
" <td>8.311813</td>\n",
" <td>14.357991</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>10280.000000</td>\n",
" <td>9875.000000</td>\n",
" <td>1.300000</td>\n",
" <td>3.000000</td>\n",
" <td>73.000000</td>\n",
" <td>10.000000</td>\n",
" <td>12.000000</td>\n",
" <td>1850.000000</td>\n",
" <td>89.000000</td>\n",
" <td>143.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30%</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>22001.000000</td>\n",
" <td>20288.100000</td>\n",
" <td>2.400000</td>\n",
" <td>4.000000</td>\n",
" <td>172.100000</td>\n",
" <td>18.000000</td>\n",
" <td>25.000000</td>\n",
" <td>3217.000000</td>\n",
" <td>104.000000</td>\n",
" <td>178.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>27635.000000</td>\n",
" <td>25294.500000</td>\n",
" <td>3.000000</td>\n",
" <td>6.000000</td>\n",
" <td>210.000000</td>\n",
" <td>19.000000</td>\n",
" <td>26.000000</td>\n",
" <td>3474.500000</td>\n",
" <td>107.000000</td>\n",
" <td>187.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>70%</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>35938.000000</td>\n",
" <td>32987.500000</td>\n",
" <td>3.600000</td>\n",
" <td>6.000000</td>\n",
" <td>240.000000</td>\n",
" <td>21.000000</td>\n",
" <td>29.000000</td>\n",
" <td>3835.300000</td>\n",
" <td>111.000000</td>\n",
" <td>192.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>192465.000000</td>\n",
" <td>173560.000000</td>\n",
" <td>8.300000</td>\n",
" <td>12.000000</td>\n",
" <td>500.000000</td>\n",
" <td>60.000000</td>\n",
" <td>66.000000</td>\n",
" <td>7190.000000</td>\n",
" <td>144.000000</td>\n",
" <td>238.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Make Model Type Origin DriveTrain MSRP \\\n",
"count 428 428 428 428 428 428.000000 \n",
"unique 38 425 6 3 3 NaN \n",
"top Toyota C240 4dr Sedan Asia Front NaN \n",
"freq 28 2 262 158 226 NaN \n",
"mean NaN NaN NaN NaN NaN 32774.855140 \n",
"std NaN NaN NaN NaN NaN 19431.716674 \n",
"min NaN NaN NaN NaN NaN 10280.000000 \n",
"30% NaN NaN NaN NaN NaN 22001.000000 \n",
"50% NaN NaN NaN NaN NaN 27635.000000 \n",
"70% NaN NaN NaN NaN NaN 35938.000000 \n",
"max NaN NaN NaN NaN NaN 192465.000000 \n",
"\n",
" Invoice EngineSize Cylinders Horsepower MPG_City \\\n",
"count 428.000000 428.000000 426.000000 428.000000 428.000000 \n",
"unique NaN NaN NaN NaN NaN \n",
"top NaN NaN NaN NaN NaN \n",
"freq NaN NaN NaN NaN NaN \n",
"mean 30014.700935 3.196729 5.807512 215.885514 20.060748 \n",
"std 17642.117750 1.108595 1.558443 71.836032 5.238218 \n",
"min 9875.000000 1.300000 3.000000 73.000000 10.000000 \n",
"30% 20288.100000 2.400000 4.000000 172.100000 18.000000 \n",
"50% 25294.500000 3.000000 6.000000 210.000000 19.000000 \n",
"70% 32987.500000 3.600000 6.000000 240.000000 21.000000 \n",
"max 173560.000000 8.300000 12.000000 500.000000 60.000000 \n",
"\n",
" MPG_Highway Weight Wheelbase Length \n",
"count 428.000000 428.000000 428.000000 428.000000 \n",
"unique NaN NaN NaN NaN \n",
"top NaN NaN NaN NaN \n",
"freq NaN NaN NaN NaN \n",
"mean 26.843458 3577.953271 108.154206 186.362150 \n",
"std 5.741201 758.983215 8.311813 14.357991 \n",
"min 12.000000 1850.000000 89.000000 143.000000 \n",
"30% 25.000000 3217.000000 104.000000 178.000000 \n",
"50% 26.000000 3474.500000 107.000000 187.000000 \n",
"70% 29.000000 3835.300000 111.000000 192.000000 \n",
"max 66.000000 7190.000000 144.000000 238.000000 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.describe(include='all', percentiles=[0.3, 0.7])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also select individual statistics using their own methods."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Model 3.5 RL 4dr\n",
"Make Acura\n",
"Type Hybrid\n",
"Origin Asia\n",
"DriveTrain All\n",
"MSRP 10280\n",
"Invoice 9875\n",
"EngineSize 1.3\n",
"Cylinders 3\n",
"Horsepower 73\n",
"MPG_City 10\n",
"MPG_Highway 12\n",
"Weight 1850\n",
"Wheelbase 89\n",
"Length 143\n",
"dtype: object"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.min()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MSRP 27635.0\n",
"Invoice 25294.5\n",
"EngineSize 3.0\n",
"Cylinders 6.0\n",
"Horsepower 210.0\n",
"MPG_City 19.0\n",
"MPG_Highway 26.0\n",
"Weight 3474.5\n",
"Wheelbase 107.0\n",
"Length 187.0\n",
"dtype: float64"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.median()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"All of the above methods also work on individual columns, or subsets of columns."
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 428.000000\n",
"mean 32774.855140\n",
"std 19431.716674\n",
"min 10280.000000\n",
"25% 20334.250000\n",
"50% 27635.000000\n",
"75% 39205.000000\n",
"max 192465.000000\n",
"Name: MSRP, dtype: float64"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['MSRP'].describe()"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"10280.0"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['MSRP'].min()"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>MSRP</th>\n",
" <th>Horsepower</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>428.000000</td>\n",
" <td>428.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>32774.855140</td>\n",
" <td>215.885514</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>19431.716674</td>\n",
" <td>71.836032</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>10280.000000</td>\n",
" <td>73.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>20334.250000</td>\n",
" <td>165.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>27635.000000</td>\n",
" <td>210.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>39205.000000</td>\n",
" <td>255.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>192465.000000</td>\n",
" <td>500.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" MSRP Horsepower\n",
"count 428.000000 428.000000\n",
"mean 32774.855140 215.885514\n",
"std 19431.716674 71.836032\n",
"min 10280.000000 73.000000\n",
"25% 20334.250000 165.000000\n",
"50% 27635.000000 210.000000\n",
"75% 39205.000000 255.000000\n",
"max 192465.000000 500.000000"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[['MSRP', 'Horsepower']].describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Grouping Data\n",
"\n",
"Another common operation in analyzing data is grouping by variable values. This is primarily done using the ``groupby`` method of DataFrames."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<pandas.core.groupby.DataFrameGroupBy object at 0x7f7784088550>"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"grpdf = df.groupby('Origin')\n",
"grpdf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You'll notice that in this case the returned value is a ``DataFrameGroupBy`` object. Many of the methods available on a DataFrame will also work on the ``DataFrameGroupBy`` object."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th></th>\n",
" <th>Cylinders</th>\n",
" <th>EngineSize</th>\n",
" <th>Horsepower</th>\n",
" <th>Invoice</th>\n",
" <th>Length</th>\n",
" <th>MPG_City</th>\n",
" <th>MPG_Highway</th>\n",
" <th>MSRP</th>\n",
" <th>Weight</th>\n",
" <th>Wheelbase</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Origin</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">Asia</th>\n",
" <th>count</th>\n",
" <td>156.000000</td>\n",
" <td>158.000000</td>\n",
" <td>158.000000</td>\n",
" <td>158.000000</td>\n",
" <td>158.000000</td>\n",
" <td>158.000000</td>\n",
" <td>158.000000</td>\n",
" <td>158.000000</td>\n",
" <td>158.000000</td>\n",
" <td>158.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>5.185897</td>\n",
" <td>2.774051</td>\n",
" <td>190.702532</td>\n",
" <td>22602.177215</td>\n",
" <td>182.816456</td>\n",
" <td>22.012658</td>\n",
" <td>28.265823</td>\n",
" <td>24741.322785</td>\n",
" <td>3319.316456</td>\n",
" <td>105.886076</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>1.269008</td>\n",
" <td>0.902310</td>\n",
" <td>59.392627</td>\n",
" <td>9842.984880</td>\n",
" <td>12.564148</td>\n",
" <td>6.733307</td>\n",
" <td>6.770503</td>\n",
" <td>11321.069675</td>\n",
" <td>717.842132</td>\n",
" <td>7.735249</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>3.000000</td>\n",
" <td>1.300000</td>\n",
" <td>73.000000</td>\n",
" <td>9875.000000</td>\n",
" <td>153.000000</td>\n",
" <td>13.000000</td>\n",
" <td>17.000000</td>\n",
" <td>10280.000000</td>\n",
" <td>1850.000000</td>\n",
" <td>89.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>4.000000</td>\n",
" <td>2.000000</td>\n",
" <td>142.000000</td>\n",
" <td>16271.500000</td>\n",
" <td>175.000000</td>\n",
" <td>18.000000</td>\n",
" <td>25.000000</td>\n",
" <td>17208.000000</td>\n",
" <td>2751.500000</td>\n",
" <td>102.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>6.000000</td>\n",
" <td>2.600000</td>\n",
" <td>187.500000</td>\n",
" <td>20949.500000</td>\n",
" <td>183.500000</td>\n",
" <td>20.500000</td>\n",
" <td>27.000000</td>\n",
" <td>23032.500000</td>\n",
" <td>3280.000000</td>\n",
" <td>105.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>6.000000</td>\n",
" <td>3.500000</td>\n",
" <td>233.750000</td>\n",
" <td>26645.000000</td>\n",
" <td>191.000000</td>\n",
" <td>24.000000</td>\n",
" <td>31.000000</td>\n",
" <td>28787.500000</td>\n",
" <td>3706.750000</td>\n",
" <td>110.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>8.000000</td>\n",
" <td>5.600000</td>\n",
" <td>340.000000</td>\n",
" <td>79978.000000</td>\n",
" <td>224.000000</td>\n",
" <td>60.000000</td>\n",
" <td>66.000000</td>\n",
" <td>89765.000000</td>\n",
" <td>5590.000000</td>\n",
" <td>140.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">Europe</th>\n",
" <th>count</th>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>6.235772</td>\n",
" <td>3.206504</td>\n",
" <td>251.894309</td>\n",
" <td>44395.081301</td>\n",
" <td>181.845528</td>\n",
" <td>18.731707</td>\n",
" <td>26.008130</td>\n",
" <td>48349.796748</td>\n",
" <td>3680.723577</td>\n",
" <td>106.447154</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>1.679585</td>\n",
" <td>1.037157</td>\n",
" <td>80.738884</td>\n",
" <td>23080.368888</td>\n",
" <td>11.479870</td>\n",
" <td>3.289509</td>\n",
" <td>4.167588</td>\n",
" <td>25318.600464</td>\n",
" <td>579.576871</td>\n",
" <td>6.690099</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>4.000000</td>\n",
" <td>1.600000</td>\n",
" <td>100.000000</td>\n",
" <td>15437.000000</td>\n",
" <td>143.000000</td>\n",
" <td>12.000000</td>\n",
" <td>14.000000</td>\n",
" <td>16999.000000</td>\n",
" <td>2524.000000</td>\n",
" <td>93.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>5.000000</td>\n",
" <td>2.500000</td>\n",
" <td>195.500000</td>\n",
" <td>31287.500000</td>\n",
" <td>177.000000</td>\n",
" <td>17.000000</td>\n",
" <td>24.000000</td>\n",
" <td>33837.500000</td>\n",
" <td>3285.000000</td>\n",
" <td>102.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>6.000000</td>\n",
" <td>3.000000</td>\n",
" <td>225.000000</td>\n",
" <td>37575.000000</td>\n",
" <td>182.000000</td>\n",
" <td>19.000000</td>\n",
" <td>26.000000</td>\n",
" <td>40590.000000</td>\n",
" <td>3585.000000</td>\n",
" <td>107.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>8.000000</td>\n",
" <td>4.200000</td>\n",
" <td>302.000000</td>\n",
" <td>51042.500000</td>\n",
" <td>190.000000</td>\n",
" <td>20.000000</td>\n",
" <td>28.500000</td>\n",
" <td>56382.500000</td>\n",
" <td>3959.500000</td>\n",
" <td>110.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>12.000000</td>\n",
" <td>6.000000</td>\n",
" <td>493.000000</td>\n",
" <td>173560.000000</td>\n",
" <td>204.000000</td>\n",
" <td>38.000000</td>\n",
" <td>46.000000</td>\n",
" <td>192465.000000</td>\n",
" <td>5423.000000</td>\n",
" <td>123.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">USA</th>\n",
" <th>count</th>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>6.108844</td>\n",
" <td>3.642857</td>\n",
" <td>212.823129</td>\n",
" <td>25949.340136</td>\n",
" <td>193.952381</td>\n",
" <td>19.074830</td>\n",
" <td>26.013605</td>\n",
" <td>28377.442177</td>\n",
" <td>3769.952381</td>\n",
" <td>112.020408</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>1.531090</td>\n",
" <td>1.194221</td>\n",
" <td>63.748618</td>\n",
" <td>10518.722194</td>\n",
" <td>15.305265</td>\n",
" <td>3.982992</td>\n",
" <td>5.396582</td>\n",
" <td>11711.982506</td>\n",
" <td>855.305524</td>\n",
" <td>8.788590</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>4.000000</td>\n",
" <td>1.600000</td>\n",
" <td>103.000000</td>\n",
" <td>10319.000000</td>\n",
" <td>150.000000</td>\n",
" <td>10.000000</td>\n",
" <td>12.000000</td>\n",
" <td>10995.000000</td>\n",
" <td>2348.000000</td>\n",
" <td>93.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>5.000000</td>\n",
" <td>2.700000</td>\n",
" <td>160.000000</td>\n",
" <td>18927.000000</td>\n",
" <td>185.000000</td>\n",
" <td>17.000000</td>\n",
" <td>22.000000</td>\n",
" <td>20315.000000</td>\n",
" <td>3234.000000</td>\n",
" <td>106.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>6.000000</td>\n",
" <td>3.600000</td>\n",
" <td>200.000000</td>\n",
" <td>23217.000000</td>\n",
" <td>194.000000</td>\n",
" <td>18.000000</td>\n",
" <td>26.000000</td>\n",
" <td>25520.000000</td>\n",
" <td>3606.000000</td>\n",
" <td>111.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>8.000000</td>\n",
" <td>4.600000</td>\n",
" <td>250.000000</td>\n",
" <td>30778.000000</td>\n",
" <td>201.500000</td>\n",
" <td>21.000000</td>\n",
" <td>29.000000</td>\n",
" <td>33767.500000</td>\n",
" <td>4235.000000</td>\n",
" <td>115.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>10.000000</td>\n",
" <td>8.300000</td>\n",
" <td>500.000000</td>\n",
" <td>74451.000000</td>\n",
" <td>238.000000</td>\n",
" <td>29.000000</td>\n",
" <td>37.000000</td>\n",
" <td>81795.000000</td>\n",
" <td>7190.000000</td>\n",
" <td>144.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Cylinders EngineSize Horsepower Invoice Length \\\n",
"Origin \n",
"Asia count 156.000000 158.000000 158.000000 158.000000 158.000000 \n",
" mean 5.185897 2.774051 190.702532 22602.177215 182.816456 \n",
" std 1.269008 0.902310 59.392627 9842.984880 12.564148 \n",
" min 3.000000 1.300000 73.000000 9875.000000 153.000000 \n",
" 25% 4.000000 2.000000 142.000000 16271.500000 175.000000 \n",
" 50% 6.000000 2.600000 187.500000 20949.500000 183.500000 \n",
" 75% 6.000000 3.500000 233.750000 26645.000000 191.000000 \n",
" max 8.000000 5.600000 340.000000 79978.000000 224.000000 \n",
"Europe count 123.000000 123.000000 123.000000 123.000000 123.000000 \n",
" mean 6.235772 3.206504 251.894309 44395.081301 181.845528 \n",
" std 1.679585 1.037157 80.738884 23080.368888 11.479870 \n",
" min 4.000000 1.600000 100.000000 15437.000000 143.000000 \n",
" 25% 5.000000 2.500000 195.500000 31287.500000 177.000000 \n",
" 50% 6.000000 3.000000 225.000000 37575.000000 182.000000 \n",
" 75% 8.000000 4.200000 302.000000 51042.500000 190.000000 \n",
" max 12.000000 6.000000 493.000000 173560.000000 204.000000 \n",
"USA count 147.000000 147.000000 147.000000 147.000000 147.000000 \n",
" mean 6.108844 3.642857 212.823129 25949.340136 193.952381 \n",
" std 1.531090 1.194221 63.748618 10518.722194 15.305265 \n",
" min 4.000000 1.600000 103.000000 10319.000000 150.000000 \n",
" 25% 5.000000 2.700000 160.000000 18927.000000 185.000000 \n",
" 50% 6.000000 3.600000 200.000000 23217.000000 194.000000 \n",
" 75% 8.000000 4.600000 250.000000 30778.000000 201.500000 \n",
" max 10.000000 8.300000 500.000000 74451.000000 238.000000 \n",
"\n",
" MPG_City MPG_Highway MSRP Weight Wheelbase \n",
"Origin \n",
"Asia count 158.000000 158.000000 158.000000 158.000000 158.000000 \n",
" mean 22.012658 28.265823 24741.322785 3319.316456 105.886076 \n",
" std 6.733307 6.770503 11321.069675 717.842132 7.735249 \n",
" min 13.000000 17.000000 10280.000000 1850.000000 89.000000 \n",
" 25% 18.000000 25.000000 17208.000000 2751.500000 102.000000 \n",
" 50% 20.500000 27.000000 23032.500000 3280.000000 105.000000 \n",
" 75% 24.000000 31.000000 28787.500000 3706.750000 110.000000 \n",
" max 60.000000 66.000000 89765.000000 5590.000000 140.000000 \n",
"Europe count 123.000000 123.000000 123.000000 123.000000 123.000000 \n",
" mean 18.731707 26.008130 48349.796748 3680.723577 106.447154 \n",
" std 3.289509 4.167588 25318.600464 579.576871 6.690099 \n",
" min 12.000000 14.000000 16999.000000 2524.000000 93.000000 \n",
" 25% 17.000000 24.000000 33837.500000 3285.000000 102.000000 \n",
" 50% 19.000000 26.000000 40590.000000 3585.000000 107.000000 \n",
" 75% 20.000000 28.500000 56382.500000 3959.500000 110.000000 \n",
" max 38.000000 46.000000 192465.000000 5423.000000 123.000000 \n",
"USA count 147.000000 147.000000 147.000000 147.000000 147.000000 \n",
" mean 19.074830 26.013605 28377.442177 3769.952381 112.020408 \n",
" std 3.982992 5.396582 11711.982506 855.305524 8.788590 \n",
" min 10.000000 12.000000 10995.000000 2348.000000 93.000000 \n",
" 25% 17.000000 22.000000 20315.000000 3234.000000 106.000000 \n",
" 50% 18.000000 26.000000 25520.000000 3606.000000 111.000000 \n",
" 75% 21.000000 29.000000 33767.500000 4235.000000 115.000000 \n",
" max 29.000000 37.000000 81795.000000 7190.000000 144.000000 "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"grpdf.describe()"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th></th>\n",
" <th>MSRP</th>\n",
" <th>Horsepower</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Origin</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">Asia</th>\n",
" <th>count</th>\n",
" <td>158.000000</td>\n",
" <td>158.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>24741.322785</td>\n",
" <td>190.702532</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>11321.069675</td>\n",
" <td>59.392627</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>10280.000000</td>\n",
" <td>73.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20%</th>\n",
" <td>15680.800000</td>\n",
" <td>130.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>23032.500000</td>\n",
" <td>187.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>80%</th>\n",
" <td>30995.000000</td>\n",
" <td>240.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>89765.000000</td>\n",
" <td>340.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">Europe</th>\n",
" <th>count</th>\n",
" <td>123.000000</td>\n",
" <td>123.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>48349.796748</td>\n",
" <td>251.894309</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>25318.600464</td>\n",
" <td>80.738884</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>16999.000000</td>\n",
" <td>100.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20%</th>\n",
" <td>32506.000000</td>\n",
" <td>184.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>40590.000000</td>\n",
" <td>225.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>80%</th>\n",
" <td>64248.000000</td>\n",
" <td>315.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>192465.000000</td>\n",
" <td>493.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">USA</th>\n",
" <th>count</th>\n",
" <td>147.000000</td>\n",
" <td>147.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>28377.442177</td>\n",
" <td>212.823129</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>11711.982506</td>\n",
" <td>63.748618</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>10995.000000</td>\n",
" <td>103.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20%</th>\n",
" <td>19099.000000</td>\n",
" <td>150.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>25520.000000</td>\n",
" <td>200.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>80%</th>\n",
" <td>36079.000000</td>\n",
" <td>275.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>81795.000000</td>\n",
" <td>500.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" MSRP Horsepower\n",
"Origin \n",
"Asia count 158.000000 158.000000\n",
" mean 24741.322785 190.702532\n",
" std 11321.069675 59.392627\n",
" min 10280.000000 73.000000\n",
" 20% 15680.800000 130.000000\n",
" 50% 23032.500000 187.500000\n",
" 80% 30995.000000 240.000000\n",
" max 89765.000000 340.000000\n",
"Europe count 123.000000 123.000000\n",
" mean 48349.796748 251.894309\n",
" std 25318.600464 80.738884\n",
" min 16999.000000 100.000000\n",
" 20% 32506.000000 184.000000\n",
" 50% 40590.000000 225.000000\n",
" 80% 64248.000000 315.000000\n",
" max 192465.000000 493.000000\n",
"USA count 147.000000 147.000000\n",
" mean 28377.442177 212.823129\n",
" std 11711.982506 63.748618\n",
" min 10995.000000 103.000000\n",
" 20% 19099.000000 150.000000\n",
" 50% 25520.000000 200.000000\n",
" 80% 36079.000000 275.000000\n",
" max 81795.000000 500.000000"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"grpdf[['MSRP', 'Horsepower']].describe(percentiles=[0.2, 0.8])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is also possible to loop through all of the groupings using iteration in Python."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"=================================== Asia ===================================\n",
"\n",
" Model Make Type Origin DriveTrain MSRP Invoice \\\n",
"0 MDX Acura SUV Asia All 36945.0 33337.0 \n",
"1 RSX Type S 2dr Acura Sedan Asia Front 23820.0 21761.0 \n",
"2 TSX 4dr Acura Sedan Asia Front 26990.0 24647.0 \n",
"3 TL 4dr Acura Sedan Asia Front 33195.0 30299.0 \n",
"4 3.5 RL 4dr Acura Sedan Asia Front 43755.0 39014.0 \n",
"\n",
" EngineSize Cylinders Horsepower MPG_City MPG_Highway Weight \\\n",
"0 3.5 6.0 265.0 17.0 23.0 4451.0 \n",
"1 2.0 4.0 200.0 24.0 31.0 2778.0 \n",
"2 2.4 4.0 200.0 22.0 29.0 3230.0 \n",
"3 3.2 6.0 270.0 20.0 28.0 3575.0 \n",
"4 3.5 6.0 225.0 18.0 24.0 3880.0 \n",
"\n",
" Wheelbase Length \n",
"0 106.0 189.0 \n",
"1 101.0 172.0 \n",
"2 105.0 183.0 \n",
"3 108.0 186.0 \n",
"4 115.0 197.0 \n",
"\n",
"=================================== Europe ===================================\n",
"\n",
" Model Make Type Origin DriveTrain MSRP \\\n",
"7 A4 1.8T 4dr Audi Sedan Europe Front 25940.0 \n",
"8 A41.8T convertible 2dr Audi Sedan Europe Front 35940.0 \n",
"9 A4 3.0 4dr Audi Sedan Europe Front 31840.0 \n",
"10 A4 3.0 Quattro 4dr manual Audi Sedan Europe All 33430.0 \n",
"11 A4 3.0 Quattro 4dr auto Audi Sedan Europe All 34480.0 \n",
"\n",
" Invoice EngineSize Cylinders Horsepower MPG_City MPG_Highway Weight \\\n",
"7 23508.0 1.8 4.0 170.0 22.0 31.0 3252.0 \n",
"8 32506.0 1.8 4.0 170.0 23.0 30.0 3638.0 \n",
"9 28846.0 3.0 6.0 220.0 20.0 28.0 3462.0 \n",
"10 30366.0 3.0 6.0 220.0 17.0 26.0 3583.0 \n",
"11 31388.0 3.0 6.0 220.0 18.0 25.0 3627.0 \n",
"\n",
" Wheelbase Length \n",
"7 104.0 179.0 \n",
"8 105.0 180.0 \n",
"9 104.0 179.0 \n",
"10 104.0 179.0 \n",
"11 104.0 179.0 \n",
"\n",
"=================================== USA ===================================\n",
"\n",
" Model Make Type Origin DriveTrain MSRP Invoice \\\n",
"46 Rainier Buick SUV USA All 37895.0 34357.0 \n",
"47 Rendezvous CX Buick SUV USA Front 26545.0 24085.0 \n",
"48 Century Custom 4dr Buick Sedan USA Front 22180.0 20351.0 \n",
"49 LeSabre Custom 4dr Buick Sedan USA Front 26470.0 24282.0 \n",
"50 Regal LS 4dr Buick Sedan USA Front 24895.0 22835.0 \n",
"\n",
" EngineSize Cylinders Horsepower MPG_City MPG_Highway Weight \\\n",
"46 4.2 6.0 275.0 15.0 21.0 4600.0 \n",
"47 3.4 6.0 185.0 19.0 26.0 4024.0 \n",
"48 3.1 6.0 175.0 20.0 30.0 3353.0 \n",
"49 3.8 6.0 205.0 20.0 29.0 3567.0 \n",
"50 3.8 6.0 200.0 20.0 30.0 3461.0 \n",
"\n",
" Wheelbase Length \n",
"46 113.0 193.0 \n",
"47 112.0 187.0 \n",
"48 109.0 195.0 \n",
"49 112.0 200.0 \n",
"50 109.0 196.0 \n",
"\n"
]
}
],
"source": [
"for name, group in grpdf:\n",
" print('=' * 35, name, '=' * 35)\n",
" print('')\n",
" print(group.head())\n",
" print('')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are far too many grouping features in Pandas to cover here. For more information, see the [Pandas Group By documentation](http://pandas.pydata.org/pandas-docs/stable/groupby.html)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plotting\n",
"\n",
"There are several packages for creating plots in Python. These include [matplotlib](http://matplotlib.org), [seaborn](https://stanford.edu/~mwaskom/software/seaborn/), [bokeh](http://bokeh.pydata.org/en/latest/), [plot.ly](https://plot.ly), or even Pandas itself. Many of these packages such as seaborn and the Pandas plotting features use matplotlib in the background. Packages like bokeh and plot.ly are primarily focused on graphics that are rendered in a web browser.\n",
"\n",
"The most basic plotting features can be accessed in the ``plot`` method of the DataFrame. To start, we can create a scatter plot of the MSRP values."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"# lines starting with % are IPython magic functions \n",
"# in this case it tells IPython to render plots inline as an image on the page"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7f777c6c5dd8>"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAuEAAAF3CAYAAAALl5VRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xt8XXWd7//3d619SZr0ElKotGm5TECmZdoo0YpFzlDm\noljqzK9QHUEcB3ScA44zKi2OBxmG85sDZdSjwoyDlzlW8VLa0VZwZkSKB1uhGjCNtCpEBtq0DtCY\nlqZNd/be63v+2Hsn+7L2Ldm3JK/n49FJsvZa3/XdK8H55JvP9/Mx1loBAAAAqB2n3hMAAAAAZhqC\ncAAAAKDGCMIBAACAGiMIBwAAAGqMIBwAAACoMYJwAAAAoMYIwgEAAIAaIwgHAAAAaowgHAAAAKgx\ngnAAAACgxgL1nsBkzJ8/35599tn1ngYAAACmuSeffPKItfb0So03pYPws88+Wz09PfWeBgAAAKY5\nY8wLlRyPdBQAAACgxgjCAQAAgBojCAcAAABqjCAcAAAAqDGCcAAAAKDGCMIBAACAGiMIBwAAAGqM\nIBwAAACoMYJwAAAAoMaqGoQbY543xvzMGNNrjOlJHjvNGPOwMebZ5Me25HFjjPmMMabfGNNnjHlt\nNecGYGYZHI5o78GjGhyO1HsqAADUpG39ZdbaI2lf3yLpEWvtncaYW5Jfb5T0FknnJf+tlPRPyY8A\nMCnbew9p47Y+BR1HUc/TpnXLtbZrUb2nBQCYweqRjvI2SV9Ofv5lSX+UdnyzTXhC0jxjzJl1mB+A\naWRwOKKN2/p0KurpeCSmU1FPG7b1sSIOAKiragfhVtL3jDFPGmPelzy2wFr7a0lKfjwjeXyRpINp\n1w4kj2UwxrzPGNNjjOl5+eWXqzh1ANPBwNCIgk7m/9QFHUcDQyN1mhEAANVPR1llrT1sjDlD0sPG\nmF8UONf4HLM5B6y9T9J9ktTd3Z3zOgCk62hrVtTzMo5FPU8dbc11mhEAAFVeCbfWHk5+fEnStyS9\nXtKLqTST5MeXkqcPSFqcdnmHpMPVnB+A6a+9NaxN65arKehodjigpqCjTeuWq701XO+poQFMdsNu\nvuvZCAygmKqthBtjWiQ51trjyc//QNLfSdoh6d2S7kx+3J68ZIekm4wx31BiQ+axVNoKAEzG2q5F\nWtU5XwNDI+poayYAh6TJb9jNdz0bgQGUoprpKAskfcsYk7rP16y1/26M+YmkLcaY6yUdkHR18vzv\nSrpCUr+kk5LeU8W5AZhh2lvDBN8Yk75h95QS6UobtvVpVef8kn5O8l2/9Mw5kxoXwMxRtSDcWvuc\npBU+xwclXe5z3Eq6sVrzAQAgJbVhNxUoS+MbdksJlvNd33vw6KTGBTBz0DETADDjTHbDbr7ruxbP\nYyMwgJIQhAMAZpzJbtjNd33ngtlsBAZQEpPIApmauru7bU9PT72nAQCYogaHI5PasJvv+smOC6Dx\nGGOetNZ2V2q8WrStBwCgIU12w26+69kIDKAY0lEAAACAGiMIBwAAAGqMIBwAAACoMYJwAAAANLzB\n4Yj2HjyqweFIvadSEWzMBAAAQEPb3ntIG7f1Keg4inqeNq1brrVdi+o9rUlhJRwAAAANa3A4oo3b\n+nQq6ul4JKZTUU8btvVN+RVxgnAAAAA0rIGhEQWdzJA16DgaGBqp04wqgyAcAAAADaujrVlRz8s4\nFvU8dbQ112lGlUEQDgAAgIbV3hrWpnXL1RR0NDscUFPQ0aZ1y6d8Qyw2ZgIAAKChre1apFWd8zUw\nNKKOtuYpH4BLBOEAAACYAtpbw9Mi+E4hHQUAAACoMYJwAAAAoMYIwgEAAIAaIwgHAAAAaowgHAAA\nAKgxgnAAAACgxgjCAQAAgBojCAcAAABqjCAcAAAAqDGCcAAAAKDGCMIBAACAGiMIB4A0g8MR7T14\nVIPDkXpPBQAwjQXqPQEAaBTbew9p47Y+BR1HUc/TpnXLtbZrUb2nBQCYhlgJBwAlVsA3buvTqain\n45GYTkU9bdjWx4o4AKAqCMIBQNLA0IiCTub/JAYdRwNDI3WaEQBgOiMIBwBJHW3NinpexrGo56mj\nrblOMwIATGcE4QAgqb01rE3rlqsp6Gh2OKCmoKNN65arvTVc76kBAKYhNmYCQNLarkVa1TlfA0Mj\n6mhrJgAHAFQNQTgApGlvDRN8AwCqjnQUAAAAoMYIwgEAAIAaIwgHAAAAaowgHAAAAKgxgnAAAACg\nxgjCAQAAgBojCAcAAABqjCAcAAAAqDGCcAAAAKDGCMIBAACAGiMIBwAAAGqMIBwAAACoMYJwAAAA\noMYIwgE0lMHhiPYePKrB4Ui9pwIAQNUE6j0BAEjZ3ntIG7f1Keg4inqeNq1brrVdi+o9LQAAKo6V\ncAANYXA4oo3b+nQq6ul4JKZTUU8btvWxIg4AmJYIwgE0hIGhEQWdzP9JCjqOBoZG6jQjAACqhyAc\nQEPoaGtW1PMyjkU9Tx1tzXWaEQAA1UMQDqAhtLeGtWndcjUFHc0OB9QUdLRp3XK1t4brPTUAACqO\njZkAGsbarkVa1TlfA0Mj6mhrJgAHAExbBOEAGkp7a5jgGwAw7ZGOAgAAANQYQTgAAABQYwThAAAA\nQI0RhAMAAAA1RhAOAAAA1FjVg3BjjGuM+akx5sHk1+cYY/YYY541xnzTGBNKHg8nv+5Pvn52tecG\nAAAA1EMtVsI/KOnnaV/fJelT1trzJA1Juj55/HpJQ9baTkmfSp4HAAAATDtVDcKNMR2S3irpC8mv\njaTVkrYmT/mypD9Kfv625NdKvn558nwAAABgWqn2Svj/lrRBkpf8ul3SUWttLPn1gKRFyc8XSToo\nScnXjyXPBwAAAKaVqgXhxpg1kl6y1j6ZftjnVFvCa+njvs8Y02OM6Xn55ZcrMFMAAACgtqq5Er5K\n0lpjzPOSvqFEGsr/ljTPGBNIntMh6XDy8wFJiyUp+fpcSb/JHtRae5+1ttta23366adXcfoAAABA\ndVQtCLfWftRa22GtPVvSOyTttNZeI+lRSVclT3u3pO3Jz3ckv1by9Z3W2pyVcAAAAGCqq0ed8I2S\nPmSM6Vci5/uLyeNflNSePP4hSbfUYW4AAABA1QWKnzJ51tofSPpB8vPnJL3e55xTkq6uxXyAmWxw\nOKKBoRF1tDWrvTVc7+nkqOb8KjF2oz8/AMDUUJMgHEBj2N57SBu39SnoOIp6njatW661XYuKX1gj\n1ZxfJcZu9OcHAJg6aFsPzBCDwxFt3NanU1FPxyMxnYp62rCtT4PDkXpPTVJ151eJsRv9+QEAphaC\ncGCGGBgaUdDJ/E8+6DgaGBqp04wyVXN+lRi70Z8fAGBqIQgHZoiOtmZFPS/jWNTz1NHWXKcZZarm\n/CoxdqM/PwDA1EIQDswQ7a1hbVq3XE1BR7PDATUFHW1at7xhNhdWc36VGLvRnx8AYGoxU7kUd3d3\nt+3p6an3NIAppdGre1AdBQDQiIwxT1pruys1HtVRgBmmvTXc0MFjNedXibEb/fkBAKYG0lEAAACA\nGiMIBwAAAGqMIBwAAACoMYJwAAAAoMYIwgEAAIAaIwgHAAAAaowgHAAAAKgxgnAAAACgxgjCAQAA\ngBojCAcAAABqjCAcAAAAqDGCcAAAAKDGCMIBAACAGiMIBwAAAGqMIBwAAACoMYJwAAAAoMYIwgEA\nAIAaIwgHAAAAaowgHAAAAKgxgnAAAACgxgL1ngBQL4PDEQ0MjaijrVntreGqjN0ScnViNF72Pao5\nNwAAUH8E4ZiRtvce0sZtfQo6jqKep03rlmtt16KKjm09q0jcqimY+INTqfeo5twAAEBjIB0FM87g\ncEQbt/XpVNTT8UhMp6KeNmzr0+BwpKJjR+JWknQq6pV8j2rODQAANA6CcMw4A0MjCjqZP/pBx9HA\n0EhVxk5xHaNHf/FSwYC6mnMDAACNgyAcM05HW7OinpdxLOp56mhrrsrYKScicd22Y59W3bVTO3oP\n1XxuAACgcRCEY8Zpbw1r07rlago6mh0OqCnoaNO65RXZAJk+dtg1kqRwYPw/sxOj8YIpJtWcGwAA\naBzGWlvvOUxYd3e37enpqfc0MEXVqjpK78Gjum3HPp0YjY+9Pjsc0FdvWKkVi+fVfG4AAKB8xpgn\nrbXdlRqP6iioiXoGlfnu3d4annDZQEkF30/62G0tIX3s2z/LeL1Yiknq2lQuOIE4AADTC0E4qq6e\nJfcqde/0cUaiMRlj1BRwSxpzV/8ReWl/cAo4KppiQplCAACmN3LCUVX1LLlXqXtnjxPzpGjcljRm\n6tpofDwKdx1HqzrnV33eAACgcRGEo6rqWXKvUvcuVHaw2Jh+14bcwnOo1zMbHI5o78GjBPsAANQA\n6SioqnqW3KvUvQuVHSw25kTmUI9nRvoLAAC1xUo4qqqeJfcqde/21rDWX9SRccwxKmnMicyhvTWs\n9d2Z91vf3VG1Z0b6CwAAtcdKOKpubdcireqcX5fqKJW49+BwRFueHMg4FnQd3XvNa7Vs4ZyiY5Y7\nh8HhiLb0ZN5vS8+APnj5+VV5dqn0l1MaX31Ppb9QlQUAgOogCEdNlFsOsJHu7RekhlxHc5uDJY9b\nzhxqHRRPNv2FmuYAAJSPIBwootY52rW+XyplZkNWTngpATW55AAATAxBOFDEZILUqXA/aWJpO+m5\n5KlV+w3b+rSqcz4r4gAAFEEQDmTxS6+YTG55sXSNweGI9h0+JsmM5ZjXI4++3LSdyabNkMYCAJjJ\nCMKBNIXSKyaSW14sXWN77yF95IG9Y818Ao70yfVdWtu1qK559KXoaGvWSDSWcWwkGispbYY0FgDA\nTFe0RKExxjXG3F2LyQD1VOlSfcXGGxyOaMPWzG6aMU+6eeveKVMe0BhT8Gs/lEQEAKCEINxaG5d0\nkSnl/7sCU5hfp0rHmGSqSGXGS+98OTA0ItfJ/c/KNbXpKDpZA0Mjagq4GceaAm7RuZfbEZROngCA\n6ajUdJSfStpujHlA0onUQWvtv1ZlVkAd+FUlOTka13s39+juq1aUnS5RrMpJR1uz4p7NuS5ua9NR\ndLImWsWlnOtIWwEATFeldsw8TdKgpNWSrkz+W1OtSQH1kKpKEg5k/mcRidkJpUsU65bZ3hrW3Vct\nV9AdXw0PONLdV61o6FzwlIl2JC31OtJWAADTWUkr4dba91R7IkAjWNu1SPNmhfT+rzypk9H42PFC\nVT8KVfkoVuUk9Xp2dZSpYqJVXEq5jk6eAIDprKQg3BhzvqR/krTAWnuhMWa5pLXW2v9Z1dlhxmmE\nsnXLFs6Rp8w0kcmkSxSrctLeGtal559RmcnXwUSruBS7zi9tJRKLqyXk+p4/2Z+dRvjZAwDMHKWm\no3xe0kclRSXJWtsn6R3VmhRmpu29h7Tqrp269gt7tOqundrRe6gu8yBdojGkfx/CyZQdxzFac8+u\nnJ+Nyf7sNMrPHgBg5ih1Y+Ysa+2PswqkxPKdDJSr0bovki7RGNZ2LdLSM+fois/ukmR1Kpr7szHZ\nn51G+9kDAMwMpa6EHzHG/JaU+Bu9MeYqSb+u2qww45Rbtq6QSpW0a28Na8XieXkDsYlWB0F5TozG\nFXYLl3qczM9OJX/2AAAoVakr4TdKuk/SBcaYQ5L+U9I1VZsVZpxKBbS1LGmXSpfYkHU/Vk8rq5RS\nj5P52eGXKQBAPZS0Em6tfc5a+3uSTpd0gbX2EmvtC9WdGmaSiZa7S1ePHO21XYu0e+NqffWGldq9\ncTU1rKuglFKPk/nZqcTPHgAA5TLW5jYLyTnJmF9JekLSDyU9Zq3dX+2JlaK7u9v29PTUexqooMlU\nqNh78Kiu/cIeHY+Mb1eYHQ7oqzes1IrF8yo9VdRYsZ8NqqMAAKrJGPOktba7UuOVmo6yVNJKSW+S\n9A/GmAsk7bXW/nGlJoL6q3YQUsr4Ey13J5FWMN2VUupxMj+3k70eAIBylBqEx5UoTxiX5El6UdJL\n1ZoUaq/audS1yNUmRxsAAEwVpaajnJT0M0mflPR9a+1gCdc0SXpMUliJYH+rtfY2Y8w5kr4h6TRJ\nT0l6l7V21BgTlrRZ0kWSBiW93Vr7fKF7kI5SGYPDEa26a+dY+TdJago62r1xdUUCWL/xQwFH3/3A\nJepcMHvS4/vdL3vFvRFSDSYzh3KubYT3Op2V+3wn+v3g+wgAjaVe6Sh/IukSSf9d0g3GmB8pkRv+\nSIFrIpJWW2uHjTFBSbuMMf8m6UOSPmWt/YYx5nOSrleiG+f1koastZ3GmHdIukvS2yf2tlCOcupd\nTyQvd2BoRAEno8a8RmOervjMD/UPV68ouCJeSiCSfU76fSVpV/+Riq3CZ98r/evUPf3mev8TL+j2\nB/cr5BrFPFvWHMr5K0Ip5xLcTVy5f9GZ6F+AalnlBwBQHyWthI+dnMgFf4ukv5J0hrW2pGRbY8ws\nSbsk/YWkhyS9ylobM8ZcLOlvrbV/aIz5j+TnjxtjApL+S9LptsAEWQmvjFJXwosFBvlev/+JF/Sx\nbz/te+9CK+6lBCJ+51hp7NhoPC7PStH4+I/RRFf5s++1vrtDW3oGFHQcjURjMsaoKeDmzNXv/Zc6\nh3L+SlHKuQR3E1fuX4wm+hemav9lCgAwMZVeCS+pRKExZluyQsqnJbVKuk5SWwnXucaYXiXyxx+W\n9CtJR621qfIVA5JSEcAiSQclKfn6MUntpb8VTFQpJdqKlf/L93r/i8d1x0P5i+nka4pSSrlBv3Nu\n3rpXG7aOH4vEbEYAXuiehfjda/PjB8a+jnmJQD97roPDEd3+nX0547mOKWkO5TSSKXZuPUo4Tifl\nNvWZaBMgmgcBwMxQajrKnZKestbGyxk8eX6XMWaepG9J+m2/05IfTYHXxhhj3ifpfZK0ZMmScqaD\nAoq1aS+WspLv9d6DR3OOp8tXvaSUFBm/c1zj+P8klXDPQvzuVUh60BR0EyvyGXOI25LmUE7Fl2Ln\nlpN2hFzlVt+ZaLUeqvwAwMxQatv6Xkk3GmO2Jv99IJnnXRJr7VFJP5D0BknzkukmktQh6XDy8wFJ\niyUp+fpcSb/xGes+a223tbb79NNPL3UKKEGhNu0T7VrYtXheznFJagm7OSvu6e3mSwlE/M6JW09x\nL/N3N9dIIdfxvWep/O5VSGquHW3NivtkVN125dKS5lBOI5li5xLcTU65TX0m2gSI5kEAMDOUWh3l\nC5KCkr6cPPQuSXFr7Q0FrjldUtRae9QY0yzpe0pstny3pG1pGzP7rLX/aIy5UdLvWGvfn9yY+f9Z\na9cXmhc54bW1o/dQTvm/9HzifK9nH7/1rUt14aK5GSvufrnKkgrez++e67s79LU9BxRLxpquY2Rk\nFQ64isY93XblMl3zhrMq8v5TOeGuMToVjcsYqTkYyJnrjt5D+vADe8fSYlwjfertXWXlYpezQbUl\n5OrEaNz33GLfw4kodXPqRMZrxMCT6igAMDNVOie81CB8r7V2RbFjWa8vVyJod5VYcd9irf07Y8y5\nGi9R+FNJ11prI8mShl+R9BolVsDfYa19rtC8CMJrb6JdCwtdV2gjmlQ8qEsPPtfcsytjnGyT3eCW\n/T7Sq55E41Y3Xdapd65ckvPe33jnI4rEJr85NJ9yNlxWMrhLv2+hzamVeB8EpQCAeqpXicK4Mea3\nrLW/Sk7iXCUa9+Rlre1TIqDOPv6cpNf7HD8l6eoS54M6mWjXwkLXFcpVzpce4zf23iL55+njTjSI\nS38fg8MR3fHQfo3GPI0mtxrf+4N+vXNl5l6FgaERhVxXkVhs7NhES0D6Sd9wmXrvG7b1aVXn/LK/\nF+Xwu69kFY3His6h3PdRyTKTtcAvDACAYkoNwm+W9Kgx5jkltr2dJek9VZsVZpRK5SqXkrddyRzo\nUjc6lvr+Jlo+sF4bLottVi13Dvnex77Dr5T1S0a9UQYSAFCKkjZmJpvynCfpL5P/Xm2tfbSaE0N9\npG+OrNU9KrURzW+c6y5eUva4pT6DUoPrSpSArMQ8Kq3YLz3lziHf+5DslCnZRxlIAECpSloJT+Zr\n/3clumZaST80xnwumUKCaaIWK3j57lGsRGKp/Mb54OXnlzxuOc8gFVxnb3T0u8dkS0AWUs48Kin7\nvn454eXMId/7WLZw7pSp6kIZSABAqUrdmLlF0nFJX00e+hNJbdbauuZwszFz8gptaqz05sHJbsCs\ntsl0OKxERZDJdkmsVx5yLaqjVKOqSzXQ7RIApq96bcx8dVYllEeNMXsrNQnUR/qqbyQWl+Nkdrmp\n9ApevlXC+/cc0D/+oL/uAdZEVzErsdGxEqvZldpwWa7s+1biWWSPUam/lFRbvf4qAQCYekoNwn9q\njHmDtfYJSTLGrJS0u3rTQrX5VrbIau9e6T/5++X8jsY93fvos4rEbN033dW7mc1UCTQnYzKr9fX6\nJaNcM+H7CACYvFI7Zq6U9CNjzPPGmOclPS7pvxljfmaM6ava7FA1qVXfdGHXKBQobRPjRDZwZm9Q\nDAccrXvtIgUKbLordp9y5lHs3EboVFioa+lUt733kFbdtVPXfmGPVt21Uzt6D9V7SlUznb+PAIDK\nKHUl/M1VnQVqzm/V1zhGD910Sd5uiymT2cCZWiW8f88B3fvos9qx97BOjGaWnE+tPhe7TznzKPVc\nVjGro9xa5gAATHellih8QdJiSauTn5+Q5FhrX0h+jSkm36pv54LZBVfwKlWC7R9/0K9IzGo4Mh6A\nt4TcsXlIKnifcuZR7pxZxaw8v7+8NGqZQQAAaqHUEoW3SeqW9GpJ/yIppESllFXVmxqqbSKrvpUo\nweY3RkvY1e1XLtNlF5yRt/tl+n3KmQdl4+qv3vn2AAA0mlJzwv9Y0lolVsBlrT0saXa1JoXaKXfV\nN9/myvRgqv/F49rac1D9Lx4veYxY3NMZcwp3mUy/TzlBXSnnVjL3HLmK5dvzfAEAM02pOeGj1lpr\njLGSZIxpqeKc0MBSwdSHH9iraLKaStzztLv/iNZ2LdLHv/0zbX7iwNj51128RH/3tt/xHSO9yYtn\npRvv/2lGvnah+5RTCq7YuZXMPUd++f7ywvMFAMxEpTbr+YgSbet/X9L/kvRnkr5mrf1sdadXGM16\n6qP/xeN6y2ceUzRtP2VT0NFX/+z1uuqfn8g5//t/fak6F+T+4WRwOKJ9h1/Rezf3KBLzb+Dzxjsf\nUSRmc15LX0EtNZ3G79xizVVovlJdPF8AwFRRl2Y91tp/MMb8vqRXlMgL/7i19uFKTQJTx/beQ7p5\na19GAC4lcqwfe/aI7zW7+l/2rbjS3hrW3OagQq6TEYSnb9gLua4isVjGa/sOv6K5zcGM8VLnl9tU\np1i+OPnk1cXzBQDMVKVuzGyRtNNa+7Ax5tWSXm2MCVpro9WdHhpJqsrIaMzLeS3qebr0vPn6zM7+\nnNf+/rs/VzgQ8E01KJavnf3aSDSm927uUchNpC6sv6hDW54cmHAqQ7H7s6Gwuni+AICZqtSNmY9J\nChtjFkn6vqT3SPo/1ZoUGpNfmTlJCrlGm9YtV/c57bru4iUZrxlJo3HlLQ1YaMNebnMfI2OMIrHx\nUoObnziQUXrw5q179dgzL5W8wa+9Naxb37pUIddklEhMrcI2QgOf6YznCwCYqUrNCX/KWvtaY8wH\nJDVbazcZY35qrX1N9aeYHznhteWXvxsKOPruBy7JyPnuf/G4eg8eVVPQ1Uf/9Wc6HhlPJ5kdDuir\nN6zUisXzcsbOl9udeu3YyKhuvP+nGeP5mRVy5Vlb0qp4alNgwDEajVvdduVSXbPyLN/3TgOf6uH5\nAgAaXV1ywhP3NRdLukbS9WVeiynILyhKrVrevHWvXOMobj3dfdXynE2XnQtmq3PBbA0OR0pONcjO\n186+f2qTZPZ4fk4mO3AW68iY3sQn5Y4H9+vNy16Vc41fPjkqh+cLAJhpSk1H+aCkj0r6lrV2nzHm\nXEmPVm9aqKftvYe06q6duvYLe7Tqrp3a0Xto7LXE301MIs8k8X/ymmiqQb77+4133cVL1BR0NCvo\n5oxTrCMjXRwBAEC9FF3NNsa4kq601q5NHbPWPifpL6s5MdRH+upwqmJFakVZSrSST69kUmy1udyu\nnIXu394a9h3vg5ef71vqsNgGPzYFAgCAeim6Em6tjUu6qAZzQQPwWx12HaNHf/GS9h0+poCTufpd\naOU41QVRUsldOUtZnc7u8tneGtal55+uu68qb9WdTYEAAKBeSs3r/qkxZoekB5RsXS9J1tp/rcqs\nUDd+q8MnInHdtmOfIrG4sqsT5ls5nmgXxMmsTpe76j7RawAAACar1Jzw0yQNSlot6crkvzXVmhTq\nJ311uCU8nmd9YjQ3AJekW9cs9a1mkkopyVeasJT7T2R1OnuVvFrXlCP1F4FSyyYCAIDpr9SOme+p\n9kTQOFKrw4/+4iXdtmOfTozGfc9rCbm6cOHcnOOT7YI4nVanJ/oXAQAAML2VtBJujOkwxnzLGPOS\nMeZFY8w2Y0xHtSeH+mlvDeuyC85QvEAd+bi16mhr1uBwRI898/JYkxy/lJJI3FNLKLeCSaH7V3N1\nuhYm8xcBAAAwvZWajvIvknZIWihpkaTvJI9hGstODQm6RgFHGWkiu/qPaOXff1/XfenHuu5LP9Eb\n/tcj2t1/ZOy6pmDiR8xYqzX37MoodzjdUQIRAADkU2rHzF5rbVexY7VGx8zaSG+cIynj8zfe+Ygi\nscyfoXDA0Y9uWa2hE6O64jM/1Gh8/PWmoKPdG1dP6RXuUvl1GJ1J7x8AgOmk0h0zS10JP2KMudYY\n4yb/XavERk3MAOmpIanPJenRX7wk49OwxzFGA0MjOjEaVziQmYJSqZXg7M2O2Skxhc6tFUogAgCA\nfEotUfhnku6R9Cklmib+KHkMM1Bqs2HAMTrlUzJlJBrX04eO6c0XvqoqzXCyNzuu7+7Q1/YcGKve\nEnSNPnH1Cq3tWlT3jZHTaZMpAAConJLSURoV6Si155di4SeVdrG7/4g25AmC09NcSg1OS71/OODo\noQ9cojX37CIdBAAATFql01EKroQbYz6rxMq3L2stretnGL/yg01BR9aziqTlfqfSTvKtBE90hdrv\n/n5cx6iJzupNAAAgAElEQVT34NFJlUoEAAColmLpKOnLzLdLuq2Kc0EVTGS12e9aKREAt4TcnBQT\na61kjNJ/X0tPO0nlkqePmyrdlwqQN2zr06rO+UXn6Ff+0E/cs+paPK8q6TAAAACTVTAIt9Z+OfW5\nMeav0r9G45tMPnT6tadicVlr1RwMjOVgb+kZyBhXUk7aSb6AejLNfFKbHTcUyQm/+6rl6lwwO+dc\nNkYCAIBGUHJOuDHmKWvta6s8n7KQE57fZMrjFcu7bgo6evCmS3RiNJ6xwl7qqnslSvdl32twOKJ9\nh1+RZLVs4dyMOe07fEyS0bKFcwjAAQDAhNQ0JxxTl99qs2OM9h0+pmUL5+YNTAeHI8nSg/kFHUcn\nRuNjpQpTstNOUuNl3yt7NXs07unG3+0s6/353Wtuc7AieecAAADVVnAl3BhzXOOJvrMknUy9JMla\na+dUd3qFsRKeX77VbNdIMkZxL/FtDTjSJ9d3ZZTzK7XyiaSCK9/bew/pIw/sVTSee6/UHO/fc0D3\nPvqsQq474UDZL9heeuYcXfHZXRqNFV5tn0zOfLZKjgUAABpLpVfCKVE4je3oPaSbt/Yp4lPLO104\nYPTQB96UU84vn///jy9UazhQcJV5cDiiN965M+fe4YDRj265fCyFpBJpKdljBBzJGDMW/KfMDgf0\n1RtWjq3gV3KlnFV3AACmt3p1zMQUtLZrkT5/XbdmBd2C57nGUe/Bo3JNoSSUhOaAo8Vts8ZWzI9H\nYjoV9bRhW19GR8qBoRG5Tu54rhnvmJlKmUlXbkdNvzFinnICcGm8Mkqqu+aGrZnv4eatfXrsmZfL\n7qyZXu0l3/OYCurVWRQAgJmInPBpbtnCOfLyl3qXJMWtlyjnFy++Cp4YyxatbtLR1jyW8pJ9r1SJ\nQL9yg+WWECy1ZGEokGgZv6v/iDZu65Mjk7NKH4l5ev9XnpQnW9ZK9mSqvTQKVvIBAKgtVsKnudQm\nyKago9nhgAKOMlaoA45091Ur1Llgtm67clnBsVLnLls4t2jw3N4a1t1XLVfQzb1XKjDNnltT0Cm7\nhGD2GOGAybinJIVco+9+4BKt6pw/tmJ9Mhr3He9kNF72SnYlfpmop+mykg8AwFTCSvg047c5MLtr\n5dCJUe3qP6L5rSFd/FvjDXKuecNZOhGJ6e7v/VKukaKe1fWrztEl581XdiWVUupvr+qcry+++3V6\nZSSqOc3BjOtT81zVOV+7N67OKTdYzgbH7Pe3u/9Iztw6F8zWXp8OmlKixX32qng5K9l+tcunUj3y\n6bCSDwDAVEMQPo3kSylID2pT6Ripc25961JduGju2Guf/P4zCrmOonFPG//wAq08tz2jY6aUCDrz\ntaNPuf+JF3T7g/sVco1ins0ISgulPpSSFuEXpKeXLMw3N78V63DA6BNXr9CHH+hVJObf8bMUxZ5H\nI5vqK/kAAExFVEeZJvJVGrn1rUt1x0P7x+pxxz1P2cVSWsOuonGb97VT0biMMWoKlFZG8P4nXtDH\nvv10xrH0sob5KqIUeq0Stb8HhyP62p4DusenJOKO3kM5K9kzKSd6pr9/AACKoVkPfPmlFLjG6PYH\n92s05uWkYKQbjvjnR2e+ZhWNxyQl2tOv6pzvu9o7OBzR7d/Zl3PcdYwGhkZ0bCQqJ6sVUHpFlEJp\nEem5y6lzCs0lXXrwLhm979Jz9c6VS4quns8UM/39AwBQa2zMnCZ8UwrinkJu8bKD5SpURnBgaERB\nN/fHKhq3evrQMb13c0/OpsjRuKdjI1G1hNyc9zASjSsai4+PXaCkYb4Se9kbDyMxT/c82q99h1/R\n4HBk7Dop8RwHhkZm5KbE9tawViyeRwAOAEANsBI+TbS3hrX+og5tfuLA2LE/es1C7dj764zzAk4i\nR3syUkHz4HAkZyNlR1uzIrHclfWP/P75uuOh/TkbIAOOFPc83Xj/U4p6ntZ3d+j+PQfHyhvGPKur\n/vkJXXfxEn3w8vPz5i4XSlPx+ytBJObphi//RJ61Y6k2p2JxWWvVHAyQkgEAAKqKIHyaGByOaMuT\nAxnHduz9tW5ds1R3PJjICT8Viyvm08SmGMck0klSgWpG0HxRh7Y8OTC+0XPNUhljpLTa5AFHuuDM\nOTmBcHPQUczzFI1LxyOJVJdv/Pigb33xzY8f0HVvONu3ComkgmkqHW3NGvWpgT469izGU22k8bmU\nmuoCAABQLoLwaSJfmbkLF87V7o2rte/wMd3w5Z/4tu0JOFI44OrEqH9ueNBNtLU/fGxE793co0ha\n0JxaeU/d9/bv7E9WVxkfqzkYkGRzVrHjVokNkmnnZgfw6XoPHtVV3Ytzcpf9Sg+m55K3t4Z102Wd\n+sTDz+R9fr7vO6tMX7HSieWWVgQAADMXOeHTRKEyc+2tYc1tDing+Levv3blWYoWSFEJuYkAfW5z\nSCHXf4yUoGtyVp2jnqeFc5t14+92Khxw1BJ2FQo4+otLz83p0lmoWk/X4nmSxnOXJWnvwaO+ueTp\nLer3Hjyqt1z4KoUD5eXHp5fp2957SKvu2qlrv7BHq+7aqR29hzLOLfY6AABAOlbCp4liDWM62poV\nt7kpGQFH+vpPDsgkY1+/xjXpwWixFvGjMU9eWkAfdI3Wd3dozT27FHQcxeKJMoiuMfr0zn45WXHx\nynNO067+wZy18OsuXqLOBbPHvs7OAT/rtGb98sUTY6+/7qy2nJrob3/dYm3pGVA05ik9K8d1jGYF\n/XPCS6nKMpmqLQAAYGYiCJ9GCpWZS7SRX6EPbekdqwUedI2stRlNaqy1+psrLtAnH37GN5jPDvTX\nd3doS89ARh3y9ADXyOqbPxlQJL1MopXiyTA7ewH+h/2DGV+7kr75529Q9zntY8f6Xzyumx/Yq9G4\nHRszPQBPjbPnPwc1Gh9PldnSM6Cv/tnr9c4v/ljxtF80Ao507zWv0bKFcyUp5/kV6yhJx0kAAFAu\ngvApKl/+cXrnyGypIH3f4VeUyLs2uvH+p8byu6VEbvjKc9ozWslLibSPjrbmnEBfkn7vtxfolZGo\njgyP6u7/+KViabnlnie5TuHV80KaQq6eHzypebNCOjEa157nBrXpe78saYOpkSNlBcbPD55U2HU0\nmhaEh1xXc5tDknIDcKl4R0k6TgIAgHIRhE9Bk+ka2d4a1qXnny4pEcgXyiNvbw3nvVf6a5J0Kuop\n7BpFsoLjmJV8KhaW7MRoXB/79tOKxDy5RiqnuItV7nvrWjzP9z0/feiY3n7f477PtFiqT7HXAQAA\nstG2forJ154+vbV7OQq1K/e7Vzhg9PnrurVwbrPW3LMr47VKmhV0dLLEsUOu0cpzTstIZbnu4iXq\nPus03/eW/Z5vfetS3fHQ/qLPlOooAADMXLStn+EqnX9cKI/cv8mN1fu/+pRinpWpwC9wYdeRMVan\n0vLSW8Kurl91jr60+z81HCm8jB5wjb77l29S54LZ6n/xuHoPHlXX4nljmzj93lv2ey71mRZK9Snl\ndQAAgBSC8AZUaEW11PzjclZtU4ZOjGrf4WOSjJYtnON7L0k6maee+IQYK6vM2uBxz2rtioW674fP\nFb389iuXqXPBbA0OR3RiNK7LLjjDN3BOlSpMT7VJP4+cbgAAUEsE4Q2mWL53e2tY67s7tPnx8fb0\n67s7MgLKYmOkvz4SjckYI9cxGekYAUf65PqusVxnR0Yno5MPvl3HyGi8DOCtb12q35wY1T2PPpto\n3JOcb+eC2dq0brlu3rpX1ia6W6ZywsMBR1bSbVcu1TUrzyrr/eZ7puR0AwCAWiInvIGUku9d7JzB\n4YjeeOfOjFrf2a9nX59POGD0o1sulyTtO/xKoltmrPQccMdIf/HfztXnf/h8RgOfcMDR56/r1sHf\nnNQdD+0fK29402WdeufKJWPvdXvvIW3Y2ifXMYp7Vjdd1qm3XPgqnRiNj61ol/I8Ss2hJ6cbAADk\nU+mccDpmNpBUbnK6VG5yqefcv+dATqCc/rrf9fm4Zjwv+tLzT9dNl3WW9X4cY/TbZ85ROJB5v5Cb\n+Dq1GfJ4JKZIzNM9j/aPnZNqgBOJeTo5Glck5uneH/SrrSWkFYvn5dTwLuf9Zj/TlFQnTgJwAABQ\nbQThDaSUfO9C5wwOR3Tvo8/mjDsaL1zTOp+4zbz3aS2hkt+LJMU8q7/e0qtTWTUKE/e3OcFxJObp\na3sSaTalBs/U8AYAAFMRQXgDSeUmNwUdzQ4H1BR0cnKTC50zMDSikOvmjHvTZZ05Na1T1wecROdM\nPx+/cllGGswdD+0v+z1F44kunOFA5nyXLZybkaKScs+jz2pwOFJy8FzsmZXyTAEAAGqtahszjTGL\nJW2W9Col2hbeZ639tDHmNEnflHS2pOclrbfWDhljjKRPS7pC0klJf2qtfapa82tUhUoGFjvHL3AN\nuUbvXLkkI985+/p9h1/R+7/yZMbGy5aQqwuTbdwl/3KFzUFH//yui7RwbrN6Dx7Vb06M6h8efiaj\nG2XivIDuvea1mtscVEvI1YlkdZWbLuvUJx5+Jmu+rgaGRrRi8bySN0sWe2alPFMAAIBaqmZ1lJik\nD1trnzLGzJb0pDHmYUl/KukRa+2dxphbJN0iaaOkt0g6L/lvpaR/Sn6ccUqpNz10YlTPvnhcLSE3\n49zXn9Wmx9Ka1sQ8q7v+/Rf61lMDskq0kb/hTefokvNOl2TV0dasZQvnyFNWp0vPZqw6t4RcRbJW\nrq2kZQvnqr01rLaWkAaGRvS161+vd35hj0bTWltGPU/LFs7Rrv4jGVVKbl2zVOGAUSSWeW7qvuUE\nz9TwBgAAU0nNqqMYY7ZLuif573ettb82xpwp6QfW2lcbY/45+fnXk+f/MnVevjGnW3WUUn382z/T\n5ifGSxRed/ESXXTWafqrb/Sq3O9m0DX6xNUrJEkffmCvosngOVWicG3XorESf9azisStmoKJLKZU\nqb/sEoDruzu0pWcgYwV7Ved83yolqW6V+coHAgAANIIp2THTGHO2pNdI2iNpQSqwTgbiZyRPWyTp\nYNplA8ljeYPwmaj/xeMZAbgkbX78gL6+52DZAbgkReNWH35gr75+w0o5aanhMU/asK1PS8+co43b\n+jKCZ8+zY10qU1VMTkW9sVSVLT0DevCmSzJKCe49eNS3K+WFi+Zq98bVk04VobwgAACYSqoehBtj\nWiVtk/RX1tpXEqnf/qf6HMuJK40x75P0PklasmRJpaY5ZfQePOp73E4oBE+Ixq3e8fk9GUG4JLnG\naMfewwpkvRAOjOd152v5fmI0rhWL540dK7TRcrKpIsWa8QAAADSaqlZHMcYElQjA77fW/mvy8IvJ\nNBQlP76UPD4gaXHa5R2SDmePaa29z1rbba3tPv3006s3+QbVlRbYpiux6mBeMc9m5HFL0onRuL64\n6zkNR3JLDJZbAnAyVUpSLecHhyO+r6VW4o9HYjoV9bRhW5/vuQAAAI2imtVRjKQvSvq5tfaTaS/t\nkPRuSXcmP25PO36TMeYbSmzIPFYoH3wm8EuxaGsJ6fLfPl2P/PzljHNd18jGM9fDHSNddVGHtj45\nIK+MhXJXUirsPjE6HmC3hF3FPatN65ZLkvYePKqOtuaKVTHxU2yVe2BoJGelPlVPnLQUAADQqKqZ\njrJK0rsk/cwY05s89jdKBN9bjDHXSzog6erka99VojxhvxIlCt9Txbk1PL/g00rauK1Pnk9E3RRw\nde+7X6NXRqJ6YfCkzmpv0cW/1a721rA2vvkCfW/ff+l/fPtpxUsIxuM+x1pCrm6/cpkuu+AM7eo/\nolV37cyYW6l53eWknvjlm2/Y1qdVnfPHxnj60LGCK/UAAACNqGpBuLV2l/zzvCXpcp/zraQbqzWf\nRpa94u0XfN68da8kk9OSPmU07o2VC8zW3hrW0oVzNSsU0PFIbOx40JGiJaaxxK3VZRck9tD6Bca7\nN67OyAEv9P5KlS/fPLXKna+B0K1rlrIKDgAAGlpNqqMgP78V77PaW3KCT9c4+X+lUWZXTD/+udvF\n5zcr5MqzdizFJF+Vk3zpH5PZNFks39wvSM9uMgQAANCIaFtfR/k2FbaE3JzgM249xfMkdocDia6Y\nhaRvjJwVKu3bHg4Yfe7a12r3xtVjgXOpGzELvb9SN00W28zpN5e4taSiAACAhkcQXkepldx0qfJ+\n2cHn3Vet0N1XJY6lmuWEXTP2mqSCFUQee+YlzZsV0oM3XaIbLjm36NxCbmLcS88/Yyz1Y2+yPGK+\nwDi7ikm+9zcwNFLyM1rbtUi7N67WV29YmfHLgDS5iisAAAD1RDpKHRVaVV6xeJ5vJZHUsZaQO9YM\nx2+jZCpY3d57SB/J6oS54Q8vKDo3z47P6/4nXtDtD+5XyDWKJauj7N64WvsOH5NktHBukz7zyLO6\n99F+hdzMLpmjWa3uI3FPLSF3Mo8tw6rO+brvXd2SbN6ceAAAgEZTs7b11TAd2tbv6D2UU96vnEYz\ng8MR33bwuzeuliS98c6dOZs5wwGjt3Ut1JaeQwXHDgccffj3z9ff/9svMo6nt5uXlHHv7HNu2/G0\n0m8fdo2MY0p+n4VyymnSAwAAaqXSbesJwhvAZFqu7z14VOs/9yNF0moPhl2jLe9/oyTpTz7/hE6O\nZpbwmxV09fX3vUEtIVe7+l/WgcGT+vLjL+SUL2wOOorGPWUXZJkVchWLeznNfdK1hF1FY/nPSf2i\nUOj9FvsFI99rrIYDAIBKq3QQTjpKAyi1dvbgcET7Dr+i9NSLaCyeEYBLUiRuFY3FdTIaVyyeu0od\nt+Pt4jsXzNbgcET3//iA4rHMcaJxq6CTSEHJPO4p5DoajftVFE+7tsA52RVV/H4RKVSiMPV5qVVa\nAAAAGglB+BSxvfeQPryld2xVOuga/cnrF+sbPz7oe/76+55QSyggq0TnzFQcHXCku69akRGotreG\ndfdVK/ShtPElyXWMRnzqkt/8B6/WJ7//TN65hgOOPn7lUt3xYG4N75T0iir50kqKVWIptUoLAABA\noyEInwIGhyPasHVvRoAcjVttfvxA3ms8q7HGPOGAo09cvVxzmkNatnBOwZbyj/9qUB/a0qvRuM3I\nJZ8VdBTzrG67cpmuecNZetXcJm3Y1icpkRMedo1kpJsuO0/vXLlE7a1hzQ4HxvLdT8XistaqORjI\naG1frCvmpnXLc3LmU/Mv9BoAAEAjIwhvAMVywvcdPiZZI2li+fsh19Hi01rydrRMaW8Na/FpsxQO\nuBqNj3fWbAmPt6xPzS8VtGdXakmff/o56Q12Sk05aW8N54xRaHwCcAAAMFUQhNdZsQof23sP6UPf\n7M3ZNFmOYmka6b8E+DbA8WxGAJ5SSi579jnZ55fS/KfQfUrNpwcAAGgkNOupo2IdJVNpKH4BeNA1\nuu7iJWONapwCLe1vXbM0b6C6vfeQVt21U9d+YY9W3bVTu/uP1LQBDg13AADATMRKeB0VS8UYGBqR\n59OqPuwaff7dr9Ol55+uD15+vu7fc0Cf3fmsPJ9ovTno6MKFc33vny8fe/fG1dq9cXXN0jxIKwEA\nADMNQXgdFUvFaAm58umDI5vsUrn34FG1hFz94w/6xzpiZvNsYpzHnnlZ6aUN+188rh17D+f8KST1\nS8CKxfMIhgEAAKqEILyOilX/+O7T/+V73Wjc01s+80M1BVxF4p5MnoZLQdfo7a/r0Js//VhGacOV\n55ymXf2DvtecGI1VpMxf9mZTvxrnKXS+BAAAMw1BeJ3lS8UYHI7o3kf7814XjVtF0yqYpAu6Rp9a\n36ULXjVbb/3sD3NKG+YLwKXEyvnQidFJrYJnB9XrL+rQ1358IOMXgU9cvUJruxYVLVEIAAAwHRGE\nN4D0Ch+JFeNjeu7lEwo4RpESrm8KOorHPQVdV3Hr6e6rVmjNioXae/CoXONIyt/Z0s+u/iO+JQdL\n4RdUb34is555NG5189a+sV8+6HwJAABmGoLwBrK995A+8sDevPnd+ZyKegoHHMkoWU88oaOtWXHr\nk1RexN9/9+cKB9wJpYb4BdV+XMeMrf7T+RIAAMw0lChsEIlyhH2+AXg4UKD+YFIk5unkaFyR2HiZ\nw/bWsD6+ZllZ83CMNBq3viUTS+EXVPuJe3ZspZ0ShQAAYKZhJbxBDAyNyPUp9t0UdPQ/33ahuhbP\n03/s+y/d/b1nio6Vns5x4aK5ag27Go4UT0kJB4xcx9HJ0fFzy00N8dtsur67Q1/bk5kTfvdV44E2\nJQoBAMBMQxDeIDramhX3qQlurdXZ7bPUe/Cojo1ESxprNO7p2EhUg8MRdbQ1K+Yzbj7Zc5hIaohf\nUP3By8/PWx1FovMlAACYWYzNU95uKuju7rY9PT31nkbFfHz7z7T58cxNjK9e0KJfvnii4HWOSawu\nh1xXp2JxWWvVHAyM5XRLiYojUjJ/3DUyjtH67g5t6RnIKA2YOpdygQAAAOOMMU9aa7srNh5BeGMY\nHI7ojXc+okis/O/HrJCrz137WknSezf3ZIzRFHS0e+NqSYmUl5aQm1H5JLued2oupIYAAACMq3QQ\nTjpKFeQLYgsdf/QXL8lJlDcp+37RuKeFc5vVe/CoAo6jSFpJQteYgh0w/dJASA0BAACoLoLwCsvX\n/bHY8YBjNBIrv5ygJC1fNEdr7tmlgGN0YjRzA+aJ0biePnxMKxbPq8TbAwAAQAVQorCC0hvVpJf4\n63/xeNHj+aqXrDynreh9nzxwrOAYf/ed/WWVGQQAAEB1sRJeQfm6P/YePFry8XQtYVd/c8VStYTc\nkssT+kk1xiHFBAAAoDGwEl5B+bo/di2eV/LxjHPiVi0hV20tIf1Ox1z93gWnT2heqcY46QaHI9p7\n8Cgr5AAAAHXASngF+TWq2bRuuToXzM45fuuapToxGteta5bqjgf3J1bEk+UFA66jU1FPxlq9+dOP\nycr41hBPcYyU7+XsxjhS/rx1AAAA1AYlCqugWHWUpw8d0x0P7R8PyN+6VBcumquOtmYNnRjVFZ/5\noUZ92teXK+BI//7BS9W5YHbGHFbdtVOnouMr8KkyhqSrAAAA+Kt0iULSUaqgvTXsWxKwvTWsjrZm\n3fHQ/oxNmnc8tH8sYD8xGlc44FZkHiHXzamWkspbT5dqTQ8AAIDaIB2lRlKr4Ad/cyJZD3xcKghO\nBemF8sTLEbe5Lefz5a2X25oeAAAAE0cQXgXZ6SipHGzrWUV80kzSg+DsvPKTozGVkpkSdI2iaSe6\nRvr4mmW+q/Gb1i3XzVv3yjWO4jaRE04qCgAAQO0QhE+CX+539qbH1MbL9BzsdAFH+vNLz9Wjv3hJ\nXYvnqXPBbK3qnK/73nWRlFwxf/9XenQy7fpZQUejcatY2m5MI2nrn79Bj/ziJX1x1/MKBYzueGi/\nZjcFtKpzfsY8beoKI8lmrsoDAACg+gjCJ8ivwsiqzvljzXdStb9v/85+qUB6ScyTPv1I/9jXb+ps\n109eGBobd+2KhRkBeOIaq6agk9GcJxxwdTIa17/86HmNxj2lUsE/tKVXruMo5Gb+UhBJ6865YVuf\nVnXOZzUcAACgRtiYOQH5OmPuO3wsZ9Oja4xGy0jx/mH/YMa4W3oGcs75yB+8OmMVXFIyz9vk3D/m\nSZHY+Hi3f2e/XOOfkw4AAIDaIAifgHwVRiSTs+lxJOrfSn6iWsKuVp7brk3rlqsp6Gh2OKCmoKNN\n65Zr2cI5RTd1Bl2jU1lzGonG2JgJAABQQ6SjTEC+CiPLFs4Z21TpOkYnIpUNwKXx7pcrFs/LyfWW\nlLGpczQel2eVsWEzFvfkOEbxtGPGkBcOAABQS6yET0Cqwkj2SnR7a1hruxZp98bVuv3KZWoJlV/v\nu6tjbsa41128xPc+qXlk1yNP3f+rN6zUj265XJ+4ekXG9Tdddp6asuqQNwVc0lEAAABqiI6Zk5Cv\nM2bqtezOlEHX6B3di/WVPQd8xwu5Ro9/9HJJyhi30H3KnackOmYCAACUiY6ZDSRfZ8zUa2tXLMw4\n9tol87TlyYPKTv4IukbhgNH1l5yjfYdf0dCJ0ZLvIyWC7L0Hj2pwOFJ0noVW8QEAAFAbrIRXyf1P\nvKCPffvpks7tWjxH+w4fz8jdDrtGxjHatG651nYtynutX6nEQuenTHZ1HQAAYCZhJXwKGByO6PYH\n95d8fu/BVzICcEmKxO1Y6cN8K9z5SiXmOz9dsdV1AAAAVA9BeBXsO/yK3AoVHLGezbtpMl+pRDZZ\nAgAANDZKFFbY9t5D2rC1L6Mj5WRE4jZvlZV8pRKp+Q0AANDYWAmvoP4Xj+vmCQTgb+psVzDP0nlT\n0NGJUf9642yyBAAAmJpYCa+Q7b2HdPMDezUaz93oGnKU07o+4Eh/+sZz9I7XLVbngtkaHI7o8V8N\n6q+/+VNFs84ttLK9tmuRb9MeAAAANC5WwisgtUHSLwCXcgNwSYp5GgvAUxafNks3/+EFCrlGLSG3\n5JVtNlkCAABMLayEV0Bqg+QplZ6G4kjqPXhUbS0h7eo/oo3b+mQ9q0jcKhxwFPWsbrtyaUnlBgEA\nADC1EIRXQEdbs0aisYxjjpECjsm7Ou5Jum3HPn3s208r7nlKTyNP5ZTf8eB+vXnZq1jhBgAAmGZI\nR6kQYzI3VrqO0W1XLlNT0FEoz6bLE6NxRWKZAXg6yg0CAABMTwThFTAwNKKmQGYZwaaAq8WnNesf\nrlohb4JdSVPlBgeHI3rsmZf12DMvldSIJ12xlvYAAACoPdJRKsCvXvepWFzv3dwj1zh5V7rTOUby\n0mL1oJtoWb+r/4g+vKV3bIyga/SJq1eUlCs+0Zb2AAAAqC5Wwisgu153OODIWqtIzOpkNLfGt192\nipe1WO4YaemZc7Rh696MID4at7p5a/HW9JNpaQ8AAIDqYiW8QtLrdR8bierG+5/S8UjmZs1ZIVee\ntbrxdzv1uf/7q7xNeCQp5LrqPXhUrnEkZZ7nOmYsVzxffXC/ii2pHHM2egIAANQXQXgFtbeG1d4a\n1vouWhMAAA2RSURBVOBwJCc9RZLeffFZuuFN50qS7v1Bf8Gxop6nrsXzFLe548Q9q6cPHdPb73s8\nb6oJLe0BAAAaF+koE5Ta8Nj/4nE99sxLeuyZlzNSPd6z6uyca760+z8l+bebv+7iJTnt5zsXzNbd\nV61QIO27FHSNPn7lUt3x0P6CqSa0tAcAAGhcrIRPQGrDY6q5TkrAkd65com29AzIMbmJ35GY1df2\nHNAHLj/Pt938By8/Pye9JHXevsOvSLJatnBuyakmtLQHAABoTAThZRocjmjD1r6xhjrpYp60+fED\nBa+/59F+vXPlkrHUlfTAOPvr9OOXnn96xrFSU03yjQkAAID6IR2lTPfvOeAbgJcq5E6+AQ+pJgAA\nAFNb1VbCjTFfkrRG0kvW2guTx06T9E1JZ0t6XtJ6a+2QSbSb/LSkKySdlPSn1tqnqjW3iRocjugz\njzwzqTEqtTmSVBMAAICpq5or4f9H0puzjt0i6RFr7XmSHkl+LUlvkXRe8t/7JP1TFec1YfsOHyup\n8U669d0dVVuxbm8Na8XieZJEV0wAAIAppGor4dbax4wxZ2cdfpuk301+/mVJP5C0MXl8s7XWSnrC\nGDPPGHOmtfbX1ZrfxPh02SmgJeTqmpVnaeObL6jaijVdMQEAAKaeWueEL0gF1smPZySPL5J0MO28\ngeSxhrJs4ZyMcoHFxK0dC7xXLJ5X8QCcrpgAAABTU6NszPRbYrY+x2SMeZ8xpscY0/Pyyy9XeVqZ\n2lvD+uT6LoUDRrOCrkKudMXvLMg5zzWqyWbJVKnCdKlShQAAAGhctS5R+GIqzcQYc6akl5LHByQt\nTjuvQ9JhvwGstfdJuk+Suru7fQP1avLbEDk4HNHjvxrUkeFTunDhXAUDbk02S9IVEwAAYGqqdRC+\nQ9K7Jd2Z/Lg97fhNxphvSFop6Vjj5YOP86vvvWbFwprdf3A4MvZLwKZ1y7UhKyecSikAAACNrZol\nCr+uxCbM+caYAUm3KRF8bzHGXC/pgKSrk6d/V4nyhP1KlCh8T7XmNdX5bcTcvXE1pQoBAACmEJMo\nSDI1dXd3256ennpPo2YGhyNadddOnYqOp6A0BR3t3ria4BsAAKCKjDFPWmu7KzVeo2zMRAnYiAkA\nADA9EIRPIWzEBAAAmB4IwqeQ9tawNq1bXrUOnAAAAKiNWldHwST5lUgEAADA1EIQPgVll0gEAADA\n1EI6CgAAAFBjBOEAAABAjRGEAwAAADVGEA4AAADUGEE4AAAAUGME4QAAAECNEYRPEYPDEe09eFSD\nw5F6TwUAAACTRJ3wKWB77yFt3NanoOMo6nnatG651nYtqve0AAAAMEGshDe4weGINm7r06mop+OR\nmE5FPW3Y1seKOAAAwBRGEN7gBoZGFHQyv01Bx9HA0EidZgQAAIDJIghvcB1tzYp6XsaxqOepo625\nTjMCAADAZBGEN7j21rA2rVuupqCj2eGAmoKONq1brvbWcL2nBgAAgAliY+YUsLZrkVZ1ztfA0Ig6\n2poJwAEAAKY4gvApor01TPANAAAwTZCOAgAAANQYQTgAAABQYwThAAAAQI0RhAMAAAA1RhAOAAAA\n1BhBOAAAAFBjBOEAAABAjRGEAwAAADVGEA4AAADUGEE4AAAAUGPGWlvvOUyYMeZlSS/Uex5T2HxJ\nR+o9iWmCZ1lZPM/K4VlWFs+zcniWlcXzrJx8z/Isa+3plbrJlA7CMTnGmB5rbXe95zEd8Cwri+dZ\nOTzLyuJ5Vg7PsrJ4npVTq2dJOgoAAABQYwThAAAAQI0RhM9s99V7AtMIz7KyeJ6Vw7OsLJ5n5fAs\nK4vnWTk1eZbkhAMAAAA1xko4AAAAUGME4VOcMWaxMeZRY8zPjTH7jDEfTB7/W2PMIWNMb/LfFWnX\nfNQY02+M+aUx5g/Tjr85eazfGHNL2vFzjDF7jDHPGmO+aYwJ1fZd1o4x5nljzM+Sz6wneew0Y8zD\nyff/sDGmLXncGGM+k3xefcaY16aN8+7k+c8aY96ddvyi5Pj9yWtN7d9lbRhjXp3289drjHnFGPNX\n/GyWzhjzJWPMS8aYp9OOVf3nMd89prI8z/JuY8wvks/rW8aYecnjZxtjRtJ+Rj+Xdk1Zz6zQ92Uq\ny/M8q/7ftjEmnPy6P/n62bV5x9WT51l+M+05Pm+M6U0e52ezAJM/JmrM/9201vJvCv+TdKak1yY/\nny3pGUlLJf2tpI/4nL9U0l5JYUnnSPqVJDf571eSzpUUSp6zNHnNFknvSH7+OUl/Ue/3XcXn+byk\n+VnHNkm6Jfn5LZLuSn5+haR/k2QkvUHSnuTx0yQ9l/zYlvy8LfnajyVdnLzm3yS9pd7vuUbP9f+1\nd6+hcpR3HMe/fxJbG5t4QyWNtTFREamtMaENtAkl2jTeEi9BIt5QoUjbF9I3rQQq5JWCLRQqFWxC\nG8lFqlYPoiZBRX2RhJi7mlZPFGz0GG3SRrGl3n6+eJ4Nc5Yz68nm7Jyzs78PDGf2eWZ2d/77zOx/\n53nmzDjgXeBbbptHFLe5wIXAy1W2x7LX6OapJJbzgfF5/p5CLKcWl2t6niOKWdnn0u1TSTw7vm8D\nPwPuz/NLgIdGOxadiGVT/W+B37htDiuWZTnRmDxu+kx4l5M0IGlbnv8Q2ANMabHKImCtpP9LehPo\nB76Xp35Jb0j6GFgLLMq/8OYBD+f1/wJc2ZmtGbMWkbYbBm//ImClkk3ACRExGfgJsEHSQUn/BjYA\nC3LdJEkblfbSlfROLC8C9kpqdXMtt80mkl4ADjYVV9Eey16jaw0VS0nrJX2aH24CTm/1HG3GrOxz\n6WolbbPMSO7bxTg/DFzUOBPZrVrFMm/btcCaVs/htpm0yInG5HHTSXiN5G65GcDmXPSL3L2yotAt\nMgX4Z2G1fbmsrPxk4D+FL6pGeV0JWB8RWyPip7nsNEkDkHZw4NRcfqSxnJLnm8t7wRIGf4m4bbav\nivZY9hp1divprFbDmRGxPSKej4g5uaydmJXFv646vW8fXifXH8rL19UcYL+k1wtlbpvD0JQTjcnj\nppPwmoiIrwOPAHdI+gD4IzAduAAYIHVnQeo+aaY2yuvqB5IuBC4Bfh4Rc1ss61gOQx7LuRD4ay5y\n2+wMx69NEbEU+BRYlYsGgDMkzQB+CayOiEm0F7NeinMV+3YvxRPgOgafwHDbHIYhcqLSRYcoq+y4\n6SS8BiLiGFJjWyXpUQBJ+yV9Julz4AFStx+kX23fLKx+OvBOi/J/kbpnxjeV15Kkd/Lf94C/keK2\nv9FFl/++lxc/0ljuY3B3d61jWXAJsE3SfnDbHAFVtMey16idfMHV5cD1uXuZPGziQJ7fShq3fA7t\nxaws/rVT0b59eJ1cfzzDHxbTVfL2XQ081Chz2/xyQ+VEjNHjppPwLpfHiy0H9kj6XaG8OK7rKqBx\n1XUfsCTSFeZnAmeTLjLYApwd6Yr0r5CGD/TlL6XngMV5/ZuBxzu5TaMlIo6LiImNedJFWy+TYta4\nMrq4/X3ATfnq6tnAodwFtQ6YHxEn5u7Y+cC6XPdhRMzOn9tN1DSWTQadyXHbPGpVtMey16iViFgA\n/ApYKOm/hfJTImJcnp9GaotvtBmzss+ldirat4txXgw82/jxVEMXA3+XdHj4g9tma2U5EWP1uKkx\ncDWrp6O6EviHpK6QXcCOPF0KPAjszuV9wOTCOktJv57/QeG/c+T1Xst1Swvl00gHzH7SkIKvjvZ2\ndyiW00hX5+8EXmnEgDTe8Bng9fz3pFwewH05XruBWYXnujXHqx+4pVA+i/TFtBf4A/mGWXWdgAnA\nAeD4Qpnb5vDjt4bU/fwJ6QzMbVW0x7LX6OapJJb9pHGfjWNn479uXJOPATuBbcAV7cas1efSzVNJ\nPDu+bwPH5sf9uX7aaMeiE7HM5X8Gbm9a1m2zdSzLcqIxedz0HTPNzMzMzCrm4ShmZmZmZhVzEm5m\nZmZmVjEn4WZmZmZmFXMSbmZmZmZWMSfhZmZmZmYVcxJuZtbFIkIR8WDh8fiIeD8insiPT4uIJyJi\nZ0S8GhFP5vKpEfG/iNiRy1fmm1wQET+KiEORbo29JyLuGp2tMzOrLyfhZmbd7SPg2xHxtfz4x8Db\nhfplwAZJ35V0HvDrQt1eSRcA55Pu/HZtoe5FpVtjzwJuiIiZHdsCM7Me5CTczKz7PQVclucH3aEU\nmEy6AQgAknY1ryzpM9KNT6YMUfcRsBWYPoLv18ys5zkJNzPrfmtJtwU/FvgOsLlQdx+wPCKei4il\nEfGN5pXzet8Hnh6i7mRgNukufWZmNkKchJuZdbl8dnsq6Sz4k01160i3AH8AOBfYHhGn5OrpEbED\nOAC81XSWfE5EbAfWA3dLchJuZjaCnISbmdVDH3Avg4eiACDpoKTVkm4EtgBzc1VjTPhZwOyIWFhY\n7UVJMyTNlHR/p9+8mVmvcRJuZlYPK4BlknYXCyNiXkRMyPMTSWO73youI2mAdMHmnRW9VzOznuck\n3MysBiTtk/T7IapmAi9FxC5gI/AnSVuGWO4xYEJEzOnk+zQzsyQkjfZ7MDMzMzPrKT4TbmZmZmZW\nMSfhZmZmZmYVcxJuZmZmZlYxJ+FmZmZmZhVzEm5mZmZmVjEn4WZmZmZmFXMSbmZmZmZWMSfhZmZm\nZmYV+wLeCiVsfDOTjAAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x7f777c66e7f0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"df.plot.scatter(x='MSRP', y='Horsepower', figsize=(12,6)) # figsize=(width, height) in inches"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For more information on plotting features of DataFrames, see the [Pandas Visualization documentation](http://pandas.pydata.org/pandas-docs/stable/visualization.html)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"We've just covered the very basics of the Pandas package here. You should have enough to get started, but for more information, you should [see the official documentation](http://pandas.pydata.org/pandas-docs/stable/)."
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.2"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment