Skip to content

Instantly share code, notes, and snippets.

@syamdev
Created February 2, 2020 11:39
Show Gist options
  • Save syamdev/f4806982ca112e3879428305ae470177 to your computer and use it in GitHub Desktop.
Save syamdev/f4806982ca112e3879428305ae470177 to your computer and use it in GitHub Desktop.
Intro to Python for Data Science - 3 & 4
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Intro to Python for Data Science - 3 & 4",
"provenance": [],
"collapsed_sections": [],
"authorship_tag": "ABX9TyPjlnQuxqPnkmxfJbws7LRt",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/syamdev/f4806982ca112e3879428305ae470177/intro-to-python-for-data-science-3-4.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_R1etonslA8P",
"colab_type": "text"
},
"source": [
"# Python"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Yny3adNqlW07",
"colab_type": "code",
"colab": {}
},
"source": [
"# Review"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "lj21OXQHlGsw",
"colab_type": "text"
},
"source": [
"## Exercise"
]
},
{
"cell_type": "code",
"metadata": {
"id": "ESyPV4n1lLGb",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "7190082e-168d-4377-880b-f209794ac72f"
},
"source": [
"# Exercise 1\n",
"\n",
"# Buat function, parameter / inputnya adalah: integer dan list\n",
"# Outputnya adalah list tsb tetapi dimulai dari index yang didefine di integer\n",
"\n",
"# Input: 2, [1, 2, 3, 4, 5]\n",
"# Output: [3, 4, 5]\n",
"\n",
"def slice_list(integer, list_of_integers):\n",
" result = list_of_integers[integer:]\n",
" return result\n",
"\n",
"slice_list(2, [1, 2, 3, 4])"
],
"execution_count": 24,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[3, 4]"
]
},
"metadata": {
"tags": []
},
"execution_count": 24
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "DapXq4ranpNo",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "84865e51-eee8-4bce-e765-b5e99ccd6f25"
},
"source": [
"# Exercise 2\n",
"\n",
"# Buat sebuah function, parameter / inputnya adalah sebuah list of integers\n",
"# Outputnya adalah statement True atau False (True jika angka pertama dan terakhir dari list tersebut sama, False sebaliknya)\n",
"\n",
"# Input: [1, 2, 3, 4, 5]\n",
"# Output: False\n",
"\n",
"# Input: [1, 2, 3, 1]\n",
"# Output: True\n",
"\n",
"def angka_pertama_terakhir(list_of_integers):\n",
" if list_of_integers[0] == list_of_integers[-1]:\n",
" return True\n",
" else:\n",
" return False\n",
"\n",
"# angka_pertama_terakhir([1, 2, 3, 4, 5])\n",
"\n",
"angka_pertama_terakhir([1, 2, 3, 1])"
],
"execution_count": 25,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"True"
]
},
"metadata": {
"tags": []
},
"execution_count": 25
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "m1QGZIYasKlz",
"colab_type": "text"
},
"source": [
"## Pandas"
]
},
{
"cell_type": "code",
"metadata": {
"id": "8C98Az6LsNgH",
"colab_type": "code",
"colab": {}
},
"source": [
"import pandas as pd"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "xxutv6gTtVK2",
"colab_type": "text"
},
"source": [
"### Series"
]
},
{
"cell_type": "code",
"metadata": {
"id": "hQct8ro1tZdS",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 102
},
"outputId": "a7d2cdd8-98dd-4539-992f-5945903f686a"
},
"source": [
"pd.Series([1, 2, 3, 4])"
],
"execution_count": 26,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0 1\n",
"1 2\n",
"2 3\n",
"3 4\n",
"dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 26
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ziWkplG8tj9B",
"colab_type": "text"
},
"source": [
"### Dataframe"
]
},
{
"cell_type": "code",
"metadata": {
"id": "a_2I5uQwtlxw",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 142
},
"outputId": "839ecab0-bdfa-4552-b7c8-9a22849237db"
},
"source": [
"data_df = {'Name': ['tom', 'nick', 'krish'],\n",
" 'Age': [20, 21, 19],\n",
" 'City': ['jakarta', 'bandung', 'yogyakarta']}\n",
"\n",
"df = pd.DataFrame(data_df)\n",
"df"
],
"execution_count": 27,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Name</th>\n",
" <th>Age</th>\n",
" <th>City</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>20</td>\n",
" <td>jakarta</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>21</td>\n",
" <td>bandung</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>19</td>\n",
" <td>yogyakarta</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Name Age City\n",
"0 tom 20 jakarta\n",
"1 nick 21 bandung\n",
"2 krish 19 yogyakarta"
]
},
"metadata": {
"tags": []
},
"execution_count": 27
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RD5DwAJcuilS",
"colab_type": "text"
},
"source": [
"#### Select Dataframe"
]
},
{
"cell_type": "code",
"metadata": {
"id": "u4wnujYzulxu",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 142
},
"outputId": "d8c102e0-1f31-47ac-e230-117b7ba9c7fa"
},
"source": [
"df[['Name', 'City']]"
],
"execution_count": 28,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Name</th>\n",
" <th>City</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>jakarta</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>bandung</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>yogyakarta</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Name City\n",
"0 tom jakarta\n",
"1 nick bandung\n",
"2 krish yogyakarta"
]
},
"metadata": {
"tags": []
},
"execution_count": 28
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NToMelLuvFIP",
"colab_type": "text"
},
"source": [
"#### Filter Dataframe"
]
},
{
"cell_type": "code",
"metadata": {
"id": "ALgWmIU9vJ3Q",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 80
},
"outputId": "54e6d7c0-bf34-4140-ee4f-69aa0032e67c"
},
"source": [
"df[df['Age'] > 20]"
],
"execution_count": 29,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Name</th>\n",
" <th>Age</th>\n",
" <th>City</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>21</td>\n",
" <td>bandung</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Name Age City\n",
"1 nick 21 bandung"
]
},
"metadata": {
"tags": []
},
"execution_count": 29
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wrJScJjswEN5",
"colab_type": "text"
},
"source": [
"#### Manipulate Dataframe"
]
},
{
"cell_type": "code",
"metadata": {
"id": "m3xOj7yjwehv",
"colab_type": "code",
"colab": {}
},
"source": [
"# bit.ly/dwp-data-ojol"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "mGsvQdwMwIzM",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"outputId": "59c9b8d8-8284-4371-8bbc-8bda8fd997ce"
},
"source": [
"data_df = {'Customer':['tom', 'nick', 'krish', 'jack', 'jack', 'tom', 'jack', 'krish', 'nick', 'nick'],\n",
" 'Origin':['Tebet', 'Gondangdia', 'Mampang', 'SCBD', 'SCBD', 'Kuningan', 'Pancoran', 'Tebet', 'Mampang', 'Tebet'],\n",
" 'Destination':['Gondangdia', 'Mampang', 'SCBD', 'Kuningan', 'Pancoran', 'Pancoran', 'Mampang', 'SCBD', 'SCBD', 'Kuningan'],\n",
" 'Distance': [4.5, 5.0, 3.0, 4.8, 2.2, 4.7, 3.4, 4.0, 2.2, 2.9],\n",
" 'Price': [20000, 23000, 14000, 24500, 7000, 20000, 15000, 18500, 9000, 11000]}\n",
" \n",
"df = pd.DataFrame(data_df)\n",
"\n",
"df"
],
"execution_count": 30,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>Distance</th>\n",
" <th>Price</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>4.5</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>5.0</td>\n",
" <td>23000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>3.0</td>\n",
" <td>14000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Kuningan</td>\n",
" <td>4.8</td>\n",
" <td>24500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Pancoran</td>\n",
" <td>2.2</td>\n",
" <td>7000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>tom</td>\n",
" <td>Kuningan</td>\n",
" <td>Pancoran</td>\n",
" <td>4.7</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>jack</td>\n",
" <td>Pancoran</td>\n",
" <td>Mampang</td>\n",
" <td>3.4</td>\n",
" <td>15000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>krish</td>\n",
" <td>Tebet</td>\n",
" <td>SCBD</td>\n",
" <td>4.0</td>\n",
" <td>18500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>nick</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>2.2</td>\n",
" <td>9000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>nick</td>\n",
" <td>Tebet</td>\n",
" <td>Kuningan</td>\n",
" <td>2.9</td>\n",
" <td>11000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination Distance Price\n",
"0 tom Tebet Gondangdia 4.5 20000\n",
"1 nick Gondangdia Mampang 5.0 23000\n",
"2 krish Mampang SCBD 3.0 14000\n",
"3 jack SCBD Kuningan 4.8 24500\n",
"4 jack SCBD Pancoran 2.2 7000\n",
"5 tom Kuningan Pancoran 4.7 20000\n",
"6 jack Pancoran Mampang 3.4 15000\n",
"7 krish Tebet SCBD 4.0 18500\n",
"8 nick Mampang SCBD 2.2 9000\n",
"9 nick Tebet Kuningan 2.9 11000"
]
},
"metadata": {
"tags": []
},
"execution_count": 30
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "S88KPMft9D1-",
"colab_type": "text"
},
"source": [
"##### Add Column"
]
},
{
"cell_type": "code",
"metadata": {
"id": "MncCn481wWzt",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"outputId": "a927c543-a6c8-4c32-b81c-4b7fe4bea2b0"
},
"source": [
"# add column\n",
"\n",
"df['Rate'] = df['Price'] / df['Distance']\n",
"\n",
"df"
],
"execution_count": 49,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>Distance</th>\n",
" <th>Price</th>\n",
" <th>Rate</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>4.5</td>\n",
" <td>20000</td>\n",
" <td>4444.444444</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>5.0</td>\n",
" <td>23000</td>\n",
" <td>4600.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>3.0</td>\n",
" <td>14000</td>\n",
" <td>4666.666667</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Kuningan</td>\n",
" <td>4.8</td>\n",
" <td>24500</td>\n",
" <td>5104.166667</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Pancoran</td>\n",
" <td>2.2</td>\n",
" <td>7000</td>\n",
" <td>3181.818182</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>tom</td>\n",
" <td>Kuningan</td>\n",
" <td>Pancoran</td>\n",
" <td>4.7</td>\n",
" <td>20000</td>\n",
" <td>4255.319149</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>jack</td>\n",
" <td>Pancoran</td>\n",
" <td>Mampang</td>\n",
" <td>3.4</td>\n",
" <td>15000</td>\n",
" <td>4411.764706</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>krish</td>\n",
" <td>Tebet</td>\n",
" <td>SCBD</td>\n",
" <td>4.0</td>\n",
" <td>18500</td>\n",
" <td>4625.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>nick</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>2.2</td>\n",
" <td>9000</td>\n",
" <td>4090.909091</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>nick</td>\n",
" <td>Tebet</td>\n",
" <td>Kuningan</td>\n",
" <td>2.9</td>\n",
" <td>11000</td>\n",
" <td>3793.103448</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination Distance Price Rate\n",
"0 tom Tebet Gondangdia 4.5 20000 4444.444444\n",
"1 nick Gondangdia Mampang 5.0 23000 4600.000000\n",
"2 krish Mampang SCBD 3.0 14000 4666.666667\n",
"3 jack SCBD Kuningan 4.8 24500 5104.166667\n",
"4 jack SCBD Pancoran 2.2 7000 3181.818182\n",
"5 tom Kuningan Pancoran 4.7 20000 4255.319149\n",
"6 jack Pancoran Mampang 3.4 15000 4411.764706\n",
"7 krish Tebet SCBD 4.0 18500 4625.000000\n",
"8 nick Mampang SCBD 2.2 9000 4090.909091\n",
"9 nick Tebet Kuningan 2.9 11000 3793.103448"
]
},
"metadata": {
"tags": []
},
"execution_count": 49
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "G_IPuJaz_XmW",
"colab_type": "text"
},
"source": [
"##### Head & Tail"
]
},
{
"cell_type": "code",
"metadata": {
"id": "tcRmOY1Ow44L",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"outputId": "54b3cf28-1c44-497c-a0d5-c9c3bc0472cd"
},
"source": [
"# Ambil 5 rows teratas\n",
"\n",
"df.head()"
],
"execution_count": 50,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>Distance</th>\n",
" <th>Price</th>\n",
" <th>Rate</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>4.5</td>\n",
" <td>20000</td>\n",
" <td>4444.444444</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>5.0</td>\n",
" <td>23000</td>\n",
" <td>4600.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>3.0</td>\n",
" <td>14000</td>\n",
" <td>4666.666667</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Kuningan</td>\n",
" <td>4.8</td>\n",
" <td>24500</td>\n",
" <td>5104.166667</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Pancoran</td>\n",
" <td>2.2</td>\n",
" <td>7000</td>\n",
" <td>3181.818182</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination Distance Price Rate\n",
"0 tom Tebet Gondangdia 4.5 20000 4444.444444\n",
"1 nick Gondangdia Mampang 5.0 23000 4600.000000\n",
"2 krish Mampang SCBD 3.0 14000 4666.666667\n",
"3 jack SCBD Kuningan 4.8 24500 5104.166667\n",
"4 jack SCBD Pancoran 2.2 7000 3181.818182"
]
},
"metadata": {
"tags": []
},
"execution_count": 50
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "MBZ3va2Dx9DE",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 111
},
"outputId": "816493b6-37fd-4fdc-d7e6-8d5710483d72"
},
"source": [
"# Ambil 2 rows teratas\n",
"\n",
"df.head(2)"
],
"execution_count": 51,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>Distance</th>\n",
" <th>Price</th>\n",
" <th>Rate</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>4.5</td>\n",
" <td>20000</td>\n",
" <td>4444.444444</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>5.0</td>\n",
" <td>23000</td>\n",
" <td>4600.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination Distance Price Rate\n",
"0 tom Tebet Gondangdia 4.5 20000 4444.444444\n",
"1 nick Gondangdia Mampang 5.0 23000 4600.000000"
]
},
"metadata": {
"tags": []
},
"execution_count": 51
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "YqHYHEVqyM3A",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"outputId": "fb6a4a63-4e39-4c78-c82e-dab62e76c797"
},
"source": [
"# Ambil 5 rows terakhir\n",
"\n",
"df.tail()"
],
"execution_count": 32,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>Distance</th>\n",
" <th>Price</th>\n",
" <th>Rate</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>tom</td>\n",
" <td>Kuningan</td>\n",
" <td>Pancoran</td>\n",
" <td>4.7</td>\n",
" <td>20000</td>\n",
" <td>4255.319149</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>jack</td>\n",
" <td>Pancoran</td>\n",
" <td>Mampang</td>\n",
" <td>3.4</td>\n",
" <td>15000</td>\n",
" <td>4411.764706</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>krish</td>\n",
" <td>Tebet</td>\n",
" <td>SCBD</td>\n",
" <td>4.0</td>\n",
" <td>18500</td>\n",
" <td>4625.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>nick</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>2.2</td>\n",
" <td>9000</td>\n",
" <td>4090.909091</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>nick</td>\n",
" <td>Tebet</td>\n",
" <td>Kuningan</td>\n",
" <td>2.9</td>\n",
" <td>11000</td>\n",
" <td>3793.103448</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination Distance Price Rate\n",
"5 tom Kuningan Pancoran 4.7 20000 4255.319149\n",
"6 jack Pancoran Mampang 3.4 15000 4411.764706\n",
"7 krish Tebet SCBD 4.0 18500 4625.000000\n",
"8 nick Mampang SCBD 2.2 9000 4090.909091\n",
"9 nick Tebet Kuningan 2.9 11000 3793.103448"
]
},
"metadata": {
"tags": []
},
"execution_count": 32
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8ej7KGsR-JNZ",
"colab_type": "text"
},
"source": [
"##### Delete Column"
]
},
{
"cell_type": "code",
"metadata": {
"id": "I-Zo-debyQ4i",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"outputId": "0109b8ef-45d1-44f9-add0-19e498d33b39"
},
"source": [
"# delete column\n",
"\n",
"df = df.drop(columns=['Rate'])\n",
"\n",
"df.head()"
],
"execution_count": 52,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>Distance</th>\n",
" <th>Price</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>4.5</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>5.0</td>\n",
" <td>23000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>3.0</td>\n",
" <td>14000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Kuningan</td>\n",
" <td>4.8</td>\n",
" <td>24500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Pancoran</td>\n",
" <td>2.2</td>\n",
" <td>7000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination Distance Price\n",
"0 tom Tebet Gondangdia 4.5 20000\n",
"1 nick Gondangdia Mampang 5.0 23000\n",
"2 krish Mampang SCBD 3.0 14000\n",
"3 jack SCBD Kuningan 4.8 24500\n",
"4 jack SCBD Pancoran 2.2 7000"
]
},
"metadata": {
"tags": []
},
"execution_count": 52
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pg3Y9mBe-O_B",
"colab_type": "text"
},
"source": [
"##### Rename Column"
]
},
{
"cell_type": "code",
"metadata": {
"id": "gQ-NibqmyjlK",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"outputId": "21ad0270-0cbe-4423-a611-3914203a74ec"
},
"source": [
"# rename column\n",
"\n",
"df = df.rename(columns={'Price': 'Cost',\n",
" 'Distance': 'KM'})\n",
"\n",
"df"
],
"execution_count": 53,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>KM</th>\n",
" <th>Cost</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>4.5</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>5.0</td>\n",
" <td>23000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>3.0</td>\n",
" <td>14000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Kuningan</td>\n",
" <td>4.8</td>\n",
" <td>24500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Pancoran</td>\n",
" <td>2.2</td>\n",
" <td>7000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>tom</td>\n",
" <td>Kuningan</td>\n",
" <td>Pancoran</td>\n",
" <td>4.7</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>jack</td>\n",
" <td>Pancoran</td>\n",
" <td>Mampang</td>\n",
" <td>3.4</td>\n",
" <td>15000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>krish</td>\n",
" <td>Tebet</td>\n",
" <td>SCBD</td>\n",
" <td>4.0</td>\n",
" <td>18500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>nick</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>2.2</td>\n",
" <td>9000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>nick</td>\n",
" <td>Tebet</td>\n",
" <td>Kuningan</td>\n",
" <td>2.9</td>\n",
" <td>11000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination KM Cost\n",
"0 tom Tebet Gondangdia 4.5 20000\n",
"1 nick Gondangdia Mampang 5.0 23000\n",
"2 krish Mampang SCBD 3.0 14000\n",
"3 jack SCBD Kuningan 4.8 24500\n",
"4 jack SCBD Pancoran 2.2 7000\n",
"5 tom Kuningan Pancoran 4.7 20000\n",
"6 jack Pancoran Mampang 3.4 15000\n",
"7 krish Tebet SCBD 4.0 18500\n",
"8 nick Mampang SCBD 2.2 9000\n",
"9 nick Tebet Kuningan 2.9 11000"
]
},
"metadata": {
"tags": []
},
"execution_count": 53
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8FSyRs0O1Ren",
"colab_type": "text"
},
"source": [
"#### Describe Dataframe"
]
},
{
"cell_type": "code",
"metadata": {
"id": "pt5XjRsu080M",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 187
},
"outputId": "2072bd4d-0dc4-475e-ec2c-08fced908470"
},
"source": [
"df.info()"
],
"execution_count": 55,
"outputs": [
{
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 10 entries, 0 to 9\n",
"Data columns (total 5 columns):\n",
"Customer 10 non-null object\n",
"Origin 10 non-null object\n",
"Destination 10 non-null object\n",
"KM 10 non-null float64\n",
"Cost 10 non-null int64\n",
"dtypes: float64(1), int64(1), object(3)\n",
"memory usage: 528.0+ bytes\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "xHbnFoER1X_y",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
},
"outputId": "bc8bd3af-f122-40b1-a6b3-bec11899541d"
},
"source": [
"df.describe()"
],
"execution_count": 56,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>KM</th>\n",
" <th>Cost</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>10.000000</td>\n",
" <td>10.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>3.670000</td>\n",
" <td>16200.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>1.071914</td>\n",
" <td>5954.456781</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>2.200000</td>\n",
" <td>7000.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>2.925000</td>\n",
" <td>11750.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>3.700000</td>\n",
" <td>16750.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>4.650000</td>\n",
" <td>20000.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>5.000000</td>\n",
" <td>24500.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" KM Cost\n",
"count 10.000000 10.000000\n",
"mean 3.670000 16200.000000\n",
"std 1.071914 5954.456781\n",
"min 2.200000 7000.000000\n",
"25% 2.925000 11750.000000\n",
"50% 3.700000 16750.000000\n",
"75% 4.650000 20000.000000\n",
"max 5.000000 24500.000000"
]
},
"metadata": {
"tags": []
},
"execution_count": 56
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qKPrUDjC3EiW",
"colab_type": "text"
},
"source": [
"#### Aggregate Dataframe"
]
},
{
"cell_type": "code",
"metadata": {
"id": "dAe9Vw0L3IRc",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"outputId": "e99f3852-8f62-40a2-b6ea-1ea323c8abb4"
},
"source": [
"df"
],
"execution_count": 57,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>KM</th>\n",
" <th>Cost</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>4.5</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>5.0</td>\n",
" <td>23000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>3.0</td>\n",
" <td>14000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Kuningan</td>\n",
" <td>4.8</td>\n",
" <td>24500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Pancoran</td>\n",
" <td>2.2</td>\n",
" <td>7000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>tom</td>\n",
" <td>Kuningan</td>\n",
" <td>Pancoran</td>\n",
" <td>4.7</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>jack</td>\n",
" <td>Pancoran</td>\n",
" <td>Mampang</td>\n",
" <td>3.4</td>\n",
" <td>15000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>krish</td>\n",
" <td>Tebet</td>\n",
" <td>SCBD</td>\n",
" <td>4.0</td>\n",
" <td>18500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>nick</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>2.2</td>\n",
" <td>9000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>nick</td>\n",
" <td>Tebet</td>\n",
" <td>Kuningan</td>\n",
" <td>2.9</td>\n",
" <td>11000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination KM Cost\n",
"0 tom Tebet Gondangdia 4.5 20000\n",
"1 nick Gondangdia Mampang 5.0 23000\n",
"2 krish Mampang SCBD 3.0 14000\n",
"3 jack SCBD Kuningan 4.8 24500\n",
"4 jack SCBD Pancoran 2.2 7000\n",
"5 tom Kuningan Pancoran 4.7 20000\n",
"6 jack Pancoran Mampang 3.4 15000\n",
"7 krish Tebet SCBD 4.0 18500\n",
"8 nick Mampang SCBD 2.2 9000\n",
"9 nick Tebet Kuningan 2.9 11000"
]
},
"metadata": {
"tags": []
},
"execution_count": 57
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "8RqUotR53SDV",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
},
"outputId": "0c2e99bf-090e-4bf3-e627-6228c25440c5"
},
"source": [
"df.groupby('Customer', as_index=False)['KM'].count()"
],
"execution_count": 58,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>KM</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>jack</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>krish</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>nick</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>tom</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer KM\n",
"0 jack 3\n",
"1 krish 2\n",
"2 nick 3\n",
"3 tom 2"
]
},
"metadata": {
"tags": []
},
"execution_count": 58
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "tPt1Bvf63fv5",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
},
"outputId": "019cb9a8-1c84-473a-b3d9-45fc264bb094"
},
"source": [
"df.groupby('Customer', as_index=False)['KM'].sum()"
],
"execution_count": 59,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>KM</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>jack</td>\n",
" <td>10.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>krish</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>nick</td>\n",
" <td>10.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>tom</td>\n",
" <td>9.2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer KM\n",
"0 jack 10.4\n",
"1 krish 7.0\n",
"2 nick 10.1\n",
"3 tom 9.2"
]
},
"metadata": {
"tags": []
},
"execution_count": 59
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "B0V7ybTi4ZbE",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
},
"outputId": "4651010a-f6c5-43de-afd2-6cb6b1d40a68"
},
"source": [
"df.groupby('Customer', as_index=False)['KM'].mean()"
],
"execution_count": 60,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>KM</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>jack</td>\n",
" <td>3.466667</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>krish</td>\n",
" <td>3.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>nick</td>\n",
" <td>3.366667</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>tom</td>\n",
" <td>4.600000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer KM\n",
"0 jack 3.466667\n",
"1 krish 3.500000\n",
"2 nick 3.366667\n",
"3 tom 4.600000"
]
},
"metadata": {
"tags": []
},
"execution_count": 60
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "_Ml0zW5O46yL",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
},
"outputId": "719138f2-3e5f-46cb-cd4d-6d98c8be0d64"
},
"source": [
"df.groupby('Customer', as_index=False)['KM'].min()"
],
"execution_count": 62,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>KM</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>jack</td>\n",
" <td>2.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>krish</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>nick</td>\n",
" <td>2.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>tom</td>\n",
" <td>4.5</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer KM\n",
"0 jack 2.2\n",
"1 krish 3.0\n",
"2 nick 2.2\n",
"3 tom 4.5"
]
},
"metadata": {
"tags": []
},
"execution_count": 62
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "wmP-nXvU_s61",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"outputId": "daaae33e-45a3-40d3-d618-a182146b9ba4"
},
"source": [
"df.sort_values('Cost')"
],
"execution_count": 83,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>KM</th>\n",
" <th>Cost</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Pancoran</td>\n",
" <td>2.2</td>\n",
" <td>7000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>nick</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>2.2</td>\n",
" <td>9000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>nick</td>\n",
" <td>Tebet</td>\n",
" <td>Kuningan</td>\n",
" <td>2.9</td>\n",
" <td>11000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>3.0</td>\n",
" <td>14000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>jack</td>\n",
" <td>Pancoran</td>\n",
" <td>Mampang</td>\n",
" <td>3.4</td>\n",
" <td>15000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>krish</td>\n",
" <td>Tebet</td>\n",
" <td>SCBD</td>\n",
" <td>4.0</td>\n",
" <td>18500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>4.5</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>tom</td>\n",
" <td>Kuningan</td>\n",
" <td>Pancoran</td>\n",
" <td>4.7</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>5.0</td>\n",
" <td>23000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Kuningan</td>\n",
" <td>4.8</td>\n",
" <td>24500</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination KM Cost\n",
"4 jack SCBD Pancoran 2.2 7000\n",
"8 nick Mampang SCBD 2.2 9000\n",
"9 nick Tebet Kuningan 2.9 11000\n",
"2 krish Mampang SCBD 3.0 14000\n",
"6 jack Pancoran Mampang 3.4 15000\n",
"7 krish Tebet SCBD 4.0 18500\n",
"0 tom Tebet Gondangdia 4.5 20000\n",
"5 tom Kuningan Pancoran 4.7 20000\n",
"1 nick Gondangdia Mampang 5.0 23000\n",
"3 jack SCBD Kuningan 4.8 24500"
]
},
"metadata": {
"tags": []
},
"execution_count": 83
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "UJCfdf-9_9Hm",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"outputId": "0d2695b5-e8d1-4afc-8a5f-3a7755831f72"
},
"source": [
"df.sort_values('Cost', ascending=False)"
],
"execution_count": 84,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>KM</th>\n",
" <th>Cost</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Kuningan</td>\n",
" <td>4.8</td>\n",
" <td>24500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>5.0</td>\n",
" <td>23000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>4.5</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>tom</td>\n",
" <td>Kuningan</td>\n",
" <td>Pancoran</td>\n",
" <td>4.7</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>krish</td>\n",
" <td>Tebet</td>\n",
" <td>SCBD</td>\n",
" <td>4.0</td>\n",
" <td>18500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>jack</td>\n",
" <td>Pancoran</td>\n",
" <td>Mampang</td>\n",
" <td>3.4</td>\n",
" <td>15000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>3.0</td>\n",
" <td>14000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>nick</td>\n",
" <td>Tebet</td>\n",
" <td>Kuningan</td>\n",
" <td>2.9</td>\n",
" <td>11000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>nick</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>2.2</td>\n",
" <td>9000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Pancoran</td>\n",
" <td>2.2</td>\n",
" <td>7000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination KM Cost\n",
"3 jack SCBD Kuningan 4.8 24500\n",
"1 nick Gondangdia Mampang 5.0 23000\n",
"0 tom Tebet Gondangdia 4.5 20000\n",
"5 tom Kuningan Pancoran 4.7 20000\n",
"7 krish Tebet SCBD 4.0 18500\n",
"6 jack Pancoran Mampang 3.4 15000\n",
"2 krish Mampang SCBD 3.0 14000\n",
"9 nick Tebet Kuningan 2.9 11000\n",
"8 nick Mampang SCBD 2.2 9000\n",
"4 jack SCBD Pancoran 2.2 7000"
]
},
"metadata": {
"tags": []
},
"execution_count": 84
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "hd36gUPoAOp4",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 80
},
"outputId": "8a062b0b-ef17-4678-d4fc-e1224bcd90ac"
},
"source": [
"df.groupby('Customer', as_index=False)['KM'].min() \\\n",
" .sort_values('KM') \\\n",
" .rename(columns={'KM': 'Total'}) \\\n",
" .head(1)"
],
"execution_count": 85,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Total</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>jack</td>\n",
" <td>2.2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Total\n",
"0 jack 2.2"
]
},
"metadata": {
"tags": []
},
"execution_count": 85
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "cDfnms2o6TF4",
"colab_type": "code",
"colab": {}
},
"source": [
"data_bike = pd.read_csv('http://bit.ly/dwp-data-bike')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "lzKJJfPCYMrv",
"colab_type": "code",
"outputId": "4c49e75c-c107-481f-eab8-bfa2bf0288d3",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 255
}
},
"source": [
"data_bike.info()"
],
"execution_count": 6,
"outputs": [
{
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 46230 entries, 0 to 46229\n",
"Data columns (total 9 columns):\n",
"tripduration 46230 non-null int64\n",
"starttime 46230 non-null object\n",
"stoptime 46230 non-null object\n",
"start station name 46230 non-null object\n",
"end station name 46230 non-null object\n",
"bikeid 46230 non-null int64\n",
"usertype 46230 non-null object\n",
"birth year 46230 non-null int64\n",
"gender 46230 non-null int64\n",
"dtypes: int64(4), object(5)\n",
"memory usage: 3.2+ MB\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "pv1pjKOumU82",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 111
},
"outputId": "8b954125-2de4-448d-94f3-534a447d6932"
},
"source": [
"data_bike.groupby('usertype', as_index=False)['age'].count().head()"
],
"execution_count": 9,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>usertype</th>\n",
" <th>age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Customer</td>\n",
" <td>6149</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Subscriber</td>\n",
" <td>40081</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" usertype age\n",
"0 Customer 6149\n",
"1 Subscriber 40081"
]
},
"metadata": {
"tags": []
},
"execution_count": 9
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "2dY3Q55vBjAo",
"colab_type": "code",
"outputId": "5a57f69f-31c4-44de-87ad-474cf78226f6",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"data_bike['usertype'].unique()"
],
"execution_count": 11,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array(['Subscriber', 'Customer'], dtype=object)"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ugsGBjtCDwLe",
"colab_type": "code",
"outputId": "43df462b-0309-47fb-d1bc-e90cad651ad0",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"source": [
"data_bike['usertype'].value_counts()"
],
"execution_count": 12,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"Subscriber 40081\n",
"Customer 6149\n",
"Name: usertype, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 12
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cUbyfCJLFCG1",
"colab_type": "text"
},
"source": [
"#### Export & Import File"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Eut8ZLJOFJHY",
"colab_type": "code",
"colab": {}
},
"source": [
"df.to_excel('data.xlsx', index=False)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "EOS98kM1Fnej",
"colab_type": "code",
"colab": {}
},
"source": [
"data_excel = pd.read_excel('data.xlsx', sheet_name='Sheet2')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "VInFs805F0lO",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 80
},
"outputId": "a73dac4d-4dc3-4d14-aee2-5a5826354260"
},
"source": [
"data_excel"
],
"execution_count": 113,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>nama</th>\n",
" <th>umur</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>joko</td>\n",
" <td>25</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" nama umur\n",
"0 joko 25"
]
},
"metadata": {
"tags": []
},
"execution_count": 113
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_GsHLC6kYGvK",
"colab_type": "text"
},
"source": [
"#### Data Type Dataframe"
]
},
{
"cell_type": "code",
"metadata": {
"id": "X_22TxQgD9rS",
"colab_type": "code",
"outputId": "17798ade-5aa5-4325-cad5-9cc8f0a5bf5b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 272
}
},
"source": [
"data_bike.info()"
],
"execution_count": 13,
"outputs": [
{
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 46230 entries, 0 to 46229\n",
"Data columns (total 10 columns):\n",
"tripduration 46230 non-null int64\n",
"starttime 46230 non-null object\n",
"stoptime 46230 non-null object\n",
"start station name 46230 non-null object\n",
"end station name 46230 non-null object\n",
"bikeid 46230 non-null int64\n",
"usertype 46230 non-null object\n",
"birth year 46230 non-null int64\n",
"gender 46230 non-null int64\n",
"age 46230 non-null int64\n",
"dtypes: int64(5), object(5)\n",
"memory usage: 3.5+ MB\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Wuj-atBz9d5k",
"colab_type": "code",
"outputId": "b035ac1e-3c52-4de4-cc5d-02baefbd5559",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"source": [
"data_bike.head()"
],
"execution_count": 14,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>tripduration</th>\n",
" <th>starttime</th>\n",
" <th>stoptime</th>\n",
" <th>start station name</th>\n",
" <th>end station name</th>\n",
" <th>bikeid</th>\n",
" <th>usertype</th>\n",
" <th>birth year</th>\n",
" <th>gender</th>\n",
" <th>age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>261</td>\n",
" <td>2019-08-01 00:14:55.9900</td>\n",
" <td>2019-08-01 00:19:17.4780</td>\n",
" <td>JC Medical Center</td>\n",
" <td>Liberty Light Rail</td>\n",
" <td>26268</td>\n",
" <td>Subscriber</td>\n",
" <td>1980</td>\n",
" <td>1</td>\n",
" <td>40</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>172</td>\n",
" <td>2019-08-01 00:23:06.9910</td>\n",
" <td>2019-08-01 00:25:59.1480</td>\n",
" <td>Dixon Mills</td>\n",
" <td>Grove St PATH</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1996</td>\n",
" <td>1</td>\n",
" <td>24</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>525</td>\n",
" <td>2019-08-01 00:23:28.6170</td>\n",
" <td>2019-08-01 00:32:13.7000</td>\n",
" <td>Newport Pkwy</td>\n",
" <td>Hamilton Park</td>\n",
" <td>29279</td>\n",
" <td>Subscriber</td>\n",
" <td>1991</td>\n",
" <td>1</td>\n",
" <td>29</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>219</td>\n",
" <td>2019-08-01 00:32:36.1410</td>\n",
" <td>2019-08-01 00:36:15.2730</td>\n",
" <td>Warren St</td>\n",
" <td>City Hall</td>\n",
" <td>29598</td>\n",
" <td>Subscriber</td>\n",
" <td>1988</td>\n",
" <td>1</td>\n",
" <td>32</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>262</td>\n",
" <td>2019-08-01 00:41:26.6700</td>\n",
" <td>2019-08-01 00:45:49.3530</td>\n",
" <td>Grove St PATH</td>\n",
" <td>Jersey &amp; 3rd</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1960</td>\n",
" <td>1</td>\n",
" <td>60</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" tripduration starttime ... gender age\n",
"0 261 2019-08-01 00:14:55.9900 ... 1 40\n",
"1 172 2019-08-01 00:23:06.9910 ... 1 24\n",
"2 525 2019-08-01 00:23:28.6170 ... 1 29\n",
"3 219 2019-08-01 00:32:36.1410 ... 1 32\n",
"4 262 2019-08-01 00:41:26.6700 ... 1 60\n",
"\n",
"[5 rows x 10 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 14
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "U2OT5K3kYb1H",
"colab_type": "code",
"colab": {}
},
"source": [
"data_bike['tripduration'] = data_bike['tripduration'].astype(str)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"outputId": "d2280c21-ca24-4e7a-ffa1-7accd2b675c6",
"id": "QB8NSZKKp0r5",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 272
}
},
"source": [
"data_bike.info()"
],
"execution_count": 16,
"outputs": [
{
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 46230 entries, 0 to 46229\n",
"Data columns (total 10 columns):\n",
"tripduration 46230 non-null object\n",
"starttime 46230 non-null object\n",
"stoptime 46230 non-null object\n",
"start station name 46230 non-null object\n",
"end station name 46230 non-null object\n",
"bikeid 46230 non-null int64\n",
"usertype 46230 non-null object\n",
"birth year 46230 non-null int64\n",
"gender 46230 non-null int64\n",
"age 46230 non-null int64\n",
"dtypes: int64(4), object(6)\n",
"memory usage: 3.5+ MB\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "mpfPvC5cXsyr",
"colab_type": "code",
"colab": {}
},
"source": [
"data_bike['tripduration'] = data_bike['tripduration'].astype(int)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "vdcWZrrvYufJ",
"colab_type": "code",
"outputId": "46efcbb8-f769-4e6b-f043-86ca8cc3766d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 272
}
},
"source": [
"data_bike.info()"
],
"execution_count": 18,
"outputs": [
{
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 46230 entries, 0 to 46229\n",
"Data columns (total 10 columns):\n",
"tripduration 46230 non-null int64\n",
"starttime 46230 non-null object\n",
"stoptime 46230 non-null object\n",
"start station name 46230 non-null object\n",
"end station name 46230 non-null object\n",
"bikeid 46230 non-null int64\n",
"usertype 46230 non-null object\n",
"birth year 46230 non-null int64\n",
"gender 46230 non-null int64\n",
"age 46230 non-null int64\n",
"dtypes: int64(5), object(5)\n",
"memory usage: 3.5+ MB\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rfhXu3bCq0uG",
"colab_type": "text"
},
"source": [
"##### Datatime Data Type"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Qn_EcC4yqihU",
"colab_type": "code",
"colab": {}
},
"source": [
"# Datatime Data Type\n",
"\n",
"data_bike['starttime'] = pd.to_datetime(data_bike['starttime'])"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "OXKwzWR3qkyV",
"colab_type": "code",
"colab": {}
},
"source": [
"data_bike['year'] = data_bike['starttime'].dt.year"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "hcsGuQx0ZuC8",
"colab_type": "code",
"outputId": "63a23f01-cb0a-4ec0-8ce1-e97045419ff8",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 306
}
},
"source": [
"data_bike.sample(5)"
],
"execution_count": 22,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>tripduration</th>\n",
" <th>starttime</th>\n",
" <th>stoptime</th>\n",
" <th>start station name</th>\n",
" <th>end station name</th>\n",
" <th>bikeid</th>\n",
" <th>usertype</th>\n",
" <th>birth year</th>\n",
" <th>gender</th>\n",
" <th>age</th>\n",
" <th>year</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>20820</th>\n",
" <td>251</td>\n",
" <td>2019-08-14 19:50:32.421</td>\n",
" <td>2019-08-14 19:54:43.4890</td>\n",
" <td>Grove St PATH</td>\n",
" <td>Warren St</td>\n",
" <td>26230</td>\n",
" <td>Subscriber</td>\n",
" <td>1989</td>\n",
" <td>1</td>\n",
" <td>31</td>\n",
" <td>2019</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23879</th>\n",
" <td>637</td>\n",
" <td>2019-08-16 17:01:57.191</td>\n",
" <td>2019-08-16 17:12:34.2710</td>\n",
" <td>Journal Square</td>\n",
" <td>Leonard Gordon Park</td>\n",
" <td>26431</td>\n",
" <td>Subscriber</td>\n",
" <td>1968</td>\n",
" <td>1</td>\n",
" <td>52</td>\n",
" <td>2019</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12109</th>\n",
" <td>236</td>\n",
" <td>2019-08-09 06:22:30.600</td>\n",
" <td>2019-08-09 06:26:27.0320</td>\n",
" <td>Brunswick St</td>\n",
" <td>Grove St PATH</td>\n",
" <td>29680</td>\n",
" <td>Subscriber</td>\n",
" <td>1998</td>\n",
" <td>1</td>\n",
" <td>22</td>\n",
" <td>2019</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45845</th>\n",
" <td>101</td>\n",
" <td>2019-08-31 17:05:19.420</td>\n",
" <td>2019-08-31 17:07:00.4840</td>\n",
" <td>Newport PATH</td>\n",
" <td>Washington St</td>\n",
" <td>29205</td>\n",
" <td>Subscriber</td>\n",
" <td>1993</td>\n",
" <td>1</td>\n",
" <td>27</td>\n",
" <td>2019</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32532</th>\n",
" <td>300</td>\n",
" <td>2019-08-22 17:28:37.461</td>\n",
" <td>2019-08-22 17:33:38.2110</td>\n",
" <td>Grove St PATH</td>\n",
" <td>Brunswick &amp; 6th</td>\n",
" <td>29508</td>\n",
" <td>Subscriber</td>\n",
" <td>1982</td>\n",
" <td>1</td>\n",
" <td>38</td>\n",
" <td>2019</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" tripduration starttime ... age year\n",
"20820 251 2019-08-14 19:50:32.421 ... 31 2019\n",
"23879 637 2019-08-16 17:01:57.191 ... 52 2019\n",
"12109 236 2019-08-09 06:22:30.600 ... 22 2019\n",
"45845 101 2019-08-31 17:05:19.420 ... 27 2019\n",
"32532 300 2019-08-22 17:28:37.461 ... 38 2019\n",
"\n",
"[5 rows x 11 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 22
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gtWzPN2jrIiu",
"colab_type": "text"
},
"source": [
"##### Custom Format Datetime"
]
},
{
"cell_type": "code",
"metadata": {
"id": "ILy8LZZbZzEe",
"colab_type": "code",
"outputId": "02caad12-cad7-4687-fec0-3cf629eedaf5",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 221
}
},
"source": [
"# Custom Format Datetime\n",
"# strftime.org\n",
"\n",
"data_bike['starttime'].dt.strftime('%A / %B')"
],
"execution_count": 23,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0 Thursday / August\n",
"1 Thursday / August\n",
"2 Thursday / August\n",
"3 Thursday / August\n",
"4 Thursday / August\n",
" ... \n",
"46225 Saturday / August\n",
"46226 Saturday / August\n",
"46227 Saturday / August\n",
"46228 Saturday / August\n",
"46229 Saturday / August\n",
"Name: starttime, Length: 46230, dtype: object"
]
},
"metadata": {
"tags": []
},
"execution_count": 23
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JeAjuUXbjYEn",
"colab_type": "text"
},
"source": [
"#### Fill N/A"
]
},
{
"cell_type": "code",
"metadata": {
"id": "eUJbM-vZj44Z",
"colab_type": "code",
"colab": {}
},
"source": [
"data_ojol = {'Customer':['tom', 'nick', 'krish', 'jack', 'jack', 'tom', 'jack', 'krish', 'nick', 'nick'],\n",
" 'Origin':['Tebet', 'Gondangdia', 'Mampang', 'SCBD', 'SCBD', 'Kuningan', 'Pancoran', 'Tebet', 'Mampang', 'Tebet'],\n",
" 'Destination':['Gondangdia', 'Mampang', 'SCBD', 'Kuningan', 'Pancoran', 'Pancoran', 'Mampang', 'SCBD', 'SCBD', 'Kuningan'],\n",
" 'Distance': [None, None, 3.0, 4.8, 2.2, 4.7, 3.4, 4.0, 2.2, 2.9],\n",
" 'Price': [20000, 23000, 14000, 24500, 7000, 20000, 15000, 18500, 9000, 11000]}\n",
" \n",
"data_ojol = pd.DataFrame(data_ojol)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "bMeKxo1ij7RV",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 187
},
"outputId": "bb375a49-544a-4b93-fb03-aa3ac7e51143"
},
"source": [
"data_ojol.info()"
],
"execution_count": 167,
"outputs": [
{
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 10 entries, 0 to 9\n",
"Data columns (total 5 columns):\n",
"Customer 10 non-null object\n",
"Origin 10 non-null object\n",
"Destination 10 non-null object\n",
"Distance 8 non-null float64\n",
"Price 10 non-null int64\n",
"dtypes: float64(1), int64(1), object(3)\n",
"memory usage: 528.0+ bytes\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "8GbMooJPkH9N",
"colab_type": "code",
"colab": {}
},
"source": [
"data_ojol['Distance'] = data_ojol['Distance'].fillna(0)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "0viIa4CNks3q",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"outputId": "5e5f411a-b18a-433d-f8ff-730815eacfb3"
},
"source": [
"data_ojol"
],
"execution_count": 171,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Origin</th>\n",
" <th>Destination</th>\n",
" <th>Distance</th>\n",
" <th>Price</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>tom</td>\n",
" <td>Tebet</td>\n",
" <td>Gondangdia</td>\n",
" <td>0.0</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>nick</td>\n",
" <td>Gondangdia</td>\n",
" <td>Mampang</td>\n",
" <td>0.0</td>\n",
" <td>23000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>krish</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>3.0</td>\n",
" <td>14000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Kuningan</td>\n",
" <td>4.8</td>\n",
" <td>24500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>jack</td>\n",
" <td>SCBD</td>\n",
" <td>Pancoran</td>\n",
" <td>2.2</td>\n",
" <td>7000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>tom</td>\n",
" <td>Kuningan</td>\n",
" <td>Pancoran</td>\n",
" <td>4.7</td>\n",
" <td>20000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>jack</td>\n",
" <td>Pancoran</td>\n",
" <td>Mampang</td>\n",
" <td>3.4</td>\n",
" <td>15000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>krish</td>\n",
" <td>Tebet</td>\n",
" <td>SCBD</td>\n",
" <td>4.0</td>\n",
" <td>18500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>nick</td>\n",
" <td>Mampang</td>\n",
" <td>SCBD</td>\n",
" <td>2.2</td>\n",
" <td>9000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>nick</td>\n",
" <td>Tebet</td>\n",
" <td>Kuningan</td>\n",
" <td>2.9</td>\n",
" <td>11000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Origin Destination Distance Price\n",
"0 tom Tebet Gondangdia 0.0 20000\n",
"1 nick Gondangdia Mampang 0.0 23000\n",
"2 krish Mampang SCBD 3.0 14000\n",
"3 jack SCBD Kuningan 4.8 24500\n",
"4 jack SCBD Pancoran 2.2 7000\n",
"5 tom Kuningan Pancoran 4.7 20000\n",
"6 jack Pancoran Mampang 3.4 15000\n",
"7 krish Tebet SCBD 4.0 18500\n",
"8 nick Mampang SCBD 2.2 9000\n",
"9 nick Tebet Kuningan 2.9 11000"
]
},
"metadata": {
"tags": []
},
"execution_count": 171
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "EUkpHinik65z",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "6e7780ef-02cc-49db-b939-0317555356fc"
},
"source": [
"data_ojol['Distance'].mean()"
],
"execution_count": 172,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"2.7199999999999998"
]
},
"metadata": {
"tags": []
},
"execution_count": 172
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ugBG3tlglGI4",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "968b7d81-2d24-455d-c342-669c8e519542"
},
"source": [
"data_ojol['Price'].max()"
],
"execution_count": 173,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"24500"
]
},
"metadata": {
"tags": []
},
"execution_count": 173
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iWhZ-J7XnXPI",
"colab_type": "text"
},
"source": [
"#### Change Value Dataframe"
]
},
{
"cell_type": "code",
"metadata": {
"id": "ZSUW_Uu_nUxQ",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 546
},
"outputId": "07c00e1d-e8d0-4dba-ef03-48979fe6db9b"
},
"source": [
"# 1. Based on 1 column\n",
"\n",
"def change_gender(value):\n",
" if value == 0:\n",
" return 'Unknown'\n",
" elif value == 1:\n",
" return 'Male'\n",
" else:\n",
" return 'Female'\n",
"\n",
"data_bike['gender'] = data_bike['gender'].apply(change_gender)\n",
"\n",
"data_bike.head(10)"
],
"execution_count": 179,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>tripduration</th>\n",
" <th>starttime</th>\n",
" <th>stoptime</th>\n",
" <th>start station name</th>\n",
" <th>end station name</th>\n",
" <th>bikeid</th>\n",
" <th>usertype</th>\n",
" <th>birth year</th>\n",
" <th>gender</th>\n",
" <th>age</th>\n",
" <th>year</th>\n",
" <th>date</th>\n",
" <th>year-month</th>\n",
" <th>day</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>261</td>\n",
" <td>2019-08-01 00:14:55.990</td>\n",
" <td>2019-08-01 00:19:17.4780</td>\n",
" <td>JC Medical Center</td>\n",
" <td>Liberty Light Rail</td>\n",
" <td>26268</td>\n",
" <td>Subscriber</td>\n",
" <td>1980</td>\n",
" <td>Female</td>\n",
" <td>40</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>172</td>\n",
" <td>2019-08-01 00:23:06.991</td>\n",
" <td>2019-08-01 00:25:59.1480</td>\n",
" <td>Dixon Mills</td>\n",
" <td>Grove St PATH</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1996</td>\n",
" <td>Female</td>\n",
" <td>24</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>525</td>\n",
" <td>2019-08-01 00:23:28.617</td>\n",
" <td>2019-08-01 00:32:13.7000</td>\n",
" <td>Newport Pkwy</td>\n",
" <td>Hamilton Park</td>\n",
" <td>29279</td>\n",
" <td>Subscriber</td>\n",
" <td>1991</td>\n",
" <td>Female</td>\n",
" <td>29</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>219</td>\n",
" <td>2019-08-01 00:32:36.141</td>\n",
" <td>2019-08-01 00:36:15.2730</td>\n",
" <td>Warren St</td>\n",
" <td>City Hall</td>\n",
" <td>29598</td>\n",
" <td>Subscriber</td>\n",
" <td>1988</td>\n",
" <td>Female</td>\n",
" <td>32</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>262</td>\n",
" <td>2019-08-01 00:41:26.670</td>\n",
" <td>2019-08-01 00:45:49.3530</td>\n",
" <td>Grove St PATH</td>\n",
" <td>Jersey &amp; 3rd</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1960</td>\n",
" <td>Female</td>\n",
" <td>60</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>820</td>\n",
" <td>2019-08-01 00:43:15.299</td>\n",
" <td>2019-08-01 00:56:55.5350</td>\n",
" <td>City Hall</td>\n",
" <td>Bergen Ave</td>\n",
" <td>29598</td>\n",
" <td>Subscriber</td>\n",
" <td>1981</td>\n",
" <td>Female</td>\n",
" <td>39</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>543</td>\n",
" <td>2019-08-01 00:52:59.212</td>\n",
" <td>2019-08-01 01:02:02.2640</td>\n",
" <td>Hilltop</td>\n",
" <td>Bergen Ave</td>\n",
" <td>29525</td>\n",
" <td>Subscriber</td>\n",
" <td>1985</td>\n",
" <td>Female</td>\n",
" <td>35</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>478</td>\n",
" <td>2019-08-01 01:00:40.394</td>\n",
" <td>2019-08-01 01:08:39.3750</td>\n",
" <td>Van Vorst Park</td>\n",
" <td>Lafayette Park</td>\n",
" <td>29641</td>\n",
" <td>Subscriber</td>\n",
" <td>1978</td>\n",
" <td>Female</td>\n",
" <td>42</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>646</td>\n",
" <td>2019-08-01 01:49:05.893</td>\n",
" <td>2019-08-01 01:59:52.0800</td>\n",
" <td>Hilltop</td>\n",
" <td>Bergen Ave</td>\n",
" <td>29448</td>\n",
" <td>Subscriber</td>\n",
" <td>1987</td>\n",
" <td>Female</td>\n",
" <td>33</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>265</td>\n",
" <td>2019-08-01 03:57:17.728</td>\n",
" <td>2019-08-01 04:01:43.1110</td>\n",
" <td>McGinley Square</td>\n",
" <td>Sip Ave</td>\n",
" <td>29477</td>\n",
" <td>Subscriber</td>\n",
" <td>1984</td>\n",
" <td>Female</td>\n",
" <td>36</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" tripduration starttime ... year-month day\n",
"0 261 2019-08-01 00:14:55.990 ... 2019-08 Thursday\n",
"1 172 2019-08-01 00:23:06.991 ... 2019-08 Thursday\n",
"2 525 2019-08-01 00:23:28.617 ... 2019-08 Thursday\n",
"3 219 2019-08-01 00:32:36.141 ... 2019-08 Thursday\n",
"4 262 2019-08-01 00:41:26.670 ... 2019-08 Thursday\n",
"5 820 2019-08-01 00:43:15.299 ... 2019-08 Thursday\n",
"6 543 2019-08-01 00:52:59.212 ... 2019-08 Thursday\n",
"7 478 2019-08-01 01:00:40.394 ... 2019-08 Thursday\n",
"8 646 2019-08-01 01:49:05.893 ... 2019-08 Thursday\n",
"9 265 2019-08-01 03:57:17.728 ... 2019-08 Thursday\n",
"\n",
"[10 rows x 14 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 179
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ECv0hRZznd8u",
"colab_type": "code",
"colab": {}
},
"source": [
"# 2. Based on more 1 column\n",
"\n",
"def route(row):\n",
" result = row['start station name'] + ' - ' + row['end station name']\n",
" return result\n",
"\n",
"data_bike['route'] = data_bike.apply(route, axis=1)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "YFswh8TkusoX",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"outputId": "524184f4-cc08-4bd4-bb26-1b1817c4fd63"
},
"source": [
"data_bike[['start station name', 'end station name', 'route']].head()"
],
"execution_count": 189,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>start station name</th>\n",
" <th>end station name</th>\n",
" <th>route</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>JC Medical Center</td>\n",
" <td>Liberty Light Rail</td>\n",
" <td>JC Medical Center - Liberty Light Rail</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Dixon Mills</td>\n",
" <td>Grove St PATH</td>\n",
" <td>Dixon Mills - Grove St PATH</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Newport Pkwy</td>\n",
" <td>Hamilton Park</td>\n",
" <td>Newport Pkwy - Hamilton Park</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Warren St</td>\n",
" <td>City Hall</td>\n",
" <td>Warren St - City Hall</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Grove St PATH</td>\n",
" <td>Jersey &amp; 3rd</td>\n",
" <td>Grove St PATH - Jersey &amp; 3rd</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" start station name ... route\n",
"0 JC Medical Center ... JC Medical Center - Liberty Light Rail\n",
"1 Dixon Mills ... Dixon Mills - Grove St PATH\n",
"2 Newport Pkwy ... Newport Pkwy - Hamilton Park\n",
"3 Warren St ... Warren St - City Hall\n",
"4 Grove St PATH ... Grove St PATH - Jersey & 3rd\n",
"\n",
"[5 rows x 3 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 189
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kx9v4HKA5SIL",
"colab_type": "text"
},
"source": [
"### Exercise"
]
},
{
"cell_type": "code",
"metadata": {
"id": "qibt7gbH5WAv",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
},
"outputId": "5025364b-3ac3-4489-d707-f25803a82193"
},
"source": [
"# Exercise 1\n",
"\n",
"# Menggunakan data ojol\n",
"# Coba hitung rata-rata spending tiap customer\n",
"\n",
"df.groupby('Customer', as_index=False)['Cost'].mean()"
],
"execution_count": 66,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Customer</th>\n",
" <th>Cost</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>jack</td>\n",
" <td>15500.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>krish</td>\n",
" <td>16250.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>nick</td>\n",
" <td>14333.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>tom</td>\n",
" <td>20000.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Customer Cost\n",
"0 jack 15500.000000\n",
"1 krish 16250.000000\n",
"2 nick 14333.333333\n",
"3 tom 20000.000000"
]
},
"metadata": {
"tags": []
},
"execution_count": 66
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "2x7jwqgj5Y3B",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 235
},
"outputId": "a3e8ced4-3e2d-4945-94fe-f07b44ec70d4"
},
"source": [
"# Exercise 2\n",
"# Menggunakan data ojol\n",
"# Coba cari origin favorit\n",
"\n",
"df.groupby('Origin', as_index=False)['Cost'].count().rename(columns={'Cost': 'Total'})"
],
"execution_count": 74,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Origin</th>\n",
" <th>Total</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Gondangdia</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Kuningan</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Mampang</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Pancoran</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>SCBD</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Tebet</td>\n",
" <td>3</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Origin Total\n",
"0 Gondangdia 1\n",
"1 Kuningan 1\n",
"2 Mampang 2\n",
"3 Pancoran 1\n",
"4 SCBD 2\n",
"5 Tebet 3"
]
},
"metadata": {
"tags": []
},
"execution_count": 74
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Te524bf39hAL",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"outputId": "ead2f0d2-dcf4-433b-8726-a661338e86cb"
},
"source": [
"# Exercise 3\n",
"\n",
"# Coba buat kolom baru dgn nama 'Age'\n",
"# Diambil dari birth year\n",
"\n",
"data_bike['age'] = 2020 - data_bike['birth year']\n",
"\n",
"data_bike.head()\n"
],
"execution_count": 8,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>tripduration</th>\n",
" <th>starttime</th>\n",
" <th>stoptime</th>\n",
" <th>start station name</th>\n",
" <th>end station name</th>\n",
" <th>bikeid</th>\n",
" <th>usertype</th>\n",
" <th>birth year</th>\n",
" <th>gender</th>\n",
" <th>age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>261</td>\n",
" <td>2019-08-01 00:14:55.9900</td>\n",
" <td>2019-08-01 00:19:17.4780</td>\n",
" <td>JC Medical Center</td>\n",
" <td>Liberty Light Rail</td>\n",
" <td>26268</td>\n",
" <td>Subscriber</td>\n",
" <td>1980</td>\n",
" <td>1</td>\n",
" <td>40</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>172</td>\n",
" <td>2019-08-01 00:23:06.9910</td>\n",
" <td>2019-08-01 00:25:59.1480</td>\n",
" <td>Dixon Mills</td>\n",
" <td>Grove St PATH</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1996</td>\n",
" <td>1</td>\n",
" <td>24</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>525</td>\n",
" <td>2019-08-01 00:23:28.6170</td>\n",
" <td>2019-08-01 00:32:13.7000</td>\n",
" <td>Newport Pkwy</td>\n",
" <td>Hamilton Park</td>\n",
" <td>29279</td>\n",
" <td>Subscriber</td>\n",
" <td>1991</td>\n",
" <td>1</td>\n",
" <td>29</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>219</td>\n",
" <td>2019-08-01 00:32:36.1410</td>\n",
" <td>2019-08-01 00:36:15.2730</td>\n",
" <td>Warren St</td>\n",
" <td>City Hall</td>\n",
" <td>29598</td>\n",
" <td>Subscriber</td>\n",
" <td>1988</td>\n",
" <td>1</td>\n",
" <td>32</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>262</td>\n",
" <td>2019-08-01 00:41:26.6700</td>\n",
" <td>2019-08-01 00:45:49.3530</td>\n",
" <td>Grove St PATH</td>\n",
" <td>Jersey &amp; 3rd</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1960</td>\n",
" <td>1</td>\n",
" <td>60</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" tripduration starttime ... gender age\n",
"0 261 2019-08-01 00:14:55.9900 ... 1 40\n",
"1 172 2019-08-01 00:23:06.9910 ... 1 24\n",
"2 525 2019-08-01 00:23:28.6170 ... 1 29\n",
"3 219 2019-08-01 00:32:36.1410 ... 1 32\n",
"4 262 2019-08-01 00:41:26.6700 ... 1 60\n",
"\n",
"[5 rows x 10 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 8
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "v-cl03NH-s_x",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
},
"outputId": "ba8492c7-4623-424c-ad41-252dc42e40f2"
},
"source": [
"# Exercise 4\n",
"\n",
"# Rata2 tripduration tiap age, urutkan dari rata2 tripduration terbanyak\n",
"\n",
"data_bike.groupby('age', as_index=False)['tripduration'].mean() \\\n",
" .sort_values('tripduration', ascending=False) \\\n",
" .head(10)"
],
"execution_count": 95,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>tripduration</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>18</td>\n",
" <td>973.372093</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>19</td>\n",
" <td>902.285714</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>21</td>\n",
" <td>896.060000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>22</td>\n",
" <td>888.409794</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>20</td>\n",
" <td>799.055556</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34</th>\n",
" <td>51</td>\n",
" <td>734.295665</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44</th>\n",
" <td>61</td>\n",
" <td>727.251256</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>17</td>\n",
" <td>695.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>23</td>\n",
" <td>658.600671</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>25</td>\n",
" <td>585.491566</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" age tripduration\n",
"1 18 973.372093\n",
"2 19 902.285714\n",
"4 21 896.060000\n",
"5 22 888.409794\n",
"3 20 799.055556\n",
"34 51 734.295665\n",
"44 61 727.251256\n",
"0 17 695.500000\n",
"6 23 658.600671\n",
"8 25 585.491566"
]
},
"metadata": {
"tags": []
},
"execution_count": 95
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "jV01DpL5aGDa",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"outputId": "fef6e7aa-09f0-4ffa-b935-60c8ba664a49"
},
"source": [
"# Exercise 5\n",
"\n",
"# Menggunakan data_bike\n",
"# Coba buat kolom baru dengan nama 'date', isinya adalah tanggal yang diambil dari starttime\n",
"\n",
"data_bike['date'] = data_bike['starttime'].dt.date\n",
"\n",
"data_bike[['starttime','date']].head()"
],
"execution_count": 136,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>starttime</th>\n",
" <th>date</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2019-08-01 00:14:55.990</td>\n",
" <td>2019-08-01</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2019-08-01 00:23:06.991</td>\n",
" <td>2019-08-01</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2019-08-01 00:23:28.617</td>\n",
" <td>2019-08-01</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2019-08-01 00:32:36.141</td>\n",
" <td>2019-08-01</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2019-08-01 00:41:26.670</td>\n",
" <td>2019-08-01</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" starttime date\n",
"0 2019-08-01 00:14:55.990 2019-08-01\n",
"1 2019-08-01 00:23:06.991 2019-08-01\n",
"2 2019-08-01 00:23:28.617 2019-08-01\n",
"3 2019-08-01 00:32:36.141 2019-08-01\n",
"4 2019-08-01 00:41:26.670 2019-08-01"
]
},
"metadata": {
"tags": []
},
"execution_count": 136
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "xSMSpnuudADv",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 306
},
"outputId": "213d9a80-07d2-459e-9071-2f9717cedebf"
},
"source": [
"# Exercise 6\n",
"\n",
"# Menggunakan data_bike\n",
"# Coba buat kolom baru dgn nama 'year-month', isinya adalah format tahun-bulan (2019-08 yang diambil dari starttime)\n",
"\n",
"data_bike['year-month'] = data_bike['starttime'].dt.strftime('%Y-%m')\n",
"\n",
"data_bike.head()\n",
"\n"
],
"execution_count": 147,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>tripduration</th>\n",
" <th>starttime</th>\n",
" <th>stoptime</th>\n",
" <th>start station name</th>\n",
" <th>end station name</th>\n",
" <th>bikeid</th>\n",
" <th>usertype</th>\n",
" <th>birth year</th>\n",
" <th>gender</th>\n",
" <th>age</th>\n",
" <th>year</th>\n",
" <th>date</th>\n",
" <th>year-month</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>261</td>\n",
" <td>2019-08-01 00:14:55.990</td>\n",
" <td>2019-08-01 00:19:17.4780</td>\n",
" <td>JC Medical Center</td>\n",
" <td>Liberty Light Rail</td>\n",
" <td>26268</td>\n",
" <td>Subscriber</td>\n",
" <td>1980</td>\n",
" <td>1</td>\n",
" <td>40</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>172</td>\n",
" <td>2019-08-01 00:23:06.991</td>\n",
" <td>2019-08-01 00:25:59.1480</td>\n",
" <td>Dixon Mills</td>\n",
" <td>Grove St PATH</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1996</td>\n",
" <td>1</td>\n",
" <td>24</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>525</td>\n",
" <td>2019-08-01 00:23:28.617</td>\n",
" <td>2019-08-01 00:32:13.7000</td>\n",
" <td>Newport Pkwy</td>\n",
" <td>Hamilton Park</td>\n",
" <td>29279</td>\n",
" <td>Subscriber</td>\n",
" <td>1991</td>\n",
" <td>1</td>\n",
" <td>29</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>219</td>\n",
" <td>2019-08-01 00:32:36.141</td>\n",
" <td>2019-08-01 00:36:15.2730</td>\n",
" <td>Warren St</td>\n",
" <td>City Hall</td>\n",
" <td>29598</td>\n",
" <td>Subscriber</td>\n",
" <td>1988</td>\n",
" <td>1</td>\n",
" <td>32</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>262</td>\n",
" <td>2019-08-01 00:41:26.670</td>\n",
" <td>2019-08-01 00:45:49.3530</td>\n",
" <td>Grove St PATH</td>\n",
" <td>Jersey &amp; 3rd</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1960</td>\n",
" <td>1</td>\n",
" <td>60</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" tripduration starttime ... date year-month\n",
"0 261 2019-08-01 00:14:55.990 ... 2019-08-01 2019-08\n",
"1 172 2019-08-01 00:23:06.991 ... 2019-08-01 2019-08\n",
"2 525 2019-08-01 00:23:28.617 ... 2019-08-01 2019-08\n",
"3 219 2019-08-01 00:32:36.141 ... 2019-08-01 2019-08\n",
"4 262 2019-08-01 00:41:26.670 ... 2019-08-01 2019-08\n",
"\n",
"[5 rows x 13 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 147
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "vsYbrUzldzvm",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 266
},
"outputId": "fb4604bb-a2cb-4b86-9a39-0f7f1e7e38c2"
},
"source": [
"# Exercise 7\n",
"\n",
"# Coba cari tahu pada hari apa peminjaman sepeda paling banyak\n",
"\n",
"data_bike['day'] = data_bike['starttime'].dt.strftime('%A')\n",
"\n",
"data_bike.groupby('day', as_index=False)['starttime'].count() \\\n",
" .sort_values('starttime', ascending=False) \\\n",
" .head(10)\n",
"\n"
],
"execution_count": 160,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>day</th>\n",
" <th>starttime</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Thursday</td>\n",
" <td>8470</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Friday</td>\n",
" <td>8100</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Tuesday</td>\n",
" <td>6627</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Monday</td>\n",
" <td>6377</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Saturday</td>\n",
" <td>6258</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Wednesday</td>\n",
" <td>5958</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Sunday</td>\n",
" <td>4440</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" day starttime\n",
"4 Thursday 8470\n",
"0 Friday 8100\n",
"5 Tuesday 6627\n",
"1 Monday 6377\n",
"2 Saturday 6258\n",
"6 Wednesday 5958\n",
"3 Sunday 4440"
]
},
"metadata": {
"tags": []
},
"execution_count": 160
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "AAaI3IgNh2rX",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 563
},
"outputId": "b47099cc-2b2b-4469-d309-b9bbaf4f74e2"
},
"source": [
"# Exercise 8\n",
"\n",
"# Menggunakan data_bike\n",
"# Coba buat kolom baru dengan nama 'trip-type' untuk mengklasifikasikan tipe peminjaman\n",
"# Jika tripduration kurang dari 500 detik -> 'short', antara 501 s/d 1000 detik -> medium, lebih dari 1000 -> long\n",
"\n",
"\n",
"def trip_type(value):\n",
" if value <= 500:\n",
" return 'short'\n",
" elif value >= 501 and value <= 1000:\n",
" return 'medium'\n",
" else:\n",
" return 'long'\n",
"\n",
"data_bike['trip_type'] = data_bike['tripduration'].apply(trip_type)\n",
"\n",
"data_bike.head(10)"
],
"execution_count": 183,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>tripduration</th>\n",
" <th>starttime</th>\n",
" <th>stoptime</th>\n",
" <th>start station name</th>\n",
" <th>end station name</th>\n",
" <th>bikeid</th>\n",
" <th>usertype</th>\n",
" <th>birth year</th>\n",
" <th>gender</th>\n",
" <th>age</th>\n",
" <th>year</th>\n",
" <th>date</th>\n",
" <th>year-month</th>\n",
" <th>day</th>\n",
" <th>trip_type</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>261</td>\n",
" <td>2019-08-01 00:14:55.990</td>\n",
" <td>2019-08-01 00:19:17.4780</td>\n",
" <td>JC Medical Center</td>\n",
" <td>Liberty Light Rail</td>\n",
" <td>26268</td>\n",
" <td>Subscriber</td>\n",
" <td>1980</td>\n",
" <td>Female</td>\n",
" <td>40</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>short</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>172</td>\n",
" <td>2019-08-01 00:23:06.991</td>\n",
" <td>2019-08-01 00:25:59.1480</td>\n",
" <td>Dixon Mills</td>\n",
" <td>Grove St PATH</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1996</td>\n",
" <td>Female</td>\n",
" <td>24</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>short</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>525</td>\n",
" <td>2019-08-01 00:23:28.617</td>\n",
" <td>2019-08-01 00:32:13.7000</td>\n",
" <td>Newport Pkwy</td>\n",
" <td>Hamilton Park</td>\n",
" <td>29279</td>\n",
" <td>Subscriber</td>\n",
" <td>1991</td>\n",
" <td>Female</td>\n",
" <td>29</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>medium</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>219</td>\n",
" <td>2019-08-01 00:32:36.141</td>\n",
" <td>2019-08-01 00:36:15.2730</td>\n",
" <td>Warren St</td>\n",
" <td>City Hall</td>\n",
" <td>29598</td>\n",
" <td>Subscriber</td>\n",
" <td>1988</td>\n",
" <td>Female</td>\n",
" <td>32</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>short</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>262</td>\n",
" <td>2019-08-01 00:41:26.670</td>\n",
" <td>2019-08-01 00:45:49.3530</td>\n",
" <td>Grove St PATH</td>\n",
" <td>Jersey &amp; 3rd</td>\n",
" <td>26162</td>\n",
" <td>Subscriber</td>\n",
" <td>1960</td>\n",
" <td>Female</td>\n",
" <td>60</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>short</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>820</td>\n",
" <td>2019-08-01 00:43:15.299</td>\n",
" <td>2019-08-01 00:56:55.5350</td>\n",
" <td>City Hall</td>\n",
" <td>Bergen Ave</td>\n",
" <td>29598</td>\n",
" <td>Subscriber</td>\n",
" <td>1981</td>\n",
" <td>Female</td>\n",
" <td>39</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>medium</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>543</td>\n",
" <td>2019-08-01 00:52:59.212</td>\n",
" <td>2019-08-01 01:02:02.2640</td>\n",
" <td>Hilltop</td>\n",
" <td>Bergen Ave</td>\n",
" <td>29525</td>\n",
" <td>Subscriber</td>\n",
" <td>1985</td>\n",
" <td>Female</td>\n",
" <td>35</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>medium</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>478</td>\n",
" <td>2019-08-01 01:00:40.394</td>\n",
" <td>2019-08-01 01:08:39.3750</td>\n",
" <td>Van Vorst Park</td>\n",
" <td>Lafayette Park</td>\n",
" <td>29641</td>\n",
" <td>Subscriber</td>\n",
" <td>1978</td>\n",
" <td>Female</td>\n",
" <td>42</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>short</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>646</td>\n",
" <td>2019-08-01 01:49:05.893</td>\n",
" <td>2019-08-01 01:59:52.0800</td>\n",
" <td>Hilltop</td>\n",
" <td>Bergen Ave</td>\n",
" <td>29448</td>\n",
" <td>Subscriber</td>\n",
" <td>1987</td>\n",
" <td>Female</td>\n",
" <td>33</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>medium</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>265</td>\n",
" <td>2019-08-01 03:57:17.728</td>\n",
" <td>2019-08-01 04:01:43.1110</td>\n",
" <td>McGinley Square</td>\n",
" <td>Sip Ave</td>\n",
" <td>29477</td>\n",
" <td>Subscriber</td>\n",
" <td>1984</td>\n",
" <td>Female</td>\n",
" <td>36</td>\n",
" <td>2019</td>\n",
" <td>2019-08-01</td>\n",
" <td>2019-08</td>\n",
" <td>Thursday</td>\n",
" <td>short</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" tripduration starttime ... day trip_type\n",
"0 261 2019-08-01 00:14:55.990 ... Thursday short\n",
"1 172 2019-08-01 00:23:06.991 ... Thursday short\n",
"2 525 2019-08-01 00:23:28.617 ... Thursday medium\n",
"3 219 2019-08-01 00:32:36.141 ... Thursday short\n",
"4 262 2019-08-01 00:41:26.670 ... Thursday short\n",
"5 820 2019-08-01 00:43:15.299 ... Thursday medium\n",
"6 543 2019-08-01 00:52:59.212 ... Thursday medium\n",
"7 478 2019-08-01 01:00:40.394 ... Thursday short\n",
"8 646 2019-08-01 01:49:05.893 ... Thursday medium\n",
"9 265 2019-08-01 03:57:17.728 ... Thursday short\n",
"\n",
"[10 rows x 15 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 183
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "DC_i9V_Zp2YG",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 700
},
"outputId": "ff629f80-d7c5-47f2-844f-3b902e33edfe"
},
"source": [
"# Exercise 9\n",
"\n",
"# Menggunakan data_bike\n",
"# Coba hitung total trip_type per hari\n",
"\n",
"data_bike.groupby(['day', 'trip_type'], as_index=False)['tripduration'].count()"
],
"execution_count": 195,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>day</th>\n",
" <th>trip_type</th>\n",
" <th>tripduration</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Friday</td>\n",
" <td>long</td>\n",
" <td>695</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Friday</td>\n",
" <td>medium</td>\n",
" <td>1683</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Friday</td>\n",
" <td>short</td>\n",
" <td>5722</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Monday</td>\n",
" <td>long</td>\n",
" <td>491</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Monday</td>\n",
" <td>medium</td>\n",
" <td>1358</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Monday</td>\n",
" <td>short</td>\n",
" <td>4528</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Saturday</td>\n",
" <td>long</td>\n",
" <td>1013</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Saturday</td>\n",
" <td>medium</td>\n",
" <td>1748</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Saturday</td>\n",
" <td>short</td>\n",
" <td>3497</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Sunday</td>\n",
" <td>long</td>\n",
" <td>774</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Sunday</td>\n",
" <td>medium</td>\n",
" <td>1230</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>Sunday</td>\n",
" <td>short</td>\n",
" <td>2436</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Thursday</td>\n",
" <td>long</td>\n",
" <td>609</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>Thursday</td>\n",
" <td>medium</td>\n",
" <td>1678</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>Thursday</td>\n",
" <td>short</td>\n",
" <td>6183</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>Tuesday</td>\n",
" <td>long</td>\n",
" <td>482</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>Tuesday</td>\n",
" <td>medium</td>\n",
" <td>1326</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>Tuesday</td>\n",
" <td>short</td>\n",
" <td>4819</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>Wednesday</td>\n",
" <td>long</td>\n",
" <td>394</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Wednesday</td>\n",
" <td>medium</td>\n",
" <td>1143</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>Wednesday</td>\n",
" <td>short</td>\n",
" <td>4421</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" day trip_type tripduration\n",
"0 Friday long 695\n",
"1 Friday medium 1683\n",
"2 Friday short 5722\n",
"3 Monday long 491\n",
"4 Monday medium 1358\n",
"5 Monday short 4528\n",
"6 Saturday long 1013\n",
"7 Saturday medium 1748\n",
"8 Saturday short 3497\n",
"9 Sunday long 774\n",
"10 Sunday medium 1230\n",
"11 Sunday short 2436\n",
"12 Thursday long 609\n",
"13 Thursday medium 1678\n",
"14 Thursday short 6183\n",
"15 Tuesday long 482\n",
"16 Tuesday medium 1326\n",
"17 Tuesday short 4819\n",
"18 Wednesday long 394\n",
"19 Wednesday medium 1143\n",
"20 Wednesday short 4421"
]
},
"metadata": {
"tags": []
},
"execution_count": 195
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "UAUr1Wnzw7cJ",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 793
},
"outputId": "db856c94-4a50-4ead-b2b3-211cd4cce703"
},
"source": [
"# Exercise 10\n",
"\n",
"# Menggunakan data_bike\n",
"# Coba cari tahu di jam berapa saja peminjaman banyak dilakukan\n",
"\n",
"data_bike['hour'] = data_bike['starttime'].dt.strftime('%H')\n",
"\n",
"data_bike.groupby('hour', as_index=False)['starttime'].count() \\\n",
" .rename(columns={'starttime': 'total'}) \\\n",
" .sort_values('hour') \\\n",
" .head(24)"
],
"execution_count": 203,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>hour</th>\n",
" <th>total</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>00</td>\n",
" <td>343</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>01</td>\n",
" <td>186</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>02</td>\n",
" <td>110</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>03</td>\n",
" <td>79</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>04</td>\n",
" <td>90</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>05</td>\n",
" <td>463</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>06</td>\n",
" <td>1252</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>07</td>\n",
" <td>3199</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>08</td>\n",
" <td>5135</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>09</td>\n",
" <td>2861</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>10</td>\n",
" <td>1905</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>11</td>\n",
" <td>1802</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>12</td>\n",
" <td>2088</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>13</td>\n",
" <td>1946</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>14</td>\n",
" <td>1815</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>15</td>\n",
" <td>1964</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>16</td>\n",
" <td>2612</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>17</td>\n",
" <td>4333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>18</td>\n",
" <td>4667</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>19</td>\n",
" <td>3521</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>20</td>\n",
" <td>2428</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>21</td>\n",
" <td>1696</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>22</td>\n",
" <td>1082</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>23</td>\n",
" <td>653</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" hour total\n",
"0 00 343\n",
"1 01 186\n",
"2 02 110\n",
"3 03 79\n",
"4 04 90\n",
"5 05 463\n",
"6 06 1252\n",
"7 07 3199\n",
"8 08 5135\n",
"9 09 2861\n",
"10 10 1905\n",
"11 11 1802\n",
"12 12 2088\n",
"13 13 1946\n",
"14 14 1815\n",
"15 15 1964\n",
"16 16 2612\n",
"17 17 4333\n",
"18 18 4667\n",
"19 19 3521\n",
"20 20 2428\n",
"21 21 1696\n",
"22 22 1082\n",
"23 23 653"
]
},
"metadata": {
"tags": []
},
"execution_count": 203
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "k15akvFJsGE2",
"colab_type": "text"
},
"source": [
"### Add HTML link to notebook"
]
},
{
"cell_type": "code",
"metadata": {
"id": "shElOgIZr6g8",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 63
},
"outputId": "72a1f1d9-fd3e-46f8-d209-99786caf1307"
},
"source": [
"%%html\n",
"<p>Right click on link, then click \"Open link on new tab\"</p>\n",
"<a href=\"https://sy4m.com\">sy4m.com</a> | \n",
"<a href=\"https://kerjaonline.id\">Kerja Online</a> | \n",
"<a href=\"https://OwHub.com\">OwHub</a> | \n",
"<a href=\"https://www.SpinnerArtikelIndonesia.com\">Spinner Artikel Indonesia</a>"
],
"execution_count": 31,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/html": [
"<p>Right click on link, then click \"Open link on new tab\"</p>\n",
"<a href=\"https://sy4m.com\">sy4m.com</a> | \n",
"<a href=\"https://kerjaonline.id\">Kerja Online</a> | \n",
"<a href=\"https://OwHub.com\">OwHub</a> | \n",
"<a href=\"https://www.SpinnerArtikelIndonesia.com\">Spinner Artikel Indonesia</a>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {
"tags": []
}
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment