jing-jin-mc/Pandas - Pivot.ipynb Secret

## Pandas - Pivot.ipynb
{
  "cells": [
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Pivot\n\n### Functions:\n\n* pivot_table\n* stack\n* unstack\n* melt "
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "import pandas as pd ",
      "execution_count": 1,
      "outputs": []
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "#### create a test dataset to demo functions"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        },
        "trusted": true
      },
      "cell_type": "code",
      "source": "df = pd.DataFrame(data = {'date':['week1','week1','week2','week3','week3','week3'],\n                          'sales':[50,20,35,10,20,40],\n                          'category':['a','b','a','a','b','a'],\n                          'revenue':[14,8,16,None,7,9]\n                         })\ndf.date = df.date.astype('str')\ndf.category = df.category.astype('str')\ndf",
      "execution_count": 4,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 4,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>category</th>\n      <th>date</th>\n      <th>revenue</th>\n      <th>sales</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>a</td>\n      <td>week1</td>\n      <td>14.0</td>\n      <td>50</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>b</td>\n      <td>week1</td>\n      <td>8.0</td>\n      <td>20</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>a</td>\n      <td>week2</td>\n      <td>16.0</td>\n      <td>35</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>a</td>\n      <td>week3</td>\n      <td>NaN</td>\n      <td>10</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>b</td>\n      <td>week3</td>\n      <td>7.0</td>\n      <td>20</td>\n    </tr>\n    <tr>\n      <th>5</th>\n      <td>a</td>\n      <td>week3</td>\n      <td>9.0</td>\n      <td>40</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "  category   date  revenue  sales\n0        a  week1     14.0     50\n1        b  week1      8.0     20\n2        a  week2     16.0     35\n3        a  week3      NaN     10\n4        b  week3      7.0     20\n5        a  week3      9.0     40"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "df.info()",
      "execution_count": 5,
      "outputs": [
        {
          "output_type": "stream",
          "text": "<class 'pandas.core.frame.DataFrame'>\nRangeIndex: 6 entries, 0 to 5\nData columns (total 4 columns):\ncategory    6 non-null object\ndate        6 non-null object\nrevenue     5 non-null float64\nsales       6 non-null int64\ndtypes: float64(1), int64(1), object(2)\nmemory usage: 272.0+ bytes\n",
          "name": "stdout"
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "### pandas.pivot_table( data, values, index, columns, aggfunc, fill_value ...)\n\nCreate a spreadsheet-style pivot tables as a DataFrame. The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of result DataFrame\n\n* data: DataFrame \n* values: column to aggregate, optional \n* index: column, Grouper, array, or list of the them \n* columns: column, Grouper, array, or list of the them\n* aggfunc: function, list of functions, dict, default numpy.mean\n* fill_value: scalar, default None. Value to replace missing values with \n\n#### example 1: "
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "pd.pivot_table(df,values='sales',index='date',columns='category',fill_value=0)",
      "execution_count": 6,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 6,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th>category</th>\n      <th>a</th>\n      <th>b</th>\n    </tr>\n    <tr>\n      <th>date</th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>week1</th>\n      <td>50</td>\n      <td>20</td>\n    </tr>\n    <tr>\n      <th>week2</th>\n      <td>35</td>\n      <td>0</td>\n    </tr>\n    <tr>\n      <th>week3</th>\n      <td>25</td>\n      <td>20</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "category   a   b\ndate            \nweek1     50  20\nweek2     35   0\nweek3     25  20"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "#### example for using aggregate function  \nimport numpy library as np to use its functions "
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "import numpy as np\npd.pivot_table(df,values=['sales','revenue'],columns='category',aggfunc = np.sum, fill_value=0)",
      "execution_count": 7,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 7,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th>category</th>\n      <th>a</th>\n      <th>b</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>revenue</th>\n      <td>39</td>\n      <td>15</td>\n    </tr>\n    <tr>\n      <th>sales</th>\n      <td>135</td>\n      <td>40</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "category    a   b\nrevenue    39  15\nsales     135  40"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "#### exercise:\nshow the total revenue and sales for each week and each category"
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "pd.pivot_table(df,\n               values = ['sales','revenue'],\n               index='date',\n               columns = 'category',\n               aggfunc =np.sum,\n               fill_value = 0\n              )",
      "execution_count": 8,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 8,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead tr th {\n        text-align: left;\n    }\n\n    .dataframe thead tr:last-of-type th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr>\n      <th></th>\n      <th colspan=\"2\" halign=\"left\">revenue</th>\n      <th colspan=\"2\" halign=\"left\">sales</th>\n    </tr>\n    <tr>\n      <th>category</th>\n      <th>a</th>\n      <th>b</th>\n      <th>a</th>\n      <th>b</th>\n    </tr>\n    <tr>\n      <th>date</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>week1</th>\n      <td>14</td>\n      <td>8</td>\n      <td>50</td>\n      <td>20</td>\n    </tr>\n    <tr>\n      <th>week2</th>\n      <td>16</td>\n      <td>0</td>\n      <td>35</td>\n      <td>0</td>\n    </tr>\n    <tr>\n      <th>week3</th>\n      <td>9</td>\n      <td>7</td>\n      <td>50</td>\n      <td>20</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "         revenue    sales    \ncategory       a  b     a   b\ndate                         \nweek1         14  8    50  20\nweek2         16  0    35   0\nweek3          9  7    50  20"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "### DataFrame.stack( level = -1, dropna = True)\n\nStack the prescribed level(s) from columns to index, like unpivot and melt\n\n#### example 1: "
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "stack_example_df = pd.pivot_table(df,values=['sales','revenue'],index='date',columns='category',fill_value=0)\nstack_example_df",
      "execution_count": 9,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 9,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead tr th {\n        text-align: left;\n    }\n\n    .dataframe thead tr:last-of-type th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr>\n      <th></th>\n      <th colspan=\"2\" halign=\"left\">revenue</th>\n      <th colspan=\"2\" halign=\"left\">sales</th>\n    </tr>\n    <tr>\n      <th>category</th>\n      <th>a</th>\n      <th>b</th>\n      <th>a</th>\n      <th>b</th>\n    </tr>\n    <tr>\n      <th>date</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>week1</th>\n      <td>14</td>\n      <td>8</td>\n      <td>50</td>\n      <td>20</td>\n    </tr>\n    <tr>\n      <th>week2</th>\n      <td>16</td>\n      <td>0</td>\n      <td>35</td>\n      <td>0</td>\n    </tr>\n    <tr>\n      <th>week3</th>\n      <td>9</td>\n      <td>7</td>\n      <td>25</td>\n      <td>20</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "         revenue    sales    \ncategory       a  b     a   b\ndate                         \nweek1         14  8    50  20\nweek2         16  0    35   0\nweek3          9  7    25  20"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "unstack_example_df = stack_example_df.stack()\nunstack_example_df",
      "execution_count": 10,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 10,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th></th>\n      <th>revenue</th>\n      <th>sales</th>\n    </tr>\n    <tr>\n      <th>date</th>\n      <th>category</th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th rowspan=\"2\" valign=\"top\">week1</th>\n      <th>a</th>\n      <td>14</td>\n      <td>50</td>\n    </tr>\n    <tr>\n      <th>b</th>\n      <td>8</td>\n      <td>20</td>\n    </tr>\n    <tr>\n      <th rowspan=\"2\" valign=\"top\">week2</th>\n      <th>a</th>\n      <td>16</td>\n      <td>35</td>\n    </tr>\n    <tr>\n      <th>b</th>\n      <td>0</td>\n      <td>0</td>\n    </tr>\n    <tr>\n      <th rowspan=\"2\" valign=\"top\">week3</th>\n      <th>a</th>\n      <td>9</td>\n      <td>25</td>\n    </tr>\n    <tr>\n      <th>b</th>\n      <td>7</td>\n      <td>20</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "                revenue  sales\ndate  category                \nweek1 a              14     50\n      b               8     20\nweek2 a              16     35\n      b               0      0\nweek3 a               9     25\n      b               7     20"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "### DataFrame.unstack( level , fill_value)\n\nPivot a level of the index labels\n\n* level: int, str, or list of these, default -1 (last level). Level(s) of index to unstack, can pass level name.\n* fill_value: int, str or dict. Replace NaN with this value if the unstack produces missing values.\n\n#### example 1: "
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "unstack_example_df.unstack(level=0)",
      "execution_count": 11,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 11,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead tr th {\n        text-align: left;\n    }\n\n    .dataframe thead tr:last-of-type th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr>\n      <th></th>\n      <th colspan=\"3\" halign=\"left\">revenue</th>\n      <th colspan=\"3\" halign=\"left\">sales</th>\n    </tr>\n    <tr>\n      <th>date</th>\n      <th>week1</th>\n      <th>week2</th>\n      <th>week3</th>\n      <th>week1</th>\n      <th>week2</th>\n      <th>week3</th>\n    </tr>\n    <tr>\n      <th>category</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>a</th>\n      <td>14</td>\n      <td>16</td>\n      <td>9</td>\n      <td>50</td>\n      <td>35</td>\n      <td>25</td>\n    </tr>\n    <tr>\n      <th>b</th>\n      <td>8</td>\n      <td>0</td>\n      <td>7</td>\n      <td>20</td>\n      <td>0</td>\n      <td>20</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "         revenue             sales            \ndate       week1 week2 week3 week1 week2 week3\ncategory                                      \na             14    16     9    50    35    25\nb              8     0     7    20     0    20"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "unstack_example_df.unstack(level=-1)",
      "execution_count": 12,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 12,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead tr th {\n        text-align: left;\n    }\n\n    .dataframe thead tr:last-of-type th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr>\n      <th></th>\n      <th colspan=\"2\" halign=\"left\">revenue</th>\n      <th colspan=\"2\" halign=\"left\">sales</th>\n    </tr>\n    <tr>\n      <th>category</th>\n      <th>a</th>\n      <th>b</th>\n      <th>a</th>\n      <th>b</th>\n    </tr>\n    <tr>\n      <th>date</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>week1</th>\n      <td>14</td>\n      <td>8</td>\n      <td>50</td>\n      <td>20</td>\n    </tr>\n    <tr>\n      <th>week2</th>\n      <td>16</td>\n      <td>0</td>\n      <td>35</td>\n      <td>0</td>\n    </tr>\n    <tr>\n      <th>week3</th>\n      <td>9</td>\n      <td>7</td>\n      <td>25</td>\n      <td>20</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "         revenue    sales    \ncategory       a  b     a   b\ndate                         \nweek1         14  8    50  20\nweek2         16  0    35   0\nweek3          9  7    25  20"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "### DataFrame.melt()\n\nUnpivot a DataFrame from wide to long format, optionally leaving identifiers set.\n\n#### example 1: "
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "stack_example_df",
      "execution_count": 13,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 13,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead tr th {\n        text-align: left;\n    }\n\n    .dataframe thead tr:last-of-type th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr>\n      <th></th>\n      <th colspan=\"2\" halign=\"left\">revenue</th>\n      <th colspan=\"2\" halign=\"left\">sales</th>\n    </tr>\n    <tr>\n      <th>category</th>\n      <th>a</th>\n      <th>b</th>\n      <th>a</th>\n      <th>b</th>\n    </tr>\n    <tr>\n      <th>date</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>week1</th>\n      <td>14</td>\n      <td>8</td>\n      <td>50</td>\n      <td>20</td>\n    </tr>\n    <tr>\n      <th>week2</th>\n      <td>16</td>\n      <td>0</td>\n      <td>35</td>\n      <td>0</td>\n    </tr>\n    <tr>\n      <th>week3</th>\n      <td>9</td>\n      <td>7</td>\n      <td>25</td>\n      <td>20</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "         revenue    sales    \ncategory       a  b     a   b\ndate                         \nweek1         14  8    50  20\nweek2         16  0    35   0\nweek3          9  7    25  20"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {
        "trusted": true
      },
      "cell_type": "code",
      "source": "stack_example_df.melt()",
      "execution_count": 14,
      "outputs": [
        {
          "output_type": "execute_result",
          "execution_count": 14,
          "data": {
            "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>None</th>\n      <th>category</th>\n      <th>value</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>revenue</td>\n      <td>a</td>\n      <td>14</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>revenue</td>\n      <td>a</td>\n      <td>16</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>revenue</td>\n      <td>a</td>\n      <td>9</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>revenue</td>\n      <td>b</td>\n      <td>8</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>revenue</td>\n      <td>b</td>\n      <td>0</td>\n    </tr>\n    <tr>\n      <th>5</th>\n      <td>revenue</td>\n      <td>b</td>\n      <td>7</td>\n    </tr>\n    <tr>\n      <th>6</th>\n      <td>sales</td>\n      <td>a</td>\n      <td>50</td>\n    </tr>\n    <tr>\n      <th>7</th>\n      <td>sales</td>\n      <td>a</td>\n      <td>35</td>\n    </tr>\n    <tr>\n      <th>8</th>\n      <td>sales</td>\n      <td>a</td>\n      <td>25</td>\n    </tr>\n    <tr>\n      <th>9</th>\n      <td>sales</td>\n      <td>b</td>\n      <td>20</td>\n    </tr>\n    <tr>\n      <th>10</th>\n      <td>sales</td>\n      <td>b</td>\n      <td>0</td>\n    </tr>\n    <tr>\n      <th>11</th>\n      <td>sales</td>\n      <td>b</td>\n      <td>20</td>\n    </tr>\n  </tbody>\n</table>\n</div>",
            "text/plain": "        NaN category  value\n0   revenue        a     14\n1   revenue        a     16\n2   revenue        a      9\n3   revenue        b      8\n4   revenue        b      0\n5   revenue        b      7\n6     sales        a     50\n7     sales        a     35\n8     sales        a     25\n9     sales        b     20\n10    sales        b      0\n11    sales        b     20"
          },
          "metadata": {}
        }
      ]
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## End\n\nThey are only a means to the end of learning how to do good data analysis "
    }
  ],
  "metadata": {
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3",
      "language": "python"
    },
    "gist": {
      "id": "",
      "data": {
        "description": "Pandas - Pivot.ipynb",
        "public": false
      }
    },
    "language_info": {
      "file_extension": ".py",
      "nbconvert_exporter": "python",
      "version": "3.5.4",
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "name": "python",
      "mimetype": "text/x-python",
      "pygments_lexer": "ipython3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}
	{
	"cells": [
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "## Pivot\n\n### Functions:\n\n* pivot_table\n* stack\n* unstack\n* melt "
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "import pandas as pd ",
	"execution_count": 1,
	"outputs": []
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "#### create a test dataset to demo functions"
	},
	{
	"metadata": {
	"slideshow": {
	"slide_type": "slide"
	},
	"trusted": true
	},
	"cell_type": "code",
	"source": "df = pd.DataFrame(data = {'date':['week1','week1','week2','week3','week3','week3'],\n 'sales':[50,20,35,10,20,40],\n 'category':['a','b','a','a','b','a'],\n 'revenue':[14,8,16,None,7,9]\n })\ndf.date = df.date.astype('str')\ndf.category = df.category.astype('str')\ndf",
	"execution_count": 4,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 4,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>category</th>\n <th>date</th>\n <th>revenue</th>\n <th>sales</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>a</td>\n <td>week1</td>\n <td>14.0</td>\n <td>50</td>\n </tr>\n <tr>\n <th>1</th>\n <td>b</td>\n <td>week1</td>\n <td>8.0</td>\n <td>20</td>\n </tr>\n <tr>\n <th>2</th>\n <td>a</td>\n <td>week2</td>\n <td>16.0</td>\n <td>35</td>\n </tr>\n <tr>\n <th>3</th>\n <td>a</td>\n <td>week3</td>\n <td>NaN</td>\n <td>10</td>\n </tr>\n <tr>\n <th>4</th>\n <td>b</td>\n <td>week3</td>\n <td>7.0</td>\n <td>20</td>\n </tr>\n <tr>\n <th>5</th>\n <td>a</td>\n <td>week3</td>\n <td>9.0</td>\n <td>40</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": " category date revenue sales\n0 a week1 14.0 50\n1 b week1 8.0 20\n2 a week2 16.0 35\n3 a week3 NaN 10\n4 b week3 7.0 20\n5 a week3 9.0 40"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "df.info()",
	"execution_count": 5,
	"outputs": [
	{
	"output_type": "stream",
	"text": "<class 'pandas.core.frame.DataFrame'>\nRangeIndex: 6 entries, 0 to 5\nData columns (total 4 columns):\ncategory 6 non-null object\ndate 6 non-null object\nrevenue 5 non-null float64\nsales 6 non-null int64\ndtypes: float64(1), int64(1), object(2)\nmemory usage: 272.0+ bytes\n",
	"name": "stdout"
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "### pandas.pivot_table( data, values, index, columns, aggfunc, fill_value ...)\n\nCreate a spreadsheet-style pivot tables as a DataFrame. The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of result DataFrame\n\n* data: DataFrame \n* values: column to aggregate, optional \n* index: column, Grouper, array, or list of the them \n* columns: column, Grouper, array, or list of the them\n* aggfunc: function, list of functions, dict, default numpy.mean\n* fill_value: scalar, default None. Value to replace missing values with \n\n#### example 1: "
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "pd.pivot_table(df,values='sales',index='date',columns='category',fill_value=0)",
	"execution_count": 6,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 6,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th>category</th>\n <th>a</th>\n <th>b</th>\n </tr>\n <tr>\n <th>date</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>week1</th>\n <td>50</td>\n <td>20</td>\n </tr>\n <tr>\n <th>week2</th>\n <td>35</td>\n <td>0</td>\n </tr>\n <tr>\n <th>week3</th>\n <td>25</td>\n <td>20</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": "category a b\ndate \nweek1 50 20\nweek2 35 0\nweek3 25 20"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "#### example for using aggregate function \nimport numpy library as np to use its functions "
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "import numpy as np\npd.pivot_table(df,values=['sales','revenue'],columns='category',aggfunc = np.sum, fill_value=0)",
	"execution_count": 7,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 7,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th>category</th>\n <th>a</th>\n <th>b</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>revenue</th>\n <td>39</td>\n <td>15</td>\n </tr>\n <tr>\n <th>sales</th>\n <td>135</td>\n <td>40</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": "category a b\nrevenue 39 15\nsales 135 40"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "#### exercise:\nshow the total revenue and sales for each week and each category"
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "pd.pivot_table(df,\n values = ['sales','revenue'],\n index='date',\n columns = 'category',\n aggfunc =np.sum,\n fill_value = 0\n )",
	"execution_count": 8,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 8,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead tr th {\n text-align: left;\n }\n\n .dataframe thead tr:last-of-type th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr>\n <th></th>\n <th colspan=\"2\" halign=\"left\">revenue</th>\n <th colspan=\"2\" halign=\"left\">sales</th>\n </tr>\n <tr>\n <th>category</th>\n <th>a</th>\n <th>b</th>\n <th>a</th>\n <th>b</th>\n </tr>\n <tr>\n <th>date</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>week1</th>\n <td>14</td>\n <td>8</td>\n <td>50</td>\n <td>20</td>\n </tr>\n <tr>\n <th>week2</th>\n <td>16</td>\n <td>0</td>\n <td>35</td>\n <td>0</td>\n </tr>\n <tr>\n <th>week3</th>\n <td>9</td>\n <td>7</td>\n <td>50</td>\n <td>20</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": " revenue sales \ncategory a b a b\ndate \nweek1 14 8 50 20\nweek2 16 0 35 0\nweek3 9 7 50 20"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "### DataFrame.stack( level = -1, dropna = True)\n\nStack the prescribed level(s) from columns to index, like unpivot and melt\n\n#### example 1: "
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "stack_example_df = pd.pivot_table(df,values=['sales','revenue'],index='date',columns='category',fill_value=0)\nstack_example_df",
	"execution_count": 9,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 9,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead tr th {\n text-align: left;\n }\n\n .dataframe thead tr:last-of-type th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr>\n <th></th>\n <th colspan=\"2\" halign=\"left\">revenue</th>\n <th colspan=\"2\" halign=\"left\">sales</th>\n </tr>\n <tr>\n <th>category</th>\n <th>a</th>\n <th>b</th>\n <th>a</th>\n <th>b</th>\n </tr>\n <tr>\n <th>date</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>week1</th>\n <td>14</td>\n <td>8</td>\n <td>50</td>\n <td>20</td>\n </tr>\n <tr>\n <th>week2</th>\n <td>16</td>\n <td>0</td>\n <td>35</td>\n <td>0</td>\n </tr>\n <tr>\n <th>week3</th>\n <td>9</td>\n <td>7</td>\n <td>25</td>\n <td>20</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": " revenue sales \ncategory a b a b\ndate \nweek1 14 8 50 20\nweek2 16 0 35 0\nweek3 9 7 25 20"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "unstack_example_df = stack_example_df.stack()\nunstack_example_df",
	"execution_count": 10,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 10,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th></th>\n <th>revenue</th>\n <th>sales</th>\n </tr>\n <tr>\n <th>date</th>\n <th>category</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th rowspan=\"2\" valign=\"top\">week1</th>\n <th>a</th>\n <td>14</td>\n <td>50</td>\n </tr>\n <tr>\n <th>b</th>\n <td>8</td>\n <td>20</td>\n </tr>\n <tr>\n <th rowspan=\"2\" valign=\"top\">week2</th>\n <th>a</th>\n <td>16</td>\n <td>35</td>\n </tr>\n <tr>\n <th>b</th>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th rowspan=\"2\" valign=\"top\">week3</th>\n <th>a</th>\n <td>9</td>\n <td>25</td>\n </tr>\n <tr>\n <th>b</th>\n <td>7</td>\n <td>20</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": " revenue sales\ndate category \nweek1 a 14 50\n b 8 20\nweek2 a 16 35\n b 0 0\nweek3 a 9 25\n b 7 20"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "### DataFrame.unstack( level , fill_value)\n\nPivot a level of the index labels\n\n* level: int, str, or list of these, default -1 (last level). Level(s) of index to unstack, can pass level name.\n* fill_value: int, str or dict. Replace NaN with this value if the unstack produces missing values.\n\n#### example 1: "
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "unstack_example_df.unstack(level=0)",
	"execution_count": 11,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 11,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead tr th {\n text-align: left;\n }\n\n .dataframe thead tr:last-of-type th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr>\n <th></th>\n <th colspan=\"3\" halign=\"left\">revenue</th>\n <th colspan=\"3\" halign=\"left\">sales</th>\n </tr>\n <tr>\n <th>date</th>\n <th>week1</th>\n <th>week2</th>\n <th>week3</th>\n <th>week1</th>\n <th>week2</th>\n <th>week3</th>\n </tr>\n <tr>\n <th>category</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>a</th>\n <td>14</td>\n <td>16</td>\n <td>9</td>\n <td>50</td>\n <td>35</td>\n <td>25</td>\n </tr>\n <tr>\n <th>b</th>\n <td>8</td>\n <td>0</td>\n <td>7</td>\n <td>20</td>\n <td>0</td>\n <td>20</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": " revenue sales \ndate week1 week2 week3 week1 week2 week3\ncategory \na 14 16 9 50 35 25\nb 8 0 7 20 0 20"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "unstack_example_df.unstack(level=-1)",
	"execution_count": 12,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 12,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead tr th {\n text-align: left;\n }\n\n .dataframe thead tr:last-of-type th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr>\n <th></th>\n <th colspan=\"2\" halign=\"left\">revenue</th>\n <th colspan=\"2\" halign=\"left\">sales</th>\n </tr>\n <tr>\n <th>category</th>\n <th>a</th>\n <th>b</th>\n <th>a</th>\n <th>b</th>\n </tr>\n <tr>\n <th>date</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>week1</th>\n <td>14</td>\n <td>8</td>\n <td>50</td>\n <td>20</td>\n </tr>\n <tr>\n <th>week2</th>\n <td>16</td>\n <td>0</td>\n <td>35</td>\n <td>0</td>\n </tr>\n <tr>\n <th>week3</th>\n <td>9</td>\n <td>7</td>\n <td>25</td>\n <td>20</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": " revenue sales \ncategory a b a b\ndate \nweek1 14 8 50 20\nweek2 16 0 35 0\nweek3 9 7 25 20"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "### DataFrame.melt()\n\nUnpivot a DataFrame from wide to long format, optionally leaving identifiers set.\n\n#### example 1: "
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "stack_example_df",
	"execution_count": 13,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 13,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead tr th {\n text-align: left;\n }\n\n .dataframe thead tr:last-of-type th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr>\n <th></th>\n <th colspan=\"2\" halign=\"left\">revenue</th>\n <th colspan=\"2\" halign=\"left\">sales</th>\n </tr>\n <tr>\n <th>category</th>\n <th>a</th>\n <th>b</th>\n <th>a</th>\n <th>b</th>\n </tr>\n <tr>\n <th>date</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>week1</th>\n <td>14</td>\n <td>8</td>\n <td>50</td>\n <td>20</td>\n </tr>\n <tr>\n <th>week2</th>\n <td>16</td>\n <td>0</td>\n <td>35</td>\n <td>0</td>\n </tr>\n <tr>\n <th>week3</th>\n <td>9</td>\n <td>7</td>\n <td>25</td>\n <td>20</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": " revenue sales \ncategory a b a b\ndate \nweek1 14 8 50 20\nweek2 16 0 35 0\nweek3 9 7 25 20"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {
	"trusted": true
	},
	"cell_type": "code",
	"source": "stack_example_df.melt()",
	"execution_count": 14,
	"outputs": [
	{
	"output_type": "execute_result",
	"execution_count": 14,
	"data": {
	"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>None</th>\n <th>category</th>\n <th>value</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>revenue</td>\n <td>a</td>\n <td>14</td>\n </tr>\n <tr>\n <th>1</th>\n <td>revenue</td>\n <td>a</td>\n <td>16</td>\n </tr>\n <tr>\n <th>2</th>\n <td>revenue</td>\n <td>a</td>\n <td>9</td>\n </tr>\n <tr>\n <th>3</th>\n <td>revenue</td>\n <td>b</td>\n <td>8</td>\n </tr>\n <tr>\n <th>4</th>\n <td>revenue</td>\n <td>b</td>\n <td>0</td>\n </tr>\n <tr>\n <th>5</th>\n <td>revenue</td>\n <td>b</td>\n <td>7</td>\n </tr>\n <tr>\n <th>6</th>\n <td>sales</td>\n <td>a</td>\n <td>50</td>\n </tr>\n <tr>\n <th>7</th>\n <td>sales</td>\n <td>a</td>\n <td>35</td>\n </tr>\n <tr>\n <th>8</th>\n <td>sales</td>\n <td>a</td>\n <td>25</td>\n </tr>\n <tr>\n <th>9</th>\n <td>sales</td>\n <td>b</td>\n <td>20</td>\n </tr>\n <tr>\n <th>10</th>\n <td>sales</td>\n <td>b</td>\n <td>0</td>\n </tr>\n <tr>\n <th>11</th>\n <td>sales</td>\n <td>b</td>\n <td>20</td>\n </tr>\n </tbody>\n</table>\n</div>",
	"text/plain": " NaN category value\n0 revenue a 14\n1 revenue a 16\n2 revenue a 9\n3 revenue b 8\n4 revenue b 0\n5 revenue b 7\n6 sales a 50\n7 sales a 35\n8 sales a 25\n9 sales b 20\n10 sales b 0\n11 sales b 20"
	},
	"metadata": {}
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "## End\n\nThey are only a means to the end of learning how to do good data analysis "
	}
	],
	"metadata": {
	"kernelspec": {
	"name": "python3",
	"display_name": "Python 3",
	"language": "python"
	},
	"gist": {
	"id": "",
	"data": {
	"description": "Pandas - Pivot.ipynb",
	"public": false
	}
	},
	"language_info": {
	"file_extension": ".py",
	"nbconvert_exporter": "python",
	"version": "3.5.4",
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"name": "python",
	"mimetype": "text/x-python",
	"pygments_lexer": "ipython3"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}