nipunbatra/pandas_epoch

## pandas_epoch
{
 "metadata": {
  "name": "pandas_epoch"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "In this IPython notebook we shall learn how to use pandas to analyse timeseries CSV data with epoch based timestamps. For the uninitiated epoch timestamps represents the number of seconds since Jan 1, 1970 and is commonly used for data recording in mant scientific applications. Unix command **date** can be used to give the current time and we can also extract the current epoch timestamp as follows:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "!date"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Thu Jun  6 09:01:16 IST 2013\r\n"
       ]
      }
     ],
     "prompt_number": 2
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "!date +%s"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "1370489477\r\n"
       ]
      }
     ],
     "prompt_number": 3
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We now create a dummy dataset using pandas containing 10 rows and 2 columns and save it locally as a CSV."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import pandas as pd\n",
      "print pd.__version__\n",
      "import datetime,time\n",
      "from pandas import DataFrame\n",
      "import numpy as np"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "0.11.0\n"
       ]
      }
     ],
     "prompt_number": 4
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "start_time=int(time.time())\n",
      "timestamp=start_time+np.arange(10)\n",
      "column_1=np.random.rand(10)\n",
      "column_2=np.random.rand(10)\n",
      "df=DataFrame({'timestamp':timestamp,'column_1':column_1,'column_2':column_2})"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 5
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>column_1</th>\n",
        "      <th>column_2</th>\n",
        "      <th>timestamp</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 0.620274</td>\n",
        "      <td> 0.720347</td>\n",
        "      <td> 1370489479</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 0.603379</td>\n",
        "      <td> 0.214927</td>\n",
        "      <td> 1370489480</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 0.508877</td>\n",
        "      <td> 0.562860</td>\n",
        "      <td> 1370489481</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 0.236682</td>\n",
        "      <td> 0.062259</td>\n",
        "      <td> 1370489482</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 0.923264</td>\n",
        "      <td> 0.278048</td>\n",
        "      <td> 1370489483</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> 0.105673</td>\n",
        "      <td> 0.246425</td>\n",
        "      <td> 1370489484</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> 0.204324</td>\n",
        "      <td> 0.770854</td>\n",
        "      <td> 1370489485</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>7</th>\n",
        "      <td> 0.206739</td>\n",
        "      <td> 0.702672</td>\n",
        "      <td> 1370489486</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>8</th>\n",
        "      <td> 0.180858</td>\n",
        "      <td> 0.440023</td>\n",
        "      <td> 1370489487</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>9</th>\n",
        "      <td> 0.269232</td>\n",
        "      <td> 0.684387</td>\n",
        "      <td> 1370489488</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "output_type": "pyout",
       "prompt_number": 6,
       "text": [
        "   column_1  column_2   timestamp\n",
        "0  0.620274  0.720347  1370489479\n",
        "1  0.603379  0.214927  1370489480\n",
        "2  0.508877  0.562860  1370489481\n",
        "3  0.236682  0.062259  1370489482\n",
        "4  0.923264  0.278048  1370489483\n",
        "5  0.105673  0.246425  1370489484\n",
        "6  0.204324  0.770854  1370489485\n",
        "7  0.206739  0.702672  1370489486\n",
        "8  0.180858  0.440023  1370489487\n",
        "9  0.269232  0.684387  1370489488"
       ]
      }
     ],
     "prompt_number": 6
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Saving the dataframe into a CSV file"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df.to_csv('epoch_demo.csv',index=False)\n",
      "!cat epoch_demo.csv"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "column_1,column_2,timestamp\r\n",
        "0.6202741436145147,0.7203473783163402,1370489479\r\n",
        "0.603378652829255,0.2149274411077019,1370489480\r\n",
        "0.5088772165417312,0.5628600230105181,1370489481\r\n",
        "0.23668184075365228,0.062259281774838415,1370489482\r\n",
        "0.923263574664376,0.27804805708797475,1370489483\r\n",
        "0.10567266566500166,0.24642510043158228,1370489484\r\n",
        "0.20432401714102266,0.7708544521037892,1370489485\r\n",
        "0.20673911644596843,0.7026719669208518,1370489486\r\n",
        "0.1808583958327319,0.4400233442122533,1370489487\r\n",
        "0.2692318429010231,0.684387368168343,1370489488\r\n"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_2=pd.read_csv('epoch_demo.csv',index_col=2)\n",
      "df_2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>column_1</th>\n",
        "      <th>column_2</th>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>timestamp</th>\n",
        "      <th></th>\n",
        "      <th></th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>1370489479</th>\n",
        "      <td> 0.620274</td>\n",
        "      <td> 0.720347</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1370489480</th>\n",
        "      <td> 0.603379</td>\n",
        "      <td> 0.214927</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1370489481</th>\n",
        "      <td> 0.508877</td>\n",
        "      <td> 0.562860</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1370489482</th>\n",
        "      <td> 0.236682</td>\n",
        "      <td> 0.062259</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1370489483</th>\n",
        "      <td> 0.923264</td>\n",
        "      <td> 0.278048</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1370489484</th>\n",
        "      <td> 0.105673</td>\n",
        "      <td> 0.246425</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1370489485</th>\n",
        "      <td> 0.204324</td>\n",
        "      <td> 0.770854</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1370489486</th>\n",
        "      <td> 0.206739</td>\n",
        "      <td> 0.702672</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1370489487</th>\n",
        "      <td> 0.180858</td>\n",
        "      <td> 0.440023</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1370489488</th>\n",
        "      <td> 0.269232</td>\n",
        "      <td> 0.684387</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "output_type": "pyout",
       "prompt_number": 8,
       "text": [
        "            column_1  column_2\n",
        "timestamp                     \n",
        "1370489479  0.620274  0.720347\n",
        "1370489480  0.603379  0.214927\n",
        "1370489481  0.508877  0.562860\n",
        "1370489482  0.236682  0.062259\n",
        "1370489483  0.923264  0.278048\n",
        "1370489484  0.105673  0.246425\n",
        "1370489485  0.204324  0.770854\n",
        "1370489486  0.206739  0.702672\n",
        "1370489487  0.180858  0.440023\n",
        "1370489488  0.269232  0.684387"
       ]
      }
     ],
     "prompt_number": 8
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We can notice that currently the index is of integer type. To convert it into DateTime type we use tp_datetime() function, which accepts time in nanoseconds. \n",
      "$$1 second = 10^9 nanoseconds$$"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_2.index=pd.to_datetime((df_2.index.values*1e9).astype(int))\n",
      "df_2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>column_1</th>\n",
        "      <th>column_2</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:19</th>\n",
        "      <td> 0.620274</td>\n",
        "      <td> 0.720347</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:20</th>\n",
        "      <td> 0.603379</td>\n",
        "      <td> 0.214927</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:21</th>\n",
        "      <td> 0.508877</td>\n",
        "      <td> 0.562860</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:22</th>\n",
        "      <td> 0.236682</td>\n",
        "      <td> 0.062259</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:23</th>\n",
        "      <td> 0.923264</td>\n",
        "      <td> 0.278048</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:24</th>\n",
        "      <td> 0.105673</td>\n",
        "      <td> 0.246425</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:25</th>\n",
        "      <td> 0.204324</td>\n",
        "      <td> 0.770854</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:26</th>\n",
        "      <td> 0.206739</td>\n",
        "      <td> 0.702672</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:27</th>\n",
        "      <td> 0.180858</td>\n",
        "      <td> 0.440023</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 03:31:28</th>\n",
        "      <td> 0.269232</td>\n",
        "      <td> 0.684387</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "output_type": "pyout",
       "prompt_number": 9,
       "text": [
        "                     column_1  column_2\n",
        "2013-06-06 03:31:19  0.620274  0.720347\n",
        "2013-06-06 03:31:20  0.603379  0.214927\n",
        "2013-06-06 03:31:21  0.508877  0.562860\n",
        "2013-06-06 03:31:22  0.236682  0.062259\n",
        "2013-06-06 03:31:23  0.923264  0.278048\n",
        "2013-06-06 03:31:24  0.105673  0.246425\n",
        "2013-06-06 03:31:25  0.204324  0.770854\n",
        "2013-06-06 03:31:26  0.206739  0.702672\n",
        "2013-06-06 03:31:27  0.180858  0.440023\n",
        "2013-06-06 03:31:28  0.269232  0.684387"
       ]
      }
     ],
     "prompt_number": 9
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_2.index"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 10,
       "text": [
        "<class 'pandas.tseries.index.DatetimeIndex'>\n",
        "[2013-06-06 03:31:19, ..., 2013-06-06 03:31:28]\n",
        "Length: 10, Freq: None, Timezone: None"
       ]
      }
     ],
     "prompt_number": 10
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "However, now we can see that the time is lagging the current local time and the index does not yet have an associated timezone. Thus, we first put the index in UTC to associate a timezone and then localize with our timezone (Asia/Kolkata) in our case."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_2.index=df_2.index.tz_localize('UTC').tz_convert('Asia/Kolkata')\n",
      "df_2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>column_1</th>\n",
        "      <th>column_2</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:19+05:30</th>\n",
        "      <td> 0.620274</td>\n",
        "      <td> 0.720347</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:20+05:30</th>\n",
        "      <td> 0.603379</td>\n",
        "      <td> 0.214927</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:21+05:30</th>\n",
        "      <td> 0.508877</td>\n",
        "      <td> 0.562860</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:22+05:30</th>\n",
        "      <td> 0.236682</td>\n",
        "      <td> 0.062259</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:23+05:30</th>\n",
        "      <td> 0.923264</td>\n",
        "      <td> 0.278048</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:24+05:30</th>\n",
        "      <td> 0.105673</td>\n",
        "      <td> 0.246425</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:25+05:30</th>\n",
        "      <td> 0.204324</td>\n",
        "      <td> 0.770854</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:26+05:30</th>\n",
        "      <td> 0.206739</td>\n",
        "      <td> 0.702672</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:27+05:30</th>\n",
        "      <td> 0.180858</td>\n",
        "      <td> 0.440023</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2013-06-06 09:01:28+05:30</th>\n",
        "      <td> 0.269232</td>\n",
        "      <td> 0.684387</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "output_type": "pyout",
       "prompt_number": 11,
       "text": [
        "                           column_1  column_2\n",
        "2013-06-06 09:01:19+05:30  0.620274  0.720347\n",
        "2013-06-06 09:01:20+05:30  0.603379  0.214927\n",
        "2013-06-06 09:01:21+05:30  0.508877  0.562860\n",
        "2013-06-06 09:01:22+05:30  0.236682  0.062259\n",
        "2013-06-06 09:01:23+05:30  0.923264  0.278048\n",
        "2013-06-06 09:01:24+05:30  0.105673  0.246425\n",
        "2013-06-06 09:01:25+05:30  0.204324  0.770854\n",
        "2013-06-06 09:01:26+05:30  0.206739  0.702672\n",
        "2013-06-06 09:01:27+05:30  0.180858  0.440023\n",
        "2013-06-06 09:01:28+05:30  0.269232  0.684387"
       ]
      }
     ],
     "prompt_number": 11
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_2.index"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 12,
       "text": [
        "<class 'pandas.tseries.index.DatetimeIndex'>\n",
        "[2013-06-06 09:01:19, ..., 2013-06-06 09:01:28]\n",
        "Length: 10, Freq: None, Timezone: Asia/Kolkata"
       ]
      }
     ],
     "prompt_number": 12
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "References\n",
      "\n",
      "* https://github.com/pydata/pandas/issues/3757\n",
      "* https://github.com/pydata/pandas/issues/3746"
     ]
    }
   ],
   "metadata": {}
  }
 ]
}
	{
	"metadata": {
	"name": "pandas_epoch"
	},
	"nbformat": 3,
	"nbformat_minor": 0,
	"worksheets": [
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"In this IPython notebook we shall learn how to use pandas to analyse timeseries CSV data with epoch based timestamps. For the uninitiated epoch timestamps represents the number of seconds since Jan 1, 1970 and is commonly used for data recording in mant scientific applications. Unix command date can be used to give the current time and we can also extract the current epoch timestamp as follows:"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"!date"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"Thu Jun 6 09:01:16 IST 2013\r\n"
	]
	}
	],
	"prompt_number": 2
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"!date +%s"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"1370489477\r\n"
	]
	}
	],
	"prompt_number": 3
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We now create a dummy dataset using pandas containing 10 rows and 2 columns and save it locally as a CSV."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"import pandas as pd\n",
	"print pd.__version__\n",
	"import datetime,time\n",
	"from pandas import DataFrame\n",
	"import numpy as np"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"0.11.0\n"
	]
	}
	],
	"prompt_number": 4
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"start_time=int(time.time())\n",
	"timestamp=start_time+np.arange(10)\n",
	"column_1=np.random.rand(10)\n",
	"column_2=np.random.rand(10)\n",
	"df=DataFrame({'timestamp':timestamp,'column_1':column_1,'column_2':column_2})"
	],
	"language": "python",
	"metadata": {},
	"outputs": [],
	"prompt_number": 5
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"df"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"html": [
	"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>column_1</th>\n",
	" <th>column_2</th>\n",
	" <th>timestamp</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>0</th>\n",
	" <td> 0.620274</td>\n",
	" <td> 0.720347</td>\n",
	" <td> 1370489479</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1</th>\n",
	" <td> 0.603379</td>\n",
	" <td> 0.214927</td>\n",
	" <td> 1370489480</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2</th>\n",
	" <td> 0.508877</td>\n",
	" <td> 0.562860</td>\n",
	" <td> 1370489481</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>3</th>\n",
	" <td> 0.236682</td>\n",
	" <td> 0.062259</td>\n",
	" <td> 1370489482</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>4</th>\n",
	" <td> 0.923264</td>\n",
	" <td> 0.278048</td>\n",
	" <td> 1370489483</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>5</th>\n",
	" <td> 0.105673</td>\n",
	" <td> 0.246425</td>\n",
	" <td> 1370489484</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>6</th>\n",
	" <td> 0.204324</td>\n",
	" <td> 0.770854</td>\n",
	" <td> 1370489485</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>7</th>\n",
	" <td> 0.206739</td>\n",
	" <td> 0.702672</td>\n",
	" <td> 1370489486</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>8</th>\n",
	" <td> 0.180858</td>\n",
	" <td> 0.440023</td>\n",
	" <td> 1370489487</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>9</th>\n",
	" <td> 0.269232</td>\n",
	" <td> 0.684387</td>\n",
	" <td> 1370489488</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"output_type": "pyout",
	"prompt_number": 6,
	"text": [
	" column_1 column_2 timestamp\n",
	"0 0.620274 0.720347 1370489479\n",
	"1 0.603379 0.214927 1370489480\n",
	"2 0.508877 0.562860 1370489481\n",
	"3 0.236682 0.062259 1370489482\n",
	"4 0.923264 0.278048 1370489483\n",
	"5 0.105673 0.246425 1370489484\n",
	"6 0.204324 0.770854 1370489485\n",
	"7 0.206739 0.702672 1370489486\n",
	"8 0.180858 0.440023 1370489487\n",
	"9 0.269232 0.684387 1370489488"
	]
	}
	],
	"prompt_number": 6
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Saving the dataframe into a CSV file"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"df.to_csv('epoch_demo.csv',index=False)\n",
	"!cat epoch_demo.csv"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "stream",
	"stream": "stdout",
	"text": [
	"column_1,column_2,timestamp\r\n",
	"0.6202741436145147,0.7203473783163402,1370489479\r\n",
	"0.603378652829255,0.2149274411077019,1370489480\r\n",
	"0.5088772165417312,0.5628600230105181,1370489481\r\n",
	"0.23668184075365228,0.062259281774838415,1370489482\r\n",
	"0.923263574664376,0.27804805708797475,1370489483\r\n",
	"0.10567266566500166,0.24642510043158228,1370489484\r\n",
	"0.20432401714102266,0.7708544521037892,1370489485\r\n",
	"0.20673911644596843,0.7026719669208518,1370489486\r\n",
	"0.1808583958327319,0.4400233442122533,1370489487\r\n",
	"0.2692318429010231,0.684387368168343,1370489488\r\n"
	]
	}
	],
	"prompt_number": 7
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"df_2=pd.read_csv('epoch_demo.csv',index_col=2)\n",
	"df_2"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"html": [
	"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>column_1</th>\n",
	" <th>column_2</th>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>timestamp</th>\n",
	" <th></th>\n",
	" <th></th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>1370489479</th>\n",
	" <td> 0.620274</td>\n",
	" <td> 0.720347</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1370489480</th>\n",
	" <td> 0.603379</td>\n",
	" <td> 0.214927</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1370489481</th>\n",
	" <td> 0.508877</td>\n",
	" <td> 0.562860</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1370489482</th>\n",
	" <td> 0.236682</td>\n",
	" <td> 0.062259</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1370489483</th>\n",
	" <td> 0.923264</td>\n",
	" <td> 0.278048</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1370489484</th>\n",
	" <td> 0.105673</td>\n",
	" <td> 0.246425</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1370489485</th>\n",
	" <td> 0.204324</td>\n",
	" <td> 0.770854</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1370489486</th>\n",
	" <td> 0.206739</td>\n",
	" <td> 0.702672</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1370489487</th>\n",
	" <td> 0.180858</td>\n",
	" <td> 0.440023</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1370489488</th>\n",
	" <td> 0.269232</td>\n",
	" <td> 0.684387</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"output_type": "pyout",
	"prompt_number": 8,
	"text": [
	" column_1 column_2\n",
	"timestamp \n",
	"1370489479 0.620274 0.720347\n",
	"1370489480 0.603379 0.214927\n",
	"1370489481 0.508877 0.562860\n",
	"1370489482 0.236682 0.062259\n",
	"1370489483 0.923264 0.278048\n",
	"1370489484 0.105673 0.246425\n",
	"1370489485 0.204324 0.770854\n",
	"1370489486 0.206739 0.702672\n",
	"1370489487 0.180858 0.440023\n",
	"1370489488 0.269232 0.684387"
	]
	}
	],
	"prompt_number": 8
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We can notice that currently the index is of integer type. To convert it into DateTime type we use tp_datetime() function, which accepts time in nanoseconds. \n",
	"$$1 second = 10^9 nanoseconds$$"
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"df_2.index=pd.to_datetime((df_2.index.values*1e9).astype(int))\n",
	"df_2"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"html": [
	"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>column_1</th>\n",
	" <th>column_2</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:19</th>\n",
	" <td> 0.620274</td>\n",
	" <td> 0.720347</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:20</th>\n",
	" <td> 0.603379</td>\n",
	" <td> 0.214927</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:21</th>\n",
	" <td> 0.508877</td>\n",
	" <td> 0.562860</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:22</th>\n",
	" <td> 0.236682</td>\n",
	" <td> 0.062259</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:23</th>\n",
	" <td> 0.923264</td>\n",
	" <td> 0.278048</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:24</th>\n",
	" <td> 0.105673</td>\n",
	" <td> 0.246425</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:25</th>\n",
	" <td> 0.204324</td>\n",
	" <td> 0.770854</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:26</th>\n",
	" <td> 0.206739</td>\n",
	" <td> 0.702672</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:27</th>\n",
	" <td> 0.180858</td>\n",
	" <td> 0.440023</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 03:31:28</th>\n",
	" <td> 0.269232</td>\n",
	" <td> 0.684387</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"output_type": "pyout",
	"prompt_number": 9,
	"text": [
	" column_1 column_2\n",
	"2013-06-06 03:31:19 0.620274 0.720347\n",
	"2013-06-06 03:31:20 0.603379 0.214927\n",
	"2013-06-06 03:31:21 0.508877 0.562860\n",
	"2013-06-06 03:31:22 0.236682 0.062259\n",
	"2013-06-06 03:31:23 0.923264 0.278048\n",
	"2013-06-06 03:31:24 0.105673 0.246425\n",
	"2013-06-06 03:31:25 0.204324 0.770854\n",
	"2013-06-06 03:31:26 0.206739 0.702672\n",
	"2013-06-06 03:31:27 0.180858 0.440023\n",
	"2013-06-06 03:31:28 0.269232 0.684387"
	]
	}
	],
	"prompt_number": 9
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"df_2.index"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 10,
	"text": [
	"<class 'pandas.tseries.index.DatetimeIndex'>\n",
	"[2013-06-06 03:31:19, ..., 2013-06-06 03:31:28]\n",
	"Length: 10, Freq: None, Timezone: None"
	]
	}
	],
	"prompt_number": 10
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"However, now we can see that the time is lagging the current local time and the index does not yet have an associated timezone. Thus, we first put the index in UTC to associate a timezone and then localize with our timezone (Asia/Kolkata) in our case."
	]
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"df_2.index=df_2.index.tz_localize('UTC').tz_convert('Asia/Kolkata')\n",
	"df_2"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"html": [
	"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>column_1</th>\n",
	" <th>column_2</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:19+05:30</th>\n",
	" <td> 0.620274</td>\n",
	" <td> 0.720347</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:20+05:30</th>\n",
	" <td> 0.603379</td>\n",
	" <td> 0.214927</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:21+05:30</th>\n",
	" <td> 0.508877</td>\n",
	" <td> 0.562860</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:22+05:30</th>\n",
	" <td> 0.236682</td>\n",
	" <td> 0.062259</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:23+05:30</th>\n",
	" <td> 0.923264</td>\n",
	" <td> 0.278048</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:24+05:30</th>\n",
	" <td> 0.105673</td>\n",
	" <td> 0.246425</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:25+05:30</th>\n",
	" <td> 0.204324</td>\n",
	" <td> 0.770854</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:26+05:30</th>\n",
	" <td> 0.206739</td>\n",
	" <td> 0.702672</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:27+05:30</th>\n",
	" <td> 0.180858</td>\n",
	" <td> 0.440023</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2013-06-06 09:01:28+05:30</th>\n",
	" <td> 0.269232</td>\n",
	" <td> 0.684387</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"output_type": "pyout",
	"prompt_number": 11,
	"text": [
	" column_1 column_2\n",
	"2013-06-06 09:01:19+05:30 0.620274 0.720347\n",
	"2013-06-06 09:01:20+05:30 0.603379 0.214927\n",
	"2013-06-06 09:01:21+05:30 0.508877 0.562860\n",
	"2013-06-06 09:01:22+05:30 0.236682 0.062259\n",
	"2013-06-06 09:01:23+05:30 0.923264 0.278048\n",
	"2013-06-06 09:01:24+05:30 0.105673 0.246425\n",
	"2013-06-06 09:01:25+05:30 0.204324 0.770854\n",
	"2013-06-06 09:01:26+05:30 0.206739 0.702672\n",
	"2013-06-06 09:01:27+05:30 0.180858 0.440023\n",
	"2013-06-06 09:01:28+05:30 0.269232 0.684387"
	]
	}
	],
	"prompt_number": 11
	},
	{
	"cell_type": "code",
	"collapsed": false,
	"input": [
	"df_2.index"
	],
	"language": "python",
	"metadata": {},
	"outputs": [
	{
	"output_type": "pyout",
	"prompt_number": 12,
	"text": [
	"<class 'pandas.tseries.index.DatetimeIndex'>\n",
	"[2013-06-06 09:01:19, ..., 2013-06-06 09:01:28]\n",
	"Length: 10, Freq: None, Timezone: Asia/Kolkata"
	]
	}
	],
	"prompt_number": 12
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"References\n",
	"\n",
	"* https://github.com/pydata/pandas/issues/3757\n",
	"* https://github.com/pydata/pandas/issues/3746"
	]
	}
	],
	"metadata": {}
	}
	]
	}