DeepakRavi/Spanish A-B Test.ipynb

## Spanish A-B Test.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Spanish Translation A/B Test"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Company XYZ is a worldwide e-ecommerce site with localized versions of the site. It was observed that Spain based users have\n",
    "a much higher conversion rate than any other Spanish speaking country. One of the possible reasons could be poor translation.\n",
    "However, it was noticed that all Spanish speaking countries had the same translation as that of the Spain based site written by a Spaniard. Hence, it was agreed upon to conduct an A/B test, where two versions of the site would be released. One of these versions would be written by a local translator from the native country and the other would be the original site written by the Spaniard.\n",
    "\n",
    "After running the test for five days, the results turned out to be negative. This implies, the local translation did poorly as compared to the original translation. \n",
    "\n",
    "The following analysis is to investigate, if the test was actually negative and if so, the possible reasons for it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "from pandas import DataFrame, Series \n",
    "from matplotlib import pyplot as plt \n",
    "import matplotlib.ticker as ticker\n",
    "import scipy as sc\n",
    "from scipy import stats \n",
    "import sklearn\n",
    "from sklearn.tree import DecisionTreeClassifier, export_graphviz\n",
    "from sklearn.metrics import classification_report\n",
    "from sklearn.cross_validation import train_test_split\n",
    "from sklearn.pipeline import Pipeline\n",
    "from sklearn.grid_search import GridSearchCV\n",
    "from sklearn.preprocessing import LabelEncoder\n",
    "from sklearn.preprocessing import Imputer\n",
    "from StringIO import StringIO\n",
    "from inspect import getmembers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Reading in the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "test_table = pd.read_csv('test_table.csv')\n",
    "user_table = pd.read_csv('user_table.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>date</th>\n",
       "      <th>source</th>\n",
       "      <th>device</th>\n",
       "      <th>browser_language</th>\n",
       "      <th>ads_channel</th>\n",
       "      <th>browser</th>\n",
       "      <th>conversion</th>\n",
       "      <th>test</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>315281</td>\n",
       "      <td>2015-12-03</td>\n",
       "      <td>Direct</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>NaN</td>\n",
       "      <td>IE</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>497851</td>\n",
       "      <td>2015-12-04</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>Google</td>\n",
       "      <td>IE</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>848402</td>\n",
       "      <td>2015-12-04</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>Facebook</td>\n",
       "      <td>Chrome</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>290051</td>\n",
       "      <td>2015-12-03</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Mobile</td>\n",
       "      <td>Other</td>\n",
       "      <td>Facebook</td>\n",
       "      <td>Android_App</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>548435</td>\n",
       "      <td>2015-11-30</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>Google</td>\n",
       "      <td>FireFox</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   user_id        date  source  device browser_language ads_channel  \\\n",
       "0   315281  2015-12-03  Direct     Web               ES         NaN   \n",
       "1   497851  2015-12-04     Ads     Web               ES      Google   \n",
       "2   848402  2015-12-04     Ads     Web               ES    Facebook   \n",
       "3   290051  2015-12-03     Ads  Mobile            Other    Facebook   \n",
       "4   548435  2015-11-30     Ads     Web               ES      Google   \n",
       "\n",
       "       browser  conversion  test  \n",
       "0           IE           1     0  \n",
       "1           IE           0     1  \n",
       "2       Chrome           0     0  \n",
       "3  Android_App           0     1  \n",
       "4      FireFox           0     1  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_table.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>sex</th>\n",
       "      <th>age</th>\n",
       "      <th>country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>765821</td>\n",
       "      <td>M</td>\n",
       "      <td>20</td>\n",
       "      <td>Mexico</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>343561</td>\n",
       "      <td>F</td>\n",
       "      <td>27</td>\n",
       "      <td>Nicaragua</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>118744</td>\n",
       "      <td>M</td>\n",
       "      <td>23</td>\n",
       "      <td>Colombia</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>987753</td>\n",
       "      <td>F</td>\n",
       "      <td>27</td>\n",
       "      <td>Venezuela</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>554597</td>\n",
       "      <td>F</td>\n",
       "      <td>20</td>\n",
       "      <td>Spain</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   user_id sex  age    country\n",
       "0   765821   M   20     Mexico\n",
       "1   343561   F   27  Nicaragua\n",
       "2   118744   M   23   Colombia\n",
       "3   987753   F   27  Venezuela\n",
       "4   554597   F   20      Spain"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "user_table.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Merging the two datasets by checking for unique id's"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(test_table) == len(test_table['user_id'].unique())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(user_table) == len(user_table['user_id'].unique())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Comparing the lengths of both the tables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(453321, 9)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_table.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(452867, 4)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "user_table.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This implies the user_table is missing a few id's.\n",
    "Therefore when we perform a join operation, we shouldn't loose the user id's in the test table that are not in the user table.\n",
    "\n",
    "We could either do a left join or an outer join. Going with the outer join in the following case"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>date</th>\n",
       "      <th>source</th>\n",
       "      <th>device</th>\n",
       "      <th>browser_language</th>\n",
       "      <th>ads_channel</th>\n",
       "      <th>browser</th>\n",
       "      <th>conversion</th>\n",
       "      <th>test</th>\n",
       "      <th>sex</th>\n",
       "      <th>age</th>\n",
       "      <th>country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>315281</td>\n",
       "      <td>2015-12-03</td>\n",
       "      <td>Direct</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>NaN</td>\n",
       "      <td>IE</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>M</td>\n",
       "      <td>32.0</td>\n",
       "      <td>Spain</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>497851</td>\n",
       "      <td>2015-12-04</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>Google</td>\n",
       "      <td>IE</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>M</td>\n",
       "      <td>21.0</td>\n",
       "      <td>Mexico</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>848402</td>\n",
       "      <td>2015-12-04</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>Facebook</td>\n",
       "      <td>Chrome</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>M</td>\n",
       "      <td>34.0</td>\n",
       "      <td>Spain</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>290051</td>\n",
       "      <td>2015-12-03</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Mobile</td>\n",
       "      <td>Other</td>\n",
       "      <td>Facebook</td>\n",
       "      <td>Android_App</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>F</td>\n",
       "      <td>22.0</td>\n",
       "      <td>Mexico</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>548435</td>\n",
       "      <td>2015-11-30</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>Google</td>\n",
       "      <td>FireFox</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>M</td>\n",
       "      <td>19.0</td>\n",
       "      <td>Mexico</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   user_id        date  source  device browser_language ads_channel  \\\n",
       "0   315281  2015-12-03  Direct     Web               ES         NaN   \n",
       "1   497851  2015-12-04     Ads     Web               ES      Google   \n",
       "2   848402  2015-12-04     Ads     Web               ES    Facebook   \n",
       "3   290051  2015-12-03     Ads  Mobile            Other    Facebook   \n",
       "4   548435  2015-11-30     Ads     Web               ES      Google   \n",
       "\n",
       "       browser  conversion  test sex   age country  \n",
       "0           IE           1     0   M  32.0   Spain  \n",
       "1           IE           0     1   M  21.0  Mexico  \n",
       "2       Chrome           0     0   M  34.0   Spain  \n",
       "3  Android_App           0     1   F  22.0  Mexico  \n",
       "4      FireFox           0     1   M  19.0  Mexico  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data = pd.merge(test_table, user_table, on = 'user_id', how = 'outer')\n",
    "data.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(453321, 12)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Summarizing the data. Getting the basic descriptive statistics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>conversion</th>\n",
       "      <th>test</th>\n",
       "      <th>age</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>453321.000000</td>\n",
       "      <td>453321.000000</td>\n",
       "      <td>453321.000000</td>\n",
       "      <td>452867.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>499937.514728</td>\n",
       "      <td>0.049579</td>\n",
       "      <td>0.476446</td>\n",
       "      <td>27.130740</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>288665.193436</td>\n",
       "      <td>0.217073</td>\n",
       "      <td>0.499445</td>\n",
       "      <td>6.776678</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>18.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>249816.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>22.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>500019.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>26.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>749522.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>31.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>1000000.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>70.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              user_id     conversion           test            age\n",
       "count   453321.000000  453321.000000  453321.000000  452867.000000\n",
       "mean    499937.514728       0.049579       0.476446      27.130740\n",
       "std     288665.193436       0.217073       0.499445       6.776678\n",
       "min          1.000000       0.000000       0.000000      18.000000\n",
       "25%     249816.000000       0.000000       0.000000      22.000000\n",
       "50%     500019.000000       0.000000       0.000000      26.000000\n",
       "75%     749522.000000       0.000000       1.000000      31.000000\n",
       "max    1000000.000000       1.000000       1.000000      70.000000"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Some insights from the data so far. \n",
    "\n",
    "1. Average conversion rate is roughly 4%. This is pretty normal. Considered to be industry standard.\n",
    "2. 47% of the population belong in the test group and 53% in the control group\n",
    "3. This is a fairly young user base with a mean age of 27 years. Also 75% of the user base is within 30 years of age. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Understanding if Spain actually converts best, as compared to other Spanish speaking nations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>conversion</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>country</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Argentina</th>\n",
       "      <td>0.013994</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bolivia</th>\n",
       "      <td>0.048634</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Chile</th>\n",
       "      <td>0.049704</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Colombia</th>\n",
       "      <td>0.051332</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Costa Rica</th>\n",
       "      <td>0.053494</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Ecuador</th>\n",
       "      <td>0.049072</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>El Salvador</th>\n",
       "      <td>0.050765</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Guatemala</th>\n",
       "      <td>0.049653</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Honduras</th>\n",
       "      <td>0.049253</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Mexico</th>\n",
       "      <td>0.050341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Nicaragua</th>\n",
       "      <td>0.053399</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Panama</th>\n",
       "      <td>0.048089</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Paraguay</th>\n",
       "      <td>0.048863</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Peru</th>\n",
       "      <td>0.050258</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Spain</th>\n",
       "      <td>0.079719</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Uruguay</th>\n",
       "      <td>0.012821</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Venezuela</th>\n",
       "      <td>0.049666</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             conversion\n",
       "country                \n",
       "Argentina      0.013994\n",
       "Bolivia        0.048634\n",
       "Chile          0.049704\n",
       "Colombia       0.051332\n",
       "Costa Rica     0.053494\n",
       "Ecuador        0.049072\n",
       "El Salvador    0.050765\n",
       "Guatemala      0.049653\n",
       "Honduras       0.049253\n",
       "Mexico         0.050341\n",
       "Nicaragua      0.053399\n",
       "Panama         0.048089\n",
       "Paraguay       0.048863\n",
       "Peru           0.050258\n",
       "Spain          0.079719\n",
       "Uruguay        0.012821\n",
       "Venezuela      0.049666"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.groupby('country')[['conversion']].mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "From the above results, it is quite evident that spain has a conversion rate of nearly 7.9% whereas other nations have conversion rates in the range of 4-5%. Therefore, Spain indeed has the best conversion rate. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below is the comparison of the performance in test and control groups. It can be seen that the control group did much better than the test group."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>conversion</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>test</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.055179</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.043425</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      conversion\n",
       "test            \n",
       "0       0.055179\n",
       "1       0.043425"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.groupby('test')[['conversion']].mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below is the comparison of the test and control group without Spain in the picture"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>date</th>\n",
       "      <th>source</th>\n",
       "      <th>device</th>\n",
       "      <th>browser_language</th>\n",
       "      <th>ads_channel</th>\n",
       "      <th>browser</th>\n",
       "      <th>conversion</th>\n",
       "      <th>test</th>\n",
       "      <th>sex</th>\n",
       "      <th>age</th>\n",
       "      <th>country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>497851</td>\n",
       "      <td>2015-12-04</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>Google</td>\n",
       "      <td>IE</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>M</td>\n",
       "      <td>21.0</td>\n",
       "      <td>Mexico</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>290051</td>\n",
       "      <td>2015-12-03</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Mobile</td>\n",
       "      <td>Other</td>\n",
       "      <td>Facebook</td>\n",
       "      <td>Android_App</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>F</td>\n",
       "      <td>22.0</td>\n",
       "      <td>Mexico</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>548435</td>\n",
       "      <td>2015-11-30</td>\n",
       "      <td>Ads</td>\n",
       "      <td>Web</td>\n",
       "      <td>ES</td>\n",
       "      <td>Google</td>\n",
       "      <td>FireFox</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>M</td>\n",
       "      <td>19.0</td>\n",
       "      <td>Mexico</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>540675</td>\n",
       "      <td>2015-12-03</td>\n",
       "      <td>Direct</td>\n",
       "      <td>Mobile</td>\n",
       "      <td>ES</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Android_App</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>F</td>\n",
       "      <td>22.0</td>\n",
       "      <td>Venezuela</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>863394</td>\n",
       "      <td>2015-12-04</td>\n",
       "      <td>SEO</td>\n",
       "      <td>Mobile</td>\n",
       "      <td>Other</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Android_App</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>M</td>\n",
       "      <td>35.0</td>\n",
       "      <td>Mexico</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   user_id        date  source  device browser_language ads_channel  \\\n",
       "1   497851  2015-12-04     Ads     Web               ES      Google   \n",
       "3   290051  2015-12-03     Ads  Mobile            Other    Facebook   \n",
       "4   548435  2015-11-30     Ads     Web               ES      Google   \n",
       "5   540675  2015-12-03  Direct  Mobile               ES         NaN   \n",
       "6   863394  2015-12-04     SEO  Mobile            Other         NaN   \n",
       "\n",
       "       browser  conversion  test sex   age    country  \n",
       "1           IE           0     1   M  21.0     Mexico  \n",
       "3  Android_App           0     1   F  22.0     Mexico  \n",
       "4      FireFox           0     1   M  19.0     Mexico  \n",
       "5  Android_App           0     1   F  22.0  Venezuela  \n",
       "6  Android_App           0     0   M  35.0     Mexico  "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_new =data.copy()\n",
    "data_new = data_new[data_new['country']!= 'Spain']\n",
    "data_new.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>conversion</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>test</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.048330</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.043425</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      conversion\n",
       "test            \n",
       "0       0.048330\n",
       "1       0.043425"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_new.groupby('test')[['conversion']].mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Some quick insights\n",
    "\n",
    "1. From the results it is quite evident that both the control group and the test group are faring similarly with the control group performing slightly better.\n",
    "\n",
    "2. There happened to be more spaniards in the control group as compared to the test group. As their conversion rate was higher, the control group had a higher mean. However, removing them as caused both the control and test group to perform similarly\n",
    "\n",
    "3. Secondly, for other countries other than spain, it doesn't seem to matter whether the translation is by a local or by a spaniard. The test results are more or less the same"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Doing a welch two sample t test on both the groups to check if there is a statistical difference in the mean of the two groups"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Ttest_indResult(statistic=7.3939374121344805, pvalue=1.4282994754055316e-13)"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "zero = data_new[data_new['test'] == 0]\n",
    "one = data_new[data_new['test'] == 1]\n",
    "\n",
    "sc.stats.ttest_ind(zero['conversion'], one['conversion'], equal_var = False, axis = 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Insights\n",
    "\n",
    "As the p value is less than alpha = .05, we can reject the null hypothesis.\n",
    "This implies, we can tell with statistical significance that the two groups have different means.\n",
    "Mean of test = 4.3% and mean of control = 4.8%.\n",
    "This would be a significant difference in means, if it were true\n",
    "\n",
    "Likely reasons for this include\n",
    "\n",
    "1. Control group and test group are not really random\n",
    "2. Not enough data for the sample to truly represent the population"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Converting data to standard format. This includes formatting datetime column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "data_new['date'] = pd.to_datetime(data_new['date'], infer_datetime_format = True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plotting to check for any anomalies or biases"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>conversion</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th>test</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">2015-11-30</th>\n",
       "      <th>0</th>\n",
       "      <td>0.051378</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.043886</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">2015-12-01</th>\n",
       "      <th>0</th>\n",
       "      <td>0.046287</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.041387</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">2015-12-02</th>\n",
       "      <th>0</th>\n",
       "      <td>0.048550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.044234</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">2015-12-03</th>\n",
       "      <th>0</th>\n",
       "      <td>0.049284</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.043884</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">2015-12-04</th>\n",
       "      <th>0</th>\n",
       "      <td>0.047043</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.043491</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                 conversion\n",
       "date       test            \n",
       "2015-11-30 0       0.051378\n",
       "           1       0.043886\n",
       "2015-12-01 0       0.046287\n",
       "           1       0.041387\n",
       "2015-12-02 0       0.048550\n",
       "           1       0.044234\n",
       "2015-12-03 0       0.049284\n",
       "           1       0.043884\n",
       "2015-12-04 0       0.047043\n",
       "           1       0.043491"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "time_series = data_new.groupby(['date','test'])[['conversion']].mean()\n",
    "time_series"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "date\n",
       "2015-11-30    0.854179\n",
       "2015-12-01    0.894141\n",
       "2015-12-02    0.911090\n",
       "2015-12-03    0.890439\n",
       "2015-12-04    0.924486\n",
       "dtype: float64"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "time_series = time_series.unstack()['conversion'][1]/time_series.unstack()['conversion'][0]\n",
    "time_series"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZMAAAEZCAYAAABSN8jfAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XecVNX5x/HPA4qKAjZiI2IhIthNQGLLKhaM/kRjYkCj\nYkESRbEDirLYEPOzBn9GFAWxYAHiCoiAuFYMKEVEWoIiCFggURRDfX5/nLs6WZdllpk7d2b2+369\n5sXcMveevbr7zDnPKebuiIiIZKJO0gUQEZHCp2AiIiIZUzAREZGMKZiIiEjGFExERCRjCiYiIpIx\nBRORSszsSDOblcB9PzKzY3N9X5FsUDCRWmtDf7zd/U13bxHTPdeb2Qoz+9rMFprZXWZmNbzGr8xs\nYRzlE9lUCiYiueXAge7eEGgLnAV0ruE1LLqOSN5QMBGppPI3/6gGc7WZTTezf5nZ02ZWL+X4KWY2\nNTr2ppkdUN3loxfuPhd4A9i/ijLUM7N7zexTM1tkZveY2eZmVh8YDeyaUsPZOVs/u8imUjARqVrl\nb/6/A04A9gQOAjoBmNkhwEBC7WJ74CGgzMw239gNzKwlcBQwpYrDvYDWwIHR/VoDvdx9JXASsNjd\nG7h7Q3dfWuOfTiTLFExE0nOfu3/m7v8GXgQOjvZ3Bv7q7u96MARYBbSp5lpTzGwZ8AIwwN0HVXHO\nWUAfd1/m7suAPsA52fphRLJts6QLIFIgPkt5vxLYJXrfFDjXzC6Ltg3YHNi1mmsd4u4fbeR+uwKf\npGwv2Mg1RRKlYCKSmYXAbe7etwafSaf31mJCoKrootw02gdKvkseUjOX1Hb1zGyLlFfdGn7+YeCP\nZtYawMy2NrNfm9nWGZbraaCXme1oZjsCNwJDomOfATuYWcMM7yGSNQomUtuNIjRbfRf927uKczZY\nE3D39wh5k/5mthyYC5xXzf2qq1WkHrsVeBd4H5gevb8tuuccQrCZb2bL1ZtL8oHFvTiWmbUD7iUE\nroHu3q/S8W2BR4G9Cb/QF7j7h2a2BfA6UI/QHPe8u/eJtbAiIrJJYg0mZlaH8E2tLaG9dzLQwd1n\np5xzJ7DC3W8xs+bAA+5+XHSsvruvjJoe3gIud/dJsRVYREQ2SdzNXK2Bee6+wN3XAEOB9pXOaQlM\ngO+r73uYWeNoe2V0zhaE2okSjyIieSjuYLIbobdLhUXRvlTTgd8AREnM3YEm0XYdM5sKLAXGufvk\nmMsrIiKbIB8S8HcA25nZFOBSYCqwDsDd17v7IYTgclg0YlhERPJM3ONMPiXUNCo0ifZ9z91XABdU\nbJvZR8D8Sud8bWavAu2ADyvfxMzU/CUiUkPuXqMZq6sTd81kMtDMzJpGE+N1AMpSTzCzRhXzGJlZ\nZ+A1d/8m6l/fKNq/FXA8MJsNcHe9svDq3bt34mUoppeep55nvr6yLdaaibuvM7OuwFh+6Bo8y8y6\nhMM+AGgBDDaz9cBM4MLo47tE++tEn33G3UfHWV4REdk0sU+n4u5jgOaV9j2U8v6dysej/TOAQ+Mu\nn4iIZC4fEvCSR0pKSpIuQlHR88wuPc/8FfsI+FwwMy+Gn0NEJFfMDC+gBLyIiNQCCiYiIpIxBRMR\nEcmYgomIiGRMwURERDKmYCIiIhlTMBERkYwpmIiISMYUTEREapmvv87+NRVMRERqkdWr4Ywzsn9d\nBRMRkVrCHTp3hq22yv61Y581WERE8sNNN8GcOTBhAmy9dXavrWAiIlILDBgATz8NEydC/frZv75m\nDRYRKXKjRsFFF8Ebb0CzZmFftmcNVs1ERKSIvfsudOoEL774QyCJgxLwIiJFav58OPVUeOQRaNMm\n3nspmIiIFKFly+Ckk+CGG6B9+/jvF3swMbN2ZjbbzOaaWfcqjm9rZsPNbLqZvWNmLaP9TcxsgpnN\nNLMZZnZ53GUVESkG330XaiSnnQaXXpqbe8aagDezOsBcoC2wGJgMdHD32Snn3AmscPdbzKw58IC7\nH2dmOwM7u/s0M9sGeA9on/rZlGsoAS8iAqxbB2eeCVtsAU88AXU2UGUotGV7WwPz3H2Bu68BhgKV\nK1wtgQkA7j4H2MPMGrv7UnefFu3/BpgF7BZzeUVECtrVV8Py5fDYYxsOJHGI+1a7AQtTthfx44Aw\nHfgNgJm1BnYHmqSeYGZ7AAcDf4+pnCIiBe+ee2D8eBgxItRMcikfEvB3ANuZ2RTgUmAqsK7iYNTE\n9TzQLaqhiIhIJc89B3ffDaNHw7bb5v7+cY8z+ZRQ06jQJNr3PXdfAVxQsW1mHwHzo/ebEQLJEHd/\noboblZaWfv++pKSEkpKSzEouIlIg3ngjJNrHjoXdd6/6nPLycsrLy2MrQ9wJ+LrAHEICfgkwCejo\n7rNSzmkErHT3NWbWGTjC3TtFxx4HvnT3qzZyHyXgRaRWmjULSkpCsv3449P/XEGNgHf3dWbWFRhL\naFIb6O6zzKxLOOwDgBbAYDNbD8wELgQwsyOAs4EZZjYVcOB6dx8TZ5lFRArF0qXw61/DnXfWLJDE\nQXNziYgUoBUrQo3k9NOhV6+afz7bNRMFExGRArNmTRiU2KRJmA3YNiEkFNo4ExERySJ3+NOfwhiS\nBx/ctEASB80aLCJSQG65BaZNg/Jy2CyP/oLnUVFERKQ6gwaF19tvwzbbJF2a/6aciYhIARg7Fs45\nB157DfbdN/PrFVTXYBERydy0afCHP8Dw4dkJJHFQAl5EJI998gmcckpIth95ZNKl2TAFExGRPPWv\nf4UFrq65Bs44I+nSVE85ExGRPLRqFZxwAvz852ECx2zToMUqKJiISDFZvx7OOissdPXMM/GsS6IE\nvIhIkevRAz79FMaNy+0CV5lQMBERySP9+0NZWRhLsuWWSZcmfQomIiJ5YsQI6NsX3nwTtt8+6dLU\njIKJiEgemDgRLr4YxoyBPfdMujQ1VyCtcSIixWvu3DCV/ODBofdWIVIwERFJ0Oefh7Ekt94aFroq\nVAomIiIJ+fbbMLr97LPhoouSLk1mNM5ERCQBa9eGpq0ddoDHHsv9uiRaHEtEpMC5w2WXhVHuDz+c\nPwtcZSL2YGJm7cxstpnNNbPuVRzf1syGm9l0M3vHzFqmHBtoZp+Z2ftxl1Mk21atCsurilTWr1/o\nvfX887D55kmXJjtiDSZmVgfoD5wI7Ad0NLPKEyhfD0x194OA84D7U449Fn1WpGCsWwcDB4bunfvt\nB++8k3SJJJ88+WSYAXj0aGjYMOnSZE/cNZPWwDx3X+Dua4ChQPtK57QEJgC4+xxgDzNrHG2/Cfwr\n5jKKZM2ECaFr56BB8MILcPvtcNppcOONsHp10qWTpE2YAFddFQLJrrsmXZrsijuY7AYsTNleFO1L\nNR34DYCZtQZ2B5rEXC6RrJo7F9q3Dz1yevWC11+HVq3gt78NCxtNmwZt2sDMmUmXVJIyYwZ06BAm\nbtxvv6RLk335MAL+DuA+M5sCzACmAutqepHS0tLv35eUlFBSUpKl4ols2PLlcPPNoeniuuvg2Wdh\niy3++5yddw5zLQ0cCCUl0LMnXHFF4UzgJ5lbtAhOPhnuvz/8P5CE8vJyysvLY7t+rF2DzawNUOru\n7aLtHoC7e79qPvMRcIC7fxNtNwVedPcDq/mMugZLTq1ZA//3f3DbbaH20acPNG688c/Nnw/nnQd1\n64amsD32iLukkrSvvoKjjgrrt197bdKl+UGhdQ2eDDQzs6ZmVg/oAJSlnmBmjcxs8+h9Z+C1ikBS\ncUr0Ekmce6hl7L8/vPQSvPpqCCrpBBKAvfaC8vLwLbVVK3j00XBNKU6rV4cVEo8+OqyWWMxiH7Ro\nZu2A+wiBa6C732FmXQg1lAFR7WUwsB6YCVzo7l9Fn30KKAF2AD4Derv7Y1XcQzUTid306SF5unQp\n3HUXtGuX2fVmzAjfVps2hQEDYKedslNOyQ/ucO65sGIFDBsWaqP5RCstVkHBROK0dGlIqr/4IvTu\nHWZ23SxL2cbVq0MT2cCBobvo6adn57qSvBtuCL23XnkF6tdPujQ/VmjNXCIF67vvQtfe/feH7baD\nOXPgkkuyF0gA6tULeZcRI0IC/7zzQhu7FLaHHgqdMcrK8jOQxEHBRKQSd3j6adh3X5gyBf7+d/jz\nn2HbbeO75y9/GboPb701HHhg+DYrhWnkSCgtDeuSpJtLKwZq5hJJMXFiyIusWQN33x0Sp7n28stw\n4YWhl1jfvrDVVrkvg2yayZND54qRI6F166RLUz01c4nEYMEC6NgRfvc7+NOfYNKkZAIJwIknwvvv\nw2efwaGHhj9Qkv/++c8wcPWRR/I/kMRBwURqtRUr4Prrwx/tffcNeZFzz01+QOH224emtt69w3oX\nffpo0sh89uWXYYGrG2+EU09NujTJUDCRWmndujD19z77wOLFoSbQu3fIWeSTDh1g6tQwWeThh8Ps\n2UmXSCr77rsQQM44I9RqayvlTKTWGT8+5EW23Rbuuacw1tx2Dz2EevUK334vuyz52pOELyW/+13I\naw0ZUlj/TTTOpAoKJpKOOXPCKOQPPwy9s04/vfAWJfrHP0Iz3FZbhdX5dt896RLVXu7QrRt88EGY\nDaHynGz5Tgl4kRpatgwuvxyOPBJ+9asQTH7zm8ILJADNmsEbb8Dxx8MvfgGPP67pWJJy991hOp3h\nwwsvkMRBwUSK1urVcO+90KJFaI748MNQMyn0X/y6daFHDxg3Dv73f0Nb/RdfJF2q2uWZZ8L/W6NH\nxzv+qJAomEjRcQ8LU+2/P4wdGyZWfOCB4htAdtBBodvwz34WBjqWlW38M5K5118POatRo+CnP026\nNPlDORMpKtOmheT655+HyRhPrCWLPr/5ZpiKpaQkdCoopuVg88mHH8Ixx8BTT0HbtkmXJjPKmYhU\nYcmSMGq8XTs488wQVGpLIIGQD5o2LTSBHXQQvPZa0iUqPosXw69/HZoWCz2QxEHBRArad9/BrbfC\nAQfAjjuGHlt//GN2J2MsFA0ahKns+/eHs84K+aH//CfpUhWHFSvCNCmdO4dlA+THFEykIK1fH5bK\nbd48rDMyaRL06weNGiVdsuSdfHJ4JgsWhDE0U6YkXaLCtmZNGEvSunWYLUGqppyJFJy334YrrwwB\n5Z57QhOP/FjF7MdXXBG6RvfoUTtrbJlwD82nn38Of/tbcT0/DVqsgoJJ7fDxx9C9ewgmt98OZ59d\nWCOOk7JoEZx/fmiqefzxMIWMpKdPnzADcHl5/k21kykl4KXW+fpr6NkzNNnst1/Ii5xzjgJJupo0\nCdPan3MOHHFE6Ca9fn3Spcp/jz4agu/IkcUXSOKgmonkrXXrwnK2vXuHXlq33Qa77pp0qQrb3Llh\nOpaGDcMfyyZNki5RfhozBjp1Cr3imjdPujTxKLiaiZm1M7PZZjbXzLpXcXxbMxtuZtPN7B0za5nu\nZ6V4jRsHhxwS+vOPGhXmoVIgydw++4QxKUcfHabdf/JJTcdS2ZQpIeAOH168gSQOsdZMzKwOMBdo\nCywGJgMd3H12yjl3Aivc/RYzaw484O7HpfPZlGuoZlIkZs2Ca68NU63/+c9w2mmFOYdWIZgyJTR9\ntWwJDz4YulbXdh9/HJoC//KXMH9bMSu0mklrYJ67L3D3NcBQoH2lc1oCEwDcfQ6wh5k1TvOzUiS+\n/DJMUXH00XDssWGkcSHO6ltIDj0U3nsvzDx80EGhBlibLV8eBiV27178gSQOcQeT3YCFKduLon2p\npgO/ATCz1sDuQJM0PysFbvXqMPtqixZhe9asMB1KvXrJlqu22HLLMO3Mk09C167QpQt8803Spcq9\n//wn1IJPOil0o5aay4de03cA95nZFGAGMBVYV9OLlJaWfv++pKSEkpKSLBVP4uAe+u1fd11ox3/9\n9R8CiuReSUkY6HjllaGWMnhw7Rm/s359mNdsl11C02qxKi8vp7y8PLbrx50zaQOUunu7aLsH4O7e\nr5rPfAQcAOyf7meVMyksU6eG2seXX4ZvxSeckHSJJFVZWZiS5pxz4OabC3/K/o255powg8LYsaGm\nVlvkbNCimVXbaujuwzd6cbO6wBxCEn0JMAno6O6zUs5pBKx09zVm1hk4wt07pfPZlGsomBSAxYvh\nhhvCqnR9+oSRxcU0oriYfPFFaPL6xz/CcrQHHZR0ieJx//2h88Fbb8H22yddmtzKdjCp7lf5f6o5\n5sBGg4m7rzOzrsBYQn5moLvPMrMu4bAPAFoAg81sPTATuLC6z6bzQ0l+Wbky1EDuvTdMlDd3rqZI\nz3eNG8OwYSGQHH98qElee22YlbhYDB8Od94ZukrXtkASBw1alNisXx/GiVx/Pfzyl3DHHbDnnkmX\nSmrqk0/CAL5Vq0IupVmzpEuUubffhvbtw8wAhx6adGmSkfOuwWbWyMzuNrN3o9ddUdOUyAa9+Sa0\naRP66w8dGpY5VSApTLvvDuPHh3Vi2rSBv/61sAc6zpkTuv4OGVJ7A0kcNlozMbNhwAfA4GjXOcBB\n7p43PbFVM8kfH30U+um/8w707QsdO2oOrWIya1ZIzDduHKa6KbRZCT77DA4/POTuLrgg6dIkK4lB\ni3u7e293nx+9+gB7ZasAUhy++ioEkVatwnrks2drVt9i1KIFTJwIhx0Wprt55pmkS5S+b7+FU04J\nwbC2B5I4pPOr/p2Zfd/j3MyOAL6Lr0hSSNauDc0ezZuHHkDvvw+9ekH9+kmXTOKy+eZQWhpm0+3d\nO6zquHx50qWq3tq18PvfhxU5e/dOujTFKZ1g8kfgATP72Mw+BvoDXWItlRSEsWPh4IPDt9OXXgqz\n0BZas4dsulatwpihn/wk1EZffjnpElXNHS69NASUhx7SFD1xqTZnEk22+Ft3f9bMGgK4+9e5Kly6\nlDPJrVmz4OqrYd68MGK4fXv9gtZ2EyaEBbhOPjn8P5FP63/cfjs891yYZaFBg6RLkz9ymjNx9/XA\nddH7r/MxkEjufPllmL/p6KPD2IOZMzWrrwTHHhuaOL/9NtRWJ05MukTBkCEwYACMHq1AErd0mrnG\nm9k1ZvZTM9u+4hV7ySRvrFoVBh22aBES6rNnhzmcNBmjpGrUKIxD6dcvzPh8ww1hIs+kjB8fpkoZ\nPTrMuyXxSqdr8EdV7HZ3z5seXWrmioc7jBgRJmNs0SI0X+y7b9KlkkKwdClcfHEY8DhkSEh859L7\n78Nxx8Hzz4eatPxYzubmSrnhlu7+n43tS5KCSfa9916YQuNf/wpTxB93XNIlkkLjHlbI7N49fCG5\n6qrcTMeycGFY4OrPfw49uKRqSYwzeTvNfVIEPv00TJ1xyinwhz+E3joKJLIpzMJ4jkmTQjfiY44J\ng1rj9O9/hwWuunVTIMm1DQYTM9vZzH4ObGVmh5jZodGrBNAogiL06qthdthddw1TTnTuXFwT+0ky\n9twz/L/Vvj20bg2PPBLPdCyrV4dpUo45JtSCJLeqm4L+PKAT8Avg3ZRDK4BB6UxBnytq5src0qVh\nnqLBg0NPLZE4fPABnHsu7LYbPPww7Lxzdq7rHka2r1wZugHrS9DGJZEzOcPdh2XrhnFQMMnMunVh\ngaojjgiLIYnEafVquOWWEEweeADOOCPza15/PZSXwyuvwFZbZX692iCJYLIFcAawBynrn7h73vzZ\nUTDJTJ8+4Rdx/Hh9o5PceeedUEs57LAwu/S2227adR58EO65J0wrv+OO2S1jMUsiAf8C0B5YC3yb\n8pIiMGFCmGLiqacUSCS32rQJHTwaNQrTsYwfX/NrlJWFWs5LLymQJC2dmskH7r5/jsqzSVQz2TQV\neZLHH1ePLUnW2LFhGefTTw+LqKUzUeikSaHX4ahRYZ4wqZlEugabWY6HHEnc1q0Ls7127qxAIsk7\n4YQw0HDZsjC1/aRJ1Z//z3+G3mGPPqpAki/SqZl8CDQDPgJWAUYYAX9g/MVLj2omNVdaGia+GzdO\nzVuSX559Fi67DLp0gRtvDFPep/rii9BZ5OqrwzmyaZJIwDetar+7L0jrBmbtgHsJtaCB7t6v0vGG\nwBPA7kBd4C53HxQd6wZcFJ36sLvfv4F7KJjUwCuvhG6UU6Zkr2umSDYtWQIXXRSaYocMgZYtw/6V\nK6Ft2zCx5G23JVvGQpfzYBLd9CDgqGjzDXefntbFwxT2c4G2wGJgMtDB3WennNMTaOjuPc1sR2AO\nsBPQHHgaaEVI/r8E/NHd51dxHwWTNFXkSYYMCb+UIvnKPQxwvP768OraNaxDv802Ic+n2aozk/Oc\nSVQ7eBL4SfR6wswuS/P6rYF57r7A3dcAQwk9w1I5UDE5dANgmbuvBVoAf3f3Ve6+DngdyJt15wtR\nRZ7k4osVSCT/mYWc3jvvwLBhsNdesGJFWHtegST/bLbxU7gQOMzdvwUws37AROAvaXx2N2BhyvYi\nQoBJ1R8oM7PFwDZAxYw6HwC3mtl2hFzNrwk1G9lEFQMSb7wx2XKI1MTee8Nrr8HQoaH3lpY+yE/p\nBBMD1qVsr4v2ZcuJwFR3P9bM9gbGmdmB7j47ClzjgG+AqZXK8V9KS0u/f19SUkJJSUkWi1j4xo8P\nI46nTFHCXQpP3bpw9tlJl6KwlZeXU15eHtv100nAXwWcB4yIdp1GmJvr3o1e3KwNUOru7aLtHoSe\nYP1SzhkJ9HX3t6LtV4Du7v5upWvdBix0979WcR/lTKqxZAn8/OfKk4jID3KeM3H3u4HzgeXR6/x0\nAklkMtDMzJqaWT2gA1BW6ZwFwHEAZrYTsA8wP9puHP27O3A68FSa95VIRZ6kSxcFEhGJz0abuaLa\nxUx3nxJtNzSzw9z97xv7rLuvM7OuwFh+6Bo8y8y6hMM+ALgVGGRm70cfu87dl0fvh0VLBK8BLtEa\n9DV3881hqd1evZIuiYgUs3SauaYCh1a0I0Xdfd9190NzUL60qJmrauPHw3nnhVUTNZ5ERFIlMZ3K\nf/2ldvf1pJe4lwQtWRJmZB0yRIFEROKXTjCZb2aXm9nm0asbUU5D8tPatdCxY8iTHHts0qURkdog\nnWDyR+Bw4FPCOJHDgIvjLJRk5uabYbPNlCcRkdypbtnejsBYd1+W2yLVnHImPxg3Djp1CuNJdtop\n6dKISL7Kds6kutzH7sBzZrY58AphbqxJ+qudvxYvDgn3J59UIBGR3EqnN1cDwjiQdoSpUGYBY4CX\n3f2z2EuYBtVMQp7kuONCjuSmm5IujYjku0RmDa5UgJbAScAJ7n5itgqSCQWTMN/WxInw8suaLkVE\nNi6J9Uxecfe2G9uXpNoeTMaOhfPPV55ERNKXs5yJmW0J1Ad2jGburbhpQ8JswJIHKvIkTz2lQCIi\nyakuAd8FuALYFXiPH4LJ14Rp4yVha9eGebcuuQSOOSbp0ohIbZZOM9dl7p7O2iWJqa3NXDfeGBYO\nGjNGeRIRqZkkplNZGvXowsx6mdlwM8ubeblqq7Fj4dFH4YknFEhEJHnpBJMb3X2FmR1J6CI8EHgw\n3mJJdT79VONJRCS/pBNMKlY3PBkY4O6jAC2cmZCKPMmll4IWkxSRfJFOMPnUzB4irM0+2sy2SPNz\nEoPSUthiC+jZM+mSiIj8IJ0EfH3C6PcZ7j7PzHYBDnD3sbkoYDpqSwL+5ZfhwgvDeJKf/CTp0ohI\nIUti2d6VwOfAkdGutcC8bBVA0vPpp2ECxyeeUCARkfyTTs2kN/ALoLm772NmuwLPufsRuShgOoq9\nZrJ2bZhz64QTNK28iGRHEl2DTwdOBb4FcPfFQINsFUA2rndv2HJL5UlEJH+lE0xWR1/7K9aA37om\nNzCzdmY228zmmln3Ko43NLMyM5tmZjPMrFPKsSvN7AMze9/MnjSzWteL7OWXYfBgjScRkfyWTjB5\nNurNta2ZdQbGAw+nc3Ezq0OYeuVEYD+go5ntW+m0S4GZ7n4wcAxwl5ltFjWnXQYc6u4HEqZ+6ZDO\nfYtFRZ7kySeVJxGR/JZOMGkMPA8MA5oDNwFN0rx+a2Ceuy9w9zXAUKB9pXOcH5rNGgDL3H1ttF0X\n2NrMNiNMOrk4zfsWvIp13Lt2hV/9KunSiIhUL51gcry7j3P3a939GncfR1jPJB27AQtTthfx4xmH\n+wMtzWwxMB3oBt/nZu4CPiGsP/9vdx+f5n0L3k03KU8iIoWjuino/wRcAuxlZu+nHGoAvJXFMpwI\nTHX3Y81sb2CcmVU0a7UHmgJfAc+b2Vnu/lRVFyktLf3+fUlJCSUFPDx8zBh4/PEwnqSOhoeKSBaU\nl5dTXl4e2/U32DXYzBoB2wF9gR4ph1a4+/K0Lm7WBih193bRdg/A3b1fyjkjgb7u/la0/QrQHdgD\nONHdO0f7zwEOc/euVdynaLoGL1oEv/gFPPssHH100qURkWKVs8Wx3P0rQo2gYwbXnww0M7OmwBJC\nAr3y9RYQJpB8y8x2AvYB5hOa4NpEi3StAtpG1ytaFXmSyy9XIBGRwlLd4lgZc/d1ZtYVGEsIDgPd\nfZaZdQmHfQBwKzAopSntuqjmM8nMngemAmuifwfEWd6k3XQT1K8PPXps/FwRkXyy0RHwhaAYmrnG\njIGLLtK8WyKSGzlr5pLcWbQojCd59lkFEhEpTOorlLC1a6FDB+VJRKSwqZkrYT17wtSpMHq0ugGL\nSO6omauIvPRSmHNL40lEpNApmCRk0SI4/3x47jlo3Djp0oiIZEbfhxNQkSfp1g2OOirp0oiIZE45\nkwT06AHTp8OoUWreEpFkKGdS4EaPDlPKK08iIsVEwSSHFi6ECy6A559XnkREiou+G+fImjUhT3LF\nFXDkkUmXRkQku5QzyRHlSUQknyhnUoBGjVKeRESKm4JJzBYuhAsvVJ5ERIqbvifHqCJPcuWVypOI\nSHFTziRG3bvDjBkwcqSat0QkvyhnUiBGjYKnn1aeRERqBwWTGKTmSXbcMenSiIjET9+Zs2zNGvj9\n75UnEZHaRTmTLLvuOpg5E158Uc1bIpK/sp0zif3PnZm1M7PZZjbXzLpXcbyhmZWZ2TQzm2FmnaL9\n+5jZVDObEv37lZldHnd5MzFyJAwdCoMHK5CISO0Sa83EzOoAc4G2wGJgMtDB3WennNMTaOjuPc1s\nR2AOsJPziBC+AAAKd0lEQVS7r610nUXAYe6+sIr7JF4z+eQTaNUKhg+HI45ItCgiIhtVaDWT1sA8\nd1/g7muAoUD7Suc40CB63wBYlhpIIscB/6wqkOSDivEkV1+tQCIitVPcwWQ3IDUALIr2peoPtDSz\nxcB0oFsV1/k98HQsJcyCG26A7baDa65JuiQiIsnIh67BJwJT3f1YM9sbGGdmB7r7NwBmtjlwKtCj\nuouUlpZ+/76kpISSkpLYCpyqIk+i8SQiks/Ky8spLy+P7fpx50zaAKXu3i7a7gG4u/dLOWck0Nfd\n34q2XwG6u/u70fapwCUV19jAfRLJmVTkSUaMgMMPz/ntRUQ2WaHlTCYDzcysqZnVAzoAZZXOWUDI\niWBmOwH7APNTjnckD5u4KsaTXHONAomISOzjTMysHXAfIXANdPc7zKwLoYYywMx2AQYBu0Qf6evu\nT0efrU8INnu5+4pq7pHzmsm118KsWVBWpuYtESk82a6ZaNDiJnjxRejaNeRJdtghZ7cVEckaTfSY\nsAUL4KKLQp5EgUREJFADTQ2sXh3yJNdeqzyJiEgqNXPVwDXXwJw58MILypOISGFTM1dCysrguec0\nnkREpCqqmaRhwQJo3Rr+9jf45S9ju42ISM4U2jiTgpeaJ1EgERGpmmomG3H11TB3rvIkIlJclDPJ\noRdegGHDlCcREdkYBZMN+PhjuPjikCfZfvukSyMikt/0fbsKq1eH9Umuu055EhGRdChnUoWrr4Z5\n80Izl2WtRVFEJH8oZxKz1DyJAomISHpUM0nx8cdw2GEhoLRpk3m5RETylcaZxKRiPEn37gokIiI1\npZpJ5Kqr4B//UJ5ERGoH5Uxi8MILMHy48iQiIpuq1tdMlCcRkdpIOZMsUp5ERCQ7anXN5MorYf78\nMMpdzVsiUpsUXM3EzNqZ2Wwzm2tm3as43tDMysxsmpnNMLNOKccamdlzZjbLzGaa2WHZKtff/haW\n3n3sMQUSEZFMxVozMbM6wFygLbAYmAx0cPfZKef0BBq6e08z2xGYA+zk7mvNbBDwmrs/ZmabAfXd\n/esq7lOjmslHH4U8yYsvhn9FRGqbQquZtAbmufsCd18DDAXaVzrHgQbR+wbAsiiQNASOcvfHANx9\nbVWBpKYq8iQ9eyqQiIhkS9zBZDdgYcr2omhfqv5ASzNbDEwHukX79wS+NLPHzGyKmQ0ws60yLVD3\n7rDrrnDFFZleSUREKuTDOJMTganufqyZ7Q2MM7MDCWU7FLjU3d81s3uBHkDvqi5SWlr6/fuSkhJK\nSkp+dM6IESFXovEkIlLblJeXU15eHtv1486ZtAFK3b1dtN0DcHfvl3LOSKCvu78Vbb8CdCfUaCa6\n+17R/iOB7u7+P1XcZ6M5k4o8yciRYT13EZHarNByJpOBZmbW1MzqAR2AskrnLACOAzCznYB9gPnu\n/hmw0Mz2ic5rC3y4KYWoyJNcf70CiYhIHGIfZ2Jm7YD7CIFroLvfYWZdCDWUAWa2CzAI2CX6SF93\nfzr67EHAI8DmwHzgfHf/qop7VFszueKKMNJ9xAg1b4mIQPZrJkU/aHHEiDCJ45QpsN12OS6YiEie\nUjCpwoaCifIkIiJVK7ScSWJWrYIzz4QbblAgERGJW9HWTLp1g08+CVPLK08iIvLftJ5JGoYPh7Iy\njScREcmVoquZzJ8fppNXnkREZMOUM6nGqlVhPInyJCIiuVVUNZPLL4dFi2DYMDVviYhURzmTDRg2\nLDRtKU8iIpJ7RVMzadzYGTUKWrVKujQiIvlPOZMN6NVLgUREJClFUzNZv97VvCUikibVTDZAgURE\nJDlFE0xERCQ5CiYiIpIxBRMREcmYgomIiGRMwURERDKmYCIiIhmLPZiYWTszm21mc82sexXHG5pZ\nmZlNM7MZZtYp5djHZjbdzKaa2aS4yyoiIpsm1mBiZnWA/sCJwH5ARzPbt9JplwIz3f1g4BjgLjOr\nmDNsPVDi7oe4u+YBzoHy8vKki1BU9DyzS88zf8VdM2kNzHP3Be6+BhgKtK90jgMNovcNgGXuvjba\nthyUUVLolzW79DyzS88zf8X9h3o3YGHK9qJoX6r+QEszWwxMB7qlHHNgnJlNNrPOsZZUREQ2WT5M\nQX8iMNXdjzWzvQnB40B3/wY4wt2XmFnjaP8sd38z2eKKiEhlsU70aGZtgFJ3bxdt9wDc3fulnDMS\n6Ovub0XbrwDd3f3dStfqDaxw97uruE/hz1YpIpJjhbQ41mSgmZk1BZYAHYCOlc5ZABwHvGVmOwH7\nAPPNrD5Qx92/MbOtgROAPlXdJJsPREREai7WYOLu68ysKzCWkJ8Z6O6zzKxLOOwDgFuBQWb2fvSx\n69x9uZntCYyIah2bAU+6+9g4yysiIpumKNYzERGRZOVlt1sza2JmE8xsZjSQ8fJo/3ZmNtbM5pjZ\ny2bWKNq/fXT+CjO7v9K1Xo0GTU41sylmtmMSP1NS9CyzS88zu/Q8syfxZ+nuefcCdgYOjt5vA8wB\n9gX6EZrBALoDd0Tv6wOHAxcD91e61qvAIUn/THqWxfHS89TzzNdX0s8yL2sm7r7U3adF778BZgFN\nCAMeB0enDQZOi85Z6e5vA6s2cMm8/DlzQc8yu/Q8s0vPM3uSfpZ5/+DNbA/gYOAdYCd3/wzCgwN+\nkuZlBkVVtV6xFLJA6Flml55ndul5Zk8SzzKvg4mZbQM8D3SLIm3l3gLp9B44y90PAI4CjjKzP2S5\nmAVBzzK79DyzS88ze5J6lnkbTCxM9vg8MMTdX4h2f2ZhLApmtjPw+cau4+5Lon+/BZ4izBdWq+hZ\nZpeeZ3bpeWZPks8yb4MJ8Cjwobvfl7KvDOgUvT8PeKHyhwiTQ4Y3ZnXNbIfo/ebAKcAHsZQ2v+lZ\nZpeeZ3bpeWZPYs8yL8eZmNkRwOvADEKVzIHrgUnAs8BPCSPnz3T3f0ef+Ygw63A94N+EEfOfRNfZ\nDKgLjAeu8nz8oWOiZ5ldep7ZpeeZPUk/y7wMJiIiUljyuZlLREQKhIKJiIhkTMFEREQypmAiIiIZ\nUzAREZGMKZiIiEjGFExEasjM1kVzFn0QTdF9lZlVu9qnmTU1s8qrjIoUDQUTkZr71t0Pdff9geOB\nk4DeG/nMnsBZsZdMJCEKJiIZcPcvCetBdIXvayCvm9m70atNdGpf4MioRtPNzOqY2Z1m9nczm2Zm\nnZP6GUSyQSPgRWrIzL5294aV9i0HmgMrgPXuvtrMmgFPu3srM/sVcLW7nxqd3xlo7O63m1k94C3g\nt+6+ILc/jUh2bJZ0AUSKREXOpB7Q38wOBtYBP9vA+ScAB5jZ76LthtG5CiZSkBRMRDJkZnsBa939\nCzPrDSx19wPNrC7w3YY+Blzm7uNyVlCRGClnIlJzqdN1NwYeBP4S7WoELInen0uYdRVC81eDlGu8\nDFwSrT+Bmf3MzLaKs9AicVLNRKTmtjSzKYQmrTXA4+5+T3Ts/4BhZnYuMAb4Ntr/PrDezKYCg9z9\nvmhp1SlRt+LPidbmFilESsCLiEjG1MwlIiIZUzAREZGMKZiIiEjGFExERCRjCiYiIpIxBRMREcmY\ngomIiGRMwURERDL2/+gyJAnIwwnpAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0xd94ba58>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "fig, ax = plt.subplots(1,1)\n",
    "ax.plot(time_series)\n",
    "ax.set_xlabel('Date')\n",
    "ax.set_ylabel('test/control')\n",
    "ax.set_title('Line Plot')\n",
    "\n",
    "ax.xaxis.set_major_locator(ticker.MultipleLocator())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Insights\n",
    "\n",
    "From the above graph it can be seen that over the course of five days, control does continuously better than test.\n",
    "Also the variability in the test/control variable is low. Min = 0.87 and Max = 0.93.\n",
    "This implies that data collected is sufficient, however there might be a bias in the control or test group.\n",
    "\n",
    "Likely cause\n",
    "\n",
    "1. Some segment of the data that has a higher or lower conversion rate has found it's way into either test or control thus increasing/decreasing that groups' overall conversion rate.\n",
    "\n",
    "2. We can use decision trees to identify this. If the split between test and control is truly random, then the tree shouldn't be able to split well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\Deepak\\Anaconda2\\lib\\site-packages\\numpy\\lib\\arraysetops.py:200: FutureWarning: numpy not_equal will not check object identity in the future. The comparison did not return the same result as suggested by the identity (`is`)) and will change.\n",
      "  flag = np.concatenate(([True], aux[1:] != aux[:-1]))\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>source</th>\n",
       "      <th>device</th>\n",
       "      <th>browser_language</th>\n",
       "      <th>ads_channel</th>\n",
       "      <th>browser</th>\n",
       "      <th>sex</th>\n",
       "      <th>age</th>\n",
       "      <th>country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>497851</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>21.0</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>290051</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>22.0</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>548435</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>19.0</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>540675</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>22.0</td>\n",
       "      <td>16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>863394</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>35.0</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   user_id  source  device  browser_language  ads_channel  browser  sex   age  \\\n",
       "1   497851       0       1                 1            3        3    2  21.0   \n",
       "3   290051       0       0                 2            2        0    1  22.0   \n",
       "4   548435       0       1                 1            3        2    2  19.0   \n",
       "5   540675       1       0                 1            0        0    1  22.0   \n",
       "6   863394       2       0                 2            0        0    2  35.0   \n",
       "\n",
       "   country  \n",
       "1       10  \n",
       "3       10  \n",
       "4       10  \n",
       "5       16  \n",
       "6       10  "
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = data_new.copy()\n",
    "lb = LabelEncoder()\n",
    "X['source'] = lb.fit_transform(X['source'])\n",
    "X['country'] = lb.fit_transform(X['country'])\n",
    "X['device'] = lb.fit_transform(X['device'])\n",
    "X['browser_language'] = lb.fit_transform(X['browser_language'])\n",
    "X['ads_channel'] = lb.fit_transform(X['ads_channel'])\n",
    "X['browser'] = lb.fit_transform(X['browser'])\n",
    "X['sex'] = lb.fit_transform(X['sex'])\n",
    "X = X.drop(['conversion','date','test'], axis = 1)\n",
    "y = data_new['test']\n",
    "X.head(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "user_id               0\n",
       "source                0\n",
       "device                0\n",
       "browser_language      0\n",
       "ads_channel           0\n",
       "browser               0\n",
       "sex                   0\n",
       "age                 454\n",
       "country               0\n",
       "dtype: int64"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Checking for missing values\n",
    "X.isnull().sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Int64Index([   819,   1696,   1934,   2409,   2721,   5042,   7552,   7855,\n",
       "              8930,   9082,\n",
       "            ...\n",
       "            444098, 444581, 444828, 445540, 445950, 446681, 451052, 452302,\n",
       "            452342, 453270],\n",
       "           dtype='int64', length=454)"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Index values of all NAN's\n",
    "index = X['age'].index[X['age'].apply(np.isnan)]\n",
    "index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below, imputing the missing values in age column with median age of the column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>source</th>\n",
       "      <th>device</th>\n",
       "      <th>browser_language</th>\n",
       "      <th>ads_channel</th>\n",
       "      <th>browser</th>\n",
       "      <th>sex</th>\n",
       "      <th>age</th>\n",
       "      <th>country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>497851.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>21.0</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>290051.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>22.0</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>548435.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>19.0</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>540675.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>22.0</td>\n",
       "      <td>16.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>863394.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>35.0</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    user_id  source  device  browser_language  ads_channel  browser  sex  \\\n",
       "0  497851.0     0.0     1.0               1.0          3.0      3.0  2.0   \n",
       "1  290051.0     0.0     0.0               2.0          2.0      0.0  1.0   \n",
       "2  548435.0     0.0     1.0               1.0          3.0      2.0  2.0   \n",
       "3  540675.0     1.0     0.0               1.0          0.0      0.0  1.0   \n",
       "4  863394.0     2.0     0.0               2.0          0.0      0.0  2.0   \n",
       "\n",
       "    age  country  \n",
       "0  21.0     10.0  \n",
       "1  22.0     10.0  \n",
       "2  19.0     10.0  \n",
       "3  22.0     16.0  \n",
       "4  35.0     10.0  "
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "impute = Imputer(missing_values = 'NaN', strategy = 'median', axis = 0, copy = True)\n",
    "imputed = DataFrame(impute.fit_transform(X))\n",
    "imputed.columns = X.columns.values\n",
    "imputed.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Creating an instance of the Decision tree classifier below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "clf = DecisionTreeClassifier(criterion = 'entropy', max_depth = 2, min_samples_leaf = 2, min_samples_split = 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DecisionTreeClassifier(class_weight=None, criterion='entropy', max_depth=2,\n",
       "            max_features=None, max_leaf_nodes=None, min_samples_leaf=2,\n",
       "            min_samples_split=2, min_weight_fraction_leaf=0.0,\n",
       "            presort=False, random_state=None, splitter='best')"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clf.fit(imputed,y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Understanding the most important features from the classification"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.])"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clf.feature_importances_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['user_id', 'source', 'device', 'browser_language', 'ads_channel',\n",
       "       'browser', 'sex', 'age', 'country'], dtype=object)"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "imputed.columns.values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Insights\n",
    "\n",
    "There seems to be a fair amount of bias in the way control and test groups are separated. \n",
    "This is highlighted by the feature importance variable. \n",
    "Country seems to be an important feature when it comes to separating test and control groups.\n",
    "Therefore, the separation is not truly random"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('country', 1.5, 1, 4),\n",
       " ('country', 0.5, 2, 3),\n",
       " ('age', -2.0, -1, -1),\n",
       " ('age', -2.0, -1, -1),\n",
       " ('country', 14.5, 5, 6),\n",
       " ('age', -2.0, -1, -1),\n",
       " ('age', -2.0, -1, -1)]"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "zip(imputed.columns[clf.tree_.feature], clf.tree_.threshold, clf.tree_.children_left, clf.tree_.children_right)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1,  2, -1, -1,  5, -1, -1], dtype=int64)"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clf.tree_.children_left"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 10.,  16.,   2.,   4.,  15.,   7.,  11.,  14.,   5.,   3.,   1.,\n",
       "         6.,   8.,   9.,  13.,  12.,   0.])"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "imputed['country'].unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['Mexico', 'Venezuela', 'Bolivia', 'Colombia', 'Uruguay',\n",
       "       'El Salvador', 'Nicaragua', 'Peru', 'Costa Rica', 'Chile',\n",
       "       'Argentina', 'Ecuador', 'Guatemala', 'Honduras', 'Paraguay',\n",
       "       'Panama', nan], dtype=object)"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_new['country'].unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mean in control</th>\n",
       "      <th>mean in test</th>\n",
       "      <th>%samples in test group</th>\n",
       "      <th>p_value</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>country</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Argentina</th>\n",
       "      <td>0.015071</td>\n",
       "      <td>0.013725</td>\n",
       "      <td>0.799799</td>\n",
       "      <td>0.335147</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bolivia</th>\n",
       "      <td>0.049369</td>\n",
       "      <td>0.047901</td>\n",
       "      <td>0.501079</td>\n",
       "      <td>0.718885</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Chile</th>\n",
       "      <td>0.048107</td>\n",
       "      <td>0.051295</td>\n",
       "      <td>0.500785</td>\n",
       "      <td>0.302848</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Colombia</th>\n",
       "      <td>0.052089</td>\n",
       "      <td>0.050571</td>\n",
       "      <td>0.498927</td>\n",
       "      <td>0.423719</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Costa Rica</th>\n",
       "      <td>0.052256</td>\n",
       "      <td>0.054738</td>\n",
       "      <td>0.498964</td>\n",
       "      <td>0.687876</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Ecuador</th>\n",
       "      <td>0.049154</td>\n",
       "      <td>0.048988</td>\n",
       "      <td>0.494432</td>\n",
       "      <td>0.961512</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>El Salvador</th>\n",
       "      <td>0.053554</td>\n",
       "      <td>0.047947</td>\n",
       "      <td>0.497492</td>\n",
       "      <td>0.248127</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Guatemala</th>\n",
       "      <td>0.050643</td>\n",
       "      <td>0.048647</td>\n",
       "      <td>0.496066</td>\n",
       "      <td>0.572107</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Honduras</th>\n",
       "      <td>0.050906</td>\n",
       "      <td>0.047540</td>\n",
       "      <td>0.491013</td>\n",
       "      <td>0.471463</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Mexico</th>\n",
       "      <td>0.049495</td>\n",
       "      <td>0.051186</td>\n",
       "      <td>0.500257</td>\n",
       "      <td>0.165544</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Nicaragua</th>\n",
       "      <td>0.052647</td>\n",
       "      <td>0.054177</td>\n",
       "      <td>0.491447</td>\n",
       "      <td>0.780400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Panama</th>\n",
       "      <td>0.046796</td>\n",
       "      <td>0.049370</td>\n",
       "      <td>0.502404</td>\n",
       "      <td>0.705327</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Paraguay</th>\n",
       "      <td>0.048493</td>\n",
       "      <td>0.049229</td>\n",
       "      <td>0.503199</td>\n",
       "      <td>0.883697</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Peru</th>\n",
       "      <td>0.049914</td>\n",
       "      <td>0.050604</td>\n",
       "      <td>0.498931</td>\n",
       "      <td>0.771953</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Uruguay</th>\n",
       "      <td>0.012048</td>\n",
       "      <td>0.012907</td>\n",
       "      <td>0.899613</td>\n",
       "      <td>0.879764</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Venezuela</th>\n",
       "      <td>0.050344</td>\n",
       "      <td>0.048978</td>\n",
       "      <td>0.496194</td>\n",
       "      <td>0.573702</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             mean in control  mean in test  %samples in test group   p_value\n",
       "country                                                                     \n",
       "Argentina           0.015071      0.013725                0.799799  0.335147\n",
       "Bolivia             0.049369      0.047901                0.501079  0.718885\n",
       "Chile               0.048107      0.051295                0.500785  0.302848\n",
       "Colombia            0.052089      0.050571                0.498927  0.423719\n",
       "Costa Rica          0.052256      0.054738                0.498964  0.687876\n",
       "Ecuador             0.049154      0.048988                0.494432  0.961512\n",
       "El Salvador         0.053554      0.047947                0.497492  0.248127\n",
       "Guatemala           0.050643      0.048647                0.496066  0.572107\n",
       "Honduras            0.050906      0.047540                0.491013  0.471463\n",
       "Mexico              0.049495      0.051186                0.500257  0.165544\n",
       "Nicaragua           0.052647      0.054177                0.491447  0.780400\n",
       "Panama              0.046796      0.049370                0.502404  0.705327\n",
       "Paraguay            0.048493      0.049229                0.503199  0.883697\n",
       "Peru                0.049914      0.050604                0.498931  0.771953\n",
       "Uruguay             0.012048      0.012907                0.899613  0.879764\n",
       "Venezuela           0.050344      0.048978                0.496194  0.573702"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = data_new.groupby(['country','test'])[['conversion']].mean().unstack()\n",
    "b = data_new.groupby('country')[['test']].mean()\n",
    "df = pd.concat([a,b], axis = 1)\n",
    "\n",
    "temp1 = data_new[data_new['test'] == 0]\n",
    "temp2 = data_new[data_new['test'] == 1]\n",
    "\n",
    "a = []; b = []; c = []; d = []\n",
    "\n",
    "for i, j in temp1.groupby('country')['conversion']:\n",
    "    a.append(i)\n",
    "    b.append(j)\n",
    "for i, j in temp2.groupby('country')['conversion']:\n",
    "    c.append(i)\n",
    "    d.append(j)\n",
    "    \n",
    "p_value = []\n",
    "for i in np.arange(16):\n",
    "    p_value.append(sc.stats.ttest_ind(b[i], d[i], equal_var = False, axis = 0)[1])    \n",
    "    \n",
    "df = pd.concat([df, DataFrame(p_value, index = a)], axis = 1)\n",
    "\n",
    "df.columns = ['mean in control', 'mean in test', '%samples in test group', 'p_value']\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Conclusions\n",
    "\n",
    "As it can be seen the p-value associated with all the countries is much greater than 0.05\n",
    "\n",
    "This implies that we fail to reject the null hypothesis. Thus, there is no statistically significant difference in means\n",
    "between test and control group in each country. \n",
    "\n",
    "However, as it can be seen Argentina and Uruguay have the lowest conversion rate of 1%. Also, nearly 80% of the samples from \n",
    "these two countries found it's way into the test group and only 20% in the control group.This can be verified from the \n",
    "third column in the above dataframe. \n",
    "\n",
    "Due to their small conversion rate and massive influx of samples into the test group, there was a significant difference\n",
    "between the overall test conversion and control conversion rates. It was because of this the mean of the test group was much\n",
    "less than the mean of the control group.\n",
    "\n",
    "However, it is now clear that the A/B test was insignificant. Both the test and the control group perform similarly. \n",
    "It is clear that the local translation did not affect the conversion rate.\n",
    "\n",
    "Side Note\n",
    "\n",
    "1. Argentina and Uruguay have the lowest conversion rate\n",
    "2. Marketing efforts can be directed in this direction to improve conversion rate in these two countries"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}