Skip to content

Instantly share code, notes, and snippets.

@usernaamee
Created January 30, 2018 07:38
Show Gist options
  • Save usernaamee/873324fc6053848d2b100460143552b8 to your computer and use it in GitHub Desktop.
Save usernaamee/873324fc6053848d2b100460143552b8 to your computer and use it in GitHub Desktop.
Compare the performance of two machine learning models
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Testing statistical significance : Comparing two models\n",
"===============================\n",
"* Follower tutorial: https://machinelearningmastery.com/use-statistical-significance-tests-interpret-machine-learning-results/\n",
"* Steps:\n",
" * Assume two models. Simulate trial data (of model errors) with $\\mu_{1} = 50$, $\\sigma_{1} = 10$ & $\\mu_{2} = 60$, $\\sigma_{2} = 10$\n",
" * Convert to pandas DataFrame\n",
" * Describe the data using describe command\n",
" * Plot a Box & Whisker plot\n",
" * Plot a histogram\n",
" * As seen, on average, model A is better than B (lower error)\n",
" * Perform normality test using scipy.normaltest function \n",
" * $H_{0}$ is that the distribution is normal.\n",
" * We set our acceptance criteria as: p-value > 0.05 (we accept $H_{0}$).\n",
" * If p-value < 0.05, we reject the $H_{0}$ with 95% confidence.\n",
" * Since both distributions are **gaussian** and have **same variance**, we apply student-t test to see if difference between the means is significant or not.\n",
" * We use scipy function: ttest_ind()\n",
" * $H_{0}$ for this function is that both samples were drawn from the same distribution\n",
" * Or in other words, model A is no better than model B.\n",
" * A p-value <= 0.05 means that the means are significantly different with 95% confidence.\n",
" * It also means, out of 100 samples, the means would be significantly different 95% of the time.\n",
"* In case we had **different variances**, we would not use student-t test. We'd use Welch's t-test instead.\n",
" * This can be done by setting equal_var option in ttest_ind function to false.\n",
"* The closer the distributions are, the higher the number of samples required to tell them apart.\n",
"* In order to compare means of non-gaussian distributions, we use the Kolmogorov-Smirnov test.\n",
" * We use scipy function ks_2samp().\n",
" * Same can be applied to even gaussian distributions but will have less statistical power and might need larger number of samples for correct differentiation."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"%matplotlib inline\n",
"\n",
"mu1 = 50; mu2 = 60\n",
"sigma1 = 10; sigma2 = 10;\n",
"trials1 = np.random.normal(mu1, sigma1, 100)\n",
"trials2 = np.random.normal(mu2, sigma2, 100)\n",
"data = pd.DataFrame({'A':trials1, 'B':trials2})"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" A B\n",
"count 100.000000 100.000000\n",
"mean 50.261176 58.846194\n",
"std 10.433515 10.326449\n",
"min 23.314465 28.507300\n",
"25% 43.559996 52.891956\n",
"50% 49.533210 59.425351\n",
"75% 59.201264 64.794323\n",
"max 76.034436 85.295188\n"
]
}
],
"source": [
"print data.describe()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7efd5b524cd0>"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAADr1JREFUeJzt3W9sXfV9x/H3tzGINF35v6sIJoJUNCwFka1XqF29ySalYutU8gAhrD2IJk9+lv3hQQH5Aeska4k0ifHUqjVl02boWBEVkaKizFedn2RNKJR1bkdKSZUoEBoRhhnaSPbdA5+sSbBz/+T+sX9+vyTr3nPu+fl8dXLy8U+/e875RWYiSVr/PjXoAiRJ3WGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgox1M+d3XLLLblt27Z+7rJoH374IVu2bBl0GdIneG5219GjR3+Rmbc2266vgb5t2zaOHDnSz10WrdFoMDo6OugypE/w3OyuiDjeynYOuUhSIQx0SSqEgS5JhTDQJakQBrokFcJAl9Q1c3NzbN++nZ07d7J9+3bm5uYGXdKG0tfLFiWVa25ujqmpKWZnZzl//jybNm1iYmICgPHx8QFXtzHYQ5fUFdPT08zOzjI2NsbQ0BBjY2PMzs4yPT096NI2DANdUlcsLi4yMjJyybqRkREWFxcHVNHGY6BL6orh4WEWFhYuWbewsMDw8PCAKtp4DHRJXTE1NcXExATz8/OcO3eO+fl5JiYmmJqaGnRpG4ZfikrqigtffO7Zs4fFxUWGh4eZnp72C9E+MtAldc34+Djj4+M+nGtAHHKRpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQXocuqWMR0XabzOxBJYIWe+gR8WcR8aOI+LeImIuI6yLizog4HBHHIuK5iLi218VKWlsyc8WfOx5/adXP1DtNAz0ibgP+GKhn5nZgE/AosA94OjM/B7wHTPSyUEnSlbU6hj4EbI6IIeDTwCngfuD56vP9wK7ulydJalXTQM/Mk8BfAT9nOcjfB44CZzPzXLXZCeC2XhUpSWqu6ZeiEXEj8BBwJ3AW+EfgwVZ3EBGTwCRArVaj0Wh0VKg+aWlpyeOpNctzs/9aucrly8DPMvNdgIj4NvAl4IaIGKp66bcDJ1dqnJkzwAxAvV5Pn8DWPT7RTmvWwQOemwPQyhj6z4EvRMSnY/kapZ3AvwPzwMPVNruBF3tToiSpFa2MoR9m+cvPV4DXqzYzwOPAYxFxDLgZmO1hnZKkJlq6sSgznwKeumz1m8B9Xa9IktQRb/2XpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcIZi9a4TmaEAWeFkTYie+hr3GqzvjgrjKTLGeiSVAgDXZIKYaBLUiEMdEkqhIEuSYUw0CWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFaBroEfHrEfHqRT//GRF/GhE3RcTLEfFG9XpjPwqWJK2saaBn5k8yc0dm7gA+D/wX8ALwBHAoM+8CDlXLkqQBaXfIZSfw08w8DjwE7K/W7wd2dbMwSVJ72g30R4G56n0tM09V798Gal2rSpLUtpanoIuIa4GvAU9e/llmZkSsOE1OREwCkwC1Wo1Go9FZpVqRx1Nrledm/7Uzp+jvAq9k5jvV8jsRsTUzT0XEVuD0So0ycwaYAajX6zk6Ono19epiBw/g8dSa5Lk5EO0MuYzzy+EWgO8Au6v3u4EXu1WUJKl9LQV6RGwBHgC+fdHqvcADEfEG8OVqWZI0IC0NuWTmh8DNl607w/JVL5KkNcA7RSWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUiHaehy5pA7r3G9/l/Y8+brvdticOtLzt9Zuv4bWnvtL2PnQpA13SFb3/0ce8tferbbVpNBptTXDRTvhrdQ65SFIhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIVoK9Ii4ISKej4gfR8RiRHwxIm6KiJcj4o3q9cZeFytJWl2rPfRngIOZeTdwL7AIPAEcysy7gEPVsiRpQJoGekRcD/wOMAuQmf+TmWeBh4D91Wb7gV29KlKS1FwrPfQ7gXeBv4mIH0TENyNiC1DLzFPVNm8DtV4VKUlqrpWnLQ4BvwnsyczDEfEMlw2vZGZGRK7UOCImgUmAWq1Go9G4uop1CY+n+qHd82xpaantNp7LV6+VQD8BnMjMw9Xy8ywH+jsRsTUzT0XEVuD0So0zcwaYAajX69nOIzXVxMEDbT2iVOpIB+dZu4/P9VzujshcsWN96UYR/wL8UWb+JCL+HNhSfXQmM/dGxBPATZn59Sv9nnq9nkeOHLnamovU6SQC7XASAXXinv339GU/r+9+vS/7WY8i4mhm1ptt1+oEF3uAv4+Ia4E3gT9kefz9WxExARwHHum0WDmJgNauDxb3em6uEy0Fema+Cqz012Fnd8uRJHXKO0UlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVAgDXZIKYaBLUiEMdEkqhIEuSYUw0CWpEAa6JBWi1TlFJW1gHc35ebD1Ntdvvqb9369PMNAlXVG7E0TD8h+ATtrp6jjkIkmFMNAlqRAGuiQVoqUx9Ih4C/gAOA+cy8x6RNwEPAdsA94CHsnM93pTpiSpmXZ66GOZuSMz69XyE8ChzLwLOFQtS5IG5GqGXB4C9lfv9wO7rr4cSVKnWg30BL4bEUcjYrJaV8vMU9X7t4Fa16uTJLWs1evQRzLzZET8KvByRPz44g8zMyMiV2pY/QGYBKjVajQajaupt2jtHpulpaW223j81S+ea/3XUqBn5snq9XREvADcB7wTEVsz81REbAVOr9J2BpgBqNfrOTo62pXCS/Mrx+9hz/EOGp5pYx/DMDr6egc7kdp08AD+X++/poEeEVuAT2XmB9X7rwB/AXwH2A3srV5f7GWhpftgcW/bd9Y1Go22/tN0dPu2pHWjlR56DXghIi5s/w+ZeTAivg98KyImgOPAI70rU5LUTNNAz8w3gXtXWH8G2NmLoiRJ7fNOUUkqhIEuSYUw0CWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgrR6pyi6oOOZhQ62Hqb6zdf0/7vl7RuGOhrRLvTz8HyH4BO2kkqk0MuklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAtB3pEbIqIH0TES9XynRFxOCKORcRzEXFt78qUJDXTTg/9T4DFi5b3AU9n5ueA94CJbhYmSWpPS4EeEbcDXwW+WS0HcD/wfLXJfmBXLwqUJLWm1R76XwNfB/63Wr4ZOJuZ56rlE8BtXa5NktSGprf+R8TvA6cz82hEjLa7g4iYBCYBarUajUaj3V+hK/B4aq3y3Oy/Vp7l8iXgaxHxe8B1wGeBZ4AbImKo6qXfDpxcqXFmzgAzAPV6PUdHR7tRtwAOHsDjqTXJc3Mgmg65ZOaTmXl7Zm4DHgX+OTP/AJgHHq422w282LMqJUlNXc116I8Dj0XEMZbH1Ge7U5IkqRNtPT43MxtAo3r/JnBf90uSJHXC56FL6tjyFcyrfLZv5fWZ2aNq5K3/kjqWmSv+zM/Pr/qZesdAl6RCGOiSVAgDXZIKYaBLUiEMdEkqhIEuSYUw0CWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSpE00CPiOsi4l8j4rWI+FFEfKNaf2dEHI6IYxHxXERc2/tyJUmraaWH/t/A/Zl5L7ADeDAivgDsA57OzM8B7wETvStTktRM00DPZUvV4jXVTwL3A89X6/cDu3pSoSSpJS2NoUfEpoh4FTgNvAz8FDibmeeqTU4At/WmRElSK4Za2SgzzwM7IuIG4AXg7lZ3EBGTwCRArVaj0Wh0UKZW4/HUWrS0tOS5OQAtBfoFmXk2IuaBLwI3RMRQ1Uu/HTi5SpsZYAagXq/n6Ojo1VWsXzp4AI+n1pK5uTmmp6dZXFxkeHiYqakpxsfHB13WhtE00CPiVuDjKsw3Aw+w/IXoPPAw8CywG3ixl4VKWtvm5uaYmppidnaW8+fPs2nTJiYmlq+VMNT7o5Ux9K3AfET8EPg+8HJmvgQ8DjwWEceAm4HZ3pUpaa2bnp5mdnaWsbExhoaGGBsbY3Z2lunp6UGXtmE07aFn5g+B31hh/ZvAfb0oStL6s7i4yMjIyCXrRkZGWFxcHFBFG493ikrqiuHhYRYWFi5Zt7CwwPDw8IAq2ngMdEldMTU1xcTEBPPz85w7d475+XkmJiaYmpoadGkbRltXuUjSai588blnz57/v8plenraL0T7yECX1DXj4+OMj4/TaDS8pHYAHHKRpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVAgDXZIKYaBLUiF8ONcaFxFX/nzfyuszswfVSFrL7KGvcZm56s/8/Pyqn0naeAx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVAgDXZIKYaBLUiGinzehRMS7wPG+7bB8twC/GHQR0go8N7vrjsy8tdlGfQ10dVdEHMnM+qDrkC7nuTkYDrlIUiEMdEkqhIG+vs0MugBpFZ6bA+AYuiQVwh66JBXCQF+HImJXRGRE3D3oWqQLIuJ8RLwaEa9FxCsR8VuDrmmjMdDXp3FgoXqV1oqPMnNHZt4LPAn85aAL2mgM9HUmIj4DjAATwKMDLkdazWeB9wZdxEbjnKLrz0PAwcz8j4g4ExGfz8yjgy5KAjZHxKvAdcBW4P4B17Ph2ENff8aBZ6v3z+Kwi9aOC0MudwMPAn8bzWY5V1d52eI6EhE3ASeAd4EENlWvd6T/kBqwiFjKzM9ctPwOcE9mnh5gWRuKPfT15WHg7zLzjszclpm/BvwM+O0B1yVdoroCaxNwZtC1bCSOoa8v48C+y9b9U7X+e/0vR7rEhTF0gAB2Z+b5QRa00TjkIkmFcMhFkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVIj/Az/DPVaWwbPWAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x7efd5b5245d0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"data.boxplot()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7efd594bfc50>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7efd5936ed90>]], dtype=object)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEICAYAAABYoZ8gAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAFzhJREFUeJzt3X+w5XV93/Hnq6iJWRwBqdcViGumDC0jBdM7qKNtL6KIaMV0HIVSXaLOmoxO1Fkng+lMTPWPkjZqm5iRbJRCOwTMGFFGqbpDvEVn1LgQdEGgEFxl14VVoeA1tunqu3+c78br5dy9537Pufd7Dt/nY+bM+f74fM/3/T3ne973ez/n8/18UlVIkvrjH3QdgCRpc5n4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y+KfMUkWkzyc5Be6jkXaaEn2JflxkqXmvP9MklO6jmvWmfhnSJJtwD8HCnhVp8FIm+dfVdWxwFbgQeCPOo5n5pn4Z8sbgK8AVwHbuw1F2lxV9X+AjwOndx3LrHtC1wFoXd4AfAD4KvCVJHNV9WDHMUmbIskvAa9jcPGjMZj4Z0SSFwHPAv68qr6f5G+AfwN8sNvIpA33ySSHgS3A94CXdRzPzLOqZ3ZsBz5fVd9v5v8Mq3vUD6+uquOAXwTeBvzPJM/oOKaZZuKfAUmeDLwW+JdJHkjyAPBO4MwkZ3YbnbQ5quonVfUJ4CfAi7qOZ5aZ+GfDqxmc7KcDZzWPfwJ8kUG9v/S4l4ELgeOBO7uOZ5bF/vinX5LPAndU1c4Vy18L/CFwclUd7iQ4aQMl2QfMMbjwKeDbwH+oqmu6jGvWmfglqWes6pGknjHxS1LPmPglqWdM/FJLSU5J8oUk30xyR5K3N8t/L8mBJLc1jwu6jlVabip/3D3xxBNr27ZtXYfR2o9+9CO2bNnSdRhjm+XjuOWWW75fVf9wI/eRZCuwtapuTfIU4BYGTW9fCyxV1R+M+lrTcM7P8uc9q7FPMu71nPNT2WXDtm3b2LNnT9dhtLa4uMjCwkLXYYxtlo8jybc3eh9VdRA42Ez/MMmdwEltXmsazvlZ/rxnNfZJxr2ec34qE780a5ous5/LoAO9FwJvS/IGYA+ws6oeHrLNDmAHwNzcHIuLi5sV7lBLS0udx9DWrMbeVdwmfmlMSY4F/gJ4R1U9muTDwPsY3HD0PuD9wBtXbldVu4BdAPPz89X1FeusXjXD7MbeVdz+uCuNIckTGST9a5p+ZKiqB5t+ZX4K/ClwdpcxSiuZ+KWWkgT4KHBnVX1g2fKty4r9GnD7ZscmHY1VPVJ7LwReD+xNcluz7HeAi5OcxaCqZx/wlm7Ck4Yz8UstVdWXgAxZdeNmxyKth1U9ktQzJn5J6hkTvyT1jHX8Hdt22WfWvc2+y1+xAZFIm6PNOQ/tznu/X8N5xS9JPWPil6SeMfFLUs+Y+CWpZ0z8ktQzayb+o4wydEKS3UnuaZ6PX2X77U2Ze5Jsn/QBSJLWZ5Qr/sMM+hM/HXg+8NYkpwOXATdV1anATc38z0lyAvAe4HkMeih8z2p/ICRJm2PNxF9VB6vq1mb6h8CRUYYuBK5uil3NYMi5lV4G7K6qh5qBKHYD508icElSO+uq418xytBcM/QcwAPA3JBNTgLuXza/n5ZD00mSJmPkO3eHjDL09+uqqpKMNWr7tA1DN471DKe284zD6379zXpvZnU4O0lHN1LiHzbKEPBgkq1VdbAZeOLQkE0PAAvL5k8GFoftY9qGoRvHeoZTu7TNLeWXjPba45rV4ewkHd0orXqGjjIE3AAcaaWzHfjUkM0/B5yX5PjmR93zmmWSpI6MUsd/ZJShFye5rXlcAFwOvDTJPcBLmnmSzCf5CEBVPcRgsOmvNY/3NsskSR1Zs6rnKKMMAZw7pPwe4M3L5q8ErmwboCRpsrxzV5J6xsQvST1j4peknjHxS1LPmPglqWdM/JLUMyZ+SeoZE78k9YyJX5J6xsQvST1j4peknjHxS1LPmPglqWdM/JLUMyZ+SeoZE78k9cwoQy9emeRQktuXLfvYstG49iW5bZVt9yXZ25TbM8nApa4lOSXJF5J8M8kdSd7eLD8hye4k9zTPx3cdq7TcKFf8VwHnL19QVa+rqrOq6iwGg7B/YtiGjXOasvPtw5Sm0mFgZ1WdDjwfeGuS04HLgJuq6lTgpmZemhprJv6quhkYOk5uMxD7a4FrJxyXNPWq6mBV3dpM/xC4EzgJuBC4uil2NfDqbiKUhktVrV0o2QZ8uqqes2L5vwA+sNrVfJJvAQ8DBfxJVe06yj52ADsA5ubm/tl111034iFMn6WlJY499tiRyu498Mi6X/+Mk5667m3aWM9xTJtzzjnnls38L7P5jtwMPAf4TlUd1ywP8PCR+RXbTNU5v1mfd5tzHo5+3q8W+zR/v2Cy7/l6zvk1B1tfw8Uc/Wr/RVV1IMnTgd1J7mr+g3iM5o/CLoD5+flaWFgYM7TuLC4uMmr8l172mXW//r5LRnvtca3nOPosybEMqjzfUVWPDnL9QFVVkqFXV9N2zm/W593mnIejn/erxT7N3y/o7jvWulVPkicA/xr42GplqupA83wIuB44u+3+pGmU5IkMkv41VXXkt64Hk2xt1m8FDnUVnzTMOM05XwLcVVX7h61MsiXJU45MA+cBtw8rK82iphrno8CdVfWBZatuALY309uBT212bNLRjNKc81rgy8BpSfYneVOz6iJWVPMkeWaSG5vZOeBLSb4O/BXwmar67ORClzr3QuD1wIuXNW++ALgceGmSexhcIF3eZZDSSmvW8VfVxassv3TIsu8CFzTT9wFnjhmfNLWq6ktAVll97mbGIq2Hd+5KUs+Y+CWpZ0z8ktQzJn5J6hkTvyT1jIlfknrGxC9JPWPil6SeMfFLUs+M2zunltnW9AS484zDrXsg3Ejb1hlT2+PYd/kr1r2NpM3jFb8k9YxX/JJmwtH+Y53W/7KnlVf8ktQzJn5J6hkTvyT1jIlfknpmlBG4rkxyKMnty5b9XpIDK0YdGrbt+UnuTnJvkssmGbgkqZ1RrvivAs4fsvyDVXVW87hx5cokxwB/DLwcOB24OMnp4wQrSRrfmom/qm4GHmrx2mcD91bVfVX1d8B1wIUtXkeSNEHjtON/W5I3AHuAnVX18Ir1JwH3L5vfDzxvtRdLsgPYATA3N8fi4uIYoXVj5xmHAZh78s+mN0Lb92a9MbU9jln87KQ+aZv4Pwy8D6jm+f3AG8cJpKp2AbsA5ufna2FhYZyX68Sly7pseP/ejbs3bt8lC622W+8NLm2Po218kjZHq1Y9VfVgVf2kqn4K/CmDap2VDgCnLJs/uVkmSepQq8SfZOuy2V8Dbh9S7GvAqUmeneRJwEXADW32J0manDX/j09yLbAAnJhkP/AeYCHJWQyqevYBb2nKPhP4SFVdUFWHk7wN+BxwDHBlVd2xIUchSRrZmom/qi4esvijq5T9LnDBsvkbgcc09ZQkdcc7dyWpZ0z8ktQzJn5J6hkTvyT1jIlfknrGxC9JPWPil6SeMfFLUs+Y+CWpZ0z8UkvjjE4ndcnEL7V3FS1Gp5O6ZuKXWhpjdDqpUxs3WojUX2uNTgdM36hzS0tLmxLDRoxON8lR7zbzc9is93wlE780WSOPTjdto84tLi6yGTGsdyS4UUxy1LvNHEFus97zlazqkSZoxNHppE6tmfhXabnwn5LcleQbSa5Pctwq2+5Lsrdp3bBnkoFL02jE0emkTo1yxX8Vj225sBt4TlX9U+B/Ae8+yvbnNK0b5tuFKE2nZnS6LwOnJdmf5E3Af2wudr4BnAO8s9MgpSFGGYHr5iTbViz7/LLZrwCvmWxY0vRbz+h00jSZRB3/G4H/scq6Aj6f5JamBYMkqWNj/Qye5N8Bh4FrVinyoqo6kOTpwO4kdzVtn4e91lQ1bWvjSHOySTYtG6bte7PemNoexyx+dlKftE78SS4FXgmcW1U1rExVHWieDyW5nkELh6GJf9qatrVxpJnaJJuWDdO2udl6m9G1PY7NbA4naf1aVfUkOR/4beBVVfW3q5TZkuQpR6aB87CFgyR1bpTmnMNaLnwIeAqD6pvbklzRlH1mkiN9k8wBX0rydeCvgM9U1Wc35CgkSSMbpVXPyC0Xquq7wAXN9H3AmWNFp6G2bcCdj5L6wzt3JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y+KXpJ4x8UtSz5j4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y+KXpJ4x8UtSz4yU+JNcmeRQktuXLTshye4k9zTPx6+y7famzD1Jtk8qcElSO6Ne8V8FnL9i2WXATVV1KnBTM/9zkpwAvAd4HoOB1t+z2h8ISdLmGCnxV9XNwEMrFl8IXN1MXw28esimLwN2V9VDVfUwsJvH/gGRJG2iNcfcPYq5qjrYTD/AYHD1lU4C7l82v79Z9hhJdgA7AObm5lhcXBwjtG7sPOMwAHNP/tn0LGt7HLP42Ul9Mk7i/3tVVUlqzNfYBewCmJ+fr4WFhUmEtqkubQZB33nGYd6/dyJvbafaHse+SxYmH4ykiRmnVc+DSbYCNM+HhpQ5AJyybP7kZpkkqSPjJP4bgCOtdLYDnxpS5nPAeUmOb37UPa9ZJknqyKjNOa8FvgyclmR/kjcBlwMvTXIP8JJmniTzST4CUFUPAe8DvtY83tsskyR1ZKQK3Kq6eJVV5w4puwd487L5K4ErW0UnSZo479yVpJ4x8UtSz5j4JalnTPxSS+P0YSV1ycQvtXcVLfqwkrpm4pdaGqMPK6lTJn5pskbpw0rq1Ox3KCNNqbX6sJq2jgmXlpbWHcPeA4+sez87z1j3JmuaZMeIm/k5tHnPJ8HEL03Wg0m2VtXBo/RhBUxfx4SLi4usN4YjHRN2bZIdI25mJ4Nt3vNJsKpHmqxR+rCSOmXil1paTx9W0jSxqkdqaT19WEnTxCt+SeoZE78k9YxVPavYNiWtFSRp0rzil6SeaZ34k5yW5LZlj0eTvGNFmYUkjywr87vjhyxJGkfrqp6quhs4CyDJMQwGUb9+SNEvVtUr2+5HkjRZk6rqORf4m6r69oReT5K0QSb14+5FwLWrrHtBkq8D3wXeVVV3DCs0bf2WjNPvxyT7DelS2+Po+rOTdHRjJ/4kTwJeBbx7yOpbgWdV1VKSC4BPAqcOe51p67dknD5IJtlvSJfaHsdm9nUiaf0mUdXzcuDWqnpw5YqqerSqlprpG4EnJjlxAvuUJLU0icR/MatU8yR5RpI002c3+/vBBPYpSWpprPqIJFuAlwJvWbbsNwCq6grgNcBvJjkM/Bi4qKpW7Z9ckrTxxkr8VfUj4Gkrll2xbPpDwIfG2YekzbH3wCNT07++NpZ37kpSz5j4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y+KXpJ4x8UtSz5j4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9M3biT7Ivyd4ktyXZM2R9kvxhknuTfCPJr467T0lSe5MaEfycqvr+KutezmCA9VOB5wEfbp4l6XFhW8sBbK46f8uEIxnNZlT1XAj8txr4CnBckq2bsF9J0hCTuOIv4PNJCviTqtq1Yv1JwP3L5vc3yw4uL5RkB7ADYG5ujsXFxQmE1t7OMw633nbuyeNtPy3aHkfXn52ko5tE4n9RVR1I8nRgd5K7qurm9b5I8wdjF8D8/HwtLCxMILT2xhl7dOcZh3n/3knVonWn7XHsu2Rh8sFImpixq3qq6kDzfAi4Hjh7RZEDwCnL5k9ulkmSOjBW4k+yJclTjkwD5wG3ryh2A/CGpnXP84FHquog0uPYWq3dpC6NWx8xB1yf5Mhr/VlVfTbJbwBU1RXAjcAFwL3A3wK/PuY+pVlxtNZuUmfGSvxVdR9w5pDlVyybLuCt4+xHkjQ53rkrbYwjrd1uaVqsSVNj9pueSNNpzdZu09aEeZabIU8y9jafQ9t9Ly0tdfK5m/ilDbC8tVuSI63dbl5RZqqaMP/RNZ+a2WbIk2xC3aY5ctvm31edv4UuPnereqQJG7G1m9SZ2fzzrqnWpt+SfZe/YgMi6czQ1m7dhiT9jIlfmrDVWrtJ08KqHknqGRO/JPWMiV+SesY6fklapu2gKrPEK35J6hkTvyT1jIlfknrGxC9JPTNTP+724UcXSdpoXvFLUs+0TvxJTknyhSTfTHJHkrcPKbOQ5JFm+LnbkvzueOFKksY1TlXPYWBnVd3a9ER4S5LdVfXNFeW+WFWvHGM/kqQJap34mwHTDzbTP0xyJ3ASsDLxS5KG2HvgkXX35T+Jnmwn8uNukm3Ac4GvDln9giRfB74LvKuq7ljlNdYcjWhWRgea5ZGMltvM4+h69CmpT8ZO/EmOBf4CeEdVPbpi9a3As6pqKckFwCeBU4e9ziijEbUd5WazTXI0oC5t5nG0GfVIUjtjtepJ8kQGSf+aqvrEyvVV9WhVLTXTNwJPTHLiOPuUJI1nnFY9AT4K3FlVH1ilzDOaciQ5u9nfD9ruU5I0vnH+j38h8Hpgb5LbmmW/A/wyQFVdAbwG+M0kh4EfAxdVVY2xT0nSmMZp1fMlIGuU+RDwobb7kCRNnnfuSlLPmPglqWdmv82hpMdo06HhzjM2IBBNJa/4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y3NOTYW24ylPom9yqW9M/NIUa/sHUToaq3okqWdM/JLUMyZ+SeoZE78k9YyJX5J6Ztwxd89PcneSe5NcNmT9LyT5WLP+q0m2jbM/aVas9d2QujTOmLvHAH8MvBw4Hbg4yekrir0JeLiq/hHwQeD32+5PmhUjfjekzoxzxX82cG9V3VdVfwdcB1y4osyFwNXN9MeBc48Mvi49jo3y3ZA6M84NXCcB9y+b3w88b7UyVXU4ySPA04Dvr3yxJDuAHc3sUpK7x4itU78FJzLkGGfNLBxHVv8f8lmbGMZKo3w3pu6cn4XPezWzGnubuCdxzk/NnbtVtQvY1XUck5BkT1XNdx3HuB4vxzGtpu2cn+XPe1Zj7yrucap6DgCnLJs/uVk2tEySJwBPBX4wxj6lWTDKd0PqzDiJ/2vAqUmeneRJwEXADSvK3ABsb6ZfA/xlVdUY+5RmwSjfDakzrat6mjr7twGfA44BrqyqO5K8F9hTVTcAHwX+e5J7gYcYfAH6YGr+fR/T4+U4NtVq342OwxrFLH/esxp7J3HHC3BJ6hfv3JWknjHxS1LPmPgnIMkxSf46yaeb+Wc3XVTc23RZ8aSuYxxFkuOSfDzJXUnuTPKCJCck2Z3knub5+K7j1GTM6nk7y+dpkncmuSPJ7UmuTfKLXbzvJv7JeDtw57L53wc+2HRV8TCDritmwX8BPltV/xg4k8ExXQbcVFWnAjc183p8mNXzdibP0yQnAb8FzFfVcxj88H8RXbzvVeVjjAeDNto3AS8GPg2EwZ14T2jWvwD4XNdxjnAcTwW+RfOD/7LldwNbm+mtwN1dx+pjIp/3TJ63s3ye8rM7uk9g0KLy08DLunjfveIf338Gfhv4aTP/NOB/V9XhZn4/gw982j0b+B7wX5t//z+SZAswV1UHmzIPAHOdRahJmtXzdmbP06o6APwB8B3gIPAIcAsdvO8m/jEkeSVwqKpu6TqWCXgC8KvAh6vqucCPWPHvcg0uSWz/O+Nm/Lyd2fO0+d3hQgZ/vJ4JbAHO7yIWE/94Xgi8Ksk+Bj0wvphB/eNxTRcVMDu36+8H9lfVV5v5jzP4gj2YZCtA83yoo/g0ObN83s7yefoS4FtV9b2q+n/AJxh8Fpv+vpv4x1BV766qk6tqG4Mfaf6yqi4BvsCgiwoYdFnxqY5CHFlVPQDcn+S0ZtG5wDf5+W43ZuJYdHSzfN7O+Hn6HeD5SX6p6Z7+SOyb/r575+6EJFkA3lVVr0zyKwyupE4A/hr4t1X1f7uMbxRJzgI+AjwJuA/4dQYXB38O/DLwbeC1VfVQZ0FqombxvJ3l8zTJvwdeBxxm8B6/mUGd/qa+7yZ+SeoZq3okqWdM/JLUMyZ+SeoZE78k9YyJX5J6xsQvST1j4peknvn/ZqgzNOgJt+AAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x7efd593d26d0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"data.hist()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"p-value (model A): 0.607729394781 \t\tAccepted: True\n",
"p-value (model B): 0.715132302366 \t\tAccepted: True\n"
]
}
],
"source": [
"from scipy.stats import normaltest\n",
"\n",
"valA, pA = normaltest(data['A'])\n",
"valB, pB = normaltest(data['B'])\n",
"print 'p-value (model A):', pA, '\\t\\tAccepted:', pA > 0.05\n",
"print 'p-value (model B):', pB, '\\t\\tAccepted:', pB > 0.05"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Statistics: -5.84822205035 p-value 2.0247432256e-08 \tDifference is significant: True\n"
]
}
],
"source": [
"from scipy.stats import ttest_ind\n",
"statistics, p_value = ttest_ind(data['A'], data['B'], equal_var=True)\n",
"print 'Statistics:', statistics, 'p-value', p_value, '\\tDifference is significant:', p_value < 0.05"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Statistics: -5.84822205035 p-value 2.02501577258e-08 \tDifference is significant: True\n"
]
}
],
"source": [
"from scipy.stats import ttest_ind\n",
"statistics, p_value = ttest_ind(data['A'], data['B'], equal_var=False)\n",
"print 'Statistics:', statistics, 'p-value', p_value, '\\tDifference is significant:', p_value < 0.05"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Statistics: 0.36 p-value 2.84907339346e-06 \tDifference is significant: True\n"
]
}
],
"source": [
"from scipy.stats import ks_2samp\n",
"statistics, p_value = ks_2samp(data['A'], data['B'])\n",
"print 'Statistics:', statistics, 'p-value', p_value, '\\tDifference is significant:', p_value < 0.05"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment