Created
January 30, 2018 07:38
-
-
Save usernaamee/873324fc6053848d2b100460143552b8 to your computer and use it in GitHub Desktop.
Compare the performance of two machine learning models
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Testing statistical significance : Comparing two models\n", | |
"===============================\n", | |
"* Follower tutorial: https://machinelearningmastery.com/use-statistical-significance-tests-interpret-machine-learning-results/\n", | |
"* Steps:\n", | |
" * Assume two models. Simulate trial data (of model errors) with $\\mu_{1} = 50$, $\\sigma_{1} = 10$ & $\\mu_{2} = 60$, $\\sigma_{2} = 10$\n", | |
" * Convert to pandas DataFrame\n", | |
" * Describe the data using describe command\n", | |
" * Plot a Box & Whisker plot\n", | |
" * Plot a histogram\n", | |
" * As seen, on average, model A is better than B (lower error)\n", | |
" * Perform normality test using scipy.normaltest function \n", | |
" * $H_{0}$ is that the distribution is normal.\n", | |
" * We set our acceptance criteria as: p-value > 0.05 (we accept $H_{0}$).\n", | |
" * If p-value < 0.05, we reject the $H_{0}$ with 95% confidence.\n", | |
" * Since both distributions are **gaussian** and have **same variance**, we apply student-t test to see if difference between the means is significant or not.\n", | |
" * We use scipy function: ttest_ind()\n", | |
" * $H_{0}$ for this function is that both samples were drawn from the same distribution\n", | |
" * Or in other words, model A is no better than model B.\n", | |
" * A p-value <= 0.05 means that the means are significantly different with 95% confidence.\n", | |
" * It also means, out of 100 samples, the means would be significantly different 95% of the time.\n", | |
"* In case we had **different variances**, we would not use student-t test. We'd use Welch's t-test instead.\n", | |
" * This can be done by setting equal_var option in ttest_ind function to false.\n", | |
"* The closer the distributions are, the higher the number of samples required to tell them apart.\n", | |
"* In order to compare means of non-gaussian distributions, we use the Kolmogorov-Smirnov test.\n", | |
" * We use scipy function ks_2samp().\n", | |
" * Same can be applied to even gaussian distributions but will have less statistical power and might need larger number of samples for correct differentiation." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import numpy as np\n", | |
"import pandas as pd\n", | |
"%matplotlib inline\n", | |
"\n", | |
"mu1 = 50; mu2 = 60\n", | |
"sigma1 = 10; sigma2 = 10;\n", | |
"trials1 = np.random.normal(mu1, sigma1, 100)\n", | |
"trials2 = np.random.normal(mu2, sigma2, 100)\n", | |
"data = pd.DataFrame({'A':trials1, 'B':trials2})" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" A B\n", | |
"count 100.000000 100.000000\n", | |
"mean 50.261176 58.846194\n", | |
"std 10.433515 10.326449\n", | |
"min 23.314465 28.507300\n", | |
"25% 43.559996 52.891956\n", | |
"50% 49.533210 59.425351\n", | |
"75% 59.201264 64.794323\n", | |
"max 76.034436 85.295188\n" | |
] | |
} | |
], | |
"source": [ | |
"print data.describe()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"<matplotlib.axes._subplots.AxesSubplot at 0x7efd5b524cd0>" | |
] | |
}, | |
"execution_count": 3, | |
"metadata": {}, | |
"output_type": "execute_result" | |
}, | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAADr1JREFUeJzt3W9sXfV9x/H3tzGINF35v6sIJoJUNCwFka1XqF29ySalYutU8gAhrD2IJk9+lv3hQQH5Aeska4k0ifHUqjVl02boWBEVkaKizFedn2RNKJR1bkdKSZUoEBoRhhnaSPbdA5+sSbBz/+T+sX9+vyTr3nPu+fl8dXLy8U+/e875RWYiSVr/PjXoAiRJ3WGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgox1M+d3XLLLblt27Z+7rJoH374IVu2bBl0GdIneG5219GjR3+Rmbc2266vgb5t2zaOHDnSz10WrdFoMDo6OugypE/w3OyuiDjeynYOuUhSIQx0SSqEgS5JhTDQJakQBrokFcJAl9Q1c3NzbN++nZ07d7J9+3bm5uYGXdKG0tfLFiWVa25ujqmpKWZnZzl//jybNm1iYmICgPHx8QFXtzHYQ5fUFdPT08zOzjI2NsbQ0BBjY2PMzs4yPT096NI2DANdUlcsLi4yMjJyybqRkREWFxcHVNHGY6BL6orh4WEWFhYuWbewsMDw8PCAKtp4DHRJXTE1NcXExATz8/OcO3eO+fl5JiYmmJqaGnRpG4ZfikrqigtffO7Zs4fFxUWGh4eZnp72C9E+MtAldc34+Djj4+M+nGtAHHKRpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQXocuqWMR0XabzOxBJYIWe+gR8WcR8aOI+LeImIuI6yLizog4HBHHIuK5iLi218VKWlsyc8WfOx5/adXP1DtNAz0ibgP+GKhn5nZgE/AosA94OjM/B7wHTPSyUEnSlbU6hj4EbI6IIeDTwCngfuD56vP9wK7ulydJalXTQM/Mk8BfAT9nOcjfB44CZzPzXLXZCeC2XhUpSWqu6ZeiEXEj8BBwJ3AW+EfgwVZ3EBGTwCRArVaj0Wh0VKg+aWlpyeOpNctzs/9aucrly8DPMvNdgIj4NvAl4IaIGKp66bcDJ1dqnJkzwAxAvV5Pn8DWPT7RTmvWwQOemwPQyhj6z4EvRMSnY/kapZ3AvwPzwMPVNruBF3tToiSpFa2MoR9m+cvPV4DXqzYzwOPAYxFxDLgZmO1hnZKkJlq6sSgznwKeumz1m8B9Xa9IktQRb/2XpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcIZi9a4TmaEAWeFkTYie+hr3GqzvjgrjKTLGeiSVAgDXZIKYaBLUiEMdEkqhIEuSYUw0CWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFaBroEfHrEfHqRT//GRF/GhE3RcTLEfFG9XpjPwqWJK2saaBn5k8yc0dm7gA+D/wX8ALwBHAoM+8CDlXLkqQBaXfIZSfw08w8DjwE7K/W7wd2dbMwSVJ72g30R4G56n0tM09V798Gal2rSpLUtpanoIuIa4GvAU9e/llmZkSsOE1OREwCkwC1Wo1Go9FZpVqRx1Nrledm/7Uzp+jvAq9k5jvV8jsRsTUzT0XEVuD0So0ycwaYAajX6zk6Ono19epiBw/g8dSa5Lk5EO0MuYzzy+EWgO8Au6v3u4EXu1WUJKl9LQV6RGwBHgC+fdHqvcADEfEG8OVqWZI0IC0NuWTmh8DNl607w/JVL5KkNcA7RSWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUiHaehy5pA7r3G9/l/Y8+brvdticOtLzt9Zuv4bWnvtL2PnQpA13SFb3/0ce8tferbbVpNBptTXDRTvhrdQ65SFIhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIVoK9Ii4ISKej4gfR8RiRHwxIm6KiJcj4o3q9cZeFytJWl2rPfRngIOZeTdwL7AIPAEcysy7gEPVsiRpQJoGekRcD/wOMAuQmf+TmWeBh4D91Wb7gV29KlKS1FwrPfQ7gXeBv4mIH0TENyNiC1DLzFPVNm8DtV4VKUlqrpWnLQ4BvwnsyczDEfEMlw2vZGZGRK7UOCImgUmAWq1Go9G4uop1CY+n+qHd82xpaantNp7LV6+VQD8BnMjMw9Xy8ywH+jsRsTUzT0XEVuD0So0zcwaYAajX69nOIzXVxMEDbT2iVOpIB+dZu4/P9VzujshcsWN96UYR/wL8UWb+JCL+HNhSfXQmM/dGxBPATZn59Sv9nnq9nkeOHLnamovU6SQC7XASAXXinv339GU/r+9+vS/7WY8i4mhm1ptt1+oEF3uAv4+Ia4E3gT9kefz9WxExARwHHum0WDmJgNauDxb3em6uEy0Fema+Cqz012Fnd8uRJHXKO0UlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVAgDXZIKYaBLUiEMdEkqhIEuSYUw0CWpEAa6JBWi1TlFJW1gHc35ebD1Ntdvvqb9369PMNAlXVG7E0TD8h+ATtrp6jjkIkmFMNAlqRAGuiQVoqUx9Ih4C/gAOA+cy8x6RNwEPAdsA94CHsnM93pTpiSpmXZ66GOZuSMz69XyE8ChzLwLOFQtS5IG5GqGXB4C9lfv9wO7rr4cSVKnWg30BL4bEUcjYrJaV8vMU9X7t4Fa16uTJLWs1evQRzLzZET8KvByRPz44g8zMyMiV2pY/QGYBKjVajQajaupt2jtHpulpaW223j81S+ea/3XUqBn5snq9XREvADcB7wTEVsz81REbAVOr9J2BpgBqNfrOTo62pXCS/Mrx+9hz/EOGp5pYx/DMDr6egc7kdp08AD+X++/poEeEVuAT2XmB9X7rwB/AXwH2A3srV5f7GWhpftgcW/bd9Y1Go22/tN0dPu2pHWjlR56DXghIi5s/w+ZeTAivg98KyImgOPAI70rU5LUTNNAz8w3gXtXWH8G2NmLoiRJ7fNOUUkqhIEuSYUw0CWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgrR6pyi6oOOZhQ62Hqb6zdf0/7vl7RuGOhrRLvTz8HyH4BO2kkqk0MuklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAtB3pEbIqIH0TES9XynRFxOCKORcRzEXFt78qUJDXTTg/9T4DFi5b3AU9n5ueA94CJbhYmSWpPS4EeEbcDXwW+WS0HcD/wfLXJfmBXLwqUJLWm1R76XwNfB/63Wr4ZOJuZ56rlE8BtXa5NktSGprf+R8TvA6cz82hEjLa7g4iYBCYBarUajUaj3V+hK/B4aq3y3Oy/Vp7l8iXgaxHxe8B1wGeBZ4AbImKo6qXfDpxcqXFmzgAzAPV6PUdHR7tRtwAOHsDjqTXJc3Mgmg65ZOaTmXl7Zm4DHgX+OTP/AJgHHq422w282LMqJUlNXc116I8Dj0XEMZbH1Ge7U5IkqRNtPT43MxtAo3r/JnBf90uSJHXC56FL6tjyFcyrfLZv5fWZ2aNq5K3/kjqWmSv+zM/Pr/qZesdAl6RCGOiSVAgDXZIKYaBLUiEMdEkqhIEuSYUw0CWpEAa6JBXCQJekQhjoklQIA12SCmGgS1IhDHRJKoSBLkmFMNAlqRAGuiQVwkCXpEIY6JJUCANdkgphoEtSIQx0SSpE00CPiOsi4l8j4rWI+FFEfKNaf2dEHI6IYxHxXERc2/tyJUmraaWH/t/A/Zl5L7ADeDAivgDsA57OzM8B7wETvStTktRM00DPZUvV4jXVTwL3A89X6/cDu3pSoSSpJS2NoUfEpoh4FTgNvAz8FDibmeeqTU4At/WmRElSK4Za2SgzzwM7IuIG4AXg7lZ3EBGTwCRArVaj0Wh0UKZW4/HUWrS0tOS5OQAtBfoFmXk2IuaBLwI3RMRQ1Uu/HTi5SpsZYAagXq/n6Ojo1VWsXzp4AI+n1pK5uTmmp6dZXFxkeHiYqakpxsfHB13WhtE00CPiVuDjKsw3Aw+w/IXoPPAw8CywG3ixl4VKWtvm5uaYmppidnaW8+fPs2nTJiYmlq+VMNT7o5Ux9K3AfET8EPg+8HJmvgQ8DjwWEceAm4HZ3pUpaa2bnp5mdnaWsbExhoaGGBsbY3Z2lunp6UGXtmE07aFn5g+B31hh/ZvAfb0oStL6s7i4yMjIyCXrRkZGWFxcHFBFG493ikrqiuHhYRYWFi5Zt7CwwPDw8IAq2ngMdEldMTU1xcTEBPPz85w7d475+XkmJiaYmpoadGkbRltXuUjSai588blnz57/v8plenraL0T7yECX1DXj4+OMj4/TaDS8pHYAHHKRpEIY6JJUCANdkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVAgDXZIKYaBLUiF8ONcaFxFX/nzfyuszswfVSFrL7KGvcZm56s/8/Pyqn0naeAx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVAgDXZIKYaBLUiGinzehRMS7wPG+7bB8twC/GHQR0go8N7vrjsy8tdlGfQ10dVdEHMnM+qDrkC7nuTkYDrlIUiEMdEkqhIG+vs0MugBpFZ6bA+AYuiQVwh66JBXCQF+HImJXRGRE3D3oWqQLIuJ8RLwaEa9FxCsR8VuDrmmjMdDXp3FgoXqV1oqPMnNHZt4LPAn85aAL2mgM9HUmIj4DjAATwKMDLkdazWeB9wZdxEbjnKLrz0PAwcz8j4g4ExGfz8yjgy5KAjZHxKvAdcBW4P4B17Ph2ENff8aBZ6v3z+Kwi9aOC0MudwMPAn8bzWY5V1d52eI6EhE3ASeAd4EENlWvd6T/kBqwiFjKzM9ctPwOcE9mnh5gWRuKPfT15WHg7zLzjszclpm/BvwM+O0B1yVdoroCaxNwZtC1bCSOoa8v48C+y9b9U7X+e/0vR7rEhTF0gAB2Z+b5QRa00TjkIkmFcMhFkgphoEtSIQx0SSqEgS5JhTDQJakQBrokFcJAl6RCGOiSVIj/Az/DPVaWwbPWAAAAAElFTkSuQmCC\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x7efd5b5245d0>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"data.boxplot()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7efd594bfc50>,\n", | |
" <matplotlib.axes._subplots.AxesSubplot object at 0x7efd5936ed90>]], dtype=object)" | |
] | |
}, | |
"execution_count": 4, | |
"metadata": {}, | |
"output_type": "execute_result" | |
}, | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEICAYAAABYoZ8gAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAFzhJREFUeJzt3X+w5XV93/Hnq6iJWRwBqdcViGumDC0jBdM7qKNtL6KIaMV0HIVSXaLOmoxO1Fkng+lMTPWPkjZqm5iRbJRCOwTMGFFGqbpDvEVn1LgQdEGgEFxl14VVoeA1tunqu3+c78br5dy9537Pufd7Dt/nY+bM+f74fM/3/T3ne973ez/n8/18UlVIkvrjH3QdgCRpc5n4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y+KfMUkWkzyc5Be6jkXaaEn2JflxkqXmvP9MklO6jmvWmfhnSJJtwD8HCnhVp8FIm+dfVdWxwFbgQeCPOo5n5pn4Z8sbgK8AVwHbuw1F2lxV9X+AjwOndx3LrHtC1wFoXd4AfAD4KvCVJHNV9WDHMUmbIskvAa9jcPGjMZj4Z0SSFwHPAv68qr6f5G+AfwN8sNvIpA33ySSHgS3A94CXdRzPzLOqZ3ZsBz5fVd9v5v8Mq3vUD6+uquOAXwTeBvzPJM/oOKaZZuKfAUmeDLwW+JdJHkjyAPBO4MwkZ3YbnbQ5quonVfUJ4CfAi7qOZ5aZ+GfDqxmc7KcDZzWPfwJ8kUG9v/S4l4ELgeOBO7uOZ5bF/vinX5LPAndU1c4Vy18L/CFwclUd7iQ4aQMl2QfMMbjwKeDbwH+oqmu6jGvWmfglqWes6pGknjHxS1LPmPglqWdM/FJLSU5J8oUk30xyR5K3N8t/L8mBJLc1jwu6jlVabip/3D3xxBNr27ZtXYfR2o9+9CO2bNnSdRhjm+XjuOWWW75fVf9wI/eRZCuwtapuTfIU4BYGTW9fCyxV1R+M+lrTcM7P8uc9q7FPMu71nPNT2WXDtm3b2LNnT9dhtLa4uMjCwkLXYYxtlo8jybc3eh9VdRA42Ez/MMmdwEltXmsazvlZ/rxnNfZJxr2ec34qE780a5ous5/LoAO9FwJvS/IGYA+ws6oeHrLNDmAHwNzcHIuLi5sV7lBLS0udx9DWrMbeVdwmfmlMSY4F/gJ4R1U9muTDwPsY3HD0PuD9wBtXbldVu4BdAPPz89X1FeusXjXD7MbeVdz+uCuNIckTGST9a5p+ZKiqB5t+ZX4K/ClwdpcxSiuZ+KWWkgT4KHBnVX1g2fKty4r9GnD7ZscmHY1VPVJ7LwReD+xNcluz7HeAi5OcxaCqZx/wlm7Ck4Yz8UstVdWXgAxZdeNmxyKth1U9ktQzJn5J6hkTvyT1jHX8Hdt22WfWvc2+y1+xAZFIm6PNOQ/tznu/X8N5xS9JPWPil6SeMfFLUs+Y+CWpZ0z8ktQzayb+o4wydEKS3UnuaZ6PX2X77U2Ze5Jsn/QBSJLWZ5Qr/sMM+hM/HXg+8NYkpwOXATdV1anATc38z0lyAvAe4HkMeih8z2p/ICRJm2PNxF9VB6vq1mb6h8CRUYYuBK5uil3NYMi5lV4G7K6qh5qBKHYD508icElSO+uq418xytBcM/QcwAPA3JBNTgLuXza/n5ZD00mSJmPkO3eHjDL09+uqqpKMNWr7tA1DN471DKe284zD6379zXpvZnU4O0lHN1LiHzbKEPBgkq1VdbAZeOLQkE0PAAvL5k8GFoftY9qGoRvHeoZTu7TNLeWXjPba45rV4ewkHd0orXqGjjIE3AAcaaWzHfjUkM0/B5yX5PjmR93zmmWSpI6MUsd/ZJShFye5rXlcAFwOvDTJPcBLmnmSzCf5CEBVPcRgsOmvNY/3NsskSR1Zs6rnKKMMAZw7pPwe4M3L5q8ErmwboCRpsrxzV5J6xsQvST1j4peknjHxS1LPmPglqWdM/JLUMyZ+SeoZE78k9YyJX5J6xsQvST1j4peknjHxS1LPmPglqWdM/JLUMyZ+SeoZE78k9cwoQy9emeRQktuXLfvYstG49iW5bZVt9yXZ25TbM8nApa4lOSXJF5J8M8kdSd7eLD8hye4k9zTPx3cdq7TcKFf8VwHnL19QVa+rqrOq6iwGg7B/YtiGjXOasvPtw5Sm0mFgZ1WdDjwfeGuS04HLgJuq6lTgpmZemhprJv6quhkYOk5uMxD7a4FrJxyXNPWq6mBV3dpM/xC4EzgJuBC4uil2NfDqbiKUhktVrV0o2QZ8uqqes2L5vwA+sNrVfJJvAQ8DBfxJVe06yj52ADsA5ubm/tl111034iFMn6WlJY499tiRyu498Mi6X/+Mk5667m3aWM9xTJtzzjnnls38L7P5jtwMPAf4TlUd1ywP8PCR+RXbTNU5v1mfd5tzHo5+3q8W+zR/v2Cy7/l6zvk1B1tfw8Uc/Wr/RVV1IMnTgd1J7mr+g3iM5o/CLoD5+flaWFgYM7TuLC4uMmr8l172mXW//r5LRnvtca3nOPosybEMqjzfUVWPDnL9QFVVkqFXV9N2zm/W593mnIejn/erxT7N3y/o7jvWulVPkicA/xr42GplqupA83wIuB44u+3+pGmU5IkMkv41VXXkt64Hk2xt1m8FDnUVnzTMOM05XwLcVVX7h61MsiXJU45MA+cBtw8rK82iphrno8CdVfWBZatuALY309uBT212bNLRjNKc81rgy8BpSfYneVOz6iJWVPMkeWaSG5vZOeBLSb4O/BXwmar67ORClzr3QuD1wIuXNW++ALgceGmSexhcIF3eZZDSSmvW8VfVxassv3TIsu8CFzTT9wFnjhmfNLWq6ktAVll97mbGIq2Hd+5KUs+Y+CWpZ0z8ktQzJn5J6hkTvyT1jIlfknrGxC9JPWPil6SeMfFLUs+M2zunltnW9AS484zDrXsg3Ejb1hlT2+PYd/kr1r2NpM3jFb8k9YxX/JJmwtH+Y53W/7KnlVf8ktQzJn5J6hkTvyT1jIlfknpmlBG4rkxyKMnty5b9XpIDK0YdGrbt+UnuTnJvkssmGbgkqZ1RrvivAs4fsvyDVXVW87hx5cokxwB/DLwcOB24OMnp4wQrSRrfmom/qm4GHmrx2mcD91bVfVX1d8B1wIUtXkeSNEHjtON/W5I3AHuAnVX18Ir1JwH3L5vfDzxvtRdLsgPYATA3N8fi4uIYoXVj5xmHAZh78s+mN0Lb92a9MbU9jln87KQ+aZv4Pwy8D6jm+f3AG8cJpKp2AbsA5ufna2FhYZyX68Sly7pseP/ejbs3bt8lC622W+8NLm2Po218kjZHq1Y9VfVgVf2kqn4K/CmDap2VDgCnLJs/uVkmSepQq8SfZOuy2V8Dbh9S7GvAqUmeneRJwEXADW32J0manDX/j09yLbAAnJhkP/AeYCHJWQyqevYBb2nKPhP4SFVdUFWHk7wN+BxwDHBlVd2xIUchSRrZmom/qi4esvijq5T9LnDBsvkbgcc09ZQkdcc7dyWpZ0z8ktQzJn5J6hkTvyT1jIlfknrGxC9JPWPil6SeMfFLUs+Y+CWpZ0z8UkvjjE4ndcnEL7V3FS1Gp5O6ZuKXWhpjdDqpUxs3WojUX2uNTgdM36hzS0tLmxLDRoxON8lR7zbzc9is93wlE780WSOPTjdto84tLi6yGTGsdyS4UUxy1LvNHEFus97zlazqkSZoxNHppE6tmfhXabnwn5LcleQbSa5Pctwq2+5Lsrdp3bBnkoFL02jE0emkTo1yxX8Vj225sBt4TlX9U+B/Ae8+yvbnNK0b5tuFKE2nZnS6LwOnJdmf5E3Af2wudr4BnAO8s9MgpSFGGYHr5iTbViz7/LLZrwCvmWxY0vRbz+h00jSZRB3/G4H/scq6Aj6f5JamBYMkqWNj/Qye5N8Bh4FrVinyoqo6kOTpwO4kdzVtn4e91lQ1bWvjSHOySTYtG6bte7PemNoexyx+dlKftE78SS4FXgmcW1U1rExVHWieDyW5nkELh6GJf9qatrVxpJnaJJuWDdO2udl6m9G1PY7NbA4naf1aVfUkOR/4beBVVfW3q5TZkuQpR6aB87CFgyR1bpTmnMNaLnwIeAqD6pvbklzRlH1mkiN9k8wBX0rydeCvgM9U1Wc35CgkSSMbpVXPyC0Xquq7wAXN9H3AmWNFp6G2bcCdj5L6wzt3JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y+KXpJ4x8UtSz5j4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y+KXpJ4x8UtSz4yU+JNcmeRQktuXLTshye4k9zTPx6+y7famzD1Jtk8qcElSO6Ne8V8FnL9i2WXATVV1KnBTM/9zkpwAvAd4HoOB1t+z2h8ISdLmGCnxV9XNwEMrFl8IXN1MXw28esimLwN2V9VDVfUwsJvH/gGRJG2iNcfcPYq5qjrYTD/AYHD1lU4C7l82v79Z9hhJdgA7AObm5lhcXBwjtG7sPOMwAHNP/tn0LGt7HLP42Ul9Mk7i/3tVVUlqzNfYBewCmJ+fr4WFhUmEtqkubQZB33nGYd6/dyJvbafaHse+SxYmH4ykiRmnVc+DSbYCNM+HhpQ5AJyybP7kZpkkqSPjJP4bgCOtdLYDnxpS5nPAeUmOb37UPa9ZJknqyKjNOa8FvgyclmR/kjcBlwMvTXIP8JJmniTzST4CUFUPAe8DvtY83tsskyR1ZKQK3Kq6eJVV5w4puwd487L5K4ErW0UnSZo479yVpJ4x8UtSz5j4JalnTPxSS+P0YSV1ycQvtXcVLfqwkrpm4pdaGqMPK6lTJn5pskbpw0rq1Ox3KCNNqbX6sJq2jgmXlpbWHcPeA4+sez87z1j3JmuaZMeIm/k5tHnPJ8HEL03Wg0m2VtXBo/RhBUxfx4SLi4usN4YjHRN2bZIdI25mJ4Nt3vNJsKpHmqxR+rCSOmXil1paTx9W0jSxqkdqaT19WEnTxCt+SeoZE78k9YxVPavYNiWtFSRp0rzil6SeaZ34k5yW5LZlj0eTvGNFmYUkjywr87vjhyxJGkfrqp6quhs4CyDJMQwGUb9+SNEvVtUr2+5HkjRZk6rqORf4m6r69oReT5K0QSb14+5FwLWrrHtBkq8D3wXeVVV3DCs0bf2WjNPvxyT7DelS2+Po+rOTdHRjJ/4kTwJeBbx7yOpbgWdV1VKSC4BPAqcOe51p67dknD5IJtlvSJfaHsdm9nUiaf0mUdXzcuDWqnpw5YqqerSqlprpG4EnJjlxAvuUJLU0icR/MatU8yR5RpI002c3+/vBBPYpSWpprPqIJFuAlwJvWbbsNwCq6grgNcBvJjkM/Bi4qKpW7Z9ckrTxxkr8VfUj4Gkrll2xbPpDwIfG2YekzbH3wCNT07++NpZ37kpSz5j4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y+KXpJ4x8UtSz5j4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9M3biT7Ivyd4ktyXZM2R9kvxhknuTfCPJr467T0lSe5MaEfycqvr+KutezmCA9VOB5wEfbp4l6XFhW8sBbK46f8uEIxnNZlT1XAj8txr4CnBckq2bsF9J0hCTuOIv4PNJCviTqtq1Yv1JwP3L5vc3yw4uL5RkB7ADYG5ujsXFxQmE1t7OMw633nbuyeNtPy3aHkfXn52ko5tE4n9RVR1I8nRgd5K7qurm9b5I8wdjF8D8/HwtLCxMILT2xhl7dOcZh3n/3knVonWn7XHsu2Rh8sFImpixq3qq6kDzfAi4Hjh7RZEDwCnL5k9ulkmSOjBW4k+yJclTjkwD5wG3ryh2A/CGpnXP84FHquog0uPYWq3dpC6NWx8xB1yf5Mhr/VlVfTbJbwBU1RXAjcAFwL3A3wK/PuY+pVlxtNZuUmfGSvxVdR9w5pDlVyybLuCt4+xHkjQ53rkrbYwjrd1uaVqsSVNj9pueSNNpzdZu09aEeZabIU8y9jafQ9t9Ly0tdfK5m/ilDbC8tVuSI63dbl5RZqqaMP/RNZ+a2WbIk2xC3aY5ctvm31edv4UuPnereqQJG7G1m9SZ2fzzrqnWpt+SfZe/YgMi6czQ1m7dhiT9jIlfmrDVWrtJ08KqHknqGRO/JPWMiV+SesY6fklapu2gKrPEK35J6hkTvyT1jIlfknrGxC9JPTNTP+724UcXSdpoXvFLUs+0TvxJTknyhSTfTHJHkrcPKbOQ5JFm+LnbkvzueOFKksY1TlXPYWBnVd3a9ER4S5LdVfXNFeW+WFWvHGM/kqQJap34mwHTDzbTP0xyJ3ASsDLxS5KG2HvgkXX35T+Jnmwn8uNukm3Ac4GvDln9giRfB74LvKuq7ljlNdYcjWhWRgea5ZGMltvM4+h69CmpT8ZO/EmOBf4CeEdVPbpi9a3As6pqKckFwCeBU4e9ziijEbUd5WazTXI0oC5t5nG0GfVIUjtjtepJ8kQGSf+aqvrEyvVV9WhVLTXTNwJPTHLiOPuUJI1nnFY9AT4K3FlVH1ilzDOaciQ5u9nfD9ruU5I0vnH+j38h8Hpgb5LbmmW/A/wyQFVdAbwG+M0kh4EfAxdVVY2xT0nSmMZp1fMlIGuU+RDwobb7kCRNnnfuSlLPmPglqWdmv82hpMdo06HhzjM2IBBNJa/4JalnTPyS1DMmfknqGRO/JPWMiV+SesbEL0k9Y3NOTYW24ylPom9yqW9M/NIUa/sHUToaq3okqWdM/JLUMyZ+SeoZE78k9YyJX5J6Ztwxd89PcneSe5NcNmT9LyT5WLP+q0m2jbM/aVas9d2QujTOmLvHAH8MvBw4Hbg4yekrir0JeLiq/hHwQeD32+5PmhUjfjekzoxzxX82cG9V3VdVfwdcB1y4osyFwNXN9MeBc48Mvi49jo3y3ZA6M84NXCcB9y+b3w88b7UyVXU4ySPA04Dvr3yxJDuAHc3sUpK7x4itU78FJzLkGGfNLBxHVv8f8lmbGMZKo3w3pu6cn4XPezWzGnubuCdxzk/NnbtVtQvY1XUck5BkT1XNdx3HuB4vxzGtpu2cn+XPe1Zj7yrucap6DgCnLJs/uVk2tEySJwBPBX4wxj6lWTDKd0PqzDiJ/2vAqUmeneRJwEXADSvK3ABsb6ZfA/xlVdUY+5RmwSjfDakzrat6mjr7twGfA44BrqyqO5K8F9hTVTcAHwX+e5J7gYcYfAH6YGr+fR/T4+U4NtVq342OwxrFLH/esxp7J3HHC3BJ6hfv3JWknjHxS1LPmPgnIMkxSf46yaeb+Wc3XVTc23RZ8aSuYxxFkuOSfDzJXUnuTPKCJCck2Z3knub5+K7j1GTM6nk7y+dpkncmuSPJ7UmuTfKLXbzvJv7JeDtw57L53wc+2HRV8TCDritmwX8BPltV/xg4k8ExXQbcVFWnAjc183p8mNXzdibP0yQnAb8FzFfVcxj88H8RXbzvVeVjjAeDNto3AS8GPg2EwZ14T2jWvwD4XNdxjnAcTwW+RfOD/7LldwNbm+mtwN1dx+pjIp/3TJ63s3ye8rM7uk9g0KLy08DLunjfveIf338Gfhv4aTP/NOB/V9XhZn4/gw982j0b+B7wX5t//z+SZAswV1UHmzIPAHOdRahJmtXzdmbP06o6APwB8B3gIPAIcAsdvO8m/jEkeSVwqKpu6TqWCXgC8KvAh6vqucCPWPHvcg0uSWz/O+Nm/Lyd2fO0+d3hQgZ/vJ4JbAHO7yIWE/94Xgi8Ksk+Bj0wvphB/eNxTRcVMDu36+8H9lfVV5v5jzP4gj2YZCtA83yoo/g0ObN83s7yefoS4FtV9b2q+n/AJxh8Fpv+vpv4x1BV766qk6tqG4Mfaf6yqi4BvsCgiwoYdFnxqY5CHFlVPQDcn+S0ZtG5wDf5+W43ZuJYdHSzfN7O+Hn6HeD5SX6p6Z7+SOyb/r575+6EJFkA3lVVr0zyKwyupE4A/hr4t1X1f7uMbxRJzgI+AjwJuA/4dQYXB38O/DLwbeC1VfVQZ0FqombxvJ3l8zTJvwdeBxxm8B6/mUGd/qa+7yZ+SeoZq3okqWdM/JLUMyZ+SeoZE78k9YyJX5J6xsQvST1j4peknvn/ZqgzNOgJt+AAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x7efd593d26d0>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"data.hist()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"p-value (model A): 0.607729394781 \t\tAccepted: True\n", | |
"p-value (model B): 0.715132302366 \t\tAccepted: True\n" | |
] | |
} | |
], | |
"source": [ | |
"from scipy.stats import normaltest\n", | |
"\n", | |
"valA, pA = normaltest(data['A'])\n", | |
"valB, pB = normaltest(data['B'])\n", | |
"print 'p-value (model A):', pA, '\\t\\tAccepted:', pA > 0.05\n", | |
"print 'p-value (model B):', pB, '\\t\\tAccepted:', pB > 0.05" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Statistics: -5.84822205035 p-value 2.0247432256e-08 \tDifference is significant: True\n" | |
] | |
} | |
], | |
"source": [ | |
"from scipy.stats import ttest_ind\n", | |
"statistics, p_value = ttest_ind(data['A'], data['B'], equal_var=True)\n", | |
"print 'Statistics:', statistics, 'p-value', p_value, '\\tDifference is significant:', p_value < 0.05" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Statistics: -5.84822205035 p-value 2.02501577258e-08 \tDifference is significant: True\n" | |
] | |
} | |
], | |
"source": [ | |
"from scipy.stats import ttest_ind\n", | |
"statistics, p_value = ttest_ind(data['A'], data['B'], equal_var=False)\n", | |
"print 'Statistics:', statistics, 'p-value', p_value, '\\tDifference is significant:', p_value < 0.05" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Statistics: 0.36 p-value 2.84907339346e-06 \tDifference is significant: True\n" | |
] | |
} | |
], | |
"source": [ | |
"from scipy.stats import ks_2samp\n", | |
"statistics, p_value = ks_2samp(data['A'], data['B'])\n", | |
"print 'Statistics:', statistics, 'p-value', p_value, '\\tDifference is significant:', p_value < 0.05" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 2", | |
"language": "python", | |
"name": "python2" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 2 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython2", | |
"version": "2.7.12" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment