danielfleischer/Statistic Tests.ipynb

## Statistic Tests.ipynb
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.random.randint(1,10,100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.random.normal(size=100)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "toc-hr-collapsed": false
   },
   "source": [
    "# Normality Tests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Shapiro Wilk\n",
    "Tests for Gaussian distribution. Observationgs need to be IID."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import shapiro"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.9135679006576538, 6.518070676975185e-06)"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "shapiro(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.9860052466392517, 0.374272882938385)"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "shapiro(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## D'Agostino's $K^2$\n",
    "Tests for Gaussian distribution. Observationgs need to be IID."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import normaltest"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "NormaltestResult(statistic=41.39502617899559, pvalue=1.0260872144248276e-09)"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "normaltest(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "NormaltestResult(statistic=3.593125002661302, pvalue=0.16586808066854905)"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "normaltest(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Anderson-Darling\n",
    "Tests for Gaussian distribution. Observationgs need to be IID."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import anderson"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AndersonResult(statistic=2.3516564884300664, critical_values=array([0.555, 0.632, 0.759, 0.885, 1.053]), significance_level=array([15. , 10. ,  5. ,  2.5,  1. ]))"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "anderson(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AndersonResult(statistic=0.3417980857269356, critical_values=array([0.555, 0.632, 0.759, 0.885, 1.053]), significance_level=array([15. , 10. ,  5. ,  2.5,  1. ]))"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "anderson(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "toc-hr-collapsed": false
   },
   "source": [
    "# Correlation Tests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pearson Coefficient\n",
    "Tests whether two samples have a linear relationship. Observations need to be IID, in each sample to be normal, in each sample to have same variance. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import pearsonr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.0820377744456807, 0.4171172980326803)"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pearsonr(a,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Spearman Rank\n",
    "Tests whether two samples have a monotonic relationship. Observations need to be IID, able to be ranked. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import spearmanr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SpearmanrResult(correlation=0.09828266099415585, pvalue=0.3306388846208107)"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "spearmanr(a,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Kendall Rank\n",
    "Tests whether two samples have a monotonic relationship. Observations need to be IID, able to be ranked. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import kendalltau"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "KendalltauResult(correlation=0.07047459480753636, pvalue=0.32412404755614377)"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "kendalltau(a,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## $\\chi^2$ Test\n",
    "Tests whether two categorical variables are related or independent. Observations need to be independent, at least 25 examples in each cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import chi2_contingency"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [],
   "source": [
    "table = np.array([[40, 10],[30, 40]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(15.062204081632652,\n",
       " 0.00010402546661835286,\n",
       " 1,\n",
       " array([[29.16666667, 20.83333333],\n",
       "        [40.83333333, 29.16666667]]))"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chi2_contingency(table)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "toc-hr-collapsed": false
   },
   "source": [
    "# Hypothesis Tests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## T Test\n",
    "Tests whether two samples' means are significantly different. Observations need to be independent, in each sample normal, in each sample same variance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import ttest_ind"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.random.normal(loc=10, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.random.normal(loc=11, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Ttest_indResult(statistic=-9.211238980494853, pvalue=4.740751714730386e-17)"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ttest_ind(a,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Paired Student T Test\n",
    "Tests whether two paired samples' means are significantly different. Observations need to be independent, in each sample normal, in each sample same variance. Observations are paired."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import ttest_rel"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.random.normal(loc=9.5, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.random.normal(loc=10.1, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Ttest_relResult(statistic=-4.265035140526937, pvalue=4.575138155519719e-05)"
      ]
     },
     "execution_count": 97,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ttest_rel(a,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Analysis of Variance ANOVA\n",
    "Tests whether two or more samples' means are significantly different. Observations need to be independent, in each sample normal, in each sample same variance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import f_oneway"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.random.normal(loc=3, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.random.normal(loc=11, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "F_onewayResult(statistic=3087.4306882609585, pvalue=9.854457388246251e-123)"
      ]
     },
     "execution_count": 104,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f_oneway(a,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "toc-hr-collapsed": false
   },
   "source": [
    "# Non-parametric Hypothesis Testing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Mann-Whitney U Test\n",
    "Tests whether the distributions of two samples are equal or not. Observations need to be independent, can be ranked."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import mannwhitneyu"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 115,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.random.randint(1,10,size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.random.normal(loc=0, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 117,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "MannwhitneyuResult(statistic=161.0, pvalue=1.3099464346996372e-32)"
      ]
     },
     "execution_count": 117,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mannwhitneyu(a,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Wilcoxon Signed Rank Test\n",
    "Tests whether the distributions of two paired samples are equal or not. Observations need to be independent, can be ranked, are paired"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import wilcoxon"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 128,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.random.normal(loc=0,size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 129,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.random.normal(loc=0, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 130,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "WilcoxonResult(statistic=2525.0, pvalue=1.0)"
      ]
     },
     "execution_count": 130,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "wilcoxon(a,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Kruskal-Wallis H Test\n",
    "Tests whether the distributions of two or more samples are equal or not. Observations need to be independent, can be ranked."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 131,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import kruskal"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 145,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.random.normal(loc=0,size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 148,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.random.normal(loc=1.3, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 149,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "KruskalResult(statistic=56.23140895522397, pvalue=6.442411772846654e-14)"
      ]
     },
     "execution_count": 149,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "kruskal(a,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "toc-hr-collapsed": false
   },
   "source": [
    "## Friedman Test\n",
    "Tests whether the distributions of two or more paired samples are equal or not. Observations need to be independent, can be ranked, are paired."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 154,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import friedmanchisquare"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 155,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.random.normal(loc=0,size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 156,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.random.normal(loc=1.3, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 158,
   "metadata": {},
   "outputs": [],
   "source": [
    "c = np.random.normal(loc=1.3, size=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 159,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "FriedmanchisquareResult(statistic=52.460000000000036, pvalue=4.059342910977334e-12)"
      ]
     },
     "execution_count": 159,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "friedmanchisquare(a,b,c)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  },
  "toc-autonumbering": false,
  "toc-showcode": false,
  "toc-showmarkdowntxt": false
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"import numpy as np"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 40,
	"metadata": {},
	"outputs": [],
	"source": [
	"a = np.random.randint(1,10,100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 41,
	"metadata": {},
	"outputs": [],
	"source": [
	"b = np.random.normal(size=100)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"toc-hr-collapsed": false
	},
	"source": [
	"# Normality Tests"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Shapiro Wilk\n",
	"Tests for Gaussian distribution. Observationgs need to be IID."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import shapiro"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 43,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(0.9135679006576538, 6.518070676975185e-06)"
	]
	},
	"execution_count": 43,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"shapiro(a)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 44,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(0.9860052466392517, 0.374272882938385)"
	]
	},
	"execution_count": 44,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"shapiro(b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## D'Agostino's $K^2$\n",
	"Tests for Gaussian distribution. Observationgs need to be IID."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 47,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import normaltest"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 48,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"NormaltestResult(statistic=41.39502617899559, pvalue=1.0260872144248276e-09)"
	]
	},
	"execution_count": 48,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"normaltest(a)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 49,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"NormaltestResult(statistic=3.593125002661302, pvalue=0.16586808066854905)"
	]
	},
	"execution_count": 49,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"normaltest(b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Anderson-Darling\n",
	"Tests for Gaussian distribution. Observationgs need to be IID."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 50,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import anderson"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 53,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"AndersonResult(statistic=2.3516564884300664, critical_values=array([0.555, 0.632, 0.759, 0.885, 1.053]), significance_level=array([15. , 10. , 5. , 2.5, 1. ]))"
	]
	},
	"execution_count": 53,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"anderson(a)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 54,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"AndersonResult(statistic=0.3417980857269356, critical_values=array([0.555, 0.632, 0.759, 0.885, 1.053]), significance_level=array([15. , 10. , 5. , 2.5, 1. ]))"
	]
	},
	"execution_count": 54,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"anderson(b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"toc-hr-collapsed": false
	},
	"source": [
	"# Correlation Tests"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Pearson Coefficient\n",
	"Tests whether two samples have a linear relationship. Observations need to be IID, in each sample to be normal, in each sample to have same variance. "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 55,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import pearsonr"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 56,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(0.0820377744456807, 0.4171172980326803)"
	]
	},
	"execution_count": 56,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"pearsonr(a,b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Spearman Rank\n",
	"Tests whether two samples have a monotonic relationship. Observations need to be IID, able to be ranked. "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 57,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import spearmanr"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 58,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"SpearmanrResult(correlation=0.09828266099415585, pvalue=0.3306388846208107)"
	]
	},
	"execution_count": 58,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"spearmanr(a,b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Kendall Rank\n",
	"Tests whether two samples have a monotonic relationship. Observations need to be IID, able to be ranked. "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 59,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import kendalltau"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 60,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"KendalltauResult(correlation=0.07047459480753636, pvalue=0.32412404755614377)"
	]
	},
	"execution_count": 60,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"kendalltau(a,b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## $\\chi^2$ Test\n",
	"Tests whether two categorical variables are related or independent. Observations need to be independent, at least 25 examples in each cell."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 62,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import chi2_contingency"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 71,
	"metadata": {},
	"outputs": [],
	"source": [
	"table = np.array([[40, 10],[30, 40]])"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 72,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"(15.062204081632652,\n",
	" 0.00010402546661835286,\n",
	" 1,\n",
	" array([[29.16666667, 20.83333333],\n",
	" [40.83333333, 29.16666667]]))"
	]
	},
	"execution_count": 72,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"chi2_contingency(table)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"toc-hr-collapsed": false
	},
	"source": [
	"# Hypothesis Tests"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## T Test\n",
	"Tests whether two samples' means are significantly different. Observations need to be independent, in each sample normal, in each sample same variance."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 73,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import ttest_ind"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 78,
	"metadata": {},
	"outputs": [],
	"source": [
	"a = np.random.normal(loc=10, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 84,
	"metadata": {},
	"outputs": [],
	"source": [
	"b = np.random.normal(loc=11, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 85,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Ttest_indResult(statistic=-9.211238980494853, pvalue=4.740751714730386e-17)"
	]
	},
	"execution_count": 85,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"ttest_ind(a,b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Paired Student T Test\n",
	"Tests whether two paired samples' means are significantly different. Observations need to be independent, in each sample normal, in each sample same variance. Observations are paired."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 87,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import ttest_rel"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 95,
	"metadata": {},
	"outputs": [],
	"source": [
	"a = np.random.normal(loc=9.5, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 96,
	"metadata": {},
	"outputs": [],
	"source": [
	"b = np.random.normal(loc=10.1, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 97,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"Ttest_relResult(statistic=-4.265035140526937, pvalue=4.575138155519719e-05)"
	]
	},
	"execution_count": 97,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"ttest_rel(a,b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Analysis of Variance ANOVA\n",
	"Tests whether two or more samples' means are significantly different. Observations need to be independent, in each sample normal, in each sample same variance."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 98,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import f_oneway"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 102,
	"metadata": {},
	"outputs": [],
	"source": [
	"a = np.random.normal(loc=3, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 103,
	"metadata": {},
	"outputs": [],
	"source": [
	"b = np.random.normal(loc=11, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 104,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"F_onewayResult(statistic=3087.4306882609585, pvalue=9.854457388246251e-123)"
	]
	},
	"execution_count": 104,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"f_oneway(a,b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"toc-hr-collapsed": false
	},
	"source": [
	"# Non-parametric Hypothesis Testing"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Mann-Whitney U Test\n",
	"Tests whether the distributions of two samples are equal or not. Observations need to be independent, can be ranked."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 105,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import mannwhitneyu"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 115,
	"metadata": {},
	"outputs": [],
	"source": [
	"a = np.random.randint(1,10,size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 116,
	"metadata": {},
	"outputs": [],
	"source": [
	"b = np.random.normal(loc=0, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 117,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"MannwhitneyuResult(statistic=161.0, pvalue=1.3099464346996372e-32)"
	]
	},
	"execution_count": 117,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"mannwhitneyu(a,b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Wilcoxon Signed Rank Test\n",
	"Tests whether the distributions of two paired samples are equal or not. Observations need to be independent, can be ranked, are paired"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 118,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import wilcoxon"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 128,
	"metadata": {},
	"outputs": [],
	"source": [
	"a = np.random.normal(loc=0,size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 129,
	"metadata": {},
	"outputs": [],
	"source": [
	"b = np.random.normal(loc=0, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 130,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"WilcoxonResult(statistic=2525.0, pvalue=1.0)"
	]
	},
	"execution_count": 130,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"wilcoxon(a,b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Kruskal-Wallis H Test\n",
	"Tests whether the distributions of two or more samples are equal or not. Observations need to be independent, can be ranked."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 131,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import kruskal"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 145,
	"metadata": {},
	"outputs": [],
	"source": [
	"a = np.random.normal(loc=0,size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 148,
	"metadata": {},
	"outputs": [],
	"source": [
	"b = np.random.normal(loc=1.3, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 149,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"KruskalResult(statistic=56.23140895522397, pvalue=6.442411772846654e-14)"
	]
	},
	"execution_count": 149,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"kruskal(a,b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"toc-hr-collapsed": false
	},
	"source": [
	"## Friedman Test\n",
	"Tests whether the distributions of two or more paired samples are equal or not. Observations need to be independent, can be ranked, are paired."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 154,
	"metadata": {},
	"outputs": [],
	"source": [
	"from scipy.stats import friedmanchisquare"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 155,
	"metadata": {},
	"outputs": [],
	"source": [
	"a = np.random.normal(loc=0,size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 156,
	"metadata": {},
	"outputs": [],
	"source": [
	"b = np.random.normal(loc=1.3, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 158,
	"metadata": {},
	"outputs": [],
	"source": [
	"c = np.random.normal(loc=1.3, size=100)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 159,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"FriedmanchisquareResult(statistic=52.460000000000036, pvalue=4.059342910977334e-12)"
	]
	},
	"execution_count": 159,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"friedmanchisquare(a,b,c)"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.7.2"
	},
	"toc-autonumbering": false,
	"toc-showcode": false,
	"toc-showmarkdowntxt": false
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}