Skip to content

Instantly share code, notes, and snippets.

@xiaoouwang
Last active January 6, 2023 15:34
Show Gist options
  • Save xiaoouwang/e2e934063f32d3e32c19bc6b92b23ed0 to your computer and use it in GitHub Desktop.
Save xiaoouwang/e2e934063f32d3e32c19bc6b92b23ed0 to your computer and use it in GitHub Desktop.
Independent t-test by hand in Python: with equal sample sizes and variance
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2021-01-15T19:33:40.167472Z",
"start_time": "2021-01-15T19:33:40.165268Z"
}
},
"source": [
"# Independent two-sample t-test with equal sample sizes and variance\n",
"> Detail: https://www.nlpinpython.com/article/article-detail/4/\n",
"> Medium:https://xiaoouwang.medium.com/t-test-by-hand-in-python-d5513d1b55eb\n",
"\n",
"> Author: Xiaoou Wang. https://www.linkedin.com/in/xiaoou-wang "
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"ExecuteTime": {
"end_time": "2021-01-15T19:36:48.188317Z",
"start_time": "2021-01-15T19:36:48.182993Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The mean of a is around 4 -> 3.9028591091939 \n",
"The mean of b is around 0 -> -0.16958838211535887\n",
"The variance of a is around 1 -> 1.418239361307898 \n",
"The variance of b is around 1 -> 0.9982804180510149\n",
"The degree of freedom is -> 18\n"
]
}
],
"source": [
"## necessary packages\n",
"import numpy as np\n",
"from scipy import stats\n",
"from numpy.random import seed\n",
"\n",
"## generate data\n",
"# set the seed\n",
"seed(1)\n",
"N = 10\n",
"# Gaussian distributed data with mean = 2 and var = 1\n",
"# see https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.randn.html\n",
"a = np.random.randn(N) + 4\n",
"#Gaussian distributed data with with mean = 0 and var = 1\n",
"b = np.random.randn(N)\n",
"print(\"The mean of a is around 4 ->\",np.mean(a),\"\\nThe mean of b is around 0 ->\",np.mean(b))\n",
"print(\"The variance of a is around 1 ->\",np.var(a),\"\\nThe variance of b is around 1 ->\",np.var(b))\n",
"df = N*2 - 2\n",
"print(\"The degree of freedom is ->\",df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Formula for variance\n",
"\n",
"$$s_N = \\sqrt{\\frac{1}{N} \\sum_{i=1}^N \\left(x_i - \\bar{x}\\right)^2}.$$\n",
"\n",
"## Formula for corrected variance\n",
"\n",
"An unbiased estimator for the *variance* is given by applying [Bessel's correction](https://www.wikiwand.com/en/Bessel%27s_correction), using *N−1* instead of *N* to yield the *unbiased sample variance* denoted *s<sup>2</sup>*:\n",
"$$s^2 = \\frac{1}{N - 1} \\sum_{i=1}^N \\left(x_i - \\bar{x}\\right)^2.$$\n",
"\n",
"## Formula for t statistic\n",
"\n",
"$$ t = \\frac{\\bar{X}_1 - \\bar{X}_2}{s_p \\sqrt\\frac{2}{n}} $$\n",
"\n",
"where\n",
"$$ s_p = \\sqrt{\\frac{s_{X_1}^2+s_{X_2}^2}{2}}.$$"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {
"ExecuteTime": {
"end_time": "2021-01-15T19:38:17.419549Z",
"start_time": "2021-01-15T19:38:17.413622Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"t hand calculated -> 7.859258455468933\n",
"p hand calculated -> 3.152614480583793e-07\n",
"t using scipy -> 7.8592584554689315\n",
"p using scipy -> 3.152614479389834e-07\n"
]
}
],
"source": [
"# ddof=1 -> corercted variance\n",
"var_a = a.var(ddof=1)\n",
"var_b = b.var(ddof=1)\n",
"sp = (var_a/2+var_b/2)**0.5\n",
"\n",
"## Calculate the t-statistics\n",
"t = (a.mean() - b.mean())/(sp*np.sqrt(2/N))\n",
"\n",
"# p-value\n",
"p = 1 - stats.t.cdf(t,df=df)\n",
"\n",
"print(\"t hand calculated ->\",t)\n",
"print(\"p hand calculated ->\",2*p)\n",
"\n",
"## Cross Checking with the internal scipy function\n",
"t2, p2 = stats.ttest_ind(a,b)\n",
"print(\"t using scipy ->\",t2)\n",
"print(\"p using scipy ->\",p2)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment