Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tonicanada/20bd671d2b3ef6c7b5d424f95472031d to your computer and use it in GitHub Desktop.
Save tonicanada/20bd671d2b3ef6c7b5d424f95472031d to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"mean\": 20,\n",
" \"std\": 3.021175268004159,\n",
" \"n\": 12\n",
"}\n"
]
}
],
"source": [
"#We compute n, mean and standard deviation\n",
"x_example1 = [21.5, 24.5, 18.5, 17.2, 14.5, \n",
" 23.2, 22.1, 20.5, 19.4, 18.1, 24.1, 18.5]\n",
"mu_example1 = 20\n",
"mean_example1 = np.mean(x_example1)\n",
"std_example1 = np.std(x_example1, ddof=1)\n",
"n_example1 = len(x_example1)\n",
"params_example1 = {\n",
" \"mean\": mu_example1,\n",
" \"std\": std_example1,\n",
" \"n\": n_example1\n",
"}\n",
"print(json.dumps(params_example1, indent=4))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\n",
"\\mu = 20\n",
"$$\n",
"$$\n",
"{\\overline{x}} = 20.175\n",
"$$\n",
"$$\n",
"s = 2.892555\n",
"$$\n",
"$$\n",
"n = 12\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With that information we can compute the 't-statistic', which standarizes the distribution of the sample means (seen abobe in the CLT chapter)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\n",
"t = \\frac{\\overline{x}-\\mu}{SE}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Where SE is the 'Standard Error of the mean', that we approximate subsituting $\\sigma$ by s, because we don't now the population deviation standard."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\n",
"SE = \\frac{\\sigma}{\\sqrt{n}} \\approx \\frac{s}{\\sqrt{n}}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we assume the following hypotheses:<br>\n",
"H0: Null hypotheses. We can consider that $\\mu \\approx {\\overline{x}}$ <br>\n",
"H1: We can reject the null hypotheses, concluding that $\\mu < {\\overline{x}}$"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.2006562773994862\n"
]
}
],
"source": [
"t_value_example1 = (mean_example1-mu_example1)/(std_example1/np.sqrt(n_example1))\n",
"print(t_value_example1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\n",
"t = \\frac{20.175-20}{\\frac{2.892555}{\\sqrt{12}}} = 0.2006562773994862\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we calculate the probability that H0 can be true, we go to the t-student table and check. Here we're going to use t-student distribution because n < 30 (this is more accurate that using normal distribution), but as we'll see, results are very similar."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.4223145946526807"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Using t-student distribtion\n",
"t.sf(t_value_example1, df=n_example1-1)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.42048367493849975"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Using normal distribution\n",
"norm.sf(t_value_example1)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment