Skip to content

Instantly share code, notes, and snippets.

@ashnair1
Last active October 29, 2021 11:55
Show Gist options
  • Save ashnair1/433ffbc1e747f80067f8a0439e346279 to your computer and use it in GitHub Desktop.
Save ashnair1/433ffbc1e747f80067f8a0439e346279 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Derivation of 1D Gaussian decision boundary threshold"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Consider a two class classification scenario. At the decision boundary, the posterior probability of classifying a data point into two classes will be equal i.e. $p(y=1|x) = p(y=2|x)$ \n",
"\n",
"Posterior definition\n",
"\n",
"$p(y=1|x) = \\large\\frac{p(x|y=1) * P(y=1)}{p(x)}$\n",
"\n",
"$p(y=2|x) = \\large\\frac{p(x|y=2) * P(y=2)}{p(x)}$\n",
"\n",
"where likelihoods are Gaussian i.e.\n",
"\n",
"$p(x|y=1) = \\mathcal{N}(x|\\mu_1, \\sigma_1)$ \n",
"\n",
"$p(x|y=2) = \\mathcal{N}(x|\\mu_2, \\sigma_2)$ \n",
"\n",
"So at the decision boundary, $p(y=1|x^*) = p(y=2|x^*)$ where $x^*$ is the threshold"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$\n",
"\\begin{align}\n",
"&p(y=1|x^*) = p(y=2|x^*) \\\\\n",
"&\\Longrightarrow\\frac{1}{\\sqrt{2\\pi\\sigma_1^2}}\\exp(-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2}) * P(y=1) = \\frac{1}{\\sqrt{2\\pi\\sigma_2^2}}\\exp(-\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2})* P(y=2)\\\\\n",
"\\end{align}\n",
"$\n",
"\n",
"Taking log on both sides,\n",
"\n",
"$\n",
"\\begin{align}\n",
"\\Rightarrow & \\small-\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\log\\sqrt{2\\pi}\\sigma_1 + \\log P(y=1)= -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2)\\\\\n",
"\\end{align}\n",
"$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"Rearranging the above equation a bit we get,$$ \n",
"\\begin{align} \n",
"\\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2}+\\log\\sqrt{2\\pi}\\sigma_1 -\\log\\sqrt{2\\pi}\\sigma_2 + \\log P(y=2) -\\log P(y=1)=0\\\\ \n",
"\\end{align} \n",
"$$\n",
"\n",
"Group the log terms,\n",
"\n",
"$$ \n",
"\\begin{align} \n",
"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1}{\\sigma_2} + \\log \\frac{P(y=2)}{P(y=1)} = 0\\\\ \n",
"\\rightarrow \\small\\frac{(x^* - \\mu_1)^2}{2\\sigma_1^2} -\\frac{(x^* - \\mu_2)^2}{2\\sigma_2^2} + \\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = 0 \n",
"\\end{align} \n",
"$$\n",
"\n",
"Let's now only consider the first two terms and for convenience set $\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)} = k$ a constant.\n",
"Expanding the squares we get,\n",
"\n",
"$$ \n",
"\\begin{align} \n",
"\\frac{2(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -4(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 2(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{4\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n",
"\\frac{0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2)}{\\sigma_1^2\\sigma_2^2} +k=0\\\\ \n",
"0.5(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + 0.5(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2) + k\\sigma_1^2\\sigma_2^2=0\\\\\n",
"\\end{align} \n",
"$$\n",
"\n",
"Multiply by 2,\n",
"$$ \n",
"\\begin{align}\n",
"(\\sigma_2^2 - \\sigma_1^2 )x^{*^2} -2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)x^* + \\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2 + 2k\\sigma_1^2\\sigma_2^2=0\n",
"\\end{align}\n",
"$$\n",
"\n",
"For a quadratic equation of the form $ax^2 + bx + c = 0$, the discriminant ($\\Delta$) $= b^2 - 4ac$\n",
"\n",
"Calculate the individual terms from above equation, \n",
"$$ \n",
"\\begin{align} \n",
"b^2 =\\left(-2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2)\\right) ^2= 4\\left(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2\\right) ^2\\\\ \n",
"4ac =4 (\\sigma_2^2 - \\sigma_1^2 ) \\left(\\sigma_2^2\\mu_1^2 - \\sigma_1^2\\mu_2^2+ 2k\\sigma_1^2\\sigma_2^2 \\right)\\\\ \n",
"\\end{align} \n",
"$$\n",
"\n",
"Calculate discriminant \n",
"$$ \n",
"\\begin{align} \n",
"b^2 - 4ac = 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - (\\sigma_1^2 - \\sigma_2^2 ) \\left(\\sigma_1^2\\mu_2^2-\\sigma_2^2\\mu_1^2 - 2k\\sigma_1^2\\sigma_2^2 \\right)\\right)\\\\ \n",
"= 4\\left(\\left(\\sigma_1^2\\mu_2 - \\sigma_2^2\\mu_1\\right) ^2 - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
"= 4\\left(\\left(\\sigma_1^4\\mu_2^2 -2\\sigma_1^2\\sigma_2^2\\mu_1\\mu_2 +\\sigma_2^4\\mu_1^2\\right) - \\left(\\sigma_1^4\\mu_2^2 - \\sigma_1^2\\sigma_2^2\\mu_1^2-2k\\sigma_1^4\\sigma_2^2 - \\sigma_1^2\\sigma_2^2\\mu_2^2 + \\sigma_2^4\\mu_1^2 + 2k\\sigma_1^2\\sigma_2^4\\right)\\right)\\\\\n",
"= 4\\sigma_1^2\\sigma_2^2\\left(\\mu_1^2 +\\mu_2^2 - 2\\mu_1\\mu_2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n",
"= 4\\sigma_1^2\\sigma_2^2\\left( \\left(\\mu_1 - \\mu_2\\right)^2 + 2k(\\sigma_1^2 - \\sigma_2^2)\\right)\\\\ \n",
"\\end{align} \n",
"$$\n",
"\n",
"So the discriminant $\\Delta =4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)$. \n",
"\n",
"\n",
"Recall formula for roots of quadratic equation $ax^2 + bx + c = 0$,\n",
"$$\n",
"\\begin{align}\n",
"x = \\frac{-b \\pm \\sqrt\\Delta}{2a}\n",
"\\end{align}\n",
"$$\n",
"\n",
"So threshold ($x^*$) can be computed as follows:\n",
"$$\n",
"\\begin{align}\n",
"x^* = \\frac{2(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{4\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 2\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{2(\\sigma_2^2 - \\sigma_1^2 )}\\\\\n",
"x^*= \\frac{(\\sigma_2^2\\mu_1 - \\sigma_1^2\\mu_2) \\pm \\sqrt{\\sigma_1^2\\sigma_2^2\\left(\\left(\\mu_1-\\mu_2\\right)^2 + 0.5\\log \\frac{\\sigma_1P(y=2)}{\\sigma_2P(y=1)}(\\sigma_1^2 - \\sigma_2^2)\\right)}}{(\\sigma_2^2 - \\sigma_1^2 )}\n",
"\\end{align}\n",
"$$\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment