Skip to content

Instantly share code, notes, and snippets.

@petered
Created December 12, 2017 23:41
Show Gist options
  • Save petered/34d937d90e9bc07990d91787fcccd888 to your computer and use it in GitHub Desktop.
Save petered/34d937d90e9bc07990d91787fcccd888 to your computer and use it in GitHub Desktop.
2017-12-12 Kasper EM
$$
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}
\newcommand{\pderivsq}[2]{\frac{\partial^2 #1}{\partial #2^2}}
\newcommand{\lderiv}[1]{\frac{\partial \mathcal L}{\partial #1}}
\newcommand{\pderivgiven}[3]{\left.\frac{\partial #1}{\partial #2}\right|_{#3}}
\newcommand{\norm}[1]{\frac12\| #1 \|_2^2}
\newcommand{\argmax}[1]{\underset{#1}{\operatorname{argmax}}}
\newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}}
\newcommand{\blue}[1]{\color{blue}{#1}}
\newcommand{\red}[1]{\color{red}{#1}}
\newcommand{\numel}[1]{|#1|}
\newcommand{\switch}[3]{\begin{cases} #2 & \text{if } {#1} \\ #3 &\text{otherwise}\end{cases}}
\newcommand{\pderivdim}[4]{\overset{\big[#3 \times #4 \big]}{\frac{\partial #1}{\partial #2}}}
\newcommand{\softmax}{\operatorname{softmax}}
\newcommand{\Bern}{\operatorname{Bern}}
\newcommand{\Cat}{\operatorname{Cat}}
\newcommand{\sigm}{\operatorname{sigm}}
\newcommand{\logfrac}[2]{\log \left( \frac{#1}{#2} \right)}
$$
We've assumed the following graphical model:
![enter image description here](https://docs.google.com/drawings/d/e/2PACX-1vRal4bK4gu7zruAjhV3R0CvjpqDP9sbAUHGop1FojAeaJnZmedx6bwoBQY762f-MTnWuuOkpyCoG8DX/pub?w=186&h=213)
This graph tells us that we can factorize our distribution as:
\begin{align}
p(X, C, F, ID; \theta)=p(X|C, F;\theta) p(C|ID;\theta) p(ID;\theta) p(F;\theta)
\end{align}
(Where we use $\theta$ to summarize all model parameters)
Now. How do we do EM on such a model?
**E-Step**
Find "responsibilities": $p(C | X, F, ID; \theta_{old})$
Using Bayes Rule, and looking at the dependencies in our graph, we can rewrite this so that we can directly solve for all the terms.
\begin{align}
p(C | X, F, ID; \theta_{old}) &= \frac{p(X, C, F, ID; \theta_{old})}{p(X, F, ID;\theta_{old})} \\
&= \frac{p(X, C, F, ID; \theta_{old})}{\sum_{c\in |C|}p(c, X, F, ID;\theta_{old})} \\
&= \frac{p(X|C, F;\theta_{old}) p(C|ID;\theta_{old}) p(ID;\theta_{old}) p(F;\theta_{old})}{\sum_{c\in |C|}p(X|C, F;\theta_{old}) p(C|ID;\theta_{old}) p(ID;\theta_{old}) p(F;\theta_{old})} \\
&:= \gamma(c)
\end{align}
**M-Step**
Maximize parameters:
\begin{align}
\theta_{new} &\leftarrow \argmax{\theta} \sum_{c \in |C|} p(C=c | X, F, ID; \theta_{old}) p(X, C=c, F, ID; \theta) \\
&=\argmax{\theta} \sum_{c \in |C|} \gamma(c) p(X, C=c, F, ID; \theta) \\
&=\argmax{\theta} \sum_{c \in |C|} \gamma(c) p(X|C=c, F;\theta) p(C=c|ID;\theta) p(ID;\theta) p(F;\theta)
\end{align}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment