gngdb/gist:dbcd157b917410561c72

## gistfile1.txt
{
 "metadata": {
  "name": "",
  "signature": "sha256:f4c2596208087783232f2bf4dbd6fc6742fa724077af8d31dee04d115c68c9f2"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Working through Bishop's chapter on this. Going to go through the analytical derivation of the update rules and then show how it works numerically. Worth noting that this is actually [a practical algorithm][sklearn].\n",
      "\n",
      "Derivation of update rules\n",
      "==========================\n",
      "\n",
      "Basically, I'm copying out a lot of Bishop, trying to fill in where I found it unclear.\n",
      "\n",
      "Defining the priors\n",
      "-------------------\n",
      "\n",
      "Describing the model, we have the conditional distribution of __Z__:\n",
      "\n",
      "$$ p(\\pmb{Z|\\pi}) = \\prod_{n=1}^{N} \\prod_{k=1}^{K} \\pi_{k}^{z_{nk}} $$\n",
      "\n",
      "Conditional distribution of the observed data:\n",
      "\n",
      "$$ p(\\pmb{X|Z, \\mu, \\Lambda}) = \\prod_{n=1}^{N} \\prod_{k=1}^{K} \\mathcal{N} (x_{n}|\\pmb{\\mu_{k}, \\Lambda_{k}^{-1}})^{z_{nk}} $$\n",
      "\n",
      "Next, define priors over the parameters. Use a Dirichlet over the mixing coefficients (guess because you know they should be between 0 and 1 in multivariate space:\n",
      "\n",
      "$$ p(\\pi) = Dir(\\pmb{\\pi|\\alpha_{0}}) = C(\\pmb{\\alpha_{0}}) \\prod_{k=1}^{K} \\pi_{k}^{\\alpha_{0} -1} $$\n",
      "\n",
      "Where $ C(\\pmb{\\alpha_{0}}) $ is the normalisation constant for a Dirichlet.\n",
      "\n",
      "Joint prior Gaussian-Wishart for mean and precision:\n",
      "\n",
      "$$ p(\\pmb{\\mu,\\Lambda}) = p(\\pmb{\\mu|\\Lambda}) p(\\pmb{\\Lambda}) $$\n",
      "\n",
      "Variational Distribution\n",
      "------------------------\n",
      "\n",
      "The full joint distribution over the random variables factorizes according to the graphical model:\n",
      "\n",
      "$$ p(\\pmb{X,Z,\\pi,\\mu,\\Lambda}) = p(\\pmb{X|Z,\\mu,\\Lambda}) p(\\pmb{Z|\\pi}) p(\\pmb{\\pi}) p(\\pmb{\\mu|\\Lambda} p(\\pmb{\\Lambda}) $$\n",
      "\n",
      "$\\newcommand{\\L}{\\pmb{\\Lambda}}$\n",
      "$\\newcommand{\\Z}{\\pmb{Z}}$\n",
      "$\\newcommand{\\pib}{\\pmb{\\pi}}$\n",
      "$\\newcommand{\\mub}{\\pmb{\\mu}}$\n",
      "\n",
      "Then the variational distribution can be one that factorizes between latent variables and parameters:\n",
      "\n",
      "$$ q(\\Z,\\pib,\\mub,\\L) = q(\\Z)q(\\pib,\\mub,\\L) $$\n",
      "\n",
      "Bishop notes here:\n",
      "\n",
      "> It is remarkable that this is the _only_ assumption that we need to make in order to obtain a tractable practical solution to our Bayesian mixture model.\n",
      "> In particular, the functional form of the factors $q(\\Z)$ and $q(\\Z)q(\\pib,\\mub,\\L)$ will be determined automatically by optimization of the variational distribution.\n",
      "\n",
      "So to continue we just use apply equation 10.9 (every example is solved like this in Bishop):\n",
      "\n",
      "$\\newcommand{\\X}{\\pmb{X}}$\n",
      "$$ \\ln q_{j}^{\\star} (\\Z_{j}) = \\mathbb{E}_{i\\neq j} \\left[ \\ln p(\\X , \\Z) \\right] + const. $$\n",
      "\n",
      "In this case, our factorisation is just between $q(\\Z)$ and $q(\\pib,\\mub,\\L)$, so we just end up with:\n",
      "\n",
      "$$ \\ln q^{\\star} (\\Z) = \\mathbb{E}_{\\mub,\\pib,\\L} \\left[ \\ln p(\\X , \\Z, \\mub, \\pib, \\L) \\right] + const. $$\n",
      "\n",
      "Looking into that expectation, we have the joint distribution, which we already know how to decompose (above):\n",
      "\n",
      "$$ \\ln q^{\\star} (\\Z) = \\mathbb{E}_{\\pib} \\left[ \\ln p(\\Z|\\pib) \\right] + \\mathbb{E}_{\\mub,\\L} \\left[ \\ln p(\\X | \\Z,\\pib,\\L) \\right] + const. $$\n",
      "\n",
      "Some more things have been absorbed into that constant ($p(\\L)$, $p(\\mub|\\L)$ and $p(\\pib)$) because they don't depend on $\\Z$.\n",
      "\n",
      "Then, since we know those conditional distributions, we can just substitute them in:\n",
      "\n",
      "$$ \\ln q^{\\star} (\\Z) = \\mathbb{E}_{\\pib} \\left[ \\ln \\prod_{n=1}^{N} \\prod_{k=1}^{K} \\pi_{k}^{z_{nk}}  \\right] + \\mathbb{E}_{\\mub,\\L} \\left[ \\ln \\prod_{n=1}^{N} \\prod_{k=1}^{K} \\mathcal{N} (x_{n}|\\pmb{\\mu_{k}, \\Lambda_{k}^{-1}})^{z_{nk}}  \\right] + const. $$\n",
      "\n",
      "And then rearrange a bit:\n",
      "\n",
      "$$ \\ln q^{\\star} (\\Z) = \\mathbb{E}_{\\pib} \\left[  \\sum_{n=1}^{N} \\sum_{k=1}^{K} z_{nk} \\ln \\pi_{k}  \\right] + \\mathbb{E}_{\\mub,\\L} \\left[ \\sum_{n=1}^{N} \\sum_{k=1}^{K} z_{nk} \\ln \\mathcal{N} (x_{n}|\\pmb{\\mu_{k}, \\Lambda_{k}^{-1}})  \\right] + const. $$\n",
      "\n",
      "$$ \\ln q^{\\star} (\\Z) =   \\sum_{n=1}^{N} \\sum_{k=1}^{K}  z_{nk} \\mathbb{E}_{\\pib} \\left[ \\ln \\pi_{k}  \\right] + \\sum_{n=1}^{N} \\sum_{k=1}^{K} z_{nk} \\mathbb{E}_{\\mub_{k},\\L_{k}} \\left[ \\ln \\mathcal{N} (x_{n}|\\pmb{\\mu_{k}, \\Lambda_{k}^{-1}})  \\right] + const. $$\n",
      "\n",
      "Then we can take everything inside the expectations and define a new variable $\\rho_{nk}$:\n",
      "\n",
      "$$ \\ln \\rho_{nk} = \\mathbb{E}_{\\pib} \\left[ \\ln \\pi_{k}  \\right] + \\mathbb{E}_{\\mub_{k},\\L_{k}} \\left[ \\ln \\mathcal{N} (x_{n}|\\pmb{\\mu_{k}, \\Lambda_{k}^{-1}})  \\right] $$\n",
      "\n",
      "$ \\newcommand{\\x}{\\pmb{x}} $\n",
      "$$ \\ln \\rho_{nk} = \\mathbb{E}_{\\pib} \\left[ \\ln \\pi_{k}  \\right] + \\frac{1}{2} \\mathbb{E}_{\\L_{k}} \\left[ \\ln |\\L_{k}| \\right] - \\frac{D}{2} \\ln(2\\pi) - \\frac{1}{2} \\mathbb{E}_{\\mub_{k},\\L_{k}} \\left[(\\x_{k} + \\mub_{k})^{T} \\L_{k} (\\x_{k} + \\mub_{k})   \\right] $$\n",
      "\n",
      "Then we can collapse the sums in a nice way and pretend the equation is much simpler than it really is:\n",
      "\n",
      "$$ \\ln q^{\\star} (\\Z) =   \\sum_{n=1}^{N} \\sum_{k=1}^{K}  z_{nk} \\ln \\rho_{nk} + const. $$ \n",
      "\n",
      "Now we can take the exponential of both sides:\n",
      "\n",
      "$$ q^{\\star} (\\Z) =   \\left( \\prod_{n=1}^{N} \\prod_{k=1}^{K} \\rho_{nk}^{z_{nk}} \\right) \\times const. $$\n",
      "\n",
      "[sklearn]: http://scikit-learn.org/stable/modules/generated/sklearn.mixture.VBGMM.html#sklearn.mixture.VBGMM"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Exercise 10.12\n",
      "--------------\n",
      "\n",
      "> Starting from the joint distribution (10.41), and applying the general result (10.9), show that the optimal variational distribution $q^{\\star}(\\Z)$ over the latent variables for the Bayesian mixture of Gaussians is given by (10.48) by verifying the steps given in the text.\n",
      "\n",
      "So far we've verified every step above. The final step brings us to:\n",
      "\n",
      "$$ q^{\\star} (\\Z) =   \\prod_{n=1}^{N} \\prod_{k=1}^{K} r_{nk}^{z_{nk}} $$\n",
      "\n",
      "Where $r_{nk}$ has been defined:\n",
      "\n",
      "$$ r_{nk} = \\frac{\\rho_{nk}}{\\sum_{j=1}^{K} \\rho_{nj} } $$\n",
      "\n",
      "To get to this answer from before the text advises:\n",
      "\n",
      "> Requiring that this distribution be normalized, and nothing that for each value of $n$ the quantities $z_{nk}$ are binary and sum to 1 over all values of $k$...\n",
      "\n",
      "We can go forward by name $const$ from above $c$:\n",
      "\n",
      "$$ q^{\\star} (\\Z) =   c \\prod_{n=1}^{N} \\prod_{k=1}^{K} \\rho_{nk}^{z_{nk}} $$\n",
      "\n",
      "For it to normalise:\n",
      "\n",
      "$$ 1 =  \\sum_{Z} c \\prod_{n=1}^{N} \\prod_{k=1}^{K} \\rho_{nk}^{z_{nk}} $$\n",
      "\n",
      "I'm willing to admit that I'm not entirely comfortable with this next bit, but it seems to be correct.\n",
      "\n",
      "Then, since each vector of $\\pmb{z}_{n}$ is binary, with a single value being one, at each data point only a single $\\rho_{nj}$; if $j$ is where this one is located then the equation becomes:\n",
      "\n",
      "$$ 1 =  c \\prod_{n=1}^{N} \\sum_{j=1}^{K} \\rho_{nj} $$\n",
      "\n",
      "Solving for c:\n",
      "\n",
      "$$ c = \\frac{1}{\\prod_{n=1}^{N} \\sum_{j=1}^{K} \\rho_{nj}} $$\n",
      "\n",
      "Then we can substitute that back into the original equation:\n",
      "\n",
      "$$ q^{\\star} (\\Z) = \\frac{1}{\\prod_{n=1}^{N} \\sum_{j=1}^{K} \\rho_{nj}} \\prod_{n=1}^{N} \\prod_{k=1}^{K} \\rho_{nk}^{z_{nk}} $$\n",
      "\n",
      "And rearrange:\n",
      "\n",
      "$$ q^{\\star} (\\Z) = \\prod_{n=1}^{N} \\frac{1}{\\sum_{j=1}^{K} \\rho_{nj}} \\prod_{k=1}^{K} \\rho_{nk}^{z_{nk}} $$\n",
      "\n",
      "As $z_{nk}$ must sum to one over all values of k ($\\sum_{k}^{K}z_{nk} = 1$):\n",
      "\n",
      "$$ q^{\\star} (\\Z) = \\prod_{n=1}^{N} \\left(\\frac{1}{\\sum_{j=1}^{K} \\rho_{nj}} \\right)^{1} \\prod_{k=1}^{K} \\rho_{nk}^{z_{nk}} $$\n",
      "\n",
      "$$ q^{\\star} (\\Z) = \\prod_{n=1}^{N} \\left(\\frac{1}{\\sum_{j=1}^{K} \\rho_{nj}} \\right)^{\\sum_{k=1}^{K}z_{nk}} \\prod_{k=1}^{K} \\rho_{nk}^{z_{nk}} $$\n",
      "\n",
      "$$ q^{\\star} (\\Z) = \\prod_{n=1}^{N} \\left(\\prod_{k=1}^{K} \\left(\\frac{1}{\\sum_{j=1}^{K} \\rho_{nj}}\\right)^{z_{nk}}\\right) \\left(\\prod_{k=1}^{K} \\rho_{nk}^{z_{nk}} \\right) $$\n",
      "\n",
      "$$ q^{\\star} (\\Z) = \\prod_{n=1}^{N}  \\prod_{k=1}^{K} \\left( \\frac{\\rho_{nk} }{\\sum_{j=1}^{K} \\rho_{nj}} \\right)^{z_{nk}} $$\n",
      "\n",
      "Then all we have to do is substitute in $r_{nk}$ as defined above:\n",
      "\n",
      "$$ q^{\\star} (\\Z) =   \\prod_{n=1}^{N} \\prod_{k=1}^{K} r_{nk}^{z_{nk}} $$\n",
      "\n",
      "And we're done."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Continuing\n",
      "----------\n",
      "\n",
      "> Note that the optimal solution for the factor $q(\\Z)$ takes the same functional form as the prior $p(\\Z|\\pi)$.\n",
      "\n",
      "Finding the expectation of this:\n",
      "\n",
      "$$ \\mathbb{E}_{q^{\\star}} [z_{nk}] = \\sum_{Z} z_{nk} \\prod_{n=1}^{N} \\prod_{k=1}^{K} r_{nk}^{z_{nk}} $$\n",
      "\n",
      "The expectation selects a single element (again, _not_ comfortable with this):\n",
      "\n",
      "$$ \\mathbb{E}_{q^{\\star}} [z_{nk}] = r_{nk} $$\n",
      "\n",
      "Which is actually the expectation of any [categorical distribution][cat].\n",
      "\n",
      "> At this point, we shall find it convenient to define three statistics of the observed data set evaluated with respect to the responsibilities, given by\n",
      "\n",
      "$$ N_{k} = \\sum_{n=1}^{N} r_{nk} $$\n",
      "\n",
      "$$ \\bar{\\x}_{k} = \\frac{1}{N_{k}} \\sum_{n=1}^{N} r_{nk} \\x_{n} $$\n",
      "\n",
      "$$ \\pmb{S}_{k} = \\frac{1}{N_{k}} \\sum_{n=1}^{N} r_{nk} (\\x_{n} - \\bar{\\x}_k)(\\x_{n} - \\bar{\\x}_{k})^{T} $$\n",
      "\n",
      "Now we do the same application of equation 10.9 as above, but to the other part of the variational distribution:\n",
      "\n",
      "$$ \\ln q_{j}^{\\star} (\\Z_{j}) = \\mathbb{E}_{i\\neq j} \\left[ \\ln p(\\X , \\Z) \\right] + const. $$\n",
      "\n",
      "$$ \\ln q^{\\star} (\\pib, \\mub, \\L) = \\mathbb{E}_{\\Z} \\left[ \\ln p(\\X , \\Z, \\mub, \\pib, \\L) \\right] + const. $$\n",
      "\n",
      "Breaking the joint up into its conditional distributions and substituting yields:\n",
      "\n",
      "$$ \\ln q^{\\star} (\\pib, \\mub, \\L) = \\ln p(\\pib) + \\sum_{k=1}^{K} \\ln p(\\mub_{k}, \\L_{k}) + \\mathbb{E}_{Z}\\left[ \\ln p(\\Z|\\pib) \\right] + \\sum_{k=1}^{K} \\sum_{n=1}^{N} \\mathbb{E}[z_{nk}] \\ln \\mathcal{N} (\\x_{n}|\\mub_{k}, \\L_{k}^{-1}) + const. $$\n",
      "\n",
      "Observing there are two terms that involve only $\\pib$ and the rest contain only $\\mub_{k}$ and $\\L_{k}$ - and those are a sum of terms. This implies the variational distribution should factorise into:\n",
      "\n",
      "$$ \\ln q^{\\star} (\\pib, \\mub, \\L) = q(\\pib) \\prod_{k=1}^{K} q(\\mub_{k}, \\L_{k}) $$\n",
      "\n",
      "Splitting these up into their own equations we have, for $\\pib$:\n",
      "\n",
      "$$ \\ln q^{\\star}(\\pib) = (\\alpha_{0} -1) \\sum_{k=1}^{K} \\ln \\pi_{k} + \\sum_{k=1}^{K} \\sum_{n=1}^{N} r_{nk} \\ln_{\\pi_{k}} + const. $$\n",
      "\n",
      "Where $\\mathbb{E}[z_{nk}]$ has become $r_{nk}$.\n",
      "\n",
      "Then, we're supposed to look at that equation and think, \"hey! that kind of looks like a [Dirichlet distribution][d] someone's taken the log of!\" (alternatively, you could just assume it's going to be Dirichlet because the prior was Dirichlet). Anyway, what we do is take the exponential of it to reveal its inner Dirichlet-ness. However, first, I'm going to substitute in $N_{k}$ and rearrange:\n",
      "\n",
      "$$ \\ln q^{\\star}(\\pib) =  \\sum_{k=1}^{K} (\\alpha_{0} -1) \\ln \\pi_{k} + \\sum_{k=1}^{K} N_{k} \\ln_{\\pi_{k}} + const. $$\n",
      "\n",
      "$$ \\ln q^{\\star}(\\pib) =  \\sum_{k=1}^{K}  \\ln \\pi_{k}(\\alpha_{0} -1 + N_{k}) + const. $$\n",
      "\n",
      "Then, taking the exponential:\n",
      "\n",
      "$$  q^{\\star}(\\pib) =  e^{\\sum_{k=1}^{K}  \\ln \\pi_{k}(\\alpha_{0} -1 + N_{k}) + const.} $$\n",
      "\n",
      "$$  q^{\\star}(\\pib) =  e^{const.} \\prod_{k=1}^{K} e^{ \\ln \\pi_{k}(\\alpha_{0} -1 + N_{k})} $$\n",
      "\n",
      "$\\newcommand{\\ab}{\\pmb{\\alpha}}$\n",
      "$$ q^{\\star}(\\pib) =  \\frac{1}{B(\\ab)} \\prod_{k=1}^{K} e^{ \\ln \\pi_{k}(\\alpha_{0} -1 + N_{k})} $$\n",
      "\n",
      "$$ q^{\\star}(\\pib) =  \\frac{1}{B(\\ab)} \\prod_{k=1}^{K} \\pi_{k}^{\\alpha_{0} -1 + N_{k}} = Dir(\\pib|\\ab) $$\n",
      "\n",
      "Where $\\ab$ has components given by:\n",
      "\n",
      "$$ \\alpha_{k} = \\alpha_{0} -1 + N_{k} $$\n",
      "\n",
      "We can break up the next part of the joint variational distribution using the product rule:\n",
      "\n",
      "$$ q^{\\star} (\\mub_{k},\\L_{k}) = q^{\\star}(\\mub_{k}|\\L_{k}) q^{\\star}(\\L_{k}) $$\n",
      "\n",
      "Expressing these factors in as a Gaussian-Wishart distribution is exercise 10.13.\n",
      "\n",
      "[cat]: https://en.wikipedia.org/wiki/Categorical_distribution\n",
      "[d]: https://en.wikipedia.org/wiki/Dirichlet_distribution"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Exercise 10.13\n",
      "--------------\n",
      "\n",
      "(I started working through this, but I don't have the time. The following is working through the answers, checking the steps used.)\n",
      "\n",
      "So we get the joint distibution by inspection as above with the $\\pib$ component.\n",
      "\n",
      "$ \\newcommand{\\N}{\\mathcal{N}} $\n",
      "$ \\newcommand{\\m}{\\pmb{m}} $\n",
      "$ \\newcommand{\\W}{\\mathcal{W}} $\n",
      "$ \\newcommand{\\v}{\\mathcal{v}} $\n",
      "$ \\newcommand{\\E}{\\mathbb{E}} $\n",
      "$$ \\ln q^{\\star}(\\mub_{k},\\L_{k}) = \\ln \\N \\left( \\mub_{b}|\\m_{0},(\\beta_{0}\\L_{k})^{-1} \\right) + \\ln \\W (\\L_{k}|W_{0}, \\v_{0}) + \\sum_{n=1}^{N} \\E[z_{nk}] \\ln \\N (\\x_{n}| \\mub_{k}, \\L_{k}^{-1}) \\\\\n",
      "= - \\frac{\\beta_{0}}{2} (\\mub_{k} - \\m_{0})^{T}\\L_{k}(\\mub_{k} - \\m_{0}) + \\frac{1}{2} \\ln |\\L_{k}| - \\frac{1}{2} Tr \\left( \\L_{k}W_{0}^{-1} \\right) + \\frac{v_{0} - D -1}{2} \\ln |\\L_{k}| - \\frac{1}{2} \\sum_{n=1}^{N} \\E [z_{nk}] (\\x_{n} - \\mub_{k})^{T} \\L_{k}(\\x_{n} - \\mub_{k}) + \\frac{1}{2} \\left( \\sum_{n=1}^{N} \\E [z_{nk}] \\right) \\ln |\\L_{k}| + const. $$\n",
      "\n",
      "\n",
      "\n",
      "Splitting this up as above, by considering only terms that depend on $\\mub_{k}$.\n",
      "\n",
      "$$ \\ln q^{\\star}(\\mub_{k}|\\L_{k}) = - \\frac{\\beta_{0}}{2} (\\mub_{k} - \\m_{0})^{T}\\L_{k}(\\mub_{k} - \\m_{0}) - \\frac{1}{2} \\sum_{n=1}^{N} \\E [z_{nk}] (\\x_{n} - \\mub_{k})^{T} \\L_{k}(\\x_{n} - \\mub_{k}) + const. $$\n",
      "\n",
      "Expanding, rearranging and absorbing more terms into the constant:\n",
      "\n",
      "$$ \\ln q^{\\star}(\\mub_{k}|\\L_{k}) = - \\frac{1}{2} \\mub_{k}^{T} \\left[ \\beta_{0} + \\sum_{n=1}^{N} \\E [z_{nk}]  \\right] \\L_{k} \\mub_{k} + \\mub_{k}^{T}\\L_{k} \\left[ \\beta_{0} \\m_{0} + \\sum_{n=1}^{N} \\E[z_{nk}] \\x_{n} \\right] + const. $$\n",
      "\n",
      "Substituting in $N_{k}$ and $\\bar{\\x}_{n}$\n",
      "\n",
      "$$ \\ln q^{\\star}(\\mub_{k}|\\L_{k}) = - \\frac{1}{2} \\mub_{k}^{T} \\left[ \\beta_{0} + N_{k}  \\right] \\L_{k} \\mub_{k} + \\mub_{k}^{T}\\L_{k} \\left[ \\beta_{0} \\m_{0} + N_{k} \\bar{\\x}_{n} \\right] + const. $$\n",
      "\n",
      "> Thus we see that $\\ln q^{\\star} (\\mub_{k}|\\L_{k})$ depends quadratically on $\\mub_{k}$ and hence $\\ln q^{\\star} (\\mub_{k}|\\L_{k})$ is a Gaussian distribution.\n",
      "\n",
      "Complete the square for the mean and precision:\n",
      "\n",
      "$$ \\ln q^{\\star} (\\mub_{k}|\\L_{k}) = \\N (\\mub_{k}| \\m_{k}, \\beta_{k}\\L_{k}) $$\n",
      "\n",
      "where:\n",
      "\n",
      "$$ \\beta_{k} = \\beta_{0} + N_{k} $$\n",
      "\n",
      "$$ \\m_{k} = \\frac{1}{\\beta_{k}} (\\beta_{0}\\m_{0} + N_{k} \\bar{\\x}_{k}) $$\n",
      "\n",
      "Using the product rule again, we can find $\\ln q^{\\star} (\\L_{k})$:\n",
      "\n",
      "$$ \\ln q^{\\star} (\\L_{k}) = \\ln q^{\\star} (\\mub_{k}, \\L_{k}) - \\ln q^{\\star} ( \\mub_{k}| \\L_{k}) $$\n",
      "\n",
      "We just found the second of these terms, and the first we stated earlier (the massive equation at the start of this exercise):\n",
      "\n",
      "$$ \\ln q^{\\star} (\\L_{k}) =  - \\frac{\\beta_{0}}{2} (\\mub_{k} - \\m_{0})^{T}\\L_{k}(\\mub_{k} - \\m_{0}) + \\frac{1}{2} \\ln |\\L_{k}| - \\frac{1}{2} Tr \\left( \\L_{k}W_{0}^{-1} \\right) + \\frac{v_{0} - D -1}{2} \\ln |\\L_{k}| - \\frac{1}{2} \\sum_{n=1}^{N} \\E [z_{nk}] (\\x_{n} - \\mub_{k})^{T} \\L_{k}(\\x_{n} - \\mub_{k}) + \\frac{1}{2} \\left( \\sum_{n=1}^{N} \\E [z_{nk}] \\right) \\ln |\\L_{k}| + \\frac{\\beta_{k}}{2}(\\mub_{k} - \\m_{k})\\L_{k}(\\mub_{k} - \\m_{k}) - \\frac{1}{2} \\ln |\\L_{k}| + const. $$\n",
      "\n",
      "Rearrange this into the shape of Wishart distribution (I am not doing this):\n",
      "\n",
      "$ \\newcommand{\\W}{\\pmb{W}} $\n",
      "$$ \\ln q^{\\star} (\\L_{k}) = \\frac{(\\v_{k} - D - 1)}{2} \\ln | \\L_{k} | - \\frac{1}{2} Tr (\\L_{k} \\W_{k}^{-1}) + const. $$\n",
      "\n",
      "Defining $\\W_{k}^{-1}$ and $\\v_{k}$:\n",
      "\n",
      "$$ \\W_{k}^{-1} = \\W_{0}^{-1} + \\beta_{0} (\\mub_{k} - \\m_{0})(\\mub_{k} - \\m_{0})^{T} + \\sum_{n=1}^{N} \\E [z_{nk}] (\\x_{n} - \\mub_{k})(\\x_{n} - \\mub_{k})^{T} - \\beta_{k}(\\mub_{k} - \\m_{k})(\\mub_{k} - \\m_{k})^{T} $$\n",
      "\n",
      "$$ \\W_{k}^{-1} = \\W_{0}^{-1} + N_{k}\\pmb{S}_{k} + \\frac{\\beta_{0}N_{k}}{\\beta_{0} + N_{k}} (\\bar{\\x}_{k} - \\m_{0})(\\bar{\\x}_{k} - \\m_{0})^{T} $$\n",
      "\n",
      "$$ \\v_{k} = \\v_{0} + \\sum_{n=1}^{N} \\E[z_{nk}] \\\\ = \\v_{0} + N_{k} $$\n",
      "\n",
      "Making use of:\n",
      "\n",
      "$$ \\sum_{n=1}^{N} \\E [z_{nk}] \\x_{n} \\x_{n}^{T} = \\sum_{n=1}^{N} \\E[z_{nk}] (\\x_{n} - \\bar{\\x}_{k})(\\x_{n} - \\bar{\\x}_{k})^{T} + N_{k}\\bar{\\x}_{k}\\bar{\\x}_{k}^{T} \\\\ = N_{k} \\pmb{S}_{k} + N_{k} \\bar{\\x}_{k}\\bar{\\x}_{k}^{T} $$\n",
      "\n",
      "Now we can state the Wishart distribution we've found:\n",
      "\n",
      "$$ q^{\\star}(\\L_{k}) = \\mathcal{W}(\\L_{k}|\\W_{k},\\v_{k}) $$"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Exercise 10.14\n",
      "--------------\n",
      "\n",
      "> Using the distribution (10.59)...\n",
      "\n",
      "$ \\newcommand{\\q}{q^{\\star}} $\n",
      "$$ \\q(\\mub_{k},\\L_{k}) = \\q(\\mub_{k}|\\L_{k})\\q(\\L_{k})  \\\\ = \\N \\left( \\mub_{k}| \\m_{k}, (\\beta_{k}\\L_{k})^{-1} \\right) \\mathcal{W} (\\L_{k}| \\W_{k}, \\v_{k}) $$\n",
      "\n",
      "> ...verify the result (10.64):\n",
      "\n",
      "$$ \\E_{\\mub_{k},\\L_{k}} \\left[ (\\x_{n} - \\mub_{k})^{T} \\L_{k} (\\x_{n} - \\mub_{k}) \\right] \\\\ = D \\beta_{k}^{-1} + \\v_{k} (\\x_{n} - \\m_{k})^{T} \\W_{k} (\\x_{n} - \\m_{k}) $$\n",
      "\n",
      "With a bit of help from Matt Graham got this one figured out; start by expanding what you find in the expectations:\n",
      "\n",
      "$$ \\E_{\\mub_{k},\\L_{k}} \\left[ \\x_{n}^{T} \\L_{k} \\x_{n} - 2\\x_{n}^{T} \\L_{k} \\mub_{k} + \\mub_{k}^{T} \\L_{k} \\mub_{k} \\right] $$\n",
      "\n",
      "Then, we can apply the trace operator to do something a bit weird. As [wikipedia notes][trace] we can express the trace of a product of matrices as the sume of entry-wise products of elements. Luckily, this means we can also do the reverse:\n",
      "\n",
      "$$ \\mub_{k}^{T} \\L_{k} \\mub_{k} = \\sum_{i,j} \\mu_{k,i} \\Lambda_{k,ij} \\mu_{k,j} = \\sum_{i} \\left( \\sum_{j} (\\mu_{k,i} \\mu_{k,j}) \\Lambda_{k,ij} \\right) = Tr \\left( (\\mub_{k} \\mub_{k}^{T}) \\L_{k} \\right) = Tr \\left( \\L_{k} (\\mub_{k} \\mub_{k}^{T}) \\right) $$\n",
      "\n",
      "Substituting and moving in the expectation:\n",
      "\n",
      "$$ \\E_{\\L_{k}} \\left[ \\x_{n}^{T} \\L_{k} \\x_{n} - 2\\x_{n}^{T} \\L_{k} \\E_{\\mub_{k}}[\\mub_{k}] + Tr \\left( \\L_{k} \\E_{\\mub_{k}}[\\mub_{k} \\mub_{k}^{T}] \\right)  \\right] $$\n",
      "\n",
      "Then, we can look at the Normally distributed part of the variational distribution we're using at this point and realise that the expectation of $\\mub_{k}$ with respect to this distribution must be equal to its mean, as it's a Normal distribution and its [first moment is equal to its mean][normal]. In other words:\n",
      "\n",
      "$$ \\E_{\\mub_{k}}[\\mub_{k}] = \\m_{k} $$\n",
      "\n",
      "Then, looking at the expectation inside the trace function and thinking about moments it's obvious that we're looking at the second moment under that normal distribution:\n",
      "\n",
      "$$ \\E_{\\mub_{k}}[\\mub_{k}\\mub_{k}^{T}] = (\\beta_{k} \\L_{k})^{-1} + m_{k} m_{k}^{T} $$\n",
      "\n",
      "Then we can just substitute and expand until we're happy with what we've got:\n",
      "\n",
      "$$ \\E_{\\L_{k}} \\left[ \\x_{n}^{T} \\L_{k} \\x_{n} - 2\\x_{n}^{T} \\L_{k} \\m_{k} + Tr \\left( \\L_{k} [ (\\beta_{k} \\L_{k})^{-1} + m_{k} m_{k}^{T} ] \\right)  \\right] $$\n",
      "\n",
      "$$ \\E_{\\L_{k}} \\left[ \\x_{n}^{T} \\L_{k} \\x_{n} - 2\\x_{n}^{T} \\L_{k} \\m_{k} + Tr \\left( \\mathbf{I} \\beta_{k}^{-1} + \\L_{k} m_{k} m_{k}^{T} ] \\right)  \\right] $$\n",
      "\n",
      "Then, we're able to do the same trick from above with the trace operator and bring the two terms outside:\n",
      "\n",
      "$$ \\E_{\\L_{k}} \\left[ \\x_{n}^{T} \\L_{k} \\x_{n} - 2\\x_{n}^{T} \\L_{k} \\m_{k} + D \\beta_{k}^{-1} +  m_{k} \\L_{k} m_{k}^{T}  \\right] $$\n",
      "\n",
      "Now, we can see that we can put together one of the terms in the solution we're hoping to get:\n",
      "\n",
      "$$ \\x_{n}^{T} \\L_{k} \\x_{n} - 2\\x_{n}^{T} \\L_{k} \\m_{k} +  m_{k} \\L_{k} m_{k}^{T} = (\\x_{n}-\\m_{k})^{T}\\L_{k}(\\x_{n} - \\m_{k}) $$\n",
      "\n",
      "And the other does not depend on $\\L_{k}$ so we can just move the expectation in:\n",
      "\n",
      "$$ (\\x_{n}-\\m_{k})^{T} \\E_{\\L_{k}}\\left[ \\L_{k} \\right] (\\x_{n} - \\m_{k}) + D \\beta_{k}^{-1} $$\n",
      "\n",
      "Which we also know, as the expectation of a [Wishart distribution][wishart] will be it's mean, which is defined as $v_{k} \\mathbf{W}_{k}$:\n",
      "\n",
      "$$ v_{k} (\\x_{n}-\\m_{k})^{T} \\mathbf{W}_{k} (\\x_{n} - \\m_{k}) + D \\beta_{k}^{-1} $$\n",
      "\n",
      "Which is the answer.\n",
      "\n",
      "[wishart]: https://en.wikipedia.org/wiki/Wishart_distribution\n",
      "[trace]: https://en.wikipedia.org/wiki/Trace_(linear_algebra)#Trace_of_a_product\n",
      "[normal]: https://en.wikipedia.org/wiki/Normal_distribution#Moments"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Two more of these expectations gives us:\n",
      "\n",
      "$$ \\ln \\tilde{\\L}_{k} = \\E[\\ln |\\L_{k}|] = \\sum_{i=1}^{N} \\psi \\left( \\frac{\\v_{k} +1 -i}{2} \\right) + D \\ln 2 + \\ln | \\W_{k} | $$\n",
      "\n",
      "$$ \\ln \\tilde{\\pi_{k}} = \\E[\\ln \\pi_{k}] = \\psi(\\alpha_{k}) + \\psi(\\hat{\\alpha}) $$\n",
      "\n",
      "Where $\\psi$ is the digamma function and $\\hat{\\alpha} = \\sum_{k} \\alpha_{k}$.\n",
      "\n",
      "Apparently, now we can just sub in these three equations to the original equation for $\\ln \\rho_{nk}$\n",
      "\n",
      "$$ \\ln \\rho_{nk} = \\E [\\ln \\pi_{k} ] + \\frac{1}{2}\\E [\\ln |\\L_{k}|] - \\frac{D}{2} \\ln (2\\pi) - \\frac{1}{2} \\E_{\\mub_{k},\\L_{k}} \\left[ (\\x_{n} - \\mub_{k})^{T} \\L_{k} (\\x_{n} - \\mub_{k}) \\right] $$\n",
      "\n",
      "But, don't sub _too much_:\n",
      "\n",
      "$$ \\ln \\rho_{nk} = \\ln \\tilde{\\pi}_{k} + \\frac{1}{2} \\ln \\tilde{\\L} - \\frac{D}{2} \\ln (2\\pi) - \\frac{1}{2} \\left( D \\beta_{k}^{-1} + \\v_{k} (\\x_{n} - \\m_{k})^{T} \\W_{k} (\\x_{n} - \\m_{k}  \\right) $$\n",
      "\n",
      "Then take the exponential:\n",
      "\n",
      "$$ \\rho_{nk} = \\tilde{\\pi}_{k} \\tilde{\\L}^{\\frac{1}{2}} (2\\pi)^{- \\frac{D}{2}} \\exp \\left(-\\frac{D}{2} \\beta_{k}^{-1} - \\frac{\\v_{k}}{2} (\\x_{n} - \\m_{k})^{T} \\W_{k} (\\x_{n} - \\m_{k} ) \\right) $$\n",
      "\n",
      "Then we can easily calculate $r_{nk}$ with:\n",
      "\n",
      "$$ r_{nk} = \\frac{\\rho_{nk}}{\\sum_{j=1}^{K} \\rho_{nj}} $$\n",
      "\n",
      "So, finally, we can learn the variational distribution of our model from the data by iteratively updating $r_{nk}$, fixing these responsibilities and updating the variational distributions from their definitions above. Then we just calculate $r_{nk}$ again and keep going round til we're satisfied.\n",
      "\n",
      "Time for the numerical example, but Bishop provides no code so we're going to have to make our own."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Numerical Example\n",
      "=================\n",
      "\n",
      "We're going to be working with the old faithful data, so first we'd better make sure we have that:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "!wget http://www.stat.cmu.edu/~larry/all-of-statistics/=data/faithful.dat"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "--2015-01-20 13:08:07--  http://www.stat.cmu.edu/~larry/all-of-statistics/=data/faithful.dat\r\n"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Resolving www.stat.cmu.edu (www.stat.cmu.edu)... "
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "128.2.12.64\r\n",
        "Connecting to www.stat.cmu.edu (www.stat.cmu.edu)|128.2.12.64|:80... "
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "connected.\r\n",
        "HTTP request sent, awaiting response... "
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "200 OK\r\n",
        "Length: 6608 (6.5K) [text/plain]\r\n",
        "Saving to: \u2018faithful.dat\u2019\r\n",
        "\r\n",
        "\r",
        "faithful.dat          0%[                      ]       0  --.-KB/s             \r",
        "faithful.dat        100%[=====================>]   6.45K  --.-KB/s   in 0s     \r\n",
        "\r\n",
        "2015-01-20 13:08:07 (828 MB/s) - \u2018faithful.dat\u2019 saved [6608/6608]\r\n",
        "\r\n"
       ]
      }
     ],
     "prompt_number": 6
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Unfortunately, they've shoved a massive header into the file for some reason:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "!head -n 30 faithful.dat"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Old Faithful Geyser Data\r\n",
        "\r\n",
        "Description: (From R manual):\r\n",
        "\r\n",
        "     Waiting time between eruptions and the duration of the eruption\r\n",
        "     for the Old Faithful geyser in Yellowstone National Park, Wyoming,\r\n",
        "     USA.\r\n",
        "\r\n",
        "     A data frame with 272 observations on 2 variables.\r\n",
        "\r\n",
        "eruptions  numeric  Eruption time in mins\r\n",
        "waiting    numeric  Waiting time to next eruption\r\n",
        "\r\n",
        "References:\r\n",
        "\r\n",
        "     Hardle, W. (1991) Smoothing Techniques with Implementation in S.\r\n",
        "     New York: Springer.\r\n",
        "\r\n",
        "     Azzalini, A. and Bowman, A. W. (1990). A look at some data on the\r\n",
        "     Old Faithful geyser. Applied Statistics 39, 357-365.\r\n",
        "\r\n",
        "\r\n",
        "\r\n",
        "\r\n",
        "\r\n",
        "    eruptions waiting\r\n",
        "1       3.600      79\r\n",
        "2       1.800      54\r\n",
        "3       3.333      74\r\n",
        "4       2.283      62\r\n"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We have to cut that part off before we have a useful csv:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "!tail -n+26 faithful.dat > faithful.csv"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 27
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Then we can just use numpy to load the data into an array:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import numpy as np"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 28
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Skipping the first row because it's just labels."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "data = np.loadtxt(\"faithful.csv\",skiprows=1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 36
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%matplotlib inline\n",
      "import matplotlib.pyplot as plt"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 38
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "plt.scatter(data[:,1],data[:,2])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 41,
       "text": [
        "<matplotlib.collections.PathCollection at 0x7f0968ff2518>"
       ]
      },
      {
       "metadata": {},
       "output_type": "display_data",
       "png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAEACAYAAABWLgY0AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJztnX+QHGd55z/Psh5nba+92pWQBf4Rsz6HkDj2riuOHDm1\nC8Vqc1SiYOuOy1WRTLhcdNyPgsBAJKJwqIpRME5MSK6gfIQD9pIUMXdEKVFHtBI+NolyB4kdcBxA\nOBQ2FxIsI8sOEJTCSp77o3t350fPTE//mp7Z76eqSz3d/b7v0++Onn7n+z79vObuCCGEGE3GBm2A\nEEKI/JCTF0KIEUZOXgghRhg5eSGEGGHk5IUQYoSRkxdCiBGmq5M3sw+Y2Vkze7Th2LSZnTKzx8zs\npJlNNZx7i5n9lZmdMbO9eRouhBCiN71G8h8EfrTl2CHglLvfBDwYfsbMXgL8K+AlYZn3mpl+KQgh\nxADp6oTd/Y+BZ1oO7wNWwv0V4JXh/k8AH3b359z9CeBLwO3ZmSqEEKJfkoy0d7r72XD/LLAz3H8B\n8NWG674KvDCFbUIIIVKSSk7xICdCt7wIypkghBADZDxBmbNmdrW7P2lmu4CnwuN/A1zbcN014bEm\nzEyOXwghEuDu1m+ZJCP540A13K8Cv99w/CfNrGJmNwD/DPjTDoaWbnvb2942cBtkk2zainbJpnhb\nUrqO5M3sw8ACsN3M/hr4z8A9wEfM7GeBJ4BXhY7782b2EeDzwEXgP3gay4QQQqSmq5N393/d4dTL\nO1z/y8AvpzVKCCFENiiOPWRxcXHQJrQhm+Ihm+JTRrtkU75Y0YqKmUnFEUKIPjEzvKCJVyGEEEOC\nnLwQQowwcvJCCDHCyMkLIcQIIycvhBAjjJy8EELEYHV1lb1797N3735WV1cHbU5sFEIphBA9WF1d\n5a67qly48E4AJiYOcuzYCsvLy4XZkDSEUk5eCCF6sHfvfk6d2sdm2q4VlpaOc/LkRwuzQXHyQggh\n2kiSalgIIbYUtdoBTp+ucuFC8Hli4iC12kr3QiVBco0QQsRgdXWV++57HxA4/SL1eJAmL4QQI400\neSGEEG3IyQshRophjWfPC8k1QoiRoQzx7HkhTV4IseUpQzx7XkiTF0II0YacvBBbmLLr1/3aV6sd\nYGLiILACrITx7Adyt7PMSK4RYotSdv06qX2DjmfPC2nyQoi+KLt+XXb7ikaavBCiVJRdCtoqKHeN\nEFuUPPOxtEotp09X+5aChjlfTJmQXCPEFiYv/TorqWVU9fUkJJVrNJIXYguzvLxcasdZdvuGAWny\nQogNstLRFcpYHiTXCCGA7EMqJbVkS+EhlGb2euDfAgb8prv/uplNAw8A1wNPAK9y92dbysnJC1FC\nFLJYbgoNoTSz7ydw8D8I3AL8mJnNAoeAU+5+E/Bg+FkIIcSASKrJvxj4tLv/g7v/I/CHwH5gH4EI\nR/jvK9ObKIQogjQ6ejctf/3c/Pwi8/N3JtL7o+qPe2zL4+59bwRO/ovANHAZ8H+A3wCeabjGGj83\nHHchRDk5ceKELy3d7UtLd/uJEydil5mY2OnwIYcP+cTEzo2yredgu0Ot6Zok9dfr9VjH4rYxDIS+\ns39/naRQ0B7/BniIYBT/XuDXWp06cD6iXM5dIYQokqWlu0PH6uH2IV9aurvjObi76Zok9U9Pz8Y6\nFreNYSCpk08cJ+/uHwA+AGBmR4GvAmfN7Gp3f9LMdgFPRZU9cuTIxv7i4iKLi4tJzRBCiJFkbW2N\ntbW19BUleTIEDxWeH/57HfAF4CrgXuBgePwQcE9EuXwfd0IId08mvSRtp1KZctjtsNsrlalSyzVF\n9UvWMAC55o+AzwGfBV4aHpsGPgE8BpwEpiLK5d4ZQmx1uunkebRVqezYaKtS2dHU1rpTnZtb8Lm5\nPYmca5RjjnustZ5h1e2TOnm9DCXECFJkzPswxdcPk62tKNWwECITsghDfPjhR1KV72VDlqGSDz/8\nyGiHYCYZ/qfZkFwjRO4klSWSlGvX3a90qCWWQ3rZkEZy6TZHUPYQTIrW5JNucvJCFEOSCcZu4ZBx\n2grCGGupwhh72ZDUxkZbAzt3O5wYmhDMpE5eqYaFGFGKTNO73laged9cSJtJWV5e5rbbbgm1+S2Q\nNC3JkyHNhkbyYoQZ1vC8ddJGn3QKd+ynT/KUa3rZKblGTl6IjgxzeF4jaR9UjeWTOs44oZBpH6ZJ\nQjAHSVInrxBKITJimMPz8kJ9kh0KoRRCCNGGnLwQGVHmJe/6if/eTA18J/Pzi6lixsvQJ53u/ejR\no8zM3MjMzI0cPXq0UJsKJYnGk2ZDmrwYYcqo6fYzV7B5bS2MIU8/vzDIPul07/V6PYzn34ztr9fr\nhdrWL2jiVQgRRT9x5ZvXpotFLwud7r1TquIyk9TJS64RokAG99r8owSLt+0P9/Mhyf21llldXWV+\nfpGZmRuZn79zdNILDIokT4Y0GxrJiy3KoEIs+5Em0sg1WaREqFSmfHx8pintQGPq4n6RXCO5RojC\nSPs6flHtbqYG3uNzcwuxtfQk99deZndbHbA7VT91mhOo1+s+PT3r09OzpXfw7smdvNIaCCGaKDId\nQhF0up/Dhw9z+PDhAVhUMEmeDGk2NJIXW5Qs5Zp+3tbstdD23Nwen56e9bm5hVTyUb/3F7S94GaT\nDt/vsNvHxi51s6si5ZrWN2k7ReyUMcIpC5BcI0T5yep1/H7zrnR6KATL9m1q762rOuV1f520/0An\n3x86/Wmfnb15w8Fv3l+tSU9vfWiNQmqJKOTkhdgiRGnfSdLkBvW0a+DFzhO030twrNmW5nvurP0P\nat6jCJI6eYVQCiHEKJPkyZBmQyN5MSTkoe32G9HRSWbJIk1uN7kmy4iU9bpmZ1/ik5PX+fT0rFer\nVR8fvzxsu1GDj15Vqptc0ygx9Zp/GGatHsk1QmRHHtpuv7HZ/TqsJE4sauI1y9jyzbr2t5WFSxuc\n9pTPzt4aa0J1bm4hfEDsdtjdFkcf98E4bI5eTl6IDMlD2+33VfqyxdUnSQWwWVd7Wbimr7p62Zd1\nmbKR1MlLkxdCiFEmyZMhzYZG8mIIKLNck0ZbjlO2H7mmMcRx/S3Z2dmbfXLyOp+cvNZnZ1/ilcqO\nHnJNc129bO71t+kt19R8bGwm9XsBRYPkGiGypYwTr2kePv2mHO408To5ea3Dto0J0kplRziBuz4h\n2hj3Pu3j45f73NxCxMTrVb4eDx812drN5jQvfo2NbRtKbV5OXogtQBptOStdOqqeYBI0Ou4+KvdM\n3JTG/drc6/ph1uaTOnlp8kKUgE4pevNJTbxKkHL4fs6dOxvZTlS76yspffKTpyPq/BbwSJ92PBqW\nuT+0qcHCsP2HH34E+Bj9pUkuJq3y0JDkyZBmQyN5IZroJklkEQ/f2E5UXHxrnZvyy2Yb1Wq1QVNv\njlMP9i/rKNdEpQpu1/e3O9Qi77Exdr7XPEaveY9hDqVEco0Qw0knCaHT8TRzBXNzC211RoVGBrLL\n5ufx8ee3XFPz8fHnh2Vf3HDuhMOL/XnP2+GTk9f63NyeSBs7pWZYv7d2e9pTHfTTl40M60tRSZ18\n4lTDZvYG4GcBJ/hN9BrgcuAB4HrgCeBV7v5sml8aQohm0qQC3r59JiMrbubKKye57bZbOHXqbxuO\nLwOHeNnLjnPy5Ef7qvG2225heXmZ++57X0Y2RjNqqZR7kuTJALwQ+DJwafj5AaAK3Av8QnjsIHBP\nRNm8H3hCDBX9yjVpRqH1er0tuqR/uWZTBomSgMymOo7gO93z2Ni2DVml9VzQ7ou98e3WuNE14+Mz\nfsUVu4ZmYZBuUKRcEzr5/wdsA8YJZkaWgDPAzvCaq4EzEWXz7w0hhoxuTqsxh3oaPbk5ve9uHxub\naXKsraGarfZ0Cv9cD02cnLzOzTZDK3vZFzxwZkJpqBaZtiFIYbCZ2yZqDqFTnPzs7K3hXEH8NAxl\nplAnH7TH64FvAk8BvxUee6bhvDV+bjiec1cIMZqkDf/LO3ww63DHTtfETaucJA1DmUnq5BNp8ma2\nDdgHfDfwd8D/MLNXt8hAbmYeVf7IkSMb+4uLiywuLiYxQwghRpa1tTXW1tbSV5TkyQD8S+D9DZ9/\nCngP8AXg6vDYLiTXCNFEPysnpcmk2G/5JHa1zg90m1uYm1sIM13uaZJ/GucHKpWptoXD04SRJsma\nWWYoWJO/HfhLYIJAllkB/iPBxOvB8JpDaOJViA3iOum0zjhJiuL+7Ypegi9K2w9y12zGw1cqUw2O\nOpgfMJv08fGZvmyO81BKkv++rBTq5IP2OBKO3B8NnfwlwDTwCeAx4CQwFVEu/94QooTE1ayL1t6T\n2ZX8XmB3hF4+mGUIh4mkTj5xnLy7HwkdfSPngZcnrVOIYWR1dXUjtrtWO5B7DPbq6ipvecs7+MpX\nvsr111/NO97x1p5tnjv3NHv37m+zcXV1NUwd8LcEAXHLbdcvLMzzh3/45+F1+8Ian+7XauB9YTvx\nXp351KceYu/e/Rv2Nvbzuk2N99Pr/JYlyZMhzYZG8mKE6Fcnz0KuiZI/usWKR8W9R2no66kFWq/f\nTCtQa9hvTl/QuARf670EIZDbm+qrVqstNk41yTWtywB2S3UQ5/wwvdnaCZTWQIji6VcaSTPx2qm9\nqCyPjeXn5vZE2tgpPDHq+s20ArWGFAcnfD3z5Nzcno73EpVKISo9Q9Dubg9Wkaq12dXZpt7nR0H6\nSerkE8s1Qoj+iftKfdpX7xvLr8sucbjttlt6XBGkMzh/HgJpZxlYYfv24x1LdEql0HqPgdSyDzgO\n3BzbZtGDJE+GNBsayYsRouishnHkmsZr10fyjVLIurQSN51Co/RRqUz57OzNYehj+5uz7u0RLVEp\nDKLSHmze2x6Hq5rsSibXNKdCSNLXZUpkhuQaIQZD0c6gU9x56zXNTu8yD1ZhanZ6cdMprD8sNh8w\nzcv5dVsicN3Rz80thGkMOq8CNTZ2aYMD3+0w5dVqtaNNUWGVUakQ+v27FP3wjoOcvBBigzTpeuPV\nGT0X0S2VQLxVm67pWD7Nvfd7v2VcQSqpk9fKUEIIMcLIyQsx5EQt1bewMM/Y2BsI3lNcAV4H3ACs\nMDFxkFrtQMeynZYcDOqsAXcQvOj+uo36K5U3c+7cWbZtu7TpOLyON77xNUAQrz4xcXDjXKMd6+fH\nx5/pWD4uzXa+qa2dOLTaOjb2Bs6dO5vhEowFkmT4n2ZDco0QmdE9t8vmxGi1Wo2VzqBTXpioydP1\nOufmFpri6sfHL/fJyesiUwn0mr8INP3vCmWba3x8/PKE6ZTb89Qn6dte8whFgjR5IbYeaVLx9lO2\nm0adpX5dtnTKZdLmkzp5yTVCCDHKJHkypNnQSF6IWCRdLSpuKt4ouaZarbrZFWH4YvNye91SLfQb\nbtjt3vqpa/1N2SCcdMHr9XrTewRJwid79ZHkGjl5IVIT90WlTuu+Js0PH2jr0w36+kysOvt5V6CX\n4+zH9tb1ZcfHZ3x8/PK2h1QayvJSlJy8ECNEJy04T404qDv/lL9Z3UMne4Nj+dk/KJI6eWnyQnSh\nUzjhVuLcufhphXv112Zq4/sJ0g8HPPzwI1u6j3MlyZMhzYZG8mJIGKQe26ntPJe025Q/phvq75wb\nJ67Nnc6vpzZOmha4k1wTlVZ5FEByjRDZMujwuSgtOLCp5kFagWA/S5tOnDjhV1yxK5Q87vYgnXC8\n+46XtqD5fJC2uJa4j1snXtfnLcqgoWdNUievVMNClJTO6YZvBn413F8BHs+0zTvuuINTp/YB1YY2\n8iFIW5w8rXCnPtrSK0G1kuTJkGZDI3kxJGQl12Q5ssxSQsoijLFXOGe3esfHZ3xiYofDNg/ecI0f\nDTOqo/VuILlGiOxJ60zy0PWzcHBZhDHGDeeMsn129lYP0h835n/fHyuuvUyx60UiJy9ECRm0rp+n\nXWnqiF6ubzZWHWXt07xJ6uQVQimEEKNMkidDmg2N5MUWoqzSQhZ2bYYw9v92aVQoKOyPnRKhjH2a\nNyi6Rojysby8zLFjK+Ei1VCrrZQi8iM7uy4BXhvuvzl2qcOHDwPwrne9neeee47nP/8aXvQij2VH\nWfu0rFjwgCiwQTMvuk0hRPbs3bu/LdRyaek4J09+dJBmjSxmhrtbv+WkyQu9up8z/fZv0r9Ht3L6\nG29hkmg8aTakyZeKrapvFkWS1LlJ/h5ZpwLO495EOlAIpUjCVg1HK4p++zfp36OolZta2YovJQ2K\npE4+0cSrmX0P8LsNh14EvBX4beAB4HrgCeBV7v5s4p8ZQohS0zn1gigNSZ4MjRuBrv814FrgXuAX\nwuMHgXsirs/3cSf6Qj+586VIuabTikjdQh3XF6sOEnzt0d++xDAouQbYC/xxuH8G2BnuXw2cibg+\n354QfaOf3PnSb/8m+Xv0cuRRD4DW4/2kFRbFk9TJpw6hNLMPAA+5+3vN7Bl33xYeN+D8+ueG6z1t\nm0KIZrqFM3Y6B0Rkm7yfpaUXKAyyhCQNoUz1MpSZVYAfJ5BmmnB3N7NIb37kyJGN/cXFRRYXF9OY\nIYQQI8fa2hpra2vpK0oy/PdN6eUngBMNn88AV4f7u5BcIwpmq0pPSUIo6/W6m01JrhkSGIRcY2a/\nC/yBu6+En+8Fnnb3d5rZIWDK3Q+1lPE0bQrRidXVVe66q8qFC+8EYGLiIMeObZ1X3ldXVxte9T/Q\ndN+t54Cwr14N/AnwRWZnd/Ge97xry/TXsJFUrkns5M3scuArwA3u/s3w2DTwEeA6OoRQysmLvNBr\n9vFRXw0fhWvy7v73wPaWY+eBlyetUwghRLYod40YGWq1A0xMHCSIEllhYuLghjSRlFHN+ZJHX4ly\noiyUYqTopksnqWuUNf4s+0rkT+GafFLk5MWwIN1alAmlGt6CjKqUIITIDq0MNaS0SgmnT1dHSkoo\nA7XaAU6frnLhQvA50K1XBmuUEH0iuWZIkZRQDNKtRVkYSFoDIUYdpdIVw440+SFFIXDFE2cORPMk\nomxIrhliJCUUR5xwylEPuRSDRSGUQuRInDkQzZOIPFEIpRBCiDbk5IeEIrTeLNoYVU06zhyI5klE\nKUmSnzjNhvLJ900R67Bm0caorxcbJ1f9Vs1nL/KHQS3/1y/S5PunCK03izakSQuRH9LkhRBCtCEn\n3yeD0JyLSKGbRRtbSZMe5NzDqM57iJxIovGk2RhiTX6QmnOWWm+n+8iija2gSQ/6ezDK8x6iMyTU\n5OXk+2Bp6e7wP5eH24d8aenuQZvVN6NyH4NikP2nv93WJamTl1yTM8Py0/rhhx8ppX3D0n9ClJYk\nT4Y0G0M8ku/3p3JZf1q32gXbHWqlsW+dYek/yTWiCJBcUwz9aM5l/ml94sQJn56eddjtcKJ09rmX\nv/8GNfewFeY9RDtJnbxSDffJqKSeXV5e5rbbbgnj2of/fopmkN+DUfkOimKQJp8RUdpxkpDCvDTo\nNPYNShffSiGZQuRGkuF/mo0hl2ui6KaT9vPTOi+9NY19g9aAJU0IEYA0+cGRlXaclwadpt4y6+JC\nbCWSOnnJNUIIMcJo4jUDarUDnD5d5cKF4HOgHa8MrJ4s683LJiFEMSgLZUZktRRfXkv6palXywwK\nMXgKX/7PzKaA9wPfBzjwGuCvgAeA64EngFe5+7Mt5UbSyQshRJ4MItXwrwMfd/fvBX4AOAMcAk65\n+03Ag+HnLUua0MNOZXvVuX5+fv5ObrzxB5iZuZH5+UWlBBBiq5Jktha4CvhyxPEzwM5w/2rgTMQ1\nOcw7l480oYfdskR2q3PzfM3hyjBdQXBtpbJDIYhCDDEUGUIJ3Ap8Gvgg8OfAbwKXA880XGONnxuO\n594ZZSCPsMVedW6evztMV6DQRyFGhaROPml0zTgwD/wnd/8zM3s3LdKMu7uZRYrvR44c2dhfXFxk\ncXExoRlCCDGarK2tsba2lr6iJE8GAinm8YbPdwL/C/gCcHV4bBcjLtd0exvzxIkTXqlMhSPq3W52\nhdfr9Vh1zs7e6rAtlF2yk2vq9breHhViSKHoN16BPwJuCvePAPeG28Hw2CHgnohyefdFIcRxuOPj\nMw3pfKe9UpnqmZq4UtnRUGbKzSabHg5x0hDMzS242aTD9Q7XuNk2r1arSlErxBAzCCd/C/BnwCPA\n7xFMxk4DnwAeA04CUxHl8u+NAoivj2+eh91ddfEkZeLWE6QVlkYvxLCS1MknfuPV3R8BfjDi1MuT\n1imEECJjkjwZ0myMyEi+Va6pVHb43NyeDRmlXXq50s22+ezsrU3XtdYZlKmFWv60j49f3resEiUl\n1et1yTVCDDEoC2XxrOvjc3ML4SRre1z73NyCX3HFLjebanD4nZfbq9frTdcmjW+P0u6VtleI4SWp\nk1fumgzYu3d/uMJSNTyywtLScU6e/GjH83Ac2Nd0XZy6hBBbk0GkNRBCCFFy5OQzoNsydaurq5w7\nd5axsTdsnIc3ATdELmfXra6jR48yM3MjMzM3cvTo0cLuTwgxxCTReNJsjJAm30gnDXxzsrPmY2Mz\nXSdeu9VVr9fDF5w2J3LjvFwlhBgNkCZfPrLU12dmbuT8+bc21TU9/XaefvpLGVkrhCgz0uSHnEYp\n5md+5mcSpygeNtKkYxZCxCDJ8D/NxojKNVHETTfcLMXUmmSZ9TKjKNekSccsxFYDxcmXkzix6c0p\nBzqnS6jX6z49PevT07ND7+Dd06VjFmKrkdTJayHvnFleXs5sTdTDhw9z+PDhTOoSQmwNpMkXTJQG\n/cY3vgZ4HUHY5A0N+ytUKm9uC7OMW2+cc4OkW7golNduIYaKJMP/NBtbTK5ppNuyfuPjlztcE6Y8\nuMzX89D3Sk/crd5e58pAJzmr7HYLUTRIky8/8Zb161+n7qZtD6vuPax2C5EXSZ285BohhBhhNPGa\nAaurq7zlLW/nK195kuuvv4Z3vOMtANx33/v48pfP8NRT38L9OSYnJzH7JO73AFNUKmeo1X4XgNOn\nq1y4AJuafECgU690bb9WO9BQ/lHGxj7EuXPfz+rqasu53vWtrq5y333v26g3q0njfunXbiFEB5IM\n/9NsjJhcs7mW6+Z6quPjM+Gx/WFse63p/Hqq4cY0wo3adJK1WIO0xnt8bGxbpOYfp76y6eBKjSzE\nJkiTHwyBdry7TT8Ojq3Hv0ct63d35jpzWh1bOrgQ5SWpk5cmnxtfB54Ffgn4cqIaokII8wwrPHfu\n6VTlFfIoRAlJ8mRIszFiI/kouWZsbCoMg1yXZy5zaF8ZqtuqT0mW8Esjt0TdRz+rUpVN6hFi1EBy\nzeBY18Onp2c3lvtrl2euCCWcBYc9Drt9bm5PxzqjpJPm9AfRckpSHXuzvROhlNTdvjj2SuoRIjuS\nOnlF12RAa+qCmZkb264ZH7+MixdfS2Oq4O3bj+duS4Iawi0f+4QQBZPkyZBmY4hG8klHxdVq1Vsz\nRlar1Y5vu0a1Ua/X2yJlOsk1WUShpJVbJNcIkS9IrsmWpE5rs9x+D9IUTHu1Wt041+iMu6U5CI7X\nHHb72NjMRtbJuHUkvec0DwuFPAqRH3LyGZNUY+6nXLw0B8nqEEKMFkmdvEIohRBihJGT70CvNLhx\ny42NvYGFhXkgiCOfn7+TmZkbmZ9fZGFhPvLazTreBNzB2Fhto45+7FTcuhBCck0XkmrMwaTpTBgy\nWduYNI2KQ69Wq23Xri/3F5WiIK6dmggVYrQgoVxjQdn+MbMngG8A/wg85+63m9k08ABwPfAE8Cp3\nf7alnCdtc1jYu3c/p07tozFccnr67Zw/vwN4bcTxtzYdW1oKQhdb61haOs7Jkx9NbEM/5YUQ5cLM\ncHfrt1waucaBRXefc/fbw2OHgFPufhPwYPhZCCHEoEgy/A9H4o8DMy3HzgA7w/2rgTMR5XL5KVMm\nOqUkGBu7zGF6Q5qpVHZ0jX1Pk6Jgbm4hlIFqfcXUKwxSiHJC0SGUBFm3PgM8BPxceOyZhvPW+Lnh\neM5dUQ5anWW9Xo98QSrq2k51xG238eEwNrbN5+b2xHpwSMcXorwMwsnvCv/dAXwW+JFWpw6cjyiX\nb0+UlKi8M9PTs5m3k2YpQMXcC1Fekjr5xLlr3P1r4b9fN7NjwO3AWTO72t2fNLNdwFNRZY8cObKx\nv7i4yOLiYlIzhprnnvsO8/N3Nq0otZ53ptMKTWVZuUkIkS9ra2usra2lryjJkwG4DJgM9y8H/gTY\nC9wLHAyPHwLuiSib7+OupETlszG7tC2kMl66g85ySrdrJNcIMbxQZAilmd0AHAs/jgO/4+7vCEMo\nPwJcxxYOoYwiCGm8gWC+GoK1XB8A6sQNn+x0vDUssttov9cvAf1SEKKcJA2hTCTXuPvjwK0Rx88D\nL09S59bgZuBXw/0VAiefPd3SDfdKRZw+VbEQokwkfhkqcYMjOpKPM0Let++n+M53fgWA8fEa8A9c\nvDjBuuOvVN7M8eO/BcBdd1W5cOGdAExMHOTYsZWuxzX6FmK0STqSTxxdk3RjBDX5uFp5kNZgt8Nu\nr1SmvF6vN60oFSd8Ms9Uw0KI8kLRaQ2SMooj+TgpBPJKM6D0BUJsDQaR1kAIIUTJkZPPgFrtQKix\nB+l+x8drbWmJa7UDVCo/D9wB3EGl8vOxUhfHaTtJSmRQKmIhtgJy8hnw0EMPcfHiBeB+4H4uXrzA\nQw89FHHlJQRZKF8b7qdneXmZY8cCiWZp6TjHjq3EmnhdXV3lrruqnDq1j1On9nHXXVU5eiFGEGny\nGTAzc2NbuuDp6bfz9NNf2rimbNp52ewRQnRHmnypeJRvfOObzM/fyQte8D1ccslOHnzwQeBjmbck\nyUUI0ZUkITlpNkYwhLI5w2Qt3K81pSwI9i9z2J9ZqGPadMQKvRRieEAhlIPl6NGjvOtdH+Qb3/gm\nFy/eCxwHmuUQuJ/x8S/z0pfemclLS2klF6UwEGJ4KDStgWjn8OHDHD58OHS8na+78srJ0ujeSmEg\nxOgjTT5jNkMabwDexHpoY7D/F2zbdumGfp5WT48Kn1xYmJdGL4TYQHJNDqzLIOfOneXJJ7/O17/+\nLJdc4jzLA1bWAAAFo0lEQVT33Le5ePE9QJCnBp7jO995N7CZh6bfkXWj5LKwMM/Ro/+lLbeNRutC\nDD9J5Ro5+YKI0s+DuPr/u/E5bQijwiKFGF0UQimEEKINTbwWRK12gNOnq1y4EHzelGuCVMFBOoKV\nTNvIok4hxHAjuaZAWkMWIfs88AqLFGI0kSYvhBAjjDR5IYQQbcjJlwzlohFCZInkmhKxnv5Xce5C\niFakyY8AinMXQnRCmrwQQog2FCdfIhTnLoTIGsk1JUNx7kKIKKTJCyHECCNNXgghRBty8kIIMcKk\ncvJm9jwz+4yZfSz8PG1mp8zsMTM7aWZT2ZgphBAiCWlH8q8HPg+si+yHgFPufhPwYPh5KFhbWxu0\nCW3IpnjIpviU0S7ZlC+JnbyZXQO8Ang/sD4ZsI9gNQzCf1+ZyroCKeMfVTbFQzbFp4x2yaZ8STOS\n/zXgzcA/NRzb6e5nw/2zwM4U9QshhEhJIidvZj8GPOXun2FzFN9EGCepWEkhhBggieLkzeyXgZ8C\nLgLfBVwJ/B7wg8Ciuz9pZruAT7r7i1vKyvELIUQCBvIylJktAG9y9x83s3uBp939nWZ2CJhy96GZ\nfBVCiFEjqzj59SfFPcCSmT0GvCz8LIQQYkAUntZACCFEceTyxquZfcDMzprZo12u+Q0z+ysze8TM\n5vKwo1+7zGzRzP4ufMHrM2b2SwXYdK2ZfdLMPmdmf2lmr+twXWH9FcemovvKzL7LzD5tZp8NbTrS\n4boi+6mnTYP4ToXtNr2oGHG+8P9/vewa0P+/J8zsL8L2/rTDNYX2VS+b+u4nd898A34EmAMe7XD+\nFcDHw/0fAj6Vhx0J7FoEjhdhS0ObVwO3hvtXAF8EvneQ/RXTpkH01WXhv+PAp4AfGvT3KoZNhfdT\n2O4bgd+JantQ//9i2DWI79TjwHSX84P4TvWyqa9+ymUk7+5/DDzT5ZKNl6bc/dPAlJnlHlMfwy7o\nEBKaF+7+pLt/Ntz/FvAF4AUtlxXaXzFtguL76tvhbgW4hOZ3NGAA36sYNkHB/dThRcVGBvL/L4Zd\ndDmeJ93aHEhf9bApzvkNBpWg7IXAXzd8/ipwzYBsacSBHw5/ln3czF5SZONm9t0EvzQ+3XJqYP3V\nxabC+8rMxszsswQv2p109z9ruaTwfoph0yC+U1EvKjYyqO9TL7sG0VcOfMLMHjKzn4s4P4i+6mVT\nX/00yJWhWp9EZZgB/nPgWnf/tpn9c+D3gZuKaNjMrgD+J/D6cPTcdknL59z7q4dNhfeVu/8TcKuZ\nXQUcM7Pvc/fPtZrdWmzANhXaT9bwoqKZLXa7tOVzrv0U065B/P/b4+5fM7MdwCkzOxP+4m+k6P97\nvWzqq58GNZL/G+Dahs/XhMcGirt/c/3nt7v/AXCJmU3n3a6ZXQJ8FPhtd//9iEsK769eNg2qr8L2\n/g74JPCjLacG9r3qZNMA+umHgX1m9jjwYeBlZvbfW64ZRD/1tGsQ3yl3/1r479eBY8DtLZcU3le9\nbOq3nwbl5I8DPw1gZruBZ30z583AMLOdZmbh/u0EIabnc27TgP8GfN7d393hskL7K45NRfeVmW23\nMHW1mU0ASwRzBY0U3U89bSq6n9z9F939Wne/AfhJ4H+7+0+3XFb4/784dg3gO3WZmU2G+5cDe4HW\nyLuiv1M9beq3n3KRa8zsw8ACsN3M/hp4G8GkFO7+X93942b2CjP7EvD3wGvysKNfu4B/Afx7M7sI\nfJvgy5g3e4BXA39hZp8Jj/0icN26XQPor542UXxf7QJWzOx5BIOTB8J++XfrNg2gn3raxGC+U404\nwID7KZZdFN9XOwkkNgh84e+4+8kB91VPm+izn/QylBBCjDBa/k8IIUYYOXkhhBhh5OSFEGKEkZMX\nQogRRk5eCCFGGDl5IYQYYeTkhRBihJGTF0KIEeb/A8n8HI9IcDPUAAAAAElFTkSuQmCC\n",
       "text": [
        "<matplotlib.figure.Figure at 0x7f0969074400>"
       ]
      }
     ],
     "prompt_number": 41
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Creating a Variational Mixture of Gaussians model\n",
      "-------------------------------------------------\n",
      "\n",
      "To define the model we first need to store the parameters of the model as described above. These are the parameters of the prior distributions: $\\alpha_{0}$, $\\m_{0}$, $\\W_{0}$ and $\\v_{0}$.\n",
      "\n",
      "Choosing default values for these is a little tricky. Going for 1 for the alpha array, hoping that's not too strong. $\\m_{0}$ is a bit easier:\n",
      "\n",
      "> Typically we would choose $\\m_{0} = 0$ by symmetry.\n",
      "\n",
      "$\\W_{0}$ is hard. It's our prior belief about the covariance matrix of the distributions. I guess a good prior would be that the covariance matrix is diagonal. What does that require the Wishart matrix to be then?\n",
      "\n",
      "[Wikipedia][wishart] might be able to help. It suggests that you just multiply your guess for the covariance matrix by the other parameter of the Wishart. Will just make a string option to implement this.\n",
      "\n",
      "[wishart]: https://en.wikipedia.org/wiki/Wishart_distribution"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class VariationalGaussianMixtureBase:\n",
      "    def __init__(self, K_components=1, alpha_0=1, m_0=0,\n",
      "                 W_0=\"diagonal\", v_0=False):\n",
      "        \"\"\" Initialise parameters of prior\n",
      "        distribution \"\"\"\n",
      "        self.K_components\n",
      "        self.alpha = np.array([alpha_0]*K_components)\n",
      "        # need to know dimensionality of data to initialise these?\n",
      "        # will have to do it later\n",
      "        self.m_0 = m_0\n",
      "        self.W_0 = W_0\n",
      "        self.v_0 = v_0"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 42
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Next we need to define the components of our update algorithm. To do this, we're going to have to define two variational updates. One, we hold $\\mathbf{r}$ fixed and update the expectations and in two we hold the expectations fixed and update $\\rho_{nk}$ and hence $\\mathbf{r}$. However, there is an obvious problem in that; on the first iteration one of the things we hold fixed is going to be undefined.\n",
      "\n",
      "The only way round this is to set an arbitrary initial value for one of the two. It seems like this might as well be $\\mathbf{r}$ and it might as well be uniform. We have to initialise $\\mathbf{r}$ the first time we do this update, because otherwise we don't know what the dimensionality of our input is going to be:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class VariationalGaussianMixtureUpdateExpectations:\n",
      "    def update_expectations(self,X):\n",
      "        # assuming array with rows as examples\n",
      "        N = X.shape[0]\n",
      "        # if we don't have any rs yet, better initialise the dictionary\n",
      "        if not self.r:\n",
      "            self.r = {}\n",
      "            for n in range(N):\n",
      "                self.r[n] = {}\n",
      "                for k in range(self.K_components):\n",
      "                    self.r[n][k] = 1./self.K_components\n",
      "        # then we can call methods to update each part of the expectations\n",
      "        "
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class VariationalGaussianMixtureUpdate_r:\n",
      "    def update_r(self,X):\n",
      "        \"\"\"Half of the variational update, updating responsibilities, expressing\n",
      "        r and rho as dictionaries and iterating over all N and K. Assuming X is the\n",
      "        input data.\"\"\"\n",
      "        # if we don't have any rs yet, better initialise the dictionary\n",
      "        if not self.r:\n",
      "            self.r = {}\n",
      "        N = X.shape[0]\n",
      "        # outer sum\n",
      "        for n in range(N):\n",
      "            # doing this here to cache\n",
      "            rho_nj_sum = sum([self.rho[n][j] for j in self.K_components])\n",
      "            # inner sum\n",
      "            for k in self.K_components:\n",
      "                self.r[n][k] = self.rho[n][k]/sum([self.rho[j] for j in self.K_components])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 42
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}