dominicrufa/mbar_notes.ipynb

## mbar_notes.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1dc27452-fac9-4fdc-9aff-98ea437ff9f2",
   "metadata": {},
   "source": [
    "# Musings on BAR and MBAR.\n",
    "Here lie a few notes i am putting together after developing (somewhat) more of an understanding of the motivation/derivation of BAR/MBAR free energy/expectation estimators.\n",
    "\n",
    "Suppose we are conducting a free energy calculation where we have $K$ replicas of a system, indexed by $k \\in 1, ..., K$. Each replica configuration $x_k$ evolves via MCMC w.r.t. the target distribution $p_k(x) = q_k(x) c_{k}^{-1}$ and $- \\log q_k(x) = u_k(x) = u(x | \\lambda_k)$ where $\\lambda_k$ is a parameter (scalar or vector) that parameterizes the distribution. \n",
    "\n",
    "I want to answer the following question: **what is the optimal set of $u_k(x)$ \"bridging distributions\" that maximize the expectation of $\\Delta f$ given a fixed sample size of $N$?**\n",
    "\n",
    "One point that i found remarkable: you get the $K$ implicit equations to solve for the $c_k$s (the MBAR equations) immediately by taking the expectation $\\langle O_i(x)\\rangle_i$ w.r.t. the mixture distribution $p_m$ where $O_i(x)$ is unity. \n",
    "\n",
    "I am going to skip over the derivations that Michael Shirts put forward in [this](https://arxiv.org/pdf/1704.00891.pdf) paper and get right to the point I wanted to make.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21ac80a9-028b-45d1-b3be-524d793c9072",
   "metadata": {},
   "source": [
    "After we have solved the implicit equation to yield $c_i$s, we can write the expectation of any observable $O(x)$ w.r.t. $p_i$ by taking its importance weight w.r.t. the mixture distribution $p_m$ like in eq. 17:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bc6a61ac-d211-43f2-8a52-ef320473a9cb",
   "metadata": {},
   "source": [
    "$$\n",
    "\\langle O(x) \\rangle_i = \\int O(x) \\frac{p_i(x)}{p_m(x)} p_m(x) dx = \\langle O(x) \\frac{p_i(x)}{p_m(x)} \\rangle_m \\approx \\sum_{n=1}^N O(x_n) \\frac{c_i^{-1} q_i(x_n)}{\\sum_{k=1}^K N_k c_k^{-1} q_k(x_n)} \n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f789e1ff-aa02-4b2d-bea1-cb773334940b",
   "metadata": {},
   "source": [
    "Now, let"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15596b02-0914-4754-b4f0-4cab73dbf593",
   "metadata": {},
   "source": [
    "$$\n",
    "O(x) = \\frac{q_K(x)}{q_1(x)}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f77af55-dee1-4a52-923c-64246127daf0",
   "metadata": {},
   "source": [
    "so that "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "83c72919-855a-41e6-bf77-7684d9c9af55",
   "metadata": {},
   "source": [
    "$$\n",
    "\\log \\langle O(x) \\rangle_1 = \\log \\frac{c_K}{c_1} = \\log \\int q_K(x) \\frac{c_1^{-1}} {p_m(x)} p_m(x) dx = -\\log c_1 + \\log \\langle \\frac{q_K(x)}{p_m(x)} \\rangle_m\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c19d6601-fd01-47fb-8ff7-2b43f6b2cb0c",
   "metadata": {},
   "source": [
    "if we rewrite the LHS and far RHS, we see that "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c290df1b-b8d0-43c7-a00d-14284f64ed96",
   "metadata": {},
   "source": [
    "$$\n",
    "\\log c_K = \\log \\langle \\frac{q_K(x)}{p_m(x)} \\rangle_m\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d1ec93a-0258-45d9-960c-e03fbec1e869",
   "metadata": {},
   "source": [
    "By Jensen's inequality,\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ed22643a-effb-44d7-a769-166a91cd7255",
   "metadata": {},
   "source": [
    "$$\n",
    "\\log \\langle \\frac{q_K(x)}{p_m(x)} \\rangle_m \\ge \\langle \\log \\frac{q_K(x)}{p_m(x)} \\rangle_m\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "519b6927-98c3-4f51-a8ec-9e5eb4e36c27",
   "metadata": {},
   "source": [
    "This provides a variational lower bound that allows us to perform stochastic gradient descent on the $\\lambda_k$ parameters defining the bridging distributions $p_k$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "002e357d-22fe-4725-8912-0bd01d6b8f40",
   "metadata": {},
   "source": [
    "I'll also mention that we can also place a bound on $\\log c_1$ by defining"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5ebe6ea-4141-4d28-a67d-01367cc28412",
   "metadata": {},
   "source": [
    "$$\n",
    "O(x) = \\frac{q_1(x)}{q_K(x)}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b10c7225-9d2d-41db-8358-6a2588b1c3c6",
   "metadata": {},
   "source": [
    "and taking an expectation w.r.t. $p_K$. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f74f07cc-5752-4859-8aac-f01ad4b2abe4",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"id": "1dc27452-fac9-4fdc-9aff-98ea437ff9f2",
	"metadata": {},
	"source": [
	"# Musings on BAR and MBAR.\n",
	"Here lie a few notes i am putting together after developing (somewhat) more of an understanding of the motivation/derivation of BAR/MBAR free energy/expectation estimators.\n",
	"\n",
	"Suppose we are conducting a free energy calculation where we have $K$ replicas of a system, indexed by $k \\in 1, ..., K$. Each replica configuration $x_k$ evolves via MCMC w.r.t. the target distribution $p_k(x) = q_k(x) c_{k}^{-1}$ and $- \\log q_k(x) = u_k(x) = u(x \| \\lambda_k)$ where $\\lambda_k$ is a parameter (scalar or vector) that parameterizes the distribution. \n",
	"\n",
	"I want to answer the following question: what is the optimal set of $u_k(x)$ \"bridging distributions\" that maximize the expectation of $\\Delta f$ given a fixed sample size of $N$?\n",
	"\n",
	"One point that i found remarkable: you get the $K$ implicit equations to solve for the $c_k$s (the MBAR equations) immediately by taking the expectation $\\langle O_i(x)\\rangle_i$ w.r.t. the mixture distribution $p_m$ where $O_i(x)$ is unity. \n",
	"\n",
	"I am going to skip over the derivations that Michael Shirts put forward in [this](https://arxiv.org/pdf/1704.00891.pdf) paper and get right to the point I wanted to make.\n"
	]
	},
	{
	"cell_type": "markdown",
	"id": "21ac80a9-028b-45d1-b3be-524d793c9072",
	"metadata": {},
	"source": [
	"After we have solved the implicit equation to yield $c_i$s, we can write the expectation of any observable $O(x)$ w.r.t. $p_i$ by taking its importance weight w.r.t. the mixture distribution $p_m$ like in eq. 17:"
	]
	},
	{
	"cell_type": "markdown",
	"id": "bc6a61ac-d211-43f2-8a52-ef320473a9cb",
	"metadata": {},
	"source": [
	"$$\n",
	"\\langle O(x) \\rangle_i = \\int O(x) \\frac{p_i(x)}{p_m(x)} p_m(x) dx = \\langle O(x) \\frac{p_i(x)}{p_m(x)} \\rangle_m \\approx \\sum_{n=1}^N O(x_n) \\frac{c_i^{-1} q_i(x_n)}{\\sum_{k=1}^K N_k c_k^{-1} q_k(x_n)} \n",
	"$$"
	]
	},
	{
	"cell_type": "markdown",
	"id": "f789e1ff-aa02-4b2d-bea1-cb773334940b",
	"metadata": {},
	"source": [
	"Now, let"
	]
	},
	{
	"cell_type": "markdown",
	"id": "15596b02-0914-4754-b4f0-4cab73dbf593",
	"metadata": {},
	"source": [
	"$$\n",
	"O(x) = \\frac{q_K(x)}{q_1(x)}\n",
	"$$"
	]
	},
	{
	"cell_type": "markdown",
	"id": "3f77af55-dee1-4a52-923c-64246127daf0",
	"metadata": {},
	"source": [
	"so that "
	]
	},
	{
	"cell_type": "markdown",
	"id": "83c72919-855a-41e6-bf77-7684d9c9af55",
	"metadata": {},
	"source": [
	"$$\n",
	"\\log \\langle O(x) \\rangle_1 = \\log \\frac{c_K}{c_1} = \\log \\int q_K(x) \\frac{c_1^{-1}} {p_m(x)} p_m(x) dx = -\\log c_1 + \\log \\langle \\frac{q_K(x)}{p_m(x)} \\rangle_m\n",
	"$$\n"
	]
	},
	{
	"cell_type": "markdown",
	"id": "c19d6601-fd01-47fb-8ff7-2b43f6b2cb0c",
	"metadata": {},
	"source": [
	"if we rewrite the LHS and far RHS, we see that "
	]
	},
	{
	"cell_type": "markdown",
	"id": "c290df1b-b8d0-43c7-a00d-14284f64ed96",
	"metadata": {},
	"source": [
	"$$\n",
	"\\log c_K = \\log \\langle \\frac{q_K(x)}{p_m(x)} \\rangle_m\n",
	"$$"
	]
	},
	{
	"cell_type": "markdown",
	"id": "3d1ec93a-0258-45d9-960c-e03fbec1e869",
	"metadata": {},
	"source": [
	"By Jensen's inequality,\n",
	"\n"
	]
	},
	{
	"cell_type": "markdown",
	"id": "ed22643a-effb-44d7-a769-166a91cd7255",
	"metadata": {},
	"source": [
	"$$\n",
	"\\log \\langle \\frac{q_K(x)}{p_m(x)} \\rangle_m \\ge \\langle \\log \\frac{q_K(x)}{p_m(x)} \\rangle_m\n",
	"$$"
	]
	},
	{
	"cell_type": "markdown",
	"id": "519b6927-98c3-4f51-a8ec-9e5eb4e36c27",
	"metadata": {},
	"source": [
	"This provides a variational lower bound that allows us to perform stochastic gradient descent on the $\\lambda_k$ parameters defining the bridging distributions $p_k$"
	]
	},
	{
	"cell_type": "markdown",
	"id": "002e357d-22fe-4725-8912-0bd01d6b8f40",
	"metadata": {},
	"source": [
	"I'll also mention that we can also place a bound on $\\log c_1$ by defining"
	]
	},
	{
	"cell_type": "markdown",
	"id": "c5ebe6ea-4141-4d28-a67d-01367cc28412",
	"metadata": {},
	"source": [
	"$$\n",
	"O(x) = \\frac{q_1(x)}{q_K(x)}\n",
	"$$"
	]
	},
	{
	"cell_type": "markdown",
	"id": "b10c7225-9d2d-41db-8358-6a2588b1c3c6",
	"metadata": {},
	"source": [
	"and taking an expectation w.r.t. $p_K$. "
	]
	},
	{
	"cell_type": "markdown",
	"id": "f74f07cc-5752-4859-8aac-f01ad4b2abe4",
	"metadata": {},
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3 (ipykernel)",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.9.7"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 5
	}