hamelin/minimal-transport.ipynb

## minimal-transport.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c2b9aa89-4fc2-4205-8df4-041f129d7eea",
   "metadata": {},
   "source": [
    "# Linearly constrained optimization\n",
    "\n",
    "Consider the problem\n",
    "\n",
    "$$\n",
    "\\min_{T \\in \\Pi_{p,q}} \\left<D, T\\right>_F\n",
    "$$\n",
    "\n",
    "where $\\left<\\cdot, \\cdot\\right>_F$ is the Frobenius norm, and\n",
    "\n",
    "$$\n",
    "\\Pi_{p,q} = \\left\\{T \\in \\mathbb{R}_{+}^{m,n} \\right|\\left. T{\\mathbb{1}_n} = p\\text{ and }T^\\mathrm{t}\\mathbb{1}_m = q\\right\\}\n",
    "$$\n",
    "\n",
    "This problem encodes many questions related to minimal transport to make two distributions as close to each other as possible. Taking $d \\in \\mathbb{R}^{m\\times n}$ as the vectorization of matrix $D$ above, this problem is equivalent to\n",
    "\n",
    "$$\n",
    "\\begin{array}{ll}\n",
    "    \\min_{t \\in \\mathbb{R}^{m \\times n}} & f(t) \\stackrel{\\text{def}}{=} \\left<d, t\\right> \\\\\n",
    "    \\text{subject to}                    & \\sum_{j=1}^n t_{ij} = p \\text{ for }1 \\leq i \\leq m\\\\\n",
    "                                         & \\sum_{i=1}^m t_{ij} = q \\text{ for }1 \\leq j \\leq n\\\\\n",
    "                                         & t \\geq 0,\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "where $\\left<\\cdot,\\cdot\\right>$ denotes the inner vector product. Constraints involved are linear equality constraints and bound constraints. The former can be handled by a change of variables. First, we assess that summations to $p$ can be rewritten as the matrix-vector product\n",
    "\n",
    "$$\n",
    "\\left[\n",
    "\\begin{array}{cccccccccccccc}\n",
    "    1      & 0      & 0      & \\ldots & 0      & 1      & 0      & 0      & \\ldots & 0      & \\ldots & 1      & 0      & 0      & \\ldots & 0 \\\\\n",
    "    0      & 1      & 0      & \\ldots & 0      & 0      & 1      & 0      & \\ldots & 0      & \\ldots & 0      & 1      & 0      & \\ldots & 0 \\\\\n",
    "    0      & 0      & 1      & \\ldots & 0      & 0      & 0      & 1      & \\ldots & 0      & \\ldots & 0      & 0      & 1      & \\ldots & 0 \\\\\n",
    "    \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "    0      & 0      & 0      & \\ldots & 1      & 0      & 0      & 0      & \\ldots & 1      & \\ldots & 0      & 0      & 0      & \\ldots & 1\n",
    "\\end{array}\n",
    "\\right]t \\stackrel{\\text{def}}{=} \\mathbf{\\mathrm{G_1}}t = p\n",
    "$$\n",
    "\n",
    "Similarly, summations to $q$ can be rewritten as\n",
    "\n",
    "$$\n",
    "\\left[\n",
    "\\begin{array}{}\n",
    "    1      & 1      & \\ldots & 1      & 0      & 0      & \\ldots & 0      & \\ldots & 0      & 0      & \\ldots & 0 \\\\\n",
    "    0      & 0      & \\ldots & 0      & 1      & 1      & \\ldots & 1      & \\ldots & 0      & 0      & \\ldots & 0 \\\\\n",
    "    \\vdots & \\vdots & \\ddots & \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
    "    0      & 0      & \\ldots & 0      & 0      & 0      & \\ldots & 0      & \\ldots & 1      & 1      & \\ldots & 1\n",
    "\\end{array}\n",
    "\\right]t \\stackrel{\\text{def}}{=} \\mathbf{\\mathrm{G_2}}t = q\n",
    "$$\n",
    "\n",
    "This enables the reformulation of the optimization problem as\n",
    "\n",
    "$$\n",
    "\\begin{array}{ll}\n",
    "    \\min_{t \\in \\mathbb{R}^{m \\times n}} & f(t) \\\\\n",
    "    \\text{subject to}                    & \\mathbf{\\mathrm{G}}t = h \\\\\n",
    "                                         & t \\geq 0\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "where\n",
    "\n",
    "$$\\mathbf{\\mathrm{G}} = \\left[\\begin{array}{c} \\mathbf{\\mathrm{G_1}} \\\\ \\mathbf{\\mathrm{G_2}} \\end{array}\\right]\\text{ and } h = \\left[\\begin{array}{c} p \\\\ q \\end{array}\\right].$$\n",
    "\n",
    "The rank of matrix $\\mathbf{\\mathrm{G}}$ is less than full: in addition to having merely $m+n$ rows to its $mn$ columns, the sum of the rows of its component parts $\\mathbf{\\mathrm{A}}$ and $\\mathbf{\\mathrm{B}}$ is respectively the same. Thus, we can find a linear transformation $\\mathbf{\\mathrm{T}}$ such that $\\mathbf{\\mathrm{G}}' = \\mathbf{\\mathrm{TG}}$ is full-rank, and such that $\\mathbf{\\mathrm{G'}}t = \\mathbf{\\mathrm{T}}h \\stackrel{\\mathrm{def}}{=} h' \\Leftrightarrow \\mathbf{\\mathrm{G}}t = h$.\n",
    "\n",
    "Second, the first-order optimality conditions for a solution $t^*$ to this problem are\n",
    "\n",
    "$$\n",
    "\\nabla f(t^*) = 0 \\\\\n",
    "\\mathbf{\\mathrm{G'}} t^* = h' \\\\\n",
    "t^* \\geq 0.\n",
    "$$\n",
    "\n",
    "A *feasible point* algorithm to solve this problem presumes the generation of a sequence of iterates $t_1, t_2\\ldots$ converging to a stationary point $t^*$ such that every iterate satisfies $\\mathbf{\\mathrm{G}}t_i = h$. Given one solution $\\hat{t}$ to this system, our iterates $t_i$ must belong to set $\\ker(\\mathbf{\\mathrm{G'}}) + \\{\\hat{t}\\}$. Assuming one forms a matrix $\\mathbf{\\mathrm{C}}$ whose columns for the basis of $\\ker(\\mathbf{\\mathrm{G'}})$, we can effect the change of variables\n",
    "\n",
    "$$\n",
    "t = \\mathbf{\\mathrm{C}}v + \\hat{t},\n",
    "$$\n",
    "\n",
    "and rewrite the optimization problem as\n",
    "\n",
    "$$\n",
    "\\begin{array}{ll}\n",
    "    \\min_{v \\in \\mathbb{R}^{k}} & f(\\mathbf{\\mathrm{C}}v + \\hat{t}) \\\\\n",
    "    \\text{subject to}           & v \\geq 0,\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "where $k = \\dim(\\ker(\\mathbf{\\mathrm{G'}}))$. Optimality conditions for this last version of the problem are stated as\n",
    "\n",
    "$$\n",
    "\\mathbf{\\mathrm{C}}^\\mathrm{t}\\nabla f(\\mathbf{\\mathrm{C}}v^* + \\hat{t}) = 0 \\\\\n",
    "v^* \\geq 0.\n",
    "$$\n",
    "\n",
    "<!-- Second, the optimality conditions that a solution $t^*$ to the problem satisfy are\n",
    "\n",
    "$$\n",
    "\\nabla f(t^*) = 0 \\\\\n",
    "\\mathbf{\\mathrm{G'}}t^* = h' \\\\\n",
    "t^* \\geq 0.\n",
    "$$\n",
    "\n",
    "The various components of the solution may live either on the boundary of the feasible set, or in its interior. As such, a useful representation of the solution involves the partitioning of its components in three subsets:\n",
    "\n",
    "1. The *non-basis components*, which are null; these correspond as a submatrix $\\mathbf{\\mathrm{N}}$ of $\\mathbf{\\mathrm{G'}}$.\n",
    "1. The *in-basis components*, expressed as a linear combination of the basis of an inversible submatrix $\\mathrm{\\mathbf{B}}$ of $\\mathbf{\\mathrm{G'}}$.\n",
    "1. The *out-basis components*, expressed as a linear combination of other columns of $\\mathbf{\\mathrm{G'}}$.\n",
    " -->\n",
    "<!-- \n",
    "\n",
    " -->"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "root (Python) *",
   "language": "python",
   "name": "conda-root-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"id": "c2b9aa89-4fc2-4205-8df4-041f129d7eea",
	"metadata": {},
	"source": [
	"# Linearly constrained optimization\n",
	"\n",
	"Consider the problem\n",
	"\n",
	"$$\n",
	"\\min_{T \\in \\Pi_{p,q}} \\left<D, T\\right>_F\n",
	"$$\n",
	"\n",
	"where $\\left<\\cdot, \\cdot\\right>_F$ is the Frobenius norm, and\n",
	"\n",
	"$$\n",
	"\\Pi_{p,q} = \\left\\{T \\in \\mathbb{R}_{+}^{m,n} \\right\|\\left. T{\\mathbb{1}_n} = p\\text{ and }T^\\mathrm{t}\\mathbb{1}_m = q\\right\\}\n",
	"$$\n",
	"\n",
	"This problem encodes many questions related to minimal transport to make two distributions as close to each other as possible. Taking $d \\in \\mathbb{R}^{m\\times n}$ as the vectorization of matrix $D$ above, this problem is equivalent to\n",
	"\n",
	"$$\n",
	"\\begin{array}{ll}\n",
	" \\min_{t \\in \\mathbb{R}^{m \\times n}} & f(t) \\stackrel{\\text{def}}{=} \\left<d, t\\right> \\\\\n",
	" \\text{subject to} & \\sum_{j=1}^n t_{ij} = p \\text{ for }1 \\leq i \\leq m\\\\\n",
	" & \\sum_{i=1}^m t_{ij} = q \\text{ for }1 \\leq j \\leq n\\\\\n",
	" & t \\geq 0,\n",
	"\\end{array}\n",
	"$$\n",
	"\n",
	"where $\\left<\\cdot,\\cdot\\right>$ denotes the inner vector product. Constraints involved are linear equality constraints and bound constraints. The former can be handled by a change of variables. First, we assess that summations to $p$ can be rewritten as the matrix-vector product\n",
	"\n",
	"$$\n",
	"\\left[\n",
	"\\begin{array}{cccccccccccccc}\n",
	" 1 & 0 & 0 & \\ldots & 0 & 1 & 0 & 0 & \\ldots & 0 & \\ldots & 1 & 0 & 0 & \\ldots & 0 \\\\\n",
	" 0 & 1 & 0 & \\ldots & 0 & 0 & 1 & 0 & \\ldots & 0 & \\ldots & 0 & 1 & 0 & \\ldots & 0 \\\\\n",
	" 0 & 0 & 1 & \\ldots & 0 & 0 & 0 & 1 & \\ldots & 0 & \\ldots & 0 & 0 & 1 & \\ldots & 0 \\\\\n",
	" \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
	" 0 & 0 & 0 & \\ldots & 1 & 0 & 0 & 0 & \\ldots & 1 & \\ldots & 0 & 0 & 0 & \\ldots & 1\n",
	"\\end{array}\n",
	"\\right]t \\stackrel{\\text{def}}{=} \\mathbf{\\mathrm{G_1}}t = p\n",
	"$$\n",
	"\n",
	"Similarly, summations to $q$ can be rewritten as\n",
	"\n",
	"$$\n",
	"\\left[\n",
	"\\begin{array}{}\n",
	" 1 & 1 & \\ldots & 1 & 0 & 0 & \\ldots & 0 & \\ldots & 0 & 0 & \\ldots & 0 \\\\\n",
	" 0 & 0 & \\ldots & 0 & 1 & 1 & \\ldots & 1 & \\ldots & 0 & 0 & \\ldots & 0 \\\\\n",
	" \\vdots & \\vdots & \\ddots & \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
	" 0 & 0 & \\ldots & 0 & 0 & 0 & \\ldots & 0 & \\ldots & 1 & 1 & \\ldots & 1\n",
	"\\end{array}\n",
	"\\right]t \\stackrel{\\text{def}}{=} \\mathbf{\\mathrm{G_2}}t = q\n",
	"$$\n",
	"\n",
	"This enables the reformulation of the optimization problem as\n",
	"\n",
	"$$\n",
	"\\begin{array}{ll}\n",
	" \\min_{t \\in \\mathbb{R}^{m \\times n}} & f(t) \\\\\n",
	" \\text{subject to} & \\mathbf{\\mathrm{G}}t = h \\\\\n",
	" & t \\geq 0\n",
	"\\end{array}\n",
	"$$\n",
	"\n",
	"where\n",
	"\n",
	"$$\\mathbf{\\mathrm{G}} = \\left[\\begin{array}{c} \\mathbf{\\mathrm{G_1}} \\\\ \\mathbf{\\mathrm{G_2}} \\end{array}\\right]\\text{ and } h = \\left[\\begin{array}{c} p \\\\ q \\end{array}\\right].$$\n",
	"\n",
	"The rank of matrix $\\mathbf{\\mathrm{G}}$ is less than full: in addition to having merely $m+n$ rows to its $mn$ columns, the sum of the rows of its component parts $\\mathbf{\\mathrm{A}}$ and $\\mathbf{\\mathrm{B}}$ is respectively the same. Thus, we can find a linear transformation $\\mathbf{\\mathrm{T}}$ such that $\\mathbf{\\mathrm{G}}' = \\mathbf{\\mathrm{TG}}$ is full-rank, and such that $\\mathbf{\\mathrm{G'}}t = \\mathbf{\\mathrm{T}}h \\stackrel{\\mathrm{def}}{=} h' \\Leftrightarrow \\mathbf{\\mathrm{G}}t = h$.\n",
	"\n",
	"Second, the first-order optimality conditions for a solution $t^*$ to this problem are\n",
	"\n",
	"$$\n",
	"\\nabla f(t^*) = 0 \\\\\n",
	"\\mathbf{\\mathrm{G'}} t^* = h' \\\\\n",
	"t^* \\geq 0.\n",
	"$$\n",
	"\n",
	"A feasible point algorithm to solve this problem presumes the generation of a sequence of iterates $t_1, t_2\\ldots$ converging to a stationary point $t^*$ such that every iterate satisfies $\\mathbf{\\mathrm{G}}t_i = h$. Given one solution $\\hat{t}$ to this system, our iterates $t_i$ must belong to set $\\ker(\\mathbf{\\mathrm{G'}}) + \\{\\hat{t}\\}$. Assuming one forms a matrix $\\mathbf{\\mathrm{C}}$ whose columns for the basis of $\\ker(\\mathbf{\\mathrm{G'}})$, we can effect the change of variables\n",
	"\n",
	"$$\n",
	"t = \\mathbf{\\mathrm{C}}v + \\hat{t},\n",
	"$$\n",
	"\n",
	"and rewrite the optimization problem as\n",
	"\n",
	"$$\n",
	"\\begin{array}{ll}\n",
	" \\min_{v \\in \\mathbb{R}^{k}} & f(\\mathbf{\\mathrm{C}}v + \\hat{t}) \\\\\n",
	" \\text{subject to} & v \\geq 0,\n",
	"\\end{array}\n",
	"$$\n",
	"\n",
	"where $k = \\dim(\\ker(\\mathbf{\\mathrm{G'}}))$. Optimality conditions for this last version of the problem are stated as\n",
	"\n",
	"$$\n",
	"\\mathbf{\\mathrm{C}}^\\mathrm{t}\\nabla f(\\mathbf{\\mathrm{C}}v^* + \\hat{t}) = 0 \\\\\n",
	"v^* \\geq 0.\n",
	"$$\n",
	"\n",
	"<!-- Second, the optimality conditions that a solution $t^*$ to the problem satisfy are\n",
	"\n",
	"$$\n",
	"\\nabla f(t^*) = 0 \\\\\n",
	"\\mathbf{\\mathrm{G'}}t^* = h' \\\\\n",
	"t^* \\geq 0.\n",
	"$$\n",
	"\n",
	"The various components of the solution may live either on the boundary of the feasible set, or in its interior. As such, a useful representation of the solution involves the partitioning of its components in three subsets:\n",
	"\n",
	"1. The non-basis components, which are null; these correspond as a submatrix $\\mathbf{\\mathrm{N}}$ of $\\mathbf{\\mathrm{G'}}$.\n",
	"1. The in-basis components, expressed as a linear combination of the basis of an inversible submatrix $\\mathrm{\\mathbf{B}}$ of $\\mathbf{\\mathrm{G'}}$.\n",
	"1. The out-basis components, expressed as a linear combination of other columns of $\\mathbf{\\mathrm{G'}}$.\n",
	" -->\n",
	"<!-- \n",
	"\n",
	" -->"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "root (Python) *",
	"language": "python",
	"name": "conda-root-py"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.8.11"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 5
	}