Skip to content

Instantly share code, notes, and snippets.

@hamelin
Last active October 27, 2021 14:52
Show Gist options
  • Save hamelin/b510387996f0d66ec7c8660bc854bb98 to your computer and use it in GitHub Desktop.
Save hamelin/b510387996f0d66ec7c8660bc854bb98 to your computer and use it in GitHub Desktop.
A discussion of optimization with linear equality constrains applied to minimal transport problems
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "c2b9aa89-4fc2-4205-8df4-041f129d7eea",
"metadata": {},
"source": [
"# Linearly constrained optimization\n",
"\n",
"Consider the problem\n",
"\n",
"$$\n",
"\\min_{T \\in \\Pi_{p,q}} \\left<D, T\\right>_F\n",
"$$\n",
"\n",
"where $\\left<\\cdot, \\cdot\\right>_F$ is the Frobenius norm, and\n",
"\n",
"$$\n",
"\\Pi_{p,q} = \\left\\{T \\in \\mathbb{R}_{+}^{m,n} \\right|\\left. T{\\mathbb{1}_n} = p\\text{ and }T^\\mathrm{t}\\mathbb{1}_m = q\\right\\}\n",
"$$\n",
"\n",
"This problem encodes many questions related to minimal transport to make two distributions as close to each other as possible. Taking $d \\in \\mathbb{R}^{m\\times n}$ as the vectorization of matrix $D$ above, this problem is equivalent to\n",
"\n",
"$$\n",
"\\begin{array}{ll}\n",
" \\min_{t \\in \\mathbb{R}^{m \\times n}} & f(t) \\stackrel{\\text{def}}{=} \\left<d, t\\right> \\\\\n",
" \\text{subject to} & \\sum_{j=1}^n t_{ij} = p \\text{ for }1 \\leq i \\leq m\\\\\n",
" & \\sum_{i=1}^m t_{ij} = q \\text{ for }1 \\leq j \\leq n\\\\\n",
" & t \\geq 0,\n",
"\\end{array}\n",
"$$\n",
"\n",
"where $\\left<\\cdot,\\cdot\\right>$ denotes the inner vector product. Constraints involved are linear equality constraints and bound constraints. The former can be handled by a change of variables. First, we assess that summations to $p$ can be rewritten as the matrix-vector product\n",
"\n",
"$$\n",
"\\left[\n",
"\\begin{array}{cccccccccccccc}\n",
" 1 & 0 & 0 & \\ldots & 0 & 1 & 0 & 0 & \\ldots & 0 & \\ldots & 1 & 0 & 0 & \\ldots & 0 \\\\\n",
" 0 & 1 & 0 & \\ldots & 0 & 0 & 1 & 0 & \\ldots & 0 & \\ldots & 0 & 1 & 0 & \\ldots & 0 \\\\\n",
" 0 & 0 & 1 & \\ldots & 0 & 0 & 0 & 1 & \\ldots & 0 & \\ldots & 0 & 0 & 1 & \\ldots & 0 \\\\\n",
" \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
" 0 & 0 & 0 & \\ldots & 1 & 0 & 0 & 0 & \\ldots & 1 & \\ldots & 0 & 0 & 0 & \\ldots & 1\n",
"\\end{array}\n",
"\\right]t \\stackrel{\\text{def}}{=} \\mathbf{\\mathrm{G_1}}t = p\n",
"$$\n",
"\n",
"Similarly, summations to $q$ can be rewritten as\n",
"\n",
"$$\n",
"\\left[\n",
"\\begin{array}{}\n",
" 1 & 1 & \\ldots & 1 & 0 & 0 & \\ldots & 0 & \\ldots & 0 & 0 & \\ldots & 0 \\\\\n",
" 0 & 0 & \\ldots & 0 & 1 & 1 & \\ldots & 1 & \\ldots & 0 & 0 & \\ldots & 0 \\\\\n",
" \\vdots & \\vdots & \\ddots & \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\vdots & \\ddots & \\vdots \\\\\n",
" 0 & 0 & \\ldots & 0 & 0 & 0 & \\ldots & 0 & \\ldots & 1 & 1 & \\ldots & 1\n",
"\\end{array}\n",
"\\right]t \\stackrel{\\text{def}}{=} \\mathbf{\\mathrm{G_2}}t = q\n",
"$$\n",
"\n",
"This enables the reformulation of the optimization problem as\n",
"\n",
"$$\n",
"\\begin{array}{ll}\n",
" \\min_{t \\in \\mathbb{R}^{m \\times n}} & f(t) \\\\\n",
" \\text{subject to} & \\mathbf{\\mathrm{G}}t = h \\\\\n",
" & t \\geq 0\n",
"\\end{array}\n",
"$$\n",
"\n",
"where\n",
"\n",
"$$\\mathbf{\\mathrm{G}} = \\left[\\begin{array}{c} \\mathbf{\\mathrm{G_1}} \\\\ \\mathbf{\\mathrm{G_2}} \\end{array}\\right]\\text{ and } h = \\left[\\begin{array}{c} p \\\\ q \\end{array}\\right].$$\n",
"\n",
"The rank of matrix $\\mathbf{\\mathrm{G}}$ is less than full: in addition to having merely $m+n$ rows to its $mn$ columns, the sum of the rows of its component parts $\\mathbf{\\mathrm{A}}$ and $\\mathbf{\\mathrm{B}}$ is respectively the same. Thus, we can find a linear transformation $\\mathbf{\\mathrm{T}}$ such that $\\mathbf{\\mathrm{G}}' = \\mathbf{\\mathrm{TG}}$ is full-rank, and such that $\\mathbf{\\mathrm{G'}}t = \\mathbf{\\mathrm{T}}h \\stackrel{\\mathrm{def}}{=} h' \\Leftrightarrow \\mathbf{\\mathrm{G}}t = h$.\n",
"\n",
"Second, the first-order optimality conditions for a solution $t^*$ to this problem are\n",
"\n",
"$$\n",
"\\nabla f(t^*) = 0 \\\\\n",
"\\mathbf{\\mathrm{G'}} t^* = h' \\\\\n",
"t^* \\geq 0.\n",
"$$\n",
"\n",
"A *feasible point* algorithm to solve this problem presumes the generation of a sequence of iterates $t_1, t_2\\ldots$ converging to a stationary point $t^*$ such that every iterate satisfies $\\mathbf{\\mathrm{G}}t_i = h$. Given one solution $\\hat{t}$ to this system, our iterates $t_i$ must belong to set $\\ker(\\mathbf{\\mathrm{G'}}) + \\{\\hat{t}\\}$. Assuming one forms a matrix $\\mathbf{\\mathrm{C}}$ whose columns for the basis of $\\ker(\\mathbf{\\mathrm{G'}})$, we can effect the change of variables\n",
"\n",
"$$\n",
"t = \\mathbf{\\mathrm{C}}v + \\hat{t},\n",
"$$\n",
"\n",
"and rewrite the optimization problem as\n",
"\n",
"$$\n",
"\\begin{array}{ll}\n",
" \\min_{v \\in \\mathbb{R}^{k}} & f(\\mathbf{\\mathrm{C}}v + \\hat{t}) \\\\\n",
" \\text{subject to} & v \\geq 0,\n",
"\\end{array}\n",
"$$\n",
"\n",
"where $k = \\dim(\\ker(\\mathbf{\\mathrm{G'}}))$. Optimality conditions for this last version of the problem are stated as\n",
"\n",
"$$\n",
"\\mathbf{\\mathrm{C}}^\\mathrm{t}\\nabla f(\\mathbf{\\mathrm{C}}v^* + \\hat{t}) = 0 \\\\\n",
"v^* \\geq 0.\n",
"$$\n",
"\n",
"<!-- Second, the optimality conditions that a solution $t^*$ to the problem satisfy are\n",
"\n",
"$$\n",
"\\nabla f(t^*) = 0 \\\\\n",
"\\mathbf{\\mathrm{G'}}t^* = h' \\\\\n",
"t^* \\geq 0.\n",
"$$\n",
"\n",
"The various components of the solution may live either on the boundary of the feasible set, or in its interior. As such, a useful representation of the solution involves the partitioning of its components in three subsets:\n",
"\n",
"1. The *non-basis components*, which are null; these correspond as a submatrix $\\mathbf{\\mathrm{N}}$ of $\\mathbf{\\mathrm{G'}}$.\n",
"1. The *in-basis components*, expressed as a linear combination of the basis of an inversible submatrix $\\mathrm{\\mathbf{B}}$ of $\\mathbf{\\mathrm{G'}}$.\n",
"1. The *out-basis components*, expressed as a linear combination of other columns of $\\mathbf{\\mathrm{G'}}$.\n",
" -->\n",
"<!-- \n",
"\n",
" -->"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "root (Python) *",
"language": "python",
"name": "conda-root-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.11"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment