Instantly share code, notes, and snippets.

# dhermes/cvxopt_spectral_norm.ipynb

Last active September 17, 2018 23:15
Show Gist options
• Save dhermes/18793452c4e3b89b2ec86d7d73439483 to your computer and use it in GitHub Desktop.
Student seminar talk on matrix 2-norm as a dual linear program
Display the source blob
Display the rendered blob
Raw
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Realizing the Spectral Norm as an SDP\n", "\n", "**Claim**: The spectral norm can be viewed as a semi-definite program:\n", "$$\\|Z\\|_2 = \\min_t t \\quad \\text{s.t.} \\quad\n", "\\left[ \\begin{array}{c c}\n", "t I_m & Z \\\\\n", "Z^{T} & t I_n \\end{array}\\right] \\succeq 0$$\n", "for $Z \\in \\mathbf{R}^{m \\times n}$.\n", "Refer to [paper][1] on the ArXiV for source of claim.\n", "\n", "*Proof*: This proof proceeds in two parts:\n", "$$\\|Z\\|_2 \\leq t \\Longleftrightarrow S = t^2 I_m - Z Z^T \\succeq 0.$$\n", "and\n", "$$S = t^2 I_m - Z Z^T \\succeq 0 \\Longleftrightarrow\n", "M = \\left[ \\begin{array}{c c}\n", "t I_m & Z \\\\\n", "Z^{T} & t I_n \\end{array}\\right] \\succeq 0.$$\n", "\n", "------------------------------------------------------------\n", "\n", "**Part I**: Note that\n", "$$\\|Z\\|_2 \\leq t \\Longleftrightarrow \\sigma_k^2 \\leq t^2$$\n", "for all singular values $\\sigma_k$ of $Z$.\n", "\n", "Since $Z Z^T$ is symmetric, we may diagonalize via\n", "$$Z Z^T = \\left(U \\Sigma V^T\\right) \\left(V \\Sigma^T U^T\\right)\n", "= U \\Sigma^2 U^T.$$\n", "Thus the eigenvalues of\n", "$$S = t^2 I_m - Z Z^T = U \\left(t^2 I_m - \\Sigma^2\\right) U^T$$\n", "are $\\left\\{t^2 - \\sigma_k^2\\right\\}_k$. Hence\n", "$$\\|Z\\|_2 \\leq t \\Longleftrightarrow S \\succeq 0.$$\n", "\n", "------------------------------------------------------------\n", "\n", "**Part II**: The matrix $S$ is a Schur complement of $t I_n$ in\n", "$$M = \\left[ \\begin{array}{c c}\n", "t I_m & Z \\\\\n", "Z^{T} & t I_n \\end{array}\\right].$$\n", "\n", "To realize the Schur complement, form the block matrices\n", "$$D = \\left[ \\begin{array}{c c}\n", "\\frac{1}{t} S & 0 \\\\\n", "0 & t I_n \\end{array}\\right] \\quad \\text{and} \\quad\n", "Q = \\left[ \\begin{array}{c c}\n", "I_m & 0 \\\\\n", "\\frac{1}{t} Z^T & I_n \\end{array}\\right].$$\n", "\n", "We'll show that $D \\sim M$ (i.e. they are congruent), hence\n", "$$M \\succeq 0 \\Longleftrightarrow D \\succeq 0.$$\n", "The spectrum is given by\n", "$$\\rho(D) = \\left\\{t\\right\\} \\bigcup \\left\\{\\frac{1}{t} \\cdot \\lambda\n", "\\mid \\lambda \\in \\rho(S)\\right\\}$$\n", "hence\n", "$$M \\succeq 0 \\Longleftrightarrow D \\succeq 0 \\Longleftrightarrow S\n", "\\succeq 0.$$\n", "(This and the definitions of $D, Q$ rely on $t > 0$, but we can safely\n", "assume this since $t < 0$ does not make sense with a norm and\n", "$t = 0 \\Leftrightarrow Z = 0$.)\n", "\n", "It remains to show that $D \\sim M$. We compute directly:\n", "\\begin{align*}\n", "t Q^T D Q &= \\left[ \\begin{array}{c c}\n", "I_m & \\frac{1}{t} Z \\\\\n", "0 & I_n \\end{array}\\right] \\left[ \\begin{array}{c c}\n", "S & 0 \\\\\n", "0 & t^2 I_n \\end{array}\\right] \\left[ \\begin{array}{c c}\n", "I_m & 0 \\\\\n", "\\frac{1}{t} Z^T & I_n \\end{array}\\right] \\\\\n", "&= \\left[ \\begin{array}{c c}\n", "S & t Z \\\\\n", "0 & t^2 I_n \\end{array}\\right] \\left[ \\begin{array}{c c}\n", "I_m & 0 \\\\\n", "\\frac{1}{t} Z^T & I_n \\end{array}\\right] \\\\\n", "&= \\left[ \\begin{array}{c c}\n", "S + Z Z^T & t Z \\\\\n", "t Z^T & t^2 I_n \\end{array}\\right] \\\\\n", "&= \\left[ \\begin{array}{c c}\n", "t^2 I_n & t Z \\\\\n", "t Z^T & t^2 I_n \\end{array}\\right] = t M.\n", "\\end{align*}\n", "Since $t \\neq 0$ this means $M = Q^T D Q$. The final bit to\n", "show congruence is that $\\text{det}(Q) = 1$ since $Q$ is\n", "lower triangular. $\\blacksquare$\n", "\n", "------------------------------------------------------------\n", "\n", "**Addendum**: To see that congruent matrices have the same signature\n", "note that if\n", "$$X = P^T Y P$$\n", "then\n", "$$v^T X v = w^T Y w$$\n", "for $w = P v$. Clearly $P$ invertible means that $X, Y$ have the\n", "same signature.\n", "\n", "------------------------------------------------------------\n", "\n", "We will implement the SDP in the claim using Python code.\n", "\n", "[1]: http://arxiv.org/pdf/0706.4138v1.pdf" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "\n", "def get_random_Z(M, N):\n", " return np.random.random((M, N))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The SDP above can be realized as\n", "$$\\widetilde{Z} = \\left[ \\begin{array}{c c}\n", "0 & Z \\\\\n", "Z^{T} & 0 \\end{array}\\right] \\succeq t \\left(-I_{m + n}\\right)$$" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def get_Z_tilde(Z, M, N):\n", " Z_tilde = np.zeros((M + N, M + N))\n", " Z_tilde[:M, M:] = Z\n", " Z_tilde[M:, :M] = Z.T\n", " return Z_tilde" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Python library cvxopt requires that square matrices be flattened out into column vectors and then grouped as columns in a matrix. In addition, it requires the (linear) coefficients of the variables.\n", "\n", "See the cvxopt [docs](http://abel.ee.ucla.edu/cvxopt/userguide/coneprog.html) for more information. To get a sense for how to use it, witness the example\n", "\n", "$$\\begin{array}{ll}\n", "\\mbox{minimize} & x_1 - x_2 + x_3 \\\\\n", "\\mbox{subject to}\n", " & x_1 \\left[ \\begin{array}{cc}\n", " -7 & -11 \\\\ -11 & 3\n", " \\end{array}\\right] +\n", " x_2 \\left[ \\begin{array}{cc}\n", " 7 & -18 \\\\ -18 & 8\n", " \\end{array}\\right] +\n", " x_3 \\left[ \\begin{array}{cc}\n", " -2 & -8 \\\\ -8 & 1\n", " \\end{array}\\right] \\preceq\n", " \\left[ \\begin{array}{cc}\n", " 33 & -9 \\\\ -9 & 26\n", " \\end{array}\\right] \\\\\n", " & x_1 \\left[ \\begin{array}{ccc}\n", " -21 & -11 & 0 \\\\\n", " -11 & 10 & 8 \\\\\n", " 0 & 8 & 5\n", " \\end{array}\\right] +\n", " x_2 \\left[ \\begin{array}{ccc}\n", " 0 & 10 & 16 \\\\\n", " 10 & -10 & -10 \\\\\n", " 16 & -10 & 3\n", " \\end{array}\\right] +\n", " x_3 \\left[ \\begin{array}{ccc}\n", " -5 & 2 & -17 \\\\\n", " 2 & -6 & -7 \\\\\n", " -17 & 8 & 6\n", " \\end{array}\\right] \\preceq\n", " \\left[ \\begin{array}{ccc}\n", " 14 & 9 & 40 \\\\\n", " 9 & 91 & 10 \\\\\n", " 40 & 10 & 15\n", " \\end{array} \\right]\n", "\\end{array}$$\n", "\n", "In order to solve this SDP, the coefficients for the objective are created:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import cvxopt\n", "c = cvxopt.matrix([1., -1., 1.])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and then each of the matrices which are used with linear combinations of $x_1, x_2, x_3$ are stored as 1D flattened values" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "G1 = cvxopt.matrix([\n", " [-7., -11., -11., 3.],\n", " [ 7., -18., -18., 8.],\n", " [-2., -8., -8., 1.],\n", "])\n", "G2 = cvxopt.matrix([\n", " [-21., -11., 0., -11., 10., 8., 0., 8., 5.],\n", " [ 0., 10., 16., 10., -10., -10., 16., -10., 3.],\n", " [ -5., 2., -17., 2., -6., 8., -17., -7., 6.],\n", "])\n", "G = [G1, G2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and finally the constant right-hand sides (of the SDPs) are combined with their actual (unflattend shapes):" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "h1 = cvxopt.matrix([\n", " [33., -9.],\n", " [-9., 26.],\n", "])\n", "h2 = cvxopt.matrix([\n", " [14., 9., 40.],\n", " [9., 91., 10.],\n", " [40., 10., 15.],\n", "])\n", "h = [h1, h2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once we have all this, the solution is found via:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ ">>> sol['x']\n", "[-3.68e-01]\n", "[ 1.90e+00]\n", "[-8.88e-01]\n", "\n", ">>> sol['zs'][0]\n", "[ 3.96e-03 -4.34e-03]\n", "[-4.34e-03 4.75e-03]\n", "\n", ">>> sol['zs'][1]\n", "[ 5.58e-02 -2.41e-03 2.42e-02]\n", "[-2.41e-03 1.04e-04 -1.05e-03]\n", "[ 2.42e-02 -1.05e-03 1.05e-02]\n", "\n" ] } ], "source": [ "import cvxopt.solvers\n", "# Suppress cvxopt logging.\n", "cvxopt.solvers.options['show_progress'] = False\n", "sol = cvxopt.solvers.sdp(c, Gs=G, hs=h)\n", "\n", "print(\">>> sol['x']\")\n", "print(sol['x'])\n", "print(\">>> sol['zs'][0]\")\n", "print(sol['zs'][0])\n", "print(\">>> sol['zs'][1]\")\n", "print(sol['zs'][1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To apply this to our problem, we have just a single row for the matrices: \n", "python\n", "G1 = cvxopt.matrix(-np.eye(M + N).flatten())\n", "G = [G1]\n", "\n", "and a single coefficient\n", "python\n", "c = cvxopt.matrix([1.0])\n", "\n", "since we are optimizing $1.0 \\cdot t$." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def get_coefficients(M, N):\n", " G1 = cvxopt.matrix(-np.eye(M + N).flatten())\n", " constraint_coefficients = [G1]\n", " optimized_coefficients = cvxopt.matrix([1.0])\n", " return constraint_coefficients, optimized_coefficients" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In addition, the constant matrix in the SDP expression — $\\widetilde{Z}$ here — needs to be represented as a matrix of the correct shape (unflattened) but cast to the custom cvxopt.matrix type." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def get_solution(Z):\n", " M, N = Z.shape\n", " \n", " constraint_coefficients, optimized_coefficients = get_coefficients(M, N)\n", " # List of G's, i.e. matrices with scalar coefficients in the SDP.\n", " Gs = [cvxopt.matrix(constraint_coefficients)]\n", " \n", " Z_tilde = get_Z_tilde(Z, M, N)\n", " # List of h's, i.e. scalar RHS in the SDP.\n", " hs = [cvxopt.matrix(Z_tilde)]\n", "\n", " solution = cvxopt.solvers.sdp(optimized_coefficients, Gs=Gs, hs=hs)\n", " return solution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we put it to work and compare." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Primal objective: 6.12584629081\n", " Duration: 0.0206348896027\n", "----------------------------------------------------------------------\n", " NumPy ||Z||₂: 6.12584641443\n", " Duration: 0.000224113464355\n" ] } ], "source": [ "# Time and compare solutions.\n", "import time\n", "\n", "\n", "M, N = 10, 15\n", "Z = get_random_Z(M, N)\n", "\n", "start = time.time()\n", "solution = get_solution(Z)\n", "duration = time.time() - start\n", "print 'Primal objective:', solution['primal objective']\n", "print ' Duration:', duration\n", "\n", "print '-' * 70\n", "\n", "# Use numpy to compute spectral norm.\n", "start = time.time()\n", "spectral_norm = np.linalg.norm(Z, ord=2)\n", "duration = time.time() - start\n", "print u' NumPy ||Z||\\u2082:', spectral_norm\n", "print ' Duration:', duration" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" } }, "nbformat": 4, "nbformat_minor": 1 }
Display the source blob
Display the rendered blob
Raw