tobydriscoll/TB-Lecture-21-julia.ipynb

## TB-Lecture-21-julia.ipynb

      
Display the source blob

    
Display the rendered blob

    
    Raw
  

              TB-Lecture-21-julia.ipynb
            
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## TB-Lecture-21-matlab.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lecture 21: Pivoting\n",
    "\n",
    "If we apply LU factorization as presented so far, the result is unstable. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A =\n",
      "   1.0000e-20   1.0000e+00\n",
      "   1.0000e+00            0\n",
      "L =\n",
      "   1.0000e+00            0\n",
      "   1.0000e+20   1.0000e+00\n",
      "U =\n",
      "   1.0000e-20   1.0000e+00\n",
      "            0  -1.0000e+20\n",
      "AminusLU =\n",
      "     0     0\n",
      "     0     0\n"
     ]
    }
   ],
   "source": [
    "format compact, format short e\n",
    "A = [ 1e-20 1;1 0 ]\n",
    "L = [1 0; 1e20 1]\n",
    "U = [1e-20 1; 0 -1e20]\n",
    "AminusLU = A - L*U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Even though the factorization above is correct, the $L$ and $U$ matrices have condition numbers exceeding $10^{20}$, creating catastrophic instability in their product and in solving linear systems with them. Yet $A$ itself is practically perfectly conditioned. \n",
    "\n",
    "In order to illustrate the cure for this problem, we return to the matrix of the last lecture."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A =\n",
      "    17     2     3    13\n",
      "     5    12    10     8\n",
      "     9     7     7    12\n",
      "     4    14    15     2\n"
     ]
    }
   ],
   "source": [
    "format short\n",
    "A = magic(4) + eye(4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Observe that when we compute the row multipliers to put into the lower triangle of $L$, they will all be less than one in magnitude:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "L1 =\n",
      "    1.0000         0         0         0\n",
      "   -0.2941    1.0000         0         0\n",
      "   -0.5294         0    1.0000         0\n",
      "   -0.2353         0         0    1.0000\n",
      "A1 =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   11.4118    9.1176    4.1765\n",
      "         0    5.9412    5.4118    5.1176\n",
      "         0   13.5294   14.2941   -1.0588\n"
     ]
    }
   ],
   "source": [
    "L1 = eye(4);  L1(2:4,1) = -A(2:4,1)/A(1,1)\n",
    "A1 = L1*A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "In the next step, however, we would have a multiplier of size 13.5/11.4 > 1. We can avoid this by swapping rows 2 and 4, using a permutation matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "P2 =\n",
      "     1     0     0     0\n",
      "     0     0     0     1\n",
      "     0     0     1     0\n",
      "     0     1     0     0\n",
      "L2 =\n",
      "    1.0000         0         0         0\n",
      "         0    1.0000         0         0\n",
      "         0   -0.4391    1.0000         0\n",
      "         0   -0.8435         0    1.0000\n",
      "A2 =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   13.5294   14.2941   -1.0588\n",
      "         0         0   -0.8652    5.5826\n",
      "         0         0   -2.9391    5.0696\n"
     ]
    }
   ],
   "source": [
    "P2 = eye(4);  P2([2 4],:) = P2([4 2],:)\n",
    "A1 = P2*A1;\n",
    "L2 = eye(4);  L2(3:4,2) = -A1(3:4,2)/A1(2,2)\n",
    "A2 = L2*A1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we must swap rows 3 and 4 to keep the multipliers less than one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "P3 =\n",
      "     1     0     0     0\n",
      "     0     1     0     0\n",
      "     0     0     0     1\n",
      "     0     0     1     0\n",
      "L3 =\n",
      "    1.0000         0         0         0\n",
      "         0    1.0000         0         0\n",
      "         0         0    1.0000         0\n",
      "         0         0   -0.2944    1.0000\n",
      "A3 =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   13.5294   14.2941   -1.0588\n",
      "         0         0   -2.9391    5.0696\n",
      "         0         0         0    4.0902\n"
     ]
    }
   ],
   "source": [
    "P3 = eye(4);  P3([3 4],:) = P3([4 3],:)\n",
    "A2 = P3*A2;\n",
    "L3 = eye(4);  L3(4:4,3) = -A2(4:4,3)/A2(3,3)\n",
    "A3 = L3*A2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is the product that makes the matrix upper triangular."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "U =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   13.5294   14.2941   -1.0588\n",
      "         0         0   -2.9391    5.0696\n",
      "    0.0000         0   -0.0000    4.0902\n"
     ]
    }
   ],
   "source": [
    "U = L3*P3*L2*P2*L1*A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(We no longer get exact zeros in the lower triangle because the product is evaluated from left to right, creating a different sequence of steps.) Of course this product isn't changed if we introduce some extra factors of $I$..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "U =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   13.5294   14.2941   -1.0588\n",
      "         0         0   -2.9391    5.0696\n",
      "    0.0000         0   -0.0000    4.0902\n"
     ]
    }
   ],
   "source": [
    "U = L3 * P3*L2/P3 * P3*P2*L1/P2/P3 * P3*P2 * A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This might look like a mess, but we can simplify a lot. First of all, each of these permutations happens to be its own inverse:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "U =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   13.5294   14.2941   -1.0588\n",
      "         0         0   -2.9391    5.0696\n",
      "    0.0000         0   -0.0000    4.0902\n"
     ]
    }
   ],
   "source": [
    "U = L3 * P3*L2*P3 * P3*P2*L1*P2*P3 * P3*P2 * A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, each permutation $P_k$ swaps row $k$ with row $i\\ge k$. This reshuffles the nontrivial multipliers in column $j$ of $L_j$. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "P3L2 =\n",
      "    1.0000         0         0         0\n",
      "         0    1.0000         0         0\n",
      "         0   -0.8435         0    1.0000\n",
      "         0   -0.4391    1.0000         0\n",
      "P3P2L1 =\n",
      "    1.0000         0         0         0\n",
      "   -0.2353         0         0    1.0000\n",
      "   -0.2941    1.0000         0         0\n",
      "   -0.5294         0    1.0000         0\n"
     ]
    }
   ],
   "source": [
    "P3L2 = P3*L2\n",
    "P3P2L1 = P3*P2*L1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, the right-multiplications by the $P_k$ undo the swap effects in the columns of $L_j$ beyond column $j$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "LL2 =\n",
      "    1.0000         0         0         0\n",
      "         0    1.0000         0         0\n",
      "         0   -0.8435    1.0000         0\n",
      "         0   -0.4391         0    1.0000\n",
      "LL1 =\n",
      "    1.0000         0         0         0\n",
      "   -0.2353    1.0000         0         0\n",
      "   -0.2941         0    1.0000         0\n",
      "   -0.5294         0         0    1.0000\n"
     ]
    }
   ],
   "source": [
    "LL2 = P3*L2*P3\n",
    "LL1 = P3*P2*L1*P2*P3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We put all of this together, with one more definition, and find"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "U =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   13.5294   14.2941   -1.0588\n",
      "         0         0   -2.9391    5.0696\n",
      "    0.0000         0   -0.0000    4.0902\n"
     ]
    }
   ],
   "source": [
    "P = P3*P2;\n",
    "U = L3*LL2*LL1*P*A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Thus, let"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "L =\n",
      "    1.0000         0         0         0\n",
      "    0.2353    1.0000         0         0\n",
      "    0.2941    0.8435    1.0000         0\n",
      "    0.5294    0.4391    0.2944    1.0000\n"
     ]
    }
   ],
   "source": [
    "L = eye(4)/LL1/LL2/L3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have arrived at $PA=LU$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ans =\n",
      "   1.7764e-15\n"
     ]
    }
   ],
   "source": [
    "norm(P*A-L*U)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that we did nothing to control the size of the entries in $U$. We could have also swapped columns, too, introducing a permutation on the right. However, in practice there is _very_ rarely stability improved by doing so, and the number of comparisons/memory accesses would go from $O(m^2)$ to a more concerning $O(m^3)$. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Solving systems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can transform $Ax=b$ into $Pb=PAx=LUx$. Hence the entries of $b$ are permuted the same way as the rows of $A$, then there are two triangular solves as usual. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "L =\n",
      "    1.0000         0         0         0\n",
      "    0.2353    1.0000         0         0\n",
      "    0.2941    0.8435    1.0000         0\n",
      "    0.5294    0.4391    0.2944    1.0000\n",
      "U =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   13.5294   14.2941   -1.0588\n",
      "         0         0   -2.9391    5.0696\n",
      "         0         0         0    4.0902\n",
      "P =\n",
      "     1     0     0     0\n",
      "     0     0     0     1\n",
      "     0     1     0     0\n",
      "     0     0     1     0\n"
     ]
    }
   ],
   "source": [
    "[L,U,P] = lu(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x =\n",
      "   -0.2640\n",
      "   -0.7830\n",
      "    1.0271\n",
      "    0.3056\n"
     ]
    }
   ],
   "source": [
    "b = (1:4)';  x = U\\(L\\(P*b))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ans =\n",
      "   1.6012e-15\n"
     ]
    }
   ],
   "source": [
    "norm(b-A*x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It's more common to use a vector of indices to keep track of the permutation and implement multiplication by $P$ as reordering of the indices."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "L =\n",
      "    1.0000         0         0         0\n",
      "    0.2353    1.0000         0         0\n",
      "    0.2941    0.8435    1.0000         0\n",
      "    0.5294    0.4391    0.2944    1.0000\n",
      "U =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   13.5294   14.2941   -1.0588\n",
      "         0         0   -2.9391    5.0696\n",
      "         0         0         0    4.0902\n",
      "p =\n",
      "     1     4     2     3\n"
     ]
    }
   ],
   "source": [
    "[L,U,p] = lu(A,'vector')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x =\n",
      "   -0.2640\n",
      "   -0.7830\n",
      "    1.0271\n",
      "    0.3056\n"
     ]
    }
   ],
   "source": [
    "x = U\\ (L\\ (b(p)) )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Psychologically lower triangular matrices"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you use the built in `lu` with two outputs, you get not $L$ but $P^{-1}L$. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "PL =\n",
      "    1.0000         0         0         0\n",
      "    0.2941    0.8435    1.0000         0\n",
      "    0.5294    0.4391    0.2944    1.0000\n",
      "    0.2353    1.0000         0         0\n",
      "U =\n",
      "   17.0000    2.0000    3.0000   13.0000\n",
      "         0   13.5294   14.2941   -1.0588\n",
      "         0         0   -2.9391    5.0696\n",
      "         0         0         0    4.0902\n"
     ]
    }
   ],
   "source": [
    "[PL,U] = lu(A)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The backslash is smart enough to recognize a 'psychologically lower triangular' matrix like this and do permuted triangular substitution with it. So the correct usage of the factors is just"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x =\n",
      "   -0.2640\n",
      "   -0.7830\n",
      "    1.0271\n",
      "    0.3056\n"
     ]
    }
   ],
   "source": [
    "x = U\\(PL\\b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The ~~(in)~~stability of pivoted LU factorization\n",
    "\n",
    "(This is a bit of the material in Lecture 22.) By the text's technical definitions, LU factorization with row pivoting is backward stable. However, the $O(\\epsilon_{\\rm machine})$ hides a so-called _growth factor_ $\\rho$ that can be as large as $2^{m-1}$ in the $m\\times m$ case. While this bound is uniform with respect to the data, it's so large that it can easily overwhelm the machine precision."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A =\n",
      "     1     0     0     0     0     1\n",
      "    -1     1     0     0     0     1\n",
      "    -1    -1     1     0     0     1\n",
      "    -1    -1    -1     1     0     1\n",
      "    -1    -1    -1    -1     1     1\n",
      "    -1    -1    -1    -1    -1     1\n"
     ]
    }
   ],
   "source": [
    "m = 6;\n",
    "A = tril(-ones(m,m),-1) + eye(m);\n",
    "A(:,m) = 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The row pivoting strategy makes no swaps during elimination. The problem lies with the $U$ matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "U =\n",
      "     1     0     0     0     0     1\n",
      "     0     1     0     0     0     2\n",
      "     0     0     1     0     0     4\n",
      "     0     0     0     1     0     8\n",
      "     0     0     0     0     1    16\n",
      "     0     0     0     0     0    32\n"
     ]
    }
   ],
   "source": [
    "[L,U] = lu(A);\n",
    "U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is the factor $2^{m-1}$. Since machine epsilon is $2^{-52}$, we are going to have trouble solving this system for $m>50$. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ans =\n",
      "    0.0169\n"
     ]
    }
   ],
   "source": [
    "m = 53;\n",
    "A = tril(-ones(m,m),-1) + eye(m);\n",
    "A(:,m) = 1;\n",
    "x = (1:m)'/m;\n",
    "b = A*x;\n",
    "norm(x - A\\b) / norm(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's 1% error in 16 digit arithmetic! And the matrix is very well conditioned:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ans =\n",
      "   23.6549\n"
     ]
    }
   ],
   "source": [
    "cond(A)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Indeed, the QR factorization has no trouble at all with this problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ans =\n",
      "   4.7754e-15\n"
     ]
    }
   ],
   "source": [
    "[Q,R] = qr(A);\n",
    "norm(x - R\\(Q'*b)) / norm(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It's surprising that an unstable algorithm (PLU) is chosen by default for linear systems when a fully reliable alternative (QR) is available. It appears (but is not fully known) that matrices that cause this instability all more or less look like this example, and they are astronomically rare---both in the sense of probability and in lived experience with applications. Since QR takes twice as many flops as LU, it's hardly ever used for square systems. "
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Matlab",
   "language": "matlab",
   "name": "matlab"
  },
  "language_info": {
   "codemirror_mode": "octave",
   "file_extension": ".m",
   "help_links": [
    {
     "text": "MetaKernel Magics",
     "url": "https://github.com/calysto/metakernel/blob/master/metakernel/magics/README.md"
    }
   ],
   "mimetype": "text/x-octave",
   "name": "matlab",
   "version": "0.11.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Lecture 21: Pivoting\n",
	"\n",
	"If we apply LU factorization as presented so far, the result is unstable. "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"A =\n",
	" 1.0000e-20 1.0000e+00\n",
	" 1.0000e+00 0\n",
	"L =\n",
	" 1.0000e+00 0\n",
	" 1.0000e+20 1.0000e+00\n",
	"U =\n",
	" 1.0000e-20 1.0000e+00\n",
	" 0 -1.0000e+20\n",
	"AminusLU =\n",
	" 0 0\n",
	" 0 0\n"
	]
	}
	],
	"source": [
	"format compact, format short e\n",
	"A = [ 1e-20 1;1 0 ]\n",
	"L = [1 0; 1e20 1]\n",
	"U = [1e-20 1; 0 -1e20]\n",
	"AminusLU = A - L*U"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Even though the factorization above is correct, the $L$ and $U$ matrices have condition numbers exceeding $10^{20}$, creating catastrophic instability in their product and in solving linear systems with them. Yet $A$ itself is practically perfectly conditioned. \n",
	"\n",
	"In order to illustrate the cure for this problem, we return to the matrix of the last lecture."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"A =\n",
	" 17 2 3 13\n",
	" 5 12 10 8\n",
	" 9 7 7 12\n",
	" 4 14 15 2\n"
	]
	}
	],
	"source": [
	"format short\n",
	"A = magic(4) + eye(4)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Observe that when we compute the row multipliers to put into the lower triangle of $L$, they will all be less than one in magnitude:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"L1 =\n",
	" 1.0000 0 0 0\n",
	" -0.2941 1.0000 0 0\n",
	" -0.5294 0 1.0000 0\n",
	" -0.2353 0 0 1.0000\n",
	"A1 =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 11.4118 9.1176 4.1765\n",
	" 0 5.9412 5.4118 5.1176\n",
	" 0 13.5294 14.2941 -1.0588\n"
	]
	}
	],
	"source": [
	"L1 = eye(4); L1(2:4,1) = -A(2:4,1)/A(1,1)\n",
	"A1 = L1*A"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"collapsed": false
	},
	"source": [
	"In the next step, however, we would have a multiplier of size 13.5/11.4 > 1. We can avoid this by swapping rows 2 and 4, using a permutation matrix."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"P2 =\n",
	" 1 0 0 0\n",
	" 0 0 0 1\n",
	" 0 0 1 0\n",
	" 0 1 0 0\n",
	"L2 =\n",
	" 1.0000 0 0 0\n",
	" 0 1.0000 0 0\n",
	" 0 -0.4391 1.0000 0\n",
	" 0 -0.8435 0 1.0000\n",
	"A2 =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 13.5294 14.2941 -1.0588\n",
	" 0 0 -0.8652 5.5826\n",
	" 0 0 -2.9391 5.0696\n"
	]
	}
	],
	"source": [
	"P2 = eye(4); P2([2 4],:) = P2([4 2],:)\n",
	"A1 = P2*A1;\n",
	"L2 = eye(4); L2(3:4,2) = -A1(3:4,2)/A1(2,2)\n",
	"A2 = L2*A1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Now we must swap rows 3 and 4 to keep the multipliers less than one."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"P3 =\n",
	" 1 0 0 0\n",
	" 0 1 0 0\n",
	" 0 0 0 1\n",
	" 0 0 1 0\n",
	"L3 =\n",
	" 1.0000 0 0 0\n",
	" 0 1.0000 0 0\n",
	" 0 0 1.0000 0\n",
	" 0 0 -0.2944 1.0000\n",
	"A3 =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 13.5294 14.2941 -1.0588\n",
	" 0 0 -2.9391 5.0696\n",
	" 0 0 0 4.0902\n"
	]
	}
	],
	"source": [
	"P3 = eye(4); P3([3 4],:) = P3([4 3],:)\n",
	"A2 = P3*A2;\n",
	"L3 = eye(4); L3(4:4,3) = -A2(4:4,3)/A2(3,3)\n",
	"A3 = L3*A2"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Here is the product that makes the matrix upper triangular."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"U =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 13.5294 14.2941 -1.0588\n",
	" 0 0 -2.9391 5.0696\n",
	" 0.0000 0 -0.0000 4.0902\n"
	]
	}
	],
	"source": [
	"U = L3P3L2P2L1*A"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"(We no longer get exact zeros in the lower triangle because the product is evaluated from left to right, creating a different sequence of steps.) Of course this product isn't changed if we introduce some extra factors of $I$..."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"U =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 13.5294 14.2941 -1.0588\n",
	" 0 0 -2.9391 5.0696\n",
	" 0.0000 0 -0.0000 4.0902\n"
	]
	}
	],
	"source": [
	"U = L3 * P3L2/P3 P3P2L1/P2/P3 * P3P2 A"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"This might look like a mess, but we can simplify a lot. First of all, each of these permutations happens to be its own inverse:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"U =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 13.5294 14.2941 -1.0588\n",
	" 0 0 -2.9391 5.0696\n",
	" 0.0000 0 -0.0000 4.0902\n"
	]
	}
	],
	"source": [
	"U = L3 * P3L2P3 * P3P2L1P2P3 * P3P2 A"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Next, each permutation $P_k$ swaps row $k$ with row $i\\ge k$. This reshuffles the nontrivial multipliers in column $j$ of $L_j$. "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"P3L2 =\n",
	" 1.0000 0 0 0\n",
	" 0 1.0000 0 0\n",
	" 0 -0.8435 0 1.0000\n",
	" 0 -0.4391 1.0000 0\n",
	"P3P2L1 =\n",
	" 1.0000 0 0 0\n",
	" -0.2353 0 0 1.0000\n",
	" -0.2941 1.0000 0 0\n",
	" -0.5294 0 1.0000 0\n"
	]
	}
	],
	"source": [
	"P3L2 = P3*L2\n",
	"P3P2L1 = P3P2L1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Finally, the right-multiplications by the $P_k$ undo the swap effects in the columns of $L_j$ beyond column $j$."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 10,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"LL2 =\n",
	" 1.0000 0 0 0\n",
	" 0 1.0000 0 0\n",
	" 0 -0.8435 1.0000 0\n",
	" 0 -0.4391 0 1.0000\n",
	"LL1 =\n",
	" 1.0000 0 0 0\n",
	" -0.2353 1.0000 0 0\n",
	" -0.2941 0 1.0000 0\n",
	" -0.5294 0 0 1.0000\n"
	]
	}
	],
	"source": [
	"LL2 = P3L2P3\n",
	"LL1 = P3P2L1P2P3"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We put all of this together, with one more definition, and find"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"U =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 13.5294 14.2941 -1.0588\n",
	" 0 0 -2.9391 5.0696\n",
	" 0.0000 0 -0.0000 4.0902\n"
	]
	}
	],
	"source": [
	"P = P3*P2;\n",
	"U = L3LL2LL1PA"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Thus, let"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 12,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"L =\n",
	" 1.0000 0 0 0\n",
	" 0.2353 1.0000 0 0\n",
	" 0.2941 0.8435 1.0000 0\n",
	" 0.5294 0.4391 0.2944 1.0000\n"
	]
	}
	],
	"source": [
	"L = eye(4)/LL1/LL2/L3"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We have arrived at $PA=LU$."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 13,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"ans =\n",
	" 1.7764e-15\n"
	]
	}
	],
	"source": [
	"norm(PA-LU)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Note that we did nothing to control the size of the entries in $U$. We could have also swapped columns, too, introducing a permutation on the right. However, in practice there is _very_ rarely stability improved by doing so, and the number of comparisons/memory accesses would go from $O(m^2)$ to a more concerning $O(m^3)$. "
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Solving systems"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"We can transform $Ax=b$ into $Pb=PAx=LUx$. Hence the entries of $b$ are permuted the same way as the rows of $A$, then there are two triangular solves as usual. "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 14,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"L =\n",
	" 1.0000 0 0 0\n",
	" 0.2353 1.0000 0 0\n",
	" 0.2941 0.8435 1.0000 0\n",
	" 0.5294 0.4391 0.2944 1.0000\n",
	"U =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 13.5294 14.2941 -1.0588\n",
	" 0 0 -2.9391 5.0696\n",
	" 0 0 0 4.0902\n",
	"P =\n",
	" 1 0 0 0\n",
	" 0 0 0 1\n",
	" 0 1 0 0\n",
	" 0 0 1 0\n"
	]
	}
	],
	"source": [
	"[L,U,P] = lu(A)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 15,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"x =\n",
	" -0.2640\n",
	" -0.7830\n",
	" 1.0271\n",
	" 0.3056\n"
	]
	}
	],
	"source": [
	"b = (1:4)'; x = U\\(L\\(P*b))"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 16,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"ans =\n",
	" 1.6012e-15\n"
	]
	}
	],
	"source": [
	"norm(b-A*x)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"It's more common to use a vector of indices to keep track of the permutation and implement multiplication by $P$ as reordering of the indices."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 17,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"L =\n",
	" 1.0000 0 0 0\n",
	" 0.2353 1.0000 0 0\n",
	" 0.2941 0.8435 1.0000 0\n",
	" 0.5294 0.4391 0.2944 1.0000\n",
	"U =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 13.5294 14.2941 -1.0588\n",
	" 0 0 -2.9391 5.0696\n",
	" 0 0 0 4.0902\n",
	"p =\n",
	" 1 4 2 3\n"
	]
	}
	],
	"source": [
	"[L,U,p] = lu(A,'vector')"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 18,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"x =\n",
	" -0.2640\n",
	" -0.7830\n",
	" 1.0271\n",
	" 0.3056\n"
	]
	}
	],
	"source": [
	"x = U\\ (L\\ (b(p)) )"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Psychologically lower triangular matrices"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"If you use the built in `lu` with two outputs, you get not $L$ but $P^{-1}L$. "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 19,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"PL =\n",
	" 1.0000 0 0 0\n",
	" 0.2941 0.8435 1.0000 0\n",
	" 0.5294 0.4391 0.2944 1.0000\n",
	" 0.2353 1.0000 0 0\n",
	"U =\n",
	" 17.0000 2.0000 3.0000 13.0000\n",
	" 0 13.5294 14.2941 -1.0588\n",
	" 0 0 -2.9391 5.0696\n",
	" 0 0 0 4.0902\n"
	]
	}
	],
	"source": [
	"[PL,U] = lu(A)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The backslash is smart enough to recognize a 'psychologically lower triangular' matrix like this and do permuted triangular substitution with it. So the correct usage of the factors is just"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 20,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"x =\n",
	" -0.2640\n",
	" -0.7830\n",
	" 1.0271\n",
	" 0.3056\n"
	]
	}
	],
	"source": [
	"x = U\\(PL\\b)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## The ~~(in)~~stability of pivoted LU factorization\n",
	"\n",
	"(This is a bit of the material in Lecture 22.) By the text's technical definitions, LU factorization with row pivoting is backward stable. However, the $O(\\epsilon_{\\rm machine})$ hides a so-called _growth factor_ $\\rho$ that can be as large as $2^{m-1}$ in the $m\\times m$ case. While this bound is uniform with respect to the data, it's so large that it can easily overwhelm the machine precision."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 21,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"A =\n",
	" 1 0 0 0 0 1\n",
	" -1 1 0 0 0 1\n",
	" -1 -1 1 0 0 1\n",
	" -1 -1 -1 1 0 1\n",
	" -1 -1 -1 -1 1 1\n",
	" -1 -1 -1 -1 -1 1\n"
	]
	}
	],
	"source": [
	"m = 6;\n",
	"A = tril(-ones(m,m),-1) + eye(m);\n",
	"A(:,m) = 1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The row pivoting strategy makes no swaps during elimination. The problem lies with the $U$ matrix."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 22,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"U =\n",
	" 1 0 0 0 0 1\n",
	" 0 1 0 0 0 2\n",
	" 0 0 1 0 0 4\n",
	" 0 0 0 1 0 8\n",
	" 0 0 0 0 1 16\n",
	" 0 0 0 0 0 32\n"
	]
	}
	],
	"source": [
	"[L,U] = lu(A);\n",
	"U"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"There is the factor $2^{m-1}$. Since machine epsilon is $2^{-52}$, we are going to have trouble solving this system for $m>50$. "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 23,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"ans =\n",
	" 0.0169\n"
	]
	}
	],
	"source": [
	"m = 53;\n",
	"A = tril(-ones(m,m),-1) + eye(m);\n",
	"A(:,m) = 1;\n",
	"x = (1:m)'/m;\n",
	"b = A*x;\n",
	"norm(x - A\\b) / norm(x)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"That's 1% error in 16 digit arithmetic! And the matrix is very well conditioned:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 24,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"ans =\n",
	" 23.6549\n"
	]
	}
	],
	"source": [
	"cond(A)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Indeed, the QR factorization has no trouble at all with this problem."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 25,
	"metadata": {
	"collapsed": false
	},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"ans =\n",
	" 4.7754e-15\n"
	]
	}
	],
	"source": [
	"[Q,R] = qr(A);\n",
	"norm(x - R\\(Q'*b)) / norm(x)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"It's surprising that an unstable algorithm (PLU) is chosen by default for linear systems when a fully reliable alternative (QR) is available. It appears (but is not fully known) that matrices that cause this instability all more or less look like this example, and they are astronomically rare---both in the sense of probability and in lived experience with applications. Since QR takes twice as many flops as LU, it's hardly ever used for square systems. "
	]
	}
	],
	"metadata": {
	"anaconda-cloud": {},
	"kernelspec": {
	"display_name": "Matlab",
	"language": "matlab",
	"name": "matlab"
	},
	"language_info": {
	"codemirror_mode": "octave",
	"file_extension": ".m",
	"help_links": [
	{
	"text": "MetaKernel Magics",
	"url": "https://github.com/calysto/metakernel/blob/master/metakernel/magics/README.md"
	}
	],
	"mimetype": "text/x-octave",
	"name": "matlab",
	"version": "0.11.0"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 0
	}