Name: Nikolay Mayorov e-mail: n59_ru@hotmail.com github: nmayorov
I'll add other officially required info later.
University: Lomonosov Moscow State University Major: Applied Mechanics / Navigation and Control Current Year: 2nd year Expected Graduation date: May of 2015 Degree: Master
Title: Improve nonlinear least squares minimization functionality in SciPy
Abstract:
The project consists of two major parts.
- Finish porting Levenberg-Marquardt algorithm from MINPACK (Fortran) to Python / Cython implementation.
- Implement a modification of the dogleg method with sparse Jacobian and box constraints support.
Details:
- Current scipy functionality for nonlinear least squares minimization is limited to the wrapper of MINPACK routines
lmder
/lmdif
. These functions implement trust-region-formulated Levenberg-Marquardt algorithm with the near-exact trust region subproblem solution at each iterator. This heavy-duty procedure is described in [1].
The reasonable approach towards this code is not to change the algorithmic part at all, since it was developed by really good specialists and well tested during the course of time. On the other hand working with Fortran code from Python is rather inconvenient, luckily the substantial part of porting of lmder
\ lmdif
to Python was done here scipy/scipy#90. The remaining challenges are to choose and implement good API and extend the test suite.
I also suggest one possible extension hinted in [2, p. 255]. When the number of observations
One simple LM algorithm for large sparse
- The dogleg method [2, p. 73] finds an approximate solution to the quadratic trust region subproblem: $$ \min_{p \in R^n} m(p) = f + g^T p + \frac{1}{2} p^T B p, \text{ s. t. } ||D p|| \leq \Delta $$
It seeks the solution
This method has the following advantages:
- It's straightforward to implement. In fact its basic version is already in
scipy
. - At each iteration we need to compute
$p^U$ only once, and$p^B$ at most once. Then the trust region radius is adjusted until sufficient reduction of the objective function is fulfilled. That means it could work faster than LM, see [4]. - It is well suited for nonlinear least squares, since matrix
$B=J^T J$ will be positive defined in practical cases. - We can apply it when
$J$ is large and sparse. - It allows rather easy extension to the case of box-constrained nonlinear least squares.
More on point 4.
In order to apply the method we need to compute two vectors: scipy
it could be LinearOperator
.
To compute the Gauss-Newton step we need to solve the equation
\begin{equation}
J^T J p^B = -J^T r
\end{equation}
But this is equivalent to solving
\begin{equation}
J p^B = -r
\end{equation}
in the least squares sense. For large sparse matrices this problem could be readily solved by routines lsqr
or lsrm
presented in scipy.
More on point 5.
I admit that this point requires further research. I found three sources where some kind of a gradient projection methods are proposed.
In [5] the authors suggest to use a rectangular trust region. Then on each step the intersection of the trust region and the constraint box forms the effective rectangular trust region where each point is feasible. Then we use the standard approach described above. If the constraint on some variable is active and the gradient along this variable has the appropriate sign we eliminate such variable from optimization at the current step.
In [6] the author suggests to use the projected gradient and the reduced Hessian, then use the standard dogleg procedure with modified matrices to find the optimal step and finally project it into the feasible region.
In [7] the authors suggest to initially find the projection of Gauss-Newton step into the feasible region. As for the gradient direction they suggest to find generalized Cauchy point (the first minimizer of quadratic model along the gradient constrained to be in the trust region). Then the optimal step is seek as the feasible convex combination
Links [1] Jorge J. More. The Levenberg-Marquardt Algorithm: Implementation and Theory [2] Jorge Nocedal and Stephen J. Wright. Numerical Optimization, 2nd edition [3] S. J. Wright and J. N. Holt. An Inexact Levenberg-Marquardt Algorithm for Large Sparse Nonlinear Least Squares [4] http://users.ics.forth.gr/~argyros/mypapers/2005_10_iccv_levenberg.pdf [5] http://cs.uoi.gr/~voglis/dogbox.ps [6] http://www.math.umt.edu/bardsley/papers/EMopt04.pdf [7] http://www.optimization-online.org/DB_FILE/2009/12/2504.pdf
Written with StackEdit.