LSMR - Sparse Least Squares LSMR Solver
#include <spral_lsmr.h> /* or <spral.h> for all packages */
Purpose
This package uses the LSMR iterative method to solve sparse linear equations and sparse least-squares problems of the form:
where the \(m \times n\) matrix \(A\) may be square or rectangular, and may have any rank. The scalar \(\lambda\) is a damping parameter. If \(\lambda > 0\), the solution is regularized in the sense that a unique soluton always exists, and \(\|x\|_2\) is always bounded.
Preconditioning may be used to try to reduce the number of iterations. A suitable choice for the preconditioner depends on the user’s knowledge of \(A\). For a user-chosen \(n \times n\) nonsingular matrix \(P\), LSMR solves
The user must then recover the final solution \(x\) by computing \(x=Py\). \(P\) will be a good preconditioner if \(AP\) is significantly better conditioned than \(A\).
Reverse communication is used for preconditioning operations \(Pz\) and \(P^Tz\) and matrix-vector products of the form \(Av\) and \(A^Tu\).
The method used is based on the Golub-Kahan bidiagonalization process. It is algebraically equivalent to applying MINRES to the normal equation \((A^TA+\lambda^2I)x=A^Tb\) (or \(((AP)^T(AP)+\lambda^2I)y=(AP)^Tb\)), but has better numerical properties, especially if \(A\) is ill-conditioned.
Notation
In the rest of this documentation, we use the following notation:
and
Version history
- 2016-05-10 Version 1.0.0.
Initial release.
[For detailed history, see ChangeLog]
Subroutines
-
void spral_lsmr_default_options(struct spral_lsmr_options *options)
Initialises options to default values.
- Parameters:
options – data structure to be initialised.
-
int spral_lsmr_solve(int *action, int m, int n, double u[m], double v[n], double y[n], void **keep, struct spral_lsmr_options const *options, struct spral_lsmr_inform *inform, double *damp)
Solve the least squares problem. Uses reverse communication. Upon return the user must perform a task specified by the action parameter and recall the routine. Possible values of action and associated tasks are:
action
Task to be performed
0
None. Computation has terminated. Check inform.flag for reason.
1
The user must compute
\[v = v + P^TA^Tu\]without altering \(u\).
2
The user must compute
\[u = u + APv\]without altering \(v\).
3
The user may test for convergence.
To continue, recall without changing any of the arguments.
- Parameters:
action – reverse communication task (see table above). Must be 0 on first call.
m – number of rows in \(A\).
n – number of columns in \(A\).
u – the vector \(u\). Must contain \(b\) on first call.
v – the vector \(v\).
y – the current solution \(y\) of the preconditioned problem (note \(x=Py\)).
keep – private internal data for LSMR.
options – controls execution of algorithm.
inform – information about execution of algorithm. Must not be changed by the user.
damp – Damping parameter \(\lambda\).
- Returns:
Value of
spral_lsmr_inform.flag
.
-
int lsmr_free(void **keep)
Free memory allocated in keep. If a series of problems is being solved sequentially, the same keep may be used without calling
spral_lsmr_free()
between each solution.- Parameters:
keep – private data to be freed.
- Returns:
0 on success, or Fortran stat parameter on failed deallocation.
Derived types
- struct spral_lsmr_options
Specify options used by LSMR algorithm.
-
int print_freq_head
Frequency of printing heading information (that is, how many lines are printed before the heading information is reprinted). Default is 20.
-
int print_freq_itn
Frequency of printing status. There is printing on each of the first print_freq_itn iterations and then printing every print_freq_itn iterations. Default is 10.
-
int unit_diagnostics
Fortran unit for diagnostic printing. Printing is suppressed if negative. Default is 6.
-
int unit_error
Fortran unit for printing error messages. Printing is suppressed if negative. Default is 6.
-
double atol
Relative error in \(A\). i.e. if \(A\) is accurate to about 6 digits, set atol to 1.0e-6. Only used if
spral_lsmr_options.ctest
=3. Default issqrt(DBL_EPSILON)
.
-
double btol
Relative error in \(b\). i.e. if \(b\) is accurate to about 6 digits, set btol to 1.0e-6. Only used if
spral_lsmr_options.ctest
=3. Default issqrt(DBL_EPSILON)
.
-
double conlim
Upper limit on \(cond(\bar{A})\), apparent condition number of \(\bar{A}\). Only used if
spral_lsmr_options.ctest
=3. Default is1/(10*sqrt(DBL_EPSILON))
:
-
int ctest
Convergence test to use. Options are:
1
User to test convergence (action=3). Without computation of norms in inform.
2
User to test convergence (action=3). With computation of norms in inform.
3 (default)
Convergence test of Fond and Saunders (in preconditioner norm).
Default is 3.
-
int itnlim
Maximum number of iterations. If negative, the limit used is \(4n\). Default is -1.
-
int itn_test
Number of iterations between user convergence tests. If negative, use \(\min(n,10)\). Default is -1.
-
int localSize
Number of historical vectors to use for reorthogonalization. Default is 0.
-
int print_freq_head
- struct spral_lsmr_inform
Information about progress of algorithm.
-
double condAP
Estimate of \(cond(\bar{A})\). A very high value of condAP may again indicate an error in the products with \(A\), \(A^T\), \(P\), or \(P^T\). A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
int flag
Exit status of algorithm. See table below.
-
int itn
Number of iterations performed
-
double normAP
Estimate of Frobenius norm of \(\bar{A}\). If \(\lambda\) is small and the columns of \(AP\) have all been scaled to have length 1.0, normAP should increase to roughly \(\sqrt{n}\). A radically different value for normAP may indicate an error in the user-supplied products with \(A\), \(A^T\), \(P\), or \(P^T\). A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
double normAP
Estimate of \(\|\bar{A}^T\bar{r}\|_2\), (normal equations residual). This should be small in all cases. Note that normAPr will often be smaller than the true value. A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
double normr
Estimate of \(\|\bar{r}\|_2\). This will be small if \(Ax = b\) has a solution. A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
double normy
Estimate of \(\|y\|_2\). A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
int stat
The Fortran stat parameter in the event of a failed allocation.
inform.flag
Interpretation
0
\(x = 0.0\) is the exact solution. No iterations were performed.
1
The equations \(Ax = b\) are probably compatible. \(\|Ax - b\|_2\) is sufficiently small, given the values of
spral_lsmr_options.atol
andspral_lsmr_options.btol
. (spral_lsmr_options.ctest
=3 only).2
If damp is not present or is zero then the system \(Ax = b\) is probably not compatible. A least-squares solution has been obtained that is sufficiently accurate, given the value of
spral_lsmr_options.atol
. Otherwise, damped least-squares solution has been obtained that is sufficiently accurate, given the value ofspral_lsmr_options.atol
. (spral_lsmr_options.ctest
=3 only).3
An estimate of cond(\(\bar{A}\)) has exceeded
spral_lsmr_options.conlim
. The system \(Ax = b\) appears to be ill-conditioned, or there could be an error in the products with \(A\), \(A^T\), \(P\), or \(P^T\). (spral_lsmr_options.ctest
=3 only).4
\(\|APy - b \|_2\) is small enough for this machine. (
spral_lsmr_options.ctest
=3 only).5
The least-squares solution is good enough for this machine. (
spral_lsmr_options.ctest
=3 only).6
The estimate
spral_lsmr_inform.condAP
appears to be too large for this machine. (spral_lsmr_options.ctest
=3 only).7
The iteration limit
spral_lsmr_options.itnlim
has been reached.8
An array allocation failed.
9
An array deallocation failed.
10
Either m<0 or n<0.
-
double condAP
Example
The following code illustrates the use of LMSR
/* examples/C/lsmr.c - Example code for SPRAL_LSMR package */
#include "spral.h"
#include <stdlib.h>
#include <stdio.h>
void matrix_mult(int m, int n, int const *ptr, int const *row,
double const *val, double const *v, double *u);
void matrix_mult_trans(int m, int n, int const *ptr, int const *row,
double const *val, double const *u, double *v);
int main(void) {
/* Derived types */
void *keep;
struct spral_lsmr_options options;
struct spral_lsmr_inform inform;
/* Initialize derived types */
keep = NULL;
spral_lsmr_default_options(&options);
options.unit_diagnostics = -1; /* Switch off diagnostic printing */
/* Data for matrix:
* ( 1.0 -1.0 )
* ( 2.0 )
* ( 2.0 2.0 )
* ( 5.0 3.0 -2.0 )
* ( 6.0 ) */
const int m = 5, n = 3;
int ptr[] = { 0, 3, 5, 9 };
int row[] = { 0, 2, 3, 1, 3, 0, 2, 3, 4 };
double val[] = { 1.0, 2.0, 5.0, 2.0, 3.0, -1.0, 2.0, -2.0, 6.0 };
/* Data for rhs b */
double b[] = { 1.0, 1.0, 1.0, 1.0, 1.0 };
/* prepare for LSMR calls (using no preconditioning) */
int action = 0;
double u[m], v[n], x[n];
for(int i=0; i<m; ++i) u[i] = b[i];
bool done = false;
while(!done) {
spral_lsmr_solve(&action, m, n, u, v, x, &keep, &options, &inform, NULL);
switch(action) {
case 0: /* we are done */
printf("Exit LSMR with inform.flag = %d and inform.itn = %d\n",
inform.flag, inform.itn);
printf("LS solution is:\n");
for(int i=0; i<n; ++i) printf(" %10.2f", x[i]);
printf("\n");
done = true;
break;
case 1: /* Compute v = v + A'*u without altering u */
matrix_mult_trans(m, n, ptr, row, val, u, v);
break;
case 2: /* Compute u = u + A*v without altering v */
matrix_mult(m, n, ptr, row, val, v, u);
break;
}
}
spral_lsmr_free(&keep);
return 0; // Success
}
/* Takes b and computes u = u + A * v (A in CSC format) */
void matrix_mult(int m, int n, int const *ptr, int const *row,
double const *val, double const *v, double *u) {
for(int j=0; j<n; ++j) {
for(int k=ptr[j]; k<ptr[j+1]; ++k) {
int i = row[k];
u[i] += val[k]*v[j];
}
}
}
/* Takes b and computes v = v + A^T * u (A in CSC format) */
void matrix_mult_trans(int m, int n, int const *ptr, int const *row,
double const *val, double const *u, double *v) {
for(int j=0; j<n; ++j) {
for(int k=ptr[j]; k<ptr[j+1]; ++k) {
int i = row[k];
v[j] += val[k]*u[i];
}
}
}
This returns the following output:
Exit LSMR with inform.flag = 2 and inform.itn = 3
LS solution is:
0.18 0.26 0.17
Method
Algorithm
The method used is based on the Golub-Kahan bidiagonalization process. It is algebraically equivalent to applying MINRES to the normal equation \((A^TA+\lambda^2I)x=A^Tb\) (or \(((AP)^T(AP)+\lambda^2I)y=(AP)^Tb\), \(Py = x\), if preconditioning is used), but has better numerical properties, especially if \(A\) is ill-conditioned. Full details may be found in [1].
Scaling
LSMR uses an iterative method to approximate the solution. The number of iterations required to reach a certain accuracy depends strongly on the scaling of the problem. Poor scaling of the rows or columns of \(A\) should therefore be avoided where possible. For example, in problem 1 the solution is unaltered by row-scaling. If a row of \(A\) is very small or large compared to the other rows of \(A\), the corresponding row of \(( A\; b )\) should be scaled up or down.
In problems 1 and 2, the solution \(x\) is easily recovered following column-scaling. Unless better information is known, the nonzero columns of \(A\) should be scaled so that they all have the same Euclidean norm (e.g., 1.0). In problem 3, there is no freedom to re-scale if damp is nonzero. However, the value of damp should be assigned only after attention has been paid to the scaling of \(A\).
The parameter damp is intended to help regularize ill-conditioned
systems, by preventing the true solution from being very large. Another
aid to regularization is provided by the spral_lsmr_inform.condAP
,
which may be used to terminate iterations before the computed solution becomes
very large.
Initial estimate
Note that \(x\) (or \(y\) for the preconditioned problem) is not an input parameter. If some initial estimate \(x_0\) of \(x\) is known and if \(\lambda = 0\), one could proceed as follows:
Compute a residual vector \(r_0 = b - Ax_0\).
Use LSMR to solve the system \(A \delta x = r_0\).
Add the correction \(\delta x\) to obtain a final solution \(x = x_0 + \delta x\).
This can be generalized for the preconditioned case. The guess \(x_0\) has
to be available before and after the calls to spral_lsmr_solve()
.
To judge the benefits, suppose spral_lsmr_solve()
takes \(k_1\)
iterations to solve \(Ax = b\) and \(k_2\) iterations to solve
\(A \delta x = r_0\). If \(x_0\) is “good”, \(\|r_0\|_2\) will be
smaller than \(\|b\|_2\). If the same stopping tolerances
spral_lsmr_options.atol
and spral_lsmr_options.btol
are
used for each system, \(k_1\) and \(k_2\) will
be similar, but the final solution \(x = x_0 + \delta x\) should be more
accurate. The only way to reduce the total work is to use a larger stopping
tolerance for the second system. If some value
spral_lsmr_options.btol
is suitable for
\(Ax=b\), the larger value
\(\mathrm{spral\_lsmr\_options.btol}*\|b\|_2 / \|r_0\|_2\) should be
suitable for \(A \delta x = r_0\).