L2RT#

purpose#

The l2rt package uses a Krylov-subspace iteration to find an approximation of the global minimizer of the regularized linear Euclidean-norm objective function. The aim is to minimize the objective function

\[r(x) = \sqrt{\|Ax - b\|_2^2 + \mu \|x\|_2^2} + \frac{\sigma}{p} \|x\|_2^p,\]
where the \(\ell_2\)-norm of \(x\) is defined to be \(\|x\|_2 = \sqrt{x^T x}\), and where the shift \(\mu \geq 0\), weight \(\sigma > 0\) and power \(p \geq 2\). The method may be suitable for large problems as no factorization is required. Reverse communication is used to obtain matrix-vector products of the form \(u + A v\) and \(v + A^T u.\)

See Section 4 of $GALAHAD/doc/l2rt.pdf for additional details.

method#

The required solution \(x\) necessarily satisfies the optimality condition \(A^T ( A x - b ) + \lambda x = 0\), where \(\lambda = \mu + \sigma \|x\|_2^{p-2} \sqrt{\| A x - b \|_2^2 + \mu \|x\|_2^2}.\)

The method is iterative. Starting with the vector \(u_1 = b\), a bi-diagonalisation process is used to generate the vectors \(v_k\) and \(u_k+1\) so that the \(n\) by \(k\) matrix \(V_k = ( v_1 \ldots v_k)\) and the \(m\) by \((k+1)\) matrix \(U_k = ( u_1 \ldots u_{k+1})\) together satisfy

\[A V_k = U_{k+1} B_k \;\;\mbox{and}\;\; b = \|b\|_2 U_{k+1} e_1,\]
where \(B_k\) is \((k+1)\) by \(k\) and lower bi-diagonal, \(U_k\) and \(V_k\) have orthonormal columns and \(e_1\) is the first unit vector. The solution sought is of the form \(x_k = V_k y_k\), where \(y_k\) solves the bi-diagonal regularized Euclidean-norm problem
\[\min \sqrt{\| B_k y - \|b\| e_1 \|_2^2 + \mu \|y\|_2^2} + \frac{1}{p} \sigma \| y \|_2^p.\;\;\mbox{(1)}\]

To solve (1), the optimality conditions

\[( B_k^T ( B_k^{} y(\lambda) - \|b\| e_1^{} ) + \lambda y(\lambda) = 0,\]
where
\[\lambda = \mu + \sigma \|y(\lambda)\|_2^{p-2} \sqrt{ \| B_k y(\lambda) - \|b\| e_1 \|_2^2 + \mu \|y(\lambda)\|_2^2},\]
are used as the basis of an iteration. The vector \(y(\lambda)\) is equivalently the solution to the regularized least-squares problem
\[\begin{split}\min \left \| \left(\begin{array}{c} B_k \\ \lambda^{1/2} I \end{array}\right) y - \|b\| e_1^{} \right \|_2.\end{split}\]
Thus, given an estimate \(\lambda \geq 0\), this regularized least-squares problem may be efficiently solved to give \(y(\lambda)\). It is then simply a matter of adjusting \(\lambda\) (for example by a Newton-like process) to solve the scalar nonlinear equation
\[\theta(\lambda) \equiv \sigma \|y(\lambda)\|_2^{p-2} \sqrt{ \| B_k y(\lambda) - \|b\| e_1 \|_2^2 + \mu \|y(\lambda)\|_2^2} + \mu - \lambda = 0.\;\;\mbox{(2)}\]
In practice (2) is reformulated, and a more rapidly converging iteration is used. Having found \(y_k\), a second pass in which \(x_k = V_k y_k\) is regenerated is needed—this need only be done once \(x_k\) has implicitly deemed to be sufficiently close to optimality. As this second pass is an additional expense, a record is kept of the optimal objective function values for each value of \(k\), and the second pass is only performed so far as to ensure a given fraction of the final optimal objective value. Large savings may be made in the second pass by choosing the required fraction to be significantly smaller than one.

Special code is used in the special case \(p=2\) as in this case the equation (2) significantly simplifies.

references#

A complete description of the un- an quadratically-regularized cases is given by

C. C. Paige and M. A. Saunders, ``LSQR: an algorithm for sparse linear equations and sparse least squares’’. ACM Transactions on Mathematical Software 8(1 (1982) 43–71,

and

C. C. Paige and M. A. Saunders, ``ALGORITHM 583: LSQR: an algorithm for sparse linear equations and sparse least squares’’. ACM Transactions on Mathematical Software 8(2) (1982) 195–209.

Additional details on the Newton-like process needed to determine \(\lambda\) and other details are described in

C. Cartis, N. I. M. Gould and Ph. L. Toint, ``Trust-region and other regularisation of linear least-squares problems’’. BIT 49(1) (2009) 21-53.

functions#

l2rt.initialize()#

Set default option values and initialize private data

Returns:

optionsdict
dictionary containing default control options:
errorint

error and warning diagnostics occur on stream error.

outint

general output occurs on stream out.

problemint

unit to write problem data into file problem_file.

print_levelint

the level of output required is specified by print_level. Possible values are

  • <=0

    gives no output,

  • 1

    gives a one-line summary for every iteration.

  • 2

    gives a summary of the inner iteration for each iteration.

  • >=3

    gives increasingly verbose (debugging) output.

start_printint

any printing will start on this iteration.

stop_printint

any printing will stop on this iteration.

print_gapint

the number of iterations between printing.

itminint

the minimum number of iterations allowed (-ve = no bound).

itmaxint

the maximum number of iterations allowed (-ve = no bound).

bitmaxint

the maximum number of Newton inner iterations per outer iteration allowed (-ve = no bound).

extra_vectorsint

the number of extra work vectors of length n used.

stopping_ruleint

the stopping rule used: 0=1.0, 1=norm step, 2=norm step/sigma (NOT USED).

freqint

frequency for solving the reduced tri-diagonal problem (NOT USED).

stop_relativefloat

the iteration stops successfully when \(\|A^Tr\|\) is less than max( stop_relative * \(\|A^Tb \|\), stop_absolute ), where \(r = A x - b\).

stop_absolutefloat

see stop_relative.

fraction_optfloat

an estimate of the solution that gives at least fraction_opt times the optimal objective value will be found.

time_limitfloat

the maximum elapsed time allowed (-ve means infinite).

space_criticalbool

if space_critical is True, every effort will be made to use as little space as possible. This may result in longer computation time.

deallocate_error_fatalbool

if deallocate_error_fatal is True, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue.

prefixstr

all output lines will be prefixed by the string contained in quotes within prefix, e.g. ‘word’ (note the qutoes) will result in the prefix word.

l2rt.load_options(options=None)#

Import control options into internal storage prior to solution.

Parameters:

optionsdict, optional

dictionary of control options (see l2rt.initialize).

l2rt.solve_problem(status, m, n, power, weight, shift, u, v)#

Find the global moinimizer of the regularized objective function \(r(x)\).

Parameters:

statusint

holds the entry status. Possible values are

  • 1

an initial entry with u set to \(b\).

  • other

the value returned from the previous call, see Returns below.

mint

holds the number of residuals, i.e., the number of rows of \(A\).

nint

holds the number of variables, i.e., the number of columns of \(A\).

powerfloat

holds the regularization power, \(p \geq 2\).

weightfloat

holds the strictly positive regularization weight, \(\sigma\).

shiftfloat

holds the positive shift, \(\mu\).

undarray(m)

holds the result vector when initial or return status = 1, 2, 4 or 5 (see below).

vndarray(n)

holds the result vector when return status = 3 (see below).

Returns:

statusint

holds the exit status. Possible values are

  • 0

the solution has been found, no further reentry is required

  • 2

the sum \(u + A v\), involving the vectors \(u\) and \(v\) returned in u and v, must be formed, the result placed in u, and the function recalled with status set to 2.

  • 3

the sum \(v + A^T u\), involving the vectors \(u\) and \(v\) returned in u and v, must be formed, the result placed in v, and the function recalled with status set to 3.

  • 4

the iteration must be restarted by setting u to \(b\), and the function recalled with status set to 4.

  • <0

an error occurred, see status in l2rt.information for further details.

xndarray(n)

holds the values of the approximate minimizer \(x\).

undarray(m)

holds the result vector \(u\).

vndarray(n)

holds the result vector \(v\).

[optional] l2rt.information()

Provide optional output information

Returns:

informdict
dictionary containing output information:
statusint

return status. Possible values are:

  • 0

    The run was successful.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit options[‘error’], and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit options[‘error’] and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -3

    The restriction n > 0, m > 0, \(p \geq 2\) or \(\sigma > 0\) has been violated.

  • -18

    The iteration limit has been exceeded.

  • -25

    status is negative on entry.

alloc_statusint

the status of the last attempted allocation/deallocation.

bad_allocstr

the name of the array for which an allocation/deallocation error occurred.

iterint

the total number of iterations required.

iter_pass2int

the total number of pass-2 iterations required.

bitersint

the total number of inner iterations performed.

biter_minint

the smallest number of inner iterations performed during an outer iteration.

biter_maxint

the largest number of inner iterations performed during an outer iteration.

objfloat

the value of the objective function.

multiplierfloat

the multiplier, $lambda = mu + sigma |x|^{p-2} * sqrt{|Ax-b|^2 + m

x_normfloat

the Euclidean norm of \(x\).

r_normfloat

the Euclidean norm of \(Ax-b\).

Atr_normfloat

the Euclidean norm of \(A^T (Ax-b) + \lambda x\).

biter_meanfloat

the average number of inner iterations performed during an outer iteration.

l2rt.terminate()#

Deallocate all internal private storage.

example code#

from galahad import l2rt
import numpy as np
np.set_printoptions(precision=4,suppress=True,floatmode='fixed')
print("\n** python test: l2rt")

# set problem data

n = 50
m = 2 * n
shift = 1.0
power = 3.0
weight = 1.0
u = np.empty(m)
v = np.empty(n)
b = np.empty(m)
b[:] = 1.0

# set default options
options = l2rt.initialize()

# set some non-default options
options['unitm'] = False
options['print_level'] = 0
#print("options:", options)

# load non-default options
l2rt.load_options(options)

# problem

print(" solve problem")
status = 1
u = b
while status > 0:
  status, x, u, v = l2rt.solve_problem(status, m, n, power, weight, shift, u, v)
  if status == 2: # u -> u + A v
    for i in range(n):
      u[i] = u[i] + v[i]
      u[n+i] = u[n+i] + (i+1) * v[i]
  elif status == 3: # v -> v + A^T u
    for i in range(n):
      v[i] = v[i] + u[i] + (i+1) * u[n+i]
  elif status == 4: # reset r to b
    u = b

# get information

inform = l2rt.information()
print(" optimal f: %.4f" % inform['obj'])
#print(" x:",x)
print('** l2rt exit status:', inform['status'])

# deallocate internal data

l2rt.terminate()

This example code is available in $GALAHAD/src/l2rt/Python/test_l2rt.py .