GLRT#

purpose#

The glrt package uses a Krylov-subspace iteration to find an approximation of the global minimizer of regularized quadratic objective function. The aim is to minimize the regularized quadratic objective function

\[r(x) = f + g^T x + \frac{1}{2} x^T H x + \frac{\sigma}{p} \|x\|_{M}^p,\]
where the weight \(\sigma \geq 0\), the power \(p \geq 2\), and where the \(M\)-norm of \(x\) is defined to be \(\|x\|_{M} = \sqrt{x^T M x}\). The method may be suitable for large problems as no factorization of \(H\) is required. Reverse communication is used to obtain matrix-vector products of the form \(H z\) and \(M^{-1} z.\)

See Section 4 of $GALAHAD/doc/glrt.pdf for additional details.

method#

The required solution \(x\) necessarily satisfies the optimality condition \(H x + \lambda M x + g = 0\), where \(\lambda = \sigma \|x\|_{M}^{p-2}\). In addition, the matrix \(H + \lambda M\) will be positive semi-definite.

The method is iterative. Starting with the vector \(M^{-1} g\), a matrix of Lanczos vectors is built one column at a time so that the \(k\)-th column is generated during iteration \(k\). These columns span a so-called Krylov space. The resulting \(n\) by \(k\) matrix \(Q_k \) has the property that \(Q_{k}^T H Q_k^{} = T_{k}^{}\), where \(T_k\) is tridiagonal. An approximation to the required solution may then be expressed formally as

\[x_{k+1} = Q_k y_k\]
where \(y_k \) solves the ``tridiagonal’’ subproblem of minimizing
\[\frac{1}{2} y^T T_k y + \|g\|_{M^{-1} } e_{1}^T y + \frac{1}{p} \sigma \| y \|_2^p,\;\;\mbox{(1)}\]
where \(e_1\) is the first unit vector.

To minimize (1), the optimality conditions

\[( T_k + \lambda I ) y(\lambda) = - g,\;\;\mbox{(2)}\]
where \(\lambda = \sigma \|y(\lambda)\|_{M}^{p-2}\) are used as the basis of an iteration. Specifically, given an estimate \(\lambda\) for which \(T_k + \lambda I\) is positive definite, the tridiagonal system (2) may be efficiently solved to give \(y(\lambda)\). It is then simply a matter of adjusting \(\lambda\) (for example by a Newton-like process) to solve the scalar nonlinear equation
\[\theta(\lambda) \equiv \|y(\lambda)\|_{M}^{p-2} - \frac{\lambda}{\sigma} = 0.\;\;\mbox{(3)}\]
In practice (3) is reformulated, and a more rapidly converging iteration is used.

It is possible to measure the optimality measure \(\|H x + \lambda M x + g\|_{M^{-1}}\) without computing \(x_{k+1}\), and thus without needing \(Q_k \). Once this measure is sufficiently small, a second pass is required to obtain the estimate \(x_{k+1} \) from \(y_k \). As this second pass is an additional expense, a record is kept of the optimal objective function values for each value of \(k\), and the second pass is only performed so far as to ensure a given fraction of the final optimal objective value. Large savings may be made in the second pass by choosing the required fraction to be significantly smaller than one.

Special code is used in the special case \(p=2\), as in this case a single pass suffices.

reference#

The method is described in detail in

C. Cartis, N. I. M. Gould and Ph. L. Toint, ``Adaptive cubic regularisation methods for unconstrained optimization. Part {I}: motivation, convergence and numerical results’’. Mathematical Programming 127(2) (2011) 245-295.

functions#

glrt.initialize()#

Set default option values and initialize private data

Returns:

optionsdict
dictionary containing default control options:
errorint

error and warning diagnostics occur on stream error.

outint

general output occurs on stream out.

problemint

unit to write problem data into file problem_file.

print_levelint

the level of output required is specified by print_level. Possible values are

  • <=0

    gives no output,

  • 1

    gives a one-line summary for every iteration.

  • 2

    gives a summary of the inner iteration for each iteration.

  • >=3

    gives increasingly verbose (debugging) output.

itmaxint

the maximum number of iterations allowed (-ve = no bound).

stopping_ruleint

the stopping rule used (see below). Possible values are:

  • 1

    stoping rule = norm of the step.

  • 2

    stopping rule is norm of the step / \(\sigma\).

  • other

    stopping rule = 1.0.

freqint

frequency for solving the reduced tri-diagonal problem.

extra_vectorsint

the number of extra work vectors of length n used.

ritz_printout_deviceint

the unit number for writing debug Ritz values.

stop_relativefloat

the iteration stops successfully when the gradient in the \(M^{-1}\) norm is smaller than max( stop_relative * min( 1, stopping_rule ) * norm initial gradient, stop_absolute ).

stop_absolutefloat

see stop_relative.

fraction_optfloat

an estimate of the solution that gives at least fraction_opt times the optimal objective value will be found.

rminvr_zerofloat

the smallest value that the square of the M norm of the gradient of the objective may be before it is considered to be zero.

f_0float

the constant term, f0, in the objective function.

unitmbool

is M the identity matrix ?.

impose_descentbool

is descent required i.e., should \(c^T x < 0\) ?.

space_criticalbool

if space_critical is True, every effort will be made to use as little space as possible. This may result in longer computation time.

deallocate_error_fatalbool

if deallocate_error_fatal is True, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue.

print_ritz_valuesbool

should the Ritz values be written to the debug stream?.

ritz_file_namestr

name of debug file containing the Ritz values.

prefixstr

all output lines will be prefixed by the string contained in quotes within prefix, e.g. ‘word’ (note the qutoes) will result in the prefix word.

glrt.load_options(options=None)#

Import control options into internal storage prior to solution.

Parameters:

optionsdict, optional

dictionary of control options (see glrt.initialize).

glrt.solve_problem(status, n, power, weight, r, v)#

Find the global moinimizer of the regularized quadratic objective function \(r(x)\).

Parameters:

statusint

holds the entry status. Possible values are

  • 1

an initial entry with r set to \(g\).

  • 6

a restart entry with \(p\) and \(g\) unchanged but a larger weight \(\sigma\).

  • other

the value returned from the previous call, see Returns below.

nint

holds the number of variables.

powerfloat

holds the regularization power \(p \geq 2\).

weightfloat

holds the strinctly positive regularization weight \(\sigma\).

rndarray(n)

holds the values of the linear term \(g\) in the objective function when initial or return status = 1, 4 or 6 (see below).

vndarray(n)

holds the result vector when return status = 2 or 3 (see below).

Returns:

statusint

holds the exit status. Possible values are

  • 0

the solution has been found, no further reentry is required

  • 2

the inverse of \(M\) must be applied to the vector returned in v, the result placed in v, and the function recalled with status set to 2. This will only occur if options[‘unitm’] is False.

  • 3

the product of \(H\) with the vector returned in v must be formed, the result placed in v, and the function recalled with status set to 3.

  • 4

the iteration must be restarted by setting r to \(g\), and the function recalled with status set to 4.

  • <0

an error occurred, see status in glrt.information for further details.

xndarray(n)

holds the values of the approximate minimizer \(x\).

rndarray(n)

holds the values of the gradient \(g + Hx\) at the current \(x\).

vndarray(n)

holds the return vector when return status = 2 or 3 (see above).

[optional] glrt.information()

Provide optional output information

Returns:

informdict
dictionary containing output information:
statusint

return status. Possible values are:

  • 0

    The run was successful.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit options[‘error’], and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit options[‘error’] and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -3

    The restriction n > 0, \(\sigma > 0\) or \(p \geq 2\) has been violated.

  • -7

    The objective function appears to be unbounded from below. This can only happen if \(p = 2\), and in this case the objective is unbounded along the arc x + t v, where x and v are as returned by glrt.solve_problem, as t goes to infinity.

  • -15

    \(M\) appears to be indefinite.

  • -18

    The iteration limit has been exceeded.

alloc_statusint

the status of the last attempted allocation/deallocation.

bad_allocstr

the name of the array for which an allocation/deallocation error occurred.

iterint

the total number of iterations required.

iter_pass2int

the total number of pass-2 iterations required.

objfloat

the value of the quadratic function.

obj_regularizedfloat

the value of the regularized quadratic function.

multiplierfloat

the multiplier, \(\sigma \|x\|^{p-2}\).

xpo_normfloat

the value of the norm \(\|x\|_M\).

leftmostfloat

an estimate of the leftmost generalized eigenvalue of the pencil \((H,M)\).

negative_curvaturebool

was negative curvature encountered ?.

hard_casebool

did the hard case occur ?.

glrt.terminate()#

Deallocate all internal private storage.

example code#

from galahad import glrt
import numpy as np
np.set_printoptions(precision=4,suppress=True,floatmode='fixed')
print("\n** python test: glrt")

# set problem data

n = 100
power = 3.0
v = np.empty(n)
hv = np.empty(n)
g = np.empty(n)
g[:] = 1.0

# set default options
options = glrt.initialize()

# set some non-default options
options['unitm'] = False
options['print_level'] = 0
#print("options:", options)

# load non-default options
glrt.load_options(options)

# first problem

print(" solve problem 1")
status = 1
weight = 1.0
r = g
while status > 0:
  status, x, r, v = glrt.solve_problem(status, n, power, weight, r, v)
  if status == 2: # precondition v -> v/2
    for i in range(n):
      v[i] = v[i] / 2.0;
  elif status == 3: # form v -> H v
    hv[0] = 2.0 * v[0] + v[1]
    for i in range(1,n-1):
       hv[i] = v[i-1] + 2.0 * v[i] + v[i+1]
    hv[n-1] = v[n-2] + 2.0 * v[n-1]
    for i in range(n):
      v[i] = hv[i]
  elif status == 4: # reset r
    r = g

# get information

inform = glrt.information()
print(" optimal f: %.4f" % inform['obj'])
#print(" x:",x)

# second problem, same data with larger weight

print("\n solve problem 2 with larger weight")
status = 6
weight = 10.0
r = g
while status > 0:
  status, x, r, v = glrt.solve_problem(status, n, power, weight, r, v)
  if status == 2: # precondition v -> v/2
    for i in range(n):
      v[i] = v[i] / 2.0;
  elif status == 3: # form v -> H v
    hv[0] =  2.0 * v[0] + v[1]
    for i in range(1,n-1):
       hv[i] = v[i-1] + 2.0 * v[i] + v[i+1]
    hv[n-1] = v[n-2] + 2.0 * v[n-1]
    for i in range(n):
      v[i] = hv[i]
  elif status == 4: # reset r
    r = g

# get information

inform = glrt.information()
print(" optimal f: %.4f" % inform['obj'])
#print(" x:",x)
print('** glrt exit status:', inform['status'])

# deallocate internal data

glrt.terminate()

This example code is available in $GALAHAD/src/glrt/Python/test_glrt.py .