GLTR#

purpose#

The gltr package uses a Krylov-subspace iteration to find an approximation of the global minimizer of quadratic objective function within an ellipsoidal region; this is commonly known as the trust-region subproblem. The aim is to minimize the quadratic objective function

\[q(x) = f + g^T x + \frac{1}{2} x^T H x,\]
where the vector \(x\) is required to satisfy the ellipsoidal trust-region constraint \(\|x\|_{M} \leq \Delta\), where the \(M\)-norm of \(x\) is defined to be \(\|x\|_{M} = \sqrt{x^T M x}\), and where the radius \(\Delta > 0\). The method may be suitable for large problems as no factorization of \(H\) is required. Reverse communication is used to obtain matrix-vector products of the form \(H z\) and \(M^{-1} z.\)

See Section 4 of $GALAHAD/doc/gltr.pdf for additional details.

method#

The required solution \(x\) necessarily satisfies the optimality condition \(H x + \lambda M x + g = 0\), where \(\lambda \geq 0\) is a Lagrange multiplier corresponding to the constraint \(\|x\|_M \leq \Delta\). In addition, the matrix \(H + \lambda M\) will be positive definite.

The method is iterative. Starting with the vector \(M^{-1} g\), a matrix of Lanczos vectors is built one column at a time so that the \(k\)-th column is generated during iteration \(k\). These columns span a so-called Krylov space. The resulting \(n\) by \(k\) matrix \(Q_k \) has the property that \(Q_{k}^T H Q_k^{} = T_{k}^{}\), where \(T_k\) is tridiagonal. An approximation to the required solution may then be expressed formally as

\[x_{k+1} = Q_k y_k\]
where \(y_k \) solves the ``tridiagonal’’ subproblem of minimizing
\[\frac{1}{2} y^T T_k y + \|g\|_{M^{-1} } e_{1}^T y \;\;\; \mbox{subject to the constraint}\;\; \|y\|_2 \leq \Delta,\;\;\mbox{(1)}\]
and where \(e_1\) is the first unit vector.

If the solution to (1) lies interior to the constraint, the required solution \(x_{k+1}\) may simply be found as the \(k\)-th (preconditioned) conjugate-gradient iterate. This solution can be obtained without the need to access the whole matrix \(Q_k\). These conjugate-gradient iterates increase in \(M\)-norm, and thus once one of them exceeds \(\Delta\) in \(M\)-norm, the solution must occur on the constraint boundary. Thereafter, the solution to (1) is less easy to obtain, but an efficient inner iteration to solve (1) is nonetheless achievable because \(T_k \) is tridiagonal. It is possible to observe the optimality measure \(\|H x + \lambda M x + g\|_{M^{-1}}\) without computing \(x_{k+1}\), and thus without needing \(Q_k \). Once this measure is sufficiently small, a second pass is required to obtain the estimate \(x_{k+1} \) from \(y_k \). As this second pass is an additional expense, a record is kept of the optimal objective function values for each value of \(k\), and the second pass is only performed so far as to ensure a given fraction of the final optimal objective value. Large savings may be made in the second pass by choosing the required fraction to be significantly smaller than one.

A cheaper alternative is to use the Steihuag-Toint strategy, which is simply to stop at the first boundary point encountered along the piecewise linear path generated by the conjugate-gradient iterates. Note that if \(H\) is significantly indefinite, this strategy often produces a far from optimal point, but is effective when \(H\) is positive definite or almost so.

reference#

The method is described in detail in

N. I. M. Gould, S. Lucidi, M. Roma and Ph. L. Toint, ``Solving the trust-region subproblem using the Lanczos method’’. SIAM Journal on Optimization 9(2) (1999), 504-525.

functions#

gltr.initialize()#

Set default option values and initialize private data

Returns:

optionsdict
dictionary containing default control options:
errorint

error and warning diagnostics occur on stream error.

outint

general output occurs on stream out.

problemint

unit to write problem data into file problem_file.

print_levelint

the level of output required is specified by print_level. Possible values are

  • <=0

    gives no output,

  • 1

    gives a one-line summary for every iteration.

  • 2

    gives a summary of the inner iteration for each iteration.

  • >=3

    gives increasingly verbose (debugging) output.

itmaxint

the maximum number of iterations allowed (-ve = no bound).

Lanczos_itmaxint

the maximum number of iterations allowed once the boundary has been encountered (-ve = no bound).

extra_vectorsint

the number of extra work vectors of length n used.

ritz_printout_deviceint

the unit number for writing debug Ritz values.

stop_relativefloat

the iteration stops successfully when the gradient in the \(M^{-1}\) norm is smaller than max( stop_relative * norm initial gradient, stop_absolute ).

stop_absolutefloat

see stop_relative.

fraction_optfloat

an estimate of the solution that gives at least fraction_opt times the optimal objective value will be found.

f_minfloat

the iteration stops if the objective-function value is lower than f_min.

rminvr_zerofloat

the smallest value that the square of the M norm of the gradient of the the objective may be before it is considered to be zero.

f_0float

the constant term, \(f\), in the objective function.

unitmbool

is \(M\) the identity matrix ?.

steihaug_tointbool

should the iteration stop when the Trust-region is first encountered ?.

boundarybool

is the solution thought to lie on the constraint boundary ?.

equality_problembool

is the solution required to lie on the constraint boundary?.

space_criticalbool

if space_critical is True, every effort will be made to use as little space as possible. This may result in longer computation time.

deallocate_error_fatalbool

if deallocate_error_fatal is True, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue.

print_ritz_valuesbool

should the Ritz values be written to the debug stream?.

ritz_file_namestr

name of debug file containing the Ritz values.

prefixstr

all output lines will be prefixed by the string contained in quotes within prefix, e.g. ‘word’ (note the qutoes) will result in the prefix word.

gltr.load_options(options=None)#

Import control options into internal storage prior to solution.

Parameters:

optionsdict, optional

dictionary of control options (see gltr.initialize).

gltr.solve_problem(status, n, radius, r, v)#

Find the global moinimizer of the quadratic objective function \(q(x)\) within a trust-region of radius \(\Delta\).

Parameters:

statusint

holds the entry status. Possible values are

  • 1

an initial entry with r set to \(g\).

  • 4

a restart entry with \(g\) unchanged but a smaller radius \(\Delta\).

  • other

the value returned from the previous call, see Returns below.

nint

holds the number of variables.

radiusfloat

holds the strictly positive trust-region radius, \(\Delta\).

rndarray(n)

holds the values of the linear term \(g\) in the objective function when initial or return status = 1, 4 or 5 (see below).

vndarray(n)

holds the result vector when return status = 2 or 3 (see below).

Returns:

statusint

holds the exit status. Possible values are

  • 0

the solution has been found, no further reentry is required

  • 2

the inverse of \(M\) must be applied to the vector returned in v, the result placed in v, and the function recalled with status set to 2. This will only occur if options[‘unitm’] is False.

  • 3

the product of \(H\) with the vector returned in v must be formed, the result placed in v, and the function recalled with status set to 3.

  • 5

the iteration must be restarted by setting r to \(g\), and the function recalled with status set to 5.

  • <0

an error occurred, see status in gltr.information for further details.

xndarray(n)

holds the values of the approximate minimizer \(x\).

rndarray(n)

holds the values of the gradient \(g + Hx\) at the current \(x\).

vndarray(n)

holds the return vector when return status = 2 or 3 (see above).

[optional] gltr.information()

Provide optional output information

Returns:

informdict
dictionary containing output information:
statusint

return status. Possible values are:

  • 0

    The run was successful.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit options[‘error’], and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit options[‘error’] and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -3

    The restriction n > 0 or \(\Delta > 0\) has been violated.

  • -15

    \(M\) appears to be indefinite.

  • -18

    The iteration limit has been exceeded.

  • -30

    The trust-region has been encountered in Steihaug-Toint mode.

  • -31

    The function value is smaller than options[‘f_min’].

alloc_statusint

the status of the last attempted allocation/deallocation.

bad_allocstr

the name of the array for which an allocation/deallocation error occurred.

iterint

the total number of iterations required.

iter_pass2int

the total number of pass-2 iterations required if the solution lies on the trust-region boundary.

objfloat

the value of the quadratic function.

multiplierfloat

the Lagrange multiplier corresponding to the trust-region constraint.

mnormxfloat

the \(M\)-norm of \(x\), \(\|x\|_M\).

pivfloat

the latest pivot in the Cholesky factorization of the Lanczos tridiagona.

curvfloat

the most negative cuurvature encountered.

rayleighfloat

the current Rayleigh quotient.

leftmostfloat

an estimate of the leftmost generalized eigenvalue of the pencil \((H,M)\).

negative_curvaturebool

was negative curvature encountered ?.

hard_casebool

did the hard case occur ?.

gltr.terminate()#

Deallocate all internal private storage.

example code#

from galahad import gltr
import numpy as np
np.set_printoptions(precision=4,suppress=True,floatmode='fixed')
print("\n** python test: gltr")

# set problem data

n = 100
v = np.empty(n)
hv = np.empty(n)
g = np.empty(n)
g[:] = 1.0

# set default options
options = gltr.initialize()

# set some non-default options
options['unitm'] = False
options['print_level'] = 0
#print("options:", options)

# load non-default options
gltr.load_options(options)

# first problem

print(" solve problem 1")
status = 1
radius = 1.0
r = g
while status > 0:
  status, x, r, v = gltr.solve_problem(status, n, radius, r, v)
  if status == 2: # precondition v -> v/2
    for i in range(n):
      v[i] = v[i] / 2.0;
  elif status == 3: # form v -> H v
    hv[0] = 2.0 * v[0] + v[1]
    for i in range(1,n-1):
       hv[i] = v[i-1] + 2.0 * v[i] + v[i+1]
    hv[n-1] = v[n-2] + 2.0 * v[n-1]
    for i in range(n):
      v[i] = hv[i]
  elif status == 5: # reset r
    r = g

# get information

inform = gltr.information()
print(" optimal f: %.4f" % inform['obj'])
#print(" x:",x)

# second problem, same data with smaller radius

print("\n solve problem 2 with smaller radius")
status = 4
radius = 0.1
r = g
while status > 0:
  status, x, r, v = gltr.solve_problem(status, n, radius, r, v)
  if status == 2: # precondition v -> v/2
    for i in range(n):
      v[i] = v[i] / 2.0;
  elif status == 3: # form v -> H v
    hv[0] =  2.0 * v[0] + v[1]
    for i in range(1,n-1):
       hv[i] = v[i-1] + 2.0 * v[i] + v[i+1]
    hv[n-1] = v[n-2] + 2.0 * v[n-1]
    for i in range(n):
      v[i] = hv[i]
  elif status == 5: # reset r
    r = g

# get information

inform = gltr.information()
print(" optimal f: %.4f" % inform['obj'])
#print(" x:",x)
print('** gltr exit status:', inform['status'])

# deallocate internal data

gltr.terminate()

This example code is available in $GALAHAD/src/gltr/Python/test_gltr.py .