GLRT#

purpose#

The glrt package uses a Krylov-subspace iteration to find an approximation of the global minimizer of regularized quadratic objective function. The aim is to minimize the regularized quadratic objective function

\[r(x) = f + g^T x + \frac{1}{2} x^T H x + \frac{\sigma}{p} \|x\|_{M}^p,\]

where the weight $\sigma \geq 0$, the power $p \geq 2$, and where the $M$-norm of $x$ is defined to be $\|x\|_{M} = \sqrt{x^T M x}$. The method may be suitable for large problems as no factorization of $H$ is required. Reverse communication is used to obtain matrix-vector products of the form $H z$ and $M^{-1} z.$

See Section 4 of $GALAHAD/doc/glrt.pdf for additional details.

method#

The required solution $x$ necessarily satisfies the optimality condition $H x + \lambda M x + g = 0$, where $\lambda = \sigma \|x\|_{M}^{p-2}$. In addition, the matrix $H + \lambda M$ will be positive semi-definite.

The method is iterative. Starting with the vector $M^{-1} g$, a matrix of Lanczos vectors is built one column at a time so that the $k$-th column is generated during iteration $k$. These columns span a so-called Krylov space. The resulting $n$ by $k$ matrix $Q_k $ has the property that $Q_{k}^T H Q_k^{} = T_{k}^{}$, where $T_k$ is tridiagonal. An approximation to the required solution may then be expressed formally as

\[x_{k+1} = Q_k y_k\]

where $y_k $ solves the ``tridiagonal’’ subproblem of minimizing

\[\frac{1}{2} y^T T_k y + \|g\|_{M^{-1} } e_{1}^T y + \frac{1}{p} \sigma \| y \|_2^p,\;\;\mbox{(1)}\]

where $e_1$ is the first unit vector.

To minimize (1), the optimality conditions

\[( T_k + \lambda I ) y(\lambda) = - g,\;\;\mbox{(2)}\]

where $\lambda = \sigma \|y(\lambda)\|_{M}^{p-2}$ are used as the basis of an iteration. Specifically, given an estimate $\lambda$ for which $T_k + \lambda I$ is positive definite, the tridiagonal system (2) may be efficiently solved to give $y(\lambda)$. It is then simply a matter of adjusting $\lambda$ (for example by a Newton-like process) to solve the scalar nonlinear equation

\[\theta(\lambda) \equiv \|y(\lambda)\|_{M}^{p-2} - \frac{\lambda}{\sigma} = 0.\;\;\mbox{(3)}\]

In practice (3) is reformulated, and a more rapidly converging iteration is used.

It is possible to measure the optimality measure $\|H x + \lambda M x + g\|_{M^{-1}}$ without computing $x_{k+1}$, and thus without needing $Q_k $. Once this measure is sufficiently small, a second pass is required to obtain the estimate $x_{k+1} $ from $y_k $. As this second pass is an additional expense, a record is kept of the optimal objective function values for each value of $k$, and the second pass is only performed so far as to ensure a given fraction of the final optimal objective value. Large savings may be made in the second pass by choosing the required fraction to be significantly smaller than one.

Special code is used in the special case $p=2$, as in this case a single pass suffices.

reference#

The method is described in detail in

C. Cartis, N. I. M. Gould and Ph. L. Toint, ``Adaptive cubic regularisation methods for unconstrained optimization. Part {I}: motivation, convergence and numerical results’’. Mathematical Programming 127(2) (2011) 245-295.

functions#

glrt.initialize()#

Set default option values and initialize private data

Returns:

optionsdict

dictionary containing default control options:

errorint
error and warning diagnostics occur on stream error.

outint
general output occurs on stream out.

problemint
unit to write problem data into file problem_file.

print_levelint
the level of output required is specified by print_level. Possible values are

<=0

gives no output,

1

gives a one-line summary for every iteration.

2

gives a summary of the inner iteration for each iteration.

>=3

gives increasingly verbose (debugging) output.

itmaxint
the maximum number of iterations allowed (-ve = no bound).

stopping_ruleint
the stopping rule used (see below). Possible values are:

1

stoping rule = norm of the step.

2

stopping rule is norm of the step / $\sigma$.

other

stopping rule = 1.0.

freqint
frequency for solving the reduced tri-diagonal problem.

extra_vectorsint
the number of extra work vectors of length n used.

ritz_printout_deviceint
the unit number for writing debug Ritz values.

stop_relativefloat
the iteration stops successfully when the gradient in the $M^{-1}$ norm is smaller than max( stop_relative * min( 1, stopping_rule ) * norm initial gradient, stop_absolute ).

stop_absolutefloat
see stop_relative.

fraction_optfloat
an estimate of the solution that gives at least fraction_opt times the optimal objective value will be found.

rminvr_zerofloat
the smallest value that the square of the M norm of the gradient of the objective may be before it is considered to be zero.

f_0float
the constant term, f0, in the objective function.

unitmbool
is M the identity matrix ?.

impose_descentbool
is descent required i.e., should $c^T x < 0$ ?.

space_criticalbool
if space_critical is True, every effort will be made to use as little space as possible. This may result in longer computation time.

deallocate_error_fatalbool
if deallocate_error_fatal is True, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue.

print_ritz_valuesbool
should the Ritz values be written to the debug stream?.

ritz_file_namestr
name of debug file containing the Ritz values.

prefixstr
all output lines will be prefixed by the string contained in quotes within prefix, e.g. ‘word’ (note the qutoes) will result in the prefix word.

glrt.load_options(options=None)#

Import control options into internal storage prior to solution.

Parameters:

optionsdict, optional
dictionary of control options (see glrt.initialize).

glrt.solve_problem(status, n, power, weight, r, v)#

Find the global moinimizer of the regularized quadratic objective function $r(x)$.

Parameters:

statusint
holds the entry status. Possible values are

1

an initial entry with r set to $g$.

6

a restart entry with $p$ and $g$ unchanged but a larger weight $\sigma$.

other

the value returned from the previous call, see Returns below.

nint
holds the number of variables.

powerfloat
holds the regularization power $p \geq 2$.

weightfloat
holds the strinctly positive regularization weight $\sigma$.

rndarray(n)
holds the values of the linear term $g$ in the objective function when initial or return status = 1, 4 or 6 (see below).

vndarray(n)
holds the result vector when return status = 2 or 3 (see below).

Returns:

statusint
holds the exit status. Possible values are

0

the solution has been found, no further reentry is required

2

the inverse of $M$ must be applied to the vector returned in v, the result placed in v, and the function recalled with status set to 2. This will only occur if options[‘unitm’] is False.

3

the product of $H$ with the vector returned in v must be formed, the result placed in v, and the function recalled with status set to 3.

4

the iteration must be restarted by setting r to $g$, and the function recalled with status set to 4.

<0

an error occurred, see status in glrt.information for further details.

xndarray(n)
holds the values of the approximate minimizer $x$.

rndarray(n)
holds the values of the gradient $g + Hx$ at the current $x$.

vndarray(n)
holds the return vector when return status = 2 or 3 (see above).

[optional] glrt.information()

Provide optional output information

Returns:

informdict

dictionary containing output information:

statusint
return status. Possible values are:

0

The run was successful.

-1

An allocation error occurred. A message indicating the offending array is written on unit options[‘error’], and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

-2

A deallocation error occurred. A message indicating the offending array is written on unit options[‘error’] and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

-3

The restriction n > 0, $\sigma > 0$ or $p \geq 2$ has been violated.

-7

The objective function appears to be unbounded from below. This can only happen if $p = 2$, and in this case the objective is unbounded along the arc x + t v, where x and v are as returned by glrt.solve_problem, as t goes to infinity.

-15

$M$ appears to be indefinite.

-18

The iteration limit has been exceeded.

alloc_statusint
the status of the last attempted allocation/deallocation.

bad_allocstr
the name of the array for which an allocation/deallocation error occurred.

iterint
the total number of iterations required.

iter_pass2int
the total number of pass-2 iterations required.

objfloat
the value of the quadratic function.

obj_regularizedfloat
the value of the regularized quadratic function.

multiplierfloat
the multiplier, $\sigma \|x\|^{p-2}$.

xpo_normfloat
the value of the norm $\|x\|_M$.

leftmostfloat
an estimate of the leftmost generalized eigenvalue of the pencil $(H,M)$.

negative_curvaturebool
was negative curvature encountered ?.

hard_casebool
did the hard case occur ?.

glrt.terminate()#

Deallocate all internal private storage.

example code#

from galahad import glrt
import numpy as np
np.set_printoptions(precision=4,suppress=True,floatmode='fixed')
print("\n** python test: glrt")

# set problem data

n = 100
power = 3.0
v = np.empty(n)
hv = np.empty(n)
g = np.empty(n)
g[:] = 1.0

# set default options
options = glrt.initialize()

# set some non-default options
options['unitm'] = False
options['print_level'] = 0
#print("options:", options)

# load non-default options
glrt.load_options(options)

# first problem

print(" solve problem 1")
status = 1
weight = 1.0
r = g
while status > 0:
  status, x, r, v = glrt.solve_problem(status, n, power, weight, r, v)
  if status == 2: # precondition v -> v/2
    for i in range(n):
      v[i] = v[i] / 2.0;
  elif status == 3: # form v -> H v
    hv[0] = 2.0 * v[0] + v[1]
    for i in range(1,n-1):
       hv[i] = v[i-1] + 2.0 * v[i] + v[i+1]
    hv[n-1] = v[n-2] + 2.0 * v[n-1]
    for i in range(n):
      v[i] = hv[i]
  elif status == 4: # reset r
    r = g

# get information

inform = glrt.information()
print(" optimal f: %.4f" % inform['obj'])
#print(" x:",x)

# second problem, same data with larger weight

print("\n solve problem 2 with larger weight")
status = 6
weight = 10.0
r = g
while status > 0:
  status, x, r, v = glrt.solve_problem(status, n, power, weight, r, v)
  if status == 2: # precondition v -> v/2
    for i in range(n):
      v[i] = v[i] / 2.0;
  elif status == 3: # form v -> H v
    hv[0] =  2.0 * v[0] + v[1]
    for i in range(1,n-1):
       hv[i] = v[i-1] + 2.0 * v[i] + v[i+1]
    hv[n-1] = v[n-2] + 2.0 * v[n-1]
    for i in range(n):
      v[i] = hv[i]
  elif status == 4: # reset r
    r = g

# get information

inform = glrt.information()
print(" optimal f: %.4f" % inform['obj'])
#print(" x:",x)
print('** glrt exit status:', inform['status'])

# deallocate internal data

glrt.terminate()

This example code is available in $GALAHAD/src/glrt/Python/test_glrt.py .