GALAHAD GLRT package#

purpose#

The glrt package uses a Krylov-subspace iteration to find an approximation of the global minimizer of regularized quadratic objective function. The aim is to minimize the regularized quadratic objective function

\[r(x) = f + g^T x + \frac{1}{2} x^T H x + \frac{\sigma}{p} \|x\|_{M}^p,\]
where the weight \(\sigma \geq 0\), the power \(p \geq 2\), and where the \(M\)-norm of \(x\) is defined to be \(\|x\|_{M} = \sqrt{x^T M x}\). The method may be suitable for large problems as no factorization of \(H\) is required. Reverse communication is used to obtain matrix-vector products of the form \(H z\) and \(M^{-1} z.\)

See Section 4 of $GALAHAD/doc/glrt.pdf for additional details.

method#

The required solution \(x\) necessarily satisfies the optimality condition \(H x + \lambda M x + g = 0\), where \(\lambda = \sigma \|x\|_{M}^{p-2}\). In addition, the matrix \(H + \lambda M\) will be positive semi-definite.

The method is iterative. Starting with the vector \(M^{-1} g\), a matrix of Lanczos vectors is built one column at a time so that the \(k\)-th column is generated during iteration \(k\). These columns span a so-called Krylov space. The resulting \(n\) by \(k\) matrix \(Q_k \) has the property that \(Q_{k}^T H Q_k^{} = T_{k}^{}\), where \(T_k\) is tridiagonal. An approximation to the required solution may then be expressed formally as

\[x_{k+1} = Q_k y_k\]
where \(y_k \) solves the ``tridiagonal’’ subproblem of minimizing
\[\frac{1}{2} y^T T_k y + \|g\|_{M^{-1} } e_{1}^T y + \frac{1}{p} \sigma \| y \|_2^p,\;\;\mbox{(1)}\]
where \(e_1\) is the first unit vector.

To minimize (1), the optimality conditions

\[( T_k + \lambda I ) y(\lambda) = - g,\;\;\mbox{(2)}\]
where \(\lambda = \sigma \|y(\lambda)\|_{M}^{p-2}\) are used as the basis of an iteration. Specifically, given an estimate \(\lambda\) for which \(T_k + \lambda I\) is positive definite, the tridiagonal system (2) may be efficiently solved to give \(y(\lambda)\). It is then simply a matter of adjusting \(\lambda\) (for example by a Newton-like process) to solve the scalar nonlinear equation
\[\theta(\lambda) \equiv \|y(\lambda)\|_{M}^{p-2} - \frac{\lambda}{\sigma} = 0.\;\;\mbox{(3)}\]
In practice (3) is reformulated, and a more rapidly converging iteration is used.

It is possible to measure the optimality measure \(\|H x + \lambda M x + g\|_{M^{-1}}\) without computing \(x_{k+1}\), and thus without needing \(Q_k \). Once this measure is sufficiently small, a second pass is required to obtain the estimate \(x_{k+1} \) from \(y_k \). As this second pass is an additional expense, a record is kept of the optimal objective function values for each value of \(k\), and the second pass is only performed so far as to ensure a given fraction of the final optimal objective value. Large savings may be made in the second pass by choosing the required fraction to be significantly smaller than one.

Special code is used in the special case \(p=2\), as in this case a single pass suffices.

reference#

The method is described in detail in

C. Cartis, N. I. M. Gould and Ph. L. Toint, ``Adaptive cubic regularisation methods for unconstrained optimization. Part {I}: motivation, convergence and numerical results’’. Mathematical Programming 127(2) (2011) 245-295.

introduction to function calls#

To solve a given problem, functions from the glrt package must be called in the following order:

  • glrt_initialize - provide default control parameters and set up initial data structures

  • glrt_read_specfile (optional) - override control values by reading replacement values from a file

  • glrt_import_control - import control parameters prior to solution

  • glrt_solve_problem - solve the problem by reverse communication, a sequence of calls are made under control of a status parameter, each exit either asks the user to provide additional informaton and to re-enter, or reports that either the solution has been found or that an error has occurred

  • glrt_information (optional) - recover information about the solution and solution process

  • glrt_terminate - deallocate data structures

See the examples section for illustrations of use.

parametric real type T#

Below, the symbol T refers to a parametric real type that may be Float32 (single precision) or Float64 (double precision). Calable functions as described are with T as Float64, but variants (with the additional suffix _s, e.g., glrt_initialize_s) are available with T as Float32.

callable functions#

    function glrt_initialize(data, control, status)

Set default control values and initialize private data

Parameters:

data

holds private internal data

control

is a structure containing control information (see glrt_control_type)

status

is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):

  • 0

    The initialization was successful.

    function glrt_read_specfile(control, specfile)

Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/glrt/GLRT.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/glrt.pdf for a list of how these keywords relate to the components of the control structure.

Parameters:

control

is a structure containing control information (see glrt_control_type)

specfile

is a one-dimensional array of type Vararg{Cchar} that must give the name of the specification file

    function glrt_import_control(control, data, status)

Import control parameters prior to solution.

Parameters:

control

is a structure whose members provide control parameters for the remaining procedures (see glrt_control_type)

data

holds private internal data

status

is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):

  • 1

    The import was successful, and the package is ready for the solve phase

    function glrt_solve_problem(data, status, n, power, weight, x, r, vector)

Solve the regularized-quadratic problem using reverse communication.

Parameters:

data

holds private internal data

status

is a scalar variable of type Int32 that gives the entry and exit status from the package.

This must be set to

  • 1

    on initial entry. Set the argument r (below) to \(c\) for this entry.

  • 6

    the iteration is to be restarted with a larger weight but with all other data unchanged. Set the argument r to \(c\) for this entry.

Possible exit values are:

  • 0

    the solution has been found

  • 2

    the inverse of \(M\) must be applied to the argument vector (below) with the result returned in vector and the function re-entered with all other data unchanged. This will only happen if control.unitm is false

  • 3

    the product of \(H\) with vector must be formed, with the result returned in vector and the function re-entered with all other data unchanged

  • 4

    The iteration must be restarted. Reset the argument r to \(c\) and re-enter with all other data unchanged.

  • -1

    an array allocation has failed

  • -2

    an array deallocation has failed

  • -3

    n and/or radius is not positive

  • -7

    the problem is unbounded from below. This can only happen if power = 2, and in this case the objective is unbounded along the arc x + t vector as t goes to infinity

  • -15

    the matrix \(M\) appears to be indefinite

  • -18

    the iteration limit has been exceeded

n

is a scalar variable of type Int32 that holds the number of variables

power

is a scalar of type T that holds the egularization power, \(p \geq 2\)

weight

is a scalar of type T that holds the positive regularization weight, \(\sigma\)

x

is a one-dimensional array of size n and type T that holds the solution \(x\). The j-th component of x, j = 1, … , n, contains \(x_j\).

r

is a one-dimensional array of size n and type T that that must be set to \(c\) on entry (status = 1) and re-entry (status = 4, 5). On exit, r contains the resiual \(H x + c\).

vector

is a one-dimensional array of size n and type T that should be used and reset appropriately when status = 2 and 3 as directed.

    function glrt_information(data, inform, status)

Provides output information

Parameters:

data

holds private internal data

inform

is a structure containing output information (see glrt_inform_type)

status

is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):

  • 0

    The values were recorded successfully

    function glrt_terminate(data, control, inform)

Deallocate all internal private storage

Parameters:

data

holds private internal data

control

is a structure containing control information (see glrt_control_type)

inform

is a structure containing output information (see glrt_inform_type)

available structures#

glrt_control_type structure#

    struct glrt_control_type{T}
      f_indexing::Bool
      error::Int32
      out::Int32
      print_level::Int32
      itmax::Int32
      stopping_rule::Int32
      freq::Int32
      extra_vectors::Int32
      ritz_printout_device::Int32
      stop_relative::T
      stop_absolute::T
      fraction_opt::T
      rminvr_zero::T
      f_0::T
      unitm::Bool
      impose_descent::Bool
      space_critical::Bool
      deallocate_error_fatal::Bool
      print_ritz_values::Bool
      ritz_file_name::NTuple{31,Cchar}
      prefix::NTuple{31,Cchar}

detailed documentation#

control derived type as a Julia structure

components#

Bool f_indexing

use C or Fortran sparse matrix indexing

Int32 error

error and warning diagnostics occur on stream error

Int32 out

general output occurs on stream out

Int32 print_level

the level of output required is specified by print_level

Int32 itmax

the maximum number of iterations allowed (-ve = no bound)

Int32 stopping_rule

the stopping rule used (see below). Possible values are:

  • 1 stopping rule = norm of the step.

  • 2 stopping rule is norm of the step / \(\sigma\).

  • other. stopping rule = 1.0.

Int32 freq

frequency for solving the reduced tri-diagonal problem

Int32 extra_vectors

the number of extra work vectors of length n used

Int32 ritz_printout_device

the unit number for writing debug Ritz values

T stop_relative

the iteration stops successfully when the gradient in the \(M^{-1}\) norm is smaller than max( stop_relative \* min( 1, stopping_rule ) \* norm initial gradient, stop_absolute )

T stop_absolute

see stop_relative

T fraction_opt

an estimate of the solution that gives at least .fraction_opt times the optimal objective value will be found

T rminvr_zero

the smallest value that the square of the M norm of the gradient of the objective may be before it is considered to be zero

T f_0

the constant term, f0, in the objective function

Bool unitm

is M the identity matrix ?

Bool impose_descent

is descent required i.e., should \(c^T x < 0\)?

Bool space_critical

if .space_critical true, every effort will be made to use as little space as possible. This may result in longer computation time

Bool deallocate_error_fatal

if .deallocate_error_fatal is true, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue

Bool print_ritz_values

should the Ritz values be written to the debug stream?

char ritz_file_name[31]

name of debug file containing the Ritz values

NTuple{31,Cchar} prefix

all output lines will be prefixed by .prefix(2:LEN(TRIM(.prefix))-1) where .prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’

glrt_inform_type structure#

    struct glrt_inform_type{T}
      status::Int32
      alloc_status::Int32
      bad_alloc::NTuple{81,Cchar}
      iter::Int32
      iter_pass2::Int32
      obj::T
      obj_regularized::T
      multiplier::T
      xpo_norm::T
      leftmost::T
      negative_curvature::Bool
      hard_case::Bool

detailed documentation#

inform derived type as a Julia structure

components#

Int32 status

return status. See glrt_solve_problem for details

Int32 alloc_status

the status of the last attempted allocation/deallocation

NTuple{81,Cchar} bad_alloc

the name of the array for which an allocation/deallocation error occurred

Int32 iter

the total number of iterations required

Int32 iter_pass2

the total number of pass-2 iterations required

T obj

the value of the quadratic function

T obj_regularized

the value of the regularized quadratic function

T multiplier

the multiplier, \(\sigma \|x\|^{p-2}\)

T xpo_norm

the value of the norm \(\|x\|_M\)

T leftmost

an estimate of the leftmost generalized eigenvalue of the pencil \((H,M)\)

Bool negative_curvature

was negative curvature encountered ?

Bool hard_case

did the hard case occur ?

example calls#

This is an example of how to use the package to solve a regularization subproblem; the code is available in $GALAHAD/src/glrt/Julia/test_glrt.jl .

# test_glrt.jl
# Simple code to test the Julia interface to GLRT

using GALAHAD
using Test
using Printf
using Accessors

function test_glrt()
  # Derived types
  data = Ref{Ptr{Cvoid}}()
  control = Ref{glrt_control_type{Float64}}()
  inform = Ref{glrt_inform_type{Float64}}()

  # Set problem data
  n = 100 # dimension

  status = Ref{Cint}()
  weight = Ref{Float64}()
  power = 3.0
  x = zeros(Float64, n)
  r = zeros(Float64, n)
  vector = zeros(Float64, n)
  h_vector = zeros(Float64, n)

  # Initialize glrt
  glrt_initialize(data, control, status)

  # use a unit M ?
  for unit_m in 0:1
    if unit_m == 0
      @reset control[].unitm = false
    else
      @reset control[].unitm = true
    end

    glrt_import_control(control, data, status)

    # resolve with a larger weight ?
    for new_weight in 0:1
      if new_weight == 0
        weight[] = 1.0
        status[] = 1
      else
        weight[] = 10.0
        status[] = 6
      end

      for i in 1:n
        r[i] = 1.0
      end

      # iteration loop to find the minimizer
      terminated = false
      while !terminated # reverse-communication loop
        glrt_solve_problem(data, status, n, power, weight[], x, r, vector)
        if status[] == 0 # successful termination
          terminated = true
        elseif status[] < 0 # error exit
          terminated = true
        elseif status[] == 2 # form the preconditioned vector
          for i in 1:n
            vector[i] = vector[i] / 2.0
          end
        elseif status[] == 3 # form the Hessian-vector product
          h_vector[1] = 2.0 * vector[1] + vector[2]
          for i in 2:(n - 1)
            h_vector[i] = vector[i - 1] + 2.0 * vector[i] + vector[i + 1]
          end
          h_vector[n] = vector[n - 1] + 2.0 * vector[n]
          for i in 1:n
            vector[i] = h_vector[i]
          end
        elseif status[] == 4 # restart
          for i in 1:n
            r[i] = 1.0
          end
        else
          @printf(" the value %1i of status should not occur\n", status)
        end
      end

      glrt_information(data, inform, status)
      @printf("MR = %1i%1i glrt_solve_problem exit status = %i, f = %.2f\n", unit_m,
              new_weight, inform[].status,
              inform[].obj_regularized)
    end
  end

  # Delete internal workspace
  glrt_terminate(data, control, inform)

  return 0
end

@testset "GLRT" begin
  @test test_glrt() == 0
end