GALAHAD EXPO package#

purpose#

The expo package uses an exponential-penalty function method to solve a given constrained optimization problem. The aim is to find a (local) minimizer of a differentiable objective function $f(x)$ of $n$ variables $x$, subject to $m$ general constraints $c_l \leq c(x) \leq c_u$ and simple-bound constraints $x_l \leq x \leq x_u$ on the variables. Here, any of the components of the vectors of bounds $c_l$, $c_u$, $x_l$ and $x_u$ may be infinite. The method offers the choice of direct and iterative solution of the key unconstrained-optimization subproblems, and is most suitable for large problems. First derivatives are required, and if second derivatives can be calculated, they will be exploited—if the product of second derivatives with a vector may be found but not the derivatives themselves, that may also be exploited.

N.B. This package is currently a beta release, and aspects may change before it is formally released

See Section 4 of $GALAHAD/doc/expo.pdf for additional details.

terminology#

The exponential penalty function is defined to be

\[\begin{split}\begin{array}{rl}\phi(x,w,\mu,v,\nu) \!\! & = f(x) + \sum_{i} \mu_{li} w_{li} \exp[(c_{li} - c_i(x))/\mu_{li}] \\ & \;\;\;\;\;\;\;\;\;\;\;\;\; + \sum_{i} \mu_{ui} w_{ui} \exp[(c_i(x) - c_{ui})/\mu_{ui}] \\ & \;\;\;\;\;\;\;\;\;\;\;\;\; + \sum_{j} \nu_{lj} v_{lj} \exp[(x_{lj} - x_j)/\nu_{lj}] \\ & \;\;\;\;\;\;\;\;\;\;\;\;\; + \sum_{j} \nu_{uj} v_{uj} \exp[(x_j - x_{uj})/\nu_{uj}], \end{array} \end{split}\]

where $c_{li}$, $c_{ui}$ and $c_i(x)$ are the $i$-th components of $c_l$, $c_u$ and $c(x)$, and $c_{lj}$, $c_{uj}$ and $x_j$ are the $j$-th components of $x_l$, $x_u$ and $x$, respectively. Here the components of $\mu_l$, $\mu_u$, $\nu_l$ and $\nu_u$ are separate penalty parameters for each lower and upper, general and simple-bound constraint, respectively, while those of $w_l$, $w_u$, $v_l$, $v_u$ are likewise separate weights for the same. The algorithm iterates by approximately minimizing $\phi(x,w,\mu,v,\nu)$ for a fixed set of penalty parameters and weights, and then adjusting these parameters and weights. The adjustments are designed so the sequence of approximate minimizers of $\phi$ converge to that of the specified constrained optimization problem.

Key constructs are the gradient of the objective function

\[g(x) := \nabla_x f(x),\]

the Jacobian of the vector of constraints,

\[J(x) := \nabla_x c(x),\]

and the gradient and Hessian of the Lagrangian function

\[g_L(x,y,z) := g(x) - J^T(x)y - z \;\;\mbox{and}\;\; H_L(x,y) := \nabla_{xx} \left[ f(x) - \sum_{i} y_i c_i(x)\right]\]

for given vectors $y$ and $z$.

Any required solution $x$ necessarily satisfies the primal optimality conditions

\[c_l \leq c(x) \leq c_u, \;\; x_l \leq x \leq x_u,\;\;\mbox{(1)}\]

the dual optimality conditions

\[g(x) = J^{T}(x) y + z,\;\; y = y_l + y_u \;\;\mbox{and}\;\; z = z_l + z_u,\;\;\mbox{(2a)}\]

and

\[y_l \geq 0, \;\; y_u \leq 0, \;\; z_l \geq 0 \;\;\mbox{and}\;\; z_u \leq 0,\;\;\mbox{(2b)}\]

and the complementary slackness conditions

\[( c(x) - c_l )^{T} y_l = 0,\;\; ( c(x) - c_u )^{T} y_u = 0,\;\; (x -x_l )^{T} z_l = 0 \;\;\mbox{and}\;\;(x -x_u )^{T} z_u = 0,\;\;\mbox{(3)}\]

where the vectors $y$ and $z$ are known as the Lagrange multipliers for the general constraints, and the dual variables for the simple bounds, respectively, and where the vector inequalities hold component-wise.

method#

The method employed involves a sequential minimization of the exponential penalty function $\phi(x,w,\mu,v,\nu)$ for a sequence of positive penalty parameters $(\mu_{lk}, \mu_{uk}, \nu_{lk}, \nu_{uk})$ and weights $(w_{lk}, w_{uk}, v_{lk}, v_{uk})$, for increasing $k \geq 0$. Convergence is ensured if the penalty parameters are forced to zero, and may be accelerated by adjusting the weights. The minimization of $\phi(x,w,\mu,v,\nu)$ is accomplished using the trust-region unconstrained solver TRU. Although critical points $\{x_k\}$ of $\phi(x,w_k,\mu_k,v_k,\nu_k)$ converge to a local solution $x_*$ of the underlying problem, the reduction of the penalty parameters to zero often results in $x_k$ being a poor starting point for the minimization of $\phi(x,w_{k+1},\mu_{k+1},v_{k+1},\nu_{k+1})$. Consequently, a careful extrapolated starting point from $x_k$ is used instead. Moreover, once the algorithm is confident that it is sufficiently close to $x_*$, it switches to Newton’s method to accelerate the convergence. Both the extrapolation and the Newton iteration rely on the block-linear-system solver SSLS.

The iteration is terminated as soon as residuals to the optimality conditions (1)–(3) are sufficiently small. For infeasible problems, this will not be possible, and instead the residuals to (1) will be made as small as possible.

references#

The method is described in detail in

N.Gould, S.Leyffer, A.Montoison and C.Vanaret (2025) The exponential multiplier method in the 21st century. RAL Technical Report, in preparation.

matrix storage#

unsymmetric storage#

The unsymmetric $m$ by $n$ Jacobian matrix $J = J(x)$ may be presented and stored in a variety of convenient input formats.

Dense storage format: The matrix $J$ is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. In this case, component $n \ast i + j$ of the storage array J_val will hold the value $J_{ij}$ for $1 \leq i \leq m$, $1 \leq j \leq n$. The string J_type = ‘dense’ should be specified.

Dense by columns storage format: The matrix $J$ is stored as a compact dense matrix by columns, that is, the values of the entries of each column in turn are stored in order within an appropriate real one-dimensional array. In this case, component $m \ast j + i$ of the storage array J_val will hold the value $J_{ij}$ for $1 \leq i \leq m$, $1 \leq j \leq n$. The string J_type = ‘dense_by_columns’ should be specified.

Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the $l$-th entry, $1 \leq l \leq ne$, of $J$, its row index i, column index j and value $J_{ij}$, $1 \leq i \leq m$, $1 \leq j \leq n$, are stored as the $l$-th components of the integer arrays J_row and J_col and real array J_val, respectively, while the number of nonzeros is recorded as J_ne = $ne$. The string J_type = ‘coordinate’should be specified.

Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of $J$ the i-th component of the integer array J_ptr holds the position of the first entry in this row, while J_ptr(m+1) holds the total number of entries plus one. The column indices j, $1 \leq j \leq n$, and values $J_{ij}$ of the nonzero entries in the i-th row are stored in components l = J_ptr(i), $\ldots$, J_ptr(i+1)-1, $1 \leq i \leq m$, of the integer array J_col, and real array J_val, respectively. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string J_type = ‘sparse_by_rows’ should be specified.

Sparse column-wise storage format: Once again only the nonzero entries are stored, but this time they are ordered so that those in column j appear directly before those in column j+1. For the j-th column of $J$ the j-th component of the integer array J_ptr holds the position of the first entry in this column, while J_ptr(n+1) holds the total number of entries plus one. The row indices i, $1 \leq i \leq m$, and values $J_{ij}$ of the nonzero entries in the j-th columnsare stored in components l = J_ptr(j), $\ldots$, J_ptr(j+1)-1, $1 \leq j \leq n$, of the integer array J_row, and real array J_val, respectively. As before, for sparse matrices, this scheme almost always requires less storage than the co-ordinate format. The string J_type = ‘sparse_by_columns’ should be specified.

symmetric storage#

The symmetric $n$ by $n$ matrix $H = H(x,y)$ may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).

Dense storage format: The matrix $H$ is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since $H$ is symmetric, only the lower triangular part (that is the part $H_{ij}$ for $1 \leq j \leq i \leq n$) need be held. In this case the lower triangle should be stored by rows, that is component $(i-1) * i / 2 + j$ of the storage array H_val will hold the value $H_{ij}$ (and, by symmetry, $H_{ji}$) for $1 \leq j \leq i \leq n$. The string H_type = ‘dense’ should be specified.

Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the $l$-th entry, $1 \leq l \leq ne$, of $H$, its row index i, column index j and value $H_{ij}$, $1 \leq j \leq i \leq n$, are stored as the $l$-th components of the integer arrays H_row and H_col and real array H_val, respectively, while the number of nonzeros is recorded as H_ne = $ne$. Note that only the entries in the lower triangle should be stored. The string H_type = ‘coordinate’ should be specified.

Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of $H$ the i-th component of the integer array H_ptr holds the position of the first entry in this row, while H_ptr(n+1) holds the total number of entries plus one. The column indices j, $1 \leq j \leq i$, and values $H_{ij}$ of the entries in the i-th row are stored in components l = H_ptr(i), …, H_ptr(i+1)-1 of the integer array H_col, and real array H_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string H_type = ‘sparse_by_rows’ should be specified.

Diagonal storage format: If $H$ is diagonal (i.e., $H_{ij} = 0$ for all $1 \leq i \neq j \leq n$) only the diagonals entries $H_{ii}$, $1 \leq i \leq n$ need be stored, and the first n components of the array H_val may be used for the purpose. The string H_type = ‘diagonal’ should be specified.

Multiples of the identity storage format: If $H$ is a multiple of the identity matrix, (i.e., $H = \alpha I$ where $I$ is the n by n identity matrix and $\alpha$ is a scalar), it suffices to store $\alpha$ as the first component of H_val. The string H_type = ‘scaled_identity’ should be specified.

The identity matrix format: If $H$ is the identity matrix, no values need be stored. The string H_type = ‘identity’ should be specified.

The zero matrix format: The same is true if $H$ is the zero matrix, but now the string H_type = ‘zero’ or ‘none’ should be specified.

introduction to function calls#

To solve a given problem, functions from the expo package must be called in the following order:

expo_initialize - provide default control parameters and set up initial data structures
expo_read_specfile (optional) - override control values by reading replacement values from a file
expo_import - import structural data and set up data structures prior to solution
expo_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved
expo_solve_hessian_direct - solve the problem using function calls to evaluate function, gradient and Hessian values
expo_information (optional) - recover information about the solution and solution process
expo_terminate - deallocate data structures

See the examples section for illustrations of use.

parametric real type T and integer type INT#

Below, the symbol T refers to a parametric real type that may be Float32 (single precision), Float64 (double precision) or, if supported, Float128 (quadruple precision). The symbol INT refers to a parametric integer type that may be Int32 (32-bit integer) or Int64 (64-bit integer).

callable functions#

    function expo_initialize(T, INT, data, control, inform)

Set default control values and initialize private data

Parameters:

data	holds private internal data
control	is a structure containing control information (see expo_control_type)
inform	is a structure containing output information (see expo_inform_type)

    function expo_read_specfile(T, INT, control, specfile)

Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/expo/EXPO.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/expo.pdf for a list of how these keywords relate to the components of the control structure.

Parameters:

control	is a structure containing control information (see expo_control_type)
specfile	is a one-dimensional array of type Vararg{Cchar} that must give the name of the specification file

    function expo_import(T, INT, control, data, status, n, m,
                        J_type, J_ne, J_row, J_col, J_ptr,
                        H_type, H_ne, H_row, H_col, H_ptr )

Import problem data into internal storage prior to solution.

Parameters:

control	is a structure whose members provide control parameters for the remaining procedures (see expo_control_type)
data	holds private internal data
status	is a scalar variable of type INT that gives the exit status from the package. Possible values are: 1 The import was successful, and the package is ready for the solve phase -1 An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -2 A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -3 The restrictions n > 0, m > 0 or requirement that J/H_type contains its relevant string ‘dense’, ‘dense_by_columns’, ‘coordinate’, ‘sparse_by_rows’, ‘sparse_by_columns’, ‘diagonal’ or ‘absent’ has been violated.
n	is a scalar variable of type INT that holds the number of variables.
m	is a scalar variable of type INT that holds the number of general constraints.
J_type	is a one-dimensional array of type Vararg{Cchar} that specifies the unsymmetric storage scheme used for the Jacobian, $J$. It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’ or ‘absent’, the latter if access to the Jacobian is via matrix-vector products; lower or upper case variants are allowed.
J_ne	is a scalar variable of type INT that holds the number of entries in $J$ in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes.
J_row	is a one-dimensional array of size J_ne and type INT that holds the row indices of $J$ in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes, and in this case can be C_NULL.
J_col	is a one-dimensional array of size J_ne and type INT that holds the column indices of $J$ in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense or diagonal storage schemes are used, and in this case can be C_NULL.
J_ptr	is a one-dimensional array of size m+1 and type INT that holds the starting position of each row of $J$, as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL.
H_type	is a one-dimensional array of type Vararg{Cchar} that specifies the symmetric storage scheme used for the Hessian, $H_L$. It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’, ‘diagonal’ or ‘absent’, the latter if access to $H$ is via matrix-vector products; lower or upper case variants are allowed.
H_ne	is a scalar variable of type INT that holds the number of entries in the lower triangular part of $H_L$ in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes.
H_row	is a one-dimensional array of size H_ne and type INT that holds the row indices of the lower triangular part of $H_L$ in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be C_NULL.
H_col	is a one-dimensional array of size H_ne and type INT that holds the column indices of the lower triangular part of $H_L$ in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense or diagonal storage schemes are used, and in this case can be C_NULL.
H_ptr	is a one-dimensional array of size n+1 and type INT that holds the starting position of each row of the lower triangular part of $H$, as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL.

    function expo_reset_control(T, INT, control, data, status)

Reset control parameters after import if required.

Parameters:

control

is a structure whose members provide control parameters for the remaining procedures (see expo_control_type)

data

holds private internal data

status

is a scalar variable of type INT that gives the exit status from the package. Possible values are:

1
The import was successful, and the package is ready for the solve phase

    function expo_solve_hessian_direct(T, INT, data, userdata, status,
                                       n, m, j_ne, h_ne,
                                       c_l, c_u, x_l, x_u,
                                       x, y, z, c, gl,
                                       eval_fc, eval_gj, eval_hl)

Find a local minimizer of the constrained optimization problem using the exponential penalty method.

This call is for the case where the Hessian of the Lagrangian function is available specifically, and all function/derivative information is available by (direct) function calls.

Parameters:

data	holds private internal data
userdata	is a structure that allows data to be passed into the function and derivative evaluation programs.
status	is a scalar variable of type INT that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are: 0 The run was successful -1 An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -2 A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -3 The restriction n > 0 or requirement that type contains its relevant string ‘dense’, ‘coordinate’, ‘sparse_by_rows’, or ‘diagonal’ has been violated. -9 The analysis phase of the factorization failed; the return status from the factorization package is given in the component inform.factor_status -10 The factorization failed; the return status from the factorization package is given in the component inform.factor_status. -11 The solution of a set of linear equations using factors from the factorization package failed; the return status from the factorization package is given in the component inform.factor_status. -16 The problem is so ill-conditioned that further progress is impossible. -17 The step is too small to make further impact. -18 Too many iterations have been performed. This may happen if control.max_it or control.max_eval is too small, but may also be symptomatic of a badly scaled problem. -19 The CPU time limit has been reached. This may happen if control.cpu_time_limit is too small, but may also be symptomatic of a badly scaled problem. -82 The user has forced termination of solver by removing the file named control.alive_file from unit unit control.alive_unit.
n	is a scalar variable of type INT that holds the number of variables.
m	is a scalar variable of type INT that holds the number of residuals.
j_ne	is a scalar variable of type INT that holds the number of entries in $J$.
h_ne	is a scalar variable of type INT that holds the number of entries in the lower triangular part of $H_L$.
c_l	is a one-dimensional array of size m and type T that holds the values $c_l$ of the lower bounds on the constraint functions $c(x)$. The j-th component of c_l, $i = 1, \ldots, m$, contains $c_{li}$.
c_u	is a one-dimensional array of size m and type T that holds the values $c_u$ of the upper bounds on the constraint functionss $c(x)$. The j-th component of c_u, $i = 1, \ldots, m$, contains $c_{ui}$.
x_l	is a one-dimensional array of size n and type T that holds the values $x_l$ of the lower bounds on the optimization variables $x$. The j-th component of x_l, $j = 1, \ldots, n$, contains $x_{lj}$.
x_u	is a one-dimensional array of size n and type T that holds the values $x_u$ of the upper bounds on the optimization variables $x$. The j-th component of x_u, $j = 1, \ldots, n$, contains $x_{uj}$.
x	is a one-dimensional array of size n and type T that holds the values $x$ of the optimization variables. The j-th component of `x`, j = 1, … , n, contains $x_j$. This should be set on input to an estimate of the minimizer.
y	is a one-dimensional array of size m and type T that holds the values $y$ of the Lagrange multipliers. The j-th component of `y`, i = 1, … , m, contains $y_i$.
z	is a one-dimensional array of size n and type T that holds the values $z$ of the dual variables. The j-th component of `z`, j = 1, … , n, contains $z_j$.
c	is a one-dimensional array of size m and type T that holds the constraint functions $c(x)$. The i-th component of `c`, i = 1, … , m, contains $c_i(x)$.
gl	is a one-dimensional array of size n and type T that holds the gradient $g_L(x,y,z) = \nabla_xf(x)$ of the Lagrangian function. The j-th component of `gl`, j = 1, … , n, contains $g_{Lj}$.
eval_fc	is a user-supplied function that must have the following signature: function eval_c(n, x, f, c, userdata) The value of the objective function $f(x)$ and the components of the residual function $c(x)$ evaluated at x=$x$ must be assigned to f and c, respectively, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into `eval_fc` via the structure `userdata`.
eval_gj	is a user-supplied function that must have the following signature: function eval_j(n, m, j_ne, x, g, j, userdata) The components of the gradient $g = g(x)$ of the objective and Jacobian $J = \nabla_x c(x$) of the constraints evaluated at x=$x$ must be assigned to g and to j, in the same order as presented to expo_import, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into `eval_gj` via the structure `userdata`.
eval_hl	is a user-supplied function that must have the following signature: function eval_hl(n, m, h_ne, x, y, h, userdata) The nonzeros of the Hessian of the Lagrangian function $H_L(x,y) = \nabla_{xx}f(x) -\sum_i y_i \nabla_{xx}c_i(x)$ evaluated at x=$x$ and y=$y$ must be assigned to h in the same order as presented to expo_import, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into `eval_hl` via the structure `userdata`.

    function expo_information(T, INT, data, inform, status)

Provides output information

Parameters:

data

holds private internal data

inform

is a structure containing output information (see expo_inform_type)

status

is a scalar variable of type INT that gives the exit status from the package. Possible values are (currently):

0
The values were recorded successfully

    function expo_terminate(T, INT, data, control, inform)

Deallocate all internal private storage

Parameters:

data	holds private internal data
control	is a structure containing control information (see expo_control_type)
inform	is a structure containing output information (see expo_inform_type)

available structures#

expo_control_type structure#

    struct expo_control_type{T,INT}
      f_indexing::Bool
      error::INT
      out::INT
      print_level::INT
      start_print::INT
      stop_print::INT
      print_gap::INT
      max_it::INT
      max_eval::INT
      alive_unit::INT
      alive_file::NTuple{31,Cchar}
      update_multipliers_itmin::INT
      update_multipliers_tol::T
      infinity::T
      stop_abs_p::T
      stop_rel_p::T
      stop_abs_d::T
      stop_rel_d::T
      stop_abs_c::T
      stop_rel_c::T
      stop_s::T
      stop_subproblem_rel::T
      initial_mu::T
      mu_reduce::T
      minimum_weight::T
      obj_unbounded::T
      try_advanced_start::T
      try_sqp_start::T
      stop_advanced_start::T
      cpu_time_limit::T
      clock_time_limit::T
      hessian_available::Bool
      subproblem_direct::Bool
      space_critical::Bool
      deallocate_error_fatal::Bool
      prefix::NTuple{31,Cchar}
      bsc_control::bsc_control_type
      tru_control::tru_control_type{T,INT}
      ssls_control::ssls_control_type{T,INT}

detailed documentation#

control derived type as a Julia structure

components#

Bool f_indexing

use C or Fortran sparse matrix indexing

INT error

error and warning diagnostics occur on stream error

INT out

general output occurs on stream out

INT print_level

the level of output required.

$\leq$ 0 gives no output,
= 1 gives a one-line summary for every iteration,
= 2 gives a summary of the inner iteration for each iteration,
$\geq$ 3 gives increasingly verbose (debugging) output

INT start_print

any printing will start on this iteration

INT stop_print

any printing will stop on this iteration

INT print_gap

the number of iterations between printing

INT max_it

the maximum number of iterations permitted

INT max_eval

the maximum number of function evaluations permitted

INT alive_unit

removal of the file alive_file from unit alive_unit terminates execution

char alive_file[31]

see alive_unit

INT update_multipliers_itmin

update the Lagrange multipliers/dual variables from iteration .update_multipliers_itmin (<0 means never) and once the primal infeasibility is below .update_multipliers_tol

T  update_multipliers_tol

see update_multipliers_itmin

T infinity

T stop_abs_p

the required absolute and relative accuracies for the primal infeasibility (1)

T stop_rel_p

see stop_abs_p

T stop_abs_d

the required absolute and relative accuracies for the dual infeasibility (2)

T stop_rel_d

see stop_abs_d

T stop_abs_c

the required absolute and relative accuracies for the complementary slackness (3)

T stop_rel_c

see stop_abs_c

T stop_s

the smallest the norm of the step can be before termination

T stop_subproblem_rel

the subproblem minimization that uses GALAHAD TRU will be stopped as soon as the relative decrease in the subproblem gradient falls below .stop_subproblem_rel. If .stop_subproblem_rel is 1.0 or bigger or 0.0 or smaller, this value will be ignored, and the choice of stopping rule delegated to .control_tru.stop_g_relative (see below)

T initial_mu

initial value for the penalty parameter (<=0 means set automatically)

T mu_reduce

the amount by which the penalty parameter is decreased

T obj_unbounded

the smallest value the objective function may take before the problem is marked as unbounded

T try_advanced_start

try an advanced start at the end of every iteration when the KKT residuals are smaller than .try_advanced_start (-ve means never)

T try_sqp_start

try an advanced SQP start at the end of every iteration when the KKT residuals are smaller than .try_sqp_start (-ve means never)

T stop_advanced_start

stop the advanced start search once the residuals small tham .stop_advanced_start

T cpu_time_limit

the maximum CPU time allowed (-ve means infinite)

T clock_time_limit

the maximum elapsed clock time allowed (-ve means infinite)

Bool hessian_available

is the Hessian matrix of second derivatives available or is access only via matrix-vector products (coming soon)?

Bool subproblem_direct

use a direct (factorization) or (preconditioned) iterative method (coming soon) to find the search direction

Bool space_critical

if .space_critical true, every effort will be made to use as little space as possible. This may result in longer computation time

Bool deallocate_error_fatal

if .deallocate_error_fatal is true, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue

NTuple{31,Cchar} prefix

all output lines will be prefixed by .prefix(2:LEN(TRIM(.prefix))-1) where .prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’

struct bsc_control_type bsc_control

control parameters for BSC

struct tru_control_type tru_control

control parameters for TRU

struct ssls_control_type ssls_control

control parameters for SSLS

expo_time_type structure#

    struct expo_time_type{T}
      total::Float32
      preprocess::Float32
      analyse::Float32
      factorize::Float32
      solve::Float32
      clock_total::T
      clock_preprocess::T
      clock_analyse::T
      clock_factorize::T
      clock_solve::T

detailed documentation#

time derived type as a Julia structure

components#

Float32 total

the total CPU time spent in the package

Float32 preprocess

the CPU time spent preprocessing the problem

Float32 analyse

the CPU time spent analysing the required matrices prior to factorization

Float32 factorize

the CPU time spent factorizing the required matrices

Float32 solve

the CPU time spent computing the search direction

T clock_total

the total clock time spent in the package

T clock_preprocess

the clock time spent preprocessing the problem

T clock_analyse

the clock time spent analysing the required matrices prior to factorization

T clock_factorize

the clock time spent factorizing the required matrices

T clock_solve

the clock time spent computing the search direction

expo_inform_type structure#

    struct expo_inform_type{T,INT}
      status::INT
      alloc_status::INT
      bad_alloc::NTuple{81,Cchar}
      bad_eval::NTuple{13,Cchar}
      iter::INT
      cg_iter::INT
      fc_eval::INT
      gj_eval::INT
      hl_eval::INT
      obj::T
      primal_infeasibility::T
      dual_infeasibility::T
      complementary_slackness::T
      time::expo_time_type{T}
      bsc_inform::bsc_inform_type{T,INT}
      tru_inform::tru_inform_type{T,INT}
      ssls_inform::ssls_inform_type{T,INT}

detailed documentation#

inform derived type as a Julia structure

components#

INT status

return status. See EXPO_solve for details

INT alloc_status

the status of the last attempted allocation/deallocation

NTuple{81,Cchar} bad_alloc

the name of the array for which an allocation/deallocation error occurred

char bad_eval[13]

the name of the user-supplied evaluation routine for which an error occurred

INT iter

the total number of iterations performed

INT fc_eval

the total number of evaluations of the objective $f(x)$ and constraint $c(x)$ functions

INT gj_eval

the total number of evaluations of the gradient $g(x)$ and Jacobian $J(x)$

INT hl_eval

the total number of evaluations of the Hessian $H(x,y)$ of the Lagrangian

T obj

the value of the objective function $f(x)$ at the best estimate the solution, x, determined by EXPO_solve

T primal_infeasibility

the norm of the primal infeasibility (1) at the best estimate of the solution x, determined by EXPO_solve

T dual_infeasibility

the norm of the dual infeasibility (2) at the best estimate of the solution (x,y,z), determined by EXPO_solve

T complementary_slackness

the norm of the complementary slackness (3) at the best estimate of the solution (x,y,z), determined by EXPO_solve

struct expo_time_type time

timings (see above)

struct bsc_inform_type bsc_inform

inform parameters for BSC

struct tru_inform_type tru_inform

inform parameters for TRU

struct ssls_inform_type ssls_inform

inform parameters for SSLS

Index

example calls#

This is an example of how to use the package to solve a nonlinear least-squares problem; the code is available in $GALAHAD/src/expo/Julia/test_expo.jl . A variety of supported Hessian and constraint matrix storage formats are shown.

# test_expo.jl
# Simple code to test the Julia interface to EXPO

using GALAHAD
using Test
using Printf
using Accessors
using Quadmath

# Custom userdata struct
mutable struct userdata_expo{T}
  p::T
end

function test_expo(::Type{T}, ::Type{INT}; mode::String="direct", sls::String="sytr") where {T,INT}

  # compute the objective and constraints
  function eval_fc(x::Vector{T}, f::Vector{T}, c::Vector{T}, userdata::userdata_expo{T})
    f[1] = x[1]^2 + x[2]^2
    c[1] = x[1] + x[2] - 1
    c[2] = x[1]^2 + x[2]^2 - 1
    c[3] = userdata.p * x[1]^2 + x[2]^2 - userdata.p
    c[4] = x[1]^2 - x[2]
    c[5] = x[2]^2 - x[1]
    return INT(0)
  end

  function eval_fc_c(n::INT, m::INT, x::Ptr{T}, f::Ptr{T}, c::Ptr{T}, userdata::Ptr{Cvoid})
    _x = unsafe_wrap(Vector{T}, x, n)
    _f = unsafe_wrap(Vector{T}, f, 1)
    _c = unsafe_wrap(Vector{T}, c, m)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_expo{T}
    eval_fc(_x, _f, _c, _userdata)
  end

  eval_fc_ptr = @eval @cfunction($eval_fc_c, $INT, ($INT, $INT, Ptr{$T}, Ptr{$T}, Ptr{$T}, Ptr{Cvoid}))

  # compute the gradient and Jacobian
  function eval_gj(x::Vector{T}, g::Vector{T}, jval::Vector{T}, userdata::userdata_expo{T})
    g[1] = 2 * x[1]
    g[2] = 2 * x[2]
    jval[1] = 1
    jval[2] = 1
    jval[3] = 2 * x[1]
    jval[4] = 2 * x[2]
    jval[5] = 2 * userdata.p * x[1]
    jval[6] = 2 * x[2]
    jval[7] = 2 * x[1]
    jval[8] = -1
    jval[9] = -1
    jval[10] = 2 * x[2]
    return INT(0)
  end

  function eval_gj_c(n::INT, m::INT, J_ne::INT, x::Ptr{T}, g::Ptr{T},
                     jval::Ptr{T}, userdata::Ptr{Cvoid})
    _x = unsafe_wrap(Vector{T}, x, n)
    _g = unsafe_wrap(Vector{T}, g, n)
    _jval = unsafe_wrap(Vector{T}, jval, J_ne)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_expo{T}
    eval_gj(_x, _g, _jval, _userdata)
  end

  eval_gj_ptr = @eval @cfunction($eval_gj_c, $INT, ($INT, $INT, $INT, Ptr{$T}, Ptr{$T}, Ptr{$T}, Ptr{Cvoid}))

  # compute the gradient and dense Jacobian
  function eval_gj_dense(x::Vector{T}, g::Vector{T}, jval::Vector{T}, userdata::userdata_expo{T})
    g[1] = 2 * x[1]
    g[2] = 2 * x[2]
    jval[1] = 1
    jval[2] = 1
    jval[3] = 2 * x[1]
    jval[4] = 2 * x[2]
    jval[5] = 2 * userdata.p * x[1]
    jval[6] = 2 * x[2]
    jval[7] = 2 * x[1]
    jval[8] = -1
    jval[9] = -1
    jval[10] = 2 * x[2]
    return INT(0)
  end

  function eval_gj_dense_c(n::INT, m::INT, J_ne::INT, x::Ptr{T}, g::Ptr{T},
                           jval::Ptr{T}, userdata::Ptr{Cvoid})
    _x = unsafe_wrap(Vector{T}, x, n)
    _g = unsafe_wrap(Vector{T}, g, n)
    _jval = unsafe_wrap(Vector{T}, jval, J_ne)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_expo{T}
    eval_gj_dense(_x, _g, _jval, _userdata)
  end

  eval_gj_dense_ptr = @eval @cfunction($eval_gj_dense_c, $INT, ($INT, $INT, $INT, Ptr{$T}, Ptr{$T}, Ptr{$T}, Ptr{Cvoid}))

  # compute the Hessian
  function eval_hl(x::Vector{T}, y::Vector{T}, hval::Vector{T}, userdata::userdata_expo{T})
    hval[1] = 2 - 2 * (y[2] + userdata.p * y[3] + y[4])
    hval[2] = 2 - 2 * (y[2] + y[3] + y[5])
    return INT(0)
  end

  function eval_hl_c(n::INT, m::INT, H_ne::INT, x::Ptr{T}, y::Ptr{T},
                     hval::Ptr{T}, userdata::Ptr{Cvoid})
    _x = unsafe_wrap(Vector{T}, x, n)
    _y = unsafe_wrap(Vector{T}, y, m)
    _hval = unsafe_wrap(Vector{T}, hval, H_ne)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_expo{T}
    eval_hl(_x, _y, _hval, _userdata)
  end

  eval_hl_ptr = @eval @cfunction($eval_hl_c, $INT, ($INT, $INT, $INT, Ptr{$T}, Ptr{$T}, Ptr{$T}, Ptr{Cvoid}))

  # compute the dense Hessian
  function eval_hl_dense(x::Vector{T}, y::Vector{T}, hval::Vector{T}, userdata::userdata_expo{T})
    hval[1] = 2 - 2 * (y[2] + userdata.p * y[3] + y[4])
    hval[2] = 0
    hval[3] = 2 - 2* (y[2] + y[3] + y[5])
    return INT(0)
  end

  function eval_hl_dense_c(n::INT, m::INT, H_ne::INT, x::Ptr{T}, y::Ptr{T},
                           hval::Ptr{T}, userdata::Ptr{Cvoid})
    _x = unsafe_wrap(Vector{T}, x, n)
    _y = unsafe_wrap(Vector{T}, y, m)
    _hval = unsafe_wrap(Vector{T}, hval, H_ne)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_expo{T}
    eval_hl_dense(_x, _y, _hval, _userdata)
  end

  eval_hl_dense_ptr = @eval @cfunction($eval_hl_dense_c, $INT, ($INT, $INT, $INT, Ptr{$T}, Ptr{$T}, Ptr{$T}, Ptr{Cvoid}))

  # Derived types
  data = Ref{Ptr{Cvoid}}()
  control = Ref{expo_control_type{T,INT}}()
  inform = Ref{expo_inform_type{T,INT}}()

  # Set user data
  userdata = userdata_expo{T}(9)
  userdata_ptr = pointer_from_objref(userdata)

  # Set problem data
  n = INT(2)  # variables
  m = INT(5)  # constraints
  j_ne = INT(10) # Jacobian elements
  h_ne = INT(2)  # Hesssian elements
  j_ne_dense = INT(10) # dense Jacobian elements
  h_ne_dense = INT(3) # dense Jacobian elements
  J_row = INT[1, 1, 2, 2, 3, 3, 4, 4, 5, 5]  # Jacobian J
  J_col = INT[1, 2, 1, 2, 1, 2, 1, 2, 1, 2]  #
  J_ptr = INT[1, 3, 5, 7, 9, 11]  # row pointers
  H_row = INT[1, 2]  # Hessian H
  H_col = INT[1, 2]  # NB lower triangle
  H_ptr = INT[1, 2, 3]  # row pointers

  # Set storage
  y = zeros(T, m)  # multipliers
  z = zeros(T, n)  # dual variables
  c = zeros(T, m)  # constraints
  gl = zeros(T, n) # gradient
  x_l = T[-50.0, -50.0]  # variable lower bound
  x_u = T[50.0, 50.0]  # variable upper bound
  c_l = T[0.0, 0.0, 0.0, 0.0, 0.0]  # constraint lower bound
  c_u = T[Inf, Inf, Inf, Inf, Inf]  # constraint upper bound
  st = ' '
  status = Ref{INT}()

  @printf(" Fortran sparse matrix indexing\n\n")

  if mode == "direct"
    @printf(" test direct-communication options\n\n")

    for d in 1:4
      # Initialize EXPO
      expo_initialize(T, INT, data, control, inform)

      # Set linear solvers
      @reset control[].ssls_control.symmetric_linear_solver = galahad_linear_solver(sls)

      # Set user-defined control options
      # @reset control[].print_level = INT(10)
      # @reset control[].tru_control.print_level = INT(10)
      # @reset control[].ssls_control.print_level = INT(10)
      # @reset control[].ssls_control.sls_control.print_level = INT(10)

      @reset control[].max_it = INT(20)
      @reset control[].max_eval = INT(100)
      @reset control[].stop_abs_p = T(0.00001)
      @reset control[].stop_abs_d = T(0.00001)
      @reset control[].stop_abs_c = T(0.00001)

      x = T[3.0, 1.0]  # starting point

      # sparse co-ordinate storage
      if d == 1
        st = 'C'
        expo_import(T, INT, control, data, status, n, m,
                    "coordinate", j_ne, J_row, J_col, C_NULL,
                    "coordinate", h_ne, H_row, H_col, C_NULL )

        expo_solve_hessian_direct(T, INT, data,
                                  userdata_ptr, status, n, m, j_ne, h_ne,
                                  c_l, c_u, x_l, x_u, x, y, z, c, gl,
                                  eval_fc_ptr, eval_gj_ptr, eval_hl_ptr)
      end

      # sparse by rows
      if d == 2
        st = 'R'
        expo_import(T, INT, control, data, status, n, m,
                    "sparse_by_rows", j_ne, C_NULL, J_col, J_ptr,
                    "sparse_by_rows", h_ne, C_NULL, H_col, H_ptr )

        expo_solve_hessian_direct(T, INT, data,
                                  userdata_ptr, status, n, m, j_ne, h_ne,
                                  c_l, c_u, x_l, x_u, x, y, z, c, gl,
                                  eval_fc_ptr, eval_gj_ptr, eval_hl_ptr)
      end

      # dense
      if d == 3
        st = 'D'
        expo_import(T, INT, control, data, status, n, m,
                    "dense", j_ne_dense, C_NULL, C_NULL, C_NULL,
                    "dense", h_ne_dense, C_NULL, C_NULL, C_NULL )

        expo_solve_hessian_direct(T, INT, data,
                                  userdata_ptr, status, n, m, j_ne_dense, h_ne_dense,
                                  c_l, c_u, x_l, x_u, x, y, z, c, gl,
                                  eval_fc_ptr, eval_gj_dense_ptr, eval_hl_dense_ptr)
      end

      # diagonal
      if d == 4
        st = 'I'
        expo_import(T, INT, control, data, status, n, m,
                    "sparse_by_rows", j_ne, C_NULL, J_col, J_ptr,
                    "diagonal", n, C_NULL, C_NULL, C_NULL )

        expo_solve_hessian_direct(T, INT, data,
                                  userdata_ptr, status, n, m, j_ne, n,
                                  c_l, c_u, x_l, x_u, x, y, z, c, gl,
                                  eval_fc_ptr, eval_gj_ptr, eval_hl_ptr)
      end

      expo_information(T, INT, data, inform, status)

      if inform[].status == 0
        @printf("%c:%6i iterations. Optimal objective value = %5.2f status = %1i\n",
                st, inform[].iter, inform[].obj, inform[].status)
      else
        @printf("%c: EXPO_solve exit status = %1i\n", st, inform[].status)
      end

      # Delete internal workspace
      expo_terminate(T, INT, data, control, inform)
    end
  end

  return 0
end

for (T, INT, libgalahad) in ((Float32 , Int32, GALAHAD.libgalahad_single      ),
                             (Float32 , Int64, GALAHAD.libgalahad_single_64   ),
                             (Float64 , Int32, GALAHAD.libgalahad_double      ),
                             (Float64 , Int64, GALAHAD.libgalahad_double_64   ),
                             (Float128, Int32, GALAHAD.libgalahad_quadruple   ),
                             (Float128, Int64, GALAHAD.libgalahad_quadruple_64))
  if isfile(libgalahad)
    @testset "EXPO -- $T -- $INT" begin
      @testset "$mode communication" for mode in ("direct",)
        @test test_expo(T, INT; mode) == 0
      end
    end
  end
end

GALAHAD EXPO package#

purpose#

terminology#

method#

references#

matrix storage#

unsymmetric storage#

symmetric storage#

introduction to function calls#

parametric real type T and integer type INT#

callable functions#

available structures#

expo_control_type structure#

detailed documentation#

components#

expo_time_type structure#

detailed documentation#

components#

expo_inform_type structure#

detailed documentation#

components#

example calls#

This Page