GALAHAD NLS package#
purpose#
The nls
package uses a regularization method to find a (local) unconstrained
minimizer of a differentiable weighted sum-of-squares objective function
See Section 4 of $GALAHAD/doc/nls.pdf for additional details.
terminology#
The gradient \(\nabla_x f(x)\) of a function \(f(x)\) is the vector whose \(i\)-th component is \(\partial f(x)/\partial x_i\). The Hessian \(\nabla_{xx} f(x)\) of \(f(x)\) is the symmetric matrix whose \(i,j\)-th entry is \(\partial^2 f(x)/\partial x_i \partial x_j\). The Hessian is sparse if a significant and useful proportion of the entries are universally zero.
The algorithm used by the package is iterative. From the current best estimate of the minimizer \(x_k\), a trial improved point \(x_k + s_k\) is sought. The correction \(s_k\) is chosen to improve a model \(m_k(s)\) of the objective function \(f(x_k+s)\) built around \(x_k\). The model is the sum of two basic components, a suitable approximation \(t_k(s)\) of \(f(x_k+s)\), %another approximation of \((\rho/r) \|x_k+s\|_r^r\) (if \(\rho > 0\)), and a regularization term \((\sigma_k/p) \|s\|_{S_k}^p\) involving a weight \(\sigma_k\), power \(p\) and a norm \(\|s\|_{S_k} := \sqrt{s^T S_k s}\) for a given positive definite scaling matrix \(S_k\) that is included to prevent large corrections. The weight \(\sigma_k\) is adjusted as the algorithm progresses to ensure convergence.
The model \(t_k(s)\) is a truncated Taylor-series approximation, and this relies on being able to compute or estimate derivatives of \(c(x)\). Various models are provided, and each has different derivative requirements. We denote the \(m\) by \(n\) residual Jacobian \(J(x) \equiv \nabla_x c(x)\) as the matrix whose \(i,j\)-th component
the first-order Taylor approximation \(f(x_k) + g(x_k)^T s\), where \(g(x) = J^T(x) W c(x)\),
a barely second-order approximation \(f(x_k) + g(x_k)^T s + \frac{1}{2} s^T W s\),
the Gauss-Newton approximation \(\frac{1}{2} \| c(x_k) + J(x_k) s\|^2_W\),
the Newton (second-order Taylor) approximation
\(f(x_k) + g(x_k)^T s + \frac{1}{2} s^T [ J^T(x_k) W J(x_k) + H(x_k,W c(x_k))] s\), and
the tensor Gauss-Newton approximation \(\frac{1}{2} \| c(x_k) + J(x_k) s + \frac{1}{2} s^T \cdot P(x_k,s) \|^2_W\), where the \(i\)-th component of \(s^T \cdot P(x_k,s)\) is shorthand for the scalar \(s^T H_i(x_k) s\), where \(W\) is the diagonal matrix of weights \(w_i\), \(i = 1, \ldots m\)0.
method#
An adaptive regularization method is used. In this, an improvement to a current estimate of the required minimizer, \(x_k\) is sought by computing a step \(s_k\). The step is chosen to approximately minimize a model \(t_k(s)\) of \(f_{\rho,r}(x_k+s)\) that includes a weighted regularization term \(\frac{\sigma_k}{p} \|s\|_{S_k}^p\) for some specified positive weight \(\sigma_k\). The quality of the resulting step \(s_k\) is assessed by computing the “ratio” \((f(x_k) - f(x_k + s_k))/(t_k(0) - t_k(s_k))\). The step is deemed to have succeeded if the ratio exceeds a given \(\eta_s > 0\), and in this case \(x_{k+1} = x_k + s_k\). Otherwise \(x_{k+1} = x_k\), and the weight is increased by powers of a given increase factor up to a given limit. If the ratio is larger than \(\eta_v \geq \eta_d\), the weight will be decreased by powers of a given decrease factor again up to a given limit. The method will terminate as soon as \(f(x_k)\) or \(\|\nabla_x f(x_k)\|\) is smaller than a specified value.
A choice of linear, quadratic or quartic models \(t_k(s)\) is available (see the previous section), and normally a two-norm regularization will be used, but this may change if preconditioning is employed.
If linear or quadratic models are employed, an appropriate,
approximate model minimizer is found using either a direct approach
involving factorization of a shift of the model Hessian \(B_k\) or an
iterative (conjugate-gradient/Lanczos) approach based on approximations
to the required solution from a so-called Krlov subspace. The direct
approach is based on the knowledge that the required solution
satisfies the linear system of equations \((B_k + \lambda_k I) s_k
= - \nabla_x f(x_k)\) involving a scalar Lagrange multiplier \(\lambda_k\).
This multiplier is found by uni-variate root finding, using a safeguarded
Newton-like process, by RQS
. The iterative approach
uses GLRT
, and is best accelerated by preconditioning with
good approximations to the Hessian of the model using PSLS
. The
iterative approach has the advantage that only Hessian matrix-vector products
are required, and thus the Hessian \(B_k\) is not required explicitly.
However when factorizations of the Hessian are possible, the direct approach
is often more efficient.
When a quartic model is used, the model is itself of least-squares form, and the package calls itself recursively to approximately minimize its model. The quartic model often gives a better approximation, but at the cost of more involved derivative requirements.
references#
The generic adaptive cubic regularization method is described in detail in
C. Cartis, N. I. M. Gould and Ph. L. Toint, ``Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results’’ Mathematical Programming 127(2) (2011) 245–295,
and uses ``tricks’’ as suggested in
N. I. M. Gould, M. Porcelli and Ph. L. Toint, ``Updating the regularization parameter in the adaptive cubic regularization algorithm’’. Computational Optimization and Applications 53(1) (2012) 1–22.
The specific methods employed here are discussed in
N. I. M. Gould, J. A. Scott and T. Rees, ``Convergence and evaluation-complexity analysis of a regularized tensor-Newton method for solving nonlinear least-squares problems’’. Computational Optimization and Applications 73(1) (2019) 1–35.
matrix storage#
unsymmetric storage#
The unsymmetric \(m\) by \(n\) Jacobian matrix \(J = J(x)\) and the residual-Hessians-vector product matrix \(P(x,v)\) may be presented and stored in a variety of convenient input formats. Let \(A\) be \(J\) or \(P\) as appropriate.
Dense storage format: The matrix \(A\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. In this case, component \(n \ast i + j\) of the storage array A_val will hold the value \(A_{ij}\) for \(1 \leq i \leq m\), \(1 \leq j \leq n\). The string A_type = ‘dense’ should be specified.
Dense by columns storage format: The matrix \(A\) is stored as a compact dense matrix by columns, that is, the values of the entries of each column in turn are stored in order within an appropriate real one-dimensional array. In this case, component \(m \ast j + i\) of the storage array A_val will hold the value \(A_{ij}\) for \(1 \leq i \leq m\), \(1 \leq j \leq n\). The string A_type = ‘dense_by_columns’ should be specified.
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(1 \leq l \leq ne\), of \(A\), its row index i, column index j and value \(A_{ij}\), \(1 \leq i \leq m\), \(1 \leq j \leq n\), are stored as the \(l\)-th components of the integer arrays A_row and A_col and real array A_val, respectively, while the number of nonzeros is recorded as A_ne = \(ne\). The string A_type = ‘coordinate’should be specified.
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(A\) the i-th component of the integer array A_ptr holds the position of the first entry in this row, while A_ptr(m+1) holds the total number of entries plus one. The column indices j, \(1 \leq j \leq n\), and values \(A_{ij}\) of the nonzero entries in the i-th row are stored in components l = A_ptr(i), \(\ldots\), A_ptr(i+1)-1, \(1 \leq i \leq m\), of the integer array A_col, and real array A_val, respectively. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string A_type = ‘sparse_by_rows’ should be specified.
Sparse column-wise storage format: Once again only the nonzero entries are stored, but this time they are ordered so that those in column j appear directly before those in column j+1. For the j-th column of \(A\) the j-th component of the integer array A_ptr holds the position of the first entry in this column, while A_ptr(n+1) holds the total number of entries plus one. The row indices i, \(1 \leq i \leq m\), and values \(A_{ij}\) of the nonzero entries in the j-th columnsare stored in components l = A_ptr(j), \(\ldots\), A_ptr(j+1)-1, \(1 \leq j \leq n\), of the integer array A_row, and real array A_val, respectively. As before, for sparse matrices, this scheme almost always requires less storage than the co-ordinate format. The string A_type = ‘sparse_by_columns’ should be specified.
symmetric storage#
The symmetric \(n\) by \(n\) matrix \(H = H(x,y)\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).
Dense storage format: The matrix \(H\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(H\) is symmetric, only the lower triangular part (that is the part \(H_{ij}\) for \(1 \leq j \leq i \leq n\)) need be held. In this case the lower triangle should be stored by rows, that is component \((i-1) * i / 2 + j\) of the storage array H_val will hold the value \(H_{ij}\) (and, by symmetry, \(H_{ji}\)) for \(1 \leq j \leq i \leq n\). The string H_type = ‘dense’ should be specified.
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(1 \leq l \leq ne\), of \(H\), its row index i, column index j and value \(H_{ij}\), \(1 \leq j \leq i \leq n\), are stored as the \(l\)-th components of the integer arrays H_row and H_col and real array H_val, respectively, while the number of nonzeros is recorded as H_ne = \(ne\). Note that only the entries in the lower triangle should be stored. The string H_type = ‘coordinate’ should be specified.
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(H\) the i-th component of the integer array H_ptr holds the position of the first entry in this row, while H_ptr(n+1) holds the total number of entries plus one. The column indices j, \(1 \leq j \leq i\), and values \(H_{ij}\) of the entries in the i-th row are stored in components l = H_ptr(i), …, H_ptr(i+1)-1 of the integer array H_col, and real array H_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string H_type = ‘sparse_by_rows’ should be specified.
Diagonal storage format: If \(H\) is diagonal (i.e., \(H_{ij} = 0\) for all \(1 \leq i \neq j \leq n\)) only the diagonals entries \(H_{ii}\), \(1 \leq i \leq n\) need be stored, and the first n components of the array H_val may be used for the purpose. The string H_type = ‘diagonal’ should be specified.
Multiples of the identity storage format: If \(H\) is a multiple of the identity matrix, (i.e., \(H = \alpha I\) where \(I\) is the n by n identity matrix and \(\alpha\) is a scalar), it suffices to store \(\alpha\) as the first component of H_val. The string H_type = ‘scaled_identity’ should be specified.
The identity matrix format: If \(H\) is the identity matrix, no values need be stored. The string H_type = ‘identity’ should be specified.
The zero matrix format: The same is true if \(H\) is the zero matrix, but now the string H_type = ‘zero’ or ‘none’ should be specified.
introduction to function calls#
To solve a given problem, functions from the nls package must be called in the following order:
To solve a given problem, functions from the nls package must be called in the following order:
nls_initialize - provide default control parameters and set up initial data structures
nls_read_specfile (optional) - override control values by reading replacement values from a file
nls_import - set up problem data structures and fixed values
nls_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved
solve the problem by calling one of
nls_solve_with_mat - solve using function calls to evaluate function, gradient and Hessian values
nls_solve_without_mat - solve using function calls to evaluate function and gradient values and Hessian-vector products
nls_solve_reverse_with_mat - solve returning to the calling program to obtain function, gradient and Hessian values, or
nls_solve_reverse_without_mat - solve returning to the calling prorgram to obtain function and gradient values and Hessian-vector products
nls_information (optional) - recover information about the solution and solution process
nls_terminate - deallocate data structures
See the examples section for illustrations of use.
parametric real type T#
Below, the symbol T refers to a parametric real type that may be Float32 (single precision) or Float64 (double precision).
callable functions#
function nls_initialize(T, data, control, inform)
Set default control values and initialize private data
Parameters:
data |
holds private internal data |
control |
is a structure containing control information (see nls_control_type) |
inform |
is a structure containing output information (see nls_inform_type) |
function nls_read_specfile(T, control, specfile)
Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/nls/NLS.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/nls.pdf for a list of how these keywords relate to the components of the control structure.
Parameters:
control |
is a structure containing control information (see nls_control_type) |
specfile |
is a one-dimensional array of type Vararg{Cchar} that must give the name of the specification file |
function nls_import(T, control, data, status, n, m, J_type, J_ne, J_row, J_col, J_ptr, H_type, H_ne, H_row, H_col, H_ptr, P_type, P_ne, P_row, P_col, P_ptr, w)
Import problem data into internal storage prior to solution.
Parameters:
control |
is a structure whose members provide control parameters for the remaining procedures (see nls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
n |
is a scalar variable of type Int32 that holds the number of variables. |
m |
is a scalar variable of type Int32 that holds the number of residuals. |
J_type |
is a one-dimensional array of type Vararg{Cchar} that specifies the unsymmetric storage scheme used for the Jacobian, \(J\). It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’ or ‘absent’, the latter if access to the Jacobian is via matrix-vector products; lower or upper case variants are allowed. |
J_ne |
is a scalar variable of type Int32 that holds the number of entries in \(J\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes. |
J_row |
is a one-dimensional array of size J_ne and type Int32 that holds the row indices of \(J\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes, and in this case can be C_NULL. |
J_col |
is a one-dimensional array of size J_ne and type Int32 that holds the column indices of \(J\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense or diagonal storage schemes are used, and in this case can be C_NULL. |
J_ptr |
is a one-dimensional array of size m+1 and type Int32 that holds the starting position of each row of \(J\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL. |
H_type |
is a one-dimensional array of type Vararg{Cchar} that specifies the symmetric storage scheme used for the Hessian, \(H\). It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’, ‘diagonal’ or ‘absent’, the latter if access to \(H\) is via matrix-vector products; lower or upper case variants are allowed. |
H_ne |
is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of \(H\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes. |
H_row |
is a one-dimensional array of size H_ne and type Int32 that holds the row indices of the lower triangular part of \(H\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be C_NULL. |
H_col |
is a one-dimensional array of size H_ne and type Int32 that holds the column indices of the lower triangular part of \(H\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense or diagonal storage schemes are used, and in this case can be C_NULL. |
H_ptr |
is a one-dimensional array of size n+1 and type Int32 that holds the starting position of each row of the lower triangular part of \(H\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL. |
P_type |
is a one-dimensional array of type Vararg{Cchar} that specifies the unsymmetric storage scheme used for the residual-Hessians-vector product matrix, \(P\). It should be one of ‘coordinate’, ‘sparse_by_columns’, ‘dense_by_columns’ or ‘absent’, the latter if access to \(P\) is via matrix-vector products; lower or upper case variants are allowed. |
P_ne |
is a scalar variable of type Int32 that holds the number of entries in \(P\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes. |
P_row |
is a one-dimensional array of size P_ne and type Int32 that holds the row indices of \(P\) in either the sparse co-ordinate, or the sparse column-wise storage scheme. It need not be set when the dense storage scheme is used, and in this case can be C_NULL. |
P_col |
is a one-dimensional array of size P_ne and type Int32 that holds the row indices of \(P\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes, and in this case can be C_NULL. |
P_ptr |
is a one-dimensional array of size n+1 and type Int32 that holds the starting position of each row of \(P\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL. |
w |
is a one-dimensional array of size m and type T that holds the values \(w\) of the weights on the residuals in the least-squares objective function. It need not be set if the weights are all ones, and in this case can be C_NULL. |
function nls_reset_control(T, control, data, status)
Reset control parameters after import if required.
Parameters:
control |
is a structure whose members provide control parameters for the remaining procedures (see nls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
function nls_solve_with_mat(T, data, userdata, status, n, m, x, c, g, eval_c, j_ne, eval_j, h_ne, eval_h, p_ne, eval_hprods)
Find a local minimizer of a given function using a trust-region method.
This call is for the case where \(H = \nabla_{xx}f(x)\) is provided specifically, and all function/derivative information is available by function calls.
Parameters:
data |
holds private internal data |
userdata |
is a structure that allows data to be passed into the function and derivative evaluation programs. |
status |
is a scalar variable of type Int32 that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
n |
is a scalar variable of type Int32 that holds the number of variables. |
m |
is a scalar variable of type Int32 that holds the number of residuals. |
x |
is a one-dimensional array of size n and type T that holds the values \(x\) of the optimization variables. The j-th component of |
c |
is a one-dimensional array of size m and type T that holds the residual \(c(x)\). The i-th component of |
g |
is a one-dimensional array of size n and type T that holds the gradient \(g = \nabla_xf(x)\) of the objective function. The j-th component of |
eval_c |
is a user-supplied function that must have the following signature: function eval_c(n, x, c, userdata) The componnts of the residual function \(c(x)\)
evaluated at x=\(x\) must be assigned to c, and the
function return value set to 0. If the evaluation is
impossible at x, return should be set to a nonzero
value. Data may be passed into |
j_ne |
is a scalar variable of type Int32 that holds the number of entries in the Jacobian matrix \(J\). |
eval_j |
is a user-supplied function that must have the following signature: function eval_j(n, m, jne, x, j, userdata) The components of the Jacobian \(J = \nabla_x c(x\)) of
the residuals must be assigned to j in the same order
as presented to nls_import, and the function return
value set to 0. If the evaluation is impossible at x,
return should be set to a nonzero value. Data may be
passed into |
h_ne |
is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of the Hessian matrix \(H\) if it is used. |
eval_h |
is a user-supplied function that must have the following signature: function eval_h(n, m, hne, x, y, h, userdata) The nonzeros of the matrix \(H = \sum_{i=1}^m y_i
\nabla_{xx}c_i(x)\) of the weighted residual Hessian
evaluated at x=\(x\) and y=\(y\) must be assigned to h
in the same order as presented to nls_import, and the
function return value set to 0. If the evaluation is
impossible at x, return should be set to a nonzero
value. Data may be passed into |
p_ne |
is a scalar variable of type Int32 that holds the number of entries in the residual-Hessians-vector product matrix \(P\) if it is used. |
eval_hprods |
is an optional user-supplied function that may be C_NULL. If non-NULL, it must have the following signature: function eval_hprods(n, m, pne, x, v, p, got_h, userdata) The entries of the matrix \(P\), whose i-th column is
the product \(\nabla_{xx}c_i(x) v\) between
\(\nabla_{xx}c_i(x)\), the Hessian of the i-th component
of the residual \(c(x)\) at x=\(x\), and v=\(v\) must be
returned in p and the function return value set
to 0. If the evaluation is impossible at x, return
should be set to a nonzero value. Data may be passed
into |
function nls_solve_without_mat(T, data, userdata, status, n, m, x, c, g, eval_c, eval_jprod, eval_hprod, p_ne, eval_hprods)
Find a local minimizer of a given function using a trust-region method.
This call is for the case where access to \(H = \nabla_{xx}f(x)\) is provided by Hessian-vector products, and all function/derivative information is available by function calls.
Parameters:
data |
holds private internal data |
userdata |
is a structure that allows data to be passed into the function and derivative evaluation programs. |
status |
is a scalar variable of type Int32 that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
n |
is a scalar variable of type Int32 that holds the number of variables |
m |
is a scalar variable of type Int32 that holds the number of residuals. |
x |
is a one-dimensional array of size n and type T that holds the values \(x\) of the optimization variables. The j-th component of |
c |
is a one-dimensional array of size m and type T that holds the residual \(c(x)\). The i-th component of |
g |
is a one-dimensional array of size n and type T that holds the gradient \(g = \nabla_xf(x)\) of the objective function. The j-th component of |
eval_c |
is a user-supplied function that must have the following signature: function eval_c(n, x, c, userdata) The componnts of the residual function \(c(x)\)
evaluated at x=\(x\) must be assigned to c, and the
function return value set to 0. If the evaluation is
impossible at x, return should be set to a nonzero
value. Data may be passed into |
eval_jprod |
is a user-supplied function that must have the following signature: function eval_jprod(n, m, x, transpose, u, v, got_j, userdata) The sum \(u + \nabla_{x}c_(x) v\) (if the Bool transpose
is false) or The sum \(u + (\nabla_{x}c_(x))^T v\) (if
tranpose is true) bewteen the product of the Jacobian
\(\nabla_{x}c_(x)\) or its tranpose with the vector
v=\(v\) and the vector $ \(u\) must be returned in u, and
the function return value set to 0. If the evaluation
is impossible at x, return should be set to a nonzero
value. Data may be passed into |
eval_hprod |
is a user-supplied function that must have the following signature: function eval_hprod(n, m, x, y, u, v, got_h, userdata) The sum \(u + \sum_{i=1}^m y_i \nabla_{xx}c_i(x) v\) of
the product of the weighted residual Hessian \(H =
\sum_{i=1}^m y_i \nabla_{xx}c_i(x)\) evaluated at
x=\(x\) and y=\(y\) with the vector v=\(v\) and the vector
\(u\) must be returned in u, and the function return
value set to 0. If the evaluation is impossible at x,
return should be set to a nonzero value. The Hessians
have already been evaluated or used at x if the Bool
got_h is true. Data may be passed into |
p_ne |
is a scalar variable of type Int32 that holds the number of entries in the residual-Hessians-vector product matrix \(P\) if it is used. |
eval_hprods |
is an optional user-supplied function that may be C_NULL. If non-NULL, it must have the following signature: function eval_hprods(n, m, p_ne, x, v, pval, got_h, userdata) The entries of the matrix \(P\), whose i-th column is
the product \(\nabla_{xx}c_i(x) v\) between
\(\nabla_{xx}c_i(x)\), the Hessian of the i-th component
of the residual \(c(x)\) at x=\(x\), and v=\(v\) must be
returned in pval and the function return value set
to 0. If the evaluation is impossible at x, return
should be set to a nonzero value. Data may be passed
into |
function nls_solve_reverse_with_mat(T, data, status, eval_status, n, m, x, c, g, j_ne, J_val, y, h_ne, H_val, v, p_ne, P_val)
Find a local minimizer of a given function using a trust-region method.
This call is for the case where \(H = \nabla_{xx}f(x)\) is provided specifically, but function/derivative information is only available by returning to the calling procedure
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
eval_status |
is a scalar variable of type Int32 that is used to indicate if objective function/gradient/Hessian values can be provided (see above) |
n |
is a scalar variable of type Int32 that holds the number of variables |
m |
is a scalar variable of type Int32 that holds the number of residuals. |
x |
is a one-dimensional array of size n and type T that holds the values \(x\) of the optimization variables. The j-th component of |
c |
is a one-dimensional array of size m and type T that holds the residual \(c(x)\). The i-th component of |
g |
is a one-dimensional array of size n and type T that holds the gradient \(g = \nabla_xf(x)\) of the objective function. The j-th component of |
j_ne |
is a scalar variable of type Int32 that holds the number of entries in the Jacobian matrix \(J\). |
J_val |
is a one-dimensional array of size j_ne and type T that holds the values of the entries of the Jacobian matrix \(J\) in any of the available storage schemes. See status = 3, above, for more details. |
y |
is a one-dimensional array of size m and type T that is used for reverse communication. See status = 4 above for more details. |
h_ne |
is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of the Hessian matrix \(H\). |
H_val |
is a one-dimensional array of size h_ne and type T that holds the values of the entries of the lower triangular part of the Hessian matrix \(H\) in any of the available storage schemes. See status = 4, above, for more details. |
v |
is a one-dimensional array of size n and type T that is used for reverse communication. See status = 7, above, for more details. |
p_ne |
is a scalar variable of type Int32 that holds the number of entries in the residual-Hessians-vector product matrix, \(P\). |
P_val |
is a one-dimensional array of size p_ne and type T that holds the values of the entries of the residual-Hessians-vector product matrix, \(P\). See status = 7, above, for more details. |
function nls_solve_reverse_without_mat(T, data, status, eval_status, n, m, x, c, g, transpose, u, v, y, p_ne, P_val)
Find a local minimizer of a given function using a trust-region method.
This call is for the case where access to \(H = \nabla_{xx}f(x)\) is provided by Hessian-vector products, but function/derivative information is only available by returning to the calling procedure.
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
eval_status |
is a scalar variable of type Int32 that is used to indicate if objective function/gradient/Hessian values can be provided (see above) |
n |
is a scalar variable of type Int32 that holds the number of variables |
m |
is a scalar variable of type Int32 that holds the number of residuals. |
x |
is a one-dimensional array of size n and type T that holds the values \(x\) of the optimization variables. The j-th component of |
c |
is a one-dimensional array of size m and type T that holds the residual \(c(x)\). The i-th component of |
g |
is a one-dimensional array of size n and type T that holds the gradient \(g = \nabla_xf(x)\) of the objective function. The j-th component of |
transpose |
is a scalar variable of type Bool, that indicates whether the product with Jacobian or its transpose should be obtained when status=5. |
u |
is a one-dimensional array of size max(n,m) and type T that is used for reverse communication. See status = 5,6 above for more details. |
v |
is a one-dimensional array of size max(n,m) and type T that is used for reverse communication. See status = 5,6,7 above for more details. |
y |
is a one-dimensional array of size m and type T that is used for reverse communication. See status = 6 above for more details. |
p_ne |
is a scalar variable of type Int32 that holds the number of entries in the residual-Hessians-vector product matrix, \(P\). |
P_val |
is a one-dimensional array of size P_ne and type T that holds the values of the entries of the residual-Hessians-vector product matrix, \(P\). See status = 7, above, for more details. |
function nls_information(T, data, inform, status)
Provides output information
Parameters:
data |
holds private internal data |
inform |
is a structure containing output information (see nls_inform_type) |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):
|
function nls_terminate(T, data, control, inform)
Deallocate all internal private storage
Parameters:
data |
holds private internal data |
control |
is a structure containing control information (see nls_control_type) |
inform |
is a structure containing output information (see nls_inform_type) |
available structures#
nls_subproblem_control_type structure#
struct nls_subproblem_control_type{T} error::Int32 out::Int32 print_level::Int32 start_print::Int32 stop_print::Int32 print_gap::Int32 maxit::Int32 alive_unit::Int32 alive_file::NTuple{31,Cchar} jacobian_available::Int32 hessian_available::Int32 model::Int32 norm::Int32 non_monotone::Int32 weight_update_strategy::Int32 stop_c_absolute::T stop_c_relative::T stop_g_absolute::T stop_g_relative::T stop_s::T power::T initial_weight::T minimum_weight::T initial_inner_weight::T eta_successful::T eta_very_successful::T eta_too_successful::T weight_decrease_min::T weight_decrease::T weight_increase::T weight_increase_max::T reduce_gap::T tiny_gap::T large_root::T switch_to_newton::T cpu_time_limit::T clock_time_limit::T subproblem_direct::Bool renormalize_weight::Bool magic_step::Bool print_obj::Bool space_critical::Bool deallocate_error_fatal::Bool prefix::NTuple{31,Cchar} rqs_control::rqs_control_type{T} glrt_control::glrt_control_type{T} psls_control::psls_control_type{T} bsc_control::bsc_control_type roots_control::roots_control_type{T}
detailed documentation#
subproblem_control derived type as a Julia structure
components#
Int32 error
error and warning diagnostics occur on stream error
Int32 out
general output occurs on stream out
Int32 print_level
the level of output required.
\(\leq\) 0 gives no output,
= 1 gives a one-line summary for every iteration,
= 2 gives a summary of the inner iteration for each iteration,
\(\geq\) 3 gives increasingly verbose (debugging) output
Int32 start_print
any printing will start on this iteration
Int32 stop_print
any printing will stop on this iteration
Int32 print_gap
the number of iterations between printing
Int32 maxit
the maximum number of iterations performed
Int32 alive_unit
removal of the file alive_file from unit alive_unit terminates execution
char alive_file[31]
see alive_unit
Int32 jacobian_available
is the Jacobian matrix of first derivatives available (\(\geq\) 2), is access only via matrix-vector products (=1) or is it not available (\(\leq\) 0) ?
Int32 hessian_available
is the Hessian matrix of second derivatives available (\(\geq\) 2), is access only via matrix-vector products (=1) or is it not available (\(\leq\) 0) ?
Int32 model
the model used.
Possible values are
0 dynamic (not yet implemented)
1 first-order (no Hessian)
2 barely second-order (identity Hessian)
3 Gauss-Newton (\(J^T J\) Hessian)
4 second-order (exact Hessian)
5 Gauss-Newton to Newton transition
6 tensor Gauss-Newton treated as a least-squares model
7 tensor Gauss-Newton treated as a general model
8 tensor Gauss-Newton transition from a least-squares to a general mode
Int32 norm
the regularization norm used.
The norm is defined via \(\|v\|^2 = v^T S v\), and will define the preconditioner used for iterative methods. Possible values for \(S\) are
-3 user’s own regularization norm
-2 \(S\) = limited-memory BFGS matrix (with .PSLS_control.lbfgs_vectors history) (not yet implemented)
-1 identity (= Euclidan two-norm)
0 automatic (not yet implemented)
1 diagonal, \(S\) = diag( max(\(J^TJ\) Hessian, .PSLS_control.min_diagonal ) )
2 diagonal, \(S\) = diag( max( Hessian, .PSLS_control.min_diagonal ) )
3 banded, \(S\) = band( Hessian ) with semi-bandwidth .PSLS_control.semi_bandwidth
4 re-ordered band, P=band(order(A)) with semi-bandwidth .PSLS_control.semi_bandwidth
5 full factorization, \(S\) = Hessian, Schnabel-Eskow modification
6 full factorization, \(S\) = Hessian, GMPS modification (not yet implemented)
7 incomplete factorization of Hessian, Lin-More’
8 incomplete factorization of Hessian, HSL_MI28
9 incomplete factorization of Hessian, Munskgaard (not yet implemented)
10 expanding band of Hessian (not yet implemented)
Int32 non_monotone
non-monotone \(\leq\) 0 monotone strategy used, anything else non-monotone strategy with this history length used
Int32 weight_update_strategy
define the weight-update strategy: 1 (basic), 2 (reset to zero when very successful), 3 (imitate TR), 4 (increase lower bound), 5 (GPT)
T stop_c_absolute
overall convergence tolerances. The iteration will terminate when \(||c(x)||_2 \leq\) MAX( .stop_c_absolute, .stop_c_relative \(* \|c(x_{\mbox{initial}})\|_2\), or when the norm of the gradient, \(g = J^T(x) c(x) / \|c(x)\|_2\), of \|\|c\|\|_2, satisfies \(\|g\|_2 \leq\) MAX( .stop_g_absolute, .stop_g_relative \(* \|g_{\mbox{initial}}\|_2\), or if the step is less than .stop_s
T stop_c_relative
see stop_c_absolute
T stop_g_absolute
see stop_c_absolute
T stop_g_relative
see stop_c_absolute
T stop_s
see stop_c_absolute
T power
the regularization power (<2 => chosen according to the model)
T initial_weight
initial value for the regularization weight (-ve => \(1/\|g_0\|)\))
T minimum_weight
minimum permitted regularization weight
T initial_inner_weight
initial value for the inner regularization weight for tensor GN (-ve => 0)
T eta_successful
a potential iterate will only be accepted if the actual decrease f - f(x_new) is larger than .eta_successful times that predicted by a quadratic model of the decrease. The regularization weight will be decreaed if this relative decrease is greater than .eta_very_successful but smaller than .eta_too_successful
T eta_very_successful
see eta_successful
T eta_too_successful
see eta_successful
T weight_decrease_min
on very successful iterations, the regularization weight will be reduced by the factor .weight_decrease but no more than .weight_decrease_min while if the iteration is unsucceful, the weight will be increased by a factor .weight_increase but no more than .weight_increase_max (these are delta_1, delta_2, delta3 and delta_max in Gould, Porcelli and Toint, 2011)
T weight_decrease
see weight_decrease_min
T weight_increase
see weight_decrease_min
T weight_increase_max
see weight_decrease_min
T reduce_gap
expert parameters as suggested in Gould, Porcelli and Toint, “Updating t regularization parameter in the adaptive cubic regularization algorithm” RAL-TR-2011-007, Rutherford Appleton Laboratory, England (2011), http://epubs.stfc.ac.uk/bitstream/6181/RAL-TR-2011-007.pdf (these are denoted beta, epsilon_chi and alpha_max in the paper)
T tiny_gap
see reduce_gap
T large_root
see reduce_gap
T switch_to_newton
if the Gauss-Newto to Newton model is specified, switch to Newton as soon as the norm of the gradient g is smaller than switch_to_newton
T cpu_time_limit
the maximum CPU time allowed (-ve means infinite)
T clock_time_limit
the maximum elapsed clock time allowed (-ve means infinite)
Bool subproblem_direct
use a direct (factorization) or (preconditioned) iterative method to find the search direction
Bool renormalize_weight
should the weight be renormalized to account for a change in scaling?
Bool magic_step
allow the user to perform a “magic” step to improve the objective
Bool print_obj
print values of the objective/gradient rather than ||c|| and its gradien
Bool space_critical
if .space_critical true, every effort will be made to use as little space as possible. This may result in longer computation time
Bool deallocate_error_fatal
if .deallocate_error_fatal is true, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue
NTuple{31,Cchar} prefix
all output lines will be prefixed by .prefix(2:LEN(TRIM(.prefix))-1) where .prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’
struct rqs_control_type rqs_control
control parameters for RQS
struct glrt_control_type glrt_control
control parameters for GLRT
struct psls_control_type psls_control
control parameters for PSLS
struct bsc_control_type bsc_control
control parameters for BSC
struct roots_control_type roots_control
control parameters for ROOTS
nls_control_type structure#
struct nls_control_type{T} f_indexing::Bool error::Int32 out::Int32 print_level::Int32 start_print::Int32 stop_print::Int32 print_gap::Int32 maxit::Int32 alive_unit::Int32 alive_file::NTuple{31,Cchar} jacobian_available::Int32 hessian_available::Int32 model::Int32 norm::Int32 non_monotone::Int32 weight_update_strategy::Int32 stop_c_absolute::T stop_c_relative::T stop_g_absolute::T stop_g_relative::T stop_s::T power::T initial_weight::T minimum_weight::T initial_inner_weight::T eta_successful::T eta_very_successful::T eta_too_successful::T weight_decrease_min::T weight_decrease::T weight_increase::T weight_increase_max::T reduce_gap::T tiny_gap::T large_root::T switch_to_newton::T cpu_time_limit::T clock_time_limit::T subproblem_direct::Bool renormalize_weight::Bool magic_step::Bool print_obj::Bool space_critical::Bool deallocate_error_fatal::Bool prefix::NTuple{31,Cchar} rqs_control::rqs_control_type{T} glrt_control::glrt_control_type{T} psls_control::psls_control_type{T} bsc_control::bsc_control_type roots_control::roots_control_type{T} subproblem_control::nls_subproblem_control_type{T}
detailed documentation#
control derived type as a Julia structure
components#
Bool f_indexing
use C or Fortran sparse matrix indexing
Int32 error
error and warning diagnostics occur on stream error
Int32 out
general output occurs on stream out
Int32 print_level
the level of output required.
\(\leq\) 0 gives no output,
= 1 gives a one-line summary for every iteration,
= 2 gives a summary of the inner iteration for each iteration,
\(\geq\) 3 gives increasingly verbose (debugging) output
Int32 start_print
any printing will start on this iteration
Int32 stop_print
any printing will stop on this iteration
Int32 print_gap
the number of iterations between printing
Int32 maxit
the maximum number of iterations performed
Int32 alive_unit
removal of the file alive_file from unit alive_unit terminates execution
char alive_file[31]
see alive_unit
Int32 jacobian_available
is the Jacobian matrix of first derivatives available (\(\geq\) 2), is access only via matrix-vector products (=1) or is it not available (\(\leq\) 0) ?
Int32 hessian_available
is the Hessian matrix of second derivatives available (\(\geq\) 2), is access only via matrix-vector products (=1) or is it not available (\(\leq\) 0) ?
Int32 model
the model used.
Possible values are
0 dynamic (not yet implemented)
1 first-order (no Hessian)
2 barely second-order (identity Hessian)
3 Gauss-Newton (\(J^T J\) Hessian)
4 second-order (exact Hessian)
5 Gauss-Newton to Newton transition
6 tensor Gauss-Newton treated as a least-squares model
7 tensor Gauss-Newton treated as a general model
8 tensor Gauss-Newton transition from a least-squares to a general mode
Int32 norm
the regularization norm used.
The norm is defined via \(\|v\|^2 = v^T S v\), and will define the preconditioner used for iterative methods. Possible values for \(S\) are
-3 user’s own regularization norm
-2 \(S\) = limited-memory BFGS matrix (with .PSLS_control.lbfgs_vectors history) (not yet implemented)
-1 identity (= Euclidan two-norm)
0 automatic (not yet implemented)
1 diagonal, \(S\) = diag( max(\(J^TJ\) Hessian, .PSLS_control.min_diagonal ) )
2 diagonal, \(S\) = diag( max( Hessian, .PSLS_control.min_diagonal ) )
3 banded, \(S\) = band( Hessian ) with semi-bandwidth .PSLS_control.semi_bandwidth
4 re-ordered band, P=band(order(A)) with semi-bandwidth .PSLS_control.semi_bandwidth
5 full factorization, \(S\) = Hessian, Schnabel-Eskow modification
6 full factorization, \(S\) = Hessian, GMPS modification (not yet implemented)
7 incomplete factorization of Hessian, Lin-More’
8 incomplete factorization of Hessian, HSL_MI28
9 incomplete factorization of Hessian, Munskgaard (not yet implemented)
10 expanding band of Hessian (not yet implemented)
Int32 non_monotone
non-monotone \(\leq\) 0 monotone strategy used, anything else non-monotone strategy with this history length used
Int32 weight_update_strategy
define the weight-update strategy: 1 (basic), 2 (reset to zero when very successful), 3 (imitate TR), 4 (increase lower bound), 5 (GPT)
T stop_c_absolute
overall convergence tolerances. The iteration will terminate when \(||c(x)||_2 \leq\) MAX( .stop_c_absolute, .stop_c_relative \(* \|c(x_{\mbox{initial}})\|_2\) or when the norm of the gradient, \(g = J^T(x) c(x) / \|c(x)\|_2\), of \|\|c(x)\|\|_2 satisfies \(\|g\|_2 \leq\) MAX( .stop_g_absolute, .stop_g_relative \(* \|g_{\mbox{initial}}\|_2\), or if the step is less than .stop_s
T stop_c_relative
see stop_c_absolute
T stop_g_absolute
see stop_c_absolute
T stop_g_relative
see stop_c_absolute
T stop_s
see stop_c_absolute
T power
the regularization power (<2 => chosen according to the model)
T initial_weight
initial value for the regularization weight (-ve => \(1/\|g_0\|)\))
T minimum_weight
minimum permitted regularization weight
T initial_inner_weight
initial value for the inner regularization weight for tensor GN (-ve => 0)
T eta_successful
a potential iterate will only be accepted if the actual decrease f - f(x_new) is larger than .eta_successful times that predicted by a quadratic model of the decrease. The regularization weight will be decreaed if this relative decrease is greater than .eta_very_successful but smaller than .eta_too_successful
T eta_very_successful
see eta_successful
T eta_too_successful
see eta_successful
T weight_decrease_min
on very successful iterations, the regularization weight will be reduced by the factor .weight_decrease but no more than .weight_decrease_min while if the iteration is unsucceful, the weight will be increased by a factor .weight_increase but no more than .weight_increase_max (these are delta_1, delta_2, delta3 and delta_max in Gould, Porcelli and Toint, 2011)
T weight_decrease
see weight_decrease_min
T weight_increase
see weight_decrease_min
T weight_increase_max
see weight_decrease_min
T reduce_gap
- expert parameters as suggested in Gould, Porcelli and Toint, “Updating the regularization parameter in the adaptive cubic
regularization algorithm”, RAL-TR-2011-007, Rutherford Appleton Laboratory, England (2011), http://epubs.stfc.ac.uk/bitstream/6181/RAL-TR-2011-007.pdf (these are denoted beta, epsilon_chi and alpha_max in the paper)
T tiny_gap
see reduce_gap
T large_root
see reduce_gap
T switch_to_newton
if the Gauss-Newto to Newton model is specified, switch to Newton as soon as the norm of the gradient g is smaller than switch_to_newton
T cpu_time_limit
the maximum CPU time allowed (-ve means infinite)
T clock_time_limit
the maximum elapsed clock time allowed (-ve means infinite)
Bool subproblem_direct
use a direct (factorization) or (preconditioned) iterative method to find the search direction
Bool renormalize_weight
should the weight be renormalized to account for a change in scaling?
Bool magic_step
allow the user to perform a “magic” step to improve the objective
Bool print_obj
print values of the objective/gradient rather than ||c|| and its gradient
Bool space_critical
if .space_critical true, every effort will be made to use as little space as possible. This may result in longer computation time
Bool deallocate_error_fatal
if .deallocate_error_fatal is true, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue
NTuple{31,Cchar} prefix
all output lines will be prefixed by .prefix(2:LEN(TRIM(.prefix))-1) where .prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’
struct rqs_control_type rqs_control
control parameters for RQS
struct glrt_control_type glrt_control
control parameters for GLRT
struct psls_control_type psls_control
control parameters for PSLS
struct bsc_control_type bsc_control
control parameters for BSC
struct roots_control_type roots_control
control parameters for ROOTS
struct nls_subproblem_control_type subproblem_control
control parameters for the step-finding subproblem
nls_time_type structure#
struct nls_time_type{T} total::Float32 preprocess::Float32 analyse::Float32 factorize::Float32 solve::Float32 clock_total::T clock_preprocess::T clock_analyse::T clock_factorize::T clock_solve::T
detailed documentation#
time derived type as a Julia structure
components#
Float32 total
the total CPU time spent in the package
Float32 preprocess
the CPU time spent preprocessing the problem
Float32 analyse
the CPU time spent analysing the required matrices prior to factorization
Float32 factorize
the CPU time spent factorizing the required matrices
Float32 solve
the CPU time spent computing the search direction
T clock_total
the total clock time spent in the package
T clock_preprocess
the clock time spent preprocessing the problem
T clock_analyse
the clock time spent analysing the required matrices prior to factorization
T clock_factorize
the clock time spent factorizing the required matrices
T clock_solve
the clock time spent computing the search direction
nls_subproblem_inform_type structure#
struct nls_subproblem_inform_type{T} status::Int32 alloc_status::Int32 bad_alloc::NTuple{81,Cchar} bad_eval::NTuple{13,Cchar} iter::Int32 cg_iter::Int32 c_eval::Int32 j_eval::Int32 h_eval::Int32 factorization_max::Int32 factorization_status::Int32 max_entries_factors::Int64 factorization_integer::Int64 factorization_real::Int64 factorization_average::T obj::T norm_c::T norm_g::T weight::T time::nls_time_type{T} rqs_inform::rqs_inform_type{T} glrt_inform::glrt_inform_type{T} psls_inform::psls_inform_type{T} bsc_inform::bsc_inform_type{T} roots_inform::roots_inform_type
detailed documentation#
subproblem_inform derived type as a Julia structure
components#
Int32 status
return status. See NLS_solve for details
Int32 alloc_status
the status of the last attempted allocation/deallocation
NTuple{81,Cchar} bad_alloc
the name of the array for which an allocation/deallocation error occurred
char bad_eval[13]
the name of the user-supplied evaluation routine for which an error occurred
Int32 iter
the total number of iterations performed
Int32 cg_iter
the total number of CG iterations performed
Int32 c_eval
the total number of evaluations of the residual function c(x)
Int32 j_eval
the total number of evaluations of the Jacobian J(x) of c(x)
Int32 h_eval
the total number of evaluations of the scaled Hessian H(x,y) of c(x)
Int32 factorization_max
the maximum number of factorizations in a sub-problem solve
Int32 factorization_status
the return status from the factorization
Int64 max_entries_factors
the maximum number of entries in the factors
Int64 factorization_integer
the total integer workspace required for the factorization
Int64 factorization_real
the total real workspace required for the factorization
T factorization_average
the average number of factorizations per sub-problem solve
T obj
the value of the objective function \(\frac{1}{2}\|c(x)\|^2_W\) at the best estimate the solution, x, determined by NLS_solve
T norm_c
the norm of the residual \(\|c(x)\|_W\) at the best estimate of the solution x, determined by NLS_solve
T norm_g
the norm of the gradient of \(\|c(x)\|_W\) of the objective function at the best estimate, x, of the solution determined by NLS_solve
T weight
the final regularization weight used
struct nls_time_type time
timings (see above)
struct rqs_inform_type rqs_inform
inform parameters for RQS
struct glrt_inform_type glrt_inform
inform parameters for GLRT
struct psls_inform_type psls_inform
inform parameters for PSLS
struct bsc_inform_type bsc_inform
inform parameters for BSC
struct roots_inform_type roots_inform
inform parameters for ROOTS
nls_inform_type structure#
struct nls_inform_type{T} status::Int32 alloc_status::Int32 bad_alloc::NTuple{81,Cchar} bad_eval::NTuple{13,Cchar} iter::Int32 cg_iter::Int32 c_eval::Int32 j_eval::Int32 h_eval::Int32 factorization_max::Int32 factorization_status::Int32 max_entries_factors::Int64 factorization_integer::Int64 factorization_real::Int64 factorization_average::T obj::T norm_c::T norm_g::T weight::T time::nls_time_type{T} rqs_inform::rqs_inform_type{T} glrt_inform::glrt_inform_type{T} psls_inform::psls_inform_type{T} bsc_inform::bsc_inform_type{T} roots_inform::roots_inform_type subproblem_inform::nls_subproblem_inform_type{T}
detailed documentation#
inform derived type as a Julia structure
components#
Int32 status
return status. See NLS_solve for details
Int32 alloc_status
the status of the last attempted allocation/deallocation
NTuple{81,Cchar} bad_alloc
the name of the array for which an allocation/deallocation error occurred
char bad_eval[13]
the name of the user-supplied evaluation routine for which an error occurred
Int32 iter
the total number of iterations performed
Int32 cg_iter
the total number of CG iterations performed
Int32 c_eval
the total number of evaluations of the residual function c(x)
Int32 j_eval
the total number of evaluations of the Jacobian J(x) of c(x)
Int32 h_eval
the total number of evaluations of the scaled Hessian H(x,y) of c(x)
Int32 factorization_max
the maximum number of factorizations in a sub-problem solve
Int32 factorization_status
the return status from the factorization
Int64 max_entries_factors
the maximum number of entries in the factors
Int64 factorization_integer
the total integer workspace required for the factorization
Int64 factorization_real
the total real workspace required for the factorization
T factorization_average
the average number of factorizations per sub-problem solve
T obj
the value of the objective function \(\frac{1}{2}\|c(x)\|^2_W\) at the best estimate the solution, x, determined by NLS_solve
T norm_c
the norm of the residual \(\|c(x)\|_W\) at the best estimate of the solution x, determined by NLS_solve
T norm_g
the norm of the gradient of \(\|c(x)\|_W\) of the objective function at the best estimate, x, of the solution determined by NLS_solve
T weight
the final regularization weight used
struct nls_time_type time
timings (see above)
struct rqs_inform_type rqs_inform
inform parameters for RQS
struct glrt_inform_type glrt_inform
inform parameters for GLRT
struct psls_inform_type psls_inform
inform parameters for PSLS
struct bsc_inform_type bsc_inform
inform parameters for BSC
struct roots_inform_type roots_inform
inform parameters for ROOTS
struct nls_subproblem_inform_type subproblem_inform
inform parameters for subproblem
example calls#
This is an example of how to use the package to solve a nonlinear least-squares problem; the code is available in $GALAHAD/src/nls/Julia/test_nls.jl . A variety of supported Hessian and constraint matrix storage formats are shown.
# test_nls.jl
# Simple code to test the Julia interface to NLS
using GALAHAD
using Test
using Printf
using Accessors
# Custom userdata struct
struct userdata_nls{T}
p::T
end
function test_nls(::Type{T}) where T
# compute the residuals
function res(n::Int, m::Int, x::Vector{T}, c::Vector{T},
userdata::userdata_nls)
c[1] = x[1]^2 + userdata.p
c[2] = x[1] + x[2]^2
c[3] = x[1] - x[2]
return 0
end
# compute the Jacobian
function jac(n::Int, m::Int, jne::Int, x::Vector{T}, jval::Vector{T},
userdata::userdata_nls)
jval[1] = 2.0 * x[1]
jval[2] = 1.0
jval[3] = 2.0 * x[2]
jval[4] = 1.0
jval[5] = -1.0
return 0
end
# compute the Hessian
function hess(n::Int, m::Int, hne::Int, x::Vector{T}, y::Vector{T},
hval::Vector{T}, userdata::userdata_nls)
hval[1] = 2.0 * y[1]
hval[2] = 2.0 * y[2]
return 0
end
# compute Jacobian-vector products
function jacprod(n::Int, m::Int, x::Vector{T}, trans::Bool, u::Vector{T},
v::Vector{T}, got_j::Bool, userdata::userdata_nls)
if trans
u[1] = u[1] + 2.0 * x[1] * v[1] + v[2] + v[3]
u[2] = u[2] + 2.0 * x[2] * v[2] - v[3]
else
u[1] = u[1] + 2.0 * x[1] * v[1]
u[2] = u[2] + v[1] + 2.0 * x[2] * v[2]
u[3] = u[3] + v[1] - v[2]
end
return 0
end
# compute Hessian-vector products
function hessprod(n::Int, m::Int, x::Vector{T}, y::Vector{T},
u::Vector{T}, v::Vector{T}, got_h::Bool,
userdata::userdata_nls)
u[1] = u[1] + 2.0 * y[1] * v[1]
u[2] = u[2] + 2.0 * y[2] * v[2]
return 0
end
# compute residual-Hessians-vector products
function rhessprods(n::Int, m::Int, pne::Int, x::Vector{T}, v::Vector{T},
pval::Vector{T}, got_h::Bool, userdata::userdata_nls)
pval[1] = 2.0 * v[1]
pval[2] = 2.0 * v[2]
return 0
end
# # scale v
function scale(n::Int, m::Int, x::Vector{T}, u::Vector{T}, v::Vector{T},
userdata::userdata_nls)
u[1] = v[1]
u[2] = v[2]
return 0
end
# compute the dense Jacobian
function jac_dense(n::Int, m::Int, jne::Int, x::Vector{T}, jval::Vector{T},
userdata::userdata_nls)
jval[1] = 2.0 * x[1]
jval[2] = 0.0
jval[3] = 1.0
jval[4] = 2.0 * x[2]
jval[5] = 1.0
jval[6] = -1.0
return 0
end
# compute the dense Hessian
function hess_dense(n::Int, m::Int, hne::Int, x::Vector{T}, y::Vector{T},
hval::Vector{T}, userdata::userdata_nls)
hval[1] = 2.0 * y[1]
hval[2] = 0.0
hval[3] = 2.0 * y[2]
return 0
end
# compute dense residual-Hessians-vector products
function rhessprods_dense(n::Int, m::Int, pne::Int, x::Vector{T},
v::Vector{T}, pval::Vector{T}, got_h::Bool,
userdata::userdata_nls)
pval[1] = 2.0 * v[1]
pval[2] = 0.0
pval[3] = 0.0
pval[4] = 2.0 * v[2]
pval[5] = 0.0
pval[6] = 0.0
return 0
end
# Derived types
data = Ref{Ptr{Cvoid}}()
control = Ref{nls_control_type{T}}()
inform = Ref{nls_inform_type{T}}()
# Set user data
userdata = userdata_nls(1.0)
# Set problem data
n = 2 # # variables
m = 3 # # residuals
j_ne = 5 # Jacobian elements
h_ne = 2 # Hesssian elements
p_ne = 2 # residual-Hessians-vector products elements
J_row = Cint[1, 2, 2, 3, 3] # Jacobian J
J_col = Cint[1, 1, 2, 1, 2] #
J_ptr = Cint[1, 2, 4, 6] # row pointers
H_row = Cint[1, 2] # Hessian H
H_col = Cint[1, 2] # NB lower triangle
H_ptr = Cint[1, 2, 3] # row pointers
P_row = Cint[1, 2] # residual-Hessians-vector product matrix
P_ptr = Cint[1, 2, 3, 3] # column pointers
# Set storage
g = zeros(T, n) # gradient
c = zeros(T, m) # residual
y = zeros(T, m) # multipliers
W = T[1.0, 1.0, 1.0] # weights
st = ' '
status = Ref{Cint}()
@printf(" Fortran sparse matrix indexing\n\n")
@printf(" tests reverse-communication options\n\n")
# reverse-communication input/output
eval_status = Ref{Cint}()
u = zeros(T, max(m, n))
v = zeros(T, max(m, n))
J_val = zeros(T, j_ne)
J_dense = zeros(T, m * n)
H_val = zeros(T, h_ne)
H_dense = zeros(T, div(n * (n + 1), 2))
H_diag = zeros(T, n)
P_val = zeros(T, p_ne)
P_dense = zeros(T, m * n)
trans = Ref{Bool}()
got_j = false
got_h = false
for d in 1:5
# Initialize NLS
nls_initialize(T, data, control, inform)
# Set user-defined control options
@reset control[].f_indexing = true # Fortran sparse matrix indexing
# @reset control[].print_level = Cint(1)
@reset control[].jacobian_available = Cint(2)
@reset control[].hessian_available = Cint(2)
@reset control[].model = Cint(6)
x = T[1.5, 1.5] # starting point
W = T[1.0, 1.0, 1.0] # weights
# sparse co-ordinate storage
if d == 1
st = 'C'
nls_import(T, control, data, status, n, m,
"coordinate", j_ne, J_row, J_col, C_NULL,
"coordinate", h_ne, H_row, H_col, C_NULL,
"sparse_by_columns", p_ne, P_row, C_NULL, P_ptr, W)
terminated = false
while !terminated # reverse-communication loop
nls_solve_reverse_with_mat(T, data, status, eval_status,
n, m, x, c, g, j_ne, J_val, y,
h_ne, H_val, v, p_ne, P_val)
if status[] == 0 # successful termination
terminated = true
elseif status[] < 0 # error exit
terminated = true
elseif status[] == 2 # evaluate c
eval_status[] = res(n, m, x, c, userdata)
elseif status[] == 3 # evaluate J
eval_status[] = jac(n, m, j_ne, x, J_val, userdata)
elseif status[] == 4 # evaluate H
eval_status[] = hess(n, m, h_ne, x, y, H_val, userdata)
elseif status[] == 7 # evaluate P
eval_status[] = rhessprods(n, m, p_ne, x, v, P_val, got_h, userdata)
else
@printf(" the value %1i of status should not occur\n", status)
end
end
end
# sparse by rows
if d == 2
st = 'R'
nls_import(T, control, data, status, n, m,
"sparse_by_rows", j_ne, C_NULL, J_col, J_ptr,
"sparse_by_rows", h_ne, C_NULL, H_col, H_ptr,
"sparse_by_columns", p_ne, P_row, C_NULL, P_ptr, W)
terminated = false
while !terminated # reverse-communication loop
nls_solve_reverse_with_mat(T, data, status, eval_status,
n, m, x, c, g, j_ne, J_val, y,
h_ne, H_val, v, p_ne, P_val)
if status[] == 0 # successful termination
terminated = true
elseif status[] < 0 # error exit
terminated = true
elseif status[] == 2 # evaluate c
eval_status[] = res(n, m, x, c, userdata)
elseif status[] == 3 # evaluate J
eval_status[] = jac(n, m, j_ne, x, J_val, userdata)
elseif status[] == 4 # evaluate H
eval_status[] = hess(n, m, h_ne, x, y, H_val, userdata)
elseif status[] == 7 # evaluate P
eval_status[] = rhessprods(n, m, p_ne, x, v, P_val, got_h, userdata)
else
@printf(" the value %1i of status should not occur\n", status)
end
end
end
# dense
if d == 3
st = 'D'
nls_import(T, control, data, status, n, m,
"dense", j_ne, C_NULL, C_NULL, C_NULL,
"dense", h_ne, C_NULL, C_NULL, C_NULL,
"dense", p_ne, C_NULL, C_NULL, C_NULL, W)
terminated = false
while !terminated # reverse-communication loop
nls_solve_reverse_with_mat(T, data, status, eval_status,
n, m, x, c, g, m * n, J_dense, y,
n * (n + 1) / 2, H_dense, v, m * n,
P_dense)
if status[] == 0 # successful termination
terminated = true
elseif status[] < 0 # error exit
terminated = true
elseif status[] == 2 # evaluate c
eval_status[] = res(n, m, x, c, userdata)
elseif status[] == 3 # evaluate J
eval_status[] = jac_dense(n, m, j_ne, x, J_dense, userdata)
elseif status[] == 4 # evaluate H
eval_status[] = hess_dense(n, m, h_ne, x, y, H_dense, userdata)
elseif status[] == 7 # evaluate P
eval_status[] = rhessprods_dense(n, m, p_ne, x, v, P_dense, got_h, userdata)
else
@printf(" the value %1i of status should not occur\n", status)
end
end
end
# diagonal
if d == 4
st = 'I'
nls_import(T, control, data, status, n, m,
"sparse_by_rows", j_ne, C_NULL, J_col, J_ptr,
"diagonal", h_ne, C_NULL, C_NULL, C_NULL,
"sparse_by_columns", p_ne, P_row, C_NULL, P_ptr, W)
terminated = false
while !terminated # reverse-communication loop
nls_solve_reverse_with_mat(T, data, status, eval_status,
n, m, x, c, g, j_ne, J_val, y,
n, H_diag, v, p_ne, P_val)
if status[] == 0 # successful termination
terminated = true
elseif status[] < 0 # error exit
terminated = true
elseif status[] == 2 # evaluate c
eval_status[] = res(n, m, x, c, userdata)
elseif status[] == 3 # evaluate J
eval_status[] = jac(n, m, j_ne, x, J_val, userdata)
elseif status[] == 4 # evaluate H
eval_status[] = hess(n, m, h_ne, x, y, H_diag, userdata)
elseif status[] == 7 # evaluate P
eval_status[] = rhessprods(n, m, p_ne, x, v, P_val, got_h, userdata)
else
@printf(" the value %1i of status should not occur\n", status)
end
end
end
# access by products
if d == 5
st = 'P'
# @reset control[].print_level = Cint(1)
nls_import(T, control, data, status, n, m,
"absent", j_ne, C_NULL, C_NULL, C_NULL,
"absent", h_ne, C_NULL, C_NULL, C_NULL,
"sparse_by_columns", p_ne, P_row, C_NULL, P_ptr, W)
terminated = false
while !terminated # reverse-communication loop
nls_solve_reverse_without_mat(T, data, status, eval_status,
n, m, x, c, g, trans,
u, v, y, p_ne, P_val)
if status[] == 0 # successful termination
terminated = true
elseif status[] < 0 # error exit
terminated = true
elseif status[] == 2 # evaluate c
eval_status[] = res(n, m, x, c, userdata)
elseif status[] == 5 # evaluate u + J v or u + J'v
eval_status[] = jacprod(n, m, x, trans[], u, v, got_j, userdata)
elseif status[] == 6 # evaluate u + H v
eval_status[] = hessprod(n, m, x, y, u, v, got_h, userdata)
elseif status[] == 7 # evaluate P
eval_status[] = rhessprods(n, m, p_ne, x, v, P_val, got_h, userdata)
else
@printf(" the value %1i of status should not occur\n", status)
end
end
end
nls_information(T, data, inform, status)
if inform[].status == 0
@printf("%c:%6i iterations. Optimal objective value = %5.2f status = %1i\n",
st, inform[].iter, inform[].obj, inform[].status)
else
@printf("%c: NLS_solve exit status = %1i\n", st, inform[].status)
end
# Delete internal workspace
nls_terminate(T, data, control, inform)
end
@printf("\n basic tests of models used, reverse access\n\n")
for model in 3:8
# Initialize NLS
nls_initialize(T, data, control, inform)
# Set user-defined control options
@reset control[].f_indexing = true # Fortran sparse matrix indexing
# @reset control[].print_level = Cint(1)
@reset control[].jacobian_available = Cint(2)
@reset control[].hessian_available = Cint(2)
@reset control[].model = Cint(model)
x = T[1.5, 1.5] # starting point
W = T[1.0, 1.0, 1.0] # weights
nls_import(T, control, data, status, n, m,
"sparse_by_rows", j_ne, C_NULL, J_col, J_ptr,
"sparse_by_rows", h_ne, C_NULL, H_col, H_ptr,
"sparse_by_columns", p_ne, P_row, C_NULL, P_ptr, W)
terminated = false
while !terminated # reverse-communication loop
nls_solve_reverse_with_mat(T, data, status, eval_status,
n, m, x, c, g, j_ne, J_val, y,
h_ne, H_val, v, p_ne, P_val)
if status[] == 0 # successful termination
terminated = true
elseif status[] < 0 # error exit
terminated = true
elseif status[] == 2 # evaluate c
eval_status[] = res(n, m, x, c, userdata)
elseif status[] == 3 # evaluate J
eval_status[] = jac(n, m, j_ne, x, J_val, userdata)
elseif status[] == 4 # evaluate H
eval_status[] = hess(n, m, h_ne, x, y, H_val, userdata)
elseif status[] == 7 # evaluate P
eval_status[] = rhessprods(n, m, p_ne, x, v, P_val, got_h, userdata)
else
@printf(" the value %1i of status should not occur\n", status)
end
end
nls_information(T, data, inform, status)
if inform[].status == 0
@printf("P%1i:%6i iterations. Optimal objective value = %5.2f status = %1i\n",
model, inform[].iter, inform[].obj, inform[].status)
else
@printf(" %i: NLS_solve exit status = %1i\n", model, inform[].status)
end
# Delete internal workspace
nls_terminate(T, data, control, inform)
end
@printf("\n basic tests of models used, reverse access by products\n\n")
for model in 3:8
# Initialize NLS
nls_initialize(T, data, control, inform)
# Set user-defined control options
@reset control[].f_indexing = true # Fortran sparse matrix indexing
# @reset control[].print_level = 1
@reset control[].jacobian_available = Cint(2)
@reset control[].hessian_available = Cint(2)
@reset control[].model = Cint(model)
x = T[1.5, 1.5] # starting point
W = T[1.0, 1.0, 1.0] # weights
nls_import(T, control, data, status, n, m,
"absent", j_ne, C_NULL, C_NULL, C_NULL,
"absent", h_ne, C_NULL, C_NULL, C_NULL,
"sparse_by_columns", p_ne, P_row, C_NULL, P_ptr, W)
terminated = false
while !terminated # reverse-communication loop
nls_solve_reverse_without_mat(T, data, status, eval_status,
n, m, x, c, g, trans,
u, v, y, p_ne, P_val)
if status[] == 0 # successful termination
terminated = true
elseif status[] < 0 # error exit
terminated = true
elseif status[] == 2 # evaluate c
eval_status[] = res(n, m, x, c, userdata)
elseif status[] == 5 # evaluate u + J v or u + J'v
eval_status[] = jacprod(n, m, x, trans[], u, v, got_j, userdata)
elseif status[] == 6 # evaluate u + H v
eval_status[] = hessprod(n, m, x, y, u, v, got_h, userdata)
elseif status[] == 7 # evaluate P
eval_status[] = rhessprods(n, m, p_ne, x, v, P_val, got_h, userdata)
else
@printf(" the value %1i of status should not occur\n", status)
end
end
nls_information(T, data, inform, status)
if inform[].status == 0
@printf("P%1i:%6i iterations. Optimal objective value = %5.2f status = %1i\n",
model, inform[].iter, inform[].obj, inform[].status)
else
@printf("P%i: NLS_solve exit status = %1i\n", model, inform[].status)
end
# Delete internal workspace
nls_terminate(T, data, control, inform)
end
return 0
end
@testset "NLS" begin
@test test_nls(Float32) == 0
@test test_nls(Float64) == 0
end