GALAHAD BNLS package#

purpose#

The bnls package uses a regularization method to solve a given bound-constrained nonlinear least-squares problem. The aim is to minimize the least-squares objective function

\[f(x) := \frac{1}{2} \sum_{i=0}^{m_r-1} w_i r_i^2(x) \equiv \frac{1}{2} \|r(x)\|^2_W\]

where the variables $x$ are required to lie within the simple bounds

\[x^l \leq x \leq x^u, \hspace{10mm} \mbox{(1)}\]

where the non-negative weights $w$ are given, and where the weighted-Euclidean norms is given by $\|v\|_W^2 = v^T W v$ with $W = \mbox{diag}(w)$. The method offers the choice of projected-gradient and interior-point solution of the key regularization subproblems, and is most suitable for problems involving a large number of unknowns $x$. First derivatives of the residual function $r(x)$ are required, and if second derivatives of the $r_i(x)$ can be calculated, they may be exploited.

terminology#

The algorithm used by the package is iterative. From the current best estimate of the minimizer $x_k$, a trial improved point $x_k + s_k$ is sought. The correction $s_k$ is chosen to improve a model $m_k(s)$ of the objective function $f(x_k+s)$ built around $x_k$. The model is the sum of two basic components, a suitable approximation $t_k(s)$ of $f(x_k+s)$, and a regularization term $\frac{1}{2} \sigma_k \|s\|_2^2$ involving a weight $\sigma_k$. The weight is adjusted as the algorithm progresses to ensure convergence.

The model $t_k(s)$ is a truncated Taylor-series approximation, and this relies on being able to compute or estimate derivatives of $c(x)$. Various models are provided, and each has different derivative requirements. We denote the $m$ by $n$ residual Jacobian $J_r(x) \equiv \nabla_x c(x)$ as the matrix whose $i,j$-th component

\[J_r(x)_{i,j} := \partial r_i(x) / \partial x_j \;\; \mbox{for $i=1,\ldots,m_r$ and $j=1,\ldots,n$.}\]

For a given $m_r$-vector $y$, the weighted-residual Hessian is the sum

\[H(x,y) := \sum_{\ell=1}^{m_r} y_{\ell} H_{\ell}(x), \;\; \mbox{where}\;\; H_{\ell}(x)_{i,j} := \partial^2 r_{\ell}(x) / \partial x_i \partial x_j \;\; \mbox{for $i,j=1,\ldots,n$}\]

is the Hessian of $r_\ell(x)$. The models $t_k(s)$ provided are,

the Gauss-Newton approximation $\frac{1}{2} \| r(x_k) + J_r(x_k) s\|^2_W$,
the Newton (second-order Taylor) approximation

$f(x_k) + g(x_k)^T s + \frac{1}{2} s^T [ J_r^T(x_k) W J_r(x_k) + H(x_k,W r(x_k))] s$

(although the latter has yet to be implemented).

The primal optimality conditions (1) and dual optimality conditions

\[J_r^T(x) W r(x) = z\]

necessarily hold at an optimal point $x$ for some dual variables $z$, where $x$ and $z$ satisfy appropriate complementarity conditions, namely that the $i$-th dual variable $z_i$ is non-negative if $x_i$ is on its lower bound $x^l_i$, non-positive if $x_i$ is on its upper bound $x^u_i$, and zero if $x_i$ lies strictly between its bounds.

method#

An adaptive regularization method is used. In this, an improvement to a current estimate of the required minimizer, $x_k$ is sought by computing a step $s_k$. The step is chosen to approximately minimize a model $t_k(s)$ of $f_{\rho,r}(x_k+s)$ that includes a weighted regularization term $\frac{\sigma_k}{p} \|s\|_{S_k}^p$ for some specified positive weight $\sigma_k$. The quality of the resulting step $s_k$ is assessed by computing the “ratio” $(f(x_k) - f(x_k + s_k))/(t_k(0) - t_k(s_k))$. The step is deemed to have succeeded if the ratio exceeds a given $\eta_s > 0$, and in this case $x_{k+1} = x_k + s_k$. Otherwise $x_{k+1} = x_k$, and the weight is increased by powers of a given increase factor up to a given limit. If the ratio is larger than $\eta_v \geq \eta_d$, the weight will be decreased by powers of a given decrease factor again up to a given limit. The method will terminate as soon as $f(x_k)$ or $\|\nabla_x f(x_k)\|$ is smaller than a specified value.

The iteration is terminated as soon as either the $W$-norm of the residual $r(x_k)$ or the the Euclidean norm of the projected gradient $P[x_k-\nabla f(x_k)]$, where the projection operator

\[P[v] = \min(\max( v, x^l), x^u) -x,\]

is sufficiently small.

references#

The generic adaptive cubic regularization method is described in detail in

C. Cartis, N. I. M. Gould and Ph. L. Toint, ‘’Evaluation complexity of algorithms for nonconvex optimization’’ SIAM-MOS Series on Optimization (2022),

and uses ‘’tricks’’ as suggested in

N. I. M. Gould, M. Porcelli and Ph. L. Toint, ‘’Updating the regularization parameter in the adaptive cubic regularization algorithm’’. Computational Optimization and Applications 53(1) (2012) 1–22.

The specific methods employed here are discussed in

N. I. M. Gould, ‘’A projection method for bound-constrained linear least-squares’’. STFC-Rutherford Appleton Laboratory Computational Mathematics Group Internal Report 2023-1 (2023).

matrix storage#

unsymmetric storage#

The unsymmetric $m_r$ by $n$ Jacobian matrix $J_r = J_r(x)$ may be presented and stored in a variety of convenient input formats. Let

Dense storage format: The matrix $J_r$ is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. In this case, component $n \ast i + j$ of the storage array Jr_val will hold the value $A_{ij}$ for $1 \leq i \leq m_r$, $1 \leq j \leq n$. The string Jr_type = ‘dense’ or ‘dense_by_rows’ should be specified.

Dense by columns storage format: The matrix $J_r$ is stored as a compact dense matrix by columns, that is, the values of the entries of each column in turn are stored in order within an appropriate real one-dimensional array. In this case, component $m \ast j + i$ of the storage array Jr_val will hold the value $A_{ij}$ for $1 \leq i \leq m_r$, $1 \leq j \leq n$. The string Jr_type = ‘dense_by_columns’ should be specified.

Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the $l$-th entry, $1 \leq l \leq ne$, of $J_r$, its row index i, column index j and value $A_{ij}$, $1 \leq i \leq m_r$, $1 \leq j \leq n$, are stored as the $l$-th components of the integer arrays Jr_row and Jr_col and real array Jr_val, respectively, while the number of nonzeros is recorded as Jr_ne = $ne$. The string Jr_type = ‘coordinate’ should be specified.

Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of $J_r$ the i-th component of the integer array Jr_ptr holds the position of the first entry in this row, while Jr_ptr(m+1) holds the total number of entries plus one. The column indices j, $1 \leq j \leq n$, and values $A_{ij}$ of the nonzero entries in the i-th row are stored in components l = Jr_ptr(i), $\ldots$, Jr_ptr(i+1)-1, $1 \leq i \leq m_r$, of the integer array Jr_col, and real array Jr_val, respectively. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string Jr_type = ‘sparse_by_rows’ should be specified.

Sparse column-wise storage format: Once again only the nonzero entries are stored, but this time they are ordered so that those in column j appear directly before those in column j+1. For the j-th column of $J_r$ the j-th component of the integer array Jr_ptr holds the position of the first entry in this column, while Jr_ptr(n+1) holds the total number of entries plus one. The row indices i, $1 \leq i \leq m_r$, and values $A_{ij}$ of the nonzero entries in the j-th columnsare stored in components l = Jr_ptr(j), $\ldots$, Jr_ptr(j+1)-1, $1 \leq j \leq n$, of the integer array Jr_row, and real array Jr_val, respectively. As before, for sparse matrices, this scheme almost always requires less storage than the co-ordinate format. The string Jr_type = ‘sparse_by_columns’ should be specified.

symmetric storage#

The symmetric $n$ by $n$ matrix $H = H(x,y)$ may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).

Dense storage format: The matrix $H$ is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since $H$ is symmetric, only the lower triangular part (that is the part $H_{ij}$ for $1 \leq j \leq i \leq n$) need be held. In this case the lower triangle should be stored by rows, that is component $(i-1) * i / 2 + j$ of the storage array H_val will hold the value $H_{ij}$ (and, by symmetry, $H_{ji}$) for $1 \leq j \leq i \leq n$. The string H_type = ‘dense’ should be specified.

Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the $l$-th entry, $1 \leq l \leq ne$, of $H$, its row index i, column index j and value $H_{ij}$, $1 \leq j \leq i \leq n$, are stored as the $l$-th components of the integer arrays H_row and H_col and real array H_val, respectively, while the number of nonzeros is recorded as H_ne = $ne$. Note that only the entries in the lower triangle should be stored. The string H_type = ‘coordinate’ should be specified.

Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of $H$ the i-th component of the integer array H_ptr holds the position of the first entry in this row, while H_ptr(n+1) holds the total number of entries plus one. The column indices j, $1 \leq j \leq i$, and values $H_{ij}$ of the entries in the i-th row are stored in components l = H_ptr(i), …, H_ptr(i+1)-1 of the integer array H_col, and real array H_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string H_type = ‘sparse_by_rows’ should be specified.

Diagonal storage format: If $H$ is diagonal (i.e., $H_{ij} = 0$ for all $1 \leq i \neq j \leq n$) only the diagonals entries $H_{ii}$, $1 \leq i \leq n$ need be stored, and the first n components of the array H_val may be used for the purpose. The string H_type = ‘diagonal’ should be specified.

Multiples of the identity storage format: If $H$ is a multiple of the identity matrix, (i.e., $H = \alpha I$ where $I$ is the n by n identity matrix and $\alpha$ is a scalar), it suffices to store $\alpha$ as the first component of H_val. The string H_type = ‘scaled_identity’ should be specified.

The identity matrix format: If $H$ is the identity matrix, no values need be stored. The string H_type = ‘identity’ should be specified.

The zero matrix format: The same is true if $H$ is the zero matrix, but now the string H_type = ‘zero’ or ‘none’ should be specified.

introduction to function calls#

To solve a given problem, functions from the bnls package must be called in the following order:

bnls_initialize - provide default control parameters and set up initial data structures
bnls_read_specfile (optional) - override control values by reading replacement values from a file
set up data structures by calling one of
- bnls_import - set up problem data structures and fixed values when $J_r(x)$ is available
- bnls_import_without_jac - set up problem data structures and fixed values when only products with $J_r(x)$ are available
bnls_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved
solve the problem by calling one of
- bnls_solve_with_jac - solve using function calls to evaluate function, gradient and Hessian values
- bnls_solve_with_jacprod - solve using function calls to evaluate function and gradient values and Hessian-vector products
- bnls_solve_reverse_with_jac - solve returning to the calling program to obtain function, gradient and Hessian values, or
- bnls_solve_reverse_with_jacprod - solve returning to the calling prorgram to obtain function and gradient values and Hessian-vector products
bnls_information (optional) - recover information about the solution and solution process
bnls_terminate - deallocate data structures

See the examples section for illustrations of use.

parametric real type T and integer type INT#

Below, the symbol T refers to a parametric real type that may be Float32 (single precision), Float64 (double precision) or, if supported, Float128 (quadruple precision). The symbol INT refers to a parametric integer type that may be Int32 (32-bit integer) or Int64 (64-bit integer).

callable functions#

    function bnls_initialize(T, INT, data, control, inform)

Set default control values and initialize private data

Parameters:

data	holds private internal data
control	is a structure containing control information (see bnls_control_type)
inform	is a structure containing output information (see bnls_inform_type)

    function bnls_read_specfile(T, INT, control, specfile)

Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/bnls/BNLS.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/bnls.pdf for a list of how these keywords relate to the components of the control structure.

Parameters:

control	is a structure containing control information (see bnls_control_type)
specfile	is a one-dimensional array of type Vararg{Cchar} that must give the name of the specification file

    function bnls_import(T, INT, control, data, status, n, m_r,
                         Jr_type, Jr_ne, Jr_row, Jr_col, Jr_ptr_ne, Jr_ptr)

Import problem data into internal storage prior to solution.

Parameters:

control	is a structure whose members provide control parameters for the remaining procedures (see bnls_control_type)
data	holds private internal data
status	is a scalar variable of type INT that gives the exit status from the package. Possible values are: 1 The import was successful, and the package is ready for the solve phase -1 An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -2 A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -3 The restrictions n > 0, m > 0 or requirement that Jr_type contains its relevant string ‘dense’, ‘dense_by_rows’, ‘dense_by_columns’, ‘coordinate’, ‘sparse_by_rows’ or ‘sparse_by_columns’ has been violated.
n	is a scalar variable of type INT that holds the number of variables.
m_r	is a scalar variable of type INT that holds the number of residuals.
Jr_type	is a one-dimensional array of type Vararg{Cchar} that specifies the unsymmetric storage scheme used for the Jacobian, $J_r$. It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’ or ‘absent’, the latter if access to the Jacobian is via matrix-vector products; lower or upper case variants are allowed.
Jr_ne	is a scalar variable of type INT that holds the number of entries in $J_r$ in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes.
Jr_row	is a one-dimensional array of size Jr_ne and type INT that holds the row indices of $J_r$ in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes, and in this case can be C_NULL.
Jr_col	is a one-dimensional array of size Jr_ne and type INT that holds the column indices of $J_r$ in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense or diagonal storage schemes are used, and in this case can be C_NULL.
J_ptr_ne	is a scalar variable of type INT, that holds the length of the pointer array if sparse row or column storage scheme is used for $J_r$. For the sparse row scheme, Jr_ptr_ne should be at least m_r+1, while for the sparse column scheme, it should be at least n+1, It should be set to 0 when the other schemes are used.
Jr_ptr	is a one-dimensional array of size m+1 and type INT that holds the starting position of each row of $J_r$, as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL.

    function bnls_import_without_jac(T, INT, control, data, status, n, m_r)

Import problem data, excluding the structure of $J_r(x)$, into internal storage prior to solution.

Parameters:

control	is a structure whose members provide control parameters for the remaining procedures (see bnls_control_type)
data	holds private internal data
status	is a scalar variable of type INT that gives the exit status from the package. Possible values are: 1 The import was successful, and the package is ready for the solve phase -1 An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -2 A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -3 The restrictions n > 0, m > 0 has been violated.
n	is a scalar variable of type INT that holds the number of variables.
m_r	is a scalar variable of type INT that holds the number of residuals.

    function bnls_reset_control(T, INT, control, data, status)

Reset control parameters after import if required.

Parameters:

control

is a structure whose members provide control parameters for the remaining procedures (see bnls_control_type)

data

holds private internal data

status

is a scalar variable of type INT that gives the exit status from the package. Possible values are:

1
The import was successful, and the package is ready for the solve phase

    function bnls_solve_with_jac(T, INT, data, userdata, status,
                                 n, m_r, x_l, x_u, x, z, r, g, x_stat,
                                 eval_r, Jr_ne, eval_jr, w)

Solve the simplex-constrained nonlinear least-squares problem when the Jacobian $J_r(x)$ is available by function calls.

Parameters:

data	holds private internal data
userdata	is a structure that allows data to be passed into the function and derivative evaluation programs.
status	is a scalar variable of type INT that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are: 0 The run was successful -1 An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -2 A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -3 The restriction n > 0 or requirement that type contains its relevant string ‘dense’, ‘coordinate’, ‘sparse_by_rows’, ‘diagonal’ or ‘absent’ has been violated. -9 The analysis phase of the factorization failed; the return status from the factorization package is given in the component inform.factor_status -10 The factorization failed; the return status from the factorization package is given in the component inform.factor_status. -11 The solution of a set of linear equations using factors from the factorization package failed; the return status from the factorization package is given in the component inform.factor_status. -16 The problem is so ill-conditioned that further progress is impossible. -17 The step is too small to make further impact. -18 Too many iterations have been performed. This may happen if control.maxit is too small, but may also be symptomatic of a badly scaled problem. -19 The CPU time limit has been reached. This may happen if control.cpu_time_limit is too small, but may also be symptomatic of a badly scaled problem. -82 The user has forced termination of solver by removing the file named control.alive_file from unit unit control.alive_unit.
n	is a scalar variable of type INT that holds the number of variables.
m_r	is a scalar variable of type INT that holds the number of residuals.
x_l	is a one-dimensional array of size n and type T that holds the values $x^l$ of the lower bounds on the optimization variables $x$. The j-th component of `x_l`, $j = 1, \ldots, n$, contains $x^l_j$.
x_u	is a one-dimensional array of size n and type T that holds the values $x^u$ of the upper bounds on the optimization variables $x$. The j-th component of `x_u`, $j = 1, \ldots, n$, contains $x^u_j$.
x	is a one-dimensional array of size n and type T that holds the values $x$ of the optimization variables. The j-th component of `x`, j = 1, … , n, contains $x_j$.
z	is a one-dimensional array of size n and type T that holds the values $z$ of the dual variables. The j-th component of `z`, j = 1, … , n, contains $z_j$.
r	is a one-dimensional array of size m and type T that holds the residual $r(x)$. The i-th component of `r`, i = 1, … , m_r, contains $r_i(x)$.
g	is a one-dimensional array of size n and type T that holds the gradient $g = \nabla_xf(x)$ of the objective function. The j-th component of `g`, j = 1, … , n, contains $g_j$.
x_stat	is a one-dimensional array of size n and type INT that gives the optimal status of the problem variables. If x_stat[j] is negative, variable $x_j$ most likely lies at its zero, lower bound, while if it is zero, $x_j$ is free of its bound (or unconstrained).
eval_r	is a user-supplied function that must have the following signature: function eval_r(n, m_r, x, r, userdata) The componnts of the residual function $r(x)$ evaluated at x=$x$ must be assigned to r, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into `eval_r` via the structure `userdata`.
Jr_ne	is a scalar variable of type INT that holds the number of entries in the Jacobian matrix $J_r$.
eval_jr	is a user-supplied function that must have the following signature: function eval_jr(n, m_r, jr_ne, x, jr_val, userdata) The components of the Jacobian $J_r = \nabla_x r(x$) of the residuals must be assigned to jr_val in the same order as presented to bnls_import, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into `eval_jr` via the structure `userdata`.
w	is a one-dimensional array of size m_r and type T that holds the values $w$ of the weights on the residuals in the least-squares objective function. It need not be set if the weights are all ones, and in this case can be C_NULL.

    function bnls_solve_with_jacprod(T, INT, data, userdata, status,
                                     n, m_r, x_l, x_u, x, z, r, g, x_stat,
                                     eval_r, eval_jr_prod, eval_jr_prods,
                                     eval_jr_sprod, w)

Solve the simplex-constrained nonlinear least-squares problem when the products of the Jacobian $J_r(x)$ and its transpose are available by function calls.

Parameters:

data	holds private internal data
userdata	is a structure that allows data to be passed into the function and derivative evaluation programs.
status	is a scalar variable of type INT that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are: 0 The run was successful -1 An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -2 A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -3 The restriction n > 0 or requirement that type contains its relevant string ‘dense’, ‘coordinate’, ‘sparse_by_rows’, ‘diagonal’ or ‘absent’ has been violated. -9 The analysis phase of the factorization failed; the return status from the factorization package is given in the component inform.factor_status -10 The factorization failed; the return status from the factorization package is given in the component inform.factor_status. -11 The solution of a set of linear equations using factors from the factorization package failed; the return status from the factorization package is given in the component inform.factor_status. -16 The problem is so ill-conditioned that further progress is impossible. -17 The step is too small to make further impact. -18 Too many iterations have been performed. This may happen if control.maxit is too small, but may also be symptomatic of a badly scaled problem. -19 The CPU time limit has been reached. This may happen if control.cpu_time_limit is too small, but may also be symptomatic of a badly scaled problem. -82 The user has forced termination of solver by removing the file named control.alive_file from unit unit control.alive_unit.
n	is a scalar variable of type INT that holds the number of variables
m_r	is a scalar variable of type INT that holds the number of residuals.
x_l	is a one-dimensional array of size n and type T that holds the values $x^l$ of the lower bounds on the optimization variables $x$. The j-th component of `x_l`, $j = 1, \ldots, n$, contains $x^l_j$.
x_u	is a one-dimensional array of size n and type T that holds the values $x^u$ of the upper bounds on the optimization variables $x$. The j-th component of `x_u`, $j = 1, \ldots, n$, contains $x^u_j$.
x	is a one-dimensional array of size n and type T that holds the values $x$ of the optimization variables. The j-th component of `x`, j = 1, … , n, contains $x_j$.
z	is a one-dimensional array of size n and type T that holds the values $z$ of the dual variables. The j-th component of `z`, j = 1, … , n, contains $z_j$.
r	is a one-dimensional array of size m and type T that holds the residual $r(x)$. The i-th component of `r`, i = 1, … , m_r, contains $r_i(x)$.
g	is a one-dimensional array of size n and type T that holds the gradient $g = \nabla_xf(x)$ of the objective function. The j-th component of `g`, j = 1, … , n, contains $g_j$.
x_stat	is a one-dimensional array of size n and type INT that gives the optimal status of the problem variables. If x_stat[j] is negative, variable $x_j$ most likely lies at its zero, lower bound, while if it is zero, $x_j$ is free of its bound (or unconstrained).
eval_r	is a user-supplied function that must have the following signature: function eval_r(n, m_r, x, r, userdata) The componnts of the residual function $r(x)$ evaluated at x=$x$ must be assigned to r, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into `eval_r` via the structure `userdata`.
eval_jr_prod	is a user-supplied function that must have the following signature: function eval_jr_prod(n, m_r, x, transpose, v, p, got_jr, userdata) The product $p = J_r(x) v$ (if the Bool transpose is false) or $p = J_r^T(x) v$ (if tranpose is true) between the Jacobian $J_r(x) \nabla_{x}r_(x)$, evaluated at x=$x$, or its tranpose with the vector v=$v$ must be returned in p, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into `eval_jr_prod` via the structure `userdata`.
eval_jr_prods	is a user-supplied function that must have the following signature: function eval_jr_prods(n, m_r, x, v, p, iv, lvl, lvu, ip, lp, got_jr, userdata) The product $p = J_r(x) v$ bewteen the Jacobian $J_r(x) = \nabla_{x}r(x)$ evaluated at x$=x$ with the vector v=$v$ must be returned in p, and the function return value set to 0. Only the components iv[lvl:lvu] of $v$ will be nonzero. If ip or lp is C_NULL, the whole of p[1,m_r] should be filled. Otherwise, only the lp nonzero components p[ip[1:lp]] need be specified, and ip and lp returned accordingly. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into `eval_jr_prods` via the structure `userdata`.
eval_jr_sprod	is a user-supplied function that must have the following signature: function eval_jr_sprod(n, m_r, x, transpose, v, p, free, n_free, got_jr, userdata) The product $J_r(x) v$ (if tranpose is false) or $J_r^T(x) v$ (if tranpose is true) bewteen the Jacobian $J_r(x) = \nabla_{x}r(x)$, evaluated at x=$x$, or its tranpose with the vector v=$v$ must be returned in p, and the function return value set to 0. If transpose is false, only the components free[1:n_free] of $v$ will be nonzero, while if transpose is true, only the components free[1:n_free] of p should be set. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into `eval_jr_sprod` via the structure `userdata`
w	is a one-dimensional array of size m and type T that holds the values $w$ of the weights on the residuals in the least-squares objective function. It need not be set if the weights are all ones, and in this case can be C_NULL.

    function bnls_solve_reverse_with_jac(T, INT, data, status, eval_status,
                                         n, m_r, x_l, x_u, x, z, r, g,
                                         x_stat, jr_ne, Jr_val, w)

Solve the simplex-constrained nonlinear least-squares problem when the Jacobian $J_r(x)$ may be computed by the calling program.

Parameters:

data	holds private internal data
status	is a scalar variable of type INT that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are: 0 The run was successful -1 An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -2 A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -3 The restriction n > 0 or requirement that type contains its relevant string ‘dense’, ‘coordinate’, ‘sparse_by_rows’, ‘diagonal’ or ‘absent’ has been violated. -9 The analysis phase of the factorization failed; the return status from the factorization package is given in the component inform.factor_status -10 The factorization failed; the return status from the factorization package is given in the component inform.factor_status. -11 The solution of a set of linear equations using factors from the factorization package failed; the return status from the factorization package is given in the component inform.factor_status. -16 The problem is so ill-conditioned that further progress is impossible. -17 The step is too small to make further impact. -18 Too many iterations have been performed. This may happen if control.maxit is too small, but may also be symptomatic of a badly scaled problem. -19 The CPU time limit has been reached. This may happen if control.cpu_time_limit is too small, but may also be symptomatic of a badly scaled problem. -82 The user has forced termination of solver by removing the file named control.alive_file from unit unit control.alive_unit. 2 The user should compute the vector of residuals $r(x)$ at the point $x$ indicated in x and then re-enter the function. The required value should be set in r, and eval_status should be set to 0. If the user is unable to evaluate $r(x)$ for instance, if the function is undefined at $x$ the user need not set r, but should then set eval_status to a non-zero value. 3 The user should compute the Jacobian of the vector of residual functions, $J_r(x) = \nabla_x c(x)$, at the point $x$ indicated in x and then re-enter the function. The l-th component of the Jacobian stored according to the scheme specified for the remainder of $J_r$ in the earlier call to bnls_import should be set in Jr_val[l], for l = 1, …, Jr_ne and eval_status should be set to 0. If the user is unable to evaluate a component of $J_r$ for instance, if a component of the matrix is undefined at $x$ the user need not set Jr_val, but should then set eval_status to a non-zero value.
eval_status	is a scalar variable of type INT that is used to indicate if objective function/gradient/Hessian values can be provided (see above)
n	is a scalar variable of type INT that holds the number of variables
m_r	is a scalar variable of type INT that holds the number of residuals.
x_l	is a one-dimensional array of size n and type T that holds the values $x^l$ of the lower bounds on the optimization variables $x$. The j-th component of `x_l`, $j = 1, \ldots, n$, contains $x^l_j$.
x_u	is a one-dimensional array of size n and type T that holds the values $x^u$ of the upper bounds on the optimization variables $x$. The j-th component of `x_u`, $j = 1, \ldots, n$, contains $x^u_j$.
x	is a one-dimensional array of size n and type T that holds the values $x$ of the optimization variables. The j-th component of `x`, j = 1, … , n, contains $x_j$.
z	is a one-dimensional array of size n and type T that holds the values $z$ of the dual variables. The j-th component of `z`, j = 1, … , n, contains $z_j$.
r	is a one-dimensional array of size m and type T that holds the residual $r(x)$. The i-th component of `r`, i = 1, … , m, contains $r_i(x)$. See status = 2, above, for more details.
g	is a one-dimensional array of size n and type T that holds the gradient $g = \nabla_xf(x)$ of the objective function. The j-th component of `g`, j = 1, … , n, contains $g_j$.
x_stat	is a one-dimensional array of size n and type INT that gives the optimal status of the problem variables. If x_stat[j] is negative, variable $x_j$ most likely lies at its zero, lower bound, while if it is zero, $x_j$ is free of its bound (or unconstrained).
Jr_ne	is a scalar variable of type INT that holds the number of entries in the Jacobian matrix $J_r$.
Jr_val	is a one-dimensional array of size Jr_ne and type T that holds the values of the entries of the Jacobian matrix $J_r$ in any of the available storage schemes. See status = 3, above, for more details.
w	is a one-dimensional array of size m and type T that holds the values $w$ of the weights on the residuals in the least-squares objective function. It need not be set if the weights are all ones, and in this case can be C_NULL.

    function bnls_solve_reverse_with_jacprod(T, INT, data, status, eval_status,
                                             n, m_r, x_l, x_u, x, z, r, g, x_stat,
                                             v, iv, lvl, lvu, p, ip, lp, w)

Solve the simplex-constrained nonlinear least-squares problem when the products of the Jacobian $J_r(x)$ and its transpose with specified vectors may be computed by the calling program.

Parameters:

data	holds private internal data
status	is a scalar variable of type INT that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are: 0 The run was successful -1 An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -2 A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively. -3 The restriction n > 0 or requirement that type contains its relevant string ‘dense’, ‘coordinate’, ‘sparse_by_rows’, ‘diagonal’ or ‘absent’ has been violated. -9 The analysis phase of the factorization failed; the return status from the factorization package is given in the component inform.factor_status -10 The factorization failed; the return status from the factorization package is given in the component inform.factor_status. -11 The solution of a set of linear equations using factors from the factorization package failed; the return status from the factorization package is given in the component inform.factor_status. -16 The problem is so ill-conditioned that further progress is impossible. -17 The step is too small to make further impact. -18 Too many iterations have been performed. This may happen if control.maxit is too small, but may also be symptomatic of a badly scaled problem. -19 The CPU time limit has been reached. This may happen if control.cpu_time_limit is too small, but may also be symptomatic of a badly scaled problem. -82 The user has forced termination of solver by removing the file named control.alive_file from unit unit control.alive_unit. 2 The user should compute the vector of residuals $r(x)$ at the point $x$ indicated in x and then re-enter the function. The required value should be set in r, and eval_status should be set to 0. If the user is unable to evaluate $r(x)$ for instance, if the function is undefined at $x$ the user need not set r, but should then set eval_status to a non-zero value. 4 The user should compute the product $p = J_r(x) v$, at the point $x$ indicated in x, between the product of the Jacobian $J_r(x) = \nabla_{x}c_(x)$ with the vector v= $v$, and then re-enter the function. The result should be set in p, and eval_status should be set to 0. If the user is unable to evaluate the product, for instance, if the Jacobian is undefined at $x$ the user need not set p, but should then set eval_status to a non-zero value. 5 The user should compute the product $p = J_r^T(x) v$, at the point $x$ indicated in x, between the product of the transpose of the Jacobian $J_r(x) = \nabla_{x}c_(x)$ with the vector v= $v$, and then re-enter the function. The result should be set in p, and eval_status should be set to 0. If the user is unable to evaluate the product, for instance, if the Jacobian is undefined at $x$ the user need not set p, but should then set eval_status to a non-zero value. 6 The user should compute the product $p = J_r(x) v$ involving the residual Jacobian $J_r(x)$ at the point $x$, given in x, and a given sparse vector $v$, whose nonzeros are in positions iv[lvl:lvu] of v. The resulting $p$ should be placed in p and eval_status should be set to 0. If the user is unable to evaluate the product, for instance, if the Jacobian is undefined at $x$ the user need not set p, but should then set eval_status to a non-zero value. 7 The user should compute the nonzeros of the product $p = J_r(x) v$ involving the residual Jacobian $J_r(x)$ at the point $x$, given in x, and a given sparse vector $v$, whose nonzeros are in positions iv[lvl:lvu] of v. The nonzeros of the resulting $p$ should be placed in p, the indices of these nonzeros recorded in ip[1:lp] with lp set accordingly, and eval_status should be set to 0. If the user is unable to evaluate the product, for instance, if the Jacobian is undefined at $x$ the user need not set p, ip and lp, but should then set eval_status to a non-zero value. 8 The user should compute selected components of the product $p = J_r^T(x) v$ involving the transpose of the residual Jacobian $J_r(x)$ at the point $x$, given in x, and a given vector $v$. Only components iv[lvl:lvu] of $p$ should be computed, and recorded in p[iv[lvl:lvu]], and eval_status should be set to 0. If the user is unable to evaluate the product, for instance, if the Jacobian is undefined at $x$ the user need not set p, but should then set eval_status to a non-zero value. 9 The user has the opportunity to replace the estimate $x$ in x by an improved value $x^+$ for which $f(x^+) \leq f(x)$; in that case r must also be reset to hold $r(x^+)$.
eval_status	is a scalar variable of type INT that is used to indicate if objective function/gradient/Hessian values can be provided (see above)
n	is a scalar variable of type INT that holds the number of variables
m_r	is a scalar variable of type INT that holds the number of residuals.
x_l	is a one-dimensional array of size n and type T that holds the values $x^l$ of the lower bounds on the optimization variables $x$. The j-th component of `x_l`, $j = 1, \ldots, n$, contains $x^l_j$.
x_u	is a one-dimensional array of size n and type T that holds the values $x^u$ of the upper bounds on the optimization variables $x$. The j-th component of `x_u`, $j = 1, \ldots, n$, contains $x^u_j$.
x	is a one-dimensional array of size n and type T that holds the values $x$ of the optimization variables. The j-th component of `x`, j = 1, … , n, contains $x_j$.
z	is a one-dimensional array of size n and type T that holds the values $z$ of the dual variables. The j-th component of `z`, j = 1, … , n, contains $z_j$.
r	is a one-dimensional array of size m and type T that holds the residual $r(x)$. The i-th component of `r`, i = 1, … , m_r, contains $r_i(x)$. See status = 2, above, for more details.
g	is a one-dimensional array of size n and type T that holds the gradient $g = \nabla_xf(x)$ of the objective function. The j-th component of `g`, j = 1, … , n, contains $g_j$.
x_stat	is a one-dimensional array of size n and type INT that gives the optimal status of the problem variables. If x_stat[j] is negative, variable $x_j$ most likely lies at its zero, lower bound, while if it is zero, $x_j$ is free of its bound (or unconstrained).
v	is a one-dimensional array of size max(n,m_r) and type T, that is used for reverse communication. See status = 4, 5, 7 and 8 above for more details.
iv	is a one-dimensional array of size max(n,m_r) and type INT, that is used for reverse communication. See status = 6, 7 and 8 above for more details.
lvl	is a scalar variable of type INT, that is used for reverse communication. See status = 6, 7 and 8 above for more details.
lvu	is a scalar variable of type INT, that is used for reverse communication. See status = 6, 7 and 8 above for more details.
p	is a one-dimensional array of size max(n,m_r) and type T, that is used for reverse communication. See status = 4 to 8 above for more details.
ip	is a one-dimensional array of size n and type INT, that is used for reverse communication. See status = 7 above for more details.
lp	is a scalar variable of type INT, that is used for reverse communication. See status = 7 above for more details.
w	is a one-dimensional array of size m and type T that holds the values $w$ of the weights on the residuals in the least-squares objective function. It need not be set if the weights are all ones, and in this case can be C_NULL.

    function bnls_information(T, INT, data, inform, status)

Provides output information

Parameters:

data

holds private internal data

inform

is a structure containing output information (see bnls_inform_type)

status

is a scalar variable of type INT that gives the exit status from the package. Possible values are (currently):

0
The values were recorded successfully

    function bnls_terminate(T, INT, data, control, inform)

Deallocate all internal private storage

Parameters:

data	holds private internal data
control	is a structure containing control information (see bnls_control_type)
inform	is a structure containing output information (see bnls_inform_type)

available structures#

bnls_control_type structure#

    struct bnls_control_type{T,INT}
      f_indexing::Bool
      error::INT
      out::INT
      print_level::INT
      start_print::INT
      stop_print::INT
      print_gap::INT
      maxit::INT
      alive_unit::INT
      alive_file::NTuple{31,Cchar}
      jacobian_available::INT
      subproblem_solver::INT
      non_monotone::INT
      weight_update_strategy::INT
      infinity::T
      stop_r_absolute::T
      stop_r_relative::T
      stop_pg_absolute::T
      stop_pg_relative::T
      stop_s::T
      stop_pg_switch::T
      initial_weight::T
      minimum_weight::T
      eta_successful::T
      eta_very_successful::T
      eta_too_successful::T
      weight_decrease_min::T
      weight_decrease::T
      weight_increase::T
      weight_increase_max::T
      switch_to_newton::T
      cpu_time_limit::T
      clock_time_limit::T
      newton_acceleration::Bool
      magic_step::Bool
      print_obj::Bool
      space_critical::Bool
      deallocate_error_fatal::Bool
      prefix::NTuple{31,Cchar}
      blls_control::blls_control_type{T,INT}
      bllsb_control::bllsb_control_type{T,INT}

detailed documentation#

control derived type as a Julia structure

components#

Bool f_indexing

use C or Fortran sparse matrix indexing

INT error

error and warning diagnostics occur on stream error

INT out

general output occurs on stream out

INT print_level

the level of output required.

$\leq$ 0 gives no output,
= 1 gives a one-line summary for every iteration,
= 2 gives a summary of the inner iteration for each iteration,
$\geq$ 3 gives increasingly verbose (debugging) output

INT start_print

any printing will start on this iteration

INT stop_print

any printing will stop on this iteration

INT print_gap

the number of iterations between printing

INT maxit

the maximum number of iterations performed

INT alive_unit

removal of the file alive_file from unit alive_unit terminates execution

char alive_file[31]

see alive_unit

INT jacobian_available

is the Jacobian matrix of first derivatives available ($\geq$ 2), is access only via matrix-vector products (=1) or is it not available ($\leq$ 0) ?

INT subproblem_solver

the method used to solve the crucial step-determination subproblem. Possible values are

1 a projected-gradient method using GALAHAD’s blls will be used
2 an interior-point method using GALAHAD’s bllsb will be used
3 an interior-point method will initially be used, but a switch to a projected-gradient method will occur when sufficient progress has occurred (see .stop_pg_switch).

INT non_monotone

non-monotone $\leq$ 0 monotone strategy used, anything else non-monotone strategy with this history length used

INT weight_update_strategy

define the weight-update strategy: 1 (basic), 2 (reset to zero when very successful), 3 (imitate TR), 4 (increase lower bound), 5 (GPT)

T infinity

any variable bound larger than infinity in modulus will be regarded as infinite

T stop_r_absolute

overall convergence tolerances. The iteration will terminate when $||r(x)||_2 \leq$ MAX( .stop_r_absolute, .stop_r_relative $* \|r(x_0)\|_2$ or when the norm of the gradient, $g(x) = J^T(x) W r(x)$ satisfies $\|P[x-g(x)]-x\|_2 \leq$ MAX( .stop_pg_absolute, .stop_pg_relative $* \|P[x_0-g(x_0)]-x_0\|_2$ or if the norm of step is less than .stop_s, where $x_0$ is the initial point.

T stop_r_relative

see stop_r_absolute

T stop_pg_absolute

see stop_r_absolute

T stop_pg_relative

see stop_r_absolute

T stop_s

see stop_r_absolute

T stop_pg_switch

the step-computation solver will switch from an interior-point method to a projected-gradient one if .subproblem_solver = 3 (see above) and $\|P[x-g(x)]-x\|_2 \leq$ MAX( .stop_pg_absolute, .stop_pg_switch $* \|P[x_0-g(x_0)]-x_0\|_2.

T initial_weight

initial value for the regularization weight (-ve => $1/\|g_0\|)$)

T minimum_weight

minimum permitted regularization weight

T eta_successful

a potential iterate will only be accepted if the actual decrease f - f(x_new) is larger than .eta_successful times that predicted by a quadratic model of the decrease. The regularization weight will be decreaed if this relative decrease is greater than .eta_very_successful but smaller than .eta_too_successful

T eta_very_successful

see eta_successful

T eta_too_successful

see eta_successful

T weight_decrease_min

on very successful iterations, the regularization weight will be reduced by the factor .weight_decrease but no more than .weight_decrease_min while if the iteration is unsucceful, the weight will be increased by a factor .weight_increase but no more than .weight_increase_max (these are delta_1, delta_2, delta3 and delta_max in Gould, Porcelli and Toint, 2011)

T weight_decrease

see weight_decrease_min

T weight_increase

see weight_decrease_min

T weight_increase_max

see weight_decrease_min

T switch_to_newton

if the value of the two-norm of the projected gradient is less than .switch_to_newton, a switch is made from the Gauss-Newton model to the Newton one when .newton_acceleration is true

T cpu_time_limit

the maximum CPU time allowed (-ve means infinite)

T clock_time_limit

the maximum elapsed clock time allowed (-ve means infinite)

Bool newton_acceleration

if they are available, second derivatives should be used to accelerate the convergence of the algorithm

Bool magic_step

allow the user to perform a “magic” step to improve the objective

Bool print_obj

print values of the objective/gradient rather than $\|r\|$ and its gradient

Bool space_critical

if .space_critical true, every effort will be made to use as little space as possible. This may result in longer computation time

Bool deallocate_error_fatal

if .deallocate_error_fatal is true, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue

NTuple{31,Cchar} prefix

all output lines will be prefixed by .prefix(2:LEN(TRIM(.prefix))-1) where .prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’

struct blls_control_type blls_control

control parameters for BLLS

struct bllsb_control_type bllsb_control

control parameters for BLLSB

bnls_time_type structure#

    struct bnls_time_type{T}
      total::T
      blls::T
      bllsb::T
      clock_total::T
      clock_blls::T
      clock_bllsb::T

detailed documentation#

time derived type as a Julia structure

components#

T total

the total CPU time spent in the package

T blls

the CPU time spent in the bllsb package

T bllsb

the CPU time spent in the bllsb package

T clock_total

the total clock time spent in the package

T clock_blls

the clock time spent in the blls package

T clock_bllsb

the clock time spent in the bllsb package

bnls_inform_type structure#

    struct bnls_inform_type{T,INT}
      status::INT
      alloc_status::INT
      bad_alloc::NTuple{81,Cchar}
      bad_eval::NTuple{13,Cchar}
      iter::INT
      inner_iter::INT
      r_eval::INT
      jr_eval::INT
      obj::T
      norm_r::T
      norm_g::T
      norm_pg::T
      weight::T
      time::bnls_time_type{T}
      blls_inform::blls_inform_type{T,INT}
      bllsb_inform::bllsb_inform_type{T,INT}

detailed documentation#

inform derived type as a Julia structure

components#

INT status

return status. See BNLS_solve for details

INT alloc_status

the status of the last attempted allocation/deallocation

NTuple{81,Cchar} bad_alloc

the name of the array for which an allocation/deallocation error occurred

char bad_eval[13]

the name of the user-supplied evaluation routine for which an error occurred

INT iter

the total number of iterations performed

INT inner_iter

the total number of inner (projected gradient and/or interior-point) iterations performed

INT r_eval

the total number of evaluations of the residual function $r(x)$

INT jr_eval

the total number of evaluations of the Jacobian $J_r(x)$ of $r(x)$

T obj

the value of the objective function $\frac{1}{2}\|c(x)\|^2_W$ at the best estimate the solution, x, determined by BNLS_solve

T norm_r

the norm of the residual $\|r(x)\|_W$ at the best estimate of the solution x, determined by BNLS_solve

T norm_g

the norm of the gradient of $\|r(x)\|_W$ of the objective function at the best estimate, x, of the solution determined by BNLS_solve

T norm_pg

the norm of the projected gradient $\|P[x - J_r^T(x) W r(x)] - x\|_2$ of the residual function at the best estimate, x, of the solution determined by BNLS_solve

T weight

the final regularization weight used

struct bnls_time_type time

timings (see above)

struct blls_inform_type blls_inform

inform parameters for BLLSB

struct bllsb_inform_type bllsb_inform

inform parameters for BLLSB

Index

example calls#

This is an example of how to use the package to solve a bound-constrained nonlinear least-squares problem; the code is available in $GALAHAD/src/bnls/Julia/test_bnls.jl .

# test_bnls.jl
# Simple code to test the Julia interface to BNLS

using GALAHAD
using Test
using Printf
using Accessors
using Quadmath

# Custom userdata struct
mutable struct userdata_bnls{T,INT}
  p::T
  flag::INT
  flags::Vector{INT}
end

function Base.unsafe_convert(::Type{Ptr{Cvoid}}, userdata::userdata_bnls)
  return pointer_from_objref(userdata)
end

function test_bnls(::Type{T}, ::Type{INT}; mode::String="reverse", sls::String="sytr",dls::String="potr") where {T,INT}

  # compute the residuals
  function res(x::Vector{T}, r::Vector{T}, userdata::userdata_bnls{T,INT})
    r[1] = x[1] * x[2] - userdata.p
    r[2] = x[2] * x[3] - one(T)
    r[3] = x[3] * x[4] - one(T)
    r[4] = x[4] * x[5] - one(T)
    return INT(0)
  end

  function res_c(n::INT, m_r::INT, x::Ptr{T}, r::Ptr{T}, userdata::Ptr{Cvoid})
    _x = unsafe_wrap(Vector{T}, x, n)
    _r = unsafe_wrap(Vector{T}, r, m_r)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_bnls{T,INT}
    res(_x, _r, _userdata)
    return INT(0)
  end

  res_ptr = @eval @cfunction($res_c, $INT,
                             ($INT, $INT, Ptr{$T}, Ptr{$T}, Ptr{Cvoid}))

  # compute the Jacobian
  function jac(x::Vector{T}, jr_val::Vector{T}, userdata::userdata_bnls{T,INT})
    jr_val[1] = x[2]
    jr_val[2] = x[1]
    jr_val[3] = x[3]
    jr_val[4] = x[2]
    jr_val[5] = x[4]
    jr_val[6] = x[3]
    jr_val[7] = x[5]
    jr_val[8] = x[4]
    return INT(0)
  end

  function jac_c(n::INT, m_r::INT, jr_ne::INT, x::Ptr{T}, jr_val::Ptr{T},
                 userdata::Ptr{Cvoid})
    _x = unsafe_wrap(Vector{T}, x, n)
    _jr_val = unsafe_wrap(Vector{T}, jr_val, jr_ne)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_bnls{T,INT}
    jac(_x, _jr_val, _userdata)
    return INT(0)
  end

  jac_ptr = @eval @cfunction($jac_c, $INT,
                             ($INT, $INT, $INT, Ptr{$T}, Ptr{$T}, Ptr{Cvoid}))

  # compute Jacobian-vector products
  function jacprod(x::Vector{T}, transpose::Bool, v::Vector{T}, p::Vector{T},
                   got_jr::Bool, userdata::userdata_bnls{T,INT})
    if transpose
      p[1] = x[2] * v[1]
      p[2] = x[3] * v[2] + x[1] * v[1]
      p[3] = x[4] * v[3] + x[2] * v[2]
      p[4] = x[5] * v[4] + x[3] * v[3]
      p[5] = x[4] * v[4]
    else
      p[1] = x[2] * v[1] + x[1] * v[2]
      p[2] = x[3] * v[2] + x[2] * v[3]
      p[3] = x[4] * v[3] + x[3] * v[4]
      p[4] = x[5] * v[4] + x[4] * v[5]
    end
    return INT(0)
  end

  function jacprod_c(n::INT, m_r::INT, x::Ptr{T}, transpose::Bool,
                     v::Ptr{T}, p::Ptr{T}, got_jr::Bool, userdata::Ptr{Cvoid})
    _x = unsafe_wrap(Vector{T}, x, n)
    _v = unsafe_wrap(Vector{T}, v, transpose ? m_r : n)
    _p = unsafe_wrap(Vector{T}, p, transpose ? n : m_r)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_bnls{T,INT}
    jacprod(_x, transpose, _v, _p, got_jr, _userdata)
    return INT(0)
  end

  jacprod_ptr = @eval @cfunction($jacprod_c, $INT,
                                 ($INT, $INT, Ptr{$T}, Bool, Ptr{$T}, Ptr{$T},
                                  Bool, Ptr{Cvoid}))

  # compute a sparse product with the Jacobian
  function jacprods(n::INT, m_r::INT, x::Vector{T}, v::Vector{T}, p::Vector{T},
                    iv::Vector{INT}, lvl::INT, lvu::INT, ip::Vector{INT},
                    lp::Vector{INT}, got_jr::Bool,
                    userdata::userdata_bnls{T,INT})
    if !isempty(ip) && !isempty(lp)
      userdata.flag = userdata.flag + 1
      lp[1] = 0
      for l = lvl:lvu
        j = iv[l]
        val = v[j]
        if j == 1
          i = 1
          if userdata.flags[i] < userdata.flag
            userdata.flags[i] = userdata.flag
            p[i] = x[i+1] * val
            lp[1] = lp[1] + 1
            ip[lp[1]] = i
          else
            p[i] = p[i] + x[i+1] * val
          end
        elseif j == n
          i = m_r
          if userdata.flags[i] < userdata.flag
            userdata.flags[i] = userdata.flag
            p[i] = x[i] * val
            lp[1] = lp[1] + 1
            ip[lp[1]] = i
          else
            p[i] = p[i] + x[i] * val
          end
        else
          i = j - 1
          if userdata.flags[i] < userdata.flag
            userdata.flags[i] = userdata.flag
            p[i] = x[i] * val
            lp[1] = lp[1] + 1
            ip[lp[1]] = i
          else
            p[i] = p[i] + x[i] * val
          end
          i = j
          if userdata.flags[i] < userdata.flag
            userdata.flags[i] = userdata.flag
            p[i] = x[i+1] * val
            lp[1] = lp[1] + 1
            ip[lp[1]] = i
          else
            p[i] = p[i] + x[i+1] * val
          end
        end
      end
    else
      for i = 1:m_r
        p[i] = zero(T)
      end
      for l = lvl:lvu
        j = iv[l]
        val = v[j]
        if j == 1
          i = 1
          p[i] = p[i] + x[i+1] * val
        elseif j == n
          i = m_r
          p[i] = p[i] + x[i] * val
        else
          i = j - 1
          p[i] = p[i] + x[i] * val
          i = j
          p[i] = p[i] + x[i+1] * val
        end
      end
    end
    return INT(0)
  end

  function jacprods_c(n::INT, m_r::INT, x::Ptr{T}, v::Ptr{T}, p::Ptr{T},
                      iv::Ptr{INT}, lvl::INT, lvu::INT, ip::Ptr{INT},
                      lp::Ptr{INT}, got_jr::Bool, userdata::Ptr{Cvoid})
    mnm = max(m_r, n)
    _x = unsafe_wrap(Vector{T}, x, n)
    _v = unsafe_wrap(Vector{T}, v, n)
    _p = unsafe_wrap(Vector{T}, p, m_r)
    _iv = unsafe_wrap(Vector{INT}, iv, mnm)
    _ip = unsafe_wrap(Vector{INT}, ip, ip == C_NULL ? 0 : m_r)
    _lp = unsafe_wrap(Vector{INT}, lp, lp == C_NULL ? 0 : 1)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_bnls{T,INT}
    jacprods(n, m_r, _x, _v, _p, _iv, lvl, lvu, _ip, _lp, got_jr, _userdata)
    return INT(0)
  end

  jacprods_ptr = @eval @cfunction($jacprods_c, $INT,
                                  ($INT, $INT, Ptr{$T}, Ptr{$T}, Ptr{$T},
                                   Ptr{$INT}, $INT, $INT, Ptr{$INT}, Ptr{$INT},
                                   Bool, Ptr{Cvoid}))

  # compute a sparse product with the Jacobian or its transpose
  function sjacprod(n::INT, m_r::INT, x::Vector{T}, transpose::Bool,
                    v::Vector{T}, p::Vector{T}, free::Vector{INT}, n_free::INT,
                    got_jr::Bool, userdata::userdata_bnls{T,INT})
    if transpose
      for i in 1:n_free
        j = free[i]
        if j == 1
          p[1] = x[2] * v[1]
        elseif j == n
          p[n] = x[m_r] * v[m_r]
        else
          p[j] = x[j-1] * v[j-1] + x[j+1] * v[j]
        end
      end
    else
      for i in 1:n_free
        j = free[i]
        val = v[j]
        if j == 1
          p[1] = p[1] + x[2] * val
        elseif j == n
          p[m_r] = p[m_r] + x[m_r] * val
        else
          p[j-1] = p[j-1] + x[j-1] * val
          p[j] = p[j] + x[j+1] * val
        end
      end
    end
    return INT(0)
  end

  function sjacprod_c(n::INT, m_r::INT, x::Ptr{T}, transpose::Bool,
                      v::Ptr{T}, p::Ptr{T}, free::Ptr{INT}, n_free::INT,
                      got_jr::Bool, userdata::Ptr{Cvoid})
    _x = unsafe_wrap(Vector{T}, x, n)
    _v = unsafe_wrap(Vector{T}, v, transpose ? m_r : n)
    _p = unsafe_wrap(Vector{T}, p, transpose ? n : m_r)
    _free = unsafe_wrap(Vector{INT}, free, n_free)
    _userdata = unsafe_pointer_to_objref(userdata)::userdata_bnls{T}
    return sjacprod(n, m_r, _x, transpose, _v, _p, _free, n_free, got_jr,
                    _userdata)
  end

  sjacprod_ptr = @eval @cfunction($sjacprod_c, $INT,
                                  ($INT, $INT, Ptr{$T}, Bool, Ptr{$T}, Ptr{$T},
                                   Ptr{$INT}, $INT, Bool, Ptr{Cvoid}))

  # Derived types
  data = Ref{Ptr{Cvoid}}()
  control = Ref{bnls_control_type{T,INT}}()
  inform = Ref{bnls_inform_type{T,INT}}()

  # Set problem data
  n = INT(5)  # variables
  m_r = INT(4)  # observations
  w = T[1.0, 1.0, 1.0, 1.0]  # weights
  jr_ne = INT(8)  # Jacobian elements
  Jr_row = INT[1, 1, 2, 2, 3, 3, 4, 4]  # Jacobian J
  Jr_col = INT[1, 2, 2, 3, 3, 4, 4, 5]
  Jr_val = zeros(T, jr_ne)

  # Set storage
  x_l = zeros(T, n)  # lower bounds
  x_u = zeros(T, n)  # upper bounds
  x = zeros(T, n)  # variables
  z = zeros(T, n)  # dual variables
  r = zeros(T, m_r)  # residual
  g = zeros(T, n)  # gradient
  x_stat = zeros(INT, n)  # variable status
  status = Ref{INT}()

  # set variable bounds
  for i in 1:n
    x_l[i] = zero(T)  # lower bound
    x_u[i] = one(T)   # upper bound
  end

  # Set user data
  p = T(4)
  flag = INT(0)  # current flag value
  flags = zeros(INT, m_r)  # array of flags
  userdata = userdata_bnls{T,INT}(p, flag, flags)

  @printf(" fortran sparse matrix indexing\n\n")

  if mode == "direct"
    # solve via function calls
    for d in 1:2
    #for d in 1:0
      # Initialize BNLS
      bnls_initialize(T, INT, data, control, inform)

      # Set user-defined control options
      # @reset control[].out = INT(0)
      # @reset control[].blls_control.out = INT(0)
      #@reset control[].print_level = INT(10)
      #@reset control[].blls_control.print_level = INT(4)
      # @reset control[].bllsb_control.print_level = INT(1)
      # @reset control[].maxit = INT(10)
      # @reset control[].blls_control.maxit = INT(5)
      @reset control[].jacobian_available = INT(2)
      if T == Float32
        @reset control[].stop_pg_absolute = T(0.0001)
      else
        @reset control[].stop_pg_absolute = T(0.00001)
      end
      @reset control[].blls_control.sbls_control.definite_linear_solver =
        galahad_linear_solver(dls)
      @reset control[].blls_control.sbls_control.symmetric_linear_solver =
        galahad_linear_solver(sls)
      st = " "

      for i in 1:n
        x[i] = T(0.5)  # starting point
      end

      # solve when Jacobian is available via function calls
      # ---------------------------------------------------
      if d == 1
        st = "JF"
        @reset control[].jacobian_available = INT(2)
        bnls_import(T, INT, control, data, status, n, m_r, "coordinate", jr_ne,
                    Jr_row, Jr_col, INT(0), C_NULL)
        bnls_solve_with_jac(T, INT, data, userdata, status, n, m_r, x_l, x_u,
                            x, z, r, g, x_stat, res_ptr, jr_ne, jac_ptr, w)
      end

      # solve when Jacobian products are available via function calls
      # -------------------------------------------------------------
      if d == 2
        st = "PF"
        @reset control[].jacobian_available = INT(1)
        bnls_import_without_jac(T, INT, control, data, status, n, m_r)
        bnls_solve_with_jacprod(T, INT, data, userdata, status, n, m_r,
                                x_l, x_u, x, z, r, g, x_stat, res_ptr,
                                jacprod_ptr, jacprods_ptr, sjacprod_ptr, w)
      end

      bnls_information(T, INT, data, inform, status)

      if inform[].status == 0
        @printf(" BNLS(%s):%6d iterations. Optimal objective value = %5.2f status = %1d\n",
                st, inform[].iter, inform[].obj, inform[].status)
      else
        @printf(" BNLS(%s): exit status = %1d\n", st, inform[].status)
      end

      # Delete internal workspace
      bnls_terminate(T, INT, data, control, inform)
    end
  end

  if mode == "reverse"
    # reverse-communication input/output
    mnm = max(m_r, n)
    eval_status = Ref{INT}()
    lvl = Ref{INT}()
    lvu = Ref{INT}()
    iv = zeros(INT, mnm)
    #ip = zeros(INT, m_r)
    ip = zeros(INT, mnm)
    lp = zeros(INT, 1)
    v = zeros(T, mnm)
    p = zeros(T, mnm)
    got_jr = true

    # solve via reverse access
    # ------------------------
    #for d in 1:0
    for d in 1:2
      # Initialize BNLS
      bnls_initialize(T, INT, data, control, inform)

      # Set user-defined control options
      # @reset control[].print_level = INT(1)
      # @reset control[].maxit = INT(10)
      # @reset control[].blls_control.maxit = INT(5)
      if T == Float32
        @reset control[].stop_pg_absolute = T(0.0001)
      else
        @reset control[].stop_pg_absolute = T(0.00001)
      end
      @reset control[].blls_control.sbls_control.definite_linear_solver =
        galahad_linear_solver(dls)
      @reset control[].blls_control.sbls_control.symmetric_linear_solver =
        galahad_linear_solver(sls)
      st = " "

      for i in 1:n
        x[i] = T(0.5)  # starting point
      end

      if d == 1
        # solve when Jacobian is available via reverse access
        # ---------------------------------------------------
        st = "JR"
        @reset control[].jacobian_available = INT(2)
        bnls_import(T, INT, control, data, status, n, m_r, "coordinate",
                    jr_ne, Jr_row, Jr_col, INT(0), C_NULL)

        terminated = false
        while !terminated # reverse-communication loop
          bnls_solve_reverse_with_jac(T, INT, data, status, eval_status,
                                      n, m_r, x_l, x_u, x, z, r, g, x_stat,
                                      jr_ne, Jr_val, w)
          if status[] == 0 # successful termination
            terminated = true
          elseif status[] < 0 # error exit
            terminated = true
          elseif status[] == 2 # evaluate r
            eval_status[] = res(x, r, userdata)
          elseif status[] == 3 # evaluate Jr
            eval_status[] = jac(x, Jr_val, userdata)
          else
            @printf(" the value %1d of status should not occur\n", status[])
          end
        end
      end

      if d == 2
        # solve when Jacobian products are available via reverse access
        # -------------------------------------------------------------
        st = "PR"
        @reset control[].jacobian_available = INT(1)
        bnls_import_without_jac(T, INT, control, data, status, n, m_r)

        terminated = false
        while !terminated # reverse-communication loop
          bnls_solve_reverse_with_jacprod(T, INT, data, status, eval_status,
                                          n, m_r, x_l, x_u, x, z, r, g, x_stat,
                                          v, iv, lvl, lvu, p, ip, lp[1], w)
          if status[] == 0 # successful termination
            terminated = true
          elseif status[] < 0 # error exit
            terminated = true
          elseif status[] == 2 # evaluate r
            eval_status[] = res(x, r, userdata)
            got_jr = false
          elseif status[] == 4 # evaluate p = Jr v
            eval_status[] = jacprod(x, false, v, p, got_jr, userdata)
          elseif status[] == 5 # evaluate p = Jr' v
            eval_status[] = jacprod(x, true, v, p, got_jr, userdata)
          elseif status[] == 6 # evaluate p = Jr * sparse v
            eval_status[] = jacprods(n, m_r, x, v, p, iv, lvl[], lvu[],
                                     INT[], INT[], got_jr, userdata)
          elseif status[] == 7 # evaluate p = sparse(Jr(x) * sparse v)
            eval_status[] = jacprods(n, m_r, x, v, p, iv, lvl[], lvu[],
                                     ip, lp, got_jr, userdata)
          elseif status[] == 8 # evaluate p = sparse(Jr' v)
            eval_status[] = sjacprod(n, m_r, x, true, v, p, iv, lvu[],
                                     got_jr, userdata)
          else
            @printf(" the value %1d of status should not occur\n", status[])
          end
        end
      end

      bnls_information(T, INT, data, inform, status)

      if inform[].status == 0
        @printf(" BNLS(%s):%6d iterations. Optimal objective value = %5.2f status = %1d\n",
                st, inform[].iter, inform[].obj, inform[].status)
      else
        @printf(" BNLS(%s): exit status = %1d\n", st, inform[].status)
      end

      # Delete internal workspace
      bnls_terminate(T, INT, data, control, inform)
    end
  end

  return 0
end

for (T, INT, libgalahad) in ((Float32 , Int32, GALAHAD.libgalahad_single      ),
                             (Float32 , Int64, GALAHAD.libgalahad_single_64   ),
                             (Float64 , Int32, GALAHAD.libgalahad_double      ),
                             (Float64 , Int64, GALAHAD.libgalahad_double_64   ),
                             (Float128, Int32, GALAHAD.libgalahad_quadruple   ),
                             (Float128, Int64, GALAHAD.libgalahad_quadruple_64))
  if isfile(libgalahad)
    @testset "BNLS -- $T -- $INT" begin
      @testset "$mode communication" for mode in ("direct","reverse")
        @test test_bnls(T, INT; mode) == 0
      end
    end
  end
end