GALAHAD BNLS package#
purpose#
The bnls package uses a regularization method to solve
a given bound-constrained nonlinear least-squares problem.
The aim is to minimize the least-squares objective function
See Section 4 of $GALAHAD/doc/bnls.pdf for additional details.
terminology#
The algorithm used by the package is iterative. From the current best estimate of the minimizer \(x_k\), a trial improved point \(x_k + s_k\) is sought. The correction \(s_k\) is chosen to improve a model \(m_k(s)\) of the objective function \(f(x_k+s)\) built around \(x_k\). The model is the sum of two basic components, a suitable approximation \(t_k(s)\) of \(f(x_k+s)\), and a regularization term \(\frac{1}{2} \sigma_k \|s\|_2^2\) involving a weight \(\sigma_k\). The weight is adjusted as the algorithm progresses to ensure convergence.
The model \(t_k(s)\) is a truncated Taylor-series approximation, and this relies on being able to compute or estimate derivatives of \(c(x)\). Various models are provided, and each has different derivative requirements. We denote the \(m\) by \(n\) residual Jacobian \(J_r(x) \equiv \nabla_x c(x)\) as the matrix whose \(i,j\)-th component
the Gauss-Newton approximation \(\frac{1}{2} \| r(x_k) + J_r(x_k) s\|^2_W\),
the Newton (second-order Taylor) approximation
\(f(x_k) + g(x_k)^T s + \frac{1}{2} s^T [ J_r^T(x_k) W J_r(x_k) + H(x_k,W r(x_k))] s\)
(although the latter has yet to be implemented).
The primal optimality conditions (1) and dual optimality conditions
method#
An adaptive regularization method is used. In this, an improvement to a current estimate of the required minimizer, \(x_k\) is sought by computing a step \(s_k\). The step is chosen to approximately minimize a model \(t_k(s)\) of \(f_{\rho,r}(x_k+s)\) that includes a weighted regularization term \(\frac{\sigma_k}{p} \|s\|_{S_k}^p\) for some specified positive weight \(\sigma_k\). The quality of the resulting step \(s_k\) is assessed by computing the “ratio” \((f(x_k) - f(x_k + s_k))/(t_k(0) - t_k(s_k))\). The step is deemed to have succeeded if the ratio exceeds a given \(\eta_s > 0\), and in this case \(x_{k+1} = x_k + s_k\). Otherwise \(x_{k+1} = x_k\), and the weight is increased by powers of a given increase factor up to a given limit. If the ratio is larger than \(\eta_v \geq \eta_d\), the weight will be decreased by powers of a given decrease factor again up to a given limit. The method will terminate as soon as \(f(x_k)\) or \(\|\nabla_x f(x_k)\|\) is smaller than a specified value.
The step \(s_k\) may be computed either by employing a projected-gradient
method to minimize the model within the simple-bound constraint set
\(x^L \leq x_k \leq x^U)\) using the GALAHAD module blls, or by
applying the interior-point method available in the module bllsb
to the same subproblem. Experience has shown that it can be beneficial
to use the latter method during early iterations, but to switch to the
former as the iterates approach convergence.
The iteration is terminated as soon as either the \(W\)-norm of the residual \(r(x_k)\) or the the Euclidean norm of the projected gradient \(P[x_k-\nabla f(x_k)]\), for which the projection operator
references#
The generic adaptive cubic regularization method is described in detail in
C. Cartis, N. I. M. Gould and Ph. L. Toint, ‘’Evaluation complexity of algorithms for nonconvex optimization’’ SIAM-MOS Series on Optimization (2022),
and uses ‘’tricks’’ as suggested in
N. I. M. Gould, M. Porcelli and Ph. L. Toint, ‘’Updating the regularization parameter in the adaptive cubic regularization algorithm’’. Computational Optimization and Applications 53(1) (2012) 1–22.
The specific methods employed here are discussed in
N. I. M. Gould, ‘’A projection method for bound-constrained linear least-squares’’. STFC-Rutherford Appleton Laboratory Computational Mathematics Group Internal Report 2023-1 (2023).
matrix storage#
The unsymmetric \(m_r\) by \(n\) Jacobian matrix \(J_r = J_r(x)\) may be presented and stored in a variety of convenient input formats.
Dense storage format: The matrix \(J_r\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. In this case, component \(n \ast i + j\) of the storage array Jr_val will hold the value \(J_{ij}\) for \(0 \leq i \leq m_r-1\), \(0 \leq j \leq n-1\).
Dense by columns storage format: The matrix \(J_r\) is stored as a compact dense matrix by columns, that is, the values of the entries of each column in turn are stored in order within an appropriate real one-dimensional array. In this case, component \(m \ast j + i\) of the storage array Jr_val will hold the value \(J_{ij}\) for \(0 \leq i \leq m_r-1\), \(0 \leq j \leq n-1\).
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(0 \leq l \leq ne-1\), of \(J_r\), its row index i, column index j and value \(J_{ij}\), \(0 \leq i \leq m_r-1\), \(0 \leq j \leq n-1\), are stored as the \(l\)-th components of the integer arrays Jr_row and Jr_col and real array Jr_val, respectively, while the number of nonzeros is recorded as Jr_ne = \(ne\).
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(J_r\) the i-th component of the integer array Jr_ptr holds the position of the first entry in this row, while Jr_ptr(m) holds the total number of entries. The column indices j, \(0 \leq j \leq n-1\), and values \(J_{ij}\) of the nonzero entries in the i-th row are stored in components l = Jr_ptr(i), \(\ldots\), Jr_ptr(i+1)-1, \(0 \leq i \leq m_r-1\), of the integer array Jr_col, and real array Jr_val, respectively. For sparse matrices, this scheme almost always requires less storage than its predecessor.
Sparse column-wise storage format: Once again only the nonzero entries are stored, but this time they are ordered so that those in column j appear directly before those in column j+1. For the j-th column of \(J_r\) the j-th component of the integer array Jr_ptr holds the position of the first entry in this column, while Jr_ptr(n) holds the total number of entries. The row indices i, \(0 \leq i \leq m_r-1\), and values \(J_{ij}\) of the nonzero entries in the j-th columnsare stored in components l = Jr_ptr(j), \(\ldots\), Jr_ptr(j+1)-1, \(0 \leq j \leq n-1\), of the integer array Jr_row, and real array Jr_val, respectively. As before, for sparse matrices, this scheme almost always requires less storage than the co-ordinate format.
The symmetric \(n\) by \(n\) matrix \(H = H(x,y)\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).
Dense storage format: The matrix \(H\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(H\) is symmetric, only the lower triangular part (that is the part \(H_{ij}\) for \(0 \leq j \leq i \leq n-1\)) need be held. In this case the lower triangle should be stored by rows, that is component \(i * i / 2 + j\) of the storage array H_val will hold the value \(H_{ij}\) (and, by symmetry, \(H_{ji}\)) for \(0 \leq j \leq i \leq n-1\).
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(0 \leq l \leq ne-1\), of \(H\), its row index i, column index j and value \(H_{ij}\), \(0 \leq j \leq i \leq n-1\), are stored as the \(l\)-th components of the integer arrays H_row and H_col and real array H_val, respectively, while the number of nonzeros is recorded as H_ne = \(ne\). Note that only the entries in the lower triangle should be stored.
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(H\) the i-th component of the integer array H_ptr holds the position of the first entry in this row, while H_ptr(n) holds the total number of entries. The column indices j, \(0 \leq j \leq i\), and values \(H_{ij}\) of the entries in the i-th row are stored in components l = H_ptr(i), …, H_ptr(i+1)-1 of the integer array H_col, and real array H_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor.
Diagonal storage format: If \(H\) is diagonal (i.e., \(H_{ij} = 0\) for all \(0 \leq i \neq j \leq n-1\)) only the diagonals entries \(H_{ii}\), \(0 \leq i \leq n-1\) need be stored, and the first n components of the array H_val may be used for the purpose.
Multiples of the identity storage format: If \(H\) is a multiple of the identity matrix, (i.e., \(H = \alpha I\) where \(I\) is the n by n identity matrix and \(\alpha\) is a scalar), it suffices to store \(\alpha\) as the first component of H_val.
The identity matrix format: If \(H\) is the identity matrix, no values need be stored.
The zero matrix format: The same is true if \(H\) is the zero matrix.
introduction to function calls#
To solve a given problem, functions from the bnls package must be called in the following order:
To solve a given problem, functions from the bnls package must be called in the following order:
bnls_initialize - provide default control parameters and set up initial data structures
bnls_read_specfile (optional) - override control values by reading replacement values from a file
set up data structures by calling one of
bnls_import - set up problem data structures and fixed values when \(J_r(x)\) is available
bnls_import_without_jac - set up problem data structures and fixed values when only products with \(J_r(x)\) are available
bnls_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved
solve the problem by calling one of
bnls_solve_with_jac - solve using function calls to evaluate residual and Jacobian values
bnls_solve_with_jacprod - solve using function calls to evaluate residual values and Jacobian-vector products
bnls_solve_reverse_with_jac - solve returning to the calling program to obtain residual and Jacobian values, or
bnls_solve_reverse_with_jacprod - solve returning to the calling prorgram to obtain residual and values and Jacobian-vector products
bnls_information (optional) - recover information about the solution and solution process
bnls_terminate - deallocate data structures
See the examples section for illustrations of use.
callable functions#
overview of functions provided#
// typedefs typedef float spc_; typedef double rpc_; typedef int ipc_; // structs struct bnls_control_type; struct bnls_inform_type; struct bnls_time_type; // function calls void bnls_initialize( void **data, struct bnls_control_type* control, struct bnls_inform_type* inform ); void bnls_read_specfile(struct bnls_control_type* control, const char specfile[]); void bnls_import( struct bnls_control_type* control, void **data, ipc_ *status, ipc_ n, ipc_ m_r, const char J_type[], ipc_ J_ne, const ipc_ J_row[], const ipc_ J_col[], ipc_ J_ptr_ne, const ipc_ J_ptr[] ); void bnls_import_withot_jac( struct bnls_control_type* control, void **data, ipc_ *status, ipc_ n, ipc_ m_r ); void bnls_reset_control( struct bnls_control_type* control, void **data, ipc_ *status ); void bnls_solve_with_jac( void **data, void *userdata, ipc_ *status, ipc_ n, ipc_ m_r, rpc_ x_l[], rpc_ x_u[], rpc_ x[], rpc_ z[], rpc_ r[], rpc_ g[], ipc_ x_stat[], ipc_(*)(ipc_, ipc_, const rpc_[], rpc_[], const void*) eval_r, ipc_ j_ne, ipc_(*)(ipc_, ipc_, ipc_, const rpc_[], rpc_[], const void*) eval_jr, const rpc_ w[] ); void bnls_solve_with_jacprod( void **data, void *userdata, ipc_ *status, ipc_ n, ipc_ m_r, rpc_ x_l[], rpc_ x_u[], rpc_ x[], rpc_ z[], rpc_ r[], rpc_ g[], ipc_ x_stat[], ipc_(*)(ipc_, ipc_, const rpc_[], rpc_[], const void*) eval_r, ipc_(*)(ipc_, ipc_, const rpc_[], const bool, rpc_[], const rpc_[], bool, const void*) eval_jr_prod, ipc_(*)(ipc_, ipc_, const rpc_[], const rpc_[], rpc_[], const ipc_[], ipc_, ipc_, ipc_[], ipc_, bool, const void*) eval_jr_prods, ipc_(*)(ipc_, ipc_, const rpc_[], const bool, const rpc_[], rpc_[], const ipc_[], ipc_, bool, const void*) eval_jr_sprod, const rpc_ w[] ); void bnls_solve_reverse_with_jac( void **data, ipc_ *status, ipc_ *eval_status, ipc_ n, ipc_ m_r, rpc_ x_l[], rpc_ x_u[], rpc_ x[], rpc_ z[], rpc_ r[], rpc_ g[], ipc_ x_stat[], ipc_ jr_ne, rpc_ Jr_val[], const rpc_ w[] ); void bnls_solve_reverse_with_jacprod( void **data, ipc_ *status, ipc_ *eval_status, ipc_ n, ipc_ m_r, rpc_ x_l[], rpc_ x_u[], rpc_ x[], rpc_ z[], rpc_ r[], rpc_ g[], ipc_ x_stat[], rpc_ v[], ipc_ iv[], rpc_ v[], ipc_ *lvl, ipc_ *lvu, const rpc_ p[], const ipc_ ip[], ipc_ lp, const rpc_ w[] ); void bnls_information(void **data, struct bnls_inform_type* inform, ipc_ *status); void bnls_terminate( void **data, struct bnls_control_type* control, struct bnls_inform_type* inform );
typedefs#
typedef float spc_
spc_ is real single precision
typedef double rpc_
rpc_ is the real working precision used, but may be changed to float by
defining the preprocessor variable REAL_32 or (if supported) to
__real128 using the variable REAL_128.
typedef int ipc_
ipc_ is the default integer word length used, but may be changed to
int64_t by defining the preprocessor variable INTEGER_64.
function and structure names#
The function and structure names described below are appropriate for the
default real working precision (double) and integer word length
(int32_t). To use the functions and structures with different precisions
and integer word lengths, an additional suffix must be added to their names
(and the arguments set accordingly). The appropriate suffices are:
_s for single precision (float) reals and
standard 32-bit (int32_t) integers;
_q for quadruple precision (__real128) reals (if supported) and
standard 32-bit (int32_t) integers;
_64 for standard precision (double) reals and
64-bit (int64_t) integers;
_s_64 for single precision (float) reals and
64-bit (int64_t) integers; and
_q_64 for quadruple precision (__real128) reals (if supported) and
64-bit (int64_t) integers.
Thus a call to bnls_initialize below will instead be
void bnls_initialize_s_64(void **data, struct bnls_control_type_s_64* control, int64_t *status)
if single precision (float) reals and 64-bit (int64_t) integers are
required. Thus it is possible to call functions for this package
with more that one precision and/or integer word length at same time. An
example is provided for the package expo,
and the obvious modifications apply equally here.
function calls#
void bnls_initialize( void **data, struct bnls_control_type* control, struct bnls_inform_type* inform )
Set default control values and initialize private data
Parameters:
data |
holds private internal data |
control |
is a struct containing control information (see bnls_control_type) |
inform |
is a struct containing output information (see bnls_inform_type) |
void bnls_read_specfile(struct bnls_control_type* control, const char specfile[])
Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/bnls/BNLS.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/bnls.pdf for a list of how these keywords relate to the components of the control structure.
Parameters:
control |
is a struct containing control information (see bnls_control_type) |
specfile |
is a character string containing the name of the specification file |
void bnls_import( struct bnls_control_type* control, void **data, ipc_ *status, ipc_ n, ipc_ m_r, const char J_type[], ipc_ J_ne, const ipc_ J_row[], const ipc_ J_col[], ipc_ J_ptr_ne, const ipc_ J_ptr[] )
Import problem data into internal storage prior to solution.
Parameters:
control |
is a struct whose members provide control paramters for the remaining prcedures (see bnls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:
|
n |
is a scalar variable of type ipc_, that holds the number of variables. |
m_r |
is a scalar variable of type ipc_, that holds the number of residuals. |
J_type |
is a one-dimensional array of type char that specifies the unsymmetric storage scheme used for the Jacobian, \(J_r\). It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’ or ‘absent’, the latter if access to the Jacobian is via matrix-vector products; lower or upper case variants are allowed. |
J_ne |
is a scalar variable of type ipc_, that holds the number of entries in \(J_r\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes. |
J_row |
is a one-dimensional array of size J_ne and type ipc_, that holds the row indices of \(J_r\) in the sparse co-ordinate and sparse column-wise storage schemes. It need not be set for any of the remaining schemes, and in this case can be NULL. |
J_col |
is a one-dimensional array of size J_ne and type ipc_, that holds the column indices of \(J_r\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set for any of the remaining schemes, and in this case can be NULL. |
J_ptr_ne |
is a scalar variable of type ipc_, that holds the length of the pointer array if sparse row or column storage scheme is used for \(J_r\). For the sparse row scheme, Jr_ptr_ne should be at least m_r+1, while for the sparse column scheme, it should be at least n+1, It should be set to 0 when the other schemes are used. |
J_ptr |
is a one-dimensional array of size m+1 and type ipc_, that holds the starting position of each row of \(J_r\), as well as the total number of entries, in the sparse row-wise storage scheme, or the starting position of each column of \(J_r\), as well as the total number of entries, in the sparse column-wise storage scheme. It need not be set when the other schemes are used, and in this case can be NULL. |
void bnls_import_without_jac( struct bnls_control_type* control, void **data, ipc_ *status, ipc_ n, ipc_ m_r )
Import problem data, excluding the structure of \(J_r(x)\), into internal storage prior to solution.
Parameters:
control |
is a struct whose members provide control paramters for the remaining prcedures (see bnls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:
|
n |
is a scalar variable of type ipc_, that holds the number of variables. |
m_r |
is a scalar variable of type ipc_, that holds the number of residuals. |
void bnls_reset_control( struct bnls_control_type* control, void **data, ipc_ *status )
Reset control parameters after import if required.
Parameters:
control |
is a struct whose members provide control paramters for the remaining prcedures (see bnls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:
|
void bnls_solve_with_jac( void **data, void *userdata, ipc_ *status, ipc_ n, ipc_ m_r, rpc_ x_l[], rpc_ x_u[], rpc_ x[], rpc_ z[], rpc_ r[], rpc_ g[], ipc_ x_stat[], ipc_(*)(ipc_, ipc_, const rpc_[], rpc_[], const void*) eval_r, ipc_ jr_ne, ipc_(*)(ipc_, ipc_, ipc_, const rpc_[], rpc_[], const void*) eval_jr, const rpc_ w[] )
Solve the simplex-constrained nonlinear least-squares problem when the Jacobian \(J_r(x)\) is available by function calls.
Parameters:
data |
holds private internal data |
userdata |
is a structure that allows data to be passed into the function and derivative evaluation programs. |
status |
is a scalar variable of type ipc_, that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
n |
is a scalar variable of type ipc_, that holds the number of variables. |
m_r |
is a scalar variable of type ipc_, that holds the number of residuals. |
x_l |
is a one-dimensional array of size n and type rpc_, that holds the lower bounds \(x^l\) on the variables \(x\). The j-th component of x_l, j = 0, … , n-1, contains \(x^l_j\). |
x_u |
is a one-dimensional array of size n and type rpc_, that holds the upper bounds \(x^u\) on the variables \(x\). The j-th component of x_u, j = 0, … , n-1, contains \(x^u_j\). |
x |
is a one-dimensional array of size n and type rpc_, that holds the values \(x\) of the optimization variables. The j-th component of x, j = 0, … , n-1, contains \(x_j\). |
z |
is a one-dimensional array of size n and type rpc_, that holds the values \(z\) of the dual variables. The j-th component of z, j = 0, … , n-1, contains \(z_j\). |
r |
is a one-dimensional array of size m and type rpc_, that holds the residual \(r(x)\). The i-th component of r, j = 0, … , n-1, contains \(r_j(x)\). |
g |
is a one-dimensional array of size n and type rpc_, that holds the gradient \(g = \nabla_xf(x)\) of the objective function. The j-th component of g, j = 0, … , n-1, contains \(g_j\). |
x_stat |
is a one-dimensional array of size n and type ipc_, that gives the optimal status of the problem variables. If x_stat[j] is negative, the variable \(x_j\) most likely lies at its zero lower bound, while if it is zero, \(x_j\) is free of its bound (or unconstrained). |
eval_r |
is a user-supplied function that must have the following signature: ipc_ eval_r( ipc_ n, ipc_ m_r, const rpc_ x[], rpc_ r[], const void *userdata ) The components of the residual function \(r(x)\) evaluated at x= \(x\) must be assigned to r, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into |
j_ne |
is a scalar variable of type ipc_, that holds the number of entries in the Jacobian matrix \(J_r\). |
eval_jr |
is a user-supplied function that must have the following signature: ipc_ eval_jr( ipc_ n, ipc_ m, ipc_ jr_ne, const rpc_ x[], rpc_ jr[], const void *userdata ) The components of the Jacobian \(J_r = \nabla_x r(x\)) of the residuals must be assigned to jr in the same order as presented to bnls_import, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into |
w |
is a one-dimensional array of size m_r and type rpc_ that holds the values \(w\) of the weights on the residuals in the least-squares objective function. It need not be set if the weights are all ones, and in this case can be NULL. |
void bnls_solve_with_jacprod( void **data, void *userdata, ipc_ *status, ipc_ n, ipc_ m_r, rpc_ x_l[], rpc_ x_u[], rpc_ x[], rpc_ z[], rpc_ r[], rpc_ g[], ipc_ x_stat[], ipc_(*)(ipc_, ipc_, const rpc_[], rpc_[], const void*) eval_r, ipc_(*)(ipc_, ipc_, const rpc_[], const bool, rpc_[], const rpc_[], bool, const void*) eval_jr_prod, ipc_(*)(ipc_, ipc_, const rpc_[], const rpc_[], rpc_[], const ipc_[], ipc_, ipc_, ipc_[], ipc_*, bool, const void *) eval_jr_prods ipc_(*)(ipc_, ipc_, const rpc_[], const bool, const rpc_[], rpc_[], const ipc_[], ipc_, bool, const void*) eval_jr_sprod, const rpc_ w[] )
Solve the simplex-constrained nonlinear least-squares problem when the products of the Jacobian \(J_r(x)\) and its transpose are available by function calls.
Parameters:
data |
holds private internal data |
userdata |
is a structure that allows data to be passed into the function and derivative evaluation programs. |
status |
is a scalar variable of type ipc_, that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
n |
is a scalar variable of type ipc_, that holds the number of variables |
m_r |
is a scalar variable of type ipc_, that holds the number of residuals. |
x_l |
is a one-dimensional array of size n and type rpc_, that holds the lower bounds \(x^l\) on the variables \(x\). The j-th component of x_l, j = 0, … , n-1, contains \(x^l_j\). |
x_u |
is a one-dimensional array of size n and type rpc_, that holds the upper bounds \(x^u\) on the variables \(x\). The j-th component of x_u, j = 0, … , n-1, contains \(x^u_j\). |
x |
is a one-dimensional array of size n and type rpc_, that holds the values \(x\) of the optimization variables. The j-th component of x, j = 0, … , n-1, contains \(x_j\). |
z |
is a one-dimensional array of size n and type rpc_, that holds the values \(z\) of the dual variables. The j-th component of z, j = 0, … , n-1, contains \(z_j\). |
r |
is a one-dimensional array of size m and type rpc_, that holds the residual \(r(x)\). The i-th component of r, j = 0, … , n-1, contains \(r_j(x)\). |
g |
is a one-dimensional array of size n and type rpc_, that holds the gradient \(g = \nabla_xf(x)\) of the objective function. The j-th component of g, j = 0, … , n-1, contains \(g_j\). |
x_stat |
is a one-dimensional array of size n and type ipc_, that gives the optimal status of the problem variables. If x_stat[j] is negative, the variable \(x_j\) most likely lies at its zero lower bound, while if it is zero, \(x_j\) is free of its bound (or unconstrained). |
eval_r |
is a user-supplied function that must have the following signature: ipc_ eval_r( ipc_ n, ipc_ m_r, const rpc_ x[], rpc_ r[], const void *userdata ) The components of the residual function \(r(x)\) evaluated at x= \(x\) must be assigned to r, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into |
eval_jr_prod |
is a user-supplied function that must have the following signature: ipc_ eval_jr_prod( ipc_ n, ipc_ m_r, const rpc_ x[], bool transpose, const rpc_ v[], rpc_ p[], bool got_jr, const void *userdata ) The product \(p = J_r(x) v\) (if tranpose is false) or \(p = J_r^T(x) v\) (if tranpose is true) bewteen the product of the Jacobian \(J_r(x) = \nabla_{x}c_(x)\) or its tranpose with the vector v= \(v\) and the vector $ \(u\) must be returned in p, and the function return value set to 0. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into |
eval_jr_prods |
is a user-supplied function that must have the following signature: ipc_ eval_jr_prods( ipc_ n, ipc_ m_r, const rpc_ x[], const rpc_ v[], rpc_ p[], const ipc_ iv[], ipc_ lvl, ipc_ lvu, ipc_ ip[], ipc_ *lp, bool got_jr, const void *userdata ) The product \(p = J_r(x) v\) bewteen the Jacobian \(J_r(x) = \nabla_{x}r(x)\) evaluated at x\(=x\) with the vector v=\(v\) must be returned in p, and the function return value set to 0. Only the components iv[lvl:lvu] of \(v\) will be nonzero. If ip or lp is NULL, the whole of p[0,m_r-1] should be filled. Otherwise, only the lp nonzero components p[ip[0:lp-1]] need be specified, and ip and lp returned accordingly. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into |
eval_jr_sprod |
is a user-supplied function that must have the following signature: ipc_ eval_jr_sprod( ipc_ n, ipc_ m_r, const rpc_ x[], const bool transpose, const rpc_ v[], rpc_ p[], const ipc_ free[], ipc_ n_free, bool got_jr, const void *userdata ) The product \(J_r(x) v\) (if tranpose is false) or \(J_r^T(x) v\) (if tranpose is true) bewteen the Jacobian \(J_r(x) = \nabla_{x}r(x)\) or its tranpose with the vector v=\(v\) must be returned in p, and the function return value set to 0. If transpose is false, only the components free[0 : n_free-1] of \(v\) will be nonzero, while if transpose is true, only the components free[0 : n_free-1] of p should be set. If the evaluation is impossible at x, return should be set to a nonzero value. Data may be passed into |
w |
is a one-dimensional array of size m_r and type rpc_ that holds the values \(w\) of the weights on the residuals in the least-squares objective function. It need not be set if the weights are all ones, and in this case can be NULL. |
void bnls_solve_reverse_with_jac( void **data, ipc_ *status, ipc_ *eval_status, ipc_ n, ipc_ m_r, rpc_ x_l[], rpc_ x_u[], rpc_ x[], rpc_ z[], rpc_ r[], rpc_ g[], ipc_ x_stat[], ipc_ jr_ne, rpc_ Jr_val[], const rpc_ w[] )
Solve the simplex-constrained nonlinear least-squares problem when the Jacobian \(J_r(x)\) may be computed by the calling program.
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
eval_status |
is a scalar variable of type ipc_, that is used to indicate if objective function/gradient/Hessian values can be provided (see above) |
n |
is a scalar variable of type ipc_, that holds the number of variables |
m_r |
is a scalar variable of type ipc_, that holds the number of residuals. |
x_l |
is a one-dimensional array of size n and type rpc_, that holds the lower bounds \(x^l\) on the variables \(x\). The j-th component of x_l, j = 0, … , n-1, contains \(x^l_j\). |
x_u |
is a one-dimensional array of size n and type rpc_, that holds the upper bounds \(x^u\) on the variables \(x\). The j-th component of x_u, j = 0, … , n-1, contains \(x^u_j\). |
x |
is a one-dimensional array of size n and type rpc_, that holds the values \(x\) of the optimization variables. The j-th component of x, j = 0, … , n-1, contains \(x_j\). |
z |
is a one-dimensional array of size n and type rpc_, that holds the values \(z\) of the dual variables. The j-th component of z, j = 0, … , n-1, contains \(z_j\). |
r |
is a one-dimensional array of size m and type rpc_, that holds the residual \(r(x)\). The i-th component of r, j = 0, … , n-1, contains \(r_j(x)\). See status = 2, above, for more details. |
g |
is a one-dimensional array of size n and type rpc_, that holds the gradient \(g = \nabla_xf(x)\) of the objective function. The j-th component of g, j = 0, … , n-1, contains \(g_j\). |
x_stat |
is a one-dimensional array of size n and type ipc_, that gives the optimal status of the problem variables. If x_stat[j] is negative, the variable \(x_j\) most likely lies at its zero lower bound, while if it is zero, \(x_j\) is free of its bound (or unconstrained). |
jr_ne |
is a scalar variable of type ipc_, that holds the number of entries in the Jacobian matrix \(J_r\). |
Jr_val |
is a one-dimensional array of size jr_ne and type rpc_, that holds the values of the entries of the Jacobian matrix \(J_r\) in any of the available storage schemes. See status = 3, above, for more details. |
w |
is a one-dimensional array of size m_r and type rpc_ that holds the values \(w\) of the weights on the residuals in the least-squares objective function. It need not be set if the weights are all ones, and in this case can be NULL. |
void bnls_solve_reverse_with_jacprod( void **data, ipc_ *status, ipc_ *eval_status, ipc_ n, ipc_ m_r, rpc_ x_l[], rpc_ x_u[], rpc_ x[], rpc_ z[], rpc_ r[], rpc_ g[], ipc_ x_stat[], rpc_ v[], ipc_ iv[], rpc_ v[], ipc_ *lvl, ipc_ *lvu, const rpc_ p[], const ipc_ ip[], ipc_ lp, const rpc_ w[] )
Solve the simplex-constrained nonlinear least-squares problem when the products of the Jacobian \(J_r(x)\) and its transpose with specified vectors may be computed by the calling program.
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
eval_status |
is a scalar variable of type ipc_, that is used to indicate if objective function/gradient/Hessian values can be provided (see above) |
n |
is a scalar variable of type ipc_, that holds the number of variables |
m_r |
is a scalar variable of type ipc_, that holds the number of residuals. |
x_l |
is a one-dimensional array of size n and type rpc_, that holds the lower bounds \(x^l\) on the variables \(x\). The j-th component of x_l, j = 0, … , n-1, contains \(x^l_j\). |
x_u |
is a one-dimensional array of size n and type rpc_, that holds the upper bounds \(x^u\) on the variables \(x\). The j-th component of x_u, j = 0, … , n-1, contains \(x^u_j\). |
x |
is a one-dimensional array of size n and type rpc_, that holds the values \(x\) of the optimization variables. The j-th component of x, j = 0, … , n-1, contains \(x_j\). |
z |
is a one-dimensional array of size n and type rpc_, that holds the values \(z\) of the dual variables. The j-th component of z, j = 0, … , n-1, contains \(z_j\). |
r |
is a one-dimensional array of size m and type rpc_, that holds the residual \(r(x)\). The i-th component of r, j = 0, … , n-1, contains \(r_j(x)\). See status = 2, above, for more details. |
g |
is a one-dimensional array of size n and type rpc_, that holds the gradient \(g = \nabla_xf(x)\) of the objective function. The j-th component of g, j = 0, … , n-1, contains \(g_j\). |
x_stat |
is a one-dimensional array of size n and type ipc_, that gives the optimal status of the problem variables. If x_stat[j] is negative, the variable \(x_j\) most likely lies at its zero lower bound, while if it is zero, \(x_j\) is free of its bound (or unconstrained). |
v |
is a one-dimensional array of size max(n,m_r) and type rpc_, that is used for reverse communication. See status = 4, 5, 7 and 8 above for more details. |
iv |
is a one-dimensional array of size max(n,m_r) and type ipc_, that is used for reverse communication. See status = 7 and 8 above for more details. |
lvl |
is a scalar variable of type ipc_, that is used for reverse communication. See status = 7 and 8 above for more details. |
lvu |
is a scalar variable of type ipc_, that is used for reverse communication. See status = 7 and 8 above for more details. |
index |
is a scalar variable of type ipc_, that is used for reverse communication. See status = 6 above for more details. |
p |
is a one-dimensional array of size max(n,m_r) and type rpc_, that is used for reverse communication. See status = 4 to 8 above for more details. |
ip |
is a one-dimensional array of size n and type ipc_, that is used for reverse communication. See status = 6 above for more details. |
lp |
is a scalar variable of type ipc_, that is used for reverse communication. See status = 6 above for more details. |
w |
is a one-dimensional array of size m_r and type rpc_ that holds the values \(w\) of the weights on the residuals in the least-squares objective function. It need not be set if the weights are all ones, and in this case can be NULL. |
void bnls_information(void **data, struct bnls_inform_type* inform, ipc_ *status)
Provides output information
Parameters:
data |
holds private internal data |
inform |
is a struct containing output information (see bnls_inform_type) |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are (currently):
|
void bnls_terminate( void **data, struct bnls_control_type* control, struct bnls_inform_type* inform )
Deallocate all internal private storage
Parameters:
data |
holds private internal data |
control |
is a struct containing control information (see bnls_control_type) |
inform |
is a struct containing output information (see bnls_inform_type) |
available structures#
bnls_control_type structure#
#include <galahad_bnls.h> struct bnls_control_type { // components bool f_indexing; ipc_ error; ipc_ out; ipc_ print_level; ipc_ start_print; ipc_ stop_print; ipc_ print_gap; ipc_ maxit; ipc_ alive_unit; char alive_file[31]; ipc_ jacobian_available; ipc_ subproblem_solver; ipc_ non_monotone; ipc_ weight_update_strategy; rpc_ infinity; rpc_ stop_r_absolute; rpc_ stop_r_relative; rpc_ stop_pg_absolute; rpc_ stop_pg_relative; rpc_ stop_s; rpc_ stop_pg_switch; rpc_ initial_weight; rpc_ minimum_weight; rpc_ eta_successful; rpc_ eta_very_successful; rpc_ eta_too_successful; rpc_ weight_decrease_min; rpc_ weight_decrease; rpc_ weight_increase; rpc_ weight_increase_max; rpc_ switch_to_newton; rpc_ cpu_time_limit; rpc_ clock_time_limit; bool newton_acceleration; bool magic_step; bool print_obj; bool space_critical; bool deallocate_error_fatal; char prefix[31]; struct blls_control_type blls_control; struct bllsb_control_type bllsb_control; };
detailed documentation#
control derived type as a C struct
components#
bool f_indexing
use C or Fortran sparse matrix indexing
ipc_ error
error and warning diagnostics occur on stream error
ipc_ out
general output occurs on stream out
ipc_ print_level
the level of output required.
\(\leq\) 0 gives no output,
= 1 gives a one-line summary for every iteration,
= 2 gives a summary of the inner iteration for each iteration,
\(\geq\) 3 gives increasingly verbose (debugging) output
ipc_ start_print
any printing will start on this iteration
ipc_ stop_print
any printing will stop on this iteration
ipc_ print_gap
the number of iterations between printing
ipc_ maxit
the maximum number of iterations performed
ipc_ alive_unit
removal of the file alive_file from unit alive_unit terminates execution
char alive_file[31]
see alive_unit
ipc_ jacobian_available
is the Jacobian matrix of first derivatives available (\(\geq\) 2), is access only via matrix-vector products (=1) or is it not available (\(\leq\) 0) ?
ipc_ subproblem_solver
the method used to solve the crucial step-determination subproblem. Possible values are
1 a projected-gradient method using GALAHAD’s
bllswill be used2 an interior-point method using GALAHAD’s
bllsbwill be used3 an interior-point method will initially be used, but a switch to a projected-gradient method will occur when sufficient progress has occurred (see .stop_pg_switch).
ipc_ non_monotone
non-monotone \(\leq\) 0 monotone strategy used, anything else non-monotone strategy with this history length used
ipc_ weight_update_strategy
define the weight-update strategy: 1 (basic), 2 (reset to zero when very successful), 3 (imitate TR), 4 (increase lower bound), 5 (GPT)
rpc_ infinity
any variable bound larger than infinity in modulus will be regarded as infinite
rpc_ stop_r_absolute
overall convergence tolerances. The iteration will terminate when \(||r(x)||_W \leq\) MAX( .stop_r_absolute, .stop_r_relative \(* \|r(x_0)\|_W\) or when the norm of the gradient, \(g(x) = J^T(x) W r(x)\) satisfies \(\|P[x-g(x)]-x\|_2 \leq\) MAX( .stop_pg_absolute, .stop_pg_relative \(* \|P[x_0-g(x_0)]-x_0\|_2\) or if the norm of step is less than .stop_s, where \(x_0\) is the initial point.
rpc_ stop_r_relative
see stop_r_absolute
rpc_ stop_pg_absolute
see stop_r_absolute
rpc_ stop_pg_relative
see stop_r_absolute
rpc_ stop_s
see stop_r_absolute
rpc_ stop_pg_switch
the step-computation solver will switch from an interior-point method to a projected-gradient one if .subproblem_solver = 3 (see above) and \(\|P[x-g(x)]-x\|_2 \leq\) MAX( .stop_pg_absolute, .stop_pg_switch \(* \|P[x_0-g(x_0)]-x_0\|_2\).
rpc_ initial_weight
initial value for the regularization weight (-ve => \(1/\|g_0\|)\))
rpc_ minimum_weight
minimum permitted regularization weight
rpc_ eta_successful
a potential iterate will only be accepted if the actual decrease f - f(x_new) is larger than .eta_successful times that predicted by a quadratic model of the decrease. The regularization weight will be decreaed if this relative decrease is greater than .eta_very_successful but smaller than .eta_too_successful
rpc_ eta_very_successful
see eta_successful
rpc_ eta_too_successful
see eta_successful
rpc_ weight_decrease_min
on very successful iterations, the regularization weight will be reduced by the factor .weight_decrease but no more than .weight_decrease_min while if the iteration is unsucceful, the weight will be increased by a factor .weight_increase but no more than .weight_increase_max (these are delta_1, delta_2, delta3 and delta_max in Gould, Porcelli and Toint, 2011)
rpc_ weight_decrease
see weight_decrease_min
rpc_ weight_increase
see weight_decrease_min
rpc_ weight_increase_max
see weight_decrease_min
rpc_ switch_to_newton
if the value of the two-norm of the projected gradient is less than .switch_to_newton, a switch is made from the Gauss-Newton model to the Newton one when .newton_acceleration is true
rpc_ cpu_time_limit
the maximum CPU time allowed (-ve means infinite)
rpc_ clock_time_limit
the maximum elapsed clock time allowed (-ve means infinite)
bool newton_acceleration
if they are available, second derivatives should be used to accelerate the convergence of the algorithm
bool magic_step
allow the user to perform a “magic” step to improve the objective
bool print_obj
print values of the objective/gradient rather than \(\|r\|\) and its gradient
bool space_critical
if .space_critical true, every effort will be made to use as little space as possible. This may result in longer computation time
bool deallocate_error_fatal
if .deallocate_error_fatal is true, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue
char prefix[31]
all output lines will be prefixed by .prefix(2:LEN(TRIM(.prefix))-1) where .prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’
struct blls_control_type blls_control
control parameters for BLLS
struct bllsb_control_type bllsb_control
control parameters for BLLSB
bnls_time_type structure#
#include <galahad_bnls.h> struct bnls_time_type { // components rpc_ total; rpc_ blls; rpc_ bllsb; rpc_ clock_total; rpc_ clock_blls; rpc_ clock_bllsb; };
detailed documentation#
time derived type as a C struct
components#
rpc_ total
the total CPU time spent in the package
rpc_ blls
the CPU time spent in the blls package
rpc_ bllsb
the CPU time spent in the bllsb package
rpc_ clock_total
the total clock time spent in the package
rpc_ clock_blls
the clock time spent in the blls package
rpc_ clock_bllsb
the clock time spent in the bllsb package
bnls_inform_type structure#
#include <galahad_bnls.h> struct bnls_inform_type { // components ipc_ status; ipc_ alloc_status; char bad_alloc[81]; char bad_eval[13]; ipc_ iter; ipc_ inner_iter; ipc_ r_eval; ipc_ jr_eval; rpc_ obj; rpc_ norm_r; rpc_ norm_g; rpc_ norm_pg; rpc_ weight; struct bnls_time_type time; struct blls_inform_type blls_inform; struct bllsb_inform_type bllsb_inform; };
detailed documentation#
inform derived type as a C struct
components#
ipc_ status
return status. See BNLS_solve for details
ipc_ alloc_status
the status of the last attempted allocation/deallocation
char bad_alloc[81]
the name of the array for which an allocation/deallocation error occurred
char bad_eval[13]
the name of the user-supplied evaluation routine for which an error occurred
ipc_ iter
the total number of iterations performed
ipc_ inner_iter
the total number of inner (projected gradient and/or interior-point) iterations performed
ipc_ r_eval
the total number of evaluations of the residual function \(r(x)\)
ipc_ jr_eval
the total number of evaluations of the Jacobian \(J_r(x)\) of \(r(x)\)
rpc_ obj
the value of the objective function \(\frac{1}{2}\|r(x)\|^2_W\) at the best estimate the solution, \(x\), determined by BNLS_solve
rpc_ norm_r
the norm of the residual \(\|r(x)\|_W\) at the best estimate of the solution \(x\), determined by BNLS_solve
rpc_ norm_g
the norm of the gradient of \(\|r(x)\|_W\) of the objective function at the best estimate, \(x\), of the solution determined by BNLS_solve
rpc_ norm_pg
the norm of the projected gradient \(\|P[x - J_r^T(x) W r(x)] - x\|_2\) of the residual function at the best estimate, x, of the solution determined by BNLS_solve
rpc_ weight
the final regularization weight used
struct bnls_time_type time
timings (see above)
struct blls_inform_type blls_inform
inform parameters for BLLS
struct bllsb_inform_type bllsb_inform
inform parameters for BLLSB
example calls#
This is an example of how to use the package to solve a simplex-constrained nonlinear least-squares problem; the code is available in $GALAHAD/src/bnls/C/bnlst.c .
Notice that C-style indexing is used, and that this is flagged by setting
control.f_indexing to false. The floating-point type rpc_
is set in galahad_precision.h to double by default, but to float
if the preprocessor variable SINGLE is defined. Similarly, the integer
type ipc_ from galahad_precision.h is set to int by default,
but to int64_t if the preprocessor variable INTEGER_64 is defined.
/* bnlstf.c */
/* Full test for the BNLS C interface using C sparse matrix indexing */
/* Jari Fowkes & Nick Gould, STFC-Rutherford Appleton Laboratory, 2026 */
#include <stdio.h>
#include <math.h>
#include <string.h>
#include "galahad_precision.h"
#include "galahad_cfunctions.h"
#include "galahad_bnls.h"
#ifdef REAL_128
#include <quadmath.h>
#endif
// Define imax
ipc_ imax(ipc_ a, ipc_ b) {
return (a > b) ? a : b;
};
// Custom userdata struct
struct userdata_type {
rpc_ p;
ipc_ *flag;
ipc_ *flags;
};
// Function prototypes
ipc_ res( ipc_ n, ipc_ m_r, const rpc_ x[], rpc_ r[], const void * );
ipc_ jac( ipc_ n, ipc_ m_r, ipc_ jr_ne, const rpc_ x[],
rpc_ jr_val[], const void * );
ipc_ jacprod( ipc_ n, ipc_ m, const rpc_ x[], const bool transpose,
const rpc_ v[], rpc_ p[], bool got_jr, const void * );
ipc_ jacprods( ipc_ n, ipc_ m_r, const rpc_ x[], const rpc_ v[],
rpc_ p[], const ipc_ iv[], ipc_ lvl, ipc_ lvu,
ipc_ ip[], ipc_ *lp, bool got_jr, const void *userdata );
ipc_ sjacprod( ipc_ n, ipc_ m_r, const rpc_ x[], bool transpose,
const rpc_ v[], rpc_ p[], const ipc_ free[],
ipc_ n_free, bool got_jr, const void * );
int main(void) {
// Derived types
void *data;
struct bnls_control_type control;
struct bnls_inform_type inform;
// Set problem data
ipc_ n = 5; // # variables
ipc_ m_r = 4; // # observations
rpc_ w[] = {1.0, 1.0, 1.0, 1.0}; // weights
ipc_ jr_ne = 8; // Jacobian elements
ipc_ Jr_row[] = {0, 0, 1, 1, 2, 2, 3, 3}; // Jacobian J
ipc_ Jr_col[] = {0, 1, 1, 2, 2, 3, 3, 4};
rpc_ Jr_val[jr_ne];
// Set storage
rpc_ x_l[n]; // lower bounds
rpc_ x_u[n]; // upper bounds
rpc_ x[n]; // variables
rpc_ z[n]; // dual variables
rpc_ r[m_r]; // residual
rpc_ g[n]; // gradient
ipc_ x_stat[n]; // variable status
ipc_ status;
// set variable bounds
for( ipc_ i = 0; i < n; i++) x_l[i] = 0.0; // lower bounds
for( ipc_ i = 0; i < n; i++) x_u[i] = 1.0; // upper bounds
// set up array to flag current nonzeros in a vector
ipc_ flag = 0; // current flag value
ipc_ flags[m_r]; // array of flags
for( ipc_ i = 0; i < m_r; i++) flags[i] = 0;
// Set user data
struct userdata_type userdata;
userdata.p = 4.0;
userdata.flag = &flag;
userdata.flags = flags;
printf(" C sparse matrix indexing\n\n");
// solve when Jacobian is available via function calls
// Initialize BNLS
bnls_initialize( &data, &control, &inform );
// Set user-defined control options
control.f_indexing = false; // C sparse matrix indexing
//control.print_level = 1;
//control.blls_control.print_level = 1;
control.jacobian_available = 2;
#ifdef REAL_32
control.stop_pg_absolute = 0.0001;
#else
control.stop_pg_absolute = 0.00001;
#endif
strcpy(control.blls_control.sbls_control.definite_linear_solver, "potr ");
strcpy(control.blls_control.sbls_control.symmetric_linear_solver, "sytr ");
for( ipc_ i = 0; i < n; i++) x[i] = 0.5; // starting point
bnls_import( &control, &data, &status, n, m_r,
"coordinate", jr_ne, Jr_row, Jr_col, 0, NULL );
bnls_solve_with_jac( &data, &userdata, &status, n, m_r, x_l, x_u,
x, z, r, g, x_stat, res, jr_ne, jac, w );
bnls_information( &data, &inform, &status );
if(inform.status == 0){
printf(" BNLS(JF):%6" d_ipc_ " iterations. Optimal objective value"
" = %5.2f status = %1" d_ipc_ "\n",
inform.iter, (double)inform.obj, inform.status);
}else{
printf(" BNLS(JF): exit status = %1" d_ipc_ "\n", inform.status);
}
// Delete internal workspace
bnls_terminate( &data, &control, &inform );
// solve when Jacobian products are available via function calls
// Initialize BNLS
bnls_initialize( &data, &control, &inform );
// Set user-defined control options
control.f_indexing = false; // fortran sparse matrix indexing
//control.print_level = 1;
//control.blls_control.print_level = 1;
//control.maxit = 10;
//control.blls_control.maxit = 10;
control.jacobian_available = 1;
#ifdef REAL_32
control.stop_pg_absolute = 0.005;
#else
control.stop_pg_absolute = 0.00001;
#endif
strcpy(control.blls_control.sbls_control.definite_linear_solver, "potr ");
strcpy(control.blls_control.sbls_control.symmetric_linear_solver, "sytr ");
for( ipc_ i = 0; i < n; i++) x[i] = 0.5; // starting point
bnls_import_without_jac( &control, &data, &status, n, m_r );
bnls_solve_with_jacprod( &data, &userdata, &status,
n, m_r, x_l, x_u, x, z, r, g, x_stat,
res, jacprod, jacprods, sjacprod, w );
bnls_information( &data, &inform, &status );
if(inform.status == 0){
printf(" BNLS(PF):%6" d_ipc_ " iterations. Optimal objective value"
" = %5.2f status = %1" d_ipc_ "\n",
inform.iter, (double)inform.obj, inform.status);
}else{
printf(" BNLS(PF): exit status = %1" d_ipc_ "\n", inform.status);
}
// Delete internal workspace
bnls_terminate( &data, &control, &inform );
// reverse-communication input/output
ipc_ mnm, lp;
mnm = imax( m_r, n );
lp = 0;
ipc_ eval_status, lvl, lvu;
ipc_ iv[mnm], ip[mnm];
rpc_ v[mnm], p[mnm];
bool got_jr;
// solve when Jacobian is available via reverse access
// Initialize BNLS
bnls_initialize( &data, &control, &inform );
// Set user-defined control options
control.f_indexing = false; // fortran sparse matrix indexing
//control.print_level = 1;
//control.blls_control.print_level = 1;
control.jacobian_available = 2;
#ifdef REAL_32
control.stop_pg_absolute = 0.0001;
#else
control.stop_pg_absolute = 0.00001;
#endif
strcpy(control.blls_control.sbls_control.definite_linear_solver, "potr ");
strcpy(control.blls_control.sbls_control.symmetric_linear_solver, "sytr ");
for( ipc_ i = 0; i < n; i++) x[i] = 0.5; // starting point
bnls_import( &control, &data, &status, n, m_r,
"coordinate", jr_ne, Jr_row, Jr_col, 0, NULL );
while(true){ // reverse-communication loop
bnls_solve_reverse_with_jac( &data, &status, &eval_status,
n, m_r, x_l, x_u, x, z, r, g, x_stat,
jr_ne, Jr_val, w );
if(status == 0){ // successful termination
break;
}else if(status < 0){ // error exit
break;
}else if(status == 2){ // evaluate r
eval_status = res( n, m_r, x, r, &userdata );
}else if(status == 3){ // evaluate Jr
eval_status = jac( n, m_r, jr_ne, x, Jr_val, &userdata );
}else{
printf(" the value %1" d_ipc_ " of status should not occur\n",
status);
break;
}
}
bnls_information( &data, &inform, &status );
if(inform.status == 0){
printf(" BNLS(JR):%6" d_ipc_ " iterations. Optimal objective value"
" = %5.2f status = %1" d_ipc_ "\n",
inform.iter, (double)inform.obj, inform.status);
}else{
printf(" BNLS(JR): exit status = %1" d_ipc_ "\n", inform.status);
}
// Delete internal workspace
bnls_terminate( &data, &control, &inform );
// solve when Jacobian products are available via reverse access
// Initialize BNLS
bnls_initialize( &data, &control, &inform );
// Set user-defined control options
control.f_indexing = false; // fortran sparse matrix indexing
//control.print_level = 1;
//control.blls_control.print_level = 1;
control.jacobian_available = 1;
#ifdef REAL_32
control.stop_pg_absolute = 0.0001;
#else
control.stop_pg_absolute = 0.00001;
#endif
strcpy(control.blls_control.sbls_control.definite_linear_solver, "potr ");
strcpy(control.blls_control.sbls_control.symmetric_linear_solver, "sytr ");
lp = mnm;
for( ipc_ i = 0; i < mnm; i++) ip[i] = i;
for( ipc_ i = 0; i < n; i++) x[i] = 0.5; // starting point
bnls_import_without_jac( &control, &data, &status, n, m_r );
while(true){ // reverse-communication loop
bnls_solve_reverse_with_jacprod( &data, &status, &eval_status,
n, m_r, x_l, x_u, x, z, r, g, x_stat,
v, iv, &lvl, &lvu, p, ip, lp, w );
if(status == 0){ // successful termination
break;
}else if(status < 0){ // error exit
break;
}else if(status == 2){ // evaluate r
eval_status = res( n, m_r, x, r, &userdata );
got_jr = false;
}else if(status == 4){ // evaluate p = Jr v
eval_status = jacprod( n, m_r, x, false, v, p, got_jr, &userdata );
}else if(status == 5){ // evaluate p = Jr' v
eval_status = jacprod( n, m_r, x, true, v, p, got_jr, &userdata );
}else if(status == 6){ // evaluate p = Jr * sparse v
eval_status = jacprods( n, m_r, x, v, p, iv, lvl, lvu, NULL, NULL,
got_jr, &userdata );
}else if(status == 7){ // evaluate p = sparse( Jr(x) * sparse v )
eval_status = jacprods( n, m_r, x, v, p, iv, lvl, lvu, ip, &lp,
got_jr, &userdata );
}else if(status == 8){ // evaluate p = sparse(Jr' v)
eval_status = sjacprod( n, m_r, x, true, v, p, iv, lvu,
got_jr, &userdata );
}else{
printf(" the value %1" d_ipc_ " of status should not occur\n",
status);
break;
}
}
bnls_information( &data, &inform, &status );
if(inform.status == 0){
printf(" BNLS(PR):%6" d_ipc_ " iterations. Optimal objective value"
" = %5.2f status = %1" d_ipc_ "\n",
inform.iter, (double)inform.obj, inform.status);
} else {
printf(" BNLS(PR): exit status = %1" d_ipc_ "\n", inform.status);
}
// Delete internal workspace
bnls_terminate( &data, &control, &inform );
printf(" BNLS tests complete\n");
}
// compute the residuals
ipc_ res( ipc_ n, ipc_ m_r, const rpc_ x[], rpc_ r[], const void *userdata ){
struct userdata_type *myuserdata = ( struct userdata_type * ) userdata;
rpc_ p = myuserdata->p;
r[0] = x[0] * x[1] - p;
r[1] = x[1] * x[2] - 1.0;
r[2] = x[2] * x[3] - 1.0;
r[3] = x[3] * x[4] - 1.0;
return 0;
}
// compute the Jacobian
ipc_ jac( ipc_ n, ipc_ m_r, ipc_ jne, const rpc_ x[], rpc_ jr_val[],
const void *userdata ){
jr_val[0] = x[1];
jr_val[1] = x[0];
jr_val[2] = x[2];
jr_val[3] = x[1];
jr_val[4] = x[3];
jr_val[5] = x[2];
jr_val[6] = x[4];
jr_val[7] = x[3];
return 0;
}
// compute Jacobian-vector products
ipc_ jacprod( ipc_ n, ipc_ m_r, const rpc_ x[], const bool transpose,
const rpc_ v[], rpc_ p[], bool got_jr, const void *userdata ){
if (transpose) {
p[0] = x[1] * v[0];
p[1] = x[2] * v[1] + x[0] * v[0];
p[2] = x[3] * v[2] + x[1] * v[1];
p[3] = x[4] * v[3] + x[2] * v[2];
p[4] = x[3] * v[3];
} else {
p[0] = x[1] * v[0] + x[0] * v[1];
p[1] = x[2] * v[1] + x[1] * v[2];
p[2] = x[3] * v[2] + x[2] * v[3];
p[3] = x[4] * v[3] + x[3] * v[4];
}
got_jr = true;
return 0;
}
// compute a sparse product with the Jacobian
ipc_ jacprods( ipc_ n, ipc_ m_r, const rpc_ x[], const rpc_ v[],
rpc_ p[], const ipc_ iv[], ipc_ lvl, ipc_ lvu,
ipc_ ip[], ipc_ *lp, bool got_jr,
const void *userdata ) {
ipc_ i, j;
rpc_ val;
struct userdata_type *myuserdata = ( struct userdata_type * ) userdata;
ipc_ flag = *(myuserdata->flag);
ipc_ *flags = myuserdata->flags;
if (ip != NULL && lp != NULL) {
flag = flag+1;
*lp = 0;
for( ipc_ l = lvl; l <= lvu; l++){
j = iv[l];
val = v[j];
if (j == 0){
i = 0;
if (flags[i] < flag) {
flags[i] = flag;
p[i] = x[i+1] * val;
ip[*lp] = i;
*lp = *lp + 1;
} else {
p[i] = p[i] + x[i+1] * val;
}
} else if (j == n-1) {
i = m_r-1;
if (flags[i] < flag) {
flags[i] = flag;
p[i] = x[i] * val;
ip[*lp] = i;
*lp = *lp + 1;
} else {
p[i] = p[i] + x[i] * val;
}
} else {
i = j - 1;
if (flags[i] < flag) {
flags[i] = flag;
p[i] = x[i] * val;
ip[*lp] = i;
*lp = *lp + 1;
} else {
p[i] = p[i] + x[i] * val;
}
i = j;
if (flags[i] < flag) {
flags[i] = flag;
p[i] = x[i+1] * val;
ip[*lp] = i;
*lp = *lp + 1;
} else {
p[i] = p[i] + x[i+1] * val;
}
}
for( ipc_ i = 0; i < *lp; i++) flags[ip[i]] = 0;
}
} else {
for( ipc_ i = 0; i < m_r; i++) p[i] = 0.0;
for( ipc_ l = lvl; l <= lvu; l++){
j = iv[l];
val = v[j];
if (j == 0) {
i = 0;
p[i] = p[i] + x[i+1] * val;
} else if (j == n-1) {
i = m_r - 1;
p[i] = p[i] + x[i] * val;
} else {
i = j - 1;
p[i] = p[i] + x[i] * val;
i = j;
p[i] = p[i] + x[i+1] * val;
}
}
}
got_jr = true;
return 0;
}
// compute a sparse product with the Jacobian or its transpose
ipc_ sjacprod( ipc_ n, ipc_ m_r, const rpc_ x[], bool transpose,
const rpc_ v[], rpc_ p[], const ipc_ free[], ipc_ n_free,
bool got_jr, const void *userdata ) {
ipc_ j;
rpc_ val;
if (transpose) {
for( ipc_ i = 0; i < n_free; i++) {
j = free[i];
if (j == 0) {
p[0] = x[1] * v[0];
} else if (j == n-1) {
p[n-1] = x[m_r-1] * v[m_r-1];
} else {
p[j] = x[j-1] * v[j-1] + x[j+1] * v[j];
}
}
} else {
for( ipc_ i = 0; i < m_r; i++) p[i] = 0.0;
for( ipc_ i = 0; i < n_free; i++) {
j = free[i];
val = v[j];
if (j == 0) {
p[0] = p[0] + x[1] * val;
} else if (j == n-1) {
p[m_r-1] = p[m_r-1] + x[m_r-1] * val;
} else {
p[j-1] = p[j-1] + x[j-1] * val;
p[j] = p[j] + x[j+1] * val;
}
}
}
got_jr = true;
return 0;
}
This is the same example, but now fortran-style indexing is used; the code is available in $GALAHAD/src/bnls/C/bnlstf.c .
/* bnlstf.c */
/* Full test for the BNLS C interface using fortran sparse matrix indexing */
/* Jari Fowkes & Nick Gould, STFC-Rutherford Appleton Laboratory, 2026 */
#include <stdio.h>
#include <math.h>
#include <string.h>
#include "galahad_precision.h"
#include "galahad_cfunctions.h"
#include "galahad_bnls.h"
#ifdef REAL_128
#include <quadmath.h>
#endif
// Define imax
ipc_ imax(ipc_ a, ipc_ b) {
return (a > b) ? a : b;
};
// Custom userdata struct
struct userdata_type {
rpc_ p;
ipc_ *flag;
ipc_ *flags;
};
// Function prototypes
ipc_ res( ipc_ n, ipc_ m_r, const rpc_ x[], rpc_ r[], const void * );
ipc_ jac( ipc_ n, ipc_ m_r, ipc_ jr_ne, const rpc_ x[],
rpc_ jr_val[], const void * );
ipc_ jacprod( ipc_ n, ipc_ m, const rpc_ x[], const bool transpose,
const rpc_ v[], rpc_ p[], bool got_jr, const void * );
ipc_ jacprods( ipc_ n, ipc_ m_r, const rpc_ x[], const rpc_ v[],
rpc_ p[], const ipc_ iv[], ipc_ lvl, ipc_ lvu,
ipc_ ip[], ipc_ *lp, bool got_jr, const void *userdata );
ipc_ sjacprod( ipc_ n, ipc_ m_r, const rpc_ x[], bool transpose,
const rpc_ v[], rpc_ p[], const ipc_ free[],
ipc_ n_free, bool got_jr, const void * );
int main(void) {
// Derived types
void *data;
struct bnls_control_type control;
struct bnls_inform_type inform;
// Set problem data
ipc_ n = 5; // # variables
ipc_ m_r = 4; // # observations
rpc_ w[] = {1.0, 1.0, 1.0, 1.0}; // weights
ipc_ jr_ne = 8; // Jacobian elements
ipc_ Jr_row[] = {1, 1, 2, 2, 3, 3, 4, 4}; // Jacobian J
ipc_ Jr_col[] = {1, 2, 2, 3, 3, 4, 4, 5};
rpc_ Jr_val[jr_ne];
// Set storage
rpc_ x_l[n]; // lower bounds
rpc_ x_u[n]; // upper bounds
rpc_ x[n]; // variables
rpc_ z[n]; // dual variables
rpc_ r[m_r]; // residual
rpc_ g[n]; // gradient
ipc_ x_stat[n]; // variable status
ipc_ status;
// set variable bounds
for( ipc_ i = 0; i < n; i++) x_l[i] = 0.0; // lower bounds
for( ipc_ i = 0; i < n; i++) x_u[i] = 1.0; // upper bounds
// set up array to flag current nonzeros in a vector
ipc_ flag = 0; // current flag value
ipc_ flags[m_r]; // array of flags
for( ipc_ i = 0; i < m_r; i++) flags[i] = 0;
// Set user data
struct userdata_type userdata;
userdata.p = 4.0;
userdata.flag = &flag;
userdata.flags = flags;
printf(" fortran sparse matrix indexing\n\n");
// solve when Jacobian is available via function calls
// Initialize BNLS
bnls_initialize( &data, &control, &inform );
// Set user-defined control options
control.f_indexing = true; // fortran sparse matrix indexing
//control.print_level = 1;
//control.blls_control.print_level = 1;
control.jacobian_available = 2;
#ifdef REAL_32
control.stop_pg_absolute = 0.0001;
#else
control.stop_pg_absolute = 0.00001;
#endif
strcpy(control.blls_control.sbls_control.definite_linear_solver, "potr ");
strcpy(control.blls_control.sbls_control.symmetric_linear_solver, "sytr ");
for( ipc_ i = 0; i < n; i++) x[i] = 0.5; // starting point
bnls_import( &control, &data, &status, n, m_r,
"coordinate", jr_ne, Jr_row, Jr_col, 0, NULL );
bnls_solve_with_jac( &data, &userdata, &status, n, m_r, x_l, x_u,
x, z, r, g, x_stat, res, jr_ne, jac, w );
bnls_information( &data, &inform, &status );
if(inform.status == 0){
printf(" BNLS(JF):%6" d_ipc_ " iterations. Optimal objective value"
" = %5.2f status = %1" d_ipc_ "\n",
inform.iter, (double)inform.obj, inform.status);
}else{
printf(" BNLS(JF): exit status = %1" d_ipc_ "\n", inform.status);
}
// Delete internal workspace
bnls_terminate( &data, &control, &inform );
// solve when Jacobian products are available via function calls
// Initialize BNLS
bnls_initialize( &data, &control, &inform );
// Set user-defined control options
control.f_indexing = true; // fortran sparse matrix indexing
//control.print_level = 1;
//control.blls_control.print_level = 1;
//control.maxit = 10;
//control.blls_control.maxit = 5;
control.jacobian_available = 1;
#ifdef REAL_32
control.stop_pg_absolute = 0.005;
#else
control.stop_pg_absolute = 0.00001;
#endif
strcpy(control.blls_control.sbls_control.definite_linear_solver, "potr ");
strcpy(control.blls_control.sbls_control.symmetric_linear_solver, "sytr ");
for( ipc_ i = 0; i < n; i++) x[i] = 0.5; // starting point
bnls_import_without_jac( &control, &data, &status, n, m_r );
bnls_solve_with_jacprod( &data, &userdata, &status,
n, m_r, x_l, x_u, x, z, r, g, x_stat,
res, jacprod, jacprods, sjacprod, w );
bnls_information( &data, &inform, &status );
if(inform.status == 0){
printf(" BNLS(PF):%6" d_ipc_ " iterations. Optimal objective value"
" = %5.2f status = %1" d_ipc_ "\n",
inform.iter, (double)inform.obj, inform.status);
}else{
printf(" BNLS(PF): exit status = %1" d_ipc_ "\n", inform.status);
}
// Delete internal workspace
bnls_terminate( &data, &control, &inform );
// reverse-communication input/output
ipc_ mnm, lp;
mnm = imax( m_r, n );
lp = 0;
ipc_ eval_status, lvl, lvu;
ipc_ iv[mnm], ip[m_r];
rpc_ v[mnm], p[mnm];
bool got_jr;
// solve when Jacobian is available via reverse access
// Initialize BNLS
bnls_initialize( &data, &control, &inform );
// Set user-defined control options
control.f_indexing = true; // fortran sparse matrix indexing
//control.print_level = 1;
//control.blls_control.print_level = 1;
control.jacobian_available = 2;
#ifdef REAL_32
control.stop_pg_absolute = 0.0001;
#else
control.stop_pg_absolute = 0.00001;
#endif
strcpy(control.blls_control.sbls_control.definite_linear_solver, "potr ");
strcpy(control.blls_control.sbls_control.symmetric_linear_solver, "sytr ");
for( ipc_ i = 0; i < n; i++) x[i] = 0.5; // starting point
bnls_import( &control, &data, &status, n, m_r,
"coordinate", jr_ne, Jr_row, Jr_col, 0, NULL );
while(true){ // reverse-communication loop
bnls_solve_reverse_with_jac( &data, &status, &eval_status,
n, m_r, x_l, x_u, x, z, r, g, x_stat,
jr_ne, Jr_val, w );
if(status == 0){ // successful termination
break;
}else if(status < 0){ // error exit
break;
}else if(status == 2){ // evaluate r
eval_status = res( n, m_r, x, r, &userdata );
}else if(status == 3){ // evaluate Jr
eval_status = jac( n, m_r, jr_ne, x, Jr_val, &userdata );
}else{
printf(" the value %1" d_ipc_ " of status should not occur\n",
status);
break;
}
}
bnls_information( &data, &inform, &status );
if(inform.status == 0){
printf(" BNLS(JR):%6" d_ipc_ " iterations. Optimal objective value"
" = %5.2f status = %1" d_ipc_ "\n",
inform.iter, (double)inform.obj, inform.status);
}else{
printf(" BNLS(JR): exit status = %1" d_ipc_ "\n", inform.status);
}
// Delete internal workspace
bnls_terminate( &data, &control, &inform );
// solve when Jacobian products are available via reverse access
// Initialize BNLS
bnls_initialize( &data, &control, &inform );
// Set user-defined control options
control.f_indexing = true; // fortran sparse matrix indexing
//control.print_level = 1;
//control.blls_control.print_level = 1;
//control.maxit = 10;
//control.blls_control.maxit = 5;
control.jacobian_available = 1;
#ifdef REAL_32
control.stop_pg_absolute = 0.0001;
#else
control.stop_pg_absolute = 0.00001;
#endif
strcpy(control.blls_control.sbls_control.definite_linear_solver, "potr ");
strcpy(control.blls_control.sbls_control.symmetric_linear_solver, "sytr ");
for( ipc_ i = 0; i < n; i++) x[i] = 0.5; // starting point
bnls_import_without_jac( &control, &data, &status, n, m_r );
while(true){ // reverse-communication loop
bnls_solve_reverse_with_jacprod( &data, &status, &eval_status,
n, m_r, x_l, x_u, x, z, r, g, x_stat,
v, iv, &lvl, &lvu, p, ip, lp, w );
if(status == 0){ // successful termination
break;
}else if(status < 0){ // error exit
break;
}else if(status == 2){ // evaluate r
eval_status = res( n, m_r, x, r, &userdata );
got_jr = false;
}else if(status == 4){ // evaluate p = Jr v
eval_status = jacprod( n, m_r, x, false, v, p, got_jr, &userdata );
}else if(status == 5){ // evaluate p = Jr' v
eval_status = jacprod( n, m_r, x, true, v, p, got_jr, &userdata );
}else if(status == 6){ // evaluate p = Jr * sparse v
eval_status = jacprods( n, m_r, x, v, p, iv, lvl, lvu, NULL, NULL,
got_jr, &userdata );
}else if(status == 7){ // evaluate p = sparse( Jr(x) * sparse v )
eval_status = jacprods( n, m_r, x, v, p, iv, lvl, lvu, ip, &lp,
got_jr, &userdata );
}else if(status == 8){ // evaluate p = sparse(Jr' v)
eval_status = sjacprod( n, m_r, x, true, v, p, iv, lvu,
got_jr, &userdata );
}else{
printf(" the value %1" d_ipc_ " of status should not occur\n",
status);
break;
}
}
bnls_information( &data, &inform, &status );
if(inform.status == 0){
printf(" BNLS(PR):%6" d_ipc_ " iterations. Optimal objective value"
" = %5.2f status = %1" d_ipc_ "\n",
inform.iter, (double)inform.obj, inform.status);
} else {
printf(" BNLS(PR): exit status = %1" d_ipc_ "\n", inform.status);
}
// Delete internal workspace
bnls_terminate( &data, &control, &inform );
printf(" BNLS tests complete\n");
}
// compute the residuals
ipc_ res( ipc_ n, ipc_ m_r, const rpc_ x[], rpc_ r[], const void *userdata ){
struct userdata_type *myuserdata = ( struct userdata_type * ) userdata;
rpc_ p = myuserdata->p;
r[0] = x[0] * x[1] - p;
r[1] = x[1] * x[2] - 1.0;
r[2] = x[2] * x[3] - 1.0;
r[3] = x[3] * x[4] - 1.0;
return 0;
}
// compute the Jacobian
ipc_ jac( ipc_ n, ipc_ m_r, ipc_ jne, const rpc_ x[], rpc_ jr_val[],
const void *userdata ){
jr_val[0] = x[1];
jr_val[1] = x[0];
jr_val[2] = x[2];
jr_val[3] = x[1];
jr_val[4] = x[3];
jr_val[5] = x[2];
jr_val[6] = x[4];
jr_val[7] = x[3];
return 0;
}
// compute Jacobian-vector products
ipc_ jacprod( ipc_ n, ipc_ m_r, const rpc_ x[], const bool transpose,
const rpc_ v[], rpc_ p[], bool got_jr, const void *userdata ){
if (transpose) {
p[0] = x[1] * v[0];
p[1] = x[2] * v[1] + x[0] * v[0];
p[2] = x[3] * v[2] + x[1] * v[1];
p[3] = x[4] * v[3] + x[2] * v[2];
p[4] = x[3] * v[3];
} else {
p[0] = x[1] * v[0] + x[0] * v[1];
p[1] = x[2] * v[1] + x[1] * v[2];
p[2] = x[3] * v[2] + x[2] * v[3];
p[3] = x[4] * v[3] + x[3] * v[4];
}
got_jr = true;
return 0;
}
// compute a sparse product with the Jacobian
ipc_ jacprods( ipc_ n, ipc_ m_r, const rpc_ x[], const rpc_ v[],
rpc_ p[], const ipc_ iv[], ipc_ lvl, ipc_ lvu,
ipc_ ip[], ipc_ *lp, bool got_jr,
const void *userdata ) {
ipc_ i, j;
rpc_ val;
struct userdata_type *myuserdata = ( struct userdata_type * ) userdata;
ipc_ flag = *(myuserdata->flag);
ipc_ *flags = myuserdata->flags;
if (ip != NULL && lp != NULL) {
flag = flag+1;
*lp = 0;
for( ipc_ l=lvl-1; l <= lvu-1; l++){
j = iv[l]-1;
val = v[j];
if (j == 0){
i = 0;
if (flags[i] < flag) {
flags[i] = flag;
p[i] = x[i+1] * val;
ip[*lp] = i+1;
*lp = *lp+1;
} else {
p[i] = p[i] + x[i+1] * val;
}
} else if (j == n-1) {
i = m_r-1;
if (flags[i] < flag) {
flags[i] = flag;
p[i] = x[i] * val;
ip[*lp] = i+1;
*lp = *lp+1;
} else {
p[i] = p[i] + x[i] * val;
}
} else {
i = j-1;
if (flags[i] < flag) {
flags[i] = flag;
p[i] = x[i] * val;
ip[*lp] = i+1;
*lp = *lp+1;
} else {
p[i] = p[i] + x[i] * val;
}
i = j;
if (flags[i] < flag) {
flags[i] = flag;
p[i] = x[i+1] * val;
ip[*lp] = i+1;
*lp = *lp + 1;
} else {
p[i] = p[i] + x[i+1] * val;
}
}
for( ipc_ i = 0; i < *lp; i++) flags[ip[i]-1] = 0;
}
} else {
for( ipc_ i = 0; i < m_r; i++) p[i] = 0.0;
for( ipc_ l = lvl-1; l <= lvu-1; l++){
//for( ipc_ l = lvl; l <= lvu; l++){
j = iv[l]-1;
val = v[j];
if (j == 0) {
i = 0;
p[i] = p[i] + x[i+1] * val;
} else if (j == n-1) {
i = m_r-1;
p[i] = p[i] + x[i] * val;
} else {
i = j-1;
p[i] = p[i] + x[i] * val;
i = j;
p[i] = p[i] + x[i+1] * val;
}
}
}
got_jr = true;
return 0;
}
// compute a sparse product with the Jacobian or its transpose
ipc_ sjacprod( ipc_ n, ipc_ m_r, const rpc_ x[], bool transpose,
const rpc_ v[], rpc_ p[], const ipc_ free[], ipc_ n_free,
bool got_jr, const void *userdata ) {
ipc_ j;
rpc_ val;
if (transpose) {
for( ipc_ i = 0; i < n_free; i++) {
j = free[i]-1;
if (j == 0) {
p[0] = x[1] * v[0];
} else if (j == n-1) {
p[n-1] = x[m_r-1] * v[m_r-1];
} else {
p[j] = x[j-1] * v[j-1] + x[j+1] * v[j];
}
}
} else {
for( ipc_ i = 0; i < m_r; i++) p[i] = 0.0;
for( ipc_ i = 0; i < n_free; i++) {
j = free[i]-1;
val = v[j];
if (j == 0) {
p[0] = p[0] + x[1] * val;
} else if (j == n-1) {
p[m_r-1] = p[m_r-1] + x[m_r-1] * val;
} else {
p[j-1] = p[j-1] + x[j-1] * val;
p[j] = p[j] + x[j+1] * val;
}
}
}
got_jr = true;
return 0;
}