GALAHAD SHA package#

purpose#

The sha package finds a component-wise secant approximation to the Hessian matrix \(H(x)\), for which \((H(x))_{i,j} = \partial f^2 (x) / \partial x_i \partial x_j\), \(1 \leq i, j \leq n\), using values of the gradient \(g(x) = \nabla_x f(x)\) of the function \(f(x)\) of \(n\) unknowns \(x = (x_1, \ldots, x_n)^T\) at a sequence of given distinct \(\{x^{(k)}\}\), \(k \geq 0\). More specifically, given differences

\[s^{(k)} = x^{(k+1)} - x^{(k)} \;\;\mbox{and}\;\; y^{(k)} = g(x^{(k+1)}) - g(x^{(k)})\]

the package aims to find an approximation \(B\) to \(H(x)\) for which the secant conditions \(B s^{(k)} \approx y^{(k)}\) hold for a chosen set of values \(k\). The methods provided take advantage of the entries in the Hessian that are known to be zero.

The package is particularly intended to allow gradient-based optimization methods, that generate iterates \(x^{(k+1)} = x^{(k)} + s^{(k)}\) based upon the values \(g( x^{(k)})\) for \(k \geq 0\), to build a suitable approximation to the Hessian \(H(x^{(k+1)})\). This then gives the method an opportunity to accelerate the iteration using the Hessian approximation.

See Section 4 of $GALAHAD/doc/sha.pdf for additional details.

method#

The package computes the entries in the each row of \(B\) one at a time. The entries \(b_{ij}\) in row \(i\) may be chosen to

\[(1) \;\;\; \min_{b_{i,j}}\!\mbox{imize} \;\; \sum_{k \in {\cal I}_i} \left[ \sum_{{\scriptscriptstyle \mbox{nonzeros}}\; j} b_{i,j} s_j^{(k)} - y_i^{(k)} \right]^2,\]

where \({\cal I}_i\) is ideally chosen to be sufficiently large so that (1) has a unique minimizer. Since this requires that there are at least as many \((s^{(k)}, y^{(k)})\) pairs as the maximum number of nonzeros in any row, this may be prohibitive in some cases. We might then be content with a minimum-norm (under-determined) least-squares solution; each row may then be processed in parallel. Or, we may take advantage of the symmetry of the Hessian, and note that if we have already found the values in row \(j\), then the value \(b_{i,j} = b_{j,i}\) in (1) is known before we process row \(i\). Thus by ordering the rows and exploiting symmetry we may reduce the numbers of unknowns in future unprocessed rows.

In the analysis phase, we order the rows by constructing the connectivity graph—a graph comprising nodes \(1\) to \(n\) and edges connecting nodes \(i\) and \(j\) if \(h_{i,j}\) is everywhere nonzero—of \(H(x)\). The nodes are ordered by increasing degree (that is, the number of edges emanating from the node) using a bucket sort. The row chosen to be ordered next corresponds to a node of minimum degree, the node is removed from the graph, the degrees updated efficiently, and the process repeated until all rows have been ordered. This often leads to a significant reduction in the numbers of unknown values in each row as it is processed in turn, but numerical rounding can lead to inaccurate values in some cases. A useful remedy is to process all rows for which there are sufficient \((s^{(k)}, y^{(k)})\) as before, and then process the remaining rows taking into account the symmetry. That is, the rows and columns are rearranged so that the matrix is in block form

\[\begin{split}B = \begin{pmatrix} B_{11} & B_{12} \\ B^T_{12} & B_{22}\end{pmatrix},\end{split}\]

the \(( B_{11} \;\; B_{12})\) rows are processed without regard for symmetry but give the \(2,1\) block \(B^T_{12}\), and finally the \(2,2\) block \(B_{22}\) is processed knowing \(B^T_{12}\) again without respecting symmetry. The rows in blocks \(( B_{11} \;\; B_{12})\) and \(B_{22}\) may be computed in parallel. It is also possible to generalise this so that \(B\) is decomposed into \(r\) blocks, and the blocks processed one at a time recursively using the symmetry from previos rows. More details of the precise algorithms (Algorithms 2.1–2.5) are given in the reference below. The linear least-squares problems (1) themselves are solved by a choice of LAPACK packages.

references#

The method is described in detail in

J. M. Fowkes, N. I. M. Gould and J. A. Scott, “Approximating large-scale Hessians using secant equations”. Preprint TR-2024-001, Rutherford Appleton Laboratory.

introduction to function calls#

To solve a given problem, functions from the sha package must be called in the following order:

  • sha_initialize - provide default control parameters and set up initial data structures

  • sha_read_specfile (optional) - override control values by reading replacement values from a file

  • sha_analyse_matrix - set up problem data structures and generate information that will be used when estimating Hessian values

  • sha_recover_matrix - recover the Hessian approximation

  • sha_information (optional) - recover information about the solution and solution process

  • sha_terminate - deallocate data structures

See the examples section for illustrations of use.

callable functions#

overview of functions provided#

// typedefs

typedef float spc_;
typedef double rpc_;
typedef int ipc_;

// structs

struct sha_control_type;
struct sha_inform_type;

// function calls

void sha_initialize(void **data, struct sha_control_type* control, ipc_ *status);

void sha_read_specfile(struct sha_control_type* control, const char specfile[]);

void sha_analyse_matrix(
    struct sha_control_type* control,
    void **data,
    ipc_ *status,
    ipc_ n,
    ipc_ ne,
    const ipc_ row[],
    const ipc_ col[],
    ipc_ *m
);

void sha_recover_matrix(
    void **data,
    ipc_ *status,
    ipc_ ne,
    ipc_ m,
    ipc_ ls1,
    ipc_ ls2,
            const real_wp_ strans[][ls2],
            ipc_ ly1,
            ipc_ ly2,
            const real_wp_ ytrans[][ly2],
            real_wp_ val[],
            const ipc_ order[]
);


void sha_information(void **data, struct sha_inform_type* inform, ipc_ *status);

void sha_terminate(
    void **data,
    struct sha_control_type* control,
    struct sha_inform_type* inform
);

typedefs#

typedef float spc_

spc_ is real single precision

typedef double rpc_

rpc_ is the real working precision used, but may be changed to float by defining the preprocessor variable REAL_32 or (if supported) to __real128 using the variable REAL_128.

typedef int ipc_

ipc_ is the default integer word length used, but may be changed to int64_t by defining the preprocessor variable INTEGER_64.

function calls#

void sha_initialize(void **data, struct sha_control_type* control, ipc_ *status)

Set default control values and initialize private data

Parameters:

data

holds private internal data

control

is a struct containing control information (see sha_control_type)

status

is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are (currently):

  • 0

    The initialization was successful.

void sha_read_specfile(struct sha_control_type* control, const char specfile[])

Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/xxx/XXX.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/xxx.pdf for a list of how these keywords relate to the components of the control structure.

Parameters:

control

is a struct containing control information (see sha_control_type)

specfile

is a character string containing the name of the specification file

void sha_analyse_matrix(
    struct sha_control_type* control,
    void **data,
    ipc_ *status,
    ipc_ n,
    ipc_ ne,
    const ipc_ row[],
    const ipc_ col[],
    ipc_ *m
)

Analsyse the sparsity structure of \(H\) to generate information that will be used when estimating its values.

Parameters:

control

is a struct whose members provide control paramters for the remaining prcedures (see sha_control_type)

data

holds private internal data

status

is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:

  • 0

    The import was successful

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -3

    A restriction n > 0, ne \(\geq\) 0 or 0 \(\leq\) row[i] \(\leq\) col[i] \(\leq\) n has been violated.

n

is a scalar variable of type ipc_, that holds the number of variables

ne

is a scalar variable of type ipc_, that holds the number of entries in the upper triangular part of \(H\).

row

is a one-dimensional array of size ne and type ipc_, that holds the row indices of the upper triangular part of \(H\).

col

is a one-dimensional array of size ne and type ipc_, that holds the column indices of the upper triangular part of \(H\).

m

is a scalar variable of type ipc_, that gives the minimum number of \((s^{(k)},y^{(k)})\) pairs that will be needed to recover a good Hessian approximation.

void sha_recover_matrix(
    void **data,
    ipc_ *status,
            ipc_ ne,
            ipc_ m_available,
            ipc_ ls1,
            ipc_ ls2,
            const real_wp_ strans[][ls2],
            ipc_ ly1,
            ipc_ ly2,
            const real_wp_ ytrans[][ly2],
            real_wp_ val[],
            const ipc_ order[]
)

Estimate the nonzero entries of the Hessian \(H\) by component-wise secant approximation.

Parameters:

data

holds private internal data

status

is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:

  • 0

    The recovery was successful

  • 1

    Insufficient data pairs \((s_i,y_i)\) have been provided, as m is too small. The returned \(B\) is likely not fully accurate.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -31

    sha_recover_matrix has been called before sha_analyse_matrix.

ne

is a scalar variable of type ipc_, that holds the number of entries in the upper triangular part of \(H\).

m_available

is a scalar variable of type ipc_, that holds the number of differences provided. Ideally this will be as large as m as reported by sha_analyse_matrix, but better still there should be a further control.extra_differences to allow for unlikely singularities.

ls1

is a scalar variable of type ipc_, that holds the leading (first) dimension of the array strans.

ls2

is a scalar variable of type ipc_, that holds the trailing (second) dimension of the array strans.

strans

is a two-dimensional array of size [ls1][ls2] and type rpc_, that holds the values of the vectors \(\{s^{(k) T}\}\). Component [\(k\)][\(i\)] should hold \(s_i^{(k)}\).

ly1

is a scalar variable of type ipc_, that holds the leading (first) dimension of the array ytrans.

ly2

is a scalar variable of type ipc_, that holds the trailing (second) dimension of the array ytrans.

ytrans

is a two-dimensional array of size [ly1][ly2] and type rpc_, that holds the values of the vectors \(\{y^{(k) T}\}\). Component [\(k\)][\(i\)] should hold \(y_i^{(k)}\).

val

is a one-dimensional array of size ne and type rpc_, that holds the values of the entries of the upper triangular part of the symmetric matrix \(H\) in the sparse coordinate scheme.

order

is a one-dimensional array of size m and type ipc_, that holds the preferred order of access for the pairs \(\{(s^{(k)},y^{(k)})\}\). The \(k\)-th component of order specifies the row number of strans and ytrans that will be used as the \(k\)-th most favoured. order need not be set if the natural order, \(k, k = 1,...,\) m, is desired, and this case order should be NULL.

void sha_information(void **data, struct sha_inform_type* inform, ipc_ *status)

Provides output information

Parameters:

data

holds private internal data

inform

is a struct containing output information (see sha_inform_type)

status

is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are (currently):

  • 0

    The values were recorded successfully

void sha_terminate(
    void **data,
    struct sha_control_type* control,
    struct sha_inform_type* inform
)

Deallocate all internal private storage

Parameters:

data

holds private internal data

control

is a struct containing control information (see sha_control_type)

inform

is a struct containing output information (see sha_inform_type)

available structures#

sha_control_type structure#

#include <galahad_sha.h>

struct sha_control_type {
    // fields

    bool f_indexing;
    ipc_ error;
    ipc_ out;
    ipc_ print_level;
    ipc_ approximation_algorithm;
    ipc_ dense_linear_solver;
    ipc_ extra_differences;
    ipc_ sparse_row;
    ipc_ recursion_max;
    ipc_ recursion_entries_required;
    bool average_off_diagonals;
    bool space_critical;
    bool deallocate_error_fatal;
    char prefix[31];
};

detailed documentation#

control derived type as a C struct

components#

bool f_indexing

use C or Fortran sparse matrix indexing

ipc_ error

error and warning diagnostics occur on stream error

ipc_ out

general output occurs on stream out

ipc_ print_level

the level of output required. <= 0 gives no output, = 1 gives a one-line summary for every iteration, = 2 gives a summary of the inner iteration for each iteration, >= 3 gives increasingly verbose (debugging) output

ipc_ approximation_algorithm

which approximation algorithm should be used?

  • 1 : unsymmetric, parallel (Alg 2.1 in paper)

  • 2 : symmetric (Alg 2.2 in pape)

  • 3 : composite, parallel (Alg 2.3 in paper)

  • 4 : composite, block parallel (Alg 2.4 in paper)

ipc_ dense_linear_solver

which dense linear equation solver should be used?

  • 1 : Gaussian elimination

  • 2 : QR factorization

  • 3 : singular-value decomposition

  • 4 : singular-value decomposition with divide-and-conquer

ipc_ extra_differences

if available use an addition extra_differences differences

ipc_ sparse_row

a row is considered sparse if it has no more than .sparse_row entries

    ipc_ recursion_max

limit on the maximum number of levels of recursion (Alg. 2.4)

    ipc_ recursion_entries_required

the minimum number of entries in a reduced row that are required if a further level of recuresion is allowed (Alg. 2.4)

bool average_off_diagonals

select if pairs of off-diagonal Hessian estimates are to be averaged on return. Otherwise pick the value from the upper triangle

bool space_critical

if space is critical, ensure allocated arrays are no bigger than needed

bool deallocate_error_fatal

exit if any deallocation fails

char prefix[31]

all output lines will be prefixed by .prefix(2:LEN(TRIM(.prefix))-1) where .prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’

sha_inform_type structure#

#include <galahad_sha.h>

struct sha_inform_type {
    // fields

    ipc_ status;
    ipc_ alloc_status;
    ipc_ max_degree;
    ipc_ differences_needed;
    ipc_ max_reduced_degree;
    ipc_ approximation_algorithm_used;
    ipc_ bad_row;
    rpc_ max_off_diagonal_difference;
    char bad_alloc[81];
};

detailed documentation#

inform derived type as a C struct

components#

ipc_ status

return status. See SHA_solve for details

ipc_ alloc_status

the status of the last attempted allocation/deallocation.

ipc_ max_degree

the maximum degree in the adgacency graph.

ipc_ differences_needed

the number of differences that will be needed.

ipc_ max_reduced_degree

the maximum reduced degree in the adgacency graph.

ipc_ approximation_algorithm_used

the approximation algorithm actually used

ipc_ bad_row

a failure occured when forming the bad_row-th row (0 = no failure).

ipc_ max_off_diagonal_difference

the maximum difference between estimated Hessian off-diagonal pairs if approximation algorithm 1, 3 or 4 has been employed.

char bad_alloc[81]

the name of the array for which an allocation/deallocation error occurred.

example calls#

This is an example of how to use the package to find a Hessian approximation using gradient differences; the code is available in $GALAHAD/src/sha/C/shat.c .

Notice that C-style indexing is used, and that this is flagged by setting control.f_indexing to false. The floating-point type rpc_ is set in galahad_precision.h to double by default, but to float if the preprocessor variable SINGLE is defined. Similarly, the integer type ipc_ from galahad_precision.h is set to int by default, but to int64_t if the preprocessor variable INTEGER_64 is defined.

/* shat.c */
/* Full test for the SHA C interface using C sparse matrix indexing */

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <float.h>
#include "galahad_precision.h"
#include "galahad_cfunctions.h"
#include "galahad_sha.h"

int main(void) {

    // Derived types
    void *data;
    struct sha_control_type control;
    struct sha_inform_type inform;

    // Set problem data
    ipc_ n = 5; // dimension of H
    ipc_ ne = 9; // number of entries in upper triangle of H
    ipc_ m_max = 9; // upper bound on max # differences required
    ipc_ row[] = {0, 0, 0, 0, 0, 1, 2, 3, 4}; // row indices, NB upper triangle
    ipc_ col[] = {0, 1, 2, 3, 4, 1, 2, 3, 4}; // column indices
    rpc_ val[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0}; // values
    ipc_ i, j, m, status;
    rpc_ rr, v;
    rpc_ val_est[ne];
    ipc_ ls1 = m_max, ls2 = n; // dimensions of s
    ipc_ ly1 = m_max, ly2 = n; // dimensions of y
    rpc_ strans[ls1][ls2];
    rpc_ ytrans[ly1][ly2];
    ipc_ order[m_max]; // diffference precedence order

    printf(" C sparse matrix indexing\n\n");

    printf(" basic tests of storage formats\n\n");

    for( ipc_ algorithm=1; algorithm <= 5; algorithm++){

        // Initialize SHA - use the sytr solver
        sha_initialize( &data, &control, &status );

        // Set user-defined control options
        control.f_indexing = false; // C sparse matrix indexing
        control.approximation_algorithm = algorithm;
        control.extra_differences = 1;

        // analyse the matrix and discover the number of differences needed
        sha_analyse_matrix( &control, &data, &status, n, ne, row, col, &m );
        printf(" Algorithm %" i_ipc_ " - %" i_ipc_ " differences required,"
               " one extra might help\n", algorithm, m);
        m = m + control.extra_differences;

        srand(1);
        for( ipc_ k = 0; k < m; k++) {
          // set up random differences strans
          for( ipc_ i = 0; i < n; i++) {
            rr = ((rpc_) rand()/ RAND_MAX);
            strans[k][i] = -1.0 + 2.0 * rr;
            ytrans[k][i] = 0.0; // initialize ytrans as the zero matrix
          }
          // form y = H s
          for( ipc_ l = 0; l < ne; l++) {
            i = row[l];
            j = col[l];
            v = val[l];
            ytrans[k][i] = ytrans[k][i] + v * strans[k][j];
            if ( i != j ) ytrans[k][j] = ytrans[k][j] + v * strans[k][i];
          }
          order[k] = m - k - 1; // pick the (s,y) vectors in reverse order
        }

        // recover the matrix
        sha_recover_matrix( &data, &status, ne, m, ls1, ls2, strans,
                            ly1, ly2, ytrans, val_est, order );
        //                  ly1, ly2, ytrans, val_est, NULL );
        //                  if the natural order is ok

        printf(" H from %" i_ipc_ " differences:\n", m);
        for( ipc_ i = 0; i < ne; i++) printf(" %1.2f", val_est[i]);

        sha_information( &data, &inform, &status );

        // Delete internal workspace
        sha_terminate( &data, &control, &inform );
        printf("\n\n");
    }
}

This is the same example, but now fortran-style indexing is used; the code is available in $GALAHAD/src/sha/C/shatf.c .

/* shatf.c */
/* Full test for the SHA C interface using Fortran sparse matrix indexing */

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <float.h>
#include "galahad_precision.h"
#include "galahad_cfunctions.h"
#include "galahad_sha.h"

int main(void) {

    // Derived types
    void *data;
    struct sha_control_type control;
    struct sha_inform_type inform;

    // Set problem data
    ipc_ n = 5; // dimension of H
    ipc_ ne = 9; // number of entries in upper triangle of H
    ipc_ m_max = 9; // upper bound on max # differences required
    ipc_ row[] = {1, 1, 1, 1, 1, 2, 3, 4, 5}; // row indices, NB upper triangle
    ipc_ col[] = {1, 2, 3, 4, 5, 2, 3, 4, 5}; // column indices
    rpc_ val[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0}; // values
    ipc_ i, j, m, status;
    rpc_ rr, v;
    rpc_ val_est[ne];
    ipc_ ls1 = m_max, ls2 = n; // dimensions of s
    ipc_ ly1 = m_max, ly2 = n; // dimensions of y
    rpc_ strans[ls1][ls2];
    rpc_ ytrans[ly1][ly2];
    ipc_ order[m_max]; // diffference precedence order

    printf(" Fortran sparse matrix indexing\n\n");

    printf(" basic tests of storage formats\n\n");

    for( ipc_ algorithm=1; algorithm <= 5; algorithm++){

        // Initialize SHA - use the sytr solver
        sha_initialize( &data, &control, &status );

        // Set user-defined control options
        control.f_indexing = true; // Fortran sparse matrix indexing
        control.approximation_algorithm = algorithm;
        control.extra_differences = 1;

        // analyse the matrix and discover the number of differences needed
        sha_analyse_matrix( &control, &data, &status, n, ne, row, col, &m );
        printf(" Algorithm %" i_ipc_ " - %" i_ipc_ " differences required,"
               " one extra might help\n", algorithm, m);
        m = m + control.extra_differences;

        srand(1);
        for( ipc_ k = 0; k < m; k++) {
          // set up random differences strans
          for( ipc_ i = 0; i < n; i++) {
            rr = ((rpc_) rand()/ RAND_MAX);
            strans[k][i] = -1.0 + 2.0 * rr;
            ytrans[k][i] = 0.0; // initialize ytrans as the zero matrix
          }
          // form y = H s
          for( ipc_ l = 0; l < ne; l++) {
            i = row[l]-1;
            j = col[l]-1;
            v = val[l];
            ytrans[k][i] = ytrans[k][i] + v * strans[k][j];
            if ( i != j ) ytrans[k][j] = ytrans[k][j] + v * strans[k][i];
          }
          order[k] = m - k; // pick the (s,y) vectors in reverse order
        }

        // recover the matrix
        sha_recover_matrix( &data, &status, ne, m, ls1, ls2, strans,
                            ly1, ly2, ytrans, val_est, order );
        //                  ly1, ly2, ytrans, val_est, NULL );
        //                  if the natural order is ok

        printf(" H from %" i_ipc_ " differences:\n", m);
        for( ipc_ i = 0; i < ne; i++) printf(" %1.2f", val_est[i]);

        sha_information( &data, &inform, &status );

        // Delete internal workspace
        sha_terminate( &data, &control, &inform );
        printf("\n\n");
    }
}