GALAHAD SLS package#
purpose#
The sls
package
solves dense or sparse symmetric systems of linear equations
using variants of Gaussian elimination.
Given a sparse symmetric matrix \(A = \{ a_{ij} \}_{n \times n}\), and an
\(n\)-vector \(b\) or a matrix \(B = \{ b_{ij} \}_{n \times r}\), this
function solves the system \(A x = b\) or the system \(A X = B\) .
The matrix \(A\) need not be definite.
The method provides a common interface to a variety of well-known
solvers from HSL and elsewhere. Currently supported solvers include
MA27/SILS
, HSL_MA57
, HSL_MA77
, HSL_MA86
,
HSL_MA87
and HSL_MA97
from {HSL},
SSIDS
from {SPRAL},
MUMPS
from Mumps Technologies,
PARDISO
both from the Pardiso Project and Intel’s MKL,
PaStiX
from Inria, and
WSMP
from the IBM alpha Works,
as well as POTR
, SYTR
and SBTR
from LAPACK.
Note that, with the exception of SSIDS
and the Netlib
reference LAPACK codes,
the solvers themselves do not form part of this package and
must be obtained/linked to separately.
Dummy instances are provided for solvers that are unavailable.
Also note that additional flexibility may be obtained by calling the
solvers directly rather that via this package.
terminology#
The solvers used each produce an \(L D L^T\) factorization of \(A\) or a perturbation thereof, where \(L\) is a permuted lower triangular matrix and \(D\) is a block diagonal matrix with blocks of order 1 and 2. It is convenient to write this factorization in the form
supported solvers#
The key features of the external solvers supported by sls
are
given in the following table:
solver |
factorization |
indefinite \(A\) |
out-of-core |
parallelised |
---|---|---|---|---|
|
multifrontal |
yes |
no |
no |
|
multifrontal |
yes |
no |
no |
|
multifrontal |
yes |
yes |
OpenMP core |
|
left-looking |
yes |
no |
OpenMP fully |
|
left-looking |
no |
no |
OpenMP fully |
|
multifrontal |
yes |
no |
OpenMP core |
|
multifrontal |
yes |
no |
CUDA core |
|
multifrontal |
yes |
optionally |
MPI |
|
left-right-looking |
yes |
no |
OpenMP fully |
|
left-right-looking |
yes |
optionally |
OpenMP fully |
|
left-right-looking |
yes |
no |
OpenMP fully |
|
left-right-looking |
yes |
no |
OpenMP fully |
|
dense |
no |
no |
with parallel LAPACK |
|
dense |
yes |
no |
with parallel LAPACK |
|
dense band |
no |
no |
with parallel LAPACK |
method#
Variants of sparse Gaussian elimination are used. See Section 4 of $GALAHAD/doc/sls.pdf for a brief description of the method employed and other details.
The solver SILS
is available as part of GALAHAD and relies on
the HSL Archive package MA27
. To obtain HSL Archive packages, see
The solvers
HSL_MA57
,
HSL_MA77
,
HSL_MA86
,
HSL_MA87
and
HSL_MA97
, the ordering packages
MC61
and HSL_MC68
, and the scaling packages
HSL_MC64
and MC77
are all part of HSL 2011.
To obtain HSL 2011 packages, see
The solver SSIDS
is from the SPRAL sparse-matrix collection,
and is available as part of GALAHAD.
The solver MUMPS
is available from Mumps Technologies in France, and
version 5.5.1 or above is sufficient.
To obtain MUMPS
, see
The solver PARDISO
is available from the Pardiso Project;
version 4.0.0 or above is required.
To obtain PARDISO
, see
The solver MKL PARDISO
is available as part of Intel’s oneAPI Math Kernel
Library (oneMKL).
To obtain this version of PARDISO
, see
The solver PaStix
is available from Inria in France, and
version 6.2 or above is sufficient.
To obtain PaStiX
, see
The solver WSMP
is available from the IBM alpha Works;
version 10.9 or above is required.
To obtain WSMP
, see
The solvers POTR
, SYTR
and PBTR
,
are available as
S/DPOTRF/S
,
S/DSYTRF/S
and S/DPBTRF/S
as part of LAPACK. Reference versions
are provided by GALAHAD, but for good performance
machined-tuned versions should be used.
Explicit sparsity re-orderings are obtained by calling the HSL package
HSL_MC68
.
Both this, HSL_MA57
and PARDISO
rely optionally
on the ordering package MeTiS
(version 4) from the Karypis Lab.
To obtain METIS
, see
Bandwidth, Profile and wavefront reduction is supported by
calling HSL’s MC61
.
The methods used are described in the user-documentation for
HSL 2011, A collection of Fortran codes for large-scale scientific computation (2011).
and papers
E. Agullo, P. R. Amestoy, A. Buttari, J.-Y. L’Excellent, A. Guermouche and F.-H. Rouet, “Robust memory-aware mappings for parallel multifrontal factorizations”. SIAM Journal on Scientific Computing, b 38(3) (2016), C256–C279,
P. R. Amestoy, I. S. Duff, J. Koster and J.-Y. L’Excellent. “A fully asynchronous multifrontal solver using distributed dynamic scheduling”. SIAM Journal on Matrix Analysis and Applications b 23(1) (2001) 15-41,
A. Gupta, “WSMP: Watson Sparse Matrix Package Part I - direct solution of symmetric sparse systems”. IBM Research Report RC 21886, IBM T. J. Watson Research Center, NY 10598, USA (2010),
P. Henon, P. Ramet and J. Roman, “PaStiX: A High-Performance Parallel Direct Solver for Sparse Symmetric Definite Systems”. Parallel Computing, b 28(2) (2002) 301–321,
J.D. Hogg, E. Ovtchinnikov and J.A. Scott. “A sparse symmetric indefinite direct solver for GPU architectures”. ACM Transactions on Mathematical Software b 42(1) (2014), Article 1,
O. Schenk and K. Gartner, “Solving Unsymmetric Sparse Systems of Linear Equations with PARDISO”. Journal of Future Generation Computer Systems b, 20(3) (2004) 475–487, and
O. Schenk and K. Gartner, “On fast factorization pivoting methods for symmetric indefinite systems”. Electronic Transactions on Numerical Analysis b 23 (2006) 158–179.
matrix storage#
The symmetric \(n\) by \(n\) matrix \(A\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).
Dense storage format: The matrix \(A\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(A\) is symmetric, only the lower triangular part (that is the part \(A_{ij}\) for \(0 \leq j \leq i \leq n-1\)) need be held. In this case the lower triangle should be stored by rows, that is component \(i * i / 2 + j\) of the storage array A_val will hold the value \(A_{ij}\) (and, by symmetry, \(A_{ji}\)) for \(0 \leq j \leq i \leq n-1\). The string A_type = ‘dense’ should be specified.
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(0 \leq l \leq ne-1\), of \(A\), its row index i, column index j and value \(A_{ij}\), \(0 \leq j \leq i \leq n-1\), are stored as the \(l\)-th components of the integer arrays A_row and A_col and real array A_val, respectively, while the number of nonzeros is recorded as A_ne = \(ne\). Note that only the entries in the lower triangle should be stored. The string A_type = ‘coordinate’ should be specified.
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(A\) the i-th component of the integer array A_ptr holds the position of the first entry in this row, while A_ptr(n) holds the total number of entries. The column indices j, \(0 \leq j \leq i\), and values \(A_{ij}\) of the entries in the i-th row are stored in components l = A_ptr(i), …, A_ptr(i+1)-1 of the integer array A_col, and real array A_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string A_type = ‘sparse_by_rows’ should be specified.
Diagonal storage format: If \(A\) is diagonal (i.e., \(A_{ij} = 0\) for all \(0 \leq i \neq j \leq n-1\)) only the diagonals entries \(A_{ii}\), \(0 \leq i \leq n-1\) need be stored, and the first n components of the array A_val may be used for the purpose. The string A_type = ‘diagonal’ should be specified.
Multiples of the identity storage format: If \(A\) is a multiple of the identity matrix, (i.e., \(H = \alpha I\) where \(I\) is the n by n identity matrix and \(\alpha\) is a scalar), it suffices to store \(\alpha\) as the first component of A_val. The string A_type = ‘scaled_identity’ should be specified.
The identity matrix format: If \(A\) is the identity matrix, no values need be stored. The string A_type = ‘identity’ should be specified.
The zero matrix format: The same is true if \(A\) is the zero matrix, but now the string A_type = ‘zero’ or ‘none’ should be specified.
introduction to function calls#
To solve a given problem, functions from the sls package must be called in the following order:
sls_initialize - provide default control parameters and set up initial data structures
sls_read_specfile (optional) - override control values by reading replacement values from a file
sls_analyse_matrix - set up matrix data structures and analyse the structure to choose a suitable order for factorization
sls_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved
sls_factorize_matrix - form and factorize the matrix \(A\)
one of
sls_solve_system - solve the linear system of equations \(Ax=b\)
sls_partial_solve_system - solve a linear system \(Mx=b\) involving one of the matrix factors \(M\) of \(A\)
sls_information (optional) - recover information about the solution and solution process
sls_terminate - deallocate data structures
See the examples section for illustrations of use.
callable functions#
overview of functions provided#
// typedefs typedef float spc_; typedef double rpc_; typedef int ipc_; // structs struct sls_control_type; struct sls_inform_type; struct sls_time_type; // global functions void sls_initialize( const char solver[], void **data, struct sls_control_type* control, ipc_ *status ); void sls_read_specfile(struct sls_control_type* control, const char specfile[]); void sls_analyse_matrix( struct sls_control_type* control, void **data, ipc_ *status, ipc_ n, const char type[], ipc_ ne, const ipc_ row[], const ipc_ col[], const ipc_ ptr[] ); void sls_reset_control( struct sls_control_type* control, void **data, ipc_ *status ); void sls_factorize_matrix( void **data, ipc_ *status, ipc_ ne, const rpc_ val[] ); void sls_solve_system(void **data, ipc_ *status, ipc_ n, rpc_ sol[]); void sls_partial_solve_system( const char part[], void **data, ipc_ *status, ipc_ n, rpc_ sol[] ); void sls_information(void **data, struct sls_inform_type* inform, ipc_ *status); void sls_terminate( void **data, struct sls_control_type* control, struct sls_inform_type* inform );
typedefs#
typedef float spc_
spc_
is real single precision
typedef double rpc_
rpc_
is the real working precision used, but may be changed to float
by
defining the preprocessor variable SINGLE
.
typedef int ipc_
ipc_
is the default integer word length used, but may be changed to
int64_t
by defining the preprocessor variable INTEGER_64
.
function calls#
void sls_initialize( const char solver[], void **data, struct sls_control_type* control, ipc_ *status )
Select solver, set default control values and initialize private data
Parameters:
solver |
is a one-dimensional array of type char that specifies the solver package that should be used to factorize the matrix \(A\). It should be one of ‘sils’, ‘ma27’, ‘ma57’, ‘ma77’, ‘ma86’, ‘ma87’, ‘ma97’, ‘ssids’, ‘mumps’, ‘pardiso’, ‘mkl pardiso’, ‘pastix’, ‘wsmp’, ‘potr’, ‘sytr’ or ‘pbtr’; lower or upper case variants are allowed. Only ‘potr’, ‘sytr’, ‘pbtr’ and, for OMP 4.0-compliant compilers, ‘ssids’ are installed by default, but others are easily installed (see README.external). |
data |
holds private internal data |
control |
is a struct containing control information (see sls_control_type) |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:
|
void sls_read_specfile(struct sls_control_type* control, const char specfile[])
Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/sls/SLS.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/sls.pdf for a list of how these keywords relate to the components of the control structure.
Parameters:
control |
is a struct containing control information (see sls_control_type) |
specfile |
is a character string containing the name of the specification file |
void sls_analyse_matrix( struct sls_control_type* control, void **data, ipc_ *status, ipc_ n, const char type[], ipc_ ne, const ipc_ row[], const ipc_ col[], const ipc_ ptr[] )
Import structural matrix data into internal storage prior to solution
Parameters:
control |
is a struct whose members provide control paramters for the remaining prcedures (see sls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:
|
n |
is a scalar variable of type ipc_, that holds the number of rows in the symmetric matrix \(A\). |
type |
is a one-dimensional array of type char that specifies the symmetric storage scheme used for the matrix \(A\). It should be one of ‘coordinate’, ‘sparse_by_rows’ or ‘dense’; lower or upper case variants are allowed. |
ne |
is a scalar variable of type ipc_, that holds the number of entries in the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes. |
row |
is a one-dimensional array of size ne and type ipc_, that holds the row indices of the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be NULL. |
col |
is a one-dimensional array of size ne and type ipc_, that holds the column indices of the lower triangular part of \(A\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense storage scheme is used, and in this case can be NULL. |
ptr |
is a one-dimensional array of size n+1 and type ipc_, that holds the starting position of each row of the lower triangular part of \(A\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be NULL. |
void sls_reset_control( struct sls_control_type* control, void **data, ipc_ *status )
Reset control parameters after import if required.
Parameters:
control |
is a struct whose members provide control paramters for the remaining prcedures (see sls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:
|
void sls_factorize_matrix( void **data, ipc_ *status, ipc_ ne, const rpc_ val[] )
Form and factorize the symmetric matrix \(A\).
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:
|
ne |
is a scalar variable of type ipc_, that holds the number of entries in the lower triangular part of the symmetric matrix \(A\). |
val |
is a one-dimensional array of size ne and type rpc_, that holds the values of the entries of the lower triangular part of the symmetric matrix \(A\) in any of the supported storage schemes. |
void sls_solve_system(void **data, ipc_ *status, ipc_ n, rpc_ sol[])
Solve the linear system \(Ax=b\).
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are:
|
n |
is a scalar variable of type ipc_, that holds the number of entries in the vectors \(b\) and \(x\). |
sol |
is a one-dimensional array of size n and type double. On entry, it must hold the vector \(b\). On a successful exit, its contains the solution \(x\). |
void sls_partial_solve_system( const char part[], void **data, ipc_ *status, ipc_ n, rpc_ sol[] )
Given the factorization \(A = L D U\) with \(U = L^T\), solve the linear system \(Mx=b\), where \(M\) is one of \(L\), \(D\), \(U\) or \(S = L \sqrt{D}\).
Parameters:
part |
is a one-dimensional array of type char that specifies the component \(M\) of the factorization that is to be used. It should be one of “L”, “D”, “U” or “S”, and these correspond to the parts \(L\), \(D\), \(U\) and \(S\); lower or upper case variants are allowed. |
data |
holds private internal data |
status |
is a scalar variable of type ipc_, that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
n |
is a scalar variable of type ipc_, that holds the number of entries in the vectors \(b\) and \(x\). |
sol |
is a one-dimensional array of size n and type double. On entry, it must hold the vector \(b\). On a successful exit, its contains the solution \(x\). |
void sls_information(void **data, struct sls_inform_type* inform, ipc_ *status)
Provide output information
Parameters:
data |
holds private internal data |
inform |
is a struct containing output information (see sls_inform_type) |
status |
is a scalar variable of type ipc_, that gives the exit status from the package. Possible values are (currently):
|
void sls_terminate( void **data, struct sls_control_type* control, struct sls_inform_type* inform )
Deallocate all internal private storage
Parameters:
data |
holds private internal data |
control |
is a struct containing control information (see sls_control_type) |
inform |
is a struct containing output information (see sls_inform_type) |
available structures#
sls_control_type structure#
#include <galahad_sls.h> struct sls_control_type { // fields bool f_indexing; ipc_ error; ipc_ warning; ipc_ out; ipc_ statistics; ipc_ print_level; ipc_ print_level_solver; ipc_ bits; ipc_ block_size_kernel; ipc_ block_size_elimination; ipc_ blas_block_size_factorize; ipc_ blas_block_size_solve; ipc_ node_amalgamation; ipc_ initial_pool_size; ipc_ min_real_factor_size; ipc_ min_integer_factor_size; int64_t max_real_factor_size; int64_t max_integer_factor_size; int64_t max_in_core_store; rpc_ array_increase_factor; rpc_ array_decrease_factor; ipc_ pivot_control; ipc_ ordering; ipc_ full_row_threshold; ipc_ row_search_indefinite; ipc_ scaling; ipc_ scale_maxit; rpc_ scale_thresh; rpc_ relative_pivot_tolerance; rpc_ minimum_pivot_tolerance; rpc_ absolute_pivot_tolerance; rpc_ zero_tolerance; rpc_ zero_pivot_tolerance; rpc_ negative_pivot_tolerance; rpc_ static_pivot_tolerance; rpc_ static_level_switch; rpc_ consistency_tolerance; ipc_ max_iterative_refinements; rpc_ acceptable_residual_relative; rpc_ acceptable_residual_absolute; bool multiple_rhs; bool generate_matrix_file; ipc_ matrix_file_device; char matrix_file_name[31]; char out_of_core_directory[401]; char out_of_core_integer_factor_file[401]; char out_of_core_real_factor_file[401]; char out_of_core_real_work_file[401]; char out_of_core_indefinite_file[401]; char out_of_core_restart_file[501]; char prefix[31]; };
detailed documentation#
control derived type as a C struct
components#
bool f_indexing
use C or Fortran sparse matrix indexing
ipc_ error
unit for error messages
ipc_ warning
unit for warning messages
ipc_ out
unit for monitor output
ipc_ statistics
unit for statistical output
ipc_ print_level
controls level of diagnostic output
ipc_ print_level_solver
controls level of diagnostic output from external solver
ipc_ bits
number of bits used in architecture
ipc_ block_size_kernel
the target blocksize for kernel factorization
ipc_ block_size_elimination
the target blocksize for parallel elimination
ipc_ blas_block_size_factorize
level 3 blocking in factorize
ipc_ blas_block_size_solve
level 2 and 3 blocking in solve
ipc_ node_amalgamation
a child node is merged with its parent if they both involve fewer than node_amalgamation eliminations
ipc_ initial_pool_size
initial size of task-pool arrays for parallel elimination
ipc_ min_real_factor_size
initial size for real array for the factors and other data
ipc_ min_integer_factor_size
initial size for integer array for the factors and other data
int64_t max_real_factor_size
maximum size for real array for the factors and other data
int64_t max_integer_factor_size
maximum size for integer array for the factors and other data
int64_t max_in_core_store
amount of in-core storage to be used for out-of-core factorization
rpc_ array_increase_factor
factor by which arrays sizes are to be increased if they are too small
rpc_ array_decrease_factor
if previously allocated internal workspace arrays are greater than array_decrease_factor times the currently required sizes, they are reset to current requirements
ipc_ pivot_control
pivot control:
1 Numerical pivoting will be performed.
2 No pivoting will be performed and an error exit will occur immediately a pivot sign change is detected.
3 No pivoting will be performed and an error exit will occur if a zero pivot is detected.
4 No pivoting is performed but pivots are changed to all be positive
ipc_ ordering
controls ordering (ignored if explicit PERM argument present)
<0 chosen by the specified solver with its own ordering-selected value -ordering
0 chosen package default (or the AMD ordering if no package default)
1 Approximate minimum degree (AMD) with provisions for “dense” rows/col
2 Minimum degree
3 Nested disection
4 indefinite ordering to generate a combination of 1x1 and 2x2 pivots
5 Profile/Wavefront reduction
6 Bandwidth reduction
>6 ordering chosen depending on matrix characteristics (not yet implemented)
ipc_ full_row_threshold
controls threshold for detecting full rows in analyse, registered as percentage of matrix order. If 100, only fully dense rows detected (defa
ipc_ row_search_indefinite
number of rows searched for pivot when using indefinite ordering
ipc_ scaling
controls scaling (ignored if explicit SCALE argument present)
<0 chosen by the specified solver with its own scaling-selected value -scaling
0 No scaling
1 Scaling using HSL’s MC64
2 Scaling using HSL’s MC77 based on the row one-norm
3 Scaling using HSL’s MC77 based on the row infinity-norm
ipc_ scale_maxit
the number of scaling iterations performed (default 10 used if .scale_maxit < 0)
rpc_ scale_thresh
the scaling iteration stops as soon as the row/column norms are less than 1+/-.scale_thresh
rpc_ relative_pivot_tolerance
pivot threshold
rpc_ minimum_pivot_tolerance
smallest permitted relative pivot threshold
rpc_ absolute_pivot_tolerance
any pivot small than this is considered zero
rpc_ zero_tolerance
any entry smaller than this is considered zero
rpc_ zero_pivot_tolerance
any pivot smaller than this is considered zero for positive-definite sol
rpc_ negative_pivot_tolerance
any pivot smaller than this is considered to be negative for p-d solvers
rpc_ static_pivot_tolerance
used for setting static pivot level
rpc_ static_level_switch
used for switch to static
rpc_ consistency_tolerance
used to determine whether a system is consistent when seeking a Fredholm alternative
ipc_ max_iterative_refinements
maximum number of iterative refinements allowed
rpc_ acceptable_residual_relative
refinement will cease as soon as the residual ||Ax-b|| falls below max( acceptable_residual_relative * ||b||, acceptable_residual_absolute
rpc_ acceptable_residual_absolute
see acceptable_residual_relative
bool multiple_rhs
set .multiple_rhs to .true. if there is possibility that the solver will be required to solve systems with more than one right-hand side. More efficient execution may be possible when .multiple_rhs = .false.
bool generate_matrix_file
if .generate_matrix_file is .true. if a file describing the current matrix is to be generated
ipc_ matrix_file_device
specifies the unit number to write the input matrix (in co-ordinate form
char matrix_file_name[31]
name of generated matrix file containing input problem
char out_of_core_directory[401]
directory name for out of core factorization and additional real workspace in the indefinite case, respectively
char out_of_core_integer_factor_file[401]
out of core superfile names for integer and real factor data, real works and additional real workspace in the indefinite case, respectively
char out_of_core_real_factor_file[401]
see out_of_core_integer_factor_file
char out_of_core_real_work_file[401]
see out_of_core_integer_factor_file
char out_of_core_indefinite_file[401]
see out_of_core_integer_factor_file
char out_of_core_restart_file[501]
see out_of_core_integer_factor_file
char prefix[31]
all output lines will be prefixed by prefix(2:LEN(TRIM(.prefix))-1) where prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’
sls_time_type structure#
#include <galahad_sls.h> struct sls_time_type { // fields rpc_ total; rpc_ analyse; rpc_ factorize; rpc_ solve; rpc_ order_external; rpc_ analyse_external; rpc_ factorize_external; rpc_ solve_external; rpc_ clock_total; rpc_ clock_analyse; rpc_ clock_factorize; rpc_ clock_solve; rpc_ clock_order_external; rpc_ clock_analyse_external; rpc_ clock_factorize_external; rpc_ clock_solve_external; };
detailed documentation#
time derived type as a C struct
components#
rpc_ total
the total cpu time spent in the package
rpc_ analyse
the total cpu time spent in the analysis phase
rpc_ factorize
the total cpu time spent in the factorization phase
rpc_ solve
the total cpu time spent in the solve phases
rpc_ order_external
the total cpu time spent by the external solver in the ordering phase
rpc_ analyse_external
the total cpu time spent by the external solver in the analysis phase
rpc_ factorize_external
the total cpu time spent by the external solver in the factorization pha
rpc_ solve_external
the total cpu time spent by the external solver in the solve phases
rpc_ clock_total
the total clock time spent in the package
rpc_ clock_analyse
the total clock time spent in the analysis phase
rpc_ clock_factorize
the total clock time spent in the factorization phase
rpc_ clock_solve
the total clock time spent in the solve phases
rpc_ clock_order_external
the total clock time spent by the external solver in the ordering phase
rpc_ clock_analyse_external
the total clock time spent by the external solver in the analysis phase
rpc_ clock_factorize_external
the total clock time spent by the external solver in the factorization p
rpc_ clock_solve_external
the total clock time spent by the external solver in the solve phases
sls_inform_type structure#
#include <galahad_sls.h> struct sls_inform_type { // fields ipc_ status; ipc_ alloc_status; char bad_alloc[81]; ipc_ more_info; ipc_ entries; ipc_ out_of_range; ipc_ duplicates; ipc_ upper; ipc_ missing_diagonals; ipc_ max_depth_assembly_tree; ipc_ nodes_assembly_tree; int64_t real_size_desirable; int64_t integer_size_desirable; int64_t real_size_necessary; int64_t integer_size_necessary; int64_t real_size_factors; int64_t integer_size_factors; int64_t entries_in_factors; ipc_ max_task_pool_size; ipc_ max_front_size; ipc_ compresses_real; ipc_ compresses_integer; ipc_ two_by_two_pivots; ipc_ semi_bandwidth; ipc_ delayed_pivots; ipc_ pivot_sign_changes; ipc_ static_pivots; ipc_ first_modified_pivot; ipc_ rank; ipc_ negative_eigenvalues; ipc_ num_zero; ipc_ iterative_refinements; int64_t flops_assembly; int64_t flops_elimination; int64_t flops_blas; rpc_ largest_modified_pivot; rpc_ minimum_scaling_factor; rpc_ maximum_scaling_factor; rpc_ condition_number_1; rpc_ condition_number_2; rpc_ backward_error_1; rpc_ backward_error_2; rpc_ forward_error; bool alternative; char solver[21]; struct sls_time_type time; struct sils_ainfo_type sils_ainfo; struct sils_finfo_type sils_finfo; struct sils_sinfo_type sils_sinfo; struct ma57_ainfo ma57_ainfo; struct ma57_finfo ma57_finfo; struct ma57_sinfo ma57_sinfo; struct ma77_info ma77_info; struct ma86_info ma86_info; struct ma87_info ma87_info; struct ma97_info ma97_info; struct spral_ssids_inform ssids_inform; ipc_ mc61_info[10]; rpc_ mc61_rinfo[15]; struct mc64_info mc64_info; struct mc68_info mc68_info; ipc_ mc77_info[10]; rpc_ mc77_rinfo[10]; ipc_ mumps_error; ipc_ mumps_info[80]; rpc_ mumps_rinfo[40]; ipc_ pardiso_error; ipc_ pardiso_IPARM[64]; rpc_ pardiso_DPARM[64]; ipc_ mkl_pardiso_error; ipc_ mkl_pardiso_IPARM[64]; ipc_ pastix_info; ipc_ wsmp_error; ipc_ wsmp_iparm[64]; rpc_ wsmp_dparm[64]; ipc_ mpi_ierr; ipc_ lapack_error; };
detailed documentation#
inform derived type as a C struct
components#
ipc_ status
reported return status: 0 success -1 allocation error -2 deallocation error -3 matrix data faulty (.n < 1, .ne < 0) -20 alegedly +ve definite matrix is not -29 unavailable option -31 input order is not a permutation or is faulty in some other way -32 > control.max_integer_factor_size integer space required for factor -33 > control.max_real_factor_size real space required for factors -40 not possible to alter the diagonals -41 no access to permutation or pivot sequence used -42 no access to diagonal perturbations -43 direct-access file error -50 solver-specific error; see the solver’s info parameter -101 unknown solver
ipc_ alloc_status
STAT value after allocate failure.
char bad_alloc[81]
name of array which provoked an allocate failure
ipc_ more_info
further information on failure
ipc_ entries
number of entries
ipc_ out_of_range
number of indices out-of-range
ipc_ duplicates
number of duplicates
ipc_ upper
number of entries from the strict upper triangle
ipc_ missing_diagonals
number of missing diagonal entries for an allegedly-definite matrix
ipc_ max_depth_assembly_tree
maximum depth of the assembly tree
ipc_ nodes_assembly_tree
nodes in the assembly tree (= number of elimination steps)
int64_t real_size_desirable
desirable or actual size for real array for the factors and other data
int64_t integer_size_desirable
desirable or actual size for integer array for the factors and other dat
int64_t real_size_necessary
necessary size for real array for the factors and other data
int64_t integer_size_necessary
necessary size for integer array for the factors and other data
int64_t real_size_factors
predicted or actual number of reals to hold factors
int64_t integer_size_factors
predicted or actual number of integers to hold factors
int64_t entries_in_factors
number of entries in factors
ipc_ max_task_pool_size
maximum number of tasks in the factorization task pool
ipc_ max_front_size
forecast or actual size of largest front
ipc_ compresses_real
number of compresses of real data
ipc_ compresses_integer
number of compresses of integer data
ipc_ two_by_two_pivots
number of 2x2 pivots
ipc_ semi_bandwidth
semi-bandwidth of matrix following bandwidth reduction
ipc_ delayed_pivots
number of delayed pivots (total)
ipc_ pivot_sign_changes
number of pivot sign changes if no pivoting is used successfully
ipc_ static_pivots
number of static pivots chosen
ipc_ first_modified_pivot
first pivot modification when static pivoting
ipc_ rank
estimated rank of the matrix
ipc_ negative_eigenvalues
number of negative eigenvalues
ipc_ num_zero
number of pivots that are considered zero (and ignored)
ipc_ iterative_refinements
number of iterative refinements performed
int64_t flops_assembly
anticipated or actual number of floating-point operations in assembly
int64_t flops_elimination
anticipated or actual number of floating-point operations in elimination
int64_t flops_blas
additional number of floating-point operations for BLAS
rpc_ largest_modified_pivot
largest diagonal modification when static pivoting or ensuring definiten
rpc_ minimum_scaling_factor
minimum scaling factor
rpc_ maximum_scaling_factor
maximum scaling factor
rpc_ condition_number_1
esimate of the condition number of the matrix (category 1 equations)
rpc_ condition_number_2
estimate of the condition number of the matrix (category 2 equations)
rpc_ backward_error_1
esimate of the backward error (category 1 equations)
rpc_ backward_error_2
esimate of the backward error (category 2 equations)
rpc_ forward_error
estimate of forward error
bool alternative
has an “alternative” y: A y = 0 and yT b > 0 been found when trying to solve A x = b ?
char solver[21]
name of external solver used to factorize and solve
struct sls_time_type time
timings (see above)
struct sils_ainfo_type sils_ainfo
the output structure from sils
struct sils_finfo_type sils_finfo
see sils_ainfo
struct sils_sinfo_type sils_sinfo
see sils_ainfo
struct ma57_ainfo ma57_ainfo
the output structure from ma57
struct ma57_finfo ma57_finfo
see ma57_ainfo
struct ma57_sinfo ma57_sinfo
see ma57_ainfo
struct ma77_info ma77_info
the output structure from ma77
struct ma86_info ma86_info
the output structure from ma86
struct ma87_info ma87_info
the output structure from ma87
struct ma97_info ma97_info
the output structure from ma97
struct spral_ssids_inform ssids_inform
the output structure from ssids
ipc_ mc61_info[10]
the integer and real output arrays from mc61
rpc_ mc61_rinfo[15]
see mc61_info
struct mc64_info mc64_info
the output structure from mc64
struct mc68_info mc68_info
the output structure from mc68
ipc_ mc77_info[10]
the integer output array from mc77
rpc_ mc77_rinfo[10]
the real output status from mc77
ipc_ mumps_error
the output scalars and arrays from mumps
ipc_ mumps_info[80]
see mumps_error
rpc_ mumps_rinfo[40]
see mumps_error
ipc_ pardiso_error
the output scalars and arrays from pardiso
ipc_ pardiso_IPARM[64]
see pardiso_error
rpc_ pardiso_DPARM[64]
see pardiso_error
ipc_ mkl_pardiso_error
the output scalars and arrays from mkl_pardiso
ipc_ mkl_pardiso_IPARM[64]
see mkl_pardiso_error
ipc_ pastix_info
the output flag from pastix
ipc_ wsmp_error
the output scalars and arrays from wsmp
ipc_ wsmp_iparm[64]
see wsmp_error
rpc_ wsmp_dparm[64]
see wsmp_error
ipc_ mpi_ierr
the output flag from MPI routines
ipc_ lapack_error
the output flag from LAPACK routines
example calls#
This is an example of how to use the package to solve a symmetric system of linear equations; the code is available in $GALAHAD/src/sls/C/slst.c . A variety of supported matrix storage formats are shown.
Notice that C-style indexing is used, and that this is flagged by setting
control.f_indexing
to false
. The floating-point type rpc_
is set in galahad_precision.h
to double
by default, but to float
if the preprocessor variable SINGLE
is defined. Similarly, the integer
type ipc_
from galahad_precision.h
is set to int
by default,
but to int64_t
if the preprocessor variable INTEGER_64
is defined.
/* slst.c */
/* Full test for the SLS C interface using C sparse matrix indexing */
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <float.h>
#include "galahad_precision.h"
#include "galahad_cfunctions.h"
#include "galahad_sls.h"
ipc_ maxabsarray(rpc_ a[], ipc_ n, rpc_ *maxabs);
int main(void) {
// Derived types
void *data;
struct sls_control_type control;
struct sls_inform_type inform;
// Set problem data
ipc_ n = 5; // dimension of A
ipc_ ne = 7; // number of entries of A
ipc_ dense_ne = 15; // number of elements of A as a dense matrix
ipc_ row[] = {0, 1, 1, 2, 2, 3, 4}; // row indices, NB lower triangle
ipc_ col[] = {0, 0, 4, 1, 2, 2, 4}; // column indices
ipc_ ptr[] = {0, 1, 3, 5, 6, 7}; // pointers to indices
rpc_ val[] = {2.0, 3.0, 6.0, 4.0, 1.0, 5.0, 1.0}; // values
rpc_ dense[] = {2.0, 3.0, 0.0, 0.0, 4.0, 1.0, 0.0,
0.0, 5.0, 0.0, 0.0, 6.0, 0.0, 0.0, 1.0};
rpc_ rhs[] = {8.0, 45.0, 31.0, 15.0, 17.0};
rpc_ sol[] = {1.0, 2.0, 3.0, 4.0, 5.0};
ipc_ i, status;
rpc_ x[n];
rpc_ error[n];
rpc_ norm_residual;
rpc_ good_x = pow( DBL_EPSILON, 0.3333 );
printf(" C sparse matrix indexing\n\n");
printf(" basic tests of storage formats\n\n");
printf(" storage RHS refine partial\n");
for( ipc_ d=1; d <= 3; d++){
// Initialize SLS - use the sytr solver
sls_initialize( "sytr", &data, &control, &status );
// Set user-defined control options
control.f_indexing = false; // C sparse matrix indexing
switch(d){ // import matrix data and factorize
case 1: // sparse co-ordinate storage
printf(" coordinate ");
sls_analyse_matrix( &control, &data, &status, n,
"coordinate", ne, row, col, NULL );
sls_factorize_matrix( &data, &status, ne, val );
break;
case 2: // sparse by rows
printf(" sparse by rows ");
sls_analyse_matrix( &control, &data, &status, n,
"sparse_by_rows", ne, NULL, col, ptr );
sls_factorize_matrix( &data, &status, ne, val );
break;
case 3: // dense
printf(" dense ");
sls_analyse_matrix( &control, &data, &status, n,
"dense", ne, NULL, NULL, NULL );
sls_factorize_matrix( &data, &status, dense_ne, dense );
break;
}
// Set right-hand side and solve the system
for(i=0; i<n; i++) x[i] = rhs[i];
sls_solve_system( &data, &status, n, x );
sls_information( &data, &inform, &status );
if(inform.status == 0){
for(i=0; i<n; i++) error[i] = x[i]-sol[i];
status = maxabsarray( error, n, &norm_residual );
if(norm_residual < good_x){
printf(" ok ");
}else{
printf(" fail ");
}
}else{
printf(" SLS_solve exit status = %1" i_ipc_ "\n", inform.status);
}
//printf("sol: ");
//for( ipc_ i = 0; i < n; i++) printf("%f ", x[i]);
// resolve, this time using iterative refinement
control.max_iterative_refinements = 1;
sls_reset_control( &control, &data, &status );
for(i=0; i<n; i++) x[i] = rhs[i];
sls_solve_system( &data, &status, n, x );
sls_information( &data, &inform, &status );
if(inform.status == 0){
for(i=0; i<n; i++) error[i] = x[i]-sol[i];
status = maxabsarray( error, n, &norm_residual );
if(norm_residual < good_x){
printf(" ok ");
}else{
printf(" fail ");
}
}else{
printf(" SLS_solve exit status = %1" i_ipc_ "\n", inform.status);
}
// obtain the solution by part solves
for(i=0; i<n; i++) x[i] = rhs[i];
sls_partial_solve_system( "L", &data, &status, n, x );
sls_partial_solve_system( "D", &data, &status, n, x );
sls_partial_solve_system( "U", &data, &status, n, x );
sls_information( &data, &inform, &status );
if(inform.status == 0){
for(i=0; i<n; i++) error[i] = x[i]-sol[i];
status = maxabsarray( error, n, &norm_residual );
if(norm_residual < good_x){
printf(" ok ");
}else{
printf(" fail ");
}
}else{
printf(" SLS_solve exit status = %1" i_ipc_ "\n", inform.status);
}
// Delete internal workspace
sls_terminate( &data, &control, &inform );
printf("\n");
}
}
ipc_ maxabsarray(rpc_ a[], ipc_ n, rpc_ *maxabs)
{
ipc_ i;
rpc_ b, max;
max=abs(a[0]);
for(i=1; i<n; i++)
{
b = fabs(a[i]);
if(max<b)
max=b;
}
*maxabs=max;
return 0;
}
This is the same example, but now fortran-style indexing is used; the code is available in $GALAHAD/src/sls/C/slstf.c .
/* slstf.c */
/* Full test for the SLS C interface using Fortran sparse matrix indexing */
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <float.h>
#include "galahad_precision.h"
#include "galahad_cfunctions.h"
#include "galahad_sls.h"
ipc_ maxabsarray(rpc_ a[], ipc_ n, rpc_ *maxabs);
int main(void) {
// Derived types
void *data;
struct sls_control_type control;
struct sls_inform_type inform;
// Set problem data
ipc_ n = 5; // dimension of A
ipc_ ne = 7; // number of entries of A
ipc_ dense_ne = 15; // number of elements of A as a dense matrix
ipc_ row[] = {1, 2, 2, 3, 3, 4, 5}; // row indices, NB lower triangle
ipc_ col[] = {1, 1, 5, 2, 3, 3, 5}; // column indices
ipc_ ptr[] = {1, 2, 4, 6, 7, 8}; // pointers to indices
rpc_ val[] = {2.0, 3.0, 6.0, 4.0, 1.0, 5.0, 1.0}; // values
rpc_ dense[] = {2.0, 3.0, 0.0, 0.0, 4.0, 1.0, 0.0,
0.0, 5.0, 0.0, 0.0, 6.0, 0.0, 0.0, 1.0};
rpc_ rhs[] = {8.0, 45.0, 31.0, 15.0, 17.0};
rpc_ sol[] = {1.0, 2.0, 3.0, 4.0, 5.0};
ipc_ i, status;
rpc_ x[n];
rpc_ error[n];
rpc_ norm_residual;
rpc_ good_x = pow( DBL_EPSILON, 0.3333 );
printf(" Fortran sparse matrix indexing\n\n");
printf(" basic tests of storage formats\n\n");
printf(" storage RHS refine partial\n");
for( ipc_ d=1; d <= 3; d++){
// Initialize SLS - use the sytr solver
sls_initialize( "sytr", &data, &control, &status );
// Set user-defined control options
control.f_indexing = true; // Fortran sparse matrix indexing
switch(d){ // import matrix data and factorize
case 1: // sparse co-ordinate storage
printf(" coordinate ");
sls_analyse_matrix( &control, &data, &status, n,
"coordinate", ne, row, col, NULL );
sls_factorize_matrix( &data, &status, ne, val );
break;
case 2: // sparse by rows
printf(" sparse by rows ");
sls_analyse_matrix( &control, &data, &status, n,
"sparse_by_rows", ne, NULL, col, ptr );
sls_factorize_matrix( &data, &status, ne, val );
break;
case 3: // dense
printf(" dense ");
sls_analyse_matrix( &control, &data, &status, n,
"dense", ne, NULL, NULL, NULL );
sls_factorize_matrix( &data, &status, dense_ne, dense );
break;
}
// Set right-hand side and solve the system
for(i=0; i<n; i++) x[i] = rhs[i];
sls_solve_system( &data, &status, n, x );
sls_information( &data, &inform, &status );
if(inform.status == 0){
for(i=0; i<n; i++) error[i] = x[i]-sol[i];
status = maxabsarray( error, n, &norm_residual );
if(norm_residual < good_x){
printf(" ok ");
}else{
printf(" fail ");
}
}else{
printf(" SLS_solve exit status = %1" i_ipc_ "\n", inform.status);
}
//printf("sol: ");
//for( ipc_ i = 0; i < n; i++) printf("%f ", x[i]);
// resolve, this time using iterative refinement
control.max_iterative_refinements = 1;
sls_reset_control( &control, &data, &status );
for(i=0; i<n; i++) x[i] = rhs[i];
sls_solve_system( &data, &status, n, x );
sls_information( &data, &inform, &status );
if(inform.status == 0){
for(i=0; i<n; i++) error[i] = x[i]-sol[i];
status = maxabsarray( error, n, &norm_residual );
if(norm_residual < good_x){
printf(" ok ");
}else{
printf(" fail ");
}
}else{
printf(" SLS_solve exit status = %1" i_ipc_ "\n", inform.status);
}
// obtain the solution by part solves
for(i=0; i<n; i++) x[i] = rhs[i];
sls_partial_solve_system( "L", &data, &status, n, x );
sls_partial_solve_system( "D", &data, &status, n, x );
sls_partial_solve_system( "U", &data, &status, n, x );
sls_information( &data, &inform, &status );
if(inform.status == 0){
for(i=0; i<n; i++) error[i] = x[i]-sol[i];
status = maxabsarray( error, n, &norm_residual );
if(norm_residual < good_x){
printf(" ok ");
}else{
printf(" fail ");
}
}else{
printf(" SLS_solve exit status = %1" i_ipc_ "\n", inform.status);
}
// Delete internal workspace
sls_terminate( &data, &control, &inform );
printf("\n");
}
}
ipc_ maxabsarray(rpc_ a[], ipc_ n, rpc_ *maxabs)
{
ipc_ i;
rpc_ b, max;
max=abs(a[0]);
for(i=1; i<n; i++)
{
b = fabs(a[i]);
if(max<b)
max=b;
}
*maxabs=max;
return 0;
}