GALAHAD SLS package#

purpose#

The sls package solves dense or sparse symmetric systems of linear equations using variants of Gaussian elimination. Given a sparse symmetric matrix \(A = \{ a_{ij} \}_{n \times n}\), and an \(n\)-vector \(b\) or a matrix \(B = \{ b_{ij} \}_{n \times r}\), this function solves the system \(A x = b\) or the system \(A X = B\) . The matrix \(A\) need not be definite.

The method provides a common interface to a variety of well-known solvers from HSL and elsewhere. Currently supported solvers include MA27/SILS, HSL_MA57, HSL_MA77 , HSL_MA86, HSL_MA87 and HSL_MA97 from {HSL}, SSIDS from {SPRAL}, MUMPS from Mumps Technologies, PARDISO both from the Pardiso Project and Intel’s MKL, PaStiX from Inria, and WSMP from the IBM alpha Works, as well as POTR, SYTR and SBTR from LAPACK. Note that, with the exception of SSIDS and the Netlib reference LAPACK codes, the solvers themselves do not form part of this package and must be obtained/linked to separately. Dummy instances are provided for solvers that are unavailable. Also note that additional flexibility may be obtained by calling the solvers directly rather that via this package.

terminology#

The solvers used each produce an \(L D L^T\) factorization of \(A\) or a perturbation thereof, where \(L\) is a permuted lower triangular matrix and \(D\) is a block diagonal matrix with blocks of order 1 and 2. It is convenient to write this factorization in the form

\[A + E = P L D L^T P^T,\]
where \(P\) is a permutation matrix and \(E\) is any diagonal perturbation introduced.

supported solvers#

The key features of the external solvers supported by sls are given in the following table:

External solver characteristics#

solver

factorization

indefinite \(A\)

out-of-core

parallelised

SILS/MA27

multifrontal

yes

no

no

HSL_MA57

multifrontal

yes

no

no

HSL_MA77

multifrontal

yes

yes

OpenMP core

HSL_MA86

left-looking

yes

no

OpenMP fully

HSL_MA87

left-looking

no

no

OpenMP fully

HSL_MA97

multifrontal

yes

no

OpenMP core

SSIDS

multifrontal

yes

no

CUDA core

MUMPS

multifrontal

yes

optionally

MPI

PARDISO

left-right-looking

yes

no

OpenMP fully

MKL_PARDISO

left-right-looking

yes

optionally

OpenMP fully

PaStix

left-right-looking

yes

no

OpenMP fully

WSMP

left-right-looking

yes

no

OpenMP fully

POTR

dense

no

no

with parallel LAPACK

SYTR

dense

yes

no

with parallel LAPACK

PBTR

dense band

no

no

with parallel LAPACK

method#

Variants of sparse Gaussian elimination are used. See Section 4 of $GALAHAD/doc/sls.pdf for a brief description of the method employed and other details.

The solver SILS is available as part of GALAHAD and relies on the HSL Archive package MA27. To obtain HSL Archive packages, see

The solvers HSL_MA57, HSL_MA77, HSL_MA86, HSL_MA87 and HSL_MA97, the ordering packages MC61 and HSL_MC68, and the scaling packages HSL_MC64 and MC77 are all part of HSL 2011. To obtain HSL 2011 packages, see

The solver SSIDS is from the SPRAL sparse-matrix collection, and is available as part of GALAHAD.

The solver MUMPS is available from Mumps Technologies in France, and version 5.5.1 or above is sufficient. To obtain MUMPS, see

The solver PARDISO is available from the Pardiso Project; version 4.0.0 or above is required. To obtain PARDISO, see

The solver MKL PARDISO is available as part of Intel’s oneAPI Math Kernel Library (oneMKL). To obtain this version of PARDISO, see

The solver PaStix is available from Inria in France, and version 6.2 or above is sufficient. To obtain PaStiX, see

The solver WSMP is available from the IBM alpha Works; version 10.9 or above is required. To obtain WSMP, see

The solvers POTR, SYTR and PBTR, are available as S/DPOTRF/S, S/DSYTRF/S and S/DPBTRF/S as part of LAPACK. Reference versions are provided by GALAHAD, but for good performance machined-tuned versions should be used.

Explicit sparsity re-orderings are obtained by calling the HSL package HSL_MC68. Both this, HSL_MA57 and PARDISO rely optionally on the ordering package MeTiS (version 4) from the Karypis Lab. To obtain METIS, see

Bandwidth, Profile and wavefront reduction is supported by calling HSL’s MC61.

The methods used are described in the user-documentation for

HSL 2011, A collection of Fortran codes for large-scale scientific computation (2011).

and papers

E. Agullo, P. R. Amestoy, A. Buttari, J.-Y. L’Excellent, A. Guermouche and F.-H. Rouet, “Robust memory-aware mappings for parallel multifrontal factorizations”. SIAM Journal on Scientific Computing, b 38(3) (2016), C256–C279,

P. R. Amestoy, I. S. Duff, J. Koster and J.-Y. L’Excellent. “A fully asynchronous multifrontal solver using distributed dynamic scheduling”. SIAM Journal on Matrix Analysis and Applications b 23(1) (2001) 15-41,

A. Gupta, “WSMP: Watson Sparse Matrix Package Part I - direct solution of symmetric sparse systems”. IBM Research Report RC 21886, IBM T. J. Watson Research Center, NY 10598, USA (2010),

P. Henon, P. Ramet and J. Roman, “PaStiX: A High-Performance Parallel Direct Solver for Sparse Symmetric Definite Systems”. Parallel Computing, b 28(2) (2002) 301–321,

J.D. Hogg, E. Ovtchinnikov and J.A. Scott. “A sparse symmetric indefinite direct solver for GPU architectures”. ACM Transactions on Mathematical Software b 42(1) (2014), Article 1,

O. Schenk and K. Gartner, “Solving Unsymmetric Sparse Systems of Linear Equations with PARDISO”. Journal of Future Generation Computer Systems b, 20(3) (2004) 475–487, and

O. Schenk and K. Gartner, “On fast factorization pivoting methods for symmetric indefinite systems”. Electronic Transactions on Numerical Analysis b 23 (2006) 158–179.

matrix storage#

The symmetric \(n\) by \(n\) matrix \(A\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).

Dense storage format: The matrix \(A\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(A\) is symmetric, only the lower triangular part (that is the part \(A_{ij}\) for \(1 \leq j \leq i \leq n\)) need be held. In this case the lower triangle should be stored by rows, that is component \((i-1) * i / 2 + j\) of the storage array A_val will hold the value \(A_{ij}\) (and, by symmetry, \(A_{ji}\)) for \(1 \leq j \leq i \leq n\). The string A_type = ‘dense’ should be specified.

Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(1 \leq l \leq ne\), of \(A\), its row index i, column index j and value \(A_{ij}\), \(1 \leq j \leq i \leq n\), are stored as the \(l\)-th components of the integer arrays A_row and A_col and real array A_val, respectively, while the number of nonzeros is recorded as A_ne = \(ne\). Note that only the entries in the lower triangle should be stored. The string A_type = ‘coordinate’ should be specified.

Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(A\) the i-th component of the integer array A_ptr holds the position of the first entry in this row, while A_ptr(n+1) holds the total number of entries plus one. The column indices j, \(1 \leq j \leq i\), and values \(A_{ij}\) of the entries in the i-th row are stored in components l = A_ptr(i), …, A_ptr(i+1)-1 of the integer array A_col, and real array A_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string A_type = ‘sparse_by_rows’ should be specified.

Diagonal storage format: If \(A\) is diagonal (i.e., \(A_{ij} = 0\) for all \(1 \leq i \neq j \leq n\)) only the diagonals entries \(A_{ii}\), \(1 \leq i \leq n\) need be stored, and the first n components of the array A_val may be used for the purpose. The string A_type = ‘diagonal’ should be specified.

Multiples of the identity storage format: If \(A\) is a multiple of the identity matrix, (i.e., \(H = \alpha I\) where \(I\) is the n by n identity matrix and \(\alpha\) is a scalar), it suffices to store \(\alpha\) as the first component of A_val. The string A_type = ‘scaled_identity’ should be specified.

The identity matrix format: If \(A\) is the identity matrix, no values need be stored. The string A_type = ‘identity’ should be specified.

The zero matrix format: The same is true if \(A\) is the zero matrix, but now the string A_type = ‘zero’ or ‘none’ should be specified.

introduction to function calls#

To solve a given problem, functions from the sls package must be called in the following order:

  • sls_initialize - provide default control parameters and set up initial data structures

  • sls_read_specfile (optional) - override control values by reading replacement values from a file

  • sls_analyse_matrix - set up matrix data structures and analyse the structure to choose a suitable order for factorization

  • sls_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved

  • sls_factorize_matrix - form and factorize the matrix \(A\)

  • one of

  • sls_information (optional) - recover information about the solution and solution process

  • sls_terminate - deallocate data structures

See the examples section for illustrations of use.

parametric real type T#

Below, the symbol T refers to a parametric real type that may be Float32 (single precision), Float64 (double precision) or, if supported, Float128 (quadruple precision).

callable functions#

    function sls_initialize(T, solver, data, control, status)

Select solver, set default control values and initialize private data

Parameters:

solver

is a one-dimensional array of type Vararg{Cchar} that specifies the solver package that should be used to factorize the matrix \(A\). It should be one of ‘sils’, ‘ma27’, ‘ma57’, ‘ma77’, ‘ma86’, ‘ma87’, ‘ma97’, ‘ssids’, ‘mumps’, ‘pardiso’, ‘mkl pardiso’, ‘pastix’, ‘wsmp’, ‘potr’, ‘sytr’ or ‘pbtr’; lower or upper case variants are allowed.

data

holds private internal data

control

is a structure containing control information (see sls_control_type)

status

is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:

  • 0

    The initialization was successful.

  • -26

    The requested solver is not available.

    function sls_read_specfile(T, control, specfile)

Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/sls/SLS.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/sls.pdf for a list of how these keywords relate to the components of the control structure.

Parameters:

control

is a structure containing control information (see sls_control_type)

specfile

is a one-dimensional array of type Vararg{Cchar} that must give the name of the specification file

    function sls_analyse_matrix(T, control, data, status, n, type, ne, row, col, ptr)

Import structural matrix data into internal storage prior to solution

Parameters:

control

is a structure whose members provide control parameters for the remaining procedures (see sls_control_type)

data

holds private internal data

status

is a scalar variable of type Int32 that gives the exit status from the package.

Possible values are:

  • 0

    The import and analysis were conducted successfully.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -3

    The restrictions n > 0 or requirement that the matrix type must contain the relevant string ‘dense’, ‘coordinate’ or ‘sparse_by_rows has been violated.

  • -20

    The matrix is not positive definite while the solver used expected it to be.

  • -26

    The requested solver is not available.

  • -29

    This option is not available with this solver.

  • -32

    More than control.max integer factor size words of internal integer storage are required for in-core factorization.

  • -34

    The package PARDISO failed; check the solver-specific information components inform.pardiso iparm and inform.pardiso_dparm along with PARDISO’s documentation for more details.

  • -35

    The package WSMP failed; check the solver-specific information components inform.wsmp_iparm and inform.wsmp dparm along with WSMP’s documentation for more details.

  • -36

    The scaling package HSL MC64 failed; check the solver-specific information component inform.mc64_info along with HSL MC64’s documentation for more details.

  • -37

    The scaling package MC77 failed; check the solver-specific information components inform.mc77 info and inform.mc77_rinfo along with MC77’s documentation for more details.

  • -43

    A direct-access file error occurred. See the value of inform.ma77_info.flag for more details.

  • -50

    A solver-specific error occurred; check the solver-specific information component of inform along with the solver’s documentation for more details.

n

is a scalar variable of type Int32 that holds the number of rows in the symmetric matrix \(A\).

type

is a one-dimensional array of type Vararg{Cchar} that specifies the symmetric storage scheme used for the matrix \(A\). It should be one of ‘coordinate’, ‘sparse_by_rows’ or ‘dense’; lower or upper case variants are allowed.

ne

is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes.

row

is a one-dimensional array of size ne and type Int32 that holds the row indices of the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be C_NULL.

col

is a one-dimensional array of size ne and type Int32 that holds the column indices of the lower triangular part of \(A\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense storage scheme is used, and in this case can be C_NULL.

ptr

is a one-dimensional array of size n+1 and type Int32 that holds the starting position of each row of the lower triangular part of \(A\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL.

    function sls_reset_control(T, control, data, status)

Reset control parameters after import if required.

Parameters:

control

is a structure whose members provide control parameters for the remaining procedures (see sls_control_type)

data

holds private internal data

status

is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:

    1. The import was successful.

    function sls_factorize_matrix(T, data, status, ne, val)

Form and factorize the symmetric matrix \(A\).

Parameters:

data

holds private internal data

status

is a scalar variable of type Int32 that gives the exit status from the package.

Possible values are:

  • 0

    The factors were generated successfully.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -3

    The restrictions n > 0 or requirement that the matrix type must contain the relevant string ‘dense’, ‘coordinate’ or ‘sparse_by_rows has been violated.

  • -20

    The matrix is not positive definite while the solver used expected it to be.

  • -26

    The requested solver is not available.

  • -29

    This option is not available with this solver.

  • -32

    More than control.max integer factor size words of internal integer storage are required for in-core factorization.

  • -34

    The package PARDISO failed; check the solver-specific information components inform.pardiso iparm and inform.pardiso_dparm along with PARDISO’s documentation for more details.

  • -35

    The package WSMP failed; check the solver-specific information components inform.wsmp_iparm and inform.wsmp dparm along with WSMP’s documentation for more details.

  • -36

    The scaling package HSL MC64 failed; check the solver-specific information component inform.mc64_info along with HSL MC64’s documentation for more details.

  • -37

    The scaling package MC77 failed; check the solver-specific information components inform.mc77 info and inform.mc77_rinfo along with MC77’s documentation for more details.

  • -43

    A direct-access file error occurred. See the value of inform.ma77_info.flag for more details.

  • -50

    A solver-specific error occurred; check the solver-specific information component of inform along with the solver’s documentation for more details.

ne

is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of the symmetric matrix \(A\).

val

is a one-dimensional array of size ne and type T that holds the values of the entries of the lower triangular part of the symmetric matrix \(A\) in any of the supported storage schemes.

    function sls_solve_system(T, data, status, n, sol)

Solve the linear system \(Ax=b\).

Parameters:

data

holds private internal data

status

is a scalar variable of type Int32 that gives the exit status from the package.

Possible values are:

  • 0

    The required solution was obtained.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -34

    The package PARDISO failed; check the solver-specific information components inform.pardiso iparm and inform.pardiso_dparm along with PARDISO’s documentation for more details.

  • -35

    The package WSMP failed; check the solver-specific information components inform.wsmp_iparm and inform.wsmp dparm along with WSMP’s documentation for more details.

n

is a scalar variable of type Int32 that holds the number of entries in the vectors \(b\) and \(x\).

sol

is a one-dimensional array of size n and type double. On entry, it must hold the vector \(b\). On a successful exit, its contains the solution \(x\).

    function sls_partial_solve_system(T, part, data, status, n, sol)

Given the factorization \(A = L D U\) with \(U = L^T\), solve the linear system

\[Mx=b,\]
where \(M\) is one of \(L\), \(D\), \(U\) or \(S = L \sqrt{D}\).

Parameters:

part

is a one-dimensional array of type Vararg{Cchar} that specifies the component \(M\) of the factorization that is to be used. It should be one of “L”, “D”, “U” or “S”, and these correspond to the parts \(L\), \(D\), \(U\) and \(S\); lower or upper case variants are allowed.

data

holds private internal data

status

is a scalar variable of type Int32 that gives the entry and exit status from the package.

On initial entry, status must be set to 1.

Possible exit values are:

  • 0

    The required solution was obtained.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -34

    The package PARDISO failed; check the solver-specific information components inform.pardiso iparm and inform.pardiso_dparm along with PARDISO’s documentation for more details.

  • -35

    The package WSMP failed; check the solver-specific information components inform.wsmp_iparm and inform.wsmp dparm along with WSMP’s documentation for more details.

n

is a scalar variable of type Int32 that holds the number of entries in the vectors \(b\) and \(x\).

sol

is a one-dimensional array of size n and type double. On entry, it must hold the vector \(b\). On a successful exit, its contains the solution \(x\).

    function sls_information(T, data, inform, status)

Provide output information

Parameters:

data

holds private internal data

inform

is a structure containing output information (see sls_inform_type)

status

is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):

  • 0

    The values were recorded successfully

    function sls_terminate(T, data, control, inform)

Deallocate all internal private storage

Parameters:

data

holds private internal data

control

is a structure containing control information (see sls_control_type)

inform

is a structure containing output information (see sls_inform_type)

available structures#

sls_control_type structure#

    struct sls_control_type{T}
      f_indexing::Bool
      error::Int32
      warning::Int32
      out::Int32
      statistics::Int32
      print_level::Int32
      print_level_solver::Int32
      bits::Int32
      block_size_kernel::Int32
      block_size_elimination::Int32
      blas_block_size_factorize::Int32
      blas_block_size_solve::Int32
      node_amalgamation::Int32
      initial_pool_size::Int32
      min_real_factor_size::Int32
      min_integer_factor_size::Int32
      max_real_factor_size::Int64
      max_integer_factor_size::Int64
      max_in_core_store::Int64
      array_increase_factor::T
      array_decrease_factor::T
      pivot_control::Int32
      ordering::Int32
      full_row_threshold::Int32
      row_search_indefinite::Int32
      scaling::Int32
      scale_maxit::Int32
      scale_thresh::T
      relative_pivot_tolerance::T
      minimum_pivot_tolerance::T
      absolute_pivot_tolerance::T
      zero_tolerance::T
      zero_pivot_tolerance::T
      negative_pivot_tolerance::T
      static_pivot_tolerance::T
      static_level_switch::T
      consistency_tolerance::T
      max_iterative_refinements::Int32
      acceptable_residual_relative::T
      acceptable_residual_absolute::T
      multiple_rhs::Bool
      generate_matrix_file::Bool
      matrix_file_device::Int32
      matrix_file_name::NTuple{31,Cchar}
      out_of_core_directory::NTuple{401,Cchar}
      out_of_core_integer_factor_file::NTuple{401,Cchar}
      out_of_core_real_factor_file::NTuple{401,Cchar}
      out_of_core_real_work_file::NTuple{401,Cchar}
      out_of_core_indefinite_file::NTuple{401,Cchar}
      out_of_core_restart_file::NTuple{501,Cchar}
      prefix::NTuple{31,Cchar}

detailed documentation#

control derived type as a Julia structure

components#

Bool f_indexing

use C or Fortran sparse matrix indexing

Int32 error

unit for error messages

Int32 warning

unit for warning messages

Int32 out

unit for monitor output

Int32 statistics

unit for statistical output

Int32 print_level

controls level of diagnostic output

Int32 print_level_solver

controls level of diagnostic output from external solver

Int32 bits

number of bits used in architecture

Int32 block_size_kernel

the target blocksize for kernel factorization

Int32 block_size_elimination

the target blocksize for parallel elimination

Int32 blas_block_size_factorize

level 3 blocking in factorize

Int32 blas_block_size_solve

level 2 and 3 blocking in solve

Int32 node_amalgamation

a child node is merged with its parent if they both involve fewer than node_amalgamation eliminations

Int32 initial_pool_size

initial size of task-pool arrays for parallel elimination

Int32 min_real_factor_size

initial size for real array for the factors and other data

Int32 min_integer_factor_size

initial size for integer array for the factors and other data

Int64 max_real_factor_size

maximum size for real array for the factors and other data

Int64 max_integer_factor_size

maximum size for integer array for the factors and other data

Int64 max_in_core_store

amount of in-core storage to be used for out-of-core factorization

T array_increase_factor

factor by which arrays sizes are to be increased if they are too small

T array_decrease_factor

if previously allocated internal workspace arrays are greater than array_decrease_factor times the currently required sizes, they are reset to current requirements

Int32 pivot_control

pivot control:

  • 1 Numerical pivoting will be performed.

  • 2 No pivoting will be performed and an error exit will occur immediately a pivot sign change is detected.

  • 3 No pivoting will be performed and an error exit will occur if a zero pivot is detected.

  • 4 No pivoting is performed but pivots are changed to all be positive

Int32 ordering

controls ordering (ignored if explicit PERM argument present)

  • <0 chosen by the specified solver with its own ordering-selected value -ordering

  • 0 chosen package default (or the AMD ordering if no package default)

  • 1 Approximate minimum degree (AMD) with provisions for “dense” rows/col

  • 2 Minimum degree

  • 3 Nested disection

  • 4 indefinite ordering to generate a combination of 1x1 and 2x2 pivots

  • 5 Profile/Wavefront reduction

  • 6 Bandwidth reduction

  • >6 ordering chosen depending on matrix characteristics (not yet implemented)

Int32 full_row_threshold

controls threshold for detecting full rows in analyse, registered as percentage of matrix order. If 100, only fully dense rows detected (defa

Int32 row_search_indefinite

number of rows searched for pivot when using indefinite ordering

Int32 scaling

controls scaling (ignored if explicit SCALE argument present)

  • <0 chosen by the specified solver with its own scaling-selected value -scaling

  • 0 No scaling

  • 1 Scaling using HSL’s MC64

  • 2 Scaling using HSL’s MC77 based on the row one-norm

  • 3 Scaling using HSL’s MC77 based on the row infinity-norm

Int32 scale_maxit

the number of scaling iterations performed (default 10 used if .scale_maxit < 0)

T scale_thresh

the scaling iteration stops as soon as the row/column norms are less than 1+/-.scale_thresh

T relative_pivot_tolerance

pivot threshold

T minimum_pivot_tolerance

smallest permitted relative pivot threshold

T absolute_pivot_tolerance

any pivot small than this is considered zero

T zero_tolerance

any entry smaller than this is considered zero

T zero_pivot_tolerance

any pivot smaller than this is considered zero for positive-definite sol

T negative_pivot_tolerance

any pivot smaller than this is considered to be negative for p-d solvers

T static_pivot_tolerance

used for setting static pivot level

T static_level_switch

used for switch to static

T consistency_tolerance

used to determine whether a system is consistent when seeking a Fredholm alternative

Int32 max_iterative_refinements

maximum number of iterative refinements allowed

T acceptable_residual_relative

refinement will cease as soon as the residual ||Ax-b|| falls below max( acceptable_residual_relative * ||b||, acceptable_residual_absolute

T acceptable_residual_absolute

see acceptable_residual_relative

Bool multiple_rhs

set .multiple_rhs to .true. if there is possibility that the solver will be required to solve systems with more than one right-hand side. More efficient execution may be possible when .multiple_rhs = .false.

Bool generate_matrix_file

if .generate_matrix_file is .true. if a file describing the current matrix is to be generated

Int32 matrix_file_device

specifies the unit number to write the input matrix (in co-ordinate form

char matrix_file_name[31]

name of generated matrix file containing input problem

char out_of_core_directory[401]

directory name for out of core factorization and additional real workspace in the indefinite case, respectively

char out_of_core_integer_factor_file[401]

out of core superfile names for integer and real factor data, real works and additional real workspace in the indefinite case, respectively

char out_of_core_real_factor_file[401]

see out_of_core_integer_factor_file

char out_of_core_real_work_file[401]

see out_of_core_integer_factor_file

char out_of_core_indefinite_file[401]

see out_of_core_integer_factor_file

char out_of_core_restart_file[501]

see out_of_core_integer_factor_file

NTuple{31,Cchar} prefix

all output lines will be prefixed by prefix(2:LEN(TRIM(.prefix))-1) where prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’

sls_time_type structure#

    struct sls_time_type{T}
      total::T
      analyse::T
      factorize::T
      solve::T
      order_external::T
      analyse_external::T
      factorize_external::T
      solve_external::T
      clock_total::T
      clock_analyse::T
      clock_factorize::T
      clock_solve::T
      clock_order_external::T
      clock_analyse_external::T
      clock_factorize_external::T
      clock_solve_external::T

detailed documentation#

time derived type as a Julia structure

components#

T total

the total cpu time spent in the package

T analyse

the total cpu time spent in the analysis phase

T factorize

the total cpu time spent in the factorization phase

T solve

the total cpu time spent in the solve phases

T order_external

the total cpu time spent by the external solver in the ordering phase

T analyse_external

the total cpu time spent by the external solver in the analysis phase

T factorize_external

the total cpu time spent by the external solver in the factorization pha

T solve_external

the total cpu time spent by the external solver in the solve phases

T clock_total

the total clock time spent in the package

T clock_analyse

the total clock time spent in the analysis phase

T clock_factorize

the total clock time spent in the factorization phase

T clock_solve

the total clock time spent in the solve phases

T clock_order_external

the total clock time spent by the external solver in the ordering phase

T clock_analyse_external

the total clock time spent by the external solver in the analysis phase

T clock_factorize_external

the total clock time spent by the external solver in the factorization p

T clock_solve_external

the total clock time spent by the external solver in the solve phases

sls_inform_type structure#

    struct sls_inform_type{T}
      status::Int32
      alloc_status::Int32
      bad_alloc::NTuple{81,Cchar}
      more_info::Int32
      entries::Int32
      out_of_range::Int32
      duplicates::Int32
      upper::Int32
      missing_diagonals::Int32
      max_depth_assembly_tree::Int32
      nodes_assembly_tree::Int32
      real_size_desirable::Int64
      integer_size_desirable::Int64
      real_size_necessary::Int64
      integer_size_necessary::Int64
      real_size_factors::Int64
      integer_size_factors::Int64
      entries_in_factors::Int64
      max_task_pool_size::Int32
      max_front_size::Int32
      compresses_real::Int32
      compresses_integer::Int32
      two_by_two_pivots::Int32
      semi_bandwidth::Int32
      delayed_pivots::Int32
      pivot_sign_changes::Int32
      static_pivots::Int32
      first_modified_pivot::Int32
      rank::Int32
      negative_eigenvalues::Int32
      num_zero::Int32
      iterative_refinements::Int32
      flops_assembly::Int64
      flops_elimination::Int64
      flops_blas::Int64
      largest_modified_pivot::T
      minimum_scaling_factor::T
      maximum_scaling_factor::T
      condition_number_1::T
      condition_number_2::T
      backward_error_1::T
      backward_error_2::T
      forward_error::T
      alternative::Bool
      solver::NTuple{21,Cchar}
      time::sls_time_type{T}
      sils_ainfo::sils_ainfo_type{T}
      sils_finfo::sils_finfo_type{T}
      sils_sinfo::sils_sinfo_type{T}
      ma57_ainfo::ma57_ainfo{T}
      ma57_finfo::ma57_finfo{T}
      ma57_sinfo::ma57_sinfo{T}
      ma77_info::ma77_info{T}
      ma86_info::ma86_info{T}
      ma87_info::ma87_info{T}
      ma97_info::ma97_info{T}
      ssids_inform::spral_ssids_inform
      mc61_info::NTuple{10,Cint}
      mc61_rinfo::NTuple{15,T}
      mc64_info::mc64_info
      mc68_info::mc68_info
      mc77_info::NTuple{10,Cint}
      mc77_rinfo::NTuple{10,T}
      mumps_error::Int32
      mumps_info::NTuple{80,Cint}
      mumps_rinfo::NTuple{40,T}
      pardiso_error::Int32
      pardiso_IPARM::NTuple{64,Cint}
      pardiso_DPARM::NTuple{64,T}
      mkl_pardiso_error::Int32
      mkl_pardiso_IPARM::NTuple{64,Cint}
      pastix_info::Int32
      wsmp_error::Int32
      wsmp_iparm::NTuple{64,Cint}
      wsmp_dparm::NTuple{64,T}
      mpi_ierr::Int32
      lapack_error::Int32

detailed documentation#

inform derived type as a Julia structure

components#

Int32 status

reported return status. Possible values are:

  • 0

    success

  • -1

    allocation error

  • -2

    deallocation error

  • -3

    matrix data faulty (n < 1, ne < 0)

  • -20

    alegedly postive definite definite matrix is indefinite

  • -29

    unavailable option

  • -31

    input order is not a permutation or is faulty in some other way

  • -32

    > control.max_integer_factor_size integer space required for factor

  • -33

    > control.max_real_factor_size real space required for factors

  • -40

    not possible to alter the diagonals

  • -41

    no access to permutation or pivot sequence used

  • -42

    no access to diagonal perturbations

  • -43

    direct-access file error

  • -50

    solver-specific error; see the solver’s info parameter

  • -101

    unknown solver

Int32 alloc_status

STAT value after allocate failure.

NTuple{81,Cchar} bad_alloc

name of array which provoked an allocate failure

Int32 more_info

further information on failure

Int32 entries

number of entries

Int32 out_of_range

number of indices out-of-range

Int32 duplicates

number of duplicates

Int32 upper

number of entries from the strict upper triangle

Int32 missing_diagonals

number of missing diagonal entries for an allegedly-definite matrix

Int32 max_depth_assembly_tree

maximum depth of the assembly tree

Int32 nodes_assembly_tree

nodes in the assembly tree (= number of elimination steps)

Int64 real_size_desirable

desirable or actual size for real array for the factors and other data

Int64 integer_size_desirable

desirable or actual size for integer array for the factors and other dat

Int64 real_size_necessary

necessary size for real array for the factors and other data

Int64 integer_size_necessary

necessary size for integer array for the factors and other data

Int64 real_size_factors

predicted or actual number of reals to hold factors

Int64 integer_size_factors

predicted or actual number of integers to hold factors

Int64 entries_in_factors

number of entries in factors

Int32 max_task_pool_size

maximum number of tasks in the factorization task pool

Int32 max_front_size

forecast or actual size of largest front

Int32 compresses_real

number of compresses of real data

Int32 compresses_integer

number of compresses of integer data

Int32 two_by_two_pivots

number of 2x2 pivots

Int32 semi_bandwidth

semi-bandwidth of matrix following bandwidth reduction

Int32 delayed_pivots

number of delayed pivots (total)

Int32 pivot_sign_changes

number of pivot sign changes if no pivoting is used successfully

Int32 static_pivots

number of static pivots chosen

Int32 first_modified_pivot

first pivot modification when static pivoting

Int32 rank

estimated rank of the matrix

Int32 negative_eigenvalues

number of negative eigenvalues

Int32 num_zero

number of pivots that are considered zero (and ignored)

Int32 iterative_refinements

number of iterative refinements performed

Int64 flops_assembly

anticipated or actual number of floating-point operations in assembly

Int64 flops_elimination

anticipated or actual number of floating-point operations in elimination

Int64 flops_blas

additional number of floating-point operations for BLAS

T largest_modified_pivot

largest diagonal modification when static pivoting or ensuring definiten

T minimum_scaling_factor

minimum scaling factor

T maximum_scaling_factor

maximum scaling factor

T condition_number_1

esimate of the condition number of the matrix (category 1 equations)

T condition_number_2

estimate of the condition number of the matrix (category 2 equations)

T backward_error_1

esimate of the backward error (category 1 equations)

T backward_error_2

esimate of the backward error (category 2 equations)

T forward_error

estimate of forward error

Bool alternative

has an “alternative” \(y\): \(A y = 0\) and \(y^T b > 0\) been found when trying to solve \(A x = b\) ?

char solver[21]

name of external solver used to factorize and solve

struct sls_time_type time

timings (see above)

struct sils_ainfo_type sils_ainfo

the analyse output structure from sils

struct sils_finfo_type sils_finfo

the factorize output structure from sils

struct sils_sinfo_type sils_sinfo

the solve output structure from sils

struct ma57_ainfo ma57_ainfo

the analyse output structure from hsl_ma57

struct ma57_finfo ma57_finfo

the factorize output structure from hsl_ma57

struct ma57_sinfo ma57_sinfo

the solve output structure from hsl_ma57

struct ma77_info ma77_info

the output structure from hsl_ma77

struct ma86_info ma86_info

the output structure from hsl_ma86

struct ma87_info ma87_info

the output structure from hsl_ma87

struct ma97_info ma97_info

the output structure from hsl_ma97

struct spral_ssids_inform ssids_inform

the output structure from ssids

Int32 mc61_info[10]

the real output array from mc61 from HSL

T mc61_rinfo[15]

the integer output array from mc61 from HSL

struct  mc64_info mc64_info

the output structure from hsl_mc64

struct mc68_info mc68_info

the output structure from hsl_mc68

Int32 mc77_info[10]

the integer output array from mc77

T mc77_rinfo[10]

the real output array from mc77 from HSL

Int32 mumps_error

the output scalars and arrays from mumps

Int32 mumps_info[80]

see mumps_error

T mumps_rinfo[40]

see mumps_error

Int32 pardiso_error

the output scalars and arrays from pardiso

Int32 pardiso_IPARM[64]

see pardiso_error

T pardiso_DPARM[64]

see pardiso_error

Int32 mkl_pardiso_error

the output scalars and arrays from mkl_pardiso

Int32 mkl_pardiso_IPARM[64]

see mkl_pardiso_error

Int32 pastix_info

the output flag from pastix

Int32 wsmp_error

the output scalars and arrays from wsmp

Int32 wsmp_iparm[64]

see wsmp_error

T wsmp_dparm[64]

see wsmp_error

Int32 mpi_ierr

the output flag from MPI routines

Int32 lapack_error

the output flag from LAPACK routines

example calls#

This is an example of how to use the package to solve a symmetric system of linear equations; the code is available in $GALAHAD/src/sls/Julia/test_sls.jl . A variety of supported matrix storage formats are shown.

# test_sls.jl
# Simple code to test the Julia interface to SLS

using GALAHAD
using Test
using Printf
using Accessors
using Quadmath

function test_sls(::Type{T}) where T
  maxabsarray(a) = abs.(a) |> maximum

  # Derived types
  data = Ref{Ptr{Cvoid}}()
  control = Ref{sls_control_type{T}}()
  inform = Ref{sls_inform_type{T}}()

  # Set problem data
  n = 5 # dimension of A
  ne = 7 # number of entries of A
  dense_ne = 15 # number of elements of A as a dense matrix
  row = Cint[1, 2, 2, 3, 3, 4, 5]  # row indices, NB lower triangle
  col = Cint[1, 1, 5, 2, 3, 3, 5]  # column indices
  ptr = Cint[1, 2, 4, 6, 7, 8]  # pointers to indices
  val = T[2.0, 3.0, 6.0, 4.0, 1.0, 5.0, 1.0]  # values
  dense = T[2.0, 3.0, 0.0, 0.0, 4.0, 1.0, 0.0,
                  0.0, 5.0, 0.0, 0.0, 6.0, 0.0, 0.0, 1.0]
  rhs = T[8.0, 45.0, 31.0, 15.0, 17.0]
  sol = T[1.0, 2.0, 3.0, 4.0, 5.0]
  status = Ref{Cint}()
  x = zeros(T, n)
  error = zeros(T, n)

  norm_residual = Ref{T}()
  good_x = eps(Float64)^(1 / 3)

  @printf(" Fortran sparse matrix indexing\n\n")
  @printf(" basic tests of storage formats\n\n")
  @printf(" storage  RHS   refine  partial\n")

  for d in 1:3
    # Initialize SLS - use the sytr solver
    sls_initialize(T, "sytr", data, control, status)

    # Set user-defined control options
    @reset control[].f_indexing = true # Fortran sparse matrix indexing

    # sparse co-ordinate storage
    if d == 1
      @printf(" coordinate ")
      sls_analyse_matrix(T, control, data, status, n,
                         "coordinate", ne, row, col, C_NULL)
      sls_factorize_matrix(T, data, status, ne, val)
    end

    # sparse by rows
    if d == 2
      @printf(" sparse by rows ")
      sls_analyse_matrix(T, control, data, status, n,
                         "sparse_by_rows", ne, C_NULL, col, ptr)
      sls_factorize_matrix(T, data, status, ne, val)
    end

    # dense
    if d == 3
      @printf(" dense  ")
      sls_analyse_matrix(T, control, data, status, n,
                         "dense", ne, C_NULL, C_NULL, C_NULL)
      sls_factorize_matrix(T, data, status, dense_ne, dense)
    end

    # Set right-hand side and solve the system
    for i in 1:n
      x[i] = rhs[i]
    end

    sls_solve_system(T, data, status, n, x)
    sls_information(T, data, inform, status)

    if inform[].status == 0
      for i in 1:n
        error[i] = x[i] - sol[i]
      end

      norm_residual = maxabsarray(error)

      if norm_residual < good_x
        @printf("   ok  ")
      else
        @printf("  fail ")
      end
    else
      @printf(" SLS_solve exit status = %1i\n", inform[].status)
    end

    # @printf("sol: ")
    # for i = 1:n
    #   @printf("%f ", x[i])
    # end

    # resolve, this time using iterative refinement
    @reset control[].max_iterative_refinements = Cint(1)
    sls_reset_control(T, control, data, status)

    for i in 1:n
      x[i] = rhs[i]
    end

    sls_solve_system(T, data, status, n, x)
    sls_information(T, data, inform, status)

    if inform[].status == 0
      for i in 1:n
        error[i] = x[i] - sol[i]
      end

      norm_residual = maxabsarray(error)

      if norm_residual < good_x
        @printf("ok  ")
      else
        @printf("   fail ")
      end
    else
      @printf(" SLS_solve exit status = %1i\n", inform[].status)
    end

    # obtain the solution by part solves
    for i in 1:n
      x[i] = rhs[i]
    end

    sls_partial_solve_system(T, "L", data, status, n, x)
    sls_partial_solve_system(T, "D", data, status, n, x)
    sls_partial_solve_system(T, "U", data, status, n, x)
    sls_information(T, data, inform, status)

    if inform[].status == 0
      for i in 1:n
        error[i] = x[i] - sol[i]
      end

      norm_residual = maxabsarray(error)

      if norm_residual < good_x
        @printf("ok  ")
      else
        @printf("   fail ")
      end
    else
      @printf(" SLS_solve exit status = %1i\n", inform[].status)
    end

    # Delete internal workspace
    sls_terminate(T, data, control, inform)
  end

  return 0
end

@testset "SLS" begin
  @test test_sls(Float32) == 0
  @test test_sls(Float64) == 0
  @test test_sls(Float128) == 0
end