GALAHAD SLS package#
purpose#
The sls
package
solves dense or sparse symmetric systems of linear equations
using variants of Gaussian elimination.
Given a sparse symmetric matrix \(A = \{ a_{ij} \}_{n \times n}\), and an
\(n\)-vector \(b\) or a matrix \(B = \{ b_{ij} \}_{n \times r}\), this
function solves the system \(A x = b\) or the system \(A X = B\) .
The matrix \(A\) need not be definite.
The method provides a common interface to a variety of well-known
solvers from HSL and elsewhere. Currently supported solvers include
MA27/SILS
, HSL_MA57
, HSL_MA77
, HSL_MA86
,
HSL_MA87
and HSL_MA97
from {HSL},
SSIDS
from {SPRAL},
MUMPS
from Mumps Technologies,
PARDISO
both from the Pardiso Project and Intel’s MKL,
PaStiX
from Inria, and
WSMP
from the IBM alpha Works,
as well as POTR
, SYTR
and SBTR
from LAPACK.
Note that, with the exception of SSIDS
and the Netlib
reference LAPACK codes,
the solvers themselves do not form part of this package and
must be obtained/linked to separately.
Dummy instances are provided for solvers that are unavailable.
Also note that additional flexibility may be obtained by calling the
solvers directly rather that via this package.
terminology#
The solvers used each produce an \(L D L^T\) factorization of \(A\) or a perturbation thereof, where \(L\) is a permuted lower triangular matrix and \(D\) is a block diagonal matrix with blocks of order 1 and 2. It is convenient to write this factorization in the form
supported solvers#
The key features of the external solvers supported by sls
are
given in the following table:
solver |
factorization |
indefinite \(A\) |
out-of-core |
parallelised |
---|---|---|---|---|
|
multifrontal |
yes |
no |
no |
|
multifrontal |
yes |
no |
no |
|
multifrontal |
yes |
yes |
OpenMP core |
|
left-looking |
yes |
no |
OpenMP fully |
|
left-looking |
no |
no |
OpenMP fully |
|
multifrontal |
yes |
no |
OpenMP core |
|
multifrontal |
yes |
no |
CUDA core |
|
multifrontal |
yes |
optionally |
MPI |
|
left-right-looking |
yes |
no |
OpenMP fully |
|
left-right-looking |
yes |
optionally |
OpenMP fully |
|
left-right-looking |
yes |
no |
OpenMP fully |
|
left-right-looking |
yes |
no |
OpenMP fully |
|
dense |
no |
no |
with parallel LAPACK |
|
dense |
yes |
no |
with parallel LAPACK |
|
dense band |
no |
no |
with parallel LAPACK |
method#
Variants of sparse Gaussian elimination are used. See Section 4 of $GALAHAD/doc/sls.pdf for a brief description of the method employed and other details.
The solver SILS
is available as part of GALAHAD and relies on
the HSL Archive package MA27
. To obtain HSL Archive packages, see
The solvers
HSL_MA57
,
HSL_MA77
,
HSL_MA86
,
HSL_MA87
and
HSL_MA97
, the ordering packages
MC61
and HSL_MC68
, and the scaling packages
HSL_MC64
and MC77
are all part of HSL 2011.
To obtain HSL 2011 packages, see
The solver SSIDS
is from the SPRAL sparse-matrix collection,
and is available as part of GALAHAD.
The solver MUMPS
is available from Mumps Technologies in France, and
version 5.5.1 or above is sufficient.
To obtain MUMPS
, see
The solver PARDISO
is available from the Pardiso Project;
version 4.0.0 or above is required.
To obtain PARDISO
, see
The solver MKL PARDISO
is available as part of Intel’s oneAPI Math Kernel
Library (oneMKL).
To obtain this version of PARDISO
, see
The solver PaStix
is available from Inria in France, and
version 6.2 or above is sufficient.
To obtain PaStiX
, see
The solver WSMP
is available from the IBM alpha Works;
version 10.9 or above is required.
To obtain WSMP
, see
The solvers POTR
, SYTR
and PBTR
,
are available as
S/DPOTRF/S
,
S/DSYTRF/S
and S/DPBTRF/S
as part of LAPACK. Reference versions
are provided by GALAHAD, but for good performance
machined-tuned versions should be used.
Explicit sparsity re-orderings are obtained by calling the HSL package
HSL_MC68
.
Both this, HSL_MA57
and PARDISO
rely optionally
on the ordering package MeTiS
(version 4) from the Karypis Lab.
To obtain METIS
, see
Bandwidth, Profile and wavefront reduction is supported by
calling HSL’s MC61
.
The methods used are described in the user-documentation for
HSL 2011, A collection of Fortran codes for large-scale scientific computation (2011).
and papers
E. Agullo, P. R. Amestoy, A. Buttari, J.-Y. L’Excellent, A. Guermouche and F.-H. Rouet, “Robust memory-aware mappings for parallel multifrontal factorizations”. SIAM Journal on Scientific Computing, b 38(3) (2016), C256–C279,
P. R. Amestoy, I. S. Duff, J. Koster and J.-Y. L’Excellent. “A fully asynchronous multifrontal solver using distributed dynamic scheduling”. SIAM Journal on Matrix Analysis and Applications b 23(1) (2001) 15-41,
A. Gupta, “WSMP: Watson Sparse Matrix Package Part I - direct solution of symmetric sparse systems”. IBM Research Report RC 21886, IBM T. J. Watson Research Center, NY 10598, USA (2010),
P. Henon, P. Ramet and J. Roman, “PaStiX: A High-Performance Parallel Direct Solver for Sparse Symmetric Definite Systems”. Parallel Computing, b 28(2) (2002) 301–321,
J.D. Hogg, E. Ovtchinnikov and J.A. Scott. “A sparse symmetric indefinite direct solver for GPU architectures”. ACM Transactions on Mathematical Software b 42(1) (2014), Article 1,
O. Schenk and K. Gartner, “Solving Unsymmetric Sparse Systems of Linear Equations with PARDISO”. Journal of Future Generation Computer Systems b, 20(3) (2004) 475–487, and
O. Schenk and K. Gartner, “On fast factorization pivoting methods for symmetric indefinite systems”. Electronic Transactions on Numerical Analysis b 23 (2006) 158–179.
matrix storage#
The symmetric \(n\) by \(n\) matrix \(A\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).
Dense storage format: The matrix \(A\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(A\) is symmetric, only the lower triangular part (that is the part \(A_{ij}\) for \(1 \leq j \leq i \leq n\)) need be held. In this case the lower triangle should be stored by rows, that is component \((i-1) * i / 2 + j\) of the storage array A_val will hold the value \(A_{ij}\) (and, by symmetry, \(A_{ji}\)) for \(1 \leq j \leq i \leq n\). The string A_type = ‘dense’ should be specified.
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(1 \leq l \leq ne\), of \(A\), its row index i, column index j and value \(A_{ij}\), \(1 \leq j \leq i \leq n\), are stored as the \(l\)-th components of the integer arrays A_row and A_col and real array A_val, respectively, while the number of nonzeros is recorded as A_ne = \(ne\). Note that only the entries in the lower triangle should be stored. The string A_type = ‘coordinate’ should be specified.
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(A\) the i-th component of the integer array A_ptr holds the position of the first entry in this row, while A_ptr(n+1) holds the total number of entries plus one. The column indices j, \(1 \leq j \leq i\), and values \(A_{ij}\) of the entries in the i-th row are stored in components l = A_ptr(i), …, A_ptr(i+1)-1 of the integer array A_col, and real array A_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string A_type = ‘sparse_by_rows’ should be specified.
Diagonal storage format: If \(A\) is diagonal (i.e., \(A_{ij} = 0\) for all \(1 \leq i \neq j \leq n\)) only the diagonals entries \(A_{ii}\), \(1 \leq i \leq n\) need be stored, and the first n components of the array A_val may be used for the purpose. The string A_type = ‘diagonal’ should be specified.
Multiples of the identity storage format: If \(A\) is a multiple of the identity matrix, (i.e., \(H = \alpha I\) where \(I\) is the n by n identity matrix and \(\alpha\) is a scalar), it suffices to store \(\alpha\) as the first component of A_val. The string A_type = ‘scaled_identity’ should be specified.
The identity matrix format: If \(A\) is the identity matrix, no values need be stored. The string A_type = ‘identity’ should be specified.
The zero matrix format: The same is true if \(A\) is the zero matrix, but now the string A_type = ‘zero’ or ‘none’ should be specified.
introduction to function calls#
To solve a given problem, functions from the sls package must be called in the following order:
sls_initialize - provide default control parameters and set up initial data structures
sls_read_specfile (optional) - override control values by reading replacement values from a file
sls_analyse_matrix - set up matrix data structures and analyse the structure to choose a suitable order for factorization
sls_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved
sls_factorize_matrix - form and factorize the matrix \(A\)
one of
sls_solve_system - solve the linear system of equations \(Ax=b\)
sls_partial_solve_system - solve a linear system \(Mx=b\) involving one of the matrix factors \(M\) of \(A\)
sls_information (optional) - recover information about the solution and solution process
sls_terminate - deallocate data structures
See the examples section for illustrations of use.
parametric real type T#
Below, the symbol T refers to a parametric real type that may be Float32 (single precision), Float64 (double precision) or, if supported, Float128 (quadruple precision).
callable functions#
function sls_initialize(T, solver, data, control, status)
Select solver, set default control values and initialize private data
Parameters:
solver |
is a one-dimensional array of type Vararg{Cchar} that specifies the solver package that should be used to factorize the matrix \(A\). It should be one of ‘sils’, ‘ma27’, ‘ma57’, ‘ma77’, ‘ma86’, ‘ma87’, ‘ma97’, ‘ssids’, ‘mumps’, ‘pardiso’, ‘mkl pardiso’, ‘pastix’, ‘wsmp’, ‘potr’, ‘sytr’ or ‘pbtr’; lower or upper case variants are allowed. |
data |
holds private internal data |
control |
is a structure containing control information (see sls_control_type) |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
function sls_read_specfile(T, control, specfile)
Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/sls/SLS.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/sls.pdf for a list of how these keywords relate to the components of the control structure.
Parameters:
control |
is a structure containing control information (see sls_control_type) |
specfile |
is a one-dimensional array of type Vararg{Cchar} that must give the name of the specification file |
function sls_analyse_matrix(T, control, data, status, n, type, ne, row, col, ptr)
Import structural matrix data into internal storage prior to solution
Parameters:
control |
is a structure whose members provide control parameters for the remaining procedures (see sls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
n |
is a scalar variable of type Int32 that holds the number of rows in the symmetric matrix \(A\). |
type |
is a one-dimensional array of type Vararg{Cchar} that specifies the symmetric storage scheme used for the matrix \(A\). It should be one of ‘coordinate’, ‘sparse_by_rows’ or ‘dense’; lower or upper case variants are allowed. |
ne |
is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes. |
row |
is a one-dimensional array of size ne and type Int32 that holds the row indices of the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be C_NULL. |
col |
is a one-dimensional array of size ne and type Int32 that holds the column indices of the lower triangular part of \(A\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense storage scheme is used, and in this case can be C_NULL. |
ptr |
is a one-dimensional array of size n+1 and type Int32 that holds the starting position of each row of the lower triangular part of \(A\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL. |
function sls_reset_control(T, control, data, status)
Reset control parameters after import if required.
Parameters:
control |
is a structure whose members provide control parameters for the remaining procedures (see sls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
function sls_factorize_matrix(T, data, status, ne, val)
Form and factorize the symmetric matrix \(A\).
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
ne |
is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of the symmetric matrix \(A\). |
val |
is a one-dimensional array of size ne and type T that holds the values of the entries of the lower triangular part of the symmetric matrix \(A\) in any of the supported storage schemes. |
function sls_solve_system(T, data, status, n, sol)
Solve the linear system \(Ax=b\).
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
n |
is a scalar variable of type Int32 that holds the number of entries in the vectors \(b\) and \(x\). |
sol |
is a one-dimensional array of size n and type double. On entry, it must hold the vector \(b\). On a successful exit, its contains the solution \(x\). |
function sls_partial_solve_system(T, part, data, status, n, sol)
Given the factorization \(A = L D U\) with \(U = L^T\), solve the linear system
Parameters:
part |
is a one-dimensional array of type Vararg{Cchar} that specifies the component \(M\) of the factorization that is to be used. It should be one of “L”, “D”, “U” or “S”, and these correspond to the parts \(L\), \(D\), \(U\) and \(S\); lower or upper case variants are allowed. |
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the entry and exit status from the package. On initial entry, status must be set to 1. Possible exit values are:
|
n |
is a scalar variable of type Int32 that holds the number of entries in the vectors \(b\) and \(x\). |
sol |
is a one-dimensional array of size n and type double. On entry, it must hold the vector \(b\). On a successful exit, its contains the solution \(x\). |
function sls_information(T, data, inform, status)
Provide output information
Parameters:
data |
holds private internal data |
inform |
is a structure containing output information (see sls_inform_type) |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):
|
function sls_terminate(T, data, control, inform)
Deallocate all internal private storage
Parameters:
data |
holds private internal data |
control |
is a structure containing control information (see sls_control_type) |
inform |
is a structure containing output information (see sls_inform_type) |
available structures#
sls_control_type structure#
struct sls_control_type{T} f_indexing::Bool error::Int32 warning::Int32 out::Int32 statistics::Int32 print_level::Int32 print_level_solver::Int32 bits::Int32 block_size_kernel::Int32 block_size_elimination::Int32 blas_block_size_factorize::Int32 blas_block_size_solve::Int32 node_amalgamation::Int32 initial_pool_size::Int32 min_real_factor_size::Int32 min_integer_factor_size::Int32 max_real_factor_size::Int64 max_integer_factor_size::Int64 max_in_core_store::Int64 array_increase_factor::T array_decrease_factor::T pivot_control::Int32 ordering::Int32 full_row_threshold::Int32 row_search_indefinite::Int32 scaling::Int32 scale_maxit::Int32 scale_thresh::T relative_pivot_tolerance::T minimum_pivot_tolerance::T absolute_pivot_tolerance::T zero_tolerance::T zero_pivot_tolerance::T negative_pivot_tolerance::T static_pivot_tolerance::T static_level_switch::T consistency_tolerance::T max_iterative_refinements::Int32 acceptable_residual_relative::T acceptable_residual_absolute::T multiple_rhs::Bool generate_matrix_file::Bool matrix_file_device::Int32 matrix_file_name::NTuple{31,Cchar} out_of_core_directory::NTuple{401,Cchar} out_of_core_integer_factor_file::NTuple{401,Cchar} out_of_core_real_factor_file::NTuple{401,Cchar} out_of_core_real_work_file::NTuple{401,Cchar} out_of_core_indefinite_file::NTuple{401,Cchar} out_of_core_restart_file::NTuple{501,Cchar} prefix::NTuple{31,Cchar}
detailed documentation#
control derived type as a Julia structure
components#
Bool f_indexing
use C or Fortran sparse matrix indexing
Int32 error
unit for error messages
Int32 warning
unit for warning messages
Int32 out
unit for monitor output
Int32 statistics
unit for statistical output
Int32 print_level
controls level of diagnostic output
Int32 print_level_solver
controls level of diagnostic output from external solver
Int32 bits
number of bits used in architecture
Int32 block_size_kernel
the target blocksize for kernel factorization
Int32 block_size_elimination
the target blocksize for parallel elimination
Int32 blas_block_size_factorize
level 3 blocking in factorize
Int32 blas_block_size_solve
level 2 and 3 blocking in solve
Int32 node_amalgamation
a child node is merged with its parent if they both involve fewer than node_amalgamation eliminations
Int32 initial_pool_size
initial size of task-pool arrays for parallel elimination
Int32 min_real_factor_size
initial size for real array for the factors and other data
Int32 min_integer_factor_size
initial size for integer array for the factors and other data
Int64 max_real_factor_size
maximum size for real array for the factors and other data
Int64 max_integer_factor_size
maximum size for integer array for the factors and other data
Int64 max_in_core_store
amount of in-core storage to be used for out-of-core factorization
T array_increase_factor
factor by which arrays sizes are to be increased if they are too small
T array_decrease_factor
if previously allocated internal workspace arrays are greater than array_decrease_factor times the currently required sizes, they are reset to current requirements
Int32 pivot_control
pivot control:
1 Numerical pivoting will be performed.
2 No pivoting will be performed and an error exit will occur immediately a pivot sign change is detected.
3 No pivoting will be performed and an error exit will occur if a zero pivot is detected.
4 No pivoting is performed but pivots are changed to all be positive
Int32 ordering
controls ordering (ignored if explicit PERM argument present)
<0 chosen by the specified solver with its own ordering-selected value -ordering
0 chosen package default (or the AMD ordering if no package default)
1 Approximate minimum degree (AMD) with provisions for “dense” rows/col
2 Minimum degree
3 Nested disection
4 indefinite ordering to generate a combination of 1x1 and 2x2 pivots
5 Profile/Wavefront reduction
6 Bandwidth reduction
>6 ordering chosen depending on matrix characteristics (not yet implemented)
Int32 full_row_threshold
controls threshold for detecting full rows in analyse, registered as percentage of matrix order. If 100, only fully dense rows detected (defa
Int32 row_search_indefinite
number of rows searched for pivot when using indefinite ordering
Int32 scaling
controls scaling (ignored if explicit SCALE argument present)
<0 chosen by the specified solver with its own scaling-selected value -scaling
0 No scaling
1 Scaling using HSL’s MC64
2 Scaling using HSL’s MC77 based on the row one-norm
3 Scaling using HSL’s MC77 based on the row infinity-norm
Int32 scale_maxit
the number of scaling iterations performed (default 10 used if .scale_maxit < 0)
T scale_thresh
the scaling iteration stops as soon as the row/column norms are less than 1+/-.scale_thresh
T relative_pivot_tolerance
pivot threshold
T minimum_pivot_tolerance
smallest permitted relative pivot threshold
T absolute_pivot_tolerance
any pivot small than this is considered zero
T zero_tolerance
any entry smaller than this is considered zero
T zero_pivot_tolerance
any pivot smaller than this is considered zero for positive-definite sol
T negative_pivot_tolerance
any pivot smaller than this is considered to be negative for p-d solvers
T static_pivot_tolerance
used for setting static pivot level
T static_level_switch
used for switch to static
T consistency_tolerance
used to determine whether a system is consistent when seeking a Fredholm alternative
Int32 max_iterative_refinements
maximum number of iterative refinements allowed
T acceptable_residual_relative
refinement will cease as soon as the residual ||Ax-b|| falls below max( acceptable_residual_relative * ||b||, acceptable_residual_absolute
T acceptable_residual_absolute
see acceptable_residual_relative
Bool multiple_rhs
set .multiple_rhs to .true. if there is possibility that the solver will be required to solve systems with more than one right-hand side. More efficient execution may be possible when .multiple_rhs = .false.
Bool generate_matrix_file
if .generate_matrix_file is .true. if a file describing the current matrix is to be generated
Int32 matrix_file_device
specifies the unit number to write the input matrix (in co-ordinate form
char matrix_file_name[31]
name of generated matrix file containing input problem
char out_of_core_directory[401]
directory name for out of core factorization and additional real workspace in the indefinite case, respectively
char out_of_core_integer_factor_file[401]
out of core superfile names for integer and real factor data, real works and additional real workspace in the indefinite case, respectively
char out_of_core_real_factor_file[401]
see out_of_core_integer_factor_file
char out_of_core_real_work_file[401]
see out_of_core_integer_factor_file
char out_of_core_indefinite_file[401]
see out_of_core_integer_factor_file
char out_of_core_restart_file[501]
see out_of_core_integer_factor_file
NTuple{31,Cchar} prefix
all output lines will be prefixed by prefix(2:LEN(TRIM(.prefix))-1) where prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’
sls_time_type structure#
struct sls_time_type{T} total::T analyse::T factorize::T solve::T order_external::T analyse_external::T factorize_external::T solve_external::T clock_total::T clock_analyse::T clock_factorize::T clock_solve::T clock_order_external::T clock_analyse_external::T clock_factorize_external::T clock_solve_external::T
detailed documentation#
time derived type as a Julia structure
components#
T total
the total cpu time spent in the package
T analyse
the total cpu time spent in the analysis phase
T factorize
the total cpu time spent in the factorization phase
T solve
the total cpu time spent in the solve phases
T order_external
the total cpu time spent by the external solver in the ordering phase
T analyse_external
the total cpu time spent by the external solver in the analysis phase
T factorize_external
the total cpu time spent by the external solver in the factorization pha
T solve_external
the total cpu time spent by the external solver in the solve phases
T clock_total
the total clock time spent in the package
T clock_analyse
the total clock time spent in the analysis phase
T clock_factorize
the total clock time spent in the factorization phase
T clock_solve
the total clock time spent in the solve phases
T clock_order_external
the total clock time spent by the external solver in the ordering phase
T clock_analyse_external
the total clock time spent by the external solver in the analysis phase
T clock_factorize_external
the total clock time spent by the external solver in the factorization p
T clock_solve_external
the total clock time spent by the external solver in the solve phases
sls_inform_type structure#
struct sls_inform_type{T} status::Int32 alloc_status::Int32 bad_alloc::NTuple{81,Cchar} more_info::Int32 entries::Int32 out_of_range::Int32 duplicates::Int32 upper::Int32 missing_diagonals::Int32 max_depth_assembly_tree::Int32 nodes_assembly_tree::Int32 real_size_desirable::Int64 integer_size_desirable::Int64 real_size_necessary::Int64 integer_size_necessary::Int64 real_size_factors::Int64 integer_size_factors::Int64 entries_in_factors::Int64 max_task_pool_size::Int32 max_front_size::Int32 compresses_real::Int32 compresses_integer::Int32 two_by_two_pivots::Int32 semi_bandwidth::Int32 delayed_pivots::Int32 pivot_sign_changes::Int32 static_pivots::Int32 first_modified_pivot::Int32 rank::Int32 negative_eigenvalues::Int32 num_zero::Int32 iterative_refinements::Int32 flops_assembly::Int64 flops_elimination::Int64 flops_blas::Int64 largest_modified_pivot::T minimum_scaling_factor::T maximum_scaling_factor::T condition_number_1::T condition_number_2::T backward_error_1::T backward_error_2::T forward_error::T alternative::Bool solver::NTuple{21,Cchar} time::sls_time_type{T} sils_ainfo::sils_ainfo_type{T} sils_finfo::sils_finfo_type{T} sils_sinfo::sils_sinfo_type{T} ma57_ainfo::ma57_ainfo{T} ma57_finfo::ma57_finfo{T} ma57_sinfo::ma57_sinfo{T} ma77_info::ma77_info{T} ma86_info::ma86_info{T} ma87_info::ma87_info{T} ma97_info::ma97_info{T} ssids_inform::spral_ssids_inform mc61_info::NTuple{10,Cint} mc61_rinfo::NTuple{15,T} mc64_info::mc64_info mc68_info::mc68_info mc77_info::NTuple{10,Cint} mc77_rinfo::NTuple{10,T} mumps_error::Int32 mumps_info::NTuple{80,Cint} mumps_rinfo::NTuple{40,T} pardiso_error::Int32 pardiso_IPARM::NTuple{64,Cint} pardiso_DPARM::NTuple{64,T} mkl_pardiso_error::Int32 mkl_pardiso_IPARM::NTuple{64,Cint} pastix_info::Int32 wsmp_error::Int32 wsmp_iparm::NTuple{64,Cint} wsmp_dparm::NTuple{64,T} mpi_ierr::Int32 lapack_error::Int32
detailed documentation#
inform derived type as a Julia structure
components#
Int32 status
reported return status. Possible values are:
0
success
-1
allocation error
-2
deallocation error
-3
matrix data faulty (n < 1, ne < 0)
-20
alegedly postive definite definite matrix is indefinite
-29
unavailable option
-31
input order is not a permutation or is faulty in some other way
-32
> control.max_integer_factor_size integer space required for factor
-33
> control.max_real_factor_size real space required for factors
-40
not possible to alter the diagonals
-41
no access to permutation or pivot sequence used
-42
no access to diagonal perturbations
-43
direct-access file error
-50
solver-specific error; see the solver’s info parameter
-101
unknown solver
Int32 alloc_status
STAT value after allocate failure.
NTuple{81,Cchar} bad_alloc
name of array which provoked an allocate failure
Int32 more_info
further information on failure
Int32 entries
number of entries
Int32 out_of_range
number of indices out-of-range
Int32 duplicates
number of duplicates
Int32 upper
number of entries from the strict upper triangle
Int32 missing_diagonals
number of missing diagonal entries for an allegedly-definite matrix
Int32 max_depth_assembly_tree
maximum depth of the assembly tree
Int32 nodes_assembly_tree
nodes in the assembly tree (= number of elimination steps)
Int64 real_size_desirable
desirable or actual size for real array for the factors and other data
Int64 integer_size_desirable
desirable or actual size for integer array for the factors and other dat
Int64 real_size_necessary
necessary size for real array for the factors and other data
Int64 integer_size_necessary
necessary size for integer array for the factors and other data
Int64 real_size_factors
predicted or actual number of reals to hold factors
Int64 integer_size_factors
predicted or actual number of integers to hold factors
Int64 entries_in_factors
number of entries in factors
Int32 max_task_pool_size
maximum number of tasks in the factorization task pool
Int32 max_front_size
forecast or actual size of largest front
Int32 compresses_real
number of compresses of real data
Int32 compresses_integer
number of compresses of integer data
Int32 two_by_two_pivots
number of 2x2 pivots
Int32 semi_bandwidth
semi-bandwidth of matrix following bandwidth reduction
Int32 delayed_pivots
number of delayed pivots (total)
Int32 pivot_sign_changes
number of pivot sign changes if no pivoting is used successfully
Int32 static_pivots
number of static pivots chosen
Int32 first_modified_pivot
first pivot modification when static pivoting
Int32 rank
estimated rank of the matrix
Int32 negative_eigenvalues
number of negative eigenvalues
Int32 num_zero
number of pivots that are considered zero (and ignored)
Int32 iterative_refinements
number of iterative refinements performed
Int64 flops_assembly
anticipated or actual number of floating-point operations in assembly
Int64 flops_elimination
anticipated or actual number of floating-point operations in elimination
Int64 flops_blas
additional number of floating-point operations for BLAS
T largest_modified_pivot
largest diagonal modification when static pivoting or ensuring definiten
T minimum_scaling_factor
minimum scaling factor
T maximum_scaling_factor
maximum scaling factor
T condition_number_1
esimate of the condition number of the matrix (category 1 equations)
T condition_number_2
estimate of the condition number of the matrix (category 2 equations)
T backward_error_1
esimate of the backward error (category 1 equations)
T backward_error_2
esimate of the backward error (category 2 equations)
T forward_error
estimate of forward error
Bool alternative
has an “alternative” \(y\): \(A y = 0\) and \(y^T b > 0\) been found when trying to solve \(A x = b\) ?
char solver[21]
name of external solver used to factorize and solve
struct sls_time_type time
timings (see above)
struct sils_ainfo_type sils_ainfo
the analyse output structure from sils
struct sils_finfo_type sils_finfo
the factorize output structure from sils
struct sils_sinfo_type sils_sinfo
the solve output structure from sils
struct ma57_ainfo ma57_ainfo
the analyse output structure from hsl_ma57
struct ma57_finfo ma57_finfo
the factorize output structure from hsl_ma57
struct ma57_sinfo ma57_sinfo
the solve output structure from hsl_ma57
struct ma77_info ma77_info
the output structure from hsl_ma77
struct ma86_info ma86_info
the output structure from hsl_ma86
struct ma87_info ma87_info
the output structure from hsl_ma87
struct ma97_info ma97_info
the output structure from hsl_ma97
struct spral_ssids_inform ssids_inform
the output structure from ssids
Int32 mc61_info[10]
the real output array from mc61 from HSL
T mc61_rinfo[15]
the integer output array from mc61 from HSL
struct mc64_info mc64_info
the output structure from hsl_mc64
struct mc68_info mc68_info
the output structure from hsl_mc68
Int32 mc77_info[10]
the integer output array from mc77
T mc77_rinfo[10]
the real output array from mc77 from HSL
Int32 mumps_error
the output scalars and arrays from mumps
Int32 mumps_info[80]
see mumps_error
T mumps_rinfo[40]
see mumps_error
Int32 pardiso_error
the output scalars and arrays from pardiso
Int32 pardiso_IPARM[64]
see pardiso_error
T pardiso_DPARM[64]
see pardiso_error
Int32 mkl_pardiso_error
the output scalars and arrays from mkl_pardiso
Int32 mkl_pardiso_IPARM[64]
see mkl_pardiso_error
Int32 pastix_info
the output flag from pastix
Int32 wsmp_error
the output scalars and arrays from wsmp
Int32 wsmp_iparm[64]
see wsmp_error
T wsmp_dparm[64]
see wsmp_error
Int32 mpi_ierr
the output flag from MPI routines
Int32 lapack_error
the output flag from LAPACK routines
example calls#
This is an example of how to use the package to solve a symmetric system of linear equations; the code is available in $GALAHAD/src/sls/Julia/test_sls.jl . A variety of supported matrix storage formats are shown.
# test_sls.jl
# Simple code to test the Julia interface to SLS
using GALAHAD
using Test
using Printf
using Accessors
using Quadmath
function test_sls(::Type{T}) where T
maxabsarray(a) = abs.(a) |> maximum
# Derived types
data = Ref{Ptr{Cvoid}}()
control = Ref{sls_control_type{T}}()
inform = Ref{sls_inform_type{T}}()
# Set problem data
n = 5 # dimension of A
ne = 7 # number of entries of A
dense_ne = 15 # number of elements of A as a dense matrix
row = Cint[1, 2, 2, 3, 3, 4, 5] # row indices, NB lower triangle
col = Cint[1, 1, 5, 2, 3, 3, 5] # column indices
ptr = Cint[1, 2, 4, 6, 7, 8] # pointers to indices
val = T[2.0, 3.0, 6.0, 4.0, 1.0, 5.0, 1.0] # values
dense = T[2.0, 3.0, 0.0, 0.0, 4.0, 1.0, 0.0,
0.0, 5.0, 0.0, 0.0, 6.0, 0.0, 0.0, 1.0]
rhs = T[8.0, 45.0, 31.0, 15.0, 17.0]
sol = T[1.0, 2.0, 3.0, 4.0, 5.0]
status = Ref{Cint}()
x = zeros(T, n)
error = zeros(T, n)
norm_residual = Ref{T}()
good_x = eps(Float64)^(1 / 3)
@printf(" Fortran sparse matrix indexing\n\n")
@printf(" basic tests of storage formats\n\n")
@printf(" storage RHS refine partial\n")
for d in 1:3
# Initialize SLS - use the sytr solver
sls_initialize(T, "sytr", data, control, status)
# Set user-defined control options
@reset control[].f_indexing = true # Fortran sparse matrix indexing
# sparse co-ordinate storage
if d == 1
@printf(" coordinate ")
sls_analyse_matrix(T, control, data, status, n,
"coordinate", ne, row, col, C_NULL)
sls_factorize_matrix(T, data, status, ne, val)
end
# sparse by rows
if d == 2
@printf(" sparse by rows ")
sls_analyse_matrix(T, control, data, status, n,
"sparse_by_rows", ne, C_NULL, col, ptr)
sls_factorize_matrix(T, data, status, ne, val)
end
# dense
if d == 3
@printf(" dense ")
sls_analyse_matrix(T, control, data, status, n,
"dense", ne, C_NULL, C_NULL, C_NULL)
sls_factorize_matrix(T, data, status, dense_ne, dense)
end
# Set right-hand side and solve the system
for i in 1:n
x[i] = rhs[i]
end
sls_solve_system(T, data, status, n, x)
sls_information(T, data, inform, status)
if inform[].status == 0
for i in 1:n
error[i] = x[i] - sol[i]
end
norm_residual = maxabsarray(error)
if norm_residual < good_x
@printf(" ok ")
else
@printf(" fail ")
end
else
@printf(" SLS_solve exit status = %1i\n", inform[].status)
end
# @printf("sol: ")
# for i = 1:n
# @printf("%f ", x[i])
# end
# resolve, this time using iterative refinement
@reset control[].max_iterative_refinements = Cint(1)
sls_reset_control(T, control, data, status)
for i in 1:n
x[i] = rhs[i]
end
sls_solve_system(T, data, status, n, x)
sls_information(T, data, inform, status)
if inform[].status == 0
for i in 1:n
error[i] = x[i] - sol[i]
end
norm_residual = maxabsarray(error)
if norm_residual < good_x
@printf("ok ")
else
@printf(" fail ")
end
else
@printf(" SLS_solve exit status = %1i\n", inform[].status)
end
# obtain the solution by part solves
for i in 1:n
x[i] = rhs[i]
end
sls_partial_solve_system(T, "L", data, status, n, x)
sls_partial_solve_system(T, "D", data, status, n, x)
sls_partial_solve_system(T, "U", data, status, n, x)
sls_information(T, data, inform, status)
if inform[].status == 0
for i in 1:n
error[i] = x[i] - sol[i]
end
norm_residual = maxabsarray(error)
if norm_residual < good_x
@printf("ok ")
else
@printf(" fail ")
end
else
@printf(" SLS_solve exit status = %1i\n", inform[].status)
end
# Delete internal workspace
sls_terminate(T, data, control, inform)
end
return 0
end
@testset "SLS" begin
@test test_sls(Float32) == 0
@test test_sls(Float64) == 0
@test test_sls(Float128) == 0
end