GALAHAD PSLS package#
purpose#
Given a sparse symmetric matrix \(A = \{ a_{ij} \}_{n \times n}\),
the psls
package builds a suitable symmetric, positive definite—or
diagonally dominant—preconditioner \(P\) of \(A\) or a symmetric
sub-matrix thereof. The matrix \(A\) need not be definite. Facilities
are provided to apply the preconditioner to a given vector, and to
remove rows and columns (symmetrically) from the initial preconditioner
without a full re-factorization.
method#
The basic preconditioners are described in detail in
A. R. Conn, N. I. M. Gould and Ph. L. Toint, LANCELOT. A fortran package for large-scale nonlinear optimization (release A). Springer Verlag Series in Computational Mathematics 17, Berlin (1992), Section 3.3.10,
along with the more modern versions implements in ICFS
due to
C.-J. Lin and J. J. More’, ``Incomplete Cholesky factorizations with limited memory’’. SIAM Journal on Scientific Computing 21 (1999) 21-45,
and in HSL_MI28
described by
J. A. Scott and M. Tuma, ``HSL_MI28: an efficient and robust limited-memory incomplete Cholesky factorization code’’. ACM Transactions on Mathematical Software 40(4) (2014), Article 24.
The factorization methods used by the package sls
in conjunction
with some preconditioners are described in the documentation to that
package. The key features of the external solvers supported by sls
are given in the following table.
solver |
factorization |
indefinite \(A\) |
out-of-core |
parallelised |
---|---|---|---|---|
|
multifrontal |
yes |
no |
no |
|
multifrontal |
yes |
no |
no |
|
multifrontal |
yes |
yes |
OpenMP core |
|
left-looking |
yes |
no |
OpenMP fully |
|
left-looking |
no |
no |
OpenMP fully |
|
multifrontal |
yes |
no |
OpenMP core |
|
multifrontal |
yes |
no |
CUDA core |
|
multifrontal |
yes |
optionally |
MPI |
|
left-right-looking |
yes |
no |
OpenMP fully |
|
left-right-looking |
yes |
optionally |
OpenMP fully |
|
left-right-looking |
yes |
no |
OpenMP fully |
|
left-right-looking |
yes |
no |
OpenMP fully |
|
dense |
no |
no |
with parallel LAPACK |
|
dense |
yes |
no |
with parallel LAPACK |
|
dense band |
no |
no |
with parallel LAPACK |
Note that, with the exception of SSIDS
and the Netlib
reference LAPACK codes,
the solvers themselves do not form part of this package and
must be obtained/linked to separately. See the documentation for sls
for more details and references.
Dummy instances are provided for solvers that are unavailable.
Also note that additional flexibility may be obtained by calling the
solvers directly rather that via this package.
Orderings to reduce the bandwidth, as implemented in HSL’s MC61
, are due to
J. K. Reid and J. A. Scott, ``Ordering symmetric sparse matrices for small profile and wavefront’’. International Journal for Numerical Methods in Engineering 45 (1999) 1737-1755.
If a subset of the rows and columns are specified, the remaining rows/columns
are removed before processing. Any subsequent removal of rows and columns
is achieved using the Schur-complement updating package scu
unless a complete re-factorization is likely more efficient.
matrix storage#
The symmetric \(n\) by \(n\) matrix \(A\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).
Dense storage format: The matrix \(A\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(A\) is symmetric, only the lower triangular part (that is the part \(A_{ij}\) for \(1 \leq j \leq i \leq n\)) need be held. In this case the lower triangle should be stored by rows, that is component \((i-1) * i / 2 + j\) of the storage array A_val will hold the value \(A_{ij}\) (and, by symmetry, \(A_{ji}\)) for \(1 \leq j \leq i \leq n\). The string A_type = ‘dense’ should be specified.
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(1 \leq l \leq ne\), of \(A\), its row index i, column index j and value \(A_{ij}\), \(1 \leq j \leq i \leq n\), are stored as the \(l\)-th components of the integer arrays A_row and A_col and real array A_val, respectively, while the number of nonzeros is recorded as A_ne = \(ne\). Note that only the entries in the lower triangle should be stored. The string A_type = ‘coordinate’ should be specified.
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(A\) the i-th component of the integer array A_ptr holds the position of the first entry in this row, while A_ptr(n+1) holds the total number of entries plus one. The column indices j, \(1 \leq j \leq i\), and values \(A_{ij}\) of the entries in the i-th row are stored in components l = A_ptr(i), …, A_ptr(i+1)-1 of the integer array A_col, and real array A_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string A_type = ‘sparse_by_rows’ should be specified.
Diagonal storage format: If \(A\) is diagonal (i.e., \(A_{ij} = 0\) for all \(1 \leq i \neq j \leq n\)) only the diagonals entries \(A_{ii}\), \(1 \leq i \leq n\) need be stored, and the first n components of the array A_val may be used for the purpose. The string A_type = ‘diagonal’ should be specified.
Multiples of the identity storage format: If \(A\) is a multiple of the identity matrix, (i.e., \(H = \alpha I\) where \(I\) is the n by n identity matrix and \(\alpha\) is a scalar), it suffices to store \(\alpha\) as the first component of A_val. The string A_type = ‘scaled_identity’ should be specified.
The identity matrix format: If \(A\) is the identity matrix, no values need be stored. The string A_type = ‘identity’ should be specified.
The zero matrix format: The same is true if \(A\) is the zero matrix, but now the string A_type = ‘zero’ or ‘none’ should be specified.
introduction to function calls#
To solve a given problem, functions from the psls package must be called in the following order:
psls_initialize - provide default control parameters and set up initial data structures
psls_read_specfile (optional) - override control values by reading replacement values from a file
psls_import - set up matrix data structures for \(A\) prior to solution
psls_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved
one of
psls_form_preconditioner - form and factorize a preconditioner \(P\) of the matrix \(A\)
psls_form_subset_preconditioner - form and factorize a preconditioner \(P\) of a symmetric submatrix of the matrix \(A\)
psls_update_preconditioner (optional) - update the preconditioner \(P\) when rows (amd columns) are removed
psls_apply_preconditioner - solve the linear system of equations \(Px=b\)
psls_information (optional) - recover information about the preconditioner and solution process
psls_terminate - deallocate data structures
See the examples section for illustrations of use.
parametric real type T#
Below, the symbol T refers to a parametric real type that may be Float32 (single precision), Float64 (double precision) or, if supported, Float128 (quadruple precision).
callable functions#
function psls_initialize(T, data, control, status)
Set default control values and initialize private data
Parameters:
data |
holds private internal data |
control |
is a structure containing control information (see psls_control_type) |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):
|
function psls_read_specfile(T, control, specfile)
Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/psls/PSLS.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/psls.pdf for a list of how these keywords relate to the components of the control structure.
Parameters:
control |
is a structure containing control information (see psls_control_type) |
specfile |
is a one-dimensional array of type Vararg{Cchar} that must give the name of the specification file |
function psls_import(T, control, data, status, n, type, ne, row, col, ptr)
Import structural matrix data into internal storage prior to solution.
Parameters:
control |
is a structure whose members provide control parameters for the remaining procedures (see psls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
n |
is a scalar variable of type Int32 that holds the number of rows in the symmetric matrix \(A\). |
type |
is a one-dimensional array of type Vararg{Cchar} that specifies the symmetric storage scheme used for the matrix \(A\). It should be one of ‘coordinate’, ‘sparse_by_rows’ or ‘dense’; lower or upper case variants are allowed. |
ne |
is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes. |
row |
is a one-dimensional array of size ne and type Int32 that holds the row indices of the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be C_NULL. |
col |
is a one-dimensional array of size ne and type Int32 that holds the column indices of the lower triangular part of \(A\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense storage scheme is used, and in this case can be C_NULL. |
ptr |
is a one-dimensional array of size n+1 and type Int32 that holds the starting position of each row of the lower triangular part of \(A\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL. |
function psls_reset_control(T, control, data, status)
Reset control parameters after import if required.
Parameters:
control |
is a structure whose members provide control parameters for the remaining procedures (see psls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
function psls_form_preconditioner(T, data, status, ne, val)
Form and factorize a preconditioner \(P\) of the matrix \(A\).
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
ne |
is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of the symmetric matrix \(A\). |
val |
is a one-dimensional array of size ne and type T that holds the values of the entries of the lower triangular part of the symmetric matrix \(A\) in any of the supported storage schemes. |
function psls_form_subset_preconditioner(T, data, status, ne, val, n_sub, sub)
Form and factorize a \(P\) preconditioner of a symmetric submatrix of the matrix \(A\).
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
ne |
is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of the symmetric matrix \(A\). |
val |
is a one-dimensional array of size ne and type T that holds the values of the entries of the lower triangular part of the symmetric matrix \(A\) in any of the supported storage schemes. |
n_sub |
is a scalar variable of type Int32 that holds the number of rows (and columns) of the required submatrix of \(A\). |
sub |
is a one-dimensional array of size n_sub and type Int32 that holds the indices of the rows of required submatrix. |
function psls_update_preconditioner(T, data, status, ne, val, n_del, del)
Update the preconditioner \(P\) when rows (amd columns) are removed.
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
ne |
is a scalar variable of type Int32 that holds the number of entries in the lower triangular part of the symmetric matrix \(A\). |
val |
is a one-dimensional array of size ne and type T that holds the values of the entries of the lower triangular part of the symmetric matrix \(A\) in any of the supported storage schemes. |
n_del |
is a scalar variable of type Int32 that holds the number of rows (and columns) of (sub) matrix that are to be deleted. |
del |
is a one-dimensional array of size n_fix and type Int32 that holds the indices of the rows that are to be deleted. |
function psls_apply_preconditioner(T, data, status, n, sol)
Solve the linear system \(Px=b\).
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
n |
is a scalar variable of type Int32 that holds the number of entries in the vectors \(b\) and \(x\). |
sol |
is a one-dimensional array of size n and type double. On entry, it must hold the vector \(b\). On a successful exit, its contains the solution \(x\). Any component corresponding to rows/columns not in the initial subset recorded by psls_form_subset_preconditioner, or in those subsequently deleted by psls_update_preconditioner, will not be altered. |
function psls_information(T, data, inform, status)
Provide output information
Parameters:
data |
holds private internal data |
inform |
is a structure containing output information (see psls_inform_type) |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):
|
function psls_terminate(T, data, control, inform)
Deallocate all internal private storage
Parameters:
data |
holds private internal data |
control |
is a structure containing control information (see psls_control_type) |
inform |
is a structure containing output information (see psls_inform_type) |
available structures#
psls_control_type structure#
struct psls_control_type{T} f_indexing::Bool error::Int32 out::Int32 print_level::Int32 preconditioner::Int32 semi_bandwidth::Int32 scaling::Int32 ordering::Int32 max_col::Int32 icfs_vectors::Int32 mi28_lsize::Int32 mi28_rsize::Int32 min_diagonal::T new_structure::Bool get_semi_bandwidth::Bool get_norm_residual::Bool space_critical::Bool deallocate_error_fatal::Bool definite_linear_solver::NTuple{31,Cchar} prefix::NTuple{31,Cchar} sls_control::sls_control_type{T} mi28_control::mi28_control{T}
detailed documentation#
control derived type as a Julia structure
components#
Bool f_indexing
use C or Fortran sparse matrix indexing
Int32 error
unit for error messages
Int32 out
unit for monitor output
Int32 print_level
controls level of diagnostic output
Int32 preconditioner
which preconditioner to use:
<0 no preconditioning occurs, \(P = I\)
0 the preconditioner is chosen automatically (forthcoming, and currently defaults to 1).
1 \(A\) is replaced by the diagonal, \(P\) = diag( max(\(A\), .min_diagonal ) ).
2 \(A\) is replaced by the band \(P\) = band(\(A\)) with semi-bandwidth .semi_bandwidth.
3 \(A\) is replaced by the reordered band \(P\) = band( order(\(A\)) ) with semi-bandwidth .semi_bandwidth, where order is chosen by the HSL package MC61 to move entries closer to the diagonal.
4 \(P\) is a full factorization of \(A\) using Schnabel-Eskow modifications, in which small or negative diagonals are made sensibly positive during the factorization.
5 \(P\) is a full factorization of \(A\) due to Gill, Murray, Ponceleon and Saunders, in which an indefinite factorization is altered to give a positive definite one.
6 \(P\) is an incomplete Cholesky factorization of \(A\) using the package ICFS due to Lin and More’.
7 \(P\) is an incomplete factorization of \(A\) implemented as HSL_MI28 from HSL.
8 \(P\) is an incomplete factorization of \(A\) due to Munskgaard (forthcoming).
>8 treated as 0.
N.B. Options 3-8 may require additional external software that is not part of the package, and that must be obtained separately.
Int32 semi_bandwidth
the semi-bandwidth for band(H) when .preconditioner = 2,3
Int32 scaling
not used at present
Int32 ordering
see scaling
Int32 max_col
maximum number of nonzeros in a column of \(A\) for Schur-complement factorization to accommodate newly deleted rpws and columns
Int32 icfs_vectors
number of extra vectors of length n required by the Lin-More’ incomplete Cholesky preconditioner when .preconditioner = 6
Int32 mi28_lsize
the maximum number of fill entries within each column of the incomplete factor L computed by HSL_MI28 when .preconditioner = 7. In general, increasing mi28_lsize improve the quality of the preconditioner but increases the time to compute and then apply the preconditioner. Values less than 0 are treated as 0
Int32 mi28_rsize
the maximum number of entries within each column of the strictly lower triangular matrix \(R\) used in the computation of the preconditioner by HSL_MI28 when .preconditioner = 7. Rank-1 arrays of size .mi28_rsize \* n are allocated internally to hold \(R\). Thus the amount of memory used, as well as the amount of work involved in computing the preconditioner, depends on mi28_rsize. Setting mi28_rsize > 0 generally leads to a higher quality preconditioner than using mi28_rsize = 0, and choosing mi28_rsize >= mi28_lsize is generally recommended
T min_diagonal
the minimum permitted diagonal in diag(max(H,.min_diagonal))
Bool new_structure
set new_structure true if the storage structure for the input matrix has changed, and false if only the values have changed
Bool get_semi_bandwidth
set get_semi_bandwidth true if the semi-bandwidth of the submatrix is to be calculated
Bool get_norm_residual
set get_norm_residual true if the residual when applying the preconditioner are to be calculated
Bool space_critical
if space is critical, ensure allocated arrays are no bigger than needed
Bool deallocate_error_fatal
exit if any deallocation fails
NTuple{31,Cchar} definite_linear_solver
the definite linear equation solver package used when .preconditioner = 3,4. Possible choices are currently: sils, ma27, ma57, ma77, ma86, ma87, ma97, ssids, mumps, pardiso, mkl_pardiso,pastix, wsmp, potr and pbtr, although only sils, potr, pbtr and, for OMP 4.0-compliant compilers, ssids are installed by default.
NTuple{31,Cchar} prefix
all output lines will be prefixed by prefix(2:LEN(TRIM(.prefix))-1) where prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’
struct sls_control_type sls_control
control parameters for SLS
struct mi28_control mi28_control
control parameters for HSL_MI28
psls_time_type structure#
struct psls_time_type{T} total::Float32 assemble::Float32 analyse::Float32 factorize::Float32 solve::Float32 update::Float32 clock_total::T clock_assemble::T clock_analyse::T clock_factorize::T clock_solve::T clock_update::T
detailed documentation#
time derived type as a Julia structure
components#
Float32 total
total time
Float32 assemble
time to assemble the preconditioner prior to factorization
Float32 analyse
time for the analysis phase
Float32 factorize
time for the factorization phase
Float32 solve
time for the linear solution phase
Float32 update
time to update the factorization
T clock_total
total clock time spent in the package
T clock_assemble
clock time to assemble the preconditioner prior to factorization
T clock_analyse
clock time for the analysis phase
T clock_factorize
clock time for the factorization phase
T clock_solve
clock time for the linear solution phase
T clock_update
clock time to update the factorization
psls_inform_type structure#
struct psls_inform_type{T} status::Int32 alloc_status::Int32 analyse_status::Int32 factorize_status::Int32 solve_status::Int32 factorization_integer::Int64 factorization_real::Int64 preconditioner::Int32 semi_bandwidth::Int32 reordered_semi_bandwidth::Int32 out_of_range::Int32 duplicates::Int32 upper::Int32 missing_diagonals::Int32 semi_bandwidth_used::Int32 neg1::Int32 neg2::Int32 perturbed::Bool fill_in_ratio::T norm_residual::T bad_alloc::NTuple{81,Cchar} mc61_info::NTuple{10,Cint} mc61_rinfo::NTuple{15,T} time::psls_time_type{T} sls_inform::sls_inform_type{T} mi28_info::mi28_info{T}
detailed documentation#
inform derived type as a Julia structure
components#
Int32 status
reported return status:
0
success
-1
allocation error
-2
deallocation error
-3
matrix data faulty (.n < 1, .ne < 0)
Int32 alloc_status
STAT value after allocate failure.
Int32 analyse_status
status return from factorization
Int32 factorize_status
status return from factorization
Int32 solve_status
status return from solution phase
Int64 factorization_integer
number of integer words to hold factors
Int64 factorization_real
number of real words to hold factors
Int32 preconditioner
code for the actual preconditioner used (see control.preconditioner)
Int32 semi_bandwidth
the actual semi-bandwidth
Int32 reordered_semi_bandwidth
the semi-bandwidth following reordering (if any)
Int32 out_of_range
number of indices out-of-range
Int32 duplicates
number of duplicates
Int32 upper
number of entries from the strict upper triangle
Int32 missing_diagonals
number of missing diagonal entries for an allegedly-definite matrix
Int32 semi_bandwidth_used
the semi-bandwidth used
Int32 neg1
number of 1 by 1 pivots in the factorization
Int32 neg2
number of 2 by 2 pivots in the factorization
Bool perturbed
has the preconditioner been perturbed during the fctorization?
T fill_in_ratio
ratio of fill in to original nonzeros
T norm_residual
the norm of the solution residual
NTuple{81,Cchar} bad_alloc
name of array which provoked an allocate failure
Int32 mc61_info[10]
the integer and real output arrays from mc61
T mc61_rinfo[15]
see mc61_info
struct psls_time_type time
times for various stages
struct sls_inform_type sls_inform
inform values from SLS
struct mi28_info mi28_info
the output info structure from HSL’s mi28
example calls#
This is an example of how to use the package to solve a definite linear system; the code is available in $GALAHAD/src/psls/Julia/test_psls.jl . A variety of supported matrix storage formats are shown.
# test_psls.jl
# Simple code to test the Julia interface to PSLS
using GALAHAD
using Test
using Printf
using Accessors
using Quadmath
function test_psls(::Type{T}) where T
# Derived types
data = Ref{Ptr{Cvoid}}()
control = Ref{psls_control_type{T}}()
inform = Ref{psls_inform_type{T}}()
# Set problem data
n = 5 # dimension of A
ne = 7 # number of elements of A
dense_ne = div(n * (n + 1), 2) # number of elements of dense A
row = Cint[1, 2, 2, 3, 3, 4, 5] # A indices values, NB lower triangle
col = Cint[1, 1, 5, 2, 3, 3, 5]
ptr = Cint[1, 2, 4, 6, 7, 8]
val = T[2.0, 3.0, 6.0, 4.0, 1.0, 5.0, 1.0]
dense = T[2.0, 3.0, 0.0, 0.0, 4.0, 1.0, 0.0,
0.0, 5.0, 0.0, 0.0, 6.0, 0.0, 0.0, 1.0]
st = ' '
status = Ref{Cint}()
status_apply = Ref{Cint}()
@printf(" Fortran sparse matrix indexing\n\n")
@printf(" basic tests of storage formats\n\n")
for d in 1:3
# Initialize PSLS
psls_initialize(T, data, control, status)
@reset control[].preconditioner = Cint(2) # band preconditioner
@reset control[].semi_bandwidth = Cint(1) # semibandwidth
@reset control[].definite_linear_solver = galahad_linear_solver("sils")
# Set user-defined control options
@reset control[].f_indexing = true # fortran sparse matrix indexing
# sparse co-ordinate storage
if d == 1
st = 'C'
psls_import(T, control, data, status, n,
"coordinate", ne, row, col, C_NULL)
psls_form_preconditioner(T, data, status, ne, val)
end
# sparse by rows
if d == 2
st = 'R'
psls_import(T, control, data, status, n,
"sparse_by_rows", ne, C_NULL, col, ptr)
psls_form_preconditioner(T, data, status, ne, val)
end
# dense
if d == 3
st = 'D'
psls_import(T, control, data, status, n,
"dense", ne, C_NULL, C_NULL, C_NULL)
psls_form_preconditioner(T, data, status, dense_ne, dense)
end
# Set right-hand side b in x
x = T[8.0, 45.0, 31.0, 15.0, 17.0] # values
if status == 0
psls_information(T, data, inform, status)
psls_apply_preconditioner(T, data, status_apply, n, x)
else
status_apply[] = -1
end
@printf("%c storage: status from form factorize = %i apply = %i\n",
st, status[], status_apply[])
# @printf("x: ")
# for i = 1:n
# @printf("%f ", x[i])
# end
# Delete internal workspace
psls_terminate(T, data, control, inform)
end
return 0
end
@testset "PSLS" begin
@test test_psls(Float32) == 0
@test test_psls(Float64) == 0
@test test_psls(Float128) == 0
end