PSLS#

purpose#

Given a sparse symmetric matrix \(A = \{ a_{ij} \}_{n \times n}\), the psls package builds a suitable symmetric, positive definite—or diagonally dominant—preconditioner \(P\) of \(A\) or a symmetric sub-matrix thereof. The matrix \(A\) need not be definite. Facilities are provided to apply the preconditioner to a given vector, and to remove rows and columns (symmetrically) from the initial preconditioner without a full re-factorization.

method#

The basic preconditioners are described in detail in

A. R. Conn, N. I. M. Gould and Ph. L. Toint, LANCELOT. A fortran package for large-scale nonlinear optimization (release A). Springer Verlag Series in Computational Mathematics 17, Berlin (1992), Section 3.3.10,

along with the more modern versions implements in ICFS due to

C.-J. Lin and J. J. More’, ``Incomplete Cholesky factorizations with limited memory’’. SIAM Journal on Scientific Computing 21 (1999) 21-45,

and in HSL_MI28 described by

J. A. Scott and M. Tuma, ``HSL_MI28: an efficient and robust limited-memory incomplete Cholesky factorization code’’. ACM Transactions on Mathematical Software 40(4) (2014), Article 24.

The factorization methods used by the package sls in conjunction with some preconditioners are described in the documentation to that package. The key features of the external solvers supported by sls are given in the following table.

External solver characteristics#

solver

factorization

indefinite \(A\)

out-of-core

parallelised

SILS/MA27

multifrontal

yes

no

no

HSL_MA57

multifrontal

yes

no

no

HSL_MA77

multifrontal

yes

yes

OpenMP core

HSL_MA86

left-looking

yes

no

OpenMP fully

HSL_MA87

left-looking

no

no

OpenMP fully

HSL_MA97

multifrontal

yes

no

OpenMP core

SSIDS

multifrontal

yes

no

CUDA core

MUMPS

multifrontal

yes

optionally

MPI

PARDISO

left-right-looking

yes

no

OpenMP fully

MKL_PARDISO

left-right-looking

yes

optionally

OpenMP fully

PaStix

left-right-looking

yes

no

OpenMP fully

WSMP

left-right-looking

yes

no

OpenMP fully

POTR

dense

no

no

with parallel LAPACK

SYTR

dense

yes

no

with parallel LAPACK

PBTR

dense band

no

no

with parallel LAPACK

Note that, with the exception of SSIDS and the Netlib reference LAPACK codes, the solvers themselves do not form part of this package and must be obtained/linked to separately. See the documentation for sls for more details and references. Dummy instances are provided for solvers that are unavailable. Also note that additional flexibility may be obtained by calling the solvers directly rather that via this package.

Orderings to reduce the bandwidth, as implemented in HSL’s MC61, are due to

J. K. Reid and J. A. Scott, ``Ordering symmetric sparse matrices for small profile and wavefront’’. International Journal for Numerical Methods in Engineering 45 (1999) 1737-1755.

If a subset of the rows and columns are specified, the remaining rows/columns are removed before processing. Any subsequent removal of rows and columns is achieved using the Schur-complement updating package scu unless a complete re-factorization is likely more efficient.

matrix storage#

The symmetric \(n\) by \(n\) matrix \(A\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).

Dense storage format: The matrix \(A\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(A\) is symmetric, only the lower triangular part (that is the part \(A_{ij}\) for \(0 \leq j \leq i \leq n-1\)) need be held. In this case the lower triangle should be stored by rows, that is component \(i * i / 2 + j\) of the storage array A_val will hold the value \(A_{ij}\) (and, by symmetry, \(A_{ji}\)) for \(0 \leq j \leq i \leq n-1\). The string A_type = ‘dense’ should be specified.

Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(0 \leq l \leq ne-1\), of \(A\), its row index i, column index j and value \(A_{ij}\), \(0 \leq j \leq i \leq n-1\), are stored as the \(l\)-th components of the integer arrays A_row and A_col and real array A_val, respectively, while the number of nonzeros is recorded as A_ne = \(ne\). Note that only the entries in the lower triangle should be stored. The string A_type = ‘coordinate’ should be specified.

Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(A\) the i-th component of the integer array A_ptr holds the position of the first entry in this row, while A_ptr(n) holds the total number of entries. The column indices j, \(0 \leq j \leq i\), and values \(A_{ij}\) of the entries in the i-th row are stored in components l = A_ptr(i), …, A_ptr(i+1)-1 of the integer array A_col, and real array A_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string A_type = ‘sparse_by_rows’ should be specified.

Diagonal storage format: If \(A\) is diagonal (i.e., \(A_{ij} = 0\) for all \(0 \leq i \neq j \leq n-1\)) only the diagonals entries \(A_{ii}\), \(0 \leq i \leq n-1\) need be stored, and the first n components of the array A_val may be used for the purpose. The string A_type = ‘diagonal’ should be specified.

Multiples of the identity storage format: If \(A\) is a multiple of the identity matrix, (i.e., \(H = \alpha I\) where \(I\) is the n by n identity matrix and \(\alpha\) is a scalar), it suffices to store \(\alpha\) as the first component of A_val. The string A_type = ‘scaled_identity’ should be specified.

The identity matrix format: If \(A\) is the identity matrix, no values need be stored. The string A_type = ‘identity’ should be specified.

The zero matrix format: The same is true if \(A\) is the zero matrix, but now the string A_type = ‘zero’ or ‘none’ should be specified.

functions#

psls.initialize()#

Set default option values and initialize private data

Returns:

optionsdict
dictionary containing default control options:
errorint

error and warning diagnostics occur on stream error.

warningint

unit for warning messages.

outint

general output occurs on stream out.

statisticsint

unit for statistical output.

print_levelint

the level of output required is specified by print_level. Possible values are

  • <=0

    gives no output.

  • 1

    gives a summary of the process.

  • >=2

    gives increasingly verbose (debugging) output.

preconditionerint

which preconditioner to use. Possible values are:

  • <0

    no preconditioning occurs, \(P = I\).

  • 0

    the preconditioner is chosen automatically (forthcoming, and currently defaults to 1).

  • 1

    \(A\) is replaced by the diagonal, \(P\) = diag( max( \(A\), min_diagonal ) ).

  • 2

    \(A\) is replaced by the band \(P\) = band( \(A\) ) with semi-bandwidth semi_bandwidth.

  • 3

    \(A\) is replaced by the reordered band \(P\) = band( order( \(A\) ) ) with semi-bandwidth semi_bandwidth, where order is chosen by the HSL package MC61 to move entries closer to the diagonal.

  • 4

    \(P\) is a full factorization of \(A\) using Schnabel-Eskow modifications, in which small or negative diagonals are made sensibly positive during the factorization.

  • 5

    \(P\) is a full factorization of \(A\) due to Gill, Murray, Ponceleon and Saunders, in which an indefinite factorization is altered to give a positive definite one.

  • 6

    \(P\) is an incomplete Cholesky factorization of \(A\) using the package ICFS due to Lin and More’.

  • 7

    \(P\) is an incomplete factorization of \(A\) implemented as HSL_MI28 from HSL.

  • 8

    \(P\) is an incomplete factorization of \(A\) due to Munskgaard (forthcoming).

  • >8

    treated as 0.

Options 3-8 may require additional external software that is not part of the package, and that must be obtained separately.

semi_bandwidthint

the semi-bandwidth for band(H) when preconditioner = 2,3.

scalingint

not used at present.

orderingint

see scaling.

max_colint

maximum number of nonzeros in a column of \(A\) for Schur-complement factorization to accommodate newly deleted rpws and columns.

icfs_vectorsint

number of extra vectors of length n required by the Lin-More’ incomplete Cholesky preconditioner when preconditioner = 6.

mi28_lsizeint

the maximum number of fill entries within each column of the incomplete factor L computed by HSL_MI28 when preconditioner = 7. In general, increasing mi28_lsize improve the quality of the preconditioner but increases the time to compute and then apply the preconditioner. Values less than 0 are treated as 0.

mi28_rsizeint

the maximum number of entries within each column of the strictly lower triangular matrix \(R\) used in the computation of the preconditioner by HSL_MI28 when preconditioner = 7. Rank-1 arrays of size mi28_rsize * n are allocated internally to hold \(R\). Thus the amount of memory used, as well as the amount of work involved in computing the preconditioner, depends on mi28_rsize. Setting mi28_rsize > 0 generally leads to a higher quality preconditioner than using mi28_rsize = 0, and choosing mi28_rsize >= mi28_lsize is generally recommended.

min_diagonalfloat

the minimum permitted diagonal in diag(max(H,.min_diagonal)).

new_structurebool

set new_structure True if the storage structure for the input matrix has changed, and False if only the values have changed.

get_semi_bandwidthbool

set get_semi_bandwidth True if the semi-bandwidth of the submatrix is to be calculated.

get_norm_residualbool

set get_norm_residual True if the residual when applying the preconditioner are to be calculated.

space_criticalbool

if space_critical is True, every effort will be made to use as little space as possible. This may result in longer computation time.

deallocate_error_fatalbool

if deallocate_error_fatal is True, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue.

definite_linear_solverstr

the definite linear equation solver used when preconditioner = 3,4. Possible choices are currently: sils, ma27, ma57, ma77, ma86, ma87, ma97, ssids, mumps, pardiso, mkl_pardiso, pastix, wsmp, potr and pbtr, although only sils, potr, pbtr and, for OMP 4.0-compliant compilers, ssids are installed by default.

prefixstr

all output lines will be prefixed by the string contained in quotes within prefix, e.g. ‘word’ (note the qutoes) will result in the prefix word.

sls_optionsdict

default control options for SLS (see sls.initialize).

mi28_optionsdict

default control options for HSL_MI28 (see mi28.initialize).

psls.load(n, A_type, A_ne, A_row, A_col, A_ptr, options=None)#

Import problem data into internal storage prior to factorization.

Parameters:

nint

holds the dimension of the system, \(n\).

A_typestring

specifies the symmetric storage scheme used for the matrix \(A\). It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’, ‘diagonal’, ‘scaled_identity’, ‘identity’, ‘zero’ or ‘none’; lower or upper case variants are allowed.

A_neint

holds the number of entries in the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes.

A_rowndarray(A_ne)

holds the row indices of the lower triangular part of \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes, and in this case can be None.

A_colndarray(A_ne)

holds the column indices of the lower triangular part of \(A\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the other storage schemes are used, and in this case can be None.

A_ptrndarray(n+1)

holds the starting position of each row of the lower triangular part of \(A\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be None.

optionsdict, optional

dictionary of control options (see psls.initialize).

psls.form_preconditioner(A_ne, A_val)#

Form and factorize the preconditioner \(P\) from the matrix \(A\).

Parameters:

A_neint

holds the number of entries in the lower triangular part of the matrix \(A\).

A_valndarray(A_ne)

holds the values of the nonzeros in the lower triangle of the matrix \(A\) in the same order as specified in the sparsity pattern in psls.load.

psls.form_subset_preconditioner(a_ne, A_val, n_sub, sub)#

Form and factorize the preconditioner of a symmetric subset of the rows and columns of \(A\).

Parameters:

A_neint

holds the number of entries in the lower triangular part of the matrix \(A\).

A_valndarray(A_ne)

holds the values of the nonzeros in the lower triangle of the matrix \(A\) in the same order as specified in the sparsity pattern in

n_subint

holds the number of rows (and columns) of the required submatrix of \(A\).

subndarray(n_sub)

holds the indices of the rows of the required submatrix of \(A\).

psls.update_preconditioner(a_ne, A_val, n_del, del)#

Update the preconditioner \(P\) when rows (and columns) are removed.

Parameters:

a_neint

holds the number of entries in the lower triangular part of the matrix \(A\).

A_valndarray(a_ne)

holds the values of the nonzeros in the lower triangle of the matrix \(A\) in the same order as specified in the sparsity pattern in

n_delint

holds the number of rows (and columns) that will be removed from \(A\).

delndarray(n_del)

holds the indices of the rows that will be removed from \(A\).

psls.apply_preconditioner(n, b)#

Solve the system \(Px=b\)

Parameters:

nint

holds the dimension of the system, \(n\). holds the number of variables.

bndarray(n)

holds the values of the right-hand side vector \(b\). Any component corresponding to rows/columns not in the initial subset recorded by psls.form_subset_preconditioner, or in those subsequently deleted by psls_update_preconditioner, will not be altered.

Returns:

xndarray(n)

holds the values of the solution \(x\) after a successful call. Any component corresponding to rows/columns not in the initial subset recorded by psls.form_subset_preconditioner, or in those subsequently deleted by psls_update_preconditioner, will be zero.

[optional] psls.information()

Provide optional output information

Returns:

informdict
dictionary containing output information:
statusint

reported return status. Possible values are

  • 0

    The run was successful.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit options[‘error’], and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit options[‘error’] and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -3

    The restriction n > 0 or requirement that type contains its relevant string ‘dense’, ‘coordinate’, ‘sparse_by_rows’, ‘diagonal’, ‘scaled_identity’, ‘identity’, ‘zero’ or ‘none’ has been violated.

  • -9

    The analysis phase of the factorization failed; the return status from the factorization package is given by inform[‘factor_status’].

  • -10

    The factorization failed; the return status from the factorization package is given by inform[‘factor_status’].

  • -20

    The matrix \(A\) is not positive definite while the factorization solver used expected it to be.

  • -26

    The requested factorization solver is unavailable.

  • -29

    A requested option is unavailable.

  • -45

    The requested preconditioner is unavailable.

  • -80

    An error occurred when calling HSL MI28. See mi28 info%stat for more details.

alloc_statusint

the status of the last attempted allocation/deallocation.

bad_allocstr

the name of the array for which an allocation/deallocation error occurred.

analyse_statusint

status return from factorization.

factorize_statusint

status return from factorization.

solve_statusint

status return from solution phase.

factorization_integerlong

number of integer words to hold factors.

factorization_reallong

number of real words to hold factors.

preconditionerint

code for the actual preconditioner used (see control.preconditioner).

semi_bandwidthint

the actual semi-bandwidth.

reordered_semi_bandwidthint

the semi-bandwidth following reordering (if any).

out_of_rangeint

number of indices out-of-range.

duplicatesint

number of duplicates.

upperint

number of entries from the strict upper triangle.

missing_diagonalsint

number of missing diagonal entries for an allegedly-definite matrix.

semi_bandwidth_usedint

the semi-bandwidth used.

neg1int

number of 1 by 1 pivots in the factorization.

neg2int

number of 2 by 2 pivots in the factorization.

perturbedbool

has the preconditioner been perturbed during the fctorization?.

fill_in_ratiofloat

ratio of fill in to original nonzeros.

norm_residualfloat

the norm of the solution residual.

mc61_infoint

the integer and real output arrays from MC61.

mc61_rinfofloat

see mc61_info.

timedict
dictionary containing timing information:
totalfloat

total time.

assemblefloat

time to assemble the preconditioner prior to factorization.

analysefloat

time for the analysis phase.

factorizefloat

time for the factorization phase.

solvefloat

time for the linear solution phase.

updatefloat

time to update the factorization.

clock_totalfloat

total clock time spent in the package.

clock_assemblefloat

clock time to assemble the preconditioner prior to factorization.

clock_analysefloat

clock time for the analysis phase.

clock_factorizefloat

clock time for the factorization phase.

clock_solvefloat

clock time for the linear solution phase.

clock_updatefloat

clock time to update the factorization.

sls_informdict

inform parameters for SLS (see sls.information).

mi28_infodict

info parameters for HSL_MI28 (see mi28.information).

psls.terminate()#

Deallocate all internal private storage.

example code#

from galahad import psls
import numpy as np
np.set_printoptions(precision=4,suppress=True,floatmode='fixed')
print("\n** python test: psls")

#  describe problem (only the lower triangle of matrix is required)
#  ( 1     )     ( 1 )
#  (   2 1 ) x = ( 3 )
#  (   1 3 )     ( 4 )

n = 3
A_type = 'coordinate'
A_ne = 4
A_row = np.array([0,1,2,2])
A_col = np.array([0,1,1,2])
A_ptr = None
A_val = np.array([1.0,2.0,1.0,3.0])
b = np.array([1.0,3.0,4.0])

# allocate internal data and set default options
options = psls.initialize()

# set some non-default options
options['preconditioner'] = 6
#print("options:", options)

# load data (and optionally non-default options)
psls.load(n, A_type, A_ne, A_row, A_col, A_ptr, options)

# form the preconditioner
psls.form_preconditioner(A_ne, A_val)

# apply the preconditioner
x = psls.apply_preconditioner(n, b)
print(" x:",x)

# get information
inform = psls.information()
print(" preconditioner:",inform['preconditioner'])
print('** psls exit status:', inform['status'])

# deallocate internal data

psls.terminate()

This example code is available in $GALAHAD/src/psls/Python/test_psls.py .