GALAHAD ULS package#
purpose#
The uls
package
solves dense or sparse unsymmetric systems of linear equations
using variants of Gaussian elimination.
Given a sparse matrix \(A = \{ a_{ij} \}_{m \times n}\), and an
\(n\)-vector \(b\), this function solves the systems \(A x = b\) or \(A^T x = b\).
Both square (\(m=n\)) and rectangular (\(m\neq n\)) matrices are handled;
one of an infinite class of solutions for consistent systems will be returned
whenever \(A\) is not of full rank.
The method provides a common interface to a variety of well-known
solvers from HSL and elsewhere. Currently supported solvers include
MA28/GLS
and HSL_MA48
from {HSL},
as well as GETR
from LAPACK.
Note that, with the exception of he Netlib reference LAPACK code,
the solvers themselves do not form part of this package and
must be obtained/linked to separately.
Dummy instances are provided for solvers that are unavailable.
Also note that additional flexibility may be obtained by calling the
solvers directly rather that via this package.
terminology#
The solvers used each produce an \(L U\) factorization of \(A\), where \(L\) and U are permuted lower and upper triangular matrices (respectively). It is convenient to write this factorization in the form
supported solvers#
The key features of the external solvers supported by uls
are
given in the following table:
solver |
factorization |
out-of-core |
parallelised |
---|---|---|---|
|
sparse |
no |
no |
|
sparse |
no |
no |
|
dense |
no |
with parallel LAPACK |
method#
Variants of sparse Gaussian elimination are used. See Section 4 of $GALAHAD/doc/uls.pdf for a brief description of the method employed and other details.
The solver GLS
is available as part of GALAHAD and relies on
the HSL Archive package MA33
. To obtain HSL Archive packages, see
The solver HSL_MA48
is part of HSL 2011.
To obtain HSL 2011 packages, see
The solver GETR
is available as S/DGETRF/S
as part of LAPACK. Reference versions
are provided by GALAHAD, but for good performance
machined-tuned versions should be used.
The methods used are described in the user-documentation for
HSL 2011, A collection of Fortran codes for large-scale scientific computation (2011).
matrix storage#
The unsymmetric \(m\) by \(n\) matrix \(A\) may be presented and stored in a variety of convenient input formats.
Dense storage format: The matrix \(A\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. In this case, component \(n \ast i + j\) of the storage array A_val will hold the value \(A_{ij}\) for \(1 \leq i \leq m\), \(1 \leq j \leq n\). The string A_type = ‘dense’ should be specified.
Dense by columns storage format: The matrix \(A\) is stored as a compact dense matrix by columns, that is, the values of the entries of each column in turn are stored in order within an appropriate real one-dimensional array. In this case, component \(m \ast j + i\) of the storage array A_val will hold the value \(A_{ij}\) for \(1 \leq i \leq m\), \(1 \leq j \leq n\). The string A_type = ‘dense_by_columns’ should be specified.
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(1 \leq l \leq ne\), of \(A\), its row index i, column index j and value \(A_{ij}\), \(1 \leq i \leq m\), \(1 \leq j \leq n\), are stored as the \(l\)-th components of the integer arrays A_row and A_col and real array A_val, respectively, while the number of nonzeros is recorded as A_ne = \(ne\). The string A_type = ‘coordinate’should be specified.
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(A\) the i-th component of the integer array A_ptr holds the position of the first entry in this row, while A_ptr(m+1) holds the total number of entries plus one. The column indices j, \(1 \leq j \leq n\), and values \(A_{ij}\) of the nonzero entries in the i-th row are stored in components l = A_ptr(i), \(\ldots\), A_ptr(i+1)-1, \(1 \leq i \leq m\), of the integer array A_col, and real array A_val, respectively. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string A_type = ‘sparse_by_rows’ should be specified.
Sparse column-wise storage format: Once again only the nonzero entries are stored, but this time they are ordered so that those in column j appear directly before those in column j+1. For the j-th column of \(A\) the j-th component of the integer array A_ptr holds the position of the first entry in this column, while A_ptr(n+1) holds the total number of entries plus one. The row indices i, \(1 \leq i \leq m\), and values \(A_{ij}\) of the nonzero entries in the j-th columnsare stored in components l = A_ptr(j), \(\ldots\), A_ptr(j+1)-1, \(1 \leq j \leq n\), of the integer array A_row, and real array A_val, respectively. As before, for sparse matrices, this scheme almost always requires less storage than the co-ordinate format. The string A_type = ‘sparse_by_columns’ should be specified.
introduction to function calls#
To solve a given problem, functions from the uls package must be called in the following order:
uls_initialize - provide default control parameters and set up initial data structures
uls_read_specfile (optional) - override control values by reading replacement values from a file
uls_factorize_matrix - set up matrix data structures, analyse the structure to choose a suitable order for factorization, and then factorize the matrix \(A\)
uls_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved
uls_solve_system - solve the linear system of equations \(Ax=b\) or \(A^Tx=b\)
uls_information (optional) - recover information about the solution and solution process
uls_terminate - deallocate data structures
See the examples section for illustrations of use.
parametric real type T#
Below, the symbol T refers to a parametric real type that may be Float32 (single precision), Float64 (double precision) or, if supported, Float128 (quadruple precision).
callable functions#
function uls_initialize(T, solver, data, control, status)
Set default control values and initialize private data
Select solver, set default control values and initialize private data
Parameters:
solver |
is a one-dimensional array of type Vararg{Cchar} that specifies the solver package that should be used to factorize the matrix \(A\). It should be one of ‘gls’, ‘ma28’, ‘ma48 or ‘getr’; lower or upper case variants are allowed. |
data |
holds private internal data |
control |
is a structure containing control information (see uls_control_type) |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
function uls_read_specfile(T, control, specfile)
Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/uls/ULS.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/uls.pdf for a list of how these keywords relate to the components of the control structure.
Parameters:
control |
is a structure containing control information (see uls_control_type) |
specfile |
is a one-dimensional array of type Vararg{Cchar} that must give the name of the specification file |
function uls_factorize_matrix(T, control, data, status, m, n, type, ne, val, row, col, ptr)
Import matrix data into internal storage prior to solution, analyse the sparsity patern, and subsequently factorize the matrix
Parameters:
control |
is a structure whose members provide control parameters for the remaining procedures (see uls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
m |
is a scalar variable of type Int32 that holds the number of rows in the unsymmetric matrix \(A\). |
n |
is a scalar variable of type Int32 that holds the number of columns in the unsymmetric matrix \(A\). |
type |
is a one-dimensional array of type Vararg{Cchar} that specifies the unsymmetric storage scheme used for the matrix \(A\). It should be one of ‘coordinate’, ‘sparse_by_rows’ or ‘dense’; lower or upper case variants are allowed. |
ne |
is a scalar variable of type Int32 that holds the number of entries in \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes. |
val |
is a one-dimensional array of size ne and type T, that holds the values of the entries of the matrix \(A\) in any of the supported storage schemes. |
row |
is a one-dimensional array of size ne and type Int32 that holds the row indices of the matrix \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be C_NULL. |
col |
is a one-dimensional array of size ne and type Int32 that holds the column indices of the matrix \(A\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense storage schemes is used, and in this case can be C_NULL. |
ptr |
is a one-dimensional array of size m+1 and type Int32 that holds the starting position of each row of the matrix \(A\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL. |
function uls_reset_control(T, control, data, status)
Reset control parameters after import if required.
Parameters:
control |
is a structure whose members provide control parameters for the remaining procedures (see uls_control_type) |
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
function uls_solve_system(T, data, status, m, n, sol, trans)
Solve the linear system \(Ax=b\) or \(A^Tx=b\).
Parameters:
data |
holds private internal data |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:
|
m |
is a scalar variable of type Int32 that holds the number of rows in the unsymmetric matrix \(A\). |
n |
is a scalar variable of type Int32 that holds the number of columns in the unsymmetric matrix \(A\). |
sol |
is a one-dimensional array of size n and type T. On entry, it must hold the vector \(b\). On a successful exit, its contains the solution \(x\). |
trans |
is a scalar variable of type Bool, that specifies whether to solve the equation \(A^Tx=b\) (trans=true) or \(Ax=b\) (trans=false). |
function uls_information(T, data, inform, status)
Provides output information
Parameters:
data |
holds private internal data |
inform |
is a structure containing output information (see uls_inform_type) |
status |
is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):
|
function uls_terminate(T, data, control, inform)
Deallocate all internal private storage
Parameters:
data |
holds private internal data |
control |
is a structure containing control information (see uls_control_type) |
inform |
is a structure containing output information (see uls_inform_type) |
available structures#
uls_control_type structure#
struct uls_control_type{T} f_indexing::Bool error::Int32 warning::Int32 out::Int32 print_level::Int32 print_level_solver::Int32 initial_fill_in_factor::Int32 min_real_factor_size::Int32 min_integer_factor_size::Int32 max_factor_size::Int64 blas_block_size_factorize::Int32 blas_block_size_solve::Int32 pivot_control::Int32 pivot_search_limit::Int32 minimum_size_for_btf::Int32 max_iterative_refinements::Int32 stop_if_singular::Bool array_increase_factor::T switch_to_full_code_density::T array_decrease_factor::T relative_pivot_tolerance::T absolute_pivot_tolerance::T zero_tolerance::T acceptable_residual_relative::T acceptable_residual_absolute::T prefix::NTuple{31,Cchar}
detailed documentation#
control derived type as a Julia structure
components#
Bool f_indexing
use C or Fortran sparse matrix indexing
Int32 error
unit for error messages
Int32 warning
unit for warning messages
Int32 out
unit for monitor output
Int32 print_level
controls level of diagnostic output
Int32 print_level_solver
controls level of diagnostic output from external solver
Int32 initial_fill_in_factor
prediction of factor by which the fill-in will exceed the initial number of nonzeros in \(A\)
Int32 min_real_factor_size
initial size for real array for the factors and other data
Int32 min_integer_factor_size
initial size for integer array for the factors and other data
Int64 max_factor_size
maximum size for real array for the factors and other data
Int32 blas_block_size_factorize
level 3 blocking in factorize
Int32 blas_block_size_solve
level 2 and 3 blocking in solve
Int32 pivot_control
pivot control:
1 Threshold Partial Pivoting is desired
2 Threshold Rook Pivoting is desired
3 Threshold Complete Pivoting is desired
4 Threshold Symmetric Pivoting is desired
5 Threshold Diagonal Pivoting is desired
Int32 pivot_search_limit
number of rows/columns pivot selection restricted to (0 = no restriction)
Int32 minimum_size_for_btf
the minimum permitted size of blocks within the block-triangular form
Int32 max_iterative_refinements
maximum number of iterative refinements allowed
Bool stop_if_singular
stop if the matrix is found to be structurally singular
T array_increase_factor
factor by which arrays sizes are to be increased if they are too small
T switch_to_full_code_density
switch to full code when the density exceeds this factor
T array_decrease_factor
if previously allocated internal workspace arrays are greater than array_decrease_factor times the currently required sizes, they are reset to current requirements
T relative_pivot_tolerance
pivot threshold
T absolute_pivot_tolerance
any pivot small than this is considered zero
T zero_tolerance
any entry smaller than this in modulus is reset to zero
T acceptable_residual_relative
refinement will cease as soon as the residual \(\|Ax-b\|\) falls below max( acceptable_residual_relative \* \(\|b\|\), acceptable_residual_absolute )
T acceptable_residual_absolute
see acceptable_residual_relative
NTuple{31,Cchar} prefix
all output lines will be prefixed by prefix(2:LEN(TRIM(.prefix))-1) where prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’
uls_inform_type structure#
struct uls_inform_type{T} status::Int32 alloc_status::Int32 bad_alloc::NTuple{81,Cchar} more_info::Int32 out_of_range::Int64 duplicates::Int64 entries_dropped::Int64 workspace_factors::Int64 compresses::Int32 entries_in_factors::Int64 rank::Int32 structural_rank::Int32 pivot_control::Int32 iterative_refinements::Int32 alternative::Bool solver::NTuple{21,Cchar} gls_ainfo_type::gls_ainfo_type{T} gls_finfo_type::gls_finfo_type{T} gls_sinfo_type::gls_sinfo_type ma48_ainfo::ma48_ainfo{T} ma48_finfo::ma48_finfo{T} ma48_sinfo::ma48_sinfo lapack_error::Int32
detailed documentation#
inform derived type as a Julia structure
components#
Int32 status
reported return status:
0
success
-1
allocation error
-2
deallocation error
-3
matrix data faulty (m < 1, n < 1, ne < 0)
-26
unknown solver
-29
unavailable option
-31
input order is not a permutation or is faulty in some other way
-32
error with integer workspace
-33
error with real workspace
-50
solver-specific error; see the solver’s info parameter
Int32 alloc_status
STAT value after allocate failure.
NTuple{81,Cchar} bad_alloc
name of array which provoked an allocate failure
Int32 more_info
further information on failure
Int64 out_of_range
number of indices out-of-range
Int64 duplicates
number of duplicates
Int64 entries_dropped
number of entries dropped during the factorization
Int64 workspace_factors
predicted or actual number of reals and integers to hold factors
Int32 compresses
number of compresses of data required
Int64 entries_in_factors
number of entries in factors
Int32 rank
estimated rank of the matrix
Int32 structural_rank
structural rank of the matrix
Int32 pivot_control
pivot control:
1
Threshold Partial Pivoting has been used
2
Threshold Rook Pivoting has been used
3
Threshold Complete Pivoting has been desired
4
Threshold Symmetric Pivoting has been desired
5
Threshold Diagonal Pivoting has been desired
Int32 iterative_refinements
number of iterative refinements performed
Bool alternative
has an “alternative” y: A^T y = 0 and yT b > 0 been found when trying to solve A x = b ?
char solver[21]
name of external solver used to factorize and solve
struct gls_ainfo gls_ainfo
the analyse output structure from gls
struct gls_finfo gls_finfo
the factorize output structure from gls
struct gls_sinfo gls_sinfo
the solve output structure from gls
struct ma48_ainfo ma48_ainfo
the analyse output structure from hsl_ma48
struct ma48_finfo ma48_finfo
the factorize output structure from hsl_ma48
struct ma48_sinfo ma48_sinfo
the solve output structure from hsl_ma48
Int32 lapack_error
the LAPACK error return code
example calls#
This is an example of how to use the package to solve a linear system; the code is available in $GALAHAD/src/uls/Python/test_uls.jl . A variety of supported Hessian and constraint matrix storage formats are shown.
# test_uls.jl
# Simple code to test the Julia interface to ULS
using GALAHAD
using Test
using Printf
using Accessors
using Quadmath
function test_uls(::Type{T}) where T
maxabsarray(a) = maximum(abs.(a))
# Derived types
data = Ref{Ptr{Cvoid}}()
control = Ref{uls_control_type{T}}()
inform = Ref{uls_inform_type{T}}()
# Set problem data
m = 5 # column dimension of A
n = 5 # column dimension of A
ne = 7 # number of entries of A
dense_ne = 25 # number of elements of A as a dense matrix
row = Cint[1, 2, 2, 3, 3, 4, 5] # row indices
col = Cint[1, 1, 5, 2, 3, 3, 4] # column indices
ptr = Cint[1, 2, 4, 6, 7, 8] # pointers to indices
val = T[2.0, 3.0, 6.0, 4.0, 1.0, 5.0, 1.0] # values
dense = T[2.0, 0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 0.0, 0.0, 6.0, 0.0, 4.0, 1.0, 0.0, 0.0,
0.0, 0.0, 5.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0]
rhs = T[2.0, 33.0, 11.0, 15.0, 4.0]
rhst = T[8.0, 12.0, 23.0, 5.0, 12.0]
sol = T[1.0, 2.0, 3.0, 4.0, 5.0]
status = Ref{Cint}()
x = zeros(T, n)
error = zeros(T, n)
good_x = eps(Float64)^(1 / 3)
@printf(" Fortran sparse matrix indexing\n\n")
@printf(" basic tests of storage formats\n\n")
@printf(" storage RHS refine RHST refine\n")
for d in 1:3
# Initialize ULS - use the gls solver
uls_initialize(T, "getr", data, control, status)
# Set user-defined control options
@reset control[].f_indexing = true # Fortran sparse matrix indexing
# sparse co-ordinate storage
if d == 1
@printf(" coordinate ")
uls_factorize_matrix(T, control, data, status, m, n,
"coordinate", ne, val, row, col, C_NULL)
end
# sparse by rows
if d == 2
@printf(" sparse by rows ")
uls_factorize_matrix(T, control, data, status, m, n,
"sparse_by_rows", ne, val, C_NULL, col, ptr)
end
# dense
if d == 3
@printf(" dense ")
uls_factorize_matrix(T, control, data, status, m, n,
"dense", dense_ne, dense, C_NULL, C_NULL, C_NULL)
end
# Set right-hand side and solve the system A x = b
for i in 1:n
x[i] = rhs[i]
end
trans = false
uls_solve_system(T, data, status, m, n, x, trans)
uls_information(T, data, inform, status)
if inform[].status == 0
for i in 1:n
error[i] = x[i] - sol[i]
end
norm_residual = maxabsarray(error)
if norm_residual < good_x
@printf(" ok ")
else
@printf(" fail ")
end
else
@printf(" ULS_solve exit status = %1i\n", inform[].status)
end
# @printf("sol: ")
# for i = 1:n
# @printf("%f ", x[i])
# end
# resolve, this time using iterative refinement
@reset control[].max_iterative_refinements = Cint(1)
uls_reset_control(T, control, data, status)
for i in 1:n
x[i] = rhs[i]
end
uls_solve_system(T, data, status, m, n, x, trans)
uls_information(T, data, inform, status)
if inform[].status == 0
for i in 1:n
error[i] = x[i] - sol[i]
end
norm_residual = maxabsarray(error)
if norm_residual < good_x
@printf("ok ")
else
@printf(" fail ")
end
else
@printf(" ULS_solve exit status = %1i\n", inform[].status)
end
# Set right-hand side and solve the system A^T x = b
for i in 1:n
x[i] = rhst[i]
end
trans = true
uls_solve_system(T, data, status, m, n, x, trans)
uls_information(T, data, inform, status)
if inform[].status == 0
for i in 1:n
error[i] = x[i] - sol[i]
end
norm_residual = maxabsarray(error)
if norm_residual < good_x
@printf(" ok ")
else
@printf(" fail ")
end
else
@printf(" ULS_solve exit status = %1i\n", inform[].status)
end
# @printf("sol: ")
# for i = 1:n
# @printf("%f ", x[i])
# end
# resolve, this time using iterative refinement
@reset control[].max_iterative_refinements = Cint(1)
uls_reset_control(T, control, data, status)
for i in 1:n
x[i] = rhst[i]
end
uls_solve_system(T, data, status, m, n, x, trans)
uls_information(T, data, inform, status)
if inform[].status == 0
for i in 1:n
error[i] = x[i] - sol[i]
end
norm_residual = maxabsarray(error)
if norm_residual < good_x
@printf("ok ")
else
@printf(" fail ")
end
else
@printf(" ULS_solve exit status = %1i\n", inform[].status)
end
# Delete internal workspace
uls_terminate(T, data, control, inform)
@printf("\n")
end
return 0
end
@testset "ULS" begin
@test test_uls(Float32) == 0
@test test_uls(Float64) == 0
@test test_uls(Float128) == 0
end