GALAHAD ULS package#

purpose#

The uls package solves dense or sparse unsymmetric systems of linear equations using variants of Gaussian elimination. Given a sparse matrix \(A = \{ a_{ij} \}_{m \times n}\), and an \(n\)-vector \(b\), this function solves the systems \(A x = b\) or \(A^T x = b\). Both square (\(m=n\)) and rectangular (\(m\neq n\)) matrices are handled; one of an infinite class of solutions for consistent systems will be returned whenever \(A\) is not of full rank.

The method provides a common interface to a variety of well-known solvers from HSL and elsewhere. Currently supported solvers include MA28/GLS and HSL_MA48 from {HSL}, as well as GETR from LAPACK. Note that, with the exception of he Netlib reference LAPACK code, the solvers themselves do not form part of this package and must be obtained/linked to separately. Dummy instances are provided for solvers that are unavailable. Also note that additional flexibility may be obtained by calling the solvers directly rather that via this package.

terminology#

The solvers used each produce an \(L U\) factorization of \(A\), where \(L\) and U are permuted lower and upper triangular matrices (respectively). It is convenient to write this factorization in the form

\[A = P_R L U P_C,\]
where \(P_R\) and \(P_C\) are row and column permutation matrices, respectively.

supported solvers#

The key features of the external solvers supported by uls are given in the following table:

External solver characteristics#

solver

factorization

out-of-core

parallelised

GLS/MA28

sparse

no

no

HSL_MA48

sparse

no

no

GETR

dense

no

with parallel LAPACK

method#

Variants of sparse Gaussian elimination are used. See Section 4 of $GALAHAD/doc/uls.pdf for a brief description of the method employed and other details.

The solver GLS is available as part of GALAHAD and relies on the HSL Archive package MA33. To obtain HSL Archive packages, see

The solver HSL_MA48 is part of HSL 2011. To obtain HSL 2011 packages, see

The solver GETR is available as S/DGETRF/S as part of LAPACK. Reference versions are provided by GALAHAD, but for good performance machined-tuned versions should be used.

The methods used are described in the user-documentation for

HSL 2011, A collection of Fortran codes for large-scale scientific computation (2011).

matrix storage#

The unsymmetric \(m\) by \(n\) matrix \(A\) may be presented and stored in a variety of convenient input formats.

Dense storage format: The matrix \(A\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. In this case, component \(n \ast i + j\) of the storage array A_val will hold the value \(A_{ij}\) for \(1 \leq i \leq m\), \(1 \leq j \leq n\). The string A_type = ‘dense’ should be specified.

Dense by columns storage format: The matrix \(A\) is stored as a compact dense matrix by columns, that is, the values of the entries of each column in turn are stored in order within an appropriate real one-dimensional array. In this case, component \(m \ast j + i\) of the storage array A_val will hold the value \(A_{ij}\) for \(1 \leq i \leq m\), \(1 \leq j \leq n\). The string A_type = ‘dense_by_columns’ should be specified.

Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(1 \leq l \leq ne\), of \(A\), its row index i, column index j and value \(A_{ij}\), \(1 \leq i \leq m\), \(1 \leq j \leq n\), are stored as the \(l\)-th components of the integer arrays A_row and A_col and real array A_val, respectively, while the number of nonzeros is recorded as A_ne = \(ne\). The string A_type = ‘coordinate’should be specified.

Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(A\) the i-th component of the integer array A_ptr holds the position of the first entry in this row, while A_ptr(m+1) holds the total number of entries plus one. The column indices j, \(1 \leq j \leq n\), and values \(A_{ij}\) of the nonzero entries in the i-th row are stored in components l = A_ptr(i), \(\ldots\), A_ptr(i+1)-1, \(1 \leq i \leq m\), of the integer array A_col, and real array A_val, respectively. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string A_type = ‘sparse_by_rows’ should be specified.

Sparse column-wise storage format: Once again only the nonzero entries are stored, but this time they are ordered so that those in column j appear directly before those in column j+1. For the j-th column of \(A\) the j-th component of the integer array A_ptr holds the position of the first entry in this column, while A_ptr(n+1) holds the total number of entries plus one. The row indices i, \(1 \leq i \leq m\), and values \(A_{ij}\) of the nonzero entries in the j-th columnsare stored in components l = A_ptr(j), \(\ldots\), A_ptr(j+1)-1, \(1 \leq j \leq n\), of the integer array A_row, and real array A_val, respectively. As before, for sparse matrices, this scheme almost always requires less storage than the co-ordinate format. The string A_type = ‘sparse_by_columns’ should be specified.

introduction to function calls#

To solve a given problem, functions from the uls package must be called in the following order:

  • uls_initialize - provide default control parameters and set up initial data structures

  • uls_read_specfile (optional) - override control values by reading replacement values from a file

  • uls_factorize_matrix - set up matrix data structures, analyse the structure to choose a suitable order for factorization, and then factorize the matrix \(A\)

  • uls_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved

  • uls_solve_system - solve the linear system of equations \(Ax=b\) or \(A^Tx=b\)

  • uls_information (optional) - recover information about the solution and solution process

  • uls_terminate - deallocate data structures

See the examples section for illustrations of use.

parametric real type T#

Below, the symbol T refers to a parametric real type that may be Float32 (single precision) or Float64 (double precision). Calable functions as described are with T as Float64, but variants (with the additional suffix _s, e.g., uls_initialize_s) are available with T as Float32.

callable functions#

    function uls_initialize(solver, data, control, status)

Set default control values and initialize private data

Select solver, set default control values and initialize private data

Parameters:

solver

is a one-dimensional array of type Vararg{Cchar} that specifies the solver package that should be used to factorize the matrix \(A\). It should be one of ‘gls’, ‘ma28’, ‘ma48 or ‘getr’; lower or upper case variants are allowed.

data

holds private internal data

control

is a structure containing control information (see uls_control_type)

status

is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:

  • 0

    The initialization was successful.

  • -26

    The requested solver is not available.

    function uls_read_specfile(control, specfile)

Read the content of a specification file, and assign values associated with given keywords to the corresponding control parameters. An in-depth discussion of specification files is available, and a detailed list of keywords with associated default values is provided in $GALAHAD/src/uls/ULS.template. See also Table 2.1 in the Fortran documentation provided in $GALAHAD/doc/uls.pdf for a list of how these keywords relate to the components of the control structure.

Parameters:

control

is a structure containing control information (see uls_control_type)

specfile

is a one-dimensional array of type Vararg{Cchar} that must give the name of the specification file

    function uls_factorize_matrix(control, data, status,
                                  m, n, type, ne, val, row, col, ptr)

Import matrix data into internal storage prior to solution, analyse the sparsity patern, and subsequently factorize the matrix

Parameters:

control

is a structure whose members provide control parameters for the remaining procedures (see uls_control_type)

data

holds private internal data

status

is a scalar variable of type Int32 that gives the exit status from the package.

Possible values are:

  • 0

    The import, analysis and factorization were conducted successfully.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -3

    The restrictions n > 0 and m> 0 or requirement that the matrix type must contain the relevant string ‘dense’, ‘coordinate’ or ‘sparse_by_rows has been violated.

  • -26

    The requested solver is not available.

  • -29

    This option is not available with this solver.

  • -32

    More than control.max integer factor size words of internal integer storage are required for in-core factorization.

  • -50

    A solver-specific error occurred; check the solver-specific information component of inform along with the solver’s documentation for more details.

m

is a scalar variable of type Int32 that holds the number of rows in the unsymmetric matrix \(A\).

n

is a scalar variable of type Int32 that holds the number of columns in the unsymmetric matrix \(A\).

type

is a one-dimensional array of type Vararg{Cchar} that specifies the unsymmetric storage scheme used for the matrix \(A\). It should be one of ‘coordinate’, ‘sparse_by_rows’ or ‘dense’; lower or upper case variants are allowed.

ne

is a scalar variable of type Int32 that holds the number of entries in \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes.

val

is a one-dimensional array of size ne and type T, that holds the values of the entries of the matrix \(A\) in any of the supported storage schemes.

row

is a one-dimensional array of size ne and type Int32 that holds the row indices of the matrix \(A\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be C_NULL.

col

is a one-dimensional array of size ne and type Int32 that holds the column indices of the matrix \(A\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense storage schemes is used, and in this case can be C_NULL.

ptr

is a one-dimensional array of size m+1 and type Int32 that holds the starting position of each row of the matrix \(A\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be C_NULL.

    function uls_reset_control(control, data, status)

Reset control parameters after import if required.

Parameters:

control

is a structure whose members provide control parameters for the remaining procedures (see uls_control_type)

data

holds private internal data

status

is a scalar variable of type Int32 that gives the exit status from the package. Possible values are:

  • 0

    The import was successful.

    function uls_solve_system(data, status, m, n, sol, trans)

Solve the linear system \(Ax=b\) or \(A^Tx=b\).

Parameters:

data

holds private internal data

status

is a scalar variable of type Int32 that gives the exit status from the package.

Possible values are:

  • 0

    The required solution was obtained.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit control.error, and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit control.error and the returned allocation status and a string containing the name of the offending array are held in inform.alloc_status and inform.bad_alloc respectively.

  • -34

    The package PARDISO failed; check the solver-specific information components inform.pardiso iparm and inform.pardiso_dparm along with PARDISO’s documentation for more details.

  • -35

    The package WSMP failed; check the solver-specific information components inform.wsmp_iparm and inform.wsmp dparm along with WSMP’s documentation for more details.

m

is a scalar variable of type Int32 that holds the number of rows in the unsymmetric matrix \(A\).

n

is a scalar variable of type Int32 that holds the number of columns in the unsymmetric matrix \(A\).

sol

is a one-dimensional array of size n and type T. On entry, it must hold the vector \(b\). On a successful exit, its contains the solution \(x\).

trans

is a scalar variable of type Bool, that specifies whether to solve the equation \(A^Tx=b\) (trans=true) or \(Ax=b\) (trans=false).

    function uls_information(data, inform, status)

Provides output information

Parameters:

data

holds private internal data

inform

is a structure containing output information (see uls_inform_type)

status

is a scalar variable of type Int32 that gives the exit status from the package. Possible values are (currently):

  • 0

    The values were recorded successfully

    function uls_terminate(data, control, inform)

Deallocate all internal private storage

Parameters:

data

holds private internal data

control

is a structure containing control information (see uls_control_type)

inform

is a structure containing output information (see uls_inform_type)

available structures#

uls_control_type structure#

    struct uls_control_type{T}
      f_indexing::Bool
      error::Int32
      warning::Int32
      out::Int32
      print_level::Int32
      print_level_solver::Int32
      initial_fill_in_factor::Int32
      min_real_factor_size::Int32
      min_integer_factor_size::Int32
      max_factor_size::Int64
      blas_block_size_factorize::Int32
      blas_block_size_solve::Int32
      pivot_control::Int32
      pivot_search_limit::Int32
      minimum_size_for_btf::Int32
      max_iterative_refinements::Int32
      stop_if_singular::Bool
      array_increase_factor::T
      switch_to_full_code_density::T
      array_decrease_factor::T
      relative_pivot_tolerance::T
      absolute_pivot_tolerance::T
      zero_tolerance::T
      acceptable_residual_relative::T
      acceptable_residual_absolute::T
      prefix::NTuple{31,Cchar}

detailed documentation#

control derived type as a Julia structure

components#

Bool f_indexing

use C or Fortran sparse matrix indexing

Int32 error

unit for error messages

Int32 warning

unit for warning messages

Int32 out

unit for monitor output

Int32 print_level

controls level of diagnostic output

Int32 print_level_solver

controls level of diagnostic output from external solver

Int32 initial_fill_in_factor

prediction of factor by which the fill-in will exceed the initial number of nonzeros in \(A\)

Int32 min_real_factor_size

initial size for real array for the factors and other data

Int32 min_integer_factor_size

initial size for integer array for the factors and other data

Int64 max_factor_size

maximum size for real array for the factors and other data

Int32 blas_block_size_factorize

level 3 blocking in factorize

Int32 blas_block_size_solve

level 2 and 3 blocking in solve

Int32 pivot_control

pivot control:

  • 1 Threshold Partial Pivoting is desired

  • 2 Threshold Rook Pivoting is desired

  • 3 Threshold Complete Pivoting is desired

  • 4 Threshold Symmetric Pivoting is desired

  • 5 Threshold Diagonal Pivoting is desired

Int32 pivot_search_limit

number of rows/columns pivot selection restricted to (0 = no restriction)

Int32 minimum_size_for_btf

the minimum permitted size of blocks within the block-triangular form

Int32 max_iterative_refinements

maximum number of iterative refinements allowed

Bool stop_if_singular

stop if the matrix is found to be structurally singular

T array_increase_factor

factor by which arrays sizes are to be increased if they are too small

T switch_to_full_code_density

switch to full code when the density exceeds this factor

T array_decrease_factor

if previously allocated internal workspace arrays are greater than array_decrease_factor times the currently required sizes, they are reset to current requirements

T relative_pivot_tolerance

pivot threshold

T absolute_pivot_tolerance

any pivot small than this is considered zero

T zero_tolerance

any entry smaller than this in modulus is reset to zero

T acceptable_residual_relative

refinement will cease as soon as the residual \(\|Ax-b\|\) falls below max( acceptable_residual_relative \* \(\|b\|\), acceptable_residual_absolute )

T acceptable_residual_absolute

see acceptable_residual_relative

NTuple{31,Cchar} prefix

all output lines will be prefixed by prefix(2:LEN(TRIM(.prefix))-1) where prefix contains the required string enclosed in quotes, e.g. “string” or ‘string’

uls_inform_type structure#

    struct uls_inform_type{T}
      status::Int32
      alloc_status::Int32
      bad_alloc::NTuple{81,Cchar}
      more_info::Int32
      out_of_range::Int64
      duplicates::Int64
      entries_dropped::Int64
      workspace_factors::Int64
      compresses::Int32
      entries_in_factors::Int64
      rank::Int32
      structural_rank::Int32
      pivot_control::Int32
      iterative_refinements::Int32
      alternative::Bool
      solver::NTuple{21,Cchar}
      gls_ainfo_type::gls_ainfo_type{T}
      gls_finfo_type::gls_finfo_type{T}
      gls_sinfo_type::gls_sinfo_type
      ma48_ainfo::ma48_ainfo{T}
      ma48_finfo::ma48_finfo{T}
      ma48_sinfo::ma48_sinfo
      lapack_error::Int32

detailed documentation#

inform derived type as a Julia structure

components#

Int32 status

reported return status:

  • 0

    success

  • -1

    allocation error

  • -2

    deallocation error

  • -3

    matrix data faulty (m < 1, n < 1, ne < 0)

  • -26

    unknown solver

  • -29

    unavailable option

  • -31

    input order is not a permutation or is faulty in some other way

  • -32

    error with integer workspace

  • -33

    error with real workspace

  • -50

    solver-specific error; see the solver’s info parameter

Int32 alloc_status

STAT value after allocate failure.

NTuple{81,Cchar} bad_alloc

name of array which provoked an allocate failure

Int32 more_info

further information on failure

Int64 out_of_range

number of indices out-of-range

Int64 duplicates

number of duplicates

Int64 entries_dropped

number of entries dropped during the factorization

Int64 workspace_factors

predicted or actual number of reals and integers to hold factors

Int32 compresses

number of compresses of data required

Int64 entries_in_factors

number of entries in factors

Int32 rank

estimated rank of the matrix

Int32 structural_rank

structural rank of the matrix

Int32 pivot_control

pivot control:

  • 1

    Threshold Partial Pivoting has been used

  • 2

    Threshold Rook Pivoting has been used

  • 3

    Threshold Complete Pivoting has been desired

  • 4

    Threshold Symmetric Pivoting has been desired

  • 5

    Threshold Diagonal Pivoting has been desired

Int32 iterative_refinements

number of iterative refinements performed

Bool alternative

has an “alternative” y: A^T y = 0 and yT b > 0 been found when trying to solve A x = b ?

char solver[21]

name of external solver used to factorize and solve

struct gls_ainfo gls_ainfo

the analyse output structure from gls

struct gls_finfo gls_finfo

the factorize output structure from gls

struct gls_sinfo gls_sinfo

the solve output structure from gls

struct ma48_ainfo ma48_ainfo

the analyse output structure from hsl_ma48

struct ma48_finfo ma48_finfo

the factorize output structure from hsl_ma48

struct ma48_sinfo ma48_sinfo

the solve output structure from hsl_ma48

Int32 lapack_error

the LAPACK error return code

example calls#

This is an example of how to use the package to solve a linear system; the code is available in $GALAHAD/src/uls/Python/test_uls.jl . A variety of supported Hessian and constraint matrix storage formats are shown.

# test_uls.jl
# Simple code to test the Julia interface to ULS

using GALAHAD
using Test
using Printf
using Accessors

function test_uls()
  maxabsarray(a) = maximum(abs.(a))

  # Derived types
  data = Ref{Ptr{Cvoid}}()
  control = Ref{uls_control_type{Float64}}()
  inform = Ref{uls_inform_type{Float64}}()

  # Set problem data
  m = 5 # column dimension of A
  n = 5 # column dimension of A
  ne = 7 # number of entries of A
  dense_ne = 25 # number of elements of A as a dense matrix
  row = Cint[1, 2, 2, 3, 3, 4, 5]  # row indices
  col = Cint[1, 1, 5, 2, 3, 3, 4]  # column indices
  ptr = Cint[1, 2, 4, 6, 7, 8]  # pointers to indices
  val = Float64[2.0, 3.0, 6.0, 4.0, 1.0, 5.0, 1.0]  # values
  dense = Float64[2.0, 0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 0.0, 0.0, 6.0, 0.0, 4.0, 1.0, 0.0, 0.0,
                  0.0, 0.0, 5.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0]
  rhs = Float64[2.0, 33.0, 11.0, 15.0, 4.0]
  rhst = Float64[8.0, 12.0, 23.0, 5.0, 12.0]
  sol = Float64[1.0, 2.0, 3.0, 4.0, 5.0]
  status = Ref{Cint}()
  x = zeros(Float64, n)
  error = zeros(Float64, n)
  good_x = eps(Float64)^(1 / 3)

  @printf(" Fortran sparse matrix indexing\n\n")
  @printf(" basic tests of storage formats\n\n")
  @printf(" storage  RHS   refine   RHST  refine\n")

  for d in 1:3
    # Initialize ULS - use the gls solver
    uls_initialize("getr", data, control, status)

    # Set user-defined control options
    @reset control[].f_indexing = true # Fortran sparse matrix indexing

    # sparse co-ordinate storage
    if d == 1
      @printf(" coordinate ")
      uls_factorize_matrix(control, data, status, m, n,
                           "coordinate", ne, val, row, col, C_NULL)
    end

    # sparse by rows
    if d == 2
      @printf(" sparse by rows ")
      uls_factorize_matrix(control, data, status, m, n,
                           "sparse_by_rows", ne, val, C_NULL, col, ptr)
    end

    # dense
    if d == 3
      @printf(" dense  ")
      uls_factorize_matrix(control, data, status, m, n,
                           "dense", dense_ne, dense, C_NULL, C_NULL, C_NULL)
    end

    # Set right-hand side and solve the system A x = b
    for i in 1:n
      x[i] = rhs[i]
    end

    trans = false
    uls_solve_system(data, status, m, n, x, trans)
    uls_information(data, inform, status)

    if inform[].status == 0
      for i in 1:n
        error[i] = x[i] - sol[i]
      end
      norm_residual = maxabsarray(error)
      if norm_residual < good_x
        @printf("   ok  ")
      else
        @printf("  fail ")
      end
    else
      @printf(" ULS_solve exit status = %1i\n", inform[].status)
    end

    # @printf("sol: ")
    # for i = 1:n
    #   @printf("%f ", x[i])
    # end

    # resolve, this time using iterative refinement
    @reset control[].max_iterative_refinements = Cint(1)
    uls_reset_control(control, data, status)

    for i in 1:n
      x[i] = rhs[i]
    end

    uls_solve_system(data, status, m, n, x, trans)
    uls_information(data, inform, status)

    if inform[].status == 0
      for i in 1:n
        error[i] = x[i] - sol[i]
      end
      norm_residual = maxabsarray(error)
      if norm_residual < good_x
        @printf("ok  ")
      else
        @printf("   fail ")
      end
    else
      @printf(" ULS_solve exit status = %1i\n", inform[].status)
    end

    # Set right-hand side and solve the system A^T x = b
    for i in 1:n
      x[i] = rhst[i]
    end

    trans = true
    uls_solve_system(data, status, m, n, x, trans)
    uls_information(data, inform, status)

    if inform[].status == 0
      for i in 1:n
        error[i] = x[i] - sol[i]
      end
      norm_residual = maxabsarray(error)
      if norm_residual < good_x
        @printf("   ok  ")
      else
        @printf("  fail ")
      end
    else
      @printf(" ULS_solve exit status = %1i\n", inform[].status)
    end

    # @printf("sol: ")
    # for i = 1:n
    #   @printf("%f ", x[i])
    # end

    # resolve, this time using iterative refinement
    @reset control[].max_iterative_refinements = Cint(1)

    uls_reset_control(control, data, status)
    for i in 1:n
      x[i] = rhst[i]
    end

    uls_solve_system(data, status, m, n, x, trans)
    uls_information(data, inform, status)

    if inform[].status == 0
      for i in 1:n
        error[i] = x[i] - sol[i]
      end
      norm_residual = maxabsarray(error)
      if norm_residual < good_x
        @printf("ok  ")
      else
        @printf("   fail ")
      end
    else
      @printf(" ULS_solve exit status = %1i\n", inform[].status)
    end

    # Delete internal workspace
    uls_terminate(data, control, inform)
    @printf("\n")
  end

  return 0
end

@testset "ULS" begin
  @test test_uls() == 0
end