DPS#
purpose#
The dps
package constructs a symmetric, positive definite matrix \(M\)
from a given \(H\) so that \(H\) is is diagonal
in the norm \(\|v\|_M = \sqrt{v^T M v}\) induced by \(M\), and consequently
minimizers of trust-region and regularized quadratic subproblems
may be computed efficiently.
The aim is either to minimize the quadratic objective function
See Section 4 of $GALAHAD/doc/dps.pdf for additional details.
method#
The required solution \(x_*\) necessarily satisfies the optimality condition \(H x_* + \lambda_* M x_* + g = 0\), where \(\lambda_* \geq 0\) is a Lagrange multiplier that corresponds to the constraint \(\|x\|_M \leq \Delta\) in the trust-region case, and is given by \(\lambda_* = \sigma \|x_*\|^{p-2}\) for the regularization problem involve \(r(x)\). In addition \(H + \lambda_* M\) will be positive semi-definite; in most instances it will actually be positive definite, but in special “hard” cases singularity is a possibility.
The matrix \(H\) is decomposed as
SLS
. Here \(P\) is a permutation
matrix, \(L\) is unit lower triangular and \(D\) is block diagonal, with
blocks of dimension at most two. The spectral decomposition of each diagonal
block of \(D\) is computed, and each eigenvalue \(\theta\) is replaced by
\(\max ( | \theta | , \theta_{\min} ) \),
where \(\theta_{\min}\) is a positive user-supplied value. The resulting block
diagonal matrix is \(B\), from which we define the modified-absolute-value
Given the factors of \(H\) (and \(M\)), the required solution is found by making the change of variables \(y = B^{1/2} L^T P^T x\) (or \(y = L^T P^T x\) in the Goldfarb case) which results in ``diagonal’’ trust-region and regularization subproblems, whose solution may be easily obtained suing a Newton or higher-order iteration of a resulting “secular” equation. If subsequent problems, for which \(H\) and \(g\) are unchanged, are to be attempted, the existing factorization and solution may easily be exploited.
The dominant cost is that for the factorization of the symmetric, but
potentially indefinite, matrix \(H\) using the package SLS
.
references#
The method is described in detail for the trust-region case in
while the adaptation for the regularization case is obvious. The method used to solve the diagonal trust-region and regularization subproblems are as given by
H. S. Dollar, N. I. M. Gould and D. P. Robinson, ``On solving trust-region and other regularised subproblems in optimization’’. Mathematical Programming Computation 2(1) (2010) 21–57
with simplifications due to the diagonal Hessian.
matrix storage#
The symmetric \(n\) by \(n\) matrix \(H\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).
Dense storage format: The matrix \(H\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(H\) is symmetric, only the lower triangular part (that is the part \(H_{ij}\) for \(0 \leq j \leq i \leq n-1\)) need be held. In this case the lower triangle should be stored by rows, that is component \(i * i / 2 + j\) of the storage array H_val will hold the value \(H_{ij}\) (and, by symmetry, \(H_{ji}\)) for \(0 \leq j \leq i \leq n-1\). The string H_type = ‘dense’ should be specified.
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(0 \leq l \leq ne-1\), of \(H\), its row index i, column index j and value \(H_{ij}\), \(0 \leq j \leq i \leq n-1\), are stored as the \(l\)-th components of the integer arrays H_row and H_col and real array H_val, respectively, while the number of nonzeros is recorded as H_ne = \(ne\). Note that only the entries in the lower triangle should be stored. The string H_type = ‘coordinate’ should be specified.
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(H\) the i-th component of the integer array H_ptr holds the position of the first entry in this row, while H_ptr(n) holds the total number of entries. The column indices j, \(0 \leq j \leq i\), and values \(H_{ij}\) of the entries in the i-th row are stored in components l = H_ptr(i), …, H_ptr(i+1)-1 of the integer array H_col, and real array H_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string H_type = ‘sparse_by_rows’ should be specified.
Diagonal storage format: If \(H\) is diagonal (i.e., \(H_{ij} = 0\) for all \(0 \leq i \neq j \leq n-1\)) only the diagonals entries \(H_{ii}\), \(0 \leq i \leq n-1\) need be stored, and the first n components of the array H_val may be used for the purpose. The string H_type = ‘diagonal’ should be specified.
Multiples of the identity storage format: If \(H\) is a multiple of the identity matrix, (i.e., \(H = \alpha I\) where \(I\) is the n by n identity matrix and \(\alpha\) is a scalar), it suffices to store \(\alpha\) as the first component of H_val. The string H_type = ‘scaled_identity’ should be specified.
The identity matrix format: If \(H\) is the identity matrix, no values need be stored. The string H_type = ‘identity’ should be specified.
The zero matrix format: The same is true if \(H\) is the zero matrix, but now the string H_type = ‘zero’ or ‘none’ should be specified.
functions#
- dps.initialize()#
Set default option values and initialize private data
Returns:
- optionsdict
- dictionary containing default control options:
- errorint
error and warning diagnostics occur on stream error.
- outint
general output occurs on stream out.
- problemint
unit to write problem data into file problem_file.
- print_levelint
the level of output required is specified by print_level. Possible values are
<=0
gives no output,
1
gives a one-line summary for every iteration.
2
gives a summary of the inner iteration for each iteration.
>=3
gives increasingly verbose (debugging) output.
- new_hint
how much of \(H\) has changed since the previous call. Possible values are
0
unchanged.
1
values but not indices have changed.
2
values and indices have changed.
- taylor_max_degreeint
maximum degree of Taylor approximant allowed.
- eigen_minfloat
smallest allowable value of an eigenvalue of the block diagonal factor of \(H\).
- lowerfloat
lower and upper bounds on the multiplier, if known.
- upperfloat
see lower.
- stop_normalfloat
stop trust-region solution when \(| \|x\|_M - \Delta | \leq\) max(
stop_normal
* delta,stop_absolute_normal
).- stop_absolute_normalfloat
see stop_normal.
- goldfarbbool
use the Goldfarb variant of the trust-region/regularization norm rather than the modified absolute-value version.
- space_criticalbool
if
space_critical
is True, every effort will be made to use as little space as possible. This may result in longer computation time.- deallocate_error_fatalbool
if
deallocate_error_fatal
is True, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue.- problem_filestr
name of file into which to write problem data.
- symmetric_linear_solverstr
symmetric (indefinite) linear equation solver.
- prefixstr
all output lines will be prefixed by the string contained in quotes within
prefix
, e.g. ‘word’ (note the qutoes) will result in the prefix word.- sls_optionsdict
default control options for SLS (see
sls.initialize
).
- dps.load(n, H_type, H_ne, H_row, H_col, H_ptr, options=None)#
Import problem data into internal storage prior to solution.
Parameters:
- nint
holds the number of variables.
- H_typestring
specifies the symmetric storage scheme used for the Hessian \(H\). It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’, ‘diagonal’, ‘scaled_identity’, ‘identity’, ‘zero’ or ‘none’; lower or upper case variants are allowed.
- H_neint
holds the number of entries in the lower triangular part of \(H\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes.
- H_rowndarray(H_ne)
holds the row indices of the lower triangular part of \(H\) in the sparse co-ordinate storage scheme. It need not be set for any of the other schemes, and in this case can be None.
- H_colndarray(H_ne)
holds the column indices of the lower triangular part of \(H\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the other storage schemes are used, and in this case can be None.
- H_ptrndarray(n+1)
holds the starting position of each row of the lower triangular part of \(H\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be None.
- optionsdict, optional
dictionary of control options (see
dps.initialize
).
- dps.solve_tr_problem(n, radius, f, g, H_ne, H_val)#
Find the global moinimizer of the quadratic objective function \(q(x)\) within the trust-region.
Parameters:
- nint
holds the number of variables.
- radiusfloat
holds the strictly positive trust-region radius, \(\Delta\).
- ffloat
holds the constant term \(f\) in the objective function.
- gndarray(n)
holds the values of the linear term \(g\) in the objective function.
- H_neint
holds the number of entries in the lower triangular part of the Hessian \(H\).
- H_valndarray(H_ne)
holds the values of the nonzeros in the lower triangle of the Hessian \(H\) in the same order as specified in the sparsity pattern in
dps.load
.Returns:
- xndarray(n)
holds the values of the approximate minimizer \(x\) after a successful call.
- rqs.solve_rq_problem(n, weight, power, f, g, H_ne, H_val)#
Find the global moinimizer of the regularized quadratic objective function \(r(x)\)
Parameters:
- nint
holds the number of variables.
- weightfloat
holds the strictly positive regularization weight, \(\sigma\).
- powerfloat
holds the regularization power, \(p \geq 2\).
- ffloat
holds the constant term \(f\) in the objective function.
- gndarray(n)
holds the values of the linear term \(g\) in the objective function.
- H_neint
holds the number of entries in the lower triangular part of the Hessian \(H\).
- H_valndarray(H_ne)
holds the values of the nonzeros in the lower triangle of the Hessian \(H\) in the same order as specified in the sparsity pattern in
rqs.load
.Returns:
- xndarray(n)
holds the values of the approximate minimizer \(x\) after a successful call.
- dps.resolve_tr_problem(n, radius, f, g)#
Find the global moinimizer of the quadratic objective function \(q(x)\) within the trust-region after any of the non-matrix terms has changed.
Parameters:
- nint
holds the number of variables.
- radiusfloat
holds the strictly positive trust-region radius, \(\Delta\).
- ffloat
holds the constant term \(f\) in the objective function.
- gndarray(n)
holds the values of the linear term \(g\) in the objective function.
Returns:
- xndarray(n)
holds the values of the approximate minimizer \(x\) after a successful call.
- rqs.resolve_rq_problem(n, weight, power, f, g)#
Find the global moinimizer of the regularized quadratic objective function \(r(x)\) after any of the non-matrix terms has changed.
Parameters:
- nint
holds the number of variables.
- weightfloat
holds the strictly positive regularization weight, \(\sigma\).
- powerfloat
holds the regularization power, \(p \geq 2\).
- ffloat
holds the constant term \(f\) in the objective function.
- gndarray(n)
holds the values of the linear term \(g\) in the objective function.
Returns:
- xndarray(n)
holds the values of the approximate minimizer \(x\) after a successful call.
- [optional] dps.information()
Provide optional output information
Returns:
- informdict
- dictionary containing output information:
- statusint
return status. Possible values are:
0
The run was successful.
-1
An allocation error occurred. A message indicating the offending array is written on unit options[‘error’], and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.
-2
A deallocation error occurred. A message indicating the offending array is written on unit options[‘error’] and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.
-3
The restriction n > 0, radius > 0, weight > 0 or power \(\geq\) 2, or requirement that type contains its relevant string ‘dense’, ‘coordinate’, ‘sparse_by_rows’, ‘diagonal’, ‘scaled_identity’, ‘identity’, ‘zero’ or ‘none’ has been violated.
-9
The analysis phase of the factorization failed; the return status from the factorization package is given by inform[‘factor_status’].
-10
The factorization failed; the return status from the factorization package is given by inform[‘factor_status’].
-11
The solution of a set of linear equations using factors from the factorization package failed; the return status from the factorization package is given by inform[‘factor_status’].
-16
The problem is so ill-conditioned that further progress is impossible.
-23
An entry from the strict upper triangle of \(H\) has been specified.
-40
An error occured when building \(M\).
- alloc_statusint
the status of the last attempted allocation/deallocation.
- bad_allocstr
the name of the array for which an allocation/deallocation error occurred.
- mod_1by1int
the number of 1 by 1 blocks from the factorization of \(H\) that were modified when constructing \(M\).
- mod_2by2int
the number of 2 by 2 blocks from the factorization of \(H\) that were modified when constructing \(M\).
- objfloat
the value of the quadratic function.
- obj_regularizedfloat
the value of the regularized quadratic function.
- x_normfloat
the \(M\)-norm of the solution.
- multiplierfloat
the Lagrange multiplier associated with the constraint/regularization.
- polefloat
a lower bound max(0,-lambda_1), where lambda_1 is the left-most eigenvalue of \((H,M)\).
- hard_casebool
has the hard case occurred?.
- timedict
- dictionary containing timing information:
- totalfloat
total CPU time spent in the package.
- analysefloat
CPU time spent reordering H prior to factorization.
- factorizefloat
CPU time spent factorizing H.
- solvefloat
CPU time spent solving the diagonal model system.
- clock_totalfloat
total clock time spent in the package.
- clock_analysefloat
clock time spent reordering H prior to factorization.
- clock_factorizefloat
clock time spent factorizing H.
- clock_solvefloat
clock time spent solving the diagonal model system.
- sls_informdict
inform parameters for SLS (see
sbls.information
).
- dps.terminate()#
Deallocate all internal private storage.
example code#
from galahad import dps
import numpy as np
np.set_printoptions(precision=4,suppress=True,floatmode='fixed')
print("\n** python test: dps")
# set parameters
p = 1.0
n = 3
m = 1
infinity = float("inf")
# describe objective function
f = 0.96
g = np.array([0.0,2.0,0.0])
H_type = 'coordinate'
H_ne = 4
H_row = np.array([0,1,2,2])
H_col = np.array([0,1,2,0])
H_ptr = None
H_val = np.array([1.0,2.0,3.0,4.0])
# allocate internal data and set default options
options = dps.initialize()
# set some non-default options
options['print_level'] = 0
#print("options:", options)
# load data (and optionally non-default options)
dps.load(n, H_type, H_ne, H_row, H_col, H_ptr, options)
# set trust-region radius
radius = 1.0
# find minimum of quadratic within the trust region
print("\n solve trust-region problem")
x = dps.solve_tr_problem(n, radius, f, g, H_ne, H_val)
print(" x:",x)
# get information
inform = dps.information()
print(" f: %.4f" % inform['obj'])
# set regualization weight and power
weight = 1.0
power = 3.0
# find minimum of regularized quadratic
print("\n solve regularization problem")
x = dps.solve_rq_problem(n, weight, power, f, g, H_ne, H_val)
print(" x:",x)
# get information
inform = dps.information()
print(" f: %.4f" % inform['obj'])
print('** dps exit status:', inform['status'])
# deallocate internal data
dps.terminate()
This example code is available in $GALAHAD/src/dps/Python/test_dps.py .