DGO#
purpose#
The dgo
package uses a deterministic partition-and-bound trust-region
method to find an approximation to the global minimizer of a
differentiable objective function \(f(x)\) of n variables \(x\),
subject to simple bounds \(x^l <= x <= x^u\) on the variables.
Here, any of the components of the vectors of bounds \(x^l\) and \(x^u\)
may be infinite. The method offers the choice of direct and
iterative solution of the key trust-region subproblems, and
is suitable for large problems. First derivatives are required,
and if second derivatives can be calculated, they will be exploited -
if the product of second derivatives with a vector may be found but
not the derivatives themselves, that may also be exploited.
Although there are theoretical guarantees, these may require a large
number of evaluations as the dimension and nonconvexity increase.
The alternative package bgo
may sometimes be preferred.
See Section 4 of $GALAHAD/doc/dgo.pdf for additional details.
method#
Starting with the initial box \(x^l \leq x \leq x^u\), a sequence of
boxes is generated by considering the current set, and partitioning
a promising candidate into three equally-sized sub-boxes by splitting
along one of the box dimensions. Each partition requires only a pair of
new function and derivative evaluations, and these values, together with
estimates of Lipschitz constants, makes it possible to remove other boxes
from further consideration as soon as they cannot contain a global minimizer.
Efficient control of the dictionary of vertices of the sub-boxes
is handled using a suitable hashing procedure provided by
HASH
; each sub-box is indexed by the concatenated
coordinates of a pair of opposite vertices. At various
stages, local minimization in a promising sub-box, using
TRB
, may be used to improve the best-known upper bound
on the global minimizer.
If \(n=1\), the specialised univariate global minimization package
UGO
is called directly.
We reiterate that although there are theoretical guarantees, these may require a large number of evaluations as the dimension and nonconvexity increase. Thus the method should best be viewed as a heuristic to try to find a reasonable approximation of the global minimum.
references#
The global minimization method employed is an extension of that due to
Ya. D. Sergeyev and D. E. Kasov, ``A deterministic global optimization using smooth diagonal auxiliary functions’’, Communications in Nonlinear Science and Numerical Simulation, 21(1-3) (2015) 99-111.
but adapted to use 2nd derivatives, while in the special case when \(n=1\), a simplification based on the ideas in
D. Lera and Ya. D. Sergeyev (2013), ``Acceleration of univariate global optimization algorithms working with Lipschitz functions and Lipschitz first derivatives’’ SIAM J. Optimization 23(1) (2013) 508–529.
is used instead. The generic bound-constrained trust-region method used for local minimization is described in detail in
A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-region methods. SIAM/MPS Series on Optimization (2000).
matrix storage#
The symmetric \(n\) by \(n\) matrix \(H = \nabla^2_{xx}f\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).
Dense storage format: The matrix \(H\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(H\) is symmetric, only the lower triangular part (that is the part \(H_{ij}\) for \(0 <= j <= i <= n-1\)) need be held. In this case the lower triangle should be stored by rows, that is component \(i * i / 2 + j\) of the storage array H_val will hold the value \(H_{ij}\) (and, by symmetry, \(H_{ji}\)) for \(0 <= j <= i <= n-1\). The string H_type = ‘dense’ should be specified.
Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(0 <= l <= ne-1\), of \(H\), its row index i, column index j and value \(H_{ij}\), \(0 <= j <= i <= n-1\), are stored as the \(l\)-th components of the integer arrays H_row and H_col and real array H_val, respectively, while the number of nonzeros is recorded as H_ne = \(ne\). Note that only the entries in the lower triangle should be stored. The string H_type = ‘coordinate’ should be specified.
Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(H\) the i-th component of the integer array H_ptr holds the position of the first entry in this row, while H_ptr(n) holds the total number of entries. The column indices j, \(0 <= j <= i\), and values \(H_{ij}\) of the entries in the i-th row are stored in components l = H_ptr(i), …, H_ptr(i+1)-1 of the integer array H_col, and real array H_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string H_type = ‘sparse_by_rows’ should be specified.
functions#
- dgo.initialize()#
Set default option values and initialize private data
Returns:
- optionsdict
- dictionary containing default control options:
- errorint
error and warning diagnostics occur on stream error.
- outint
general output occurs on stream out.
- print_levelint
the level of output required. Possible values are:
<= 0
no output
1
a one-line summary for every improvement
2
a summary of each iteration
>= 3
increasingly verbose (debugging) output.
- start_printint
any printing will start on this iteration.
- stop_printint
any printing will stop on this iteration.
- print_gapint
the number of iterations between printing.
- maxitint
the maximum number of iterations performed.
- max_evalsint
the maximum number of function evaluations made.
- dictionary_sizeint
the size of the initial hash dictionary.
- alive_unitint
removal of the file alive_file from unit alive_unit terminates execution.
- alive_filestr
see alive_unit.
- infinityfloat
any bound larger than infinity in modulus will be regarded as infinite.
- lipschitz_lower_boundfloat
a small positive constant (<= 1e-6) that ensure that the estimted gradient Lipschitz constant is not too small.
- lipschitz_reliabilityfloat
the Lipschitz reliability parameter, the Lipschiz constant used will be a factor lipschitz_reliability times the largest value observed.
- lipschitz_controlfloat
the reliablity control parameter, the actual reliability parameter used will be
lipschitz_reliability
+ MAX( 1, n - 1 ) *lipschitz_control
/ iteration.- stop_lengthfloat
the iteration will stop if the length, delta, of the diagonal in the box with the smallest-found objective function is smaller than
stop_length
times that of the original bound box, delta_0.- stop_ffloat
the iteration will stop if the gap between the best objective value found and the smallest lower bound is smaller than
stop_f
.- obj_unboundedfloat
the smallest value the objective function may take before the problem is marked as unbounded.
- cpu_time_limitfloat
the maximum CPU time allowed (-ve means infinite).
- clock_time_limitfloat
the maximum elapsed clock time allowed (-ve means infinite).
- hessian_availablebool
is the Hessian matrix of second derivatives available or is access only via matrix-vector products?.
- prunebool
should boxes that cannot contain the global minimizer be pruned (i.e., removed from further consideration)?.
- perform_local_optimizationbool
should approximate minimizers be impoved by judicious local minimization?.
- space_criticalbool
if
space_critical
is True, every effort will be made to use as little space as possible. This may result in longer computation time.- deallocate_error_fatalbool
if
deallocate_error_fatal
is True, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue.- prefixstr
all output lines will be prefixed by the string contained in quotes within
prefix
, e.g. ‘word’ (note the qutoes) will result in the prefix word.- ugo_optionsdict
default control options for UGO (see
ugo.initialize
).- hash_optionsdict
default control options for HASH (see
hash.initialize
).- trb_optionsdict
default control options for TRB (see
trb.initialize
).
- dgo.load(n, x_l, x_u, H_type, H_ne, H_row, H_col, H_ptr, options=None)#
Import problem data into internal storage prior to solution.
Parameters:
- nint
holds the number of variables.
- x_lndarray(n)
holds the values \(x^l\) of the lower bounds on the optimization variables \(x\).
- x_undarray(n)
holds the values \(x^u\) of the upper bounds on the optimization variables \(x\).
- H_typestring
specifies the symmetric storage scheme used for the Hessian. It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’, ‘diagonal’ or ‘absent’, the latter if access to the Hessian is via matrix-vector products; lower or upper case variants are allowed.
- H_neint
holds the number of entries in the lower triangular part of \(H\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes.
- H_rowndarray(H_ne)
holds the row indices of the lower triangular part of \(H\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be None
- H_colndarray(H_ne)
holds the column indices of the lower triangular part of \(H\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense or diagonal storage schemes are used, and in this case can be None
- H_ptrndarray(n+1)
holds the starting position of each row of the lower triangular part of \(H\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be None
- optionsdict, optional
dictionary of control options (see
dgo.initialize
).
- dgo.solve(n, H_ne, x, eval_f, eval_g, eval_h)#
Find an approximation to the global minimizer of a given function subject to simple bounds on the variables using a multistart trust-region method.
Parameters:
- nint
holds the number of variables.
- H_neint
holds the number of entries in the lower triangular part of \(H\).
- xndarray(n)
holds the values of optimization variables \(x\).
- eval_fcallable
a user-defined function that must have the signature:
f = eval_f(x)
The value of the objective function \(f(x)\) evaluated at \(x\) must be assigned to
f
.- eval_gcallable
a user-defined function that must have the signature:
g = eval_g(x)
The components of the gradient \(\nabla f(x)\) of the objective function evaluated at \(x\) must be assigned to
g
.- eval_hcallable
a user-defined function that must have the signature:
h = eval_h(x)
The components of the nonzeros in the lower triangle of the Hessian \(\nabla^2 f(x)\) of the objective function evaluated at \(x\) must be assigned to
h
in the same order as specified in the sparsity pattern indgo.load
.Returns:
- xndarray(n)
holds the value of the approximate global minimizer \(x\) after a successful call.
- gndarray(n)
holds the gradient \(\nabla f(x)\) of the objective function.
- [optional] dgo.information()
Provide optional output information
Returns:
- informdict
- dictionary containing output information:
- statusint
return status. Possible values are:
0
The run was successful.
-1
An allocation error occurred. A message indicating the offending array is written on unit options[‘error’], and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.
-2
A deallocation error occurred. A message indicating the offending array is written on unit options[‘error’] and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.
-3
The restriction n > 0 or requirement that type contains its relevant string ‘dense’, ‘coordinate’, ‘sparse_by_rows’, ‘diagonal’ or ‘absent’ has been violated.
-7
The objective function appears to be unbounded from below.
-9
The analysis phase of the factorization failed; the return status from the factorization package is given by inform[‘factor_status’].
-10
The factorization failed; the return status from the factorization package is given by inform[‘factor_status’].
-11
The solution of a set of linear equations using factors from the factorization package failed; the return status from the factorization package is given by inform[‘factor_status’].
-16
The problem is so ill-conditioned that further progress is impossible.
-19
The CPU time limit has been reached. This may happen if options[‘cpu_time_limit’] is too small, but may also be symptomatic of a badly scaled problem.
-82
The user has forced termination of the solver by removing the file named options[‘alive_file’] from unit options[‘alive_unit’].
-91
The hash table used to store the dictionary of vertices of the sub-boxes is full, and there is no room to increase it further
-99
The budget limit on function evaluations has been reached. This will happen if the limit options[‘max_evals’] is exceeded, and is quite normal for stochastic global-optimization methods. The user may explore increasing options[‘max_evals’] to see if that produces a lower value of the objective function, but there are unfortunately no guarantees.
- alloc_statusint
the status of the last attempted allocation/deallocation.
- bad_allocstr
the name of the array for which an allocation/deallocation error occurred.
- iterint
the total number of iterations performed.
- f_evalint
the total number of evaluations of the objective function.
- g_evalint
the total number of evaluations of the gradient of the objective function.
- h_evalint
the total number of evaluations of the Hessian of the objective function.
- objfloat
the value of the objective function at the best estimate of the solution determined by
dgo.solve
.- norm_pgfloat
the norm of the projected gradient of the objective function at the best estimate of the solution determined by
dgo.solve
.- length_ratiofloat
the ratio of the final to the initial box lengths.
- f_gapfloat
the gap between the best objective value found and the lowest bound.
- why_stopstr
why did the iteration stop? This wil be ‘D’ if the box length is small enough, ‘F’ if the objective gap is small enough, and ‘ ‘ otherwise.
- timedict
- dictionary containing timing information:
- totalfloat
the total CPU time spent in the package.
- univariate_globalfloat
the CPU time spent performing univariate global optimization.
- multivariate_localfloat
the CPU time spent performing multivariate local optimization.
- clock_totalfloat
the total clock time spent in the package.
- clock_univariate_globalfloat
the clock time spent performing univariate global optimization.
- clock_multivariate_localfloat
the clock time spent performing multivariate local optimization.
- ugo_informdict
inform parameters for UGO (see
ugo.information
).- lhs_informdict
inform parameters for HASH (see
hash.information
).- trb_informdict
inform parameters for TRB (see
trb.information
).
- dgo.terminate()#
Deallocate all internal private storage.
example code#
from galahad import dgo
import numpy as np
np.set_printoptions(precision=4,suppress=True,floatmode='fixed')
print("\n** python test: dgo")
# allocate internal data and set default options
options = dgo.initialize()
# set some non-default options
options['print_level'] = 0
options['ugo_options']['print_level'] = 0
#print("options:", options)
# set parameters
p = 4
freq = 10
mag = 1000
# set bounds
n = 3
x_l = np.array([-10.0,-10.0,-10.0])
x_u = np.array([1.0,1.0,1.0])
# set Hessian sparsity
H_type = 'coordinate'
H_ne = 5
H_row = np.array([0,2,1,2,2])
H_col = np.array([0,0,1,1,2])
H_ptr = None
# load data (and optionally non-default options)
dgo.load(n, x_l, x_u, H_type, H_ne, H_row, H_col, H_ptr, options=options)
# define objective function and its derivatives
def eval_f(x):
return (x[0] + x[2] + p)**2 + (x[1] + x[2])**2 + mag * np.cos(freq * x[0]) + x[0] + x[1] + x[2]
def eval_g(x):
return np.array([2. * ( x[0] + x[2] + p )
- mag * freq * np.sin(freq * x[0]) + 1.,
2. * ( x[1] + x[2] ) + 1.,
2. * ( x[0] + x[2] + p ) + 2.0 * ( x[1] + x[2] ) + 1.])
def eval_h(x):
return np.array([2. - mag * freq * freq * np.cos(freq * x[0]),2.,2.,2.,4.])
# set starting point
x = np.array([1.,1.,1.])
# find optimum
x, g = dgo.solve(n, H_ne, x, eval_f, eval_g, eval_h)
print(" x:",x)
print(" g:",g)
# get information
inform = dgo.information()
print(" f: %.4f" % inform['obj'])
print('** dgo exit status:', inform['status'])
# deallocate internal data
dgo.terminate()
This example code is available in $GALAHAD/src/dgo/Python/test_dgo.py .