DGO#

purpose#

The dgo package uses a deterministic partition-and-bound trust-region method to find an approximation to the global minimizer of a differentiable objective function \(f(x)\) of n variables \(x\), subject to simple bounds \(x^l <= x <= x^u\) on the variables. Here, any of the components of the vectors of bounds \(x^l\) and \(x^u\) may be infinite. The method offers the choice of direct and iterative solution of the key trust-region subproblems, and is suitable for large problems. First derivatives are required, and if second derivatives can be calculated, they will be exploited - if the product of second derivatives with a vector may be found but not the derivatives themselves, that may also be exploited.

Although there are theoretical guarantees, these may require a large number of evaluations as the dimension and nonconvexity increase. The alternative package bgo may sometimes be preferred.

See Section 4 of $GALAHAD/doc/dgo.pdf for additional details.

method#

Starting with the initial box \(x^l \leq x \leq x^u\), a sequence of boxes is generated by considering the current set, and partitioning a promising candidate into three equally-sized sub-boxes by splitting along one of the box dimensions. Each partition requires only a pair of new function and derivative evaluations, and these values, together with estimates of Lipschitz constants, makes it possible to remove other boxes from further consideration as soon as they cannot contain a global minimizer. Efficient control of the dictionary of vertices of the sub-boxes is handled using a suitable hashing procedure provided by HASH; each sub-box is indexed by the concatenated coordinates of a pair of opposite vertices. At various stages, local minimization in a promising sub-box, using TRB, may be used to improve the best-known upper bound on the global minimizer. If \(n=1\), the specialised univariate global minimization package UGO is called directly.

We reiterate that although there are theoretical guarantees, these may require a large number of evaluations as the dimension and nonconvexity increase. Thus the method should best be viewed as a heuristic to try to find a reasonable approximation of the global minimum.

references#

The global minimization method employed is an extension of that due to

Ya. D. Sergeyev and D. E. Kasov, ``A deterministic global optimization using smooth diagonal auxiliary functions’’, Communications in Nonlinear Science and Numerical Simulation, 21(1-3) (2015) 99-111.

but adapted to use 2nd derivatives, while in the special case when \(n=1\), a simplification based on the ideas in

D. Lera and Ya. D. Sergeyev (2013), ``Acceleration of univariate global optimization algorithms working with Lipschitz functions and Lipschitz first derivatives’’ SIAM J. Optimization 23(1) (2013) 508–529.

is used instead. The generic bound-constrained trust-region method used for local minimization is described in detail in

A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-region methods. SIAM/MPS Series on Optimization (2000).

matrix storage#

The symmetric \(n\) by \(n\) matrix \(H = \nabla^2_{xx}f\) may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).

Dense storage format: The matrix \(H\) is stored as a compact dense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since \(H\) is symmetric, only the lower triangular part (that is the part \(H_{ij}\) for \(0 <= j <= i <= n-1\)) need be held. In this case the lower triangle should be stored by rows, that is component \(i * i / 2 + j\) of the storage array H_val will hold the value \(H_{ij}\) (and, by symmetry, \(H_{ji}\)) for \(0 <= j <= i <= n-1\). The string H_type = ‘dense’ should be specified.

Sparse co-ordinate storage format: Only the nonzero entries of the matrices are stored. For the \(l\)-th entry, \(0 <= l <= ne-1\), of \(H\), its row index i, column index j and value \(H_{ij}\), \(0 <= j <= i <= n-1\), are stored as the \(l\)-th components of the integer arrays H_row and H_col and real array H_val, respectively, while the number of nonzeros is recorded as H_ne = \(ne\). Note that only the entries in the lower triangle should be stored. The string H_type = ‘coordinate’ should be specified.

Sparse row-wise storage format: Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of \(H\) the i-th component of the integer array H_ptr holds the position of the first entry in this row, while H_ptr(n) holds the total number of entries. The column indices j, \(0 <= j <= i\), and values \(H_{ij}\) of the entries in the i-th row are stored in components l = H_ptr(i), …, H_ptr(i+1)-1 of the integer array H_col, and real array H_val, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor. The string H_type = ‘sparse_by_rows’ should be specified.

functions#

dgo.initialize()#

Set default option values and initialize private data

Returns:

optionsdict
dictionary containing default control options:
errorint

error and warning diagnostics occur on stream error.

outint

general output occurs on stream out.

print_levelint

the level of output required. Possible values are:

  • <= 0

    no output

  • 1

    a one-line summary for every improvement

  • 2

    a summary of each iteration

  • >= 3

    increasingly verbose (debugging) output.

start_printint

any printing will start on this iteration.

stop_printint

any printing will stop on this iteration.

print_gapint

the number of iterations between printing.

maxitint

the maximum number of iterations performed.

max_evalsint

the maximum number of function evaluations made.

dictionary_sizeint

the size of the initial hash dictionary.

alive_unitint

removal of the file alive_file from unit alive_unit terminates execution.

alive_filestr

see alive_unit.

infinityfloat

any bound larger than infinity in modulus will be regarded as infinite.

lipschitz_lower_boundfloat

a small positive constant (<= 1e-6) that ensure that the estimted gradient Lipschitz constant is not too small.

lipschitz_reliabilityfloat

the Lipschitz reliability parameter, the Lipschiz constant used will be a factor lipschitz_reliability times the largest value observed.

lipschitz_controlfloat

the reliablity control parameter, the actual reliability parameter used will be lipschitz_reliability + MAX( 1, n - 1 ) * lipschitz_control / iteration.

stop_lengthfloat

the iteration will stop if the length, delta, of the diagonal in the box with the smallest-found objective function is smaller than stop_length times that of the original bound box, delta_0.

stop_ffloat

the iteration will stop if the gap between the best objective value found and the smallest lower bound is smaller than stop_f.

obj_unboundedfloat

the smallest value the objective function may take before the problem is marked as unbounded.

cpu_time_limitfloat

the maximum CPU time allowed (-ve means infinite).

clock_time_limitfloat

the maximum elapsed clock time allowed (-ve means infinite).

hessian_availablebool

is the Hessian matrix of second derivatives available or is access only via matrix-vector products?.

prunebool

should boxes that cannot contain the global minimizer be pruned (i.e., removed from further consideration)?.

perform_local_optimizationbool

should approximate minimizers be impoved by judicious local minimization?.

space_criticalbool

if space_critical is True, every effort will be made to use as little space as possible. This may result in longer computation time.

deallocate_error_fatalbool

if deallocate_error_fatal is True, any array/pointer deallocation error will terminate execution. Otherwise, computation will continue.

prefixstr

all output lines will be prefixed by the string contained in quotes within prefix, e.g. ‘word’ (note the qutoes) will result in the prefix word.

ugo_optionsdict

default control options for UGO (see ugo.initialize).

hash_optionsdict

default control options for HASH (see hash.initialize).

trb_optionsdict

default control options for TRB (see trb.initialize).

dgo.load(n, x_l, x_u, H_type, H_ne, H_row, H_col, H_ptr, options=None)#

Import problem data into internal storage prior to solution.

Parameters:

nint

holds the number of variables.

x_lndarray(n)

holds the values \(x^l\) of the lower bounds on the optimization variables \(x\).

x_undarray(n)

holds the values \(x^u\) of the upper bounds on the optimization variables \(x\).

H_typestring

specifies the symmetric storage scheme used for the Hessian. It should be one of ‘coordinate’, ‘sparse_by_rows’, ‘dense’, ‘diagonal’ or ‘absent’, the latter if access to the Hessian is via matrix-vector products; lower or upper case variants are allowed.

H_neint

holds the number of entries in the lower triangular part of \(H\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes.

H_rowndarray(H_ne)

holds the row indices of the lower triangular part of \(H\) in the sparse co-ordinate storage scheme. It need not be set for any of the other three schemes, and in this case can be None

H_colndarray(H_ne)

holds the column indices of the lower triangular part of \(H\) in either the sparse co-ordinate, or the sparse row-wise storage scheme. It need not be set when the dense or diagonal storage schemes are used, and in this case can be None

H_ptrndarray(n+1)

holds the starting position of each row of the lower triangular part of \(H\), as well as the total number of entries, in the sparse row-wise storage scheme. It need not be set when the other schemes are used, and in this case can be None

optionsdict, optional

dictionary of control options (see dgo.initialize).

dgo.solve(n, H_ne, x, eval_f, eval_g, eval_h)#

Find an approximation to the global minimizer of a given function subject to simple bounds on the variables using a multistart trust-region method.

Parameters:

nint

holds the number of variables.

H_neint

holds the number of entries in the lower triangular part of \(H\).

xndarray(n)

holds the values of optimization variables \(x\).

eval_fcallable

a user-defined function that must have the signature:

f = eval_f(x)

The value of the objective function \(f(x)\) evaluated at \(x\) must be assigned to f.

eval_gcallable

a user-defined function that must have the signature:

g = eval_g(x)

The components of the gradient \(\nabla f(x)\) of the objective function evaluated at \(x\) must be assigned to g.

eval_hcallable

a user-defined function that must have the signature:

h = eval_h(x)

The components of the nonzeros in the lower triangle of the Hessian \(\nabla^2 f(x)\) of the objective function evaluated at \(x\) must be assigned to h in the same order as specified in the sparsity pattern in dgo.load.

Returns:

xndarray(n)

holds the value of the approximate global minimizer \(x\) after a successful call.

gndarray(n)

holds the gradient \(\nabla f(x)\) of the objective function.

[optional] dgo.information()

Provide optional output information

Returns:

informdict
dictionary containing output information:
statusint

return status. Possible values are:

  • 0

    The run was successful.

  • -1

    An allocation error occurred. A message indicating the offending array is written on unit options[‘error’], and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -2

    A deallocation error occurred. A message indicating the offending array is written on unit options[‘error’] and the returned allocation status and a string containing the name of the offending array are held in inform[‘alloc_status’] and inform[‘bad_alloc’] respectively.

  • -3

    The restriction n > 0 or requirement that type contains its relevant string ‘dense’, ‘coordinate’, ‘sparse_by_rows’, ‘diagonal’ or ‘absent’ has been violated.

  • -7

    The objective function appears to be unbounded from below.

  • -9

    The analysis phase of the factorization failed; the return status from the factorization package is given by inform[‘factor_status’].

  • -10

    The factorization failed; the return status from the factorization package is given by inform[‘factor_status’].

  • -11

    The solution of a set of linear equations using factors from the factorization package failed; the return status from the factorization package is given by inform[‘factor_status’].

  • -16

    The problem is so ill-conditioned that further progress is impossible.

  • -19

    The CPU time limit has been reached. This may happen if options[‘cpu_time_limit’] is too small, but may also be symptomatic of a badly scaled problem.

  • -82

    The user has forced termination of the solver by removing the file named options[‘alive_file’] from unit options[‘alive_unit’].

  • -91

    The hash table used to store the dictionary of vertices of the sub-boxes is full, and there is no room to increase it further

  • -99

    The budget limit on function evaluations has been reached. This will happen if the limit options[‘max_evals’] is exceeded, and is quite normal for stochastic global-optimization methods. The user may explore increasing options[‘max_evals’] to see if that produces a lower value of the objective function, but there are unfortunately no guarantees.

alloc_statusint

the status of the last attempted allocation/deallocation.

bad_allocstr

the name of the array for which an allocation/deallocation error occurred.

iterint

the total number of iterations performed.

f_evalint

the total number of evaluations of the objective function.

g_evalint

the total number of evaluations of the gradient of the objective function.

h_evalint

the total number of evaluations of the Hessian of the objective function.

objfloat

the value of the objective function at the best estimate of the solution determined by dgo.solve.

norm_pgfloat

the norm of the projected gradient of the objective function at the best estimate of the solution determined by dgo.solve.

length_ratiofloat

the ratio of the final to the initial box lengths.

f_gapfloat

the gap between the best objective value found and the lowest bound.

why_stopstr

why did the iteration stop? This wil be ‘D’ if the box length is small enough, ‘F’ if the objective gap is small enough, and ‘ ‘ otherwise.

timedict
dictionary containing timing information:
totalfloat

the total CPU time spent in the package.

univariate_globalfloat

the CPU time spent performing univariate global optimization.

multivariate_localfloat

the CPU time spent performing multivariate local optimization.

clock_totalfloat

the total clock time spent in the package.

clock_univariate_globalfloat

the clock time spent performing univariate global optimization.

clock_multivariate_localfloat

the clock time spent performing multivariate local optimization.

ugo_informdict

inform parameters for UGO (see ugo.information).

lhs_informdict

inform parameters for HASH (see hash.information).

trb_informdict

inform parameters for TRB (see trb.information).

dgo.terminate()#

Deallocate all internal private storage.

example code#

from galahad import dgo
import numpy as np
np.set_printoptions(precision=4,suppress=True,floatmode='fixed')
print("\n** python test: dgo")

# allocate internal data and set default options
options = dgo.initialize()

# set some non-default options
options['print_level'] = 0
options['ugo_options']['print_level'] = 0
#print("options:", options)

# set parameters
p = 4
freq = 10
mag = 1000
# set bounds
n = 3
x_l = np.array([-10.0,-10.0,-10.0])
x_u = np.array([1.0,1.0,1.0])

# set Hessian sparsity
H_type = 'coordinate'
H_ne = 5
H_row = np.array([0,2,1,2,2])
H_col = np.array([0,0,1,1,2])
H_ptr = None

# load data (and optionally non-default options)
dgo.load(n, x_l, x_u, H_type, H_ne, H_row, H_col, H_ptr, options=options)

# define objective function and its derivatives
def eval_f(x):
    return (x[0] + x[2] + p)**2 + (x[1] + x[2])**2 + mag * np.cos(freq * x[0]) + x[0] + x[1] + x[2]
def eval_g(x):
    return np.array([2. * ( x[0] + x[2] + p )
                     - mag * freq * np.sin(freq * x[0]) + 1.,
                     2. * ( x[1] + x[2] ) + 1.,
                     2. * ( x[0] + x[2] + p ) + 2.0 * ( x[1] + x[2] ) + 1.])
def eval_h(x):
    return np.array([2. - mag * freq * freq * np.cos(freq * x[0]),2.,2.,2.,4.])

# set starting point
x = np.array([1.,1.,1.])

# find optimum
x, g = dgo.solve(n, H_ne, x, eval_f, eval_g, eval_h)
print(" x:",x)
print(" g:",g)

# get information
inform = dgo.information()
print(" f: %.4f" % inform['obj'])
print('** dgo exit status:', inform['status'])

# deallocate internal data
dgo.terminate()

This example code is available in $GALAHAD/src/dgo/Python/test_dgo.py .