Académique Documents
Professionnel Documents
Culture Documents
Managing Editors:
Panos Pardalos
University ofFlorida, U.S.A.
Reiner Horst
University of Trier, Germany
Advisory Board:
Ding-ZhuDu
University ofMinnesota, U.S.A.
C. A. Floudas
Princeton University, U.S.A.
G.lnfanger
Stariford University, U.S.A.
J. Mockus
Lithuanian Academy of Sciences, Lithuania
H. D. Sherali
Virginia Polytechnic Institute and State University, U.S.A.
I. E. Grossmann
Carnegie Mellon University
The titles published in this series are listed at the end ofthis volume.
Global Optimization in
Engineering Design
Edited by
Ignacio E. Grossmann
Carnegie Mellon University
ISBN 978-1-4419-4754-3
AU Rights Reserved
© 1996 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1996
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanical,
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner.
TABLE OF CONTENTS
Preface ..................................................... vii
To see how the need for global optimization has been motivated in chemical
engineering, it is instructive to briefly follow the development of nonlinear
optimization over the last 20 years in this engineering discipline. In the
70's, pioneering research on nonlinear programming algorithms, which were
applied to process design and optimal control problems, was performed at Imperial
College. Subsequently the advent of the successive quadratic programming
algorithm spurred a great deal of interest and was first applied to chemical process
simulators at Wisconsin in the late 70's. This algorithm was also adapted to
problems with many equations and few degrees of freedom (a common case in
engineering) with decomposition approaches developed in the early 80's at
Carnegie Mellon. New techniques for mixed-integer nonlinear programming
vii
viii PREFACE
emerged also at Carnegie Mellon in the mid to late 80's and were applied for
the first time to chemical process synthesis problems. Nonlinear programming
techniques, especially successive quadratic programming algorithms, continue
to be of active interest at Carnegie Mellon, Clarkson and Imperial College, par-
ticularly for large scale applications, such as real time optimization. Likewise,
mixed-integer nonlinear programming algorithms continue to be of interest at
Abo Akademi, Carnegie Mellon, Dundee, Maribor and Imperial College. Ini-
tial work in global optimization was performed at Stanford in the early 70's,
while the study of implications of nonconvexities in design and their handling
in decomposition strategies were developed at Florida in the mid-70's. Inter-
est in global optimization resurfaced in the late 80's with the development of
Benders type of algorithms at Princeton. Since that time global optimization
has attracted increased attention, and work is being pursued at a number of
universities (largely represented in this monograph). It is this level of research
activity that has motivated the creation of this monograph.
The chapters of this monograph are roughly divided into two major parts: chap-
ters 1 to 6 emphasize algorithms, and chapters 7 to 12 emphasize applications.
Chapters 1 and 2 describe a novel and elegant LP-based branch and bound
approach by Epperly and Swaney for solving nonlinear programs that are ex-
pressed in factorable form. Computational experience is reported on a set of
50 test problems. Chapters 3 and 4 describe several recent enhancements and
improvements to the GOP algorithm by Visweswaran and Floudas. Details of
the implementation of the algorithm are described, as well as results of process
design and general test problems. Chapters 5 and 6 describe implementations
of methods based on interval analysis. Chapter 5 emphasizes implementations
of existing methods and the effect of inclusion functions, while chapter 6 em-
phasizes strategies for accelerating the search and quickly identifying infeasible
solutions.
As for the applications, chapter 7 presents an exact and finite branch and bound
solution approach to the planning of process networks with separable concave
costs. Chapter 8 by Ierapetritou and Pistikopoulos deals with a number of
stochastic planning and scheduling models which are shown to obey convexity
properties when discretization and decomposition schemes are used on two-
stage programming formulations. Chapter 9 by Iyer and Grossmann presents
the extension of the global optimization method for heat exchanger networks by
Quesada and Grossmann to the case of multiperiod design problems. Chapter
10 by Quesada and Grossmann explores alternative bounding approximations
for their algorithm for linear fractional and bilinear functions which is applied
to problems in layout design, design of truss structures, and multiproduct batch
design. Chapter 11 by Sherali, Smith and Kim outlines a comprehensive solu-
PREFACE ix
We believe that this monograph provides a good overview of the current state-
of-the-art of deterministic global optimization techniques for engineering de-
sign.
Ignacio E. Grossmann
Carnegie Mellon University
1
BRANCH AND BOUND FOR
GLOBAL NLP: NEW BOUNDING LP
Thomas G. W. Epperly
Ross E. Swaney
Department of Chemical Engineering
University of Wisconsin
Madison, Wisconsin
We present here a new method for bounding nonlinear programs which forms the
foundation for a branch and bound algorithm presented in the next chapter. The
bounding method is a generalization of the method proposed py Swaney [34] and
is applicable to NLPs in factorable form, which include problems with quadratic
objective functions and quadratic constraints as well as problems with twice differ-
entiable transcendental functions. This class of problems is wide enough to cover
many useful engineering applications including the following which have appeared
in the literature: phase and chemical equilibrium problems [5, 15, 16], complex re-
actor networks [5], heat exchanger networks [5, 23, 38], pool blending [36, 37]' and
flowsheet optimization [5, 28]. Reviews of the applications of general nonlinear and
bilinear programs are available in [1, 5].
Although the problem of finding the global optimum of nonconvex NLPs has been
studied for over 30 years, still relatively few concrete algorithms have been proposed
and tested for solving problems where the feasible region is a general compact non-
convex set. Much of the research has focused on concave programs with a convex
feasible region, concave and indefinite quadratic programs, and bilinear programs.
Many of these methods have been summarized in the following review papers or
monographs [11, 6, 12, 10, 21, 22]. While these algorithm have many possible
applications, there are many engineering design problems that do not fit the re-
quirements of these methods because of general nonconvex objective functions or
nonconvex feasible regions.
relaxed dual subproblems required during each iteration may increase exponentially
with the number of variables appearing in bilinear terms. To address this difficulty,
Visweswaran and Floudas [37J introduced some properties to reduce the number of
relaxed dual subproblems required in most cases.
The second line of approaches involve various branch and bound algorithms applied
to the continuous variable domain. These are distinguished by how they obtain
bounds and how they partition the variable domain. Bounding approaches fall
into two main groups, using either interval mathematics or convex underestimating
programs. Interval mathematics provides tools for placing bounds on the objective
function and restricting the feasible region; these methods have been summarized
in the following review articles and monographs [9, 25, 26, 27J. Interval methods
have been applied recently to some engineering design problems in [35J. The tools
offered by interval mathematics have several uses in global optimization.
The other group of bounding techniques are based on convex underestimating pro-
grams. Falk and Soland [4J used convex envelopes of separable functions to provide
bounds for problems with nonconvex objective functions and convex feasible re-
gions, and Soland [33J extended their work to separable nonconvex constraints.
McCormick [13J then introduced a general method for constructing convex/concave
envelopes for factorable functions, thereby removing the need for separability, and
in [14J presented a branch and bound algorithm based on these envelopes. More
recently, Swaney [34J presented new bounding functions based on McCormick's
envelopes and positive definite combinations of quadratic terms to improve conver-
gence and to provide finite termination even when the minimum is not determined
by a full set of active constraints.
The tightness of the bound strategy used by a branch and bound algorithm is crit-
ical to its success. Tighter bounding functions reduce the need for partitioning,
decreasing the computational effort required. Swaney [34] identified and remedied
BRANCH AND BOUND FOR GLOBAL NLP 3
a problem that may occur when McCormick's envelopes are constructed at minima
with less than a full active set, but the covering program involved had to be con-
structed at a local minimum, where the projection of the Hessian of the Lagrangian
in the unconstrained directions is positive semi-definite. In this chapter, we relax
this requirement.
Branch and bound algorithms typically get an upper bound on the solution from the
best known feasible point, so some branch and bound algorithms use a local NLP
method to find good feasible points. The algorithm presented here incorporates
global bounding information into the search for local minima and feasible points.
The bounding method here is developed for branch and bound algorithms using
rectangular domain regions. An underestimating LP based on McCormick's convex
envelopes and additional constraints formed from positive definite combinations of
quadratic terms is used to provide a lower bound for the original problem over a
rectangular region. Two different approaches are used in the search for feasible
points, one using MINOS 5.4 [17] and a second one using the underestimating LP
in an iterative scheme. These aspects are treated in the next chapter.
The key features of the bounding method are its ease of evaluation, its wide appli-
cability, and its ability to provide the exact lower bound over a region of finite size.
Because the bounding method uses an LP, it can be reliably and efficiently solved
using existing algorithms. It can be applied to a large class of problems because it
only requires them to be in factorable form. Lastly, the underestimating LP is de-
signed to provide the exact lower bound when built around the global minimum of a
region of finite size, enabling finite termination of the branch and bound algorithm.
The resulting underestimating LP will be used in the branch and bound algorithm
presented in the following chapter. It is used to provide lower bounds for each
region, and is also used iteratively in a search for local minima and feasible points.
The procedure to develop the orthant program from the NLP is summarized as
follows. The first step is to transform the original problem into a quadratic NLP, a
NLP with a quadratic objective and quadratic constraints. Next, using a variable
transformation, the quadratic terms are separated into two groups - those with
gradients in the range space of the constraint gradients, and those with gradients
in the null space of the constraint gradients. Linear bounds for the constraint
space quadratic terms are constructed using McCormick's convex envelopes for a
bilinearity, while linear bounds for the null space quadratic terms are constructed
from positive definite combinations. Both types of constraints are combined into a
LP which underestimates the original NLP for a particular orthant. The details of
each of these steps are presented below.
It is assumed that the problem is well scaled and written in the following factorable
form:
min Xo (1.1)
S.t. gi(x)=biTX+xTH(i)x+ Llj(xj)~O iEl. .. m (1.2)
j=F.
(1.3)
(1.4)
McCormick [14] shows how many problems can be written in this form. Equality
constraints are written as two opposing inequality constraints. H(i) is chosen to be a
symmetric matrix, and the functions Ij (x j) are nonlinear, single variable functions.
BRANCH AND BOUND FOR GLOBAL NLP 5
These definitions assume that the first n components of U correspond to the vari-
able bounds, followed by m components corresponding to the general constraints.
Swaney's algorithm [34] required that X be a local minimum of (1.1) and U and
fA,!A be the corresponding Lagrange multipliers and active set. The orthant pro-
gram presented here is more general because it can be applied at any point including
infeasible ones; when the x is a local minimum, it is equivalent to the earlier version.
The first step in the transformation is to replace the single variable, nonlinear
functions with underestimating quadratics. The Taylor series expansion of IJ(xj)
about x can be written as
where (j(Xj) holds the higher order terms. Using the expansion, gi can be rewritten
exactly
(1.10)
where &) is the largest value that underestimates Ij over the range [xy, xY]. With
the following definition,
(1.11)
Now the original nonlinear program has been transformed into a quadratic NLP, and
the next task is to provide bounding functions for the quadratic terms. McCormick
[14, pages 387-416] developed the convex/concave envelopes for quadratic terms,
and these can be used to provide bounding functions. However, in this context
these bounds are insufficient for two reasons explained below. These shortcomings
may be overcome by separating the bilinearities into two groups of components,
those whose gradients lie in the space spanned by the gradients of the set of active
constraints, and those whose gradients lie in the null space. This separation is
accomplished through a variable transformation.
The first problem with McCormick's envelopes is that the underestimators match
the actual quadratic functions only on the boundaries of the region, and for the
underestimating program to remain stationary when x is a global minimum, there
must be no error in the underestimation at that point. This point is demonstrated
in Figures 1, 2.a, and 2.b, which show a bilinear function, its convex envelope, and
the estimation error as a function of position. Later, w and d will be used to denote
deviations from x, so the error at (0,0) prevents this bound from being tight at x.
This difficulty is addressed below in Section 1.3 by bounding each orthant separately
using piecewise convex envelopes.
Concave terms in the Lagrangian give rise to the other problem with convex en-
velopes. If piecewise convex envelopes are generated for a function which is the
sum of a convex part and a concave part, the addition of the two bounding func-
tions is not the tightest possible linear bound for the function. Figure 3 provides
a one dimensional illustration of this difficulty, which arises in multiple dimensions
through combination of bilinear terms having individual envelope functions. This
BRANCH AND BOUND FOR GLOBAL NLP 7
-2
Error
4
3
2
!~~~~~~tt~~~~~~1 2
-2
w 2
a) b)
. ! ---=_ 3 i
;
0--
·1
...-/
./"
I
1
I
- +::
.3
! . 2 !
I
,
I -1 _------.
-2 ! -4' -3
·2 ·1 0 1 2 ·2·1 0 1 2 ·2·1 0 1 2
Both of these difficulties with convex envelopes can be solved after separating the
quadratic terms into constraint space contributions and null space contributions.
To perform the necessary separation, new coordinate variables d and p are defined
for the constraint space and null space respectively. To express the relation between
d and p and x, the constraint space basis G and the null space basis N are defined
as follows:
'l"7 g. ....... ei ' ...J i E lA i' E lA'
G -- [... ,v", (1.12)
, '" , .
Note that Y'gi = "bi and ei refers to the i'th column of an n x n identity matrix. This
basis matrix is generally not orthogonal. The null space basis N can be calculated
from G. The rows of G are permuted, so G can be partitioned G =[ g:! ] with
G! nonsingular. N may then be taken as
N = [ GI~fr! ]. (1.13)
By defining
------2(x-x)
(1.16)
and
w = 2(x -x) - d (1.17)
BRANCH AND BOUND FOR GLOBAL NLP 9
(1.15) becomes
Then by defining
tjk = pT(NTeje kT N)p Vj,k E 1, . .. ,n (1.18)
qjk = wjdk V(j, k) E 1, ... , n (1.19)
- (i)
Hjk (tjk + qjk) (1.20)
j,kEl, ... ,n
The qjk terms hold the constraint space terms and the cross terms, and the tjk
terms hold the null space terms. Linear bounding functions for the qjk and tjk
terms will be developed in sections 1.3 and 1.4.
Figure 2 demonstrated the problem with using the convex envelope to bound bilin-
earities when the point of interest is in the interior of the region. This problem will
be solved in this section by dividing the region into orthants and using the convex
and concave envelopes within each orthant. The envelopes for all of the orthants
will be combined into a single program in Section 2.
The quadratic terms qjk to be bounded come from the constraint space and cross
terms as defined in (1.19). If Wj E [wf, wf] and dk E [df, df], McCormick [14]
provides the convex and concave envelopes
(1.23)
(1.24)
The bounds on wand d are calculated by evaluating (1.22) and (1.21) with interval
arithmetic.
(I + N(NT N)-l NT)([xL, x U]_ x)
(I - N(N T N)-l NT)([x L,x u ]_ x)
Error
4
3
-2
4
2
~~_~
0
2
~ ~
w 2 w 2
The next step in the bounding procedure is to split the variable region into orthants
around x and then to develop convex and concave envelopes for each orthant. The
combined result of this is a piecewise convex/concave envelope. The resulting piece-
wise convex envelope is demonstrated in Figure 4 along with the estimation error.
As desired, there is no estimation error at x.
Below are the linear bound constraints that are added to the orthant underestimat-
ing program. CTjk indicates the direction of support that is needed.
Ojk . (2: -
SIgn jk -
Ui fI{i)) '
- SIgn (-
'Yjk ) (1.25)
I
CjkWj + cjk2 dk < CTjkqjk (1.26)
4 d
CjkWj + Cjk k <
3 (1.27)
CTjkqjk
if d~u = df (1.28)
if d~L = df
IfCTjk=-l
if d~u = df (1.29)
if d~L = df
Constraints (1.26) and (1.27) have first order gaps in the directions of the space
spanned by the gradients of the active constraints, so Lagrange multiplier adjust-
ments will compensate for those deviations in the stationarity conditions. This point
will be demonstrated in section 1.5 where the first order optimality conditions for
the orthant program will be presented.
BRANCH AND BOUND FOR GLOBAL NLP 11
The remaining quadratic terms are those in the null space directions. These terms
are not bounded well by termwise convex envelopes because, as demonstrated in
Figure 3, problems with any concave terms will not be tightly bound even if the
combined function behavior is convex. The bounds for these variables are developed
from positive definite combinations of the quadratic terms, derived from the overall
behavior of the program at x to make a tight bound. The quadratic terms under
consideration are defined as follows:
Given any n x n positive semidefinite (PSD) matrix 'Y, the following is true by
definition:
pTNT'YNp pT (L~;'(JVTe;e'TN»)
J,k
p (1.30)
= L 'Yjktjk (1.31)
j,k
> 0 (1.32)
Legitimate constraints of this form can be written for any PSD matrix 'Y, but only
certain 'Y's will give a sufficiently tight bound to support stationarity as desired.
Writing the orthant program with a single constraint of the form (1.31) provides
some insight into the choice of 'Y. Below is a program with only the constraints
involving 'Y, and its first order optimality conditions for the tjk variables, using
Ax x -x,
Ax u xU -x,
AxL xL -x
minAxo
s.t.
Ui
_
gi + -iT
b Ax + 'L..J
"' (i){
Hjk tjk + qjk )
A
~ 0 u,;
v.
j,k
X : L1jk t jk ~ 0 (1.33)
j,k
(First Order Optimality Conditions:)
(i)
L..J uiHjk
'"' A _
- X'Yjk =0 \:Ij, k (1.34)
12 T. G. W. EPPERLY AND R. E. SWANEY
If X is a local minimum of (1.1), the second order optimality conditions require that
L:i uiNT R(i) N be PSD, and choosing
(1.35)
The solution to this difficulty is to generate a set of 'Y matrices and a corresponding
set of constraints to bound the null space quadratic terms. The set of 'Y's that
can accommodate the greatest allowable change in u would inscribe the space of
positive definite combinations of the lI(i) 's. It is not clear how that particular set
of constraints can be conveniently generated, so the method presented here tries
to span the largest space possible while requiring reasonable computational effort
and maintaining problem sparsity. The method takes 'Y and perturbs it in each of
the bilinear directions which appear in the problem until the semi-definite limit is
reached. If 'Y is not positive definite in the null space, it is adjusted with a diagonal
positive definite matrix. Here are the details of the method.
(1.36)
Q is perturbed by one symmetric dyad pair at a time in each direction which appears
in the problem.
Q=Q - pjkNT(eiekT + ekeiT)N
By realizing that at the semidefinite limit Q becomes singular, it is possible to
calculate a formula for the limiting value of Pjk.
+ eke jT ) N) x = 0 for some x f. 0
(Q - PikNT (eje kT (1.37)
x = Pjk (Q-l NTeie kT Nx + Q-l NTekeiT Nx)
ekTNx = Pik (e kT NQ-l NTeje kT Nx + e kT NQ-l NTekejTNx) (1.38)
e jT Nx = Pjk (e iT NQ-INTeiekT Nx + eiT NQ-l NTeke jT Nx) (1.39)
Then define
'f/ik = e jT NQ-l NTe k ,
a ekTNx,
b = eiTNx.
Solving these. two equations together gives the limiting values of Pjk.
For each perturbation dyad jk, equation (1.40) will give either a positive and a zero
value for Pjk1 or two nonzero values of opposite signs. These two values give two
constraints via (1.30-1.32), each of the form
(1.41 )
r,s
The two values of Pjk1 are renamed separately as Pjk1 and Pjk1 , such that
-1
Pjk < 0,
--1
Pjk > 0.
Then the two constraints from (1.41) can be rewritten as
If x is not a local minimum or the Lagrange multipliers are not correct, Q in (1.36)
is potentially indefinite. In the development above, it is necessary to factorize Q.
A modified Cholesky algorithm [30] is used, and if necessary, a diagonal matrix is
added to make Q positive definite. ;Y is modified to incorporate the adjustment to Q.
Given a diagonal adjustment E to Q, the equivalent adjustment to 'Y is determined
as follows:
[GHG l 1 -J f [~ ~] [ Gl T1 ]:7
= Q+E
Constraints (1.42) and (1.43) are included in the orthant program for every pair
jk that is used in the problem or added by the diagonal adjustment. When
Ilil(i) - II
j{(i) and Ilu - fill are not too large, this set of constraints derived from
perturbing Q in each direction is sufficient to support the first order optimality
conditions of (1.1).
14 T. O. W. EPPERLY AND R. E. SWANEY
With the pieces derived, the complete orthant program can be written. This orthant
program is useful as an intermediate result, though the 2n orthants are too many
to be used directly in an algorithm. This combinatorial difficulty is solved by the
covering program presented in the next section.
It is illuminating to compare the original quadratic NLP with the orthant program
and to compare their first order optimality conditions.
(1.57)
rs
For the most part, the programs are very similar, differing only in how the quadratic
variables are treated, and if Ax = 0 satisfies the original program's optimality
conditions, Ax = 0 will also satisfy the optimality conditions of the orthant program
if the region is small enough. The changes in (1.48) create differences in (1.58),
(1.60), and (1.61). Constraint (1.58) can be satisfied by adjusted values of afk and
afk as long as signC"(jk) = sign(I:i uiHj~). These changes in a will ultimately
require changes in u via the variables 7r and 13.
The changes in (1.47) create differences in (1.57) and (1.59). When Ax = 0, con-
straints (1.59) are identical because p = 0, so that will not cause the orthant program
to have a different solution. The orthant version of (1.57) may be satisfied by some
combination of Xfk' Xfk if I:i uiH( i) is in the space of positive semi-definite matrices
spanned by the set of'Y developed above, which will be true when Iii - I:i UiH( i) II
is not too large.
The orthant program derived in the previous section may be impractical as a means
of obtaining a bound on a region because it implies the solving of 2n subproblems.
In this section, an interval-based relaxation will be developed and then applied to
the set of 2n orthant programs to give a single program to solve for a lower bound
on a region.
16 T. G. W. EPPERLY AND R. E. SWANEY
In the orthant program, only constraints (1.48.A) and (1.48.B) and variable bounds
(1.53) and (1.54) are affected by the choice of orthant. To extend the orthant
program into a single program for the whole variable domain, the variable bounds
can be extended to their whole ranges, and the coefficients in constraints (1.48.A)
and (1.48.B) can be replaced with intervals.
For the derivation that follows, it is necessary to develop a set of linear constraints
from a linear system of equations with interval coefficients. Given a linear system
Au=c (1.63)
u2::0
with A E [A., A], an interval matrix, the goal is to develop the tightest set of linear
constraints that can be written to limit the range of U E [y., ill. (The motivation here
°
is to be able to show Y. 2:: in order to demonstrate stationarity.) Use of LP and
linear constraints to approximate linear systems with interval coefficients has been
studied before [2, 3, 19, 20]. The method presented here includes the constraints
of this previous work plus an additional constraint to increase the tightness of the
bounds on u. Neumaier [18] presents other interval methods for solving this prob-
lem, but these methods are not applicable for the method presented here because
they cannot be used within aLP.
For any variable x, the notations xC +) and xC -) will refer to its positive and negative
parts defined as
x C+) max{O,x}
x(-) max{O, -x}
Consider a row i of (1.63), and choose a particular multiplier, Ur(i). Under the
condition that U 2:: 0, the following may be used to obtain one limit of a valid
interval [Y.r(i),iLr(i)] containing the value of Ur(i) for some solution of (1.63) for all
Air(i) E [A.irCi),Air(i)]
~ -(+)- -(-)
Air(i)ur(i) :::; Ci - ~ (Aij Uj - Aij Y.j) (1.64)
#r(i)
One method for selecting r(i) will be described in the following section. By choosing
the value of Ai r( i) which maximizes the left hand side and the value of UrC i) which
minimizes it, the following is obtained.
(1.65)
BRANCH AND BOUND FOR GLOBAL NLP 17
for all Air(i) E [Air(i),Air(i») and some Ur(i). By choosing the Air(i) which mini-
mizes the left hand size and the value of ur ( i) which maximizes it and performing
similar rearrangements, the following is similarly obtained.
L(A~j)uj - ~t)1l!.j) -IAir(i)l(ur(i) -1l!.r(i») :::; -Ci (1.68)
j
Combining (1.66) and (1.68) for all i gives a set of constraints which can be used
to determine intervals [1l!., u) containing the values of the solution U to (1.63) for
A E [A, A). These constraints will be used below to develop the underestimating
linear program.
i next = unassigned
arg max
i
( ~in
unassigned}
. max { ( _Xi. ) ,(
AijUj
Xi.) })
- AijUj
(1. 71)
18 T. G. W. EPPERLY AND R. E. SWANEY
r(i) is determined by calculating i next and then calculating r(i next ). Let s(j) be
defined as the inverse function of r(i) such that
i s(r(i))
j r(s(j))
Now that constraints for a interval linear system have been developed, it is possible
to develop an underestimating program for a linear program that has an interval
coefficient matrix. Given an LP of the form:
with A E [A, A] c IRmxn, the set of all m x n matrices with interval coefficients.
The dual of this LP is:
Constraint (1. 75) is an interval linear system, so the bounding constraints developed
in Section 2.1 above can be applied to it. Applying these constraints leads to the
following linear program
""
L..J (-(+)-
Aji Uj -(-})
- Aji - (- )
Yj -IAr{i}il Ur{i} - !!r(i} ~ -Ci
'W,;
v. (1.79)
j
-Y ~ 0 (1.80)
y-u~O (1.81)
Constraints (1.78) and (1.79) define restrictions on Y,u such that they describe
a valid interval containing the solution to (1.74) for all A E [A, A]. The interval
[y,u] so defined may overestimate the true range of U values in (1.74), i.e. these
constraints may be somewhat overrestrictive in that role. Also, (1.80) is introduced
BRANCH AND BOUND FOR GLOBAL NLP 19
~ (A~-:-)x(+)
L...J ~J J
+ A("'!-)x(-»)
OJ J .
-
j
S.t. I-is(i)
A 1xs(i)
(+)
- IAis(i) 1xs(i)
(-) - Yi = - b(+)
i 'Vi (1.83)
~ (-A~+)x(+) - A(-:-)x(-»)
L...J ~J J OJ J
+
j
A
-is(i)
I (+)
1xs(i) + IA- is(i) 1xs(i)
(-)
- Zi + Yi -
-
b(-)
i 'Vi (1.84)
x(+) ~ 0 (1.85)
x(-) ~ 0 (1.86)
z~O (1.87)
y~O (1.88)
Equation (1.83) can be used to eliminate y, and Z converts (1.84) into an inequality.
t
s.. ~
L...J (A~+)
-'J
A~-:-») x(+)
- ~J J
- (A-("'!-)
OJ
- A-(-:-»)
OJ
x(-)
J
<
_.
b· 'Vi
j
I-is(i)
A Ixs(i)
(+)
+ IA- is(i) Ixs(i)
(-) '" (A(-)
- L...J
(+)
=ij Xj + A-(+) (-»)
ij Xj ~
b(+)
i
u;
v.
j
IAiS(i)lx~t~ + IAiS(i)lx~~~ -
~ (A(-:-)x(+) + A~"'!-)x(-») < b~+) 'Vi (1.91)
L...J ~J J OJ J - 0
j
x(+) ~ 0 (1.92)
x(-) ~ 0 (1.93)
20 T. G. W. EPPERLY AND R. E. SWANEY
The choice of the objective function (1.77) for the restricted dual causes the right
hand side of (1.90) to be equivalent to the constraints in the original LP, and it
makes (1.91) as tight as it can be.
d ~ [B,B]x, (1.94)
x(-) = x(-)
is feasible in (1.89). This choice of x(+) and x(-) clearly satisfies the positivity
constraints (1.92) and (1.93). The following shows that (1.90) is satisfied:
(A - A)x(+) + (A - A)x(-) + A(x(+) - x(-))
(A - A)x(+) + (A - A)x(-) + Ax
~~~
~o ~o ~b
< b
To justify the right hand side of (1.91) and to illustrate that it is satisfied, it is
useful to consider a simple constraint
x·J <
- b·,
To prove that (1.91) is satisfied by X, two cases need to be considered. In the first
case, s(i) = 0 or Ais(i) = 0, so (1.91) is reduced to
<
- b~+)
,
In this case, this constraint is redundant, so it need not be included when solving
the underestimating LP.
In the remaining case, s(i) "# 0, and Ais(i) "# O. From above, the following is true
for row i.
b~+) > bi
> -,
A'!x(+) - .IFx(-)
,
"((A(+) - A~-:-»)x(+) - (A(~) - A~-:-»)x(-»)
~ -OJ -OJ J OJ OJ J
j
Because of the (+) and (-) designations and the fact that A ~ A, only one of
the four products involving Ais(i) and Xs(i} may be nonzero. This constraint must
be satisfied for each of these four terms alone. The third and fourth terms are
satisfied trivially, and the first and second terms give the following constraint which
is equivalent to (1.91).
The constraints (1.48.A) and (1.48.B) are the only general constraints to depend on
the choice of orthant. The values of e}k and elk are determined by equations (1.28)
and (1.29). If O'jk = 1, e}k E [df, df], and elk E [wf, wf]. The convex and concave
envelopes of these quadratic terms can be written as (1.99) and (1.100). The simple
variable bounds on w and d become (1.111) and (1.112).
The general McCormick convex and concave envelopes for !:l.Xj !:l.xk are added to
the program to give a bound which will work if the other bounding functions fail
to keep the LP stationary at !:l.x = O. This requires the addition of constraints
(1.101-1.104).
(1.97)
Vi (1.98)
j,k
o ~ [df,dflwj - qjk ~ 0 Vj, k (1.99)
o ~ [wf,wf]dj - qjk ~ 0 Vj, k (1.100)
!:l.xf !:l.Xk + !:l.xf !:l.Xj - ~(qjk + tjk + qkj + tkj) :5 tl.xf tl.xf '1j, k (1.101)
!:l.xf !:l.Xk + !:l.xf !:l.Xj - ~(qjk + tjk + qkj + tkj) ~ !:l.xf !:l.xf Vj, k (1.102)
!:l.xf !:l.xk + !:l.xf !:l.Xj - Hqjk + tjk + qkj + tkj) ~ !:l.xf !:l.xf Vj, k (1.103)
!:l.xf !:l.xk + !:l.xf!:l.xj - t(qjk + tjk + qkj + tkj) ~ !:l.xf !:l.xf Vj, k (1.104)
o ~ (tjk + tkj) + IPjk11 L 7rstrs Vj, k (1.105)
r,s
Vj, k (1.106)
r,B
(1.113)
Vi (1.114)
j,k
o ::; [df, dfle j T (I + N(N T N)-l NT)LlX -
qjk ::; 0 Vj, k (1.115)
0::; [wf, wflejT(I - N(NT N)-l NT)LlX - qjk ::; 0 Vj, k (1.116)
Llxy LlXk+ Llxf LlXj - ~(qjk + tjk + qkj + tkj) ::; Llxy Llxf Vj, k (1.117)
Llxf LlXk + Llxf LlXj - ~(qjk + tjk + qkj + tkj) ::; Llxf Llxf Vj, k (1.118)
Llxf LlXk + Llxf LlXj - hqjk + tjk + qkj + hj) ~ Llxf Llxf Vj, k (1.119)
Llxy LlXk + Llxf LlXj - hqjk + tjk + qkj + tkj) ~ Llxy Llxf Vj, k (1.120)
0::; (tjk + tkj) + IPjk11 L 'Yrstrs Vj, k (1.121)
r,s
Vj,k (1.122)
r,s
(1.123)
When the interval LP relaxation derived above is applied to this LP, it is able
to verify the global minimum for some finite region when there is no constraint
gradient null space. However, it is unable to verify the global minimum for finite
sized regions when there is a null space. This problem can be solved by introducing
a new variable z to deal with the rank deficiency of G.
Z = (I + [0 N])LlX (1.126)
because biT N = 0 for i E fA. Second, the components of z and Ax in the constraint
space are identical.
Introducing z into the LP and replacing Ax with it where possible tightens the
bounds produced by the covering program.
Vj,k (1.138)
T,S
z= ~x+N~x (1.139)
~XL ~ ~x ~ ~xu (1.140)
zL ~ Z ~ zU (1.141)
qL ~ q ~ qU (1.142)
tL ~t~ tU (1.143)
where
zf = zY = 0 Vj f/. J
[qfk,qf,.l = [wf,wYl x [dy,dYl Vj,k
[tYk, tYkl = (e jT N(N T N)-l NT[~xL, ~xU]) x (e kT N(NT N)-l NT [.6.x L,~xu])
Applying the interval LP relaxation to this program gives the end result:
Covering Program
min CT(~X(+) - ~x(-)) (1.144)
S.t. l)iT(z(+) - z(-)) + Vi E fA (1.145)
'"
~
iIiJk (t(+)
Jk
- t(-)
Jk
+ q(+)
Jk
- q(-)) < 0
Jk-
j,k
Vi f/. fA (1.146)
j,k
Vj,k (1.147)
Vj,k (1.148)
Vj,k (1.149)
Vj,k (1.150)
Vj,k (1.151)
Vj,k (1.152)
Vj,k (1.153)
26 T. G. W. EPPERLY AND R. E. SWANEY
Vj,k (1.154)
Vj,k (1.155)
r,s
o <- -2(t~+)
Jk - t~-»
Jk + Ip--:-11
Jk '"'L.J lVIrs (t(+)
rs - t(-»
rs Vj,k (1.156)
r,s
z(+) - z(-) = (/ + [0 NJ)(~x(+) - ~x(-» (1.157)
~xL ~ ~x(+) - ~x(-) ~ ~xU (1.158)
zL ~ z(+) - z(-) ~ zU (1.159)
qL ~ q(+) _ q(-) ~ qU (1.160)
t L ~ t(+) - t(-) ~ t U (1.161 )
bi(~)Z(+)
s(,)
+ b'(-)z(-)
s(i)
- '"'
L.J
(bi,(-)z(+)
J J
+ bi,(+)Z~-»-
J J Vi E IA (1.162)
#s(i)
'"'(iIi(-)(t('+)
L.J Jk Jk
+ q(+»
Jk
+ iIJki(+) (l-)
Jk
+ q~-»)
Jk
<
-
0
j,k
-z~-) + (I + [0 N])~i) ~x~+) + (I + [0 N])~t) ~x~-) Vi E J (1.163)
L ((I + [0 N])~t) ~x;+) + (I + [0 N])~;) ~x;-») ~ 0
#i
q ~-) - hTjk(-)z(+) - jiTjk(+)z(-) <0 Vj, k E U (1.164)
Jk - -
q ~+) - hTjk(+)Z(+) - hTjk(-)z(-) < 0 Vj, k ¢ U (1.165)
Jk - -
k)
(2+ Ipjk11"Yjk)(-)ti + (2+ Ipjkll"Yjk)(+)t;~)- V(j,k) E C(1.166)
L ((IPjk11"Yrs)(+ t~~)+(lpjkll"Yrs)(-)t~:;}) ~O
r,s#j,k
(-2 + Ipjk11"Yjk)(-)ti k)+ (-2 + Ipjkll"Yjk)(+)tj~)- V(j, k) ¢ C(1.167)
L ((lpjk11"Yrs) (+ t~~) + (IPjk11"Yrs) (-) t~:;») ~ 0
r,s#j,k
where
The interval relaxation also includes a constraint for each variable of the form (1.91).
For each variable, a particular constraint is chosen to make an extra constraint to
tighten the bound. Constraints (1.162) are the constraints for the z variables, and
s(i) is defined by the pivot sequence used in LU factoring G. The x constraints
(1.163) are taken from the equations (1.139) that relate x and z and the qjk con-
straints (1.164) and (1.165) are taken from the constraint (1.132) using (Jjk to
indicate whether the upper or lower bound will be active. Lastly, the t constraints
(1.166) and (1.167) are chosen from (1.137) and (1.138) using the constraint with
the largest coefficient on t j k.
When there is no null space, the interval LP can be simplified to the following:
where
[qJk,qlicl = [~xJ,~xYl x [~xf,~xfl
Applying the interval LP relaxation gives the following useful result:
The above give two forms of the covering program available to calculate a lower
bound for the NLP over a range of variables. LP (1.177) is much smaller and is
used when the solution point is completely constrained, while the larger (1.144) is
used when there is a constraint null space. This decision is based on the results
from a previous solution. In addition, the solution to these gives a step for the x
variables, ~x, which will be used in a line search in the algorithm.
One problematic feature of the null space covering program is that there is no
constraint relating tjk to ~x except for the McCormick envelopes (1.151-1.154).
When the program can reduce its objective function by moving away from j; in the
direction of the null space, McCormick's constraints and variable bounds are the
only ones that keep it from moving infinitely in the null space direction, and these
constraints do not give a D..x which will lead the iterative algorithm to the global
minimum. In this section, a constraint which can be added to the covering program
to limit motion in the null space will be developed. It is designed to try to produce
a Newton step in the null space.
BRANCH AND BOUND FOR GLOBAL NLP 29
Assuming that the active set is correct, the LU factors of G used in calculating N
may also be used to calculate the Newton step in the constraint space.
dN = _ [ G b T
] 9 (1.191)
for all p. Choosing the correct values of p will produce a constraint which is valid
and can give a Newton step in the null space direction.
(p_jj)TQ(p_jj) > 0 (1.199)
pT Qp _ 2jjT Qp + jjT Qjj > 0 (1.200)
'--v-'
L 'Yijtij
ij
L 'Yijtij > 0 (1.201)
ij
30 T. O. W. EPPERLY AND R. E. SWANEY
The goal is to choose a set of p's such that when constraints (1.200) and (1.201) are
active, the Newton step in the null space given by (1.196), so (1.200) and (1.201)
will be treated like equalities below. By scaling each row of (1.196) by -am, the
value of p which makes (1.200) equivalent to a row of (1.196) when (1.201) is active
can be determined.
-amemTQp = amemTNT(V!(Xk) + "YdN ) (1.202)
_2pTQp = -pATQAp (1.203)
am m
p -e (1.204)
2
-4e mT NT(V!(Xk) + "Yd N )
am (1.205)
Qmm
These values of a and p give a constraint of the following form for each null space
dimension.
The combination of constraints (1.206) and (1.137-1.138) constrain the LP's move-
ment in the null space.
Given a problem in factorable form (1.1), the following index sets are defined:
K is the index set required for qjk, and l' is the initial index set required for tjk.
The cardinality of a set is denoted with the notation I . I. The number of active
constraints is IlAI + IlAI, and the null space has dimension n - (llAI + IlAI. In
Section 1.4, some additional quadratic terms may be added to adjust the Q matrix;
let a be the number of quadratic terms added in this step. It can be shown that
a~n - (llAI + IlAD·
For the first form of the covering program (1.144), the number of variables and rows
can be calculated and bounded as follows:
10000
1000
100
-------.... -- ..
10 ....•..........t ... ···•·····•···•···•··•··•·•.... ·•·
1 10
Number of problem variables, n
The second form of the covering program (1.177) always has the same number of
variables and rows because it does not depend on the null space. The number of
variables and rows required is given by
A completely dense problem presents the worst case for covering program size,
giving IKI = n 2 and 11'1 = ~n(n + 1). Figure 5 shows a comparison of the number
of extra constraints (in addition to the m original constraints) required in the null
space version of the covering program applied to the test problems used in the next
chapter and the worst case behavior. The slope for the line of best fit through the
test points is 1.03, indicating linear growth in the number of extra constraints with
respect to the number of original problem variables.
32 T. G. W. EPPERLY AND R. E. SWANEY
3 CONCLUSIONS
The size of the covering program is a linear function of the number of variables,
number of original constraints, and the number of unique nonlinear terms appearing
quadratically or in single variable functions. For dense problems, the number of
bilinear terms is proportional to n 2 , so the covering program may be too large
for conventional solvers. However, the majority of engineering problems will be
sparse, and for these the number of additional constraints grows linearly with n, as
suggested by the test problem set.
Acknowledgements
This work was supported by the Computational Science Graduate Fellowship Pro-
gram of the Office of Scientific Computing in the Department of Energy. The Na-
tional Science Foundation also provided partial support under grant DDM-8619582.
REFERENCES
[25] H. Ratschek and J. Rokne. New Computer Methods For Global Optimization.
Mathematics and its Applications. Ellis Horwood Limited, 1988.
[26] H. Ratschek and J. Rokne. Interval tools for global optimization. Computers
and Mathematics with Applications, 21(6/7):41-50, 1991.
[27] H. Ratschek and R. L. Voller. What can interval analysis do for global opti-
mization? Journal of Global Optimization, 1:111-130,1991.
[28] G. V. Reklaitis, A. Ravindran, and K. M. Ragsdell. Engineering Optimization:
Methods and Applications. John Wiley & Sons, Inc., 1983.
[29] H. S. Ryoo and N. V. Sahinidis. Global optimization of nonconvex NLPs
and MINLPs with applications in process design. Computers and Chemical
Engineering, 19(5):551-566, 1995.
This chapter presents a branch and bound algorithm for global solution of nonconvex
nonlinear programs. The algorithm utilizes the covering program developed in
the previous chapter to compute bounds over rectangular domain partitions. An
adaptive rectangular partitioning strategy is employed to locate and verify a global
solution.
Two versions of the algorithm are presented which differ in how they search for
feasible points. Version 1 uses MINOS 5.4 [4] to search for local minima. Version
2 employs an iterative strategy using a search direction obtained from the covering
program as well as an approximate Newton step.
Section 1 describes the algorithm in detail. Section 2 reports our results in applying
the algorithm to a number of test problems and engineering design problems, and
we give some brief conclusions in Section 3.
1 THE ALGORITHM
The algorithm uses the covering program to calculate a lower bound for the NLP,
and uses the objective function at feasible points as an upper bound on the solution
of the problem. When the lower bound for the NLP equals the upper bound, the
problem is solved. If the algorithm is applied to a problem without a solution, the
algorithm determines this by eliminating all subsets of the variable domain because
of infeasibility. The overall goal during the algorithm is to increase the lower bound
and decrease the upper bound until they meet.
In difficult problems, the covering program applied to the original domain will
provide a lower bound significantly lower than the global optimum, so the branch
and bound algorithm must resort to an adaptive domain partitioning strategy to
37
I. E. Gross1flll1l1l (ed.), Global Optimization in Engineering Design, 37-73.
" 1996 Kluwer Academic Publishers.
38 T. O. W. EPPERLY AND R. E. SWANEY
increase the lower bound. The original variable domain is split into a list of subsets
which are bounded separately. The lower bound for the NLP is the least lower
bound of all of the regions in the list. Any member of the list can be removed if its
lower bound is greater than or equal to the current upper bound or if its covering
program is infeasible. The algorithm proceeds by bounding the region with the
least lower bound, and if it cannot be ruled out, it is removed from the list and split
into two subsets which are added to the list. As the region sizes decrease, the lower
bound increases, until all other regions can be ruled out and the global optimum is
verified, or all of the regions are infeasible.
Decreases in the upper bound are obtained in the course of searching for better
feasible points. The program stores the best feasible point and its objective function
value. Each time the problem constraints are evaluated at a point, the program
checks to see if it is a better feasible point than the stored one, and if it is, it replaces
the stored one. The systematic search for better feasible points is accomplished
by the region analysis procedure. Two different versions of the region analysis
procedure have been developed for comparison. The first is a modification of the
algorithm presented by Swaney [6] and uses MINOS 5.4 [4] to search for local
minima of the NLP. The second uses a line search in the direction provided by the
covering program solution, and the direction provided by a Newton step if available,
to search for new feasible points.
The algorithm presentation below will be organized in three main parts: the branch
and bound loop, the variable splitting method, and the region analysis procedure.
The branch and bound loop is the manager for the algorithm. It organizes and
maintains the region list, and it makes the choice of which region to analyze next.
It keeps track of the current upper and lower bounds, and prunes the region list
when needed. All of the other procedures operate under its control.
One of the primary functions of the branch and bound loop is to manage the region
list. Each element in the region list contains the specification of a region describing
the lower and upper bounds for each variable and a lower bound on the objective
function within that region. This list is kept sorted in order of increasing lower
bounds, and the next region to be analyzed is always taken from the top of the
list. This sorting and region selection corresponds to analyzing the region with
the lowest lower bound first. The algorithm has also been operated in a highest
lower bound first manner, in a last in first out (LIFO) manner, and first in first out
(FIFO) manner. However, the lowest lower bound appears to be the best based on
tests conducted early in the algorithm development.
BRANCH AND BOUND FOR GLOBAL NLP 39
The region list is initialized with the variable bounds from the problem specification,
and the best objective function value is initialized to a large value. The algorithm
terminates when the region list is empty.
The procedures that evaluate the objective function and constraints keep track of
the best feasible point. Each time the constraints are evaluated, the procedure
checks if the point is feasible (not violating any of the constraints by more than
f O. 55 = 2.4578 x 10- 9 ) and if it is better than the current best point. The evaluation
procedures also store the gradient of the constraints at the best feasible point to
avoid recalculating them. The branch and bound loop can poll the evaluation
routines to obtain the best objective function value, the best point, and the number
of feasible points found.
The objective function value of the best feasible point can be used to remove el-
ements from the region list. Any region list element with a lower bound greater
than or equal to the current best objective function value cannot contain a better
feasible point, so it is removed from the region list (pruned).
The covering program requires a point around which to construct its bounds. In this
step, the algorithm first checks if the best feasible point is contained in the region
of interest; if so, it chooses the best feasible point. Otherwise, the algorithm adapts
the point used in the previous iteration (the "current point"). The algorithm checks
if each component Xi of the current point is inside the bounds for that variable in the
current region. If it is inside the bounds, its value remains unchanged; otherwise,
40 T. G. W. EPPERLY AND R. E. SWANEY
Xi is assigned the value of the average of the lower and upper bounds for Xi. If the
current point is inside the region, region analysis can be started without having to
reevaluate the objective function, constraints, and the constraint gradients.
When a new feasible point is found, the algorithm checks each element in the region
list to see if it can be pruned. Any region whose bound is greater than the current
upper bound can be removed from the list. For finite precision mathematics, a region
is pruned if (upper bound -lower bound) ~ 1 x 10- 4 . The problems are scaled using
a heuristic method described below which attempts to give the objective function an
order of magnitude of 10°. This pruning criteria guarantees the objective function
to a high enough tolerance for most engineering applications. However, problems
with many local minima with objective function values very close to each other may
require a higher tolerance.
The method for choosing which variable to split and where it should be split has a
large effect on the performance of the overall algorithm. Each time the algorithm
splits it creates two more regions that may need to be stored and processed or
pruned, so it is important that each split be chosen judiciously. Otherwise, the
algorithm may just generate more work without improving the lower bound. The
goal of the method presented here is to choose the variable which will most greatly
improve the objective function bound.
The method used is analogous to the method employed in [6]. The starting idea is
to choose the variable whose bounds most greatly affect the value of the covering
program objective function. The effect on the objective function of changes in
the bounds is estimated from sensitivity analysis of the solution to the covering
program. Given the following program which depends on certain parameters a
min r.p(x)
x
s.t. g(x, a) ~ 0
and the solution r.p* = r.p(x*), the effect of a change in the parameters, 6.a, can be
estimated as follows:
6.r.p*~U*T 8g I 6.a (2.1)
8aT x=x·
where u* are the Lagrange multipliers at the solution. For the variable splitting
procedure, the parameters are the variable bounds and the program is either (1.144)
or (1.177). For the null space program, constraints (1.145-1.146) are affected by
BRANCH AND BOUND FOR GLOBAL NLP 41
The process of splitting takes place in two steps. First the variable to split is chosen,
and then the actual location to split at is chosen. In the first step, it is necessary
to choose a hypothetical location for the split to occur. For each variable, a point
in its domain, xf, is chosen and the effects of changing the bounds from [xf, xfl to
[xf, xfj and [xf, xfl are estimated. xi is chosen in the following manner:
if xf < Xi < xf
otherwise if xf < Xi + D.Xi < xf (2.2)
if neither of the above is satisfied
where D.Xi comes from the solution of the appropriate covering program.
The effects of changing each variable's lower and upper bound are estimated and
labeled D.f and D.f respectively. For purposes of choosing the variable to split, it
is better to reduce the contribution of the simple variable bounds on the x and z
variables, constraints (1.158-1.159) or (1.185-1.186), to a small percentage of their
contribution. The goal of splitting is to improve the bounds on the quadratic terms,
and the effect of simple variable bounds should be secondary. The results presented
in· Section 2 are for an algorithm using 1% of the contribution from the simple
variable bounds. The estimates with the reduced contribution from the simple
variable bounds are called D.f' and D.f'. The variable to split on is chosen by the
following:
l = argmax !(D.f' + 6f') (2.3)
i
If D.f and D.f should be zero, the nonlinear variable with the widest domain is
bisected, and Step 5 ends causing the algorithm to continue with Step 1.
When D.f and 6f are not zero, the algorithm now switches to the task of choos-
ing the location of the split. If an upper bound for the problem is available, the
algorithm attempts to determine a split which will result in one region that can be
eliminated by a lower bound increase and one region that still might contain the
global minimum. This procedure may fail if the estimated bound increase is not
large enough to predict that part of the region may be eliminated, or if it requires
a division by a number near zero. If this procedure should fail or if an upper bound
is not available, the algorithm will split at xf as long as it satisfies the following
conditions
xf + o.2(xf - xf) ~ xf ~ xf - o.2(xf - xf)
Otherwise, the variable is split at Hxf + xf).
42 T. G. W. EPPERLY AND R. E. SWANEY
When an upper bound is available, the estimated increase of the covering program
objective function can be used to predict a split location yielding one region that
can be eliminated and another than may contain the solution. The lower bound
given by the covering program solution is Xo + ~xo, and the estimated new bound
as a function of split location, xi, can be written as
The goal is for x8st to meet or exceed the upper bound, U B, so by substituting in
for x8st , the following equations for xi are obtained.
Here 4> is a small multiplier (1 ~ 4> ~ 1.2) to overestimate the cut needed. If the
estimate for one of the bound changes lies outside the variable domain or if ~I is
zero, this analysis for that bound cannot predict a split that will eliminate part of
the region. The splitting algorithm chooses that value of xi which will eliminate
the largest piece of the region. For a lower bound change, xV - xi is predicted to be
eliminated, and for an upper bound change, xi - xf is predicted to be eliminated.
After the variable and location have been recommended, the algorithm checks the
ratio of the width of the recommended variable to the width of the widest nonlinear
variable normalized by their original widths. If the recommended variable's nor-
malized width is less than one one-hundredth of the widest nonlinear variable's nor-
malized width, the algorithm overrides the recommendation and bisects the widest
nonlinear variable.
After the splitting method has chosen a variable and a split location, it needs to
add the two new regions to the branch and bound list. The new regions take their
lower bound on the objective function from the region they subdivided because
their lower bounds have to be at least as high as the region from which they come.
If their bound is the same as the first element in the list, they are added to the
front of the list; otherwise, they are inserted in order of increasing lower bound.
The region which includes the current point (the point around which the covering
program was constructed) is added second (on top). If this region has the least
lower bound, the covering program of the next iteration can be constructed without
having to evaluate the objective function, constraints, and their gradients again.
This is not necessary for the algorithm to work, but reduces the number of function
evaluations.
BRANCH AND BOUND FOR GLOBAL NLP 43
This region analysis procedure is a modification of the one presented by Swaney [6].
It is different because it uses MINOS 5.4 [4] as its local NLP solver and because it
uses the generalized covering program.
Because the generalized covering program is used, this region analysis loop can still
provide a lower bound even when MINOS 5.4 does not find a feasible point. If the
problem is infeasible, the covering program will become infeasible when the region
size is small enough. The details of constructing the covering program are shown
in the next section.
The second version of the region analysis procedure is a new approach combining
a search direction derived from the covering program and a Newton like search
direction. The covering program provides a search direction based on the global
character of the problem which may provide an improved point when the local
methods fail. The local step is used to provide quick convergence when close to a
local minimum.
Given Lagrange multiplier estimates, the algorithm first counts the number of active
constraints as determined by the value of the multipliers and MINOS' basis array. If
the number of active constraints is less than the number of variables, the algorithm
constructs the null space covering program; otherwise, it constructs the full rank
covering program.
Constructing the full rank program (1.177) is relatively straightforward. Given the
current point and the bounds on the variables, constructing the LP does not require
any difficult operations.
Constructing the null space program (1.144) requires more computational effort.
First the algorithm constructs the matrix G, defined by (1.12), whose columns are
gradients of the active constraints determined from the Lagrange multipliers. Next
the LU factors of G are calculated using LUSOL [3], the sparse LU factorization
routine from MINOS 5.4 [4]. If G is not of full rank, the linearly dependent con-
straints are removed from the active set, and G is recalculated until a set of linearly
independent active constraint gradients is found.
Once a G of full rank has been found, the algorithm uses it to calculate new estimates
of the multipliers. Assuming that the current point is a Karush-Kuhn-TUcker point,
c+Gu = 0
which is solved for u using the LU factors of G. These updated u estimates are
checked to make sure they are of the correct sign. If a sign is incorrect, the associated
constraint is removed from the active set, and a new G is used.
BRANCH AND BOUND FOR GLOBAL NLP 45
Using the LU factors of G, a basis N for the null space of the active constraints is
calculated.
(1.13)
where G [ are the most linearly independent rows of G as determined by the row
permutation provided by the LU factorization routines. The Newton step in the
constraint space dN is computed from the LU factors of G using (1.191).
Once N has been calculated, the projection matrix N(NT N)-l NT is calculated
using the QR factorization of N, from which N(NT N)-l NT = Q NQiv. This
matrix is calculated and stored using full matrix routines.
Next ""y is calculated as Li UiH(i). Matrix Q is calculated from ""Y according to the
definition (1.36). Q is stored and processed as a full matrix. Q is factorized using
a modified Cholesky factorization enforcing positive definiteness [5] which adds the
smallest possible diagonal matrix needed to Q to make it positive definite. That
procedure was modified to add a larger diagonal matrix to avoid scaling problems
in the LP. The diagonal adjustment is propagated to ""Y using (1.44). From Q, r] is
calculated using full matrix computations using equation (1.39), and from r], the
required elements of p-l and p-l are calculated using (1.40). The Cholesky factors
of Q are also used to calculate the Newton step in the null space pN, using (1.197)
which gives the total Newton step of l1x N = d N + NpN. The a's for the Newton
constraints are calculated using equation (1. 205).
From this information, the null space covering program can be constructed and
solved using MINOS 5.4.
The algorithm employs a special kind of line search that can search two directions
simultaneously. In some cases, the algorithm will have a search direction from the
covering program, l1x c , and a Newton search direction, l1x N , and sometimes, it
will have only one of the two. When the full rank bound is used, only l1x c is
available, and when the covering program gives a zero step or cannot be solved by
MINOS 5.4 (which occurs very rarely), only the Newton step is available.
The line search uses the same merit function for evaluating progress for either search
direction. The merit function contains the objective function and a weighted sum
of the infeasibility.
(2.6)
The weighting factors W are a kind of moving average of the Lagrange multipliers.
The method of calculate Wk+l from Wk is evaluated using the first rule that applies
46 T. G. W. EPPERLY AND R. E. SWANEY
in the following:
if Ui = 0
if Wk, = 0 (2.7)
otherwise
This definition allows the weights to be adjusted without drastic changes that can
cause cycling.
The line search first calculates the directional derivative for each available search
direction. If the Newton directional derivative is nonnegative, which may occur if
the active set is wrong, the Newton step is ignored. When the directional derivative
is negative, an Armijo criteria [1, Section 8.3] is used as a terminating condition.
Otherwise, a fixed decrease is required for the step be accepted. The line search
takes a step with each available direction, and then pursues the one that has the
best merit function.
After taking the first step, if the directional derivative is negative, the minimum
of a quadratic approximation of the merit function is used to calculate a new step
length. If the directional derivative is positive, it decreases the step by 0.4 each
iterations. The line search continues until an improved point is found or until a
maximum of 8 iterations have been taken.
This algorithm uses a simple, a priori procedure to scale the variables and function
values. After a problem has been defined as published, the scaling procedure is
applied to compute variable and function scales, which are added to the problem
definition file before executing the branch and bound algorithm.
The variable scale is based on the order of magnitude of the average of the upper
and lower bound. Here are the definitions of some of the functions used and how
the scale for variable i is determined.
{
lOtrunc(ioglo (\Y\))
if y ¥= 0
Magnitude(y) (2.9)
1 if y = 0
BRANCH AND BOUND FOR GLOBAL NLP 47
To determine the constraint scale, 100 points are chosen at random uniformly dis-
tributed in the variable domain determined by the upper and lower bounds, and
the magnitudes are estimated from the averages of the largest magnitude term in
each expression as explained below. The random points are designated xk for k
ranging from 1 to 100. It is assumed that problems are in the form (1.1) with one
possible modification. If the objective function of the original problem specification
is linear, the linear objective function is used as published, and Xo is not introduced
as a dummy objective variable.
Maximums and absolute values give the scale of the largest term in the expression.
The method for the constraints is similar.
Without this scaling procedure, the algorithm fails to solve several of the test prob-
lems. The scaling produced by this method are sufficient for the set of test problems,
and it is the only method that has been tested.
Some of the single variable functions or their derivatives are undefined for some
subset of the variable domain, and to solve them with this algorithm, small adjust-
ments must be made to the problem definition. Of all of the single variable functions
needed to define the 50 test problems, PoxP1 , Po In(p1x + P2), and POX In(p1 x + P2)
are the only three that may have undefined function values or derivatives. With
In{,8),,8 ~ 0, undefined, the bounds on variable x may need to be adjusted to keep
the lower bound of P1X+P2 greater than or equal to some small value strictly greater
than zero.
required in the test problems has PI = 0.6, resulting from the estimated cost of heat
exchangers taken as proportional to (Area)o.6. In this case, the power function is
defined for all nonnegative values of x, but the derivative approaches infinity as x
approaches zero from the right. To treat this infinite derivative, the power function
is replaced by a fourth order polynomial which goes smoothly to zero for x <= 0.05.
The fourth order polynomial chosen has the minimum integrated error squared such
that the value and derivative of the power function are matched at x = 0.05 and
the value of power function is matched at x = o.
2 COMPUTATIONAL EXPERIENCE
The algorithm presented was implemented in C++ with calls to FORTRAN numer-
ical libraries, and it was tested on four different platforms, a DECstation 3100, a
NeXTstation Turbo, a HP 9000/735, and an IBM SP-2. This section will present a
summary of the results obtained by observing the behavior of the algorithm applied
to a variety of problems from the literature.
Tables 1, 2, 3, and 4 give an overview of the problem sizes and characteristics and
the runtimes obtained on a HP 9000 Series 700 Model 735 with a clock speed of 99
MHz and 80MB of RAM. Table 5 gives the benchmark results for this machine as
reported by HP.
The algorithm was able to solve 47 of the 50 test problems within the arbitrary
iteration limit of 25000. In 42 of these cases, it found a result agreeing with the
results reported in the problem source, and in 5 cases, it found either better solutions
or alternative minima. The problems definitions and locations of these minima are
presented in an Appendix available on request from the authors.
For the three unsolvable problems, the algorithm provides mixed results. For prob-
lem fp_3_1, both versions found the solution but were unable to reduce the bound
gap to zero within the iteration limit. In the case of problem fpA_9, neither version
of the algorithm found a feasible point. Problem fp_5_1 is not solved within the
iteration limit with Version 1, and the runtime was prohibitive for Version 2. For
the unsolved problems, Table 6 shows the objective function value found by the al-
gorithm, the bound gap after 25000 iterations, and the number of regions remaining
on the list at the end.
Figures 1 and 2 show the runtime required to solve each problem versus the number
of problem variables and number of bilinearities respectively. Both show a high
variance, with the runtime correlating poorly with both. However, the runtime
correlates somewhat better with the number of quadratic terms than the number
of variables.
BRANCH AND BOUND FOR GLOBAL NLP 49
Version 1 Version 2
Name na XiXjb SVFc NSD d Time(s) Iter. Time(s) Iter. Soln. e
fp_2_1 5 5 0 0 1.48 33 1.23 37 .J
fp_2.2 6 5 0 0 0.03 1 0.02 1 .J
fp_2_3 13 4 0 1 1.82 16 0.11 3 .J
fp_2A 6 1 0 1 0.11 3 0.08 7 .J
fp_2_5 10 7 0 0 1.24 31 0.81 21
*
fp_2_6 10 10 0 0 3.91 31 2.73 27 .J
fp_23_1 20 20 0 0 275.68 643 363.55 663 .J
fp_2_7_2 20 20 0 0 265.86 615 275.10 515 .J
fp_23_3 20 20 0 0 751.67 971 642.74 835 .J
fp_2_7A 20 20 0 0 198.26 623 213.15 473 .J
fp_2_7 _5 20 20 0 0 1168.83 1909 745.30 1611
fp_2_8 24 24 0 1 48.18 53 148.49 141
*
fp_3_1 8 5 0 2 1278.21 25000 4837.56 25000 x
*
fp_3_2 5 8 0 2 1.26 27 21.15 40 .J
fp_3_3 6 7 0 0 1.01 23 0.40 7 .J
fp_3A 3 9 0 0 245.90 9213 595.69 6269
*
fpA_3 4 2 2 0 0.23 15 0.12 5 .J
fpAA 4 2 2 0 0.34 25 0.30 19 .J
fpA_5 6 3 3 0 0.62 35 0.80 29
*
fpA_6 2 1 2 0 0.15 13 0.32 11 .J
fpA3 2 2 1 1 0.05 3 0.37 17 .J
fpA_9 lIt 16 8 0 9357.13 25000 19472.80 25000 x
fp_5_1 48 44 0 9 192930.82 25000 nla nla x
aThe number of variables
bThe number of bilinearities
CThe number of single variable nonlinear functions. Each of these also is counted as a quadratic
variable.
dThe number of null space dimensions at the solution.
e...; means the solution agrees with published values, * means the algorithm found a better or
different solution, and x means the algorithm did not find a solution within the iteration limit.
f Application of the algorithm to this problem required 11 additional variables.
Version 1 Version 2
Name na Xi X / SVFc NSDd Time(s) Iter. Time(s) Iter. Soln. e
La 2 1 0 1 0.46 35 0.92 23 v'
Lb 2 1 0 0 0.03 1 0.01 1 v'
Lc 4 4 0 0 0.04 1 0.03 1 v'
Ld 20 20 0 0 49.93 87 30.67 103 v'
Le 6 6 0 0 1.38 23 0.17 3 v'
Lf 2 2 0 1 0.21 9 0.27 9 v'
Lg.l 9 2 0 1 0.08 3 0.52 6 v'
Lg.lI 9 2 0 1 0.31 3 0.29 3 v'
Lg.lII 9 2 0 0 0.35 7 1.91 13 v'
Lwingo 1 1 3 1 0.32 27 0.24 27 v'
a,b,c,d"See Table 1 footnotes.
Version 1 Version 2
Name na XiXjb SVF c NSD d Time(s) Iter. Time(s) Iter. Soln.e
s_l 2 3 0 0 0.35 19 0.36 19 J
s_lb 2 3 0 0 0.39 19 0.29 21 v'
s_lc 2 3 0 0 0.11 7 0.10 7 J
s_ld 2 3 0 0 0.31 17 0.26 13 v'
s_2 2 1 0 1 0.02 1 0.16 5 v'
s_2b 2 1 0 1 1.39 119 0.13 7 v'
s_2c 2 1 0 1 0.02 1 0.04 1 v'
s_2d 2 1 0 1 0.03 1 0.04 1 v'
s_3 2 2 1 1 0.06 3 0.18 3 J
sA 3 3 2 0 0.40 23 2.91 56 v'
s_5 4 2 2 0 0.32 19 0.22 17 v'
s_6 9 2 0 1 0.18 3 0.45 3 v'
a,b,c,d,eSee Table 1 footnotes.
Version 1 Version 2
Name na XiXjb SVFc NSD d Time(s) Iter. Time(s) Iter. Soln. e
diet 6 0 0 0 0.02 1 0.01 1 v'
e_1 3 3 0 2 0.14 7 0.12 7 v'
w 6 8 5 0 278.13 4129 128.25 885 v'
w_b 6 5 8 0 12.51 323 19.34 128 v'
w_c 3 5 8 0 7.97 307 11.43 129 v'
a,b,c,d,eSee Table 1 footnotes.
SPECint92 109
SPECfp92 168
MIPS 125
MFLOPS (dp) 45
Version 1 Version 2
Name Objective Gap Regions Objective Gap Regions
fp_3_1 7049.25 108.98 2443 7049.25 135.59 2181
fpA_9 None 00 3693 None 00 2189
fp_5_1 1.567 0.132931 5868 nla nla nla
Table 6 Results for unsolved problems
Figure 1 Runtime versus number of variables
10000
Version One..
Version Two +
1000 ..
t
. ..
+
:
+
+ +
100
U; .. ..
I
"C
+
+ +
CD
E
10 •..
E
.. . t .. ..
+ "+
,
:::J +
II:
+ +
•" i "
+
* +
*. ..
+
0.1 +
+ ~ +
$ ..
.
+
+
"
0.01
0 5 10 15 20 25
Number of bilinearilies
The results of the runs of both algorithms have been grouped into characteristic
behaviors, and some examples of each characteristic behavior will be presented.
The runs have been grouped into the following categories: trivial runs, runs where
the log of the bound gap decreases nearly linearly, runs where the log of the bound
gap decreases superlinearly, and poor runs. All of the graphs are in terms of the
scaled objective function. Some problems appear in different groups for the different
algorithm versions.
The first category consists of runs which took less than fifteen iterations to solve for
a particular algorithm. Figures 3 and 4 show the bounds as a function of iteration
number for problem Lg-III for Versions 1 and 2 of the algorithm respectively. In
both cases, the bounds converge in a few iterations.
The second category are those where the log of the bound gap decreases approx-
imately linearly with the iteration number. Applying Version 1 of the algorithm
to problem fp_2_7 _5 demonstrates this behavior on a problem with 20 variables
and 20 quadratic terms, and applying Version 2 of the algorithm to problem s_l
demonstrates this performance on a problem with 2 variables and 1 quadratic term.
Figures 5, 6, and 7 show the bounds, bound gap, and region size versus iteration
number for fp_2_1-5 using Version 1, and Figures 8, 9, and 10 show the analogous
results for s_1.
The next category are those where the log of the bound gap decreases superlin-
early with respect to the iteration number. Applying Version 2 of the algorithm to
problem fp_23_3 gives a typical example of this as shown in Figures 11, 12 and 13.
54 T. G. W. EPPERLY AND R. E. SWANEY
Version 1 Version 2
Name Obj.a Const. b Grad. C Obj. Const. Grad.
fp_2_1 4162 4195 4224 172 181 54
fp_2_2 10 11 12 2 2 2
fp_2_3 218 234 243 22 23 4
fp-2A 137 140 142 35 38 11
fp_2_5 229 273 290 64 65 20
fp_2_6 4463 4814 4836 74 74 24
fp-2-1_1 7647 40855 41183 3614 3760 679
fp_2_7 -2 13347 38686 39014 3565 3775 697
fp_2_7 --3 43323 154888 155372 4740 4886 770
fp_2_7A 13425 35490 35818 2899 3028 490
fp_2_7 _5 78500 245840 246859 5653 5735 1390
fp-2_8 4651 4709 4736 917 994 197
fp_3_2 67 1327 1339 624 673 118
fp_3_3 833 1989 2001 41 45 11
fp_3A 87141 334185 341338 57562 64512 11849
fpA_3 49 69 77 39 39 9
fpAA 145 175 192 93 98 22
fpA_5 247 346 366 240 260 40
fpA_6 121 213 220 241 261 46
fpA-1 6 91 92 94 100 17
aObjective function evaluations
bConstraint evaluations
CConstraint gradient evaluations
Version 1 Version 2
Name Obj.a Const. b Grad. C Obj. Const. Grad.
La 176 398 416 352 423 89
Lb 43 44 45 2 2 2
Lc 13 16 17 2 2 2
Ld 4454 8882 8925 331 336 70
Le 933 3342 3354 17 19 6
f..f 252 440 446 106 122 24
LgJ 24 84 86 39 42 10
LgJI 19 155 157 13 15 9
LgJII 46 499 503 69 72 19
Lwingo 202 406 420 129 134 20
a,b,cSee Table 7 footnotes.
Version 1 Version 2
Name Obj.a Const. b Grad. c Obj. Const. Grad.
s_1 350 541 550 123 130 25
s_lb 520 768 778 97 99 19
s_lc 21 48 51 26 27 8
s_ld 279 422 430 62 65 13
s_2 3 3 4 75 85 15
s_2b 633 1038 1124 29 32 9
s_2c 3 3 4 6 8 4
s_2d 3 3 4 8 11 5
s_3 5 100 101 24 31 9
sA 76 389 401 524 577 95
s_5 97 125 134 62 64 12
s_6 12 84 86 16 18 6
a,b,cSee Table 7 footnotes.
Version 1 Version 2
Name Obj.a Const. b Grad. c Obj. Const. Grad.
diet 10 10 11 2 2 2
e_l 134 198 202 39 41 12
w 4683 61969 64406 9632 11105 2441
w_b 1223 6991 7160 1279 1462 301
w_c 1734 4619 4778 1439 1537 273
a,b,cSee Table 7 footnotes.
Version 1 Version 2
Function evaluations 274669 95194
Constraint evaluations 960107 105010
Constraint gradient evaluations 972947 19950
-0.7 r-------,r-------,----,.---~---__._---___,
Lower bound -
Upper bound -----
-0.72
-0.74
-0.76
~
S -0.78
-0.81---------------
-0.82
-0.84
o 2 3 4 5 6
Iteration Number
Figure 3 Bounds versus iteration number for typical rapid convergence problem
(LgJII, Version 1)
-0.1
---------------------------------,
Lower bound -
Upper bound -----
-0.2
-0.3
-0.4
'"c:
"0
:J
0
m -0.5
\
-0.6
-0.7
._-----------------------
-0.8
0 2 4 6 8 10 12
Iteration Number
Figure 4 Bounds versus iteration number for typical rapid convergence problem
(LgJII, Version 2)
BRANCH AND BOUND FOR GLOBAL NLP 57
-4 •••...________________ =
___:__:;:;:___;.__: :=___
=__=___=__=_ _ _ _ _ _ _--'=:.:..==---'
-6
-8
-10
-12
-14~--~--~--~----~--~--~--~----~--~--~
o 200 400 600 800 1000 1200 1400 1600 1800 2000
Iteration Number
Figure 5 Bounds versus iteration number for typical linear convergence problem
(fp-2_7 _5, Version 1)
10~--,_--~----~--_r--_.r_--~--_r----r_--,_--~
"C
C
~
1..:.
0.1
c
::l
.8 0.01
~
:::>
0.001
o 200 400 600 800 1000 1200 1400 1600 1800 2000
Iteration Number
Figure 6 Bound gap versus iteration number for typical linear convergence prob-
lem (fp-2_7-5, Version 1)
58 T. G. W. EPPERLY AND R. E. SWANEY
1e-100
o 200 400 600 800 1000 1200 1400 1600 1800 2000
Iteration Number
1.1 r----,---,------.--,------.--,------.--,--,-----,
Lower bound
Upper bound -- --
0.9
0.8
\~ ... ------------------------------------------::::--"'--~--~-~--------------'
0.7
0.6
0.5
0.4
Figure 8 Bounds versus iteration number for typical linear convergence problem
(s_l, Version 2)
BRANCH AND BOUND FOR GLOBAL NLP 59
0.1
"C
c:
:::>
0
.Q
:n
~
"C
0.01
c:
:::>
0
.Q
:n
8:
::l
0.001
0.0001
o 2 4 6 8 10 12 14 16 18
Iteration Number
Figure 9 Bound gap versus iteration number for typical linear convergence prob-
lem (s_l, Version 2)
10r----.----,---~r---_r----~--_,----_.----r_--_,
CI>
E
:::> 0.1
"0
>
"C
CI>
.!::!
eCI>
c:
CI>
0.01
<!l
0.001
o 2 4 6 8 10 12 14 16 18
Iteration Number
Figure 10 II(xY - xf) versus iteration number for typical linear convergence
i=l
problem (s_l, Version 2)
60 T. G. W. EPPERLY AND R. E. SWANEY
Lower bound
-4 Upper bound
-6
-8
-------'
-----------------------------------------------------------=--:;:;--""--~-
-10
-12
_14L-L--L___ __ _ _ L_ _ _ _
~ ~~ ~_~ _ _ _ _ _ L_ _ _ _ ~ __ ~
10.----.----,----,,---.----,----,-----,----,----,
'C
c:
:l
0
.c
Q; 0.1
~
.::,c:
:l
0
.c 0.01
Q;
Co
Co
:::>
0.001
Figure 12 Bound gap versus iteration number for typical superlinear convergence
problem (fp..2 ..7..3, Version 2)
BRANCH AND BOUND FOR GLOBAL NLP 61
1e+10
1e-10
CIl
E
:::J
(5
> 1e-20
'0
.~
eCIl
c 1e-3O
CIl
Cl
1e-4O
1e-50
1e-60
0 100 200 300 400 500 600 700 800 900
Iteration Number
IT
n
Figure 13 (xf - xf) versus iteration number for typical linear convergence
i=l
problem (fp.2_L3, Version 2)
Problem fp_3A displays another kind of superlinear decrease, and the behavior is
virtually the same for both algorithms. Because the behaviors are similar, only
Version 1 of the algorithm is shown. Figures 14, 15, and 16 show the bounds,
bound gap, and region size versus iteration number. The lower bound increases
slowly and linearly with respect to iteration number, which causes the semi-log
plot to have negative curvature. This was the only problem to display this kind of
behavior. The difficulty is not caused by problem size because the problem only has
three variables and nine quadratic terms, and it has a linear objective functions,
two linear constraints, and one reverse convex constraint.
Another interesting difference in this problem is the region size versus iteration
number graph, Figure 16. Most problems show a much greater reduction in the
region size over that many iterations, which suggests that the difficulty may lie in
the splitting method. Most of the problems are able to use the split recommended
by the sensitivity analysis, referred to here as the "best split," most of the time; the
alternative is referred to here as the "widest split." Table 12 shows the percentages
of best and widest splits for problems using more than 5% widest split for one of
the versions. The runtimes in Tables 1, 2, 3, and 4, tend to be higher for problems
with a high percentage of widest splits. This suggests that when the algorithm has
to resort to a widest split for certain variables, the efficiency suffers.
62 T. G. W. EPPERLY AND R. E. SWANEY
-3.5 ,..--....,.-----r--r------.----,.--.,----r---,--,,----,
Lower bound -
Upper bound -----
-4 ---------------------------------------------------------------------------------
-4.5
-5
-5.5
o 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Iteration Number
10.---.--.---r---,.---r---,,..----.----,----.---.
"0
c:
::l
0
.c
a; 0.1
~
"0
c:
::l
0
.c 0.01
!0.
:::l
0.001
0.0001
o 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Iteration Number
Figure 15 Bound gap versus iteration number for problem fp_3A, Version 1
BRANCH AND BOUND FOR GLOBAL NLP 63
1&-10
o 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Iteration Number
Figure 16
i=1
Version 1 Version 2
Name Widest Best Widest Best
fp_3_1 0.3558 0.6442 0.3500 0.6500
fp _3A 0.4870 0.5130 0.4650 0.5350
fpA_9 0.5781 0.4219 0.3965 0.6035
fp_5_1 0.1438 0.8562 n/a n/a
s-2b 0.1017 0.8983 0.0000 1.0000
sA 0.0909 0.9091 0.4722 0.5278
w 0.5804 0.4196 0.6240 0.3760
w_b 0.3540 0.6460 0.4318 0.5682
w_c 0.4314 0.5686 0.4302 0.5698
Table 12 Problems which use the widest rule over 5% of the time
64 T. O. W. EPPERLY AND R. E. SWANEY
One of the adjustable parameters in the algorithm is the ratio of variable widths
required to override the best split. The normal value is 0.01, and the best split is
overridden while the following inequality is satisfied:
xU _xL xU -xL
_best best:::; (Override) . _widest widest
xbest - ebest xwidest - ewidest
Table 13 shows how varying the widest override parameter affects the runtime and
the number of regions searched for the problems in Table 12. In the first two cases
of problem fp_3_1, the algorithm does not find a solution. For Override = 0.001,
problem fp_3_1 terminates with a bound gap of 0.158927, and for Override = 0.01,
it terminates with a bound gap of 0.135591. For the third case, Override = 0.1, the
algorithm solves fp_3_1. For Override = 0.001, problem fp_3A exceeds the iteration
limit before the solution is verified; however, as the override factor increases the
problem solves faster. Problem fpA_9 does not find a feasible point in all three
cases, so its bound gap is infinite. Most of these problems benefit from having a
higher override factor, which branches using the widest rule more frequently, thereby
showing that the sensitivity based selection rule is not providing good splits for these
problems.
Table 14 shows the same information for the remaining problems when the change
in override factor caused more than a negligible change in the performance. Most
of the problems are unaffected by the change in override factor, and among those
that are, there is no consistent trend.
In the case of problem fp_3A, the slow convergence is due to the splitting rule. The
best splitting rule almost always recommends variable one or two; however, the al-
gorithm requires variable three to be split to verify the solution. Generally, variable
three is only split when the best split is overridden (i.e. when variable three's re-
gion width is 100 times that of the recommended variable). The sensitivity analysis
usually misses the dependence on variable three bounds because the McCormick
BRANCH AND BOUND FOR GLOBAL NLP 65
constraints (1.151-1.154) are active. For example, if constraint (1.151) is active for
bilinearity 2,3 with a Lagrange multiplier of U2,3, the effect on the objective of a
change in xf in this constraint is estimated by
where b..xf' is the new bound for b..X3 and b..xf is the old bound. When b..X2 =
b..x¥ , which occurs frequently in problem fp_3A, the dependence on xf disappears.
The sensitivity analysis predicts changes assuming that the LP basis remains the
same, so it cannot predict how a change in the bounds of X3 will affect the LP basis
and consequently the objective function. This kind of problem keeps the sensitivity
based selection rule from choosing variable three. If X, the point around which
the covering program is constructed, is in the interior of the region (ie. not at a
variable bound), McCormick's bounds, constraints (1.151-1.154), will always have
a bound gap as shown in Figure 2, so ultimately, those constraints must leave the
basis and be replaced by the constraint space or null space quadratic constraints.
It may be possible to improve the selection rule by using a more sophisticated
and computationally intensive sensitivity analysis which can account for potential
changes in the LP basis [2].
Figures 17 and 18 illustrates how the algorithm can proceed for a number of itera-
tions without any improvement in the lower bound. The flat spaces in the bound
graph can be explained by a feature of the splitting strategy. The algorithm always
examines the region with the least lower bound. When it splits, the splitting strat-
egy may select a variable that produces one region with an increased lower bound
and another with no increase. This happens when the solution of the covering pro-
gram depends strongly on either the upper or lower variable bound but not both.
Because of this feature, the splitting strategy may have to split several times before
it picks a variable that will raise the least lower bound. The criteria for choosing a
variable, (2.3), favors variables where both bounds are significant, but it may still
choose variables which do not improve the least lower bound.
66 T. O. W. EPPERLY AND R. E. SWANEY
·49 r - - - , - - - , - - - - , - - - r - - - r - - - " T - - , - - - . . - , - - ,
Lower bound -f-
Upper bound . . •
·49.2
.,.--------------------'-
................
·49.4
·49.6
·49.8 ..-'
o 10 20 30 40 50 60 70 80 90
Iteration Number
0.1
n
"0
C
:::l
.&
Q;
~
~ 0.01
c
:::l
0
.0
Q;
8:
::>
0.001
0.0001
o 10 20 30 40 50 60 70 80
I 90
Iteration Number
Figure 18 Bound gap versus iteration number for problem Ld, Version 1
BRANCH AND BOUND FOR GLOBAL NLP 67
10000 .--..,.----,----,--..,----r-----,--..---.----,
1000
100
'C
c::
.8 10
I
0.1
0.01
0.001
The last category of problems contains those on which either one or both of the
algorithms performed poorly. Applying Version 1 to problems w and s-2b results
in slow convergence. In the case of problem w shown in Figure 19, the difficulty
seems to be due to the unbounded derivative of XO. 6 at x = o. XO. 6 is replaced
with a polynomial at small values of x, but it still has first and second derivatives
which causes the underestimating quadratic to fit poorly. The poor underestimation
causes it to require a very small region to verify the minimum.
Version 1 requires 119 iterations to solve problem s-2b, which is much larger than
the 7 iterations required by Version 2, and the performance of Version 1 is shown in
Figure 20. Of the 119 regions that the algorithm examines, only 4 of them contain
the global minimum. The remaining 115 are needed to eliminate the area that does
not contain the solution, which is more than it ought to take considering the size
of the problem and the overall size of the variable domain. The excess is due to a
split very near the global optimum and looseness in the full rank program (1.177).
The split near the global optimum causes the regions to have lower bounds very
close to the global optimum, and the looseness in the full rank program is sufficient
to keep them from getting pruned. The looseness in the full rank program comes
because of difficulty in choosing constraints (1.187). There are three variables and
only two constraints, so one of the variables does not have a constraint of the form
of {1.187} to help enforce complementarity of its positive and negative component.
This problem might be solved by generating more than one constraint of the form
{1.187} for a particular original constraint.
68 T. O. W. EPPERLY AND R. E. SWANEY
0.1
't>
C
~
1,:, 0.01
c
::>
0
.D
Gi
8:
:::l
0.001
0.0001 ' - - - - - ' - - - - ' ' - - - - - ' - - - - - - - ' - - - - ' - - - - = ' ' ' - '
o 20 40 60 80 100 120
Iteration Number
Figure 20 Bound gap versus iteration number for problem s..2b, Version 1
The splitting rule is not recommending the best sequence of splits to improve the
lower bound. The algorithm logs show that this problem is suffering from the same
difficulty as fp_3A. The sensitivity with respect to some variables is disappearing
because the McCormick envelopes indicate no dependence on some variable bounds.
The sensitivity based splitting rule is usually recommending three of the eight vari-
ables, and the remaining five are usually split by the widest split override. Table 15
shows the frequency of different types of splits for each variable. It is also interesting
to note the small percentage of the best splits which are predicted to eliminate a
region.
Another difficulty with problem fp_3_1 is that the branch and bound list size keeps
growing. Figure 24 compares the list size behavior for fp_3_1 with the typical be-
havior shown by Version 1 applied to problem fp_2_7 _5.
The Newton constraint turned out to be insignificant on many of the problems. For
Version 1, the constraint was only active in problems fp_3_1, fpA_9, and e_l. For
Version 2, it was only active in problems fp_3_1, fp_3A, fpA-1, fpA_9, La, U, s_ld,
~d s-3. In the other 41 problems, it was never active in the solution of a covering
program. Rerunning these problems with the Newton constraint removed from the
BRANCH AND BOUND FOR GLOBAL NLP 69
8._------,_------~--------._------,_------_,
Lower bound -
Upper bound -----
7 -----------------------------------------------------------------------------------------
10._-------.--------,--------,--------,--------,
Figure 22 Bound gap versus iteration number for problem fp-3_1, Version 2
70 T. G. W. EPPERLY AND R. E. SWANEY
Figure 23
i=1
80
2000
70
60
II) .~ 1500
.~ 50
iii
:::; 40 :!l 1000
30
20 500
10
0
0 500 1000 1500 2000 6250 12500 18750 25000
iteration number Iteration number
Version 2 Version 2
Without Newton With Newton
Time (s) Regions Time (s) Regions
fp_3_1 4644.20 25000 4837.56 25000
fp_3A 641.06 7129 595.69 6269
fpA_7 0.32 17 0.37 17
fpA_9 21639.35 25000 19472.80 25000
e_l 0.11 7 0.12 7
La 0.52 23 0.92 23
U 0.47 11 0.27 9
s_ld 0.24 13 0.26 13
s--3 0.13 3 0.18 3
covering program had a small effect on the overall performance of the algorithm, as
shown in Table 16.
3 CONCLUSIONS
Both versions of algorithm have been shown to be successful at solving to a high tol-
erance a variety of problems including concave and indefinite quadratic programs,
bilinear programs, polynomial programs, and quadratic NLPs with nonlinear tran-
scendental functions. The time required for solution is highly program dependent,
but correlates somewhat with the number of quadratic terms.
Both versions of the algorithm perform about equally in terms of runtime, but
Version 2 of the algorithm requires far fewer contraint and constraint gradient eval-
72 T. G. W. EPPERLY AND R. E. SWANEY
uations. The evaluations required for Version 1 might be reduced by using a dif-
ferent local NLP solver such as successive quadratic programming. MINOS was
used primarily because of its availability and its ability to solve both NLPs and LPs
efficiently.
In the cases where the algorithm fails or performs poorly, the primary cause of the
poor performance is the branching rules. The problems with poor runtimes use a
higher percentage of widest-variable splits. In some cases, the widest split is needed
because the sensitivity analysis ignores the effect of the bounds of some variables,
and in other cases, the widest split rule is a hindrance to success because it causes
the algorithm to split on variables that do not matter. The majority of the problems
are able to succeed while using the best split over 95% of the time.
This branch and bound algorithm can be readily adapted to massively parallel
computers or parallel distributed computers, which is a subject currently under
study. Preliminary results show that the problems that could not be solved with
one processor can be solved here on a multiprocessor machine.
Acknowledgements
This work was supported by the Computational Science Graduate Fellowship Pro-
gram of the Office of Scientific Computing in the Department of Energy. The Na-
tional Science Foundation also provided partial support under grant DDM-8619582.
REFERENCES
[1] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty. Nonlinear Programming:
Theory and Algorithms. John Wiley & Sons, Inc., second edition, 1993.
ABSTRACT
In Floudas and Visweswaran (1990, 1993), a deterministic global optimization approach was
proposed for solving certain classes of nonconvex optimization problems. A global optimization
algorithm, GOP, was presented for the solution of the problem through a series of primal and
relaxed dual problems that provide valid upper and lower bounds respectively on the global
solution. The algorithm was proven to have finite convergence to an f-global optimum. In this
paper, a branch-and-bound framework of the GOP algorithm is presented, along with several
reduction tests that can be applied at each node of the branch-and-bound tree. The effect of the
properties is to prune the tree and provide tighter underestimators for the relaxed dual problems.
We also present a mixed-integer linear programming (MILP) formulation for the relaxed dual
problem, which enables an implicit enumeration of the nodes in the branch-and-bound tree
at each iteration. Finally, an alternate branching scheme is presented for the solution of the
relaxed dual problem through a linear number of subproblems. Simple examples are presented
to illustrate the new approaches. Detailed computational results on the implementation of both
versions of the algorithm can be found in the companion paper in chapter 4.
1 INTRODUCTION
In recent years, the global optimization of constrained nonlinear problems has
received widespread attention. A considerable body of research has focused on the
theoretical, algorithmic and computational aspects for identifying the global solution.
Comprehensive reviews of the various existing approaches can be found in Dixon and
Szego (1975, 1978), Archetti and Schoen (1984), Pardalos and Rosen (1986, 1987),
75
I. E. Grossmann (ed.). Global Optimization in Engineering Design. 75-109.
C 1996 Kluwer Academic Publishers.
76 V. VISWESWARAN AND C. A. FLOUDAS
Tom and Zilinskas (1989), Mockus (1989), Horst and Tuy (1990) and Floudas and
Pardalos (1990, 1992).
The GOP algorithm presented in Floudas and Visweswaran (1990, 1993) follows a
cutting plane approach to the solution of the relaxed dual subproblems. While this
approach provides tight lower bounds by including all the valid cuts in the relaxed dual
subproblems, it renders the implementation of the actual relaxed dual problem more
complex. In particular, the identification of valid underestimators at each iteration of
the algorithm must be followed with care. Moreover, the algorithm leaves open the
questions of (i) an implicit enumeration of all the relaxed dual subproblems, and (ii)
the reduction of the number of relaxed dual subproblems from exponential to linear,
which would greatly improve the efficiency of the solution procedure.
One of the main advantages of the branch-and-bound framework for the GOP algorithm
is that it allows naturally for an implicit enumeration of the relaxed dual subproblems
at each level. The introduction of binary variables linked to the sign of the derivatives
of the Lagrange function results in mixed integer linear and nonlinear programming
formulations that offer considerable scope for incorporation of reduction tests on a per
node basis. The resulting GOPIMILP algorithm is discussed in detail in Section 5.
GOP ALGORITHM 77
Due to the partitioning of the variable domain using the gradients of the Lagrange
function, the GOP algorithm can require, in the worst case, an exponential number of
dual subproblems at each iteration. This can lead to large CPU times as the number of
variables increases. Therefore, it is worth considering alternate partitioning schemes
that can reduce the number of subproblems that need to be solved at each iteration. In
Section 6, one such branching scheme is presented that requires only a linear number
of subproblems for the determination of the lower bound. A simple example is used
to illustrate the new scheme.
2 PROBLEM FORMULATION
The general form of the optimization problem addressed in this paper is given as
follows:
min F(:c,y)
"',Y
where X and Yare non-empty, compact, convex sets, F(:c, y) is the objective function
to be minimized, G(:c, y) is a vector of inequality constraints and H(:c, y) is a vector
of equality constraints. It is assumed that these functions are continuous and piecewise
differentiable over X x Y. For the sake of convenience, it will be assumed that the
set X is incorporated into the first two sets of constraints. In addition, the problem is
also assumed to satisfy the following conditions:
Conditions (A):
(a) F(:c, y) and G(z, y) are convex in:c for every fixed y, and convex in y for every
fixed :c,
(b) H(:c, y) is affine in:c for every fixed y, and affine in y for every fixed:c,
78 V. VISWESWARAN AND C. A. FLOUDAS
It has been shown (Floudas and Visweswaran, 1990) that the class of problems that
satisfies these conditions includes, but is not restricted to, bilinear problems, quadratic
problems with quadratic constraints and polynomial and rational polynomial problems.
Recently, it has also been shown (Liu and Floudas, 1993; Liu and Floudas, 1995) that
a very large class of smooth optimization problems can be converted to a form where
they satisfy Conditions (A), and hence are solvable by the GOP algorithm.
where yk E Y. It has been assumed here that any bounds on the z variables are
incorporated into the first set of constraints. Notice that because of the introduction of
additional constraints by fixing the y variables, this problem provides an upper bound
on the global optimum of (3.1). Moreover, pk (yk), the solution value of this problem
yields a solution zk for the z variables and Lagrange multipliers Ak and J'k for the
equality and inequality constraints respectivelyl .
The Lagrange function constructed from the primal problem is given as:
The z variables that are present in the linearization of the Lagrange function around
zk, and for which the gradients of the Lagrange functions with respect to z at zk are
1 It is assumed here that the primal problem is feasible for y = yk. See Floudas and Visweswaran (1990,
1993) for the treatment of the cases when the primal problem is infeasible for a given value of y.
GOP ALGORITHM 79
functions of the y-variables, are called the connected variables. It can easily be shown
that the linearization of the Lagrange function around zk can also be written in the
fonn:
NI:;
L k( z, y, Ak,pk)l xk
lin
= Lok( y,,I\,k ,pk) + L...J
" Zigik()
Y (3.4)
i=1
where N I~ is the number of connected variables at the let h iteration (representing the
z variables that appear in the Lagrange function), and L~ (y, Ak, pk) represents all the
tenns in the linearized Lagrange function that depend only on y. The positivity and
negativity of the functions gf (y) define a set of equations that are called the qualifying
constraints of the Lagrange function at the le th iteration, and which partition the y
variable space into 2N I; subregions. In each of these subregions, a Lagrange function
can be constructed (using the bounds for the z variables) that underestimates the global
solution in the subregion, and can therefore be minimized to provide a lower bound
for the global solution in that region.
Consider the first iteration of the GOP algorithm. The initial parent region is the
entire space y E Y from the original problem. This region is subdivided into 2N1:
subregions, and in each of these subregions, a subproblem of the following fonn is
solved:
where I; is the set of connected variables at the first iteration, N I~ is the number
of connected variables, and zfand zf
are the lower and upper bounds on the ith
connected variable respectively. This subproblem corresponds to the minimization
of the Lagrange function, with the connected variables replaced by a combination of
their lower and upper bounds. Note the presence of the qualifying constraints in the
problem. These constraints ensure that the minimization is carried out in a subregion
of the parent node. If this problem has a value of PB that is lower than the current
best upper bound obtained from the primal problem, then it is added to the set of
candidate lower bounds; otherwise, the solution is fathomed, that is, removed from
consideration for further refinement.
Consider a problem with two z and two y variables. In the first iteration, assuming
that both Zl and Z2 are in the set of connected variables for the first iteration, there
are four relaxed dual subproblems solved. These problems are shown in Figure lao It
80 V. VISWESWARAN AND C. A. FLOUDAS
~ 1 ~----------------------+,_/_/------~
_____ gf(y) < 0 • yC // g~(y) > 0
g~(y) = 0- -------____~2(y) > 0 ,/ g~(y) > 0
-- ..........
~~~"
1 "'
"
• yA
.yD ----- J.'-
-,,-,,~ ..... ...........
g~(y) < 0 ,,
, --- ..........
,,
g~(y) < 0 , ,,
,
---
,,
,,
, • yB
,,
,
,,
,,
• JF
><,~~",
In the second iteration, the relaxed dual problem is equivalent to further partitioning
the subregion that was selected for refinement. In each of these partitions, a relaxed
dual subproblem is solved. Figure 2a shows the subregions created in the example,
assuming that there was only one connected variable in this iteration. The two relaxed
dual subproblems solved in this iteration give new solutions yE and yF and are possible
candidates for entering at future iterations. Figure 2b shows the corresponding nodes
in the branch-and-bound tree created by this iteration.
The preceding discussion illustrates the key features of a branch and bound framework
for the algorithm. The framework is based upon the successive refinement of regions
by partitioning on the basis of the qualifying constraints. In the next section, the key
features of its implementation are discussed, based on which a formal statement of the
algorithm is then presented.
At the beginning of the algorithm, there are no subdivisions in the y-space. Therefore,
the root node in the branch and bound tree is simply the starting point for the
algorithm, yl. The region of application for this node (i.e., the current region) is the
entire y-space.
GOP ALGORITHM 83
At each node, the current region of application is divided into several subregions using
the qualifying constraints of the current Lagrange function. It is possible to conduct
simple tests on the basis of the signs of the qualifying constraints that can be used to
reduce the number of connected variables. One such test, based upon the properties
first presented in Visweswaran and Floudas (1993) is presented below:
Reduction Test:
(i) If gf (y) ~ 0 Vy E Ri' set Zi = zf in Lie (z, y,)..Ie, J.£1e) and remove i from the
set of connected variables.
(ii) If gf (y) ~ 0 Vy E Ri' set Zi = zY in Lie (z, y,)..Ie, J.£1e) and remove i from the
set of connected variables.
The proofs of the validity of these reductions can be easily obtained by considering
that the term zjgf(y) can be underestimated by zfgf(y) for all positive gf(y) and
zY gf (y) for all negative gf (y). For more details, the reader is referred to Visweswaran
and Floudas (1993).
1. Choose an i E If.
84 V. VISWESWARAN AND C. A. FLOUDAS
2. Solve the following two problems:
min ± Zi
II:
z-y 0
Y E Rj
Use the solutions of the two problems for the lower and upper bounds on Zi
respectively. Note that the set Rj includes all linear and convex constraints from
the original problem.
3. Repeat Step 1 and 2 for all i E If.
Similarly, when there are other convex constraints in Z and y, these constraints can
be added to the above problem. This procedure can be very useful in obtaining the
tightest bounds on the connected Z variables at each iteration and consequently in
obtaining the tightest underestimators for the relaxed dual subproblems. Note also that
in the case of nonconvex constraints, we can incorporate their convex underestimators
in the evaluation of the bounds problems.
STEP 0: Initialization
(a) Read in the data for the problem including tolerance for convergence, E.
(b) Define initial upper and lower bounds (fu I fL) on the global optimum.
(c) Generate initial bounds for the z variables, zL and zP.
(d) Choose a starting point yl for the algorithm.
(e) Set K = 1, C = Pc = 1, N = 1.
(b) Determine the set of connected variables If and the corresponding partial
derivatives gf (y) (i = 1, ... , If) of the current Lagrange function.
(c) For each connected variable, determine (if possible) tight lower and upper bounds
zf and zf in the current region y E Rc. Otherwise, use the original bounds.
(d) Evaluate lower and upper bounds on gf (y) in the region y ERe.
(a) Select a combination of the bounds B/ of the connected variables, say B/ = Bl.
86 v. VISWESWARAN AND C. A. FLOUDAS
(b) Find the solution (J'~, y*) to the following relaxed dual subproblem:
gf(y) ~ 0 if Zfl = zf
gf (y) ~ 0 if Zfl = zf
(y, J'B) E Rc
(i) If J'~ < fU - E , set j = N + 1, P(j) = 0, N = N + 1, and store the
solution in J'~ , yi .
(ii) If J'~ ~ fU - E , fathom the solution.
(c) Select a new combination of bounds, say B, = B2, for the connected variables.
(d) Repeat steps (b) and (c) until all the combinations of bounds for the connected
variables have been considered.
Select the infimum of all J'~, say J'~. Set 0 =p, yK +1 = 11 ,fL = J'~.
STEP 6: Check for convergence
4.5 Dlustration
Consider the application of the branch and bound algorithm to the following problem,
taken from Al-Khayyal and Falk (1983):
min -z+ zy-y
z:,g
-6z+ 8y- 3 < 0,
3z - y - 3 < 0,
z,y > 0
Note that with these constraints, the bounds on both z and y are (0,1.5). Consider a
starting point of y1 = 1 for the algorithm.
GOP ALGORITHM 87
Iteration 1
For yl = 1, the fIrst primal problem has the solution of z = 0, I-'t = I-'~ = I-'~ = 0,
with the objective value of -1. The upper bound on the problem is therefore -1. The
Lagrange function is given by
L1(z,y,I-'1) = -z + zy - y = zgi(y) - y
where gt(y) = y - 1 is the fIrst (and only) qualifying constraint. From the original
problem, the bounds on y are (0, 1.5). Therefore, -1.0 ~ gl(y) ~ 0.5, implying that
two relaxed dual subproblems need to be solved. These problems, solved for positive
and negative gl(y) (with z set to 0 and 1.5 respectively), are shown below:
Iteration 2
From node 2, the current value of y is 0.0. For this value, the primal problem has the
solution z = 1.0, I-'~ = 0, I-'~ = ~, I-'~ = O. with the objective value of -1.0. The
Lagrange function from this problem is
For this iteration, the relaxed dual subproblems are solved in the region 0 ~ y ~ 1.
The tightest bounds on z for this region can be found by solving the following two
88 V. VISWESWARAN AND C. A. FLOUDAS
(y =1.0)
(y = 1.5) (K=l)
(y =1.25) (K=3)
,
:, 5 ' (K=4)
.... ,,'
(y =0.333) (y =0.047)
o Explored Nodes
(i Fathomed Nodes
mm ±:c
x
-6:c + 8y - 3 < 0,
3:c - y - 3 < 0,
O~y < I,
:c > O.
The solutions of these problems provide lower and upper bounds for :c respectively.
Thus, for the region 0 ~ y ~ 1, this yields the bounds 0 ~ :c ~ ~.
Since y > 0, it is obvious that gf(y) is positivefor all y in the current region. Therefore,
only one relaxed dual problem needs to be solved, with a valid underestimator to
L 2(:c, y, J.l.2) being used by fixing :c to its lower bound. Moreover, from the first
iteration, the Lagrange function corresponding to node 2 is also a valid cut for this
region. Note, however, that instead of using the original bounds on :c in both these
Lagrange functions, the improved bounds can be used. This yields the following
relaxed dual problem:
min J.l.B
Y,i"B
J1.B ~ h-~
gi(y) y-l~O
J.l.B ~ -b- 1
o< y :S 1.5
Solution: y = 0.2, J.l.B = -1.26667.
At the end of this iteration, there are two candidate regions for further partitioning: (i)
the region 1 :S y :S 1.5 corresponding to node 1, with a lower bound of -1.5, and (ii)
the region 0 :S y ~ 1 corresponding to node 3, with the lower bound of -1.26667.
Following the criterion of selecting the region with the best lower bound, node 1 is
chosen for further exploration.
90 V. VISWESWARAN AND C. A. FLOUDAS
Iteration 3
From node 1, the current value of y is 1.5. For this value, the primal problem has the
=
solution z 1.5, I"~ = = =
1\' I"~ 0, I"g 0, with the objective value of -0.75. The
Lagrange function for this iteration is
3 3
L (z, y, I" ) = -z + zy - y + 121 (-6z + 8y - 3) = zgl3()
Y -"31Y - 4"1
where gr(y) = y - 1.5 is the qualifying constraint for this Lagrange function.
For this iteration, the relaxed dual subproblems are solved in the region 1 ~ y ~ 1.5,
In this region, solving the bounds problems for z yields ~ ~ z ~ 1.5. Since gf(y) ~ 0
for all y, only one relaxed dual problem needs to be solved, with z fixed to its upper
bound. From the first iteration, the Lagrange function corresponding to node 1 is also
a valid cut for this region. Using the improved bounds on z shown above yields the
following relaxed dual problem:
min I"B
Y,J.'B
I"B > -~y- ~
gi(y) y-1~0
I"B > ~y - 2.5
0 < y:5 1.5
Solution: y = 1.25,I"B = -1.04167.
Again, there is no partition of the region in this iteration, but the relaxed dual provides
a tighter lower bound for this region than was originally available.
At the end of this iteration, there are two candidate regions for further partitioning: (i)
the region 0 ~ y ~ 1, corresponding to the node 3, with the lower bound of -1.26667,
and (ii) the region 1 ~ y ~ 1.5, corresponding to node 4, with the lower bound of
-1.04167. Following the criterion of selecting the region with the best lower bound,
node 3 is chosen for further exploration.
GOP ALGORITHM 91
Iteration 4
From node 3, the current value of y is 0.2. For this value, the primal problem has
the solution z = 1.0667, J.'t = 0, J.'i = 0.2667, J.'~ = 0, with the objective value
of -1.05333. Note that the solution ofthis problem has the immediate consequence
that it provides an upper bound that is lower than the lower bound for node 4 (which
is -1.04167). Therefore, node 4 can be immediately fathomed, i.e., removed from
consideration for any further refinement or exploration.
where gt(y) = y - 0.2 is the qualifying constraint for this Lagrange function.
For this iteration, the relaxed dual subproblems are solved in the region 0 ~ y ~ 1.0,
and try to provide refined lower bounds by partitioning the region further. The tightest
bounds for z in this region are 0 ~ z ~ ~.
Unlike the previous two iterations, it is necessary to partition the current region
since -0.2 ~ g1(y) ~ 1.3 and the reduction tests of Section 4.2 do not provide any
help. It is therefore necessary to solve two relaxed dual subproblems in the current
iteration. For both these problems, the Lagrange functions from nodes 2 and 3 are
valid underestimators. These two problems are shown below:
NodeS Node 6
mm J.'B mm J.'B
Y,I-'B Y,I-'B
J.'B > h-~ J.'B > h-~
gi(y) y-1~0 gi(y) y-1~0
J.'B > -~y-1 J.'B > -b- 1
J.'B > -1.2667y - 0.8 J.'B > 0.0667y - 1.0667
gt(y) y - 0.2 ~ 0 gt(y) y - 0.2 ~ 0
0 < y ~ 1.5 0 < y ~ 1.5
Solution: y = 0.333, Solution: y = 0.04762,
J.'B = -1.2222. J.'B = -1.06349.
Together, these two problems provide a tighter lower bound (-1.2222) for the region
o ~ y ~ 1 than before (-1.26667).
92 V. VISWESWARAN AND C. A. FLOUDAS
At the end of this iteration, there are two candidate regions for further partitioning --
the region 0 ~ y ~ 0.2, corresponding to node 6, with the lower bound of -1.06349,
and (ii) the region 0.2 ~ Y ~ 1, corresponding to node 5, with the lower bound of
-1.2222. Therefore, node 5 is chosen for further refinement.
The algorithm continues in this fashion for 18 iterations, converging to the global
solution of -1.0833 at z = =
1.1667, Y 0.5 with a tolerance of 0.001 between the
upper and lower bounds. It is interesting to note that the original GOP algorithm,
which does not compute the tightest bounds on the z variables at each iteration, takes
76 iterations to converge with the same tolerance. This indicates the importance of
having the tightest possible bounds on the connected variables at each iteration.
At the Kth iteration, the Lagrange function has the form given by (3.4). Consider the
ith term in the summation. In each of the 2NI~ relaxed dual subproblems, this term
takes on either of two values:
zfgf(y) if gf(y)? 0
zY gf (y)
if gf (y) ~ 0
Now, Zi can be implicitly expressed as a combination of its lower and upper bounds:
(3.5)
This leads to the following formulation for the ith term in (3.4):
where gf and gf are respectively the lower and upper bounds on the qualifying
constraints. As the following property shows. this can be used to reformulate the
relaxed dual problem as a mixed integer linear program (MILP):
Property 5.1 Suppose that, at the Kth iteration. C denotes the current node to be
partitioned, and Re denotes the set of constraints defining the region associated with
C. Then, the best solutionfrom all the relaxed dual subproblems at this iteration can
be obtained as the optimal solution of the following mixed-integer linear program.
mm I'B (3.6)
~eY'''B
t,'"
NI~ NI~
B.t. I'B > I: tf + I: zfgf (y) + L{! (y,,\K, I'K) (3.7)
i=l i=l
, >
t!' (zV - zf)(gf (y) - (1 - af)gf) (3.9)
a!'g!' :$ gf (y) :$ (1 - af)gf (3.10)
, -'-
tK E ~NI~, a K E{O,l}NI~, yEY (3.11)
(y,I'B) E Re (3.12)
where gf and gf are the lower and upper bounds on gf (y) over Y.
Proof. Since af is a binary variable. it can take on only two values in any solution.
either 0 or 1. Consider these two possible cases for af :
Case II (of = 1) :
In this case, equations (3.8)-(3.10) reduce to
, > (u
t~ L) K
Zi -Zi ~ (3.16)
, > (zY - zf)gf (y)
t~ (3.17)
gf < gf(y) ~O (3.18)
Thus, it can be seen that any solution of the relaxed dual problem in Step 4 of the
algorithm in Section 4 is automatically embedded in the set of constraints described by
(3.7)-(3.12). Therefore, (3.6)-(3.12) is a valid formulation for obtaining the solution
of the relaxed dual problem. 0
Remark 5.1 If Llf (y,).K, J1.K) are convex functions in y, then (3.6)-(3.12) is a
convex MINLP, and can be solved with the Generalized Benders Decomposition
(Geoffrion, 1972; Floudas et al., 1989) or the Outer Approximation algorithm (Duran
and Grossmann, 1986).
It should be noted that the reduction tests of Section 4.2 can also be applied to the
MILP formulation, as shown by the following property.
(i) If gf (y) 2 0 for all y (respectively gf (y) ~ 0 for all y) then variable of can be
fixed to 0 (respectively 1.)
GOP ALGORITHM 95
(il) Ifgf (y) = ofor all y then variable Olf vanishesfromformulation (3.6)-(3.12).
Proof. (i) Suppose that gf (y) ~ 0 for all y E Y. Then, to underestimate the Lagrange
function from the Kth iteration, zp
must be set to zf .
By the definition of Olf '
Hence, this leads to Olf = O. Conversely, if gf (y) ~ 0 for all y E Y, then Olf must
be equal to 1.
gf = gf (y) = gf = 0
Therefore, in (3.6)-(3.12), tf is always equal to zero, and the variable Olf vanishes
from the formulation. 0
Backtracking
With the MILP reformulation, it is possible to solve the relaxed dual subproblems
implicitly for the best solution at each iteration. However, it is not sufficient to find
the best solution; it must also be determined whether any of the other partitions can
provide a useful solution for further refinement.
Consider the relaxed dual subproblems solved when node j is being partitioned.
Suppose that this node was partitioned during iteration K. Then, there are NIt'
binary variables, and 2NI~ partitions to consider. Solving the problem (3.6)-(3.12)
gives the best solution among these partitions. Suppose that this solution corresponds
to the combination Ol c. Suppose also that J C is the set of binary variables that are
equal to 1 in this combination, and that there are N J c of them. Consider now the
following cut
LOli- LOli~NJc-l
iEJc irpc
If problem (3.6)-(3.12) is resolved with the above cut added to the problem, then the
solution will have a value for Ol different from Ol C , and will therefore correspond to
a different subregion of the current problem. Note that the objective value of this
problem represents the "second" best possible solution. The best solution, of course,
is the one corresponding to the solution of the first MILP problem, with Ol = Ol c .
Therefore, this methodology is sufficient to go back to a partitioned node at any point.
96 V. VISWESWARAN AND C. A. FLOUDAS
Note that although the size of the MILP problems increases slightly at each iteration
due to the accumulation of constraints from previous iterations, the number of binary
variables present in these problems is equal to the number of connected variables for
each iteration. In other words, the number of binary variables in the MILP problems
is bounded by the number of:z: variables in the original problem.
STEP 0: Initialization
This step is the same as in Section 4.4, with the addition of setting Al = 0.
STEP 1 •• Step 3:
Let the solution for the binary variables in this problem be Q = QC. Let J C be the
set of variables which are 1 in this solution, and let N J C the number of such binary
variables.
GOP ALGORITHM 97
Suppose that the solution selected in Step 5 corresponds to node C, and that this node
was originally partitioned at iteration k. Then, add the cut
L O:j - L O:i $ N Jc - 1
iEJc i~Jc
to the set of binary cuts Ac. Solve the MILP problem (3.6)-(3.12) with the added set
of binary cuts Ac. Suppose the solution of this problem is J.l.1J.
Remark 5.2 After the MILP problem has been solved in either Step 4 or Step 6, an
integer cut is added to the corresponding formulation which ensures that that solution
cannot be repeated. This implies that the same MILP formulation might be solved
several times over the course of the iterations with small differences arising from the
additional integer cuts. Subsequently, there is considerable potential for storing the
tree information from these problems for use in future iterations.
Remark 5.3 At each iteration of the algorithm, there is a single MILP problem solved
in Step 4 or Step 6 as compared to the original algorithm, which needs to solve 2NI~
subproblems at the Kth iteration. This MILP problem contains N If binary variables
in the case of Step 4, or N If
variables in Step 6. In either case, the number of binary
variables present in any MILP formulation during all the iterations is bounded by the
98 V. VISWESWARAN AND C. A. FLOUDAS
maximum number of:z: variables. However, it is usually the case that the number of
connected variables is a fraction of the total number of :z: variables, implying that the
MILP problems are likely to have few binary variables.
Remark 5.4 The major advantage of the MILP problem appears when there are more
than about 15 connected variables at any iteration. In such cases, the original algorithm
would need to solve over 2 million problems at that iteration, the vast majority of
which would never be considered as candidate solutions for further branching. In the
case of the MILP algorithm, the implicit enumeration allows for far fewer problems
to be solved. The maximum number ofMILP problems solved is twice the number of
iterations of the algorithm.
Iteration 1
For yl = =
1, the first primal problem has the solution of:z: 0, 14 I"~= = =
I"§ 0,
with the objective value of -1. The upper bound on the problem is therefore -1. The
Lagrange function is given by
Ll(:z:, y,l"l) = -:z: +:z:y - y = :z:gHy) - y
where gt(y) =y - 1 is the first (and only) qualifying constraint.
I"B > t} - y
t 11 > -1.5a}
t 11 > 1.5(gi - 0.5(1 - aD)
a} < gi ::; 0.5(1 - aD
gi y-1
0 < y ::; 1.5
The solution of this problem is y =
0.0, I"B =
-1.5, a1 =
1. Note that this
corresponds to node 2 in the branch and bound tree in Figure 3. This solution is
GOP ALGORITHM 99
chosen to be the next candidate for branching. However, in order to ensure that the
other regions are also considered for future reference, it is necessary to solve one more
problem, with the cut
a 11 <
_ 0
added to the MILP. This problem has the solution y = 0, J1.B = -1.5 and a~ = 1. It
is stored for future reference.
Iteration 2
=
For y 0.0, the primal problem has the solution z =
1.0, J.£~ =
0, J.£~ = k,
J.£g = 0,
with the objective value of -1.0. The Lagrange function from this problem is
1 4
L (z, y, J.£ ) = -z + zy - y + 3(3z - Y - 3) = zgl (y) - 3Y - 1
3 3 3
Since 0 :::; y :::; 1, tight bounds on z can be obtained to be 0 :::; z :::; ~. Since y > 0,
a valid underestimator to L2(Z, y, J.£2) for all y can be obtained by fixing z to its
lower bound. Therefore, there are no binary variables, and consequently, the MILP
formulation reduces to the same formulation as in Section 4.4. The solution of the
=
resulting subproblem is y 0.2, J1.B = -1.2667.
At the end of this iteration, there are two candidate regions for further branching: (i)
node I (1 :::; y :::; 1.5) with a lower bound of -1.5, and (ii) node 3 (0 :::; y :::; 1) with a
lower bound of -1.2667. The former node is selected for further exploration.
Iteration 3
=
For y 1.5, the primal problem has the solution z =
1.5, J.£r =
112 , J.£~ =
0, J1.~ = 0,
with the objective value of -0.75. The Lagrange function from this problem is
1
L (z, y, J.£ ) = -z + zy - y + 12 (-6z
2 2
+ 8y - 2
3) = zgl (y) - 31 Y - 4"1
where gHy) = y - 1.5 is the qualifying constraint for this Lagrange function.
For y :::; 1.5, the tightest bounds on z are ~ :::; z :::; 1.5. Again, only one relaxed dual
problem needs to be solved, with a valid underestimator to L 3(z, y, J1.3) being used
by fixing z to its upper bound. Therefore, the MILP is again identical to the original
=
algorithm formulation, and has the solution y 1.25, J1.B = -1.04167.
100 V. VISWESWARAN AND C. A. FLOUDAS
At the end of this iteration, there are two candidate regions for further partitioning -
(i) the region 0 ~ y ~ 1, corresponding to node 3, with a lower bound of -1.26667,
and (ii) the region 1 ~ y ~ 1.5, corresponding to node 4, with the lower bound of
-1.04167. Following the criterion of selecting the region with the best lower bound,
node 3 is chosen for further exploration.
Iteration 4
For y = 0.2, the primal problem has the solution z = 1.0667,1-'1 = 0, I-'~ = 0.2667,
I-'~ = 0, with the objective value of -1.05333. Note that the solution of this problem
provides an upper bound that is lower than the lower bound for node 4 (which is
-1.04167). Therefore, node 4 can be immediately fathomed, i.e., removed from
consideration for any further refinement or exploration.
where g1(y) =y - 0.2 is the qualifying constraint for this Lagrange function.
For this iteration, the relaxed dual subproblems are solved in the region 0 ~ y ~ 1.0,
and try to provide refined lower bounds by partitioning the region further. The tightest
bounds for z in this region are 0 ~ z ~ ~.
Unlike the previous two iterations, it is necessary to partition the current region since
-0.2 ~ g1(y) ~ 1.3. Therefore, the MILP in this iteration takes the form:
min I-'B
Y,J,lB
1 4
I-'B > -y-
3
-
3
gi(y) y-1~0
4
I-'B > --y-1
3
I-'B > tt - 1.2667y
t 41 > -0.26667at
t 41 > 1.3333(gt - (1 - at) . 1.0667)
-0.2at < gt ~ (1 - at) ·0.8
GOP ALGORITHM 101
gt y - 0.2
o< y ~ 1.5
Remark 5.5 Note that in this example, there is no arguable advantage to using the
MILP formulation, since it needs to be solved for both combinations of al at each
iteration. However, for problems with more than one connected variable, it is obvious
that this formulation can offer a major advantage over the original formulation. This is
because at each iteration, no more than 2 MILP problems need to be solved. Although
these problems are bigger in size and more complex than the original relaxed dual
subproblems, their structure is such that finding their solution is not really dependent
on the presence of the binary variables, and a good MILP solver can be expected to
solve them very efficiently. At the same time, they feature the key advantage of not
having to solve the full set of subproblems at each iteration.
It should be noted, however, that the convenience of solving just one compact problem
is achieved at the expense of problem size. Because all possible solutions of the relaxed
dual problem have to be incorporated in the GOPIMILP formulation, the result is a
much larger problem to solve. A number of constraints and variables need to be used
to implicitly represent all the possible bound combinations. For large problems, this
could cause difficulties, although the availability of increasingly fast MILP solvers
makes this less of a drawback.
of relaxed dual subproblems? In this section, we present one branching scheme that
achieves this goal. This scheme originates from the study of Barmish et al (1995a,
1995b) on the stability of polytopes of matrices of robust control systems.
Consider the relaxed dual problem at the lc th iteration. This problem has the constraint
NI~
J.l.B ~ L~(y,Ak,J.l.k) + L zjgf(y)·
i=l
Suppose that all the Z variables are bounded between -1 and 1. If this is not the
case, it can be achieved by use of the following linear transformation. Suppose that
zL :s Z :s zU. Then, define z' such that -1 :s z' :s 1, and
Z =a·z' +b
The substitution of the lower and upper bounds gives
(a) If gf (y) ~ 0,
(b) Ifgf(y):s 0,
zjgf(y) ~ -lgf(y)1
GOP ALGORITHM 103
and
NI~
J.l.B ~ L~(y,).rc,J.l.k) - I: /gf(y)/ (3.19)
i=l
The first term on the right hand side is convex, and can remain unaltered. Consider
now the summation term. Using the concept of the infinity norm, (3.19) can be written
as
(3.20)
implying that
IgJ(y)1 ~ Igf(y)I, i= l, ... ,NI~ (3.21)
Consider the following two possibilities:
(a) If gJ(y) ~ 0, then IgJ(y)1 = gJ(y), and (3.21) reduces to the two inequalities
gt(y) ~ gf(y)
} i = 1, ... , N I~, i =1= j (3.22)
gj (y) ~ -gf(y)
and (3.20) becomes
(b) If gJ(y) ~ 0, then Igj(y)1 = -gj(y), and (3.21) reduces to the two inequalities
gt(y) ~ gf(y)
} i=l, ... ,NI~,i=l=j (3.23)
gj (y) ~ -g:(y)
and (3.20) becomes
The two cases presented above indicate how the summation in (3.19) can be replaced
by a linear term when gj (y) represents the maximum of all the qualifying constraints
at a given value of y. This concept can then be extended to cover the entire region for
y. To do this, the above procedure needs to be repeated for all values of j, resulting
in 2 x N I~ subproblems that need to be solved in order to properly underestimate the
Lagrange function at all values of y.
104 V. VISWESWARAN AND C. A. FLOUDAS
Remark 6.1 It should be noted that with the use of the linear branching scheme, the
same space in y is now spanned by a linear number of underestimators (as opposed
to an exponential number in the original algorithm). Therefore, the tightness of these
underestimators will be less than with the original algorithm. Therefore, at the end
of each iteration, the lower bounds obtained from the dual problems with the linear
branching scheme will be looser than those obtained with the original algorithm,
resulting in an increase in the number of iterations required for convergence. At the
same time, the number of subproblems solved at each iteration is vastly reduced.
Therefore, the total computational effort required for the entire algorithm is likely to
be much smaller with the linear branching scheme.
6.2 Illustration
B.t. Zl - Y1 = 0
Z2 - Y2 = 0
-1::; z,y::; 1
Suppose that the GOP algorithm is applied to this problem, with the starting point of
y = O. The fIrst primal problem has the solution z 0, = .\t =
0 and .\~ =
O. This
leads to the following constraint in the fIrst relaxed dual problem:
Y1-Y2 < 0
Y1+Y2 < 0
Y1 < 0
GOP ALGORITHM 105
c A
I
1 I
Y
-----------4------------
I
I
I
I
I
I
D B
Y1 I ,,
,,
,,
, ,,
,, ,
,, H ,,
,, ,
, ,,
E F
, ,,
,, ,,
,, ,
Yl - Y2 > 0
Yl + Y2 > 0
Yl > 0
Y2 - Yl < 0
Y2 + Yl < 0
Y2 < 0
>
Y2 - Yl 0
Y2+Yl > 0
Y2 > 0
Thus, it can be seen that the use of the equations (3.22) and (3.23) result in a new
set of partitions of the region in y. For this example, there are still 4 partitions, so
there is no reduction in the number of subproblems to be solved. However, when
the number of connected variables is more than 2, the use of these transformations
will result in a linearly increasing (as opposed to exponentially increasing) number of
subproblems at each iteration. For example, when there are 10 connected variables,
the new partitioning scheme requires 20 relaxed dual subproblems as opposed to 1024
for the original GOP algorithm.
7 CONCLUSIONS
This paper has focussed on presenting the GOP Algorithm of Floudas and Visweswaran
(1990, 1993) in a branch and bound framework. This framework is based upon
branching on the gradients of the Lagrange function, and is considerably simpler
than the original cutting plane algorithm. The primary advantage of the framework
is in simplicity of implementation. In particular, the selection of previous Lagrange
functions as cuts for current dual problems is considerably simplified. Moreover, the
framework allows for the use of a mixed integer formulation that implicitly enumerates
the solutions of all the dual subproblems. This paper has also considered the issue
of reducing the number of subproblems at each iteration, and in Section 6, a new
partitioning scheme was presented that requires only a linear number of subproblems.
This is a significant reduction from the exponential number of subproblems required
by the original algorithm.
The new algorithms have been implemented in a package cGOP (Visweswaran and
Floudas, 1995a) and applied to a large number of problems. The results of these
applications can be found in the companion paper (Visweswaran and Floudas, 1995b).
Acknowledgements
Financial support from the National Science Foundation under grant CTS-922141I is
gratefully acknowledged.
108 V. VISWESWARAN AND C. A. FLOUDAS
REFERENCES
[1] F. A. Al-Khayyal and J. E. Falk. Jointly constrained biconvex programming.
Math. ofOper. Res., 8(2):273, 1983.
[2] F. Archetti and F. Schoen. A Survey on the Global Optimization Problem: General
Theory and Computational Approaches. Annals of Operations Research, 1:87,
1984.
[5] L.C.W. Dixon and G.P. Szego. Towards global optimisation. North-Holland,
Amsterdam, 1975.
[6] L.C.W. Dixon and G.P. Szego. Towards global optimisation 2. North-Holland,
Amsterdam, 1978.
ABSTRACT
Recently, Aoudas and Visweswaran (1990, 1993) proposed a global optimization algorithm
(GOP) for the solution of a large class of nonconvex problems through a series of primal
and relaxed dual subproblems that provide upper and lower bounds on the global solution.
Visweswaran and Aoudas (1995a) proposed a reformulation of the algorithm in the framework
of a branch and bound approach that allows for an easier implementation. They also proposed
an implicit enumeration of all the nodes in the resulting branch and bound tree using a mixed
integer linear (MILP) formulation, and a linear branching scheme that reduces the number
of subproblems from exponential to linear. In this paper, a complete implementation of the
new versions of the GOP algorithm, as well as detailed computational results of applying the
algorithm to various classes of nonconvex optimization problems is presented. The problems
considered including pooling and blending problems, problems with separation and heat
exchanger networks, robust stability analysis with real parameter uncertainty, and concave and
indefinite quadratic problems of medium size.
1 INTRODUCTION
In this paper, a complete implementation of the new versions of the GOP algorithm,
along with computational results, is discussed. The actual details of the implementation
can be found in Appendix A, which discusses the various aspects involved in the
implementation, including reduction tests and local enhancements at each node of the
tree. In particular, the movement of data from one part of the program to another is
discussed in detail. In the following sections, the results of applying the implementation
to various classes of nonconvex optimization problems, including pooling and blending
problems, problems with separation and heat exchanger networks, and quadratic
problems from literature are described.
2 COMPUTATIONAL RESULTS
Heat exchanger network synthesis problems have traditionally been solved using a
decomposition strategy, where the aims of targeting, selection of matches and opti-
mization of the resulting network configuration are treated as independent problems.
Given the minimum utility requirements and a set of matches, a superstructure of
all the possible alternatives is formulated. The resulting optimization problem is
nonconvex. In this section, two such superstructures of heat exchanger networks are
solved using the GOP algorithm.
GOP ALGORITHM AND ITS VARIANTS 113
OBJ .
mlD
'" (
L.J C%ij U .. LMTD ..
Qij )f3 i i
ijeMA 1J I)
s.t.
"32 X
( DT1ij X DT2ij) 1/2 +"61 X (DT1ij + DT2ij)
114 V. VISWESWARAN AND C. A. FLOUDAS
Here. Uij are the fixed heat transfer coefficients. It should be noted that for fixed
Qij. the objective function is convex. Therefore. by projecting on the flow rates
Ii. the primal problem becomes convex in the remaining variables (the temperatures
and temperature differences). Linearization of the Lagrange function ensures that the
relaxed dual subproblems are LP subproblems in the flowrates.
Example 2.1 This example is taken from Floudas and Ciric (1989). In this problem.
the objective is to determine the globally optimal network for a system of two hot
streams and one cold stream. The superstructure of all possible solutions is shown in
Figure 1. Based upon this superstructure. the model can be formulated as the following
optimization problem :
1300[ 1000
min
0.05[~(.6.Tll.6.T12)] + H.6.Tll + .6.T12)
]0.6
+
1300 [ 600 ] 0.6
0.05[~(.6.T21.6.T22)] + H.6.T21 + .6.T22)
s.t.
H +/~ 10
It +/~ -If 0
I~ +/~ -If 0
If> +/~ - If 0
If +/~ -If 0
150lt + t~ I~ - tUf 0
1501: + t? I~ - t~/f 0
If(t? - to 1000
I:(t~ - t~) = 600
.6.Tll = 500 - t? , .6.T12 250 - t{
.6.T21 = 350 - t~ , .6.T22 200 - t~
.6.Tll ,.6.T12,.6.T21! .6.T22 > 10
Considering the set of possible solutions inherent in Figure 1. it is obvious that the
bypass streams (/~ and I~) can never be simultaneously active. i.e. at least one of
these streams has to be zero. Therefore. two different problems can be solved. one
with I~ = 0 and another with I~ = o. When the GOP algorithm is applied to the
GOP ALGORITHM AND ITS VARIANTS 115
II IE 110
1 1
tI to
1 1
f2~
10 10
fl~
150 0 310 0
10 10 10
Example 2.2 This example is also taken from Floudas and eiric (1989). It features
three hot streams and two cold streams.
min
B. t. If + 1£ + IJ 45
If + I~ + Irs - IF 0
I~ + 1ft + I~ - If 0
IJ + 1ft + 1£ - If 0
If + 1ft + 1ft - IF = 0
If + I~ + Ira - If = 0
Ir
+ Irs + I~ - If 0
1001{ + t~ I~ + t~ Ira - t{lF = 0
100/~ + t? 1ft + t~ I~ - t~/f 0
10011 + t~ 1ft + t~ 1£ - t~/f 0
IF(t? - t{} = 2000, If(t~ - t~) = 1000, If(t~ - t~) = 1500
.6.T11 = 210 - t?, .6.T21 = 210 - t~, .6.T31 210 - t30
.6.T12 = 130 - tL =
.6.T22 160 - t~, .6.T32 180 - t~
.6.T11 , .6.T12,.6.T21! .6.T22.6.T31 , .6.T32 > 10
o ~ If,/~,/J,lf,lf,/r,
The superstructure for this example is shown in Figure 3. There are a total of 27
variables and 19 constraints (of which six are bilinear). With a projection on the flow
rates, there are six connected variables. The GOP algorithm requires a total of 39
iterations and 54.62 cpu seconds to solve this problem. The optimal solution found by
the algorithm is given in Figure 4.
GOP ALGORITHM AND ITS VARIANTS 117
/.0
45 2 45
II
3
45 45 45 45
450 HI
In order to reduce these problems to a form where the GOP algorithm could be applied,
we employ the ideas ofLiu and Floudas (1993), which involve a difference of convex
functions transformation. This involves use of eigenvalue analysis on the resulting
fractional objective functions in order to determine the smallest quadratic terms that
are needed to "convexify" the objective function. Since this method is very general
and can be of use in various problems of this type, it is outlined in some detail here for
one of the examples.
The problem formulation, featuring constraints for the heat balances, minimum
temperature approaches and feasibility is shown below:
min
GOP ALGORITIIM AND ITS VARIANTS 119
Temperature Differences:
2,6,T1 150 + Tl - T4
2,6,T2 500 + T2 - T4 - T5
2,6,T3 150 + T3 - T5
Heat Balances:
Feasibility:
The three heat balance equations can be used to eliminate three of the variables in the
problem. Choosing the intermediate streams T4 and T5 as the independent variables
leads to
Tl 750 - T4
T2 500 + T4 - T5
T3 150 + T5
Using the minimum temperature approaches, tighter bounds on T4 and T5 are obtained:
aT2 = 500-T5
aT3 150
62 F1 300
6Ti - (450 - T4)3
which is always positive, since T4 $ 400. Therefore, this term is convex for all
values of T4 and T5.
where z = 500 - T4 and y = 500 - T5. The eigenvalues of this Hessian are
given by
GOP ALGORITHM AND ITS VARIANTS 121
It can be seen that the second eigenvalue (for the negative value of the square root)
will always be negative. Thus, the Hessian has mixed eigenvalues, indicating
that the second term in the objective is nonconvex.
In order to "convexify" this term, a quadratic term in one or more of the variables
can be added. Suppose that the term aTl is added. Then, the term becomes
,Ts-T4 2
F2 = 500 _ Ts + aT4
For the second eigenvalue to be positive for all values of T4 and Ts , the term in
the square brackets must be positive. In other words,
Thus, adding the term 40~OO Tl to F2 is sufficient to make this term convex. The net
result of this is that the objective function can now be written as
Table 1 Heat Exchanger Network Problems from Quesada and Grossmann (1993) with
variables eliminated as detailed in Section 2.2
Zl - Y1 0
300::; Zl, Yh Y2 < 400
Now the problem satisfies the conditions of the GOP algorithm, being a convex
problem in Y for all fixed z and a linear problem in z for all fixed y.
Similar reductions were obtained for all the example problems given in Quesada and
Grossmann (1993). The results of applying the GOP algorithm to these problems is
given in Table 1. Note that in all the cases, the problems reduced to either one or
two variable unconstrained problems. Consequently, the subproblems solved by the
algorithm are very small in size, as shown in the CPU times taken to converge to the
optimum.
Pooling and blending problems are a feature of models for most chemical processes.
In particular, for problems relating to refinery and petrochemical processing, it is
often necessary to model not only the product flows but the properties of intermediate
streams as well. These streams are usually combined in a tank or pool, and the pool
is used in downstream processing or blending. The presence of these streams in the
model introduces nonlinearities, often in a nonconvex manner. The nonconvexities
arise from the interactions between the qualities of the input streams and the blended
products.
Traditionally, pooling problems have been solved using successive linear programming
(SLP) techniques. The first SLP algorithm (Method of Approximation Programming)
was proposed by Griffith and Stewart (1961). Subsequently, SLP algorithms have
GOP ALGORITHM AND ITS VARIANTS 123
A 3%5
Max. 2.5%5
x
1%5
B---------
Max 1.5% 5 Y
C 2%5
been proposed by Lasdon et al. (1979), Palacios-Gomez et al. (1982) and Baker and
Lasdon (1985) among others. These algorithms have been applied to pooling problems
by Haverly (1978) and Lasdon et al. (1979). SLP algorithms have the advantage
that they can utilize existing LP codes and can handle large scale systems easily.
However, to guarantee convergence to the global solution, they require convexity in
the objective function and the constraints. For this reason, these methods cannot be
relied upon to determine the best solution for all pooling problems.
Various formulations have been proposed for pooling and blending problems. In
the following sections, we consider the application of the GOP algorithm to three of
these formulations, namely, the Haverly Pooling problem, two pooling problems from
Ben-Tal and Gershovitz (1992), and a multiperiod tankage quality problem commonly
occuring in refineries.
In his studies of the recursive behavior of linear programming (LP) models, Haverly
(1978) defined a pooling problem as shown in Figure 6. Three substances A, B and
C with different sulfur contents are to be combined to form two products z and y
with specified maximum sulfur contents. In the absence of a pooling restriction, the
problem can be formulated and solved as an LP. However, when the streams need to
be pooled (as, for example, when there is only one tank to store A and B), the LPmust
be modified. Haverly has shown that without the explicit incorporation of the effect
of the economics associated with the sulfur constraints on the feed selection process,
a recursive algorithm for solving a simple formulation having only a pool balance
cannot find the global solution. Lasdon et al. (1979) added a pool quality constraint
to the formulation. This complete NLP formulation is shown below:
124 V. VISWESWARAN AND C. A. FLOUDAS
z- Px -Cx
y - Py - Cy
= 0
= 0 } component balance
z <
Y <
zU
yU } upper bounds on products
where p is the sulfur quality of the pool; its lower and upper bounds are 1 and 3
respectively. This problem was solved by both Haverly (1979) and Lasdon et al.
(1979). In all cases, however, the global optimum could not always be determined,
the final solution being dependent on the starting point.
More recently, Floudas and Aggarwal (1990) solved the problem using the Global
Optimum Search (Floudas et ai., 1989). They had to reformulate the problem by
adding variables and constraints, and despite being they were successful in finding
the global minimum from 28 out of 30 starting points, they could not mathematically
guarantee that the algorithm would converge to the global minimum.
By projecting on the pooling quality p, the problem becomes linear in the remaining
variables. Hence, p is chosen as the "y" variable. From the constraint set, it can
be seen that only Px and Py are the connected variables. Hence, four relaxed dual
subproblems need to be solved at each iteration. Three cases of the pooling problem
have been solved using the GOP and GOPIMILP algorithms. The data for these
three cases, as well as the average number of iterations required by the algorithms to
converge, are given in Table 2. It can be seen that in all cases, the algorithms require
less than 15 iterations to identify and converge to the global solution.
GOP ALGORITHM AND ITS VARIANTS 125
Using this notation, these pooling problems have the following form:
E Zil +E Ylj 0
j
EZil < S,
The data for these problems can be found in Ben-Tal and Gershovitz (1992). The
results of application of the GOP algorithm to these problems is given in Table 3.
This example concerns a multiperiod tankage quality problem that arises often in the
operations of refineries. The models for these problems are similar to the pooling
problem of the previous section.
In order to develop the mathematical formulation, the following sets are defined :
For this problem, there are 3 products (Pl,P2,P3), 2 components (Cl, C2), and 3 time
periods (to, tl, t2). The following variables are defined :
The objective of the problem is to maximize the total value at the end of the last time
period. The terminal value of each product (vp) is given. Also provided are lower and
upper bounds on the qualities of the products, qualities of stocks at start of each time
period (sp,t), qualities in each component (QUe,I)' and the product lifting (LFp ,.) for
every period. The data for this problem is provided in Table 4.
max L vp.sp,'t;
pEPR
S.t.
L Zc,p,t < ARc,t t E {tl, t2}, C E CO
pEPR
The sources of nonconvexities in this problem are the bilinear terms Sp,t . qp,l,t in the
last set of constraints. Thus, fixing either the set of S or q variables makes the problem
linear in the remaining variables.
The GOP Algorithm: To apply the GOP algorithm to this problem, we can project
on the qualities (ql, q2). Then, the stocks are the connected variables. Since there are
six of them (corresponding to three products at two time periods), 64 relaxed dual
problem problems need to be solved at every iteration. The results of solving this
problem using the branch-and-bound GOP and GOPIMILP algorithms are shown in
Table 5.
128 V. VISWESWARAN AND C. A. FLOUDAS
As in the case of heat exchanger networks, problems involving separations (sharp and
nonsharp) can often be posed as a superstructure from which the best alternative is to
be selected. The following example considers one such formulation.
Example 2.3 This problem involves the separation of a three component mixture into
two multicomponent products using separators, splitters, blenders and pools. The
superstructure for the problem (Floudas and Aggarwal, 1990) is given in Figure 7.
The NLP formulation for the problem is given below:
subject to
(Overall Mass Balances)
Fl + F2 + F3 + F4 300
F6 - F7 - F8 o
Fg - FlO - Fll - Fl2 o
Fl4 - Fl5 - Fl6 - Fl7 o
Fl8 - Fl9 - F20 o
(Splitter Component Balances)
F5zj,5 - F6Zj,6 - Fgzj,g o j = A,B,C
F l3 Zj,l3 - F l4 Zj,l4 - F l8Zj,l8 o j = A,B,C
(Compositions)
ZA,i + ZB,i + ZC,i = 1 i =5, 16,9,13,14,16
(Sharp Split)
ZB,6 = ZC,6 = ZA,9 = ZC,14 = ZA,18 = ZB,18 = 0
By projecting on the compositions ZA., ZB. and ZC., the primal and relaxed dual sub-
problems become linear. There are a total of 38 variables and 32 equality constraints.
There are initially 20 connected variables (the flow rates.) However, considering
Figure 7, it is obvious that the recycle streams cannot both be simultaneously active.
This leads to solving two independent problems, with FlO = 0 in the first case and
F15 = 0 in the second case. In each case, the resulting problem has 9 connected
variables. Application of the GOP algorithm to the problem identifies the optimal
solution (shown in Figure 8) in 17 iterations using the parallel configuration as a
starting point. The total CPU time taken was 3.84 seconds on an HP730.
Phase and Chemical equilibrium problems are of crucial importance in several process
separation applications. For conditions of constant pressure and temperature, a global
minimum of the Gibbs free energy function describes the equilibrium state. Moreover,
the Gibbs tangent plane criterion can be used to test the intrinsic thermodynamic
stability of solutions obtained via the minimization of the Gibbs free energy. Simply
stated, this criterion seeks the minimum of the distance between the Gibbs free energy
function at a given point and the tangent plane constructed from any other point in
the mole fraction space. If the minimum is positive, then the equilibrium solution is
stable.
The tangent plane criterion for phase stability of an n-component mixture can be
formulated as the following optimization problem (McDonald and Floudas, 1995):
mm F(y)
y
= Lyd~i(Y) - ~?(z)}
iEC
GOP ALGORITHM AND ITS VARIANTS 131
A 6
30A
1 5
I 40B
BC 30C
100 A
looB
looC
14 70A
2 13
n AD
SOB
70C
'-----r-----'
C 18
A 20
30A
60 40
I BC
40B
30C
100 A
looB
looC
70A
40 B
n 20
SOB
70C
C 20
240
B.t. EYi 1
ieC
0::; Yi < 1
where Y is the mole fraction vector for the various components, J.'i(Y) is the chemical
potential of component i, and J.'?(z) represents the tangent constructed to the Gibbs
free energy surface at mole fraction z. The use of the NRTI.. equation for the chemical
potential reduces the problem to the following formulation:
where Tij are non-symmetric binary interaction parameters, gij are parameters
introduced for convenience, and the function C(y) is a convex function. By projecting
on Yi, it can be seen that this problem satisfies Conditions (A).
The GOP algorithm was applied to solve several problems in this class. These
problems are taken from McDonald and Floudas (1995) and have been solved by them
using the GLOPEQ package (McDonald and Floudas, 1994). The results are shown
in Table 6. It can be seen that for most of the problems, the GOP algorithm performs
very well when compared to the specialized code in GLOPEQ, which is a package
specifically designed for phase equilibrium problems.
The following example was first studied by de Gaston and Sofonov (1988). It
concerns the exact computation of the stability margin for a system with real parameter
uncertainty. This problem (shown in Figure 9) involves a single-input single-output
feedback system with a lead-lag element controller. The model for the problem is
given below:
min AIm = Z6
GOP ALGORITHM AND ITS VARIANTS 133
r e 1..+2 u q] y
- -
1..+10 A.(A.+q 2) (A.+q 3)
Details of the development of the model can be found in Psarris and Floudas (1993).
The optimal solution for this problem is km = 0.3417. Application of the GOP
algorithm to this problem converges to the optimal solution in 45 iterations, requiring
1.5 seconds on an HP730.
The conditions under which the GOP algorithm can be applied make it highly attractive
for problems with quadratic functions in the objective and/or constraints. Of particular
interest are quadratic problems with linear constraints, which occur as subproblems in
successive quadratic programming (SQP) and other optimization techniques, as well
as being interesting global optimization problems in their own right. In this section,
the results of applying the GOP and GOPIMILP algorithms to various problems of
this type is discussed.
Eleven small-size concave quadratic problems from Phillips and Rosen (1988) have
been solved using the GOP algorithm. The problems have the following form:
GOP ALGORITIIM AND ITS VARIANTS 135
Ih,92 E !R.
Here, m is the number of linear constraints, n is the number of concave variables (z),
and k is the number of linear variables (y). The parameters 91 and 92 are -1 and 1
respectively, and the relative tolerance for convergence between the upper and lower
bounds (€) is 0.001.
The results of the application of the algorithm to these problems are given in Table 7.
The CPU times for the GOP algorithm and the Phillips and Rosen algorithm (denoted
by P&R) are given in seconds. It should be noted that the P&R algorithm was run on
a eRAY2. As can be seen, the algorithm solves problems of this size very fast, taking
about 5 iterations to identify and converge to the optimal solution.
136 V. VISWESWARAN AND C. A. FLOUDAS
Results from application of the GOP algorithm to another set of concave and indefinite
quadratic test problems taken from Floudas and Pardalos (1990) are given in table 8.
These problems have also been solved recently by Sherali and Tuncbilek (1994) whose
results are listed in the same table. Here, N z , Ny and Nc refer to the number of z and
y variables and the number of linear constraints respectively.
Table 9 Concave Quadratic Problems from Phillips and Rosen (1988), e =0.01
Table 10 Concave Quadratic Problems from Phillips and Rosen (1988). £ =0.1
Table 11 Indefinite Quadratic Problems from Phillips and Rosen (1988). £ = 0.1 and
0.01
138 V. VISWESWARAN AND C. A. FLOUDAS
studied by Phillips and Rosen (1988), and we generated the data for the constants
'\, iii, d, Ab A2 and b as they have used. The parameters (h and 82 have been set to
values of -0.001 and 0.1 respectively. Depending on the values of '\i, the problems
generated are either concave quadratic or indefinite quadratic problems. For the
case of indefinite quadratic problems, roughly as many postive'\i as negative '\i are
generated. For each problem size, 5-10 different problems (using various seeds) have
been generated and solved.
Tables 9 and 10 present the results for concave quadratic problems using tolerances of
0.01 and 0.1 respectively, while Table 11 presents the results for indefinite quadratic
problems using tolerances of 0.01 and 0.1 with the GOP algorithm. In all the cases, it
can be seen that the algorithm generally requires very few iterations for the upper and
lower bounds to be within 10% of the optimal solution; generally, the convergence to
within 1% is achived in a few more iterations. Moreover, certain trends are noticeable
in all cases. For example, as the number of constraints (m) grows, the problems
generally become easier to solve. Conversely, as the size of the linear variables (Ie)
increases, the algorithm requires more time for the solution of the dual problems,
leading to larger overall CPU times. In general, these results indicate that the GOP
and GOPIMILP algorithms can be very effective in solving medium sized quadratic
problems with several hundred variables and constraints.
It should be noted that several sizes of these problems have also been solved on
a supercomputer using a specially parallelized version of the GOP algorithm. The
results can be found in Androulakis et al. (1995).
3 CONCLUSIONS
Visweswaran and Floudas (1995) proposed new formulations and branching strategies
for the GOP algorithm for solving nonconvex optimization problems. In this paper, a
complete implementations of various versions of the algorithm has been discusssed.
The new formulation as a branch and bound algorithm permits a simplified implemen-
tation. The resulting package cGOP has been applied to a large number of engineering
design and control problems as well as quadratic problems. It can be seen from the
results that the implementation permits very efficient solutions of problems of medium
size.
GOP ALGORITHM AND ITS VARIANTS 139
Acknowledgments
Financial support from the National Science Foundation under grant CTS-9221411 is
gratefully acknowledged.
REFERENCES
[1] I. P. Androulakis, V. Visweswaran, and C. A. Floudas. Distributed
Decomposition-Based Approaches in Global Optimization. In Proceedings
of State of the Art in Global Optimization: Computational Methods and Appli-
cations (&is. C.A. Floudas and P.M. Pardalos), Kluwer Academic Series on
Nonconvex Optimization and Its Applications, 1995. To Appear.
[2] T.E. Baker and L.S. Lasdon. Successive linear programming at Exxon. Mgmt.
Sci., 31(3):264, 1985.
[3] A. Ben-Tal and V. Gershovitz. Computational Methods for the Solution of
the Pooling/Blending Problem. Technical report, Technion-Israel Institute of
Technology, Haifa, Israel, 1992.
[4] R. R. E. de Gaston and M. G. Sofonov. Exact calculation of the multiloop
stability margin. IEEE Transactions on Automatic Control, 2: 156, 1988.
[5] C. A. Floudas and A. Aggarwal. A decomposition strategy for global optimum
search in the pooling problem. ORSA, Journal on Computing, 2(3):225, 1990.
[6] C. A. Floudas, A. Aggarwal, and A. R. Ciric. Global optimum search for
nonconvex NLP and MINLP problems. C&ChE, 13(10): 1117, 1989.
[7] C. A. Floudas and A. R. Ciric. Strategies for overcoming uncertainties in heat
exchanger network synthesis. Compo & Chem. Eng., 13(10): 1133, 1989.
[8] C. A. Floudas and P. M. Pardalos. A Collection of Test ProblemsforConstrained
Global Optimization Algorithms, volume 455 of Lecture Notes in Computer
Science. Springer-Verlag, Berlin, Germany, 1990.
[9] C. A. Floudas and V. Visweswaran. A global optimization algorithm (GOP) for
certain classes of nonconvex NLPs: I. theory. C&ChE, 14:1397,1990.
[10] C. A. Floudas and V. Visweswaran. A primal-relaxed dual global optimization
approach. J. Optim. Theory and Appl., 78(2):187,1993.
[11] R. E. Griffith and R. A. Stewart. A nonlinear programming technique for the
optimization of continuous processesing systems. Manag. Sci., 7:379, 1961.
140 V. VISWESWARAN AND C. A. FLOUDAS
[12] Studies of the Behaviour of Recursion for the Pooling Problem. ACM SIGMAP
Bulletin, 25: 19, 1978.
[13] Behaviour of Recursion Model- More Studies. SIGMAP Bulletin, 26:22,1979.
[14] L.S. Lasdon, AD. Waren, S. Sarkar, and F. Palacios-Gomez. Solving the
Pooling Problem Using Generalized Reduced Gradient and Successive Linear
Programming Algorithms. ACM SIGMAP Bulletin, 27:9, 1979.
[15] W. B. Liu and C. A. Floudas. A Remark on the GOP Algorithm for Global
Optimization. J. Global Optim., 3:519, 1993.
[16] C.D. Maranas and C.A Floudas. A Global Optimization Approach for Lennard-
-Jones Microc1usters. J. Chem. Phys., 97(10):7667, 1992.
[17] C.M. McDonald and C.A Floudas. A user guide to GLOPEQ. Computer Aided
Systems Laboratory, Chemical Engineering Department, Princeton University,
NJ,1994.
[18] C.M. McDonald and C.A Floudas. Global Optimization for the Phase Stability
Problem. A1CHE Journal, 41:1798,1995.
Data Structures
Since the cGOP package is written in C, it is highly convenient to aggregate the data
transfer from one routine to another using structures (equivalent to COMMON blocks
in Fortran). The primary data structures used in the package describe the problem
data, the solutions of the various primal problems, the data for the various Lagrange
functions, and the solutions of the relaxed dual subproblems at each iteration.
The most important group of data is obviously the problem data itself. In order to
facilitate easy and general use of this data, the implementation was written assuming
142 V. VISWESWARAN AND C. A. FLOUDAS
Given the formulation (4.2), the data for the problem can be separated into one part
containing the linear and bilinear terms,and another part containing the nonlinear
terms Fj(z) and Gj(Y). The first part can be specified through a data file or as
arguments during the subroutine call that runs the algorithm. The nonlinear terms,
which in general cannot be specified using data files, can be given through user defined
subroutines that compute the contribution to the objective function and constraints
from these terms, as well as their contribution to the Hessian of the objective function
and the Jacobian of the constraints. The problem data is therefore carried in one
data structure (called pdat from here on, and shown in Figure 10) that describes the
following items:
Control Data This refers to the type of the problem (bilinear, quadratic, nonlinear,
etc), number of z and y variables, the number of constraints, type and value of
the starting point for the y variables, as well as tolerances for convergence.
Bilinear Data For reasons of convenience, the linear and bilinear terms in the
objective function and constraints are treated together. The data is stored in
sparse form, with only the nonzero terms being stored. For each term, the value
of the term as well as the indices of its z and/or y terms are stored.
Bounds The global bounds on the variables (which can be changed before the start of
the algorithm, but thereafter remain constant) are stored in arrays.
Nonlinear Data The pointers to the functions that compute the nonlinear terms and
their gradients are stored in the data structure.
GOP ALGORITHM AND ITS VARIANTS 143
Iteration Data Various counters and loop variables that control and aid in the progress
of the iterations are stored in the main data structure. In addition, the best solution
obtained by the algorithm so far is also stored.
It is important to note that almost all of the main data structure, once it has been
read in from the data file or passed to the main subroutine in the algorithm, remains
constant throughout the progress of the algorithm. The only exceptions are the iteration
variables and the best solution obtained by the algorithm so far.
The solution of the primal problem is stored together as another data structure, psol
(shown in Figure 11). This contains the value of yK for which the primal problem was
solved, solution for the :Il variables, the marginals for all the constraints and variables
at their bounds, as well as an indicator of whether the primal was feasible or not.
Because of the form (4.2), the Lagrange function (for iterations with feasible primal
problems) can be written (after linearization of the terms with respect to :Il and
substitution of the KKT optimality conditions for the primal problem) as
NIK
L (:Il, y, AK)llin T +~
x K = Lc + LLY L.J :Ili9iT( Y - YK) + G'()
Y
i=l
where G'(y) represents all the nonlinear terms weighted by the marginals, and can be
written as
M2
G'(y) = Go(Y) + L Af Gj(y)
j=M , +1
NI~ M2
L (:Il, y, AK)I'in
xK Lc + LI y + L :Ili9T(y - yK) + L Af Zj (4.3)
i=l j=M , +1
Zj > Go(Y) + Gj(y) (4.4)
Note that a simplistic implementation of the algorithm for the general nonlinear
problem in (4.2) leads to a problem with nonlinear terms in each Lagrange function,
making it much more computationally intensive. Given the fact that the nonlinear
terms are the same in each Lagrange function except for a factor due to the marginals
Af, it is far more efficient to group the terms together, and therefore to compute
their gradients only once. Moreover, the regrouping of the terms means that as far as
144 V. VISWESWARAN AND C. A. FLOUDAS
struct pdat {
/* Control section */
char *probname; /* Name of original problem */
char objtype; /* Type of objective function */
char contype; /* Type of constraints */
char pr imaltype; /* Type of primal problems */
char rdualtype; /* Type of relaxed dual problems */
int nxvar; /* Number of x variables */
int nyvar; /* Number of y variables */
int ncon; /* Number of constraints */
int nzcnt; /* Total number of non-zeros */
/* Data */
char *ctype; /* Type of X and Y variables */
int *sense; /* Sense of row: <=, ==, >= */
double *rhs; /* Right hand sides of the rows */
int *count; /* Number of entries in each row */
int *begin; /* Start of entries for each row */
TERMS terms; /* Bilinear terms in problem */
double *xlbd, *xubd; /* Bounds on X variables */
double *ylbd, *yubd; /* Bounds on Y variables */
double objconst; /* Constants in the objective */
double epsa; /* Absolute tolerance specified */
double epsr; /* Relative tolerance specified */
int maxiter; /* Maximum number of iterations */
/* Various functions */
void userobj(}; /* Nonlinear terms in objective */
void usercon(}; /* Nonlinear terms in constraints */
/* Solution */
int niter; /* Number of iterations so far */
double primalubd; /* Current upper bound from primals */
double rdlbd; /* Current lower bound from duals */
double *Xi /* Starting point, solution for X */
double *y; /* Starting point, solution for Y * /
double abserror; /* Absolute error between bounds */
double relerror; /* Relative error between bounds */
};
Figure 10 Main data structure for the GOP and GOPIMILP algorithms
GOP ALGORITHM AND ITS VARIANTS 145
struct psol {
int modstat; /* Feasible or infeasible */
int nxvar; /* Number of x variables */
int nyvar; /* Number of y variables */
int ncon; /* Number of constraints */
double *yval; /* Fixed values for Y variables */
double objval; /* Objective value for primal */
double *varval; /* solution for X variables */
double *cmargval; /* Marginals for constraints */
double *bmargval; /* Marginals for bounds */
char *varstat; /* Status for each variable */
char *solver; /* Which solver was used */
};
each individual Lagrange function is concerned, only the data regarding (4.3) need to
be stored, .e. the coefficients of the linear terms L1,the bilinear terms g;
and the
multipliers >.f. Its structure is shown in Figure 12.
The solutions of the relaxed dual subproblems comprise the last major data structure.
Apart from the actual objective value for the solution and the values of the y variables,
this data includes information about which iteration and parent node generated each
child node in the branch and bound tree. Thus, the entire information about the tree is
stored in the array of relaxed dual solution structures, rdsol.
Based upon these various data units, the overall scheme of the implementation is now
presented. A pictorial view of the algorithm is given in Figure (13).
Initialization of parameters
At the start of the algorithm, the list of relaxed dual solutions rdsol is initialized to
contain the starting point for the y variables, indicating the root node for the whole
branch and bound tree. An initial local optimization problem can be solved to find a
good upper bound and starting point for the y variables, if desired. Various counters
and bookkeeping variables are initialized before the start of the iterations.
At any given iteration, the relaxed dual subproblems will contain a Lagrange function
from the current iteration, and one from each of the parent nodes of the current node
in the branch and bound tree. In order to select these functions, a backward search
is done through the list of solutions to the relaxed dual problems starting from the
current node (i.e. the node that has been chosen at the end of the previous iteration).
The following steps are repeated:
Step O. Initialize lagsel[MAXI1ER], the array of parent nodes for the current node.
Step 1. Add the current node C to lagsel. Set lagsel[1] =C, and set the number of
Lagrange functions numlag = 1.
Step 3. Go to the node corresponding to iteration P (say node D) and add this node
to the list, i.e. set numlag = numlag + 1, lagsel[numlag] D. =
Step 4. Repeat Steps 2 and 3 until the root node has been reached.
GOP ALGORITHM AND ITS VARIANTS 147
-
START
BEGIN ITERATIONS
Primal Problem
(Black box)
Data for Problem, current Y
Set up data
:---Nonlinear ---:
_______ ~ I
Subroutines :
for the
1
r ---i- ----
1- - - -
Primal Problem
1 1
1 1
1
~
NPSOUMINOS
(nonlinear)
Pointer to datafor primal
•
SOLVE THE
CPLEXIOSL
PRIMAL (linear)
PROBLEM
Pointer to solution of
Primal problem
l Function
Evaluation
(square)
Lagrange data
Current and Previous fixed Y
Select Previous -- Gradients used as criterion
i - - - - - - - - - - - - i Lagrange Functions -- One Lagrange /Unction
Set of constraints for per iteration
relaxed dual problems
Constraint data
SOLVE THE
r----- MILP Fonn
RELAXED DUAL
I - - - (CPLEXIOSL)
PROBLEM
Set of solutions for
relaxed dual I One problem
[ One Solution
problems
1 r
Original Fonn
Linear: CPLEXIOSL
------
:
r--
~~~i~~;;-
------~ Sub~utines
Nonlinear: NPSOUMINOS ,- ------- -----
- Several subproblems
- Branch on gradients of
Lagrange /Unction
- CPLEXIOSL can reuse bases
from one problem to another
- Solutions stored in linked list
UPDATE BOUNDS
YES
-- Clean up and exit
The list of nodes generated in the above steps provides a set of qualifying constraints
(one set per node) that define the region of operation for the current node.
Primal problem
The primal problem takes as data the pdat structure, along with the current vector
for yK. It is also given the current region for the problem as defined by the selected
qualifying constraints. There are several schemes that can be followed to solve the
primal problem, all of which involve various combinations of the primal, relaxed
primal or a local optimization problem solved in the current region. One possible
scheme is as follows:
(a) Solve the full NLP as a local optimization problem in the current region.
(b) If the NLP solution is lower than the upper bound, replace yK with the NLP
solution and go to Step 1. Otherwise go to Step 4.
(a) Solve the full NLP as a local optimization problem in the current region.
(b) If the NLP provides a feasible solution, then replace yK with the new
solution from the NLP and go to Step 1. Otherwise, solve the relaxed primal
problem go to Step 4.
The solution of the current primal (or relaxed primal) problem is used to determine
the set of connected variables. Several reduction tests are used to determine the set.
These include testing for the lower and upper bounds on the gradients of the Lagrange
function and the tightness of the bounds on the :c variables. If the lower and upper
bounds on an :c variable are within a certain tolerance, that variable can be fixed at
its bound. Provision is also made for user defined tests for reducing the number of
connected variables.
As mentioned earlier, only the data for the Lagrange functions (4.3) are stored. This
data is generated from the current psol structure. Once the data is generated, it can be
used again whenever the Lagrange functions from that iteration need to be generated.
If there are no connected variables in the Lagrange function generated at the current
iteration, then this function contains only the y variables. Therefore, it is a valid
underestimator for the entire y space, and can be included as a cut for all future relaxed
dual subproblems. In such a case, the current Lagrange function is added to the list of
"global" Lagrange functions.
Given the current region and a set of connected variables, the region is partitioned
using the qualifying constraints of the current Lagrange function. Then, a relaxed
dual subproblem is solved in each region, and the solutions are stored as part of rdsol
if feasible. The nonlinear terms in the objective function and constraints are again
incorporated through calls to the user defined functions. In the case of the GOPIMILP
algorithm, only one MILP problem needs to be solved.
After the relaxed dual problem has been solved for every possible combination of the
bounds of the connected variables (in the case of the GOPIMILP algorithm, after the
MILP has been solved), a new lower bound needs to be determined for the global
solution. Since the solutions are all stored as a linked list, this permits a simple
GOP ALGORITHM AND ITS VARIANTS 153
search for the best solution. This solution is then removed by simply removing the
corresponding node from the linked list. At the same time, the corresponding value of
y is also extracted to use for the next iteration.
In the case of the GOPIMILP formulation, after a solution has been selected from
the list of candidate solutions, the MILP formulation corresponding to the iteration
from which the solution was generated needs to be resolved. To accomplish this, a
binary cut that excludes the selected solution is generated and added to the MILP
formulation, which is then solved. Because of the likelihood that the formulation for
any given iteration is likely to be solved again and again at least a few times, several
such formulations are stored in memory, so that when they are resolved, it is merely
a matter of restarting the problem with the additional binary cut. This saves valuable
loading and startup time for the solution of these problems.
Convergence
Finally, the check for convergence is done. The algorithm is deemed to have converged
if the relative difference between the upper bound from the primal problems and the
lower bound from the relaxed dual problems is less than f. Then, the algorithm
terminates (in the case of the standalone version) or returns to the calling routine (in
case of the subroutine version). Otherwise, the algorithm continues with the new fixed
value of y for the primal problem found from the previous step.
5
SOLVING NONCONVEX PROCESS
OPTIMISATION PROBLEMS USING
INTERVAL SUBDIVISION
ALGORITHMS
R.P Byrne, I.D.L Bogle
Department of Chemical (1 Biochemical Engineering, t
University College London, London, England
ABSTRACT
Many Engineering Design problems are nonconvex. A particular approach to global
optimisation, the class of 'Covering Methods', is reviewed in a general framework.
The method can be used to solve general nonconvex problems and provides guaran-
tees that solutions are globally optimal. Aspects of the Interval Subdivision method
are presented with the results of their application to some illustrative test problems.
The results show the care that must be taken in constructing inclusion functions and
demonstrate the effects of some different implementation decisions. Some particular
difficulties of applying the method to constrained problems are brought to light by the
results.
1 INTRODUCTION
1.1 Motivation
The advantages of optimisation are well known: it provides the best possible
solution to a well defined problem. Thus the Design Engineer can be confident
that the design produced is the best one available for the problem. Traditionally,
however, this has only been the case if the problem is convex. When nonconvex
problems are attempted it can no longer be assumed that the design is the best
possible design because the solution to the optimisation problem may not be
the global solution.
t This work was done as part of the Centre For Proce88 Systems Engineering and supported
by the EPSRC.
155
I. E. Grossmann (ed.), Global Optimization in Engineering Design, 155-174.
" 1996 Kluwer Academic Publishers.
156 R. P. BYRNE AND I. D. L. BOGLE
minf(x) (5.1)
xEA
about the problem structure and are more successful but still cannot guaran-
tee global optimality. In order to do so the problem space must be covered and
bounded. This is the basis for 'Covering Methods' which solve nonconvex global
optimisation problems (§2). Covering methods, typically, require more inform-
ation about the problem than Random Search methods but global optimality
can be guaranteed.
To use Covering Methods there must be some way of obtaining a lower bound
on J(x) and an upper bound on the value of the global optimum, J(x*). A
methodology for Covering Methods, independent of the bounding procedure, is
discussed in §2. The bounds needed to cover the problem may be provided in
a number of different ways.
For functions which are Lipschitz, the Lipschitz constant, when known, can be
used to provide these bounds (§2.1). These are the simplest of the covering
methods. However, not all practical problems are Lipschitz and the constant
may not be available.
2 COVERING/EXCLUSION METHODS
In order to ensure that the solution to a general nonconvex optimisation problem
is global it is necessary either to locate all the minima or to cover the region of
interest so that no minima are missed. These 'Covering Methods' are usually
based on excluding subregions until a region, or set of regions, that is sufficiently
small may be said to contain the global optimum. To exhaustively search a
region it is necessary to have some mechanism for obtaining lower, and perhaps
upper, bounds on the value ofthe objective over this region and an upper bound,
fj, on the value of J(x*).
The algorithms rely on bounding the objective function over subsets, Xk, of
the feasible region, and maintaining an upper bound on the value of the global
optimum. If the lower bound for a given X k is greater than fj then Xk cannot
158 R. P. BYRNE AND I. D. L. BOGLE
contain a global minimiser. This is the exclusion principle and is common to
all covering methods.
1. Initialise :
(a) an upper bound on the value f(x*), y = 00.
Methods vary in the manner bounds are generated, in how the region is divided
and in the assumptions required for f(x) and A but, while many are more
sophisticated, all follow this general scheme.
In some algorithms which solve a relaxed problem to generate lower bounds ([4],
[6]) it is possible that the solution of the relaxed problem will be a solution to
the original problem. As the relaxed problem is a convex problem this solution
is global and so is also the global minimum of f (x) on Xk. In this case a slightly
different treatment may be applied whereby Xk is removed from L and added
to a secondary list of regions whose global minimum is known. Elements of this
list will also be discarded in step 5 subject to the criteria lk > y.
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 159
Most covering methods are constructed so that they can take advantage of
additional information or properties if it is available. An interval algorithm due
to Hansen [7] takes advantage of differentiability to reduce Xk at step 2 using
an Interval Newton method and to discard more sets from the list in step 5 with
a monotonicity test.
Given this common base Covering methods can be split into two main groups:
The former class can take advantage of well developed convex optimisation
techniques applying them to relaxed problems or decompositions of the original
problem. Examples may be found in, amongst others, [4], [6] and [8].
(5.5)
The task then, is to determine the density of the grid which will provide a
solution to the required accuracy. Evaluating f(x) at N points Xi in A, given
160 R. P. BYRNE AND I. D. L. BOGLE
by
(2i - 1)(
Xi a+ L (5.6)
L(b - a)
N > 2(
(5.7)
will result in at least one point, Xf , satisfying an (-global optima criteria (Eqn
5.4) [9].
The Lipschitz methods are, in general, effective for single variable problems if
f(x) is known to be Lipschitz and L can be found. Complications arise when
solving multivariate problems because the regions or sets which can be excluded
are hyper-spheres. Thus, as more hyper-spheres are excluded from the search,
the region remaining becomes increasingly difficult to describe computationally
[12].
A degenerate interval [a, a] is a real number in much the same way as a complex
number z = a + Oi is real. The set of all possible intervals is denoted by IT and
lR C IT.
The application of this algebraic system was first used to bound the error in
finite arithmetic operations on computers. If an operation on a number is,
instead, posed as an operation on an interval which bounds the number then
the resulting interval will bound the answer. This provides a representation of
the number and the absolute error incurred.
where A is a positive constant. Thus for small intervals the higher order inclusion
functions will produce tighter bounds. However for wide intervals the converse
is true and a lower order inclusion will produce tighter bounds. In practice if
w(X) ~ 2 a second order inclusion will be better than a first order inclusion
[15].
the problem. The choice of inclusion function directly affects the 'tightness'
of the bounds that can be generated. The better the bounds are the fewer
subdivisions need to be made and, consequently, the algorithms performance is
improved 2. In general the lower order inclusion functions are better at the start
of an algorithm when the intervals, Xk are large and the higher order inclusions
are more suitable for small intervals.
Once the base functions have been defined (e.g. EXP(X)) inclusion functions
may be constructed using natural extension, one of the centred forms or a
combination of these and the rules of Interval Arithmetic.
The significance of choosing the best inclusion for a given problem is illustrated
by the Six Hump Camel problem (Eqn. 5.25) where different inclusion functions
have a dramatic effect on the convergence and attainable accuracy.
Some of the differences between Real Analysis and Interval Analysis are import-
ant for the construction of inclusion functions by Natural Extension. Foremost
2See also Eqn. §5.23
164 R. P. BYRNE AND I. D. L. BOGLE
The constant c can be chosen as the midpoint of X. The Mean Value form of
an inclusion is of order two [15]. Therefore it will provide tighter bounds on
intervals with a small width.
Specialised inclusion forms for univariate and/or rational f(x) can be found in
[7] and [16].
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 165
Inequality Constraints
For an inequality constraint, g(x) ~ 0, with an inclusion function G(X), interval
analysis provides bounds on the value of the constraint function and thus the
'status' of Xk may be determined to be feasible, infeasible or indeterminate 3
with respect to g(x) ~ o.
Equality Constraints
A box, Xk, may be determined to contain feasible points with respect to an
equality,h(x), iff 0 E H(X), the inclusion of h(x).
Clearly it is not possible for the process of finite subdivision to result in a box
which is entirely feasible with respect to an equality, h(x) = O. For this reason
some form of relaxation must be made with respect to equality constraints. It
3 Relational expressions on II are outlined in §3.1, pg 7.
166 R. P. BYRNE AND I. D. L. BOGLE
Alternatively, a maximum box width, /3, is chosen and boxes satisfying W(Xk) ::;
/3 are considered to be feasible. This results in a 'chain' of boxes, below the
acceptable size, along the equality. As with the treatment of inequality con-
straints, this retains all feasible points.
If it holds that,
w(G) --7 0
w(H) --7 0
as w(X) --7 0 (5.23)
then the region Z can be reduced such that its complement with A is smaller
than any reasonable specified tolerance. It is always the case that A ~ Z so
that no feasible points are discarded.
As boxes are divided and bounds on the objective function are accumulated
they may be discarded if they do not contain any feasible points or if the lower
bound on f(x) over the box is greater than the current estimate of the global
optimum, y.
contains constraints Xk must be a feasible point. Thus only those Xk that are
feasible contribute. This can be improved if a feasible point algorithm is used
to determine a feasible point in each indeterminate box and improved further
if a constrained local search is performed. In the case of equality constrained
problems it is necessary to use a feasible point search, or constrained local
optimiser, in order to generate values for fj.
5.1 Implementation
When implementing an interval optimisation algorithm a number of decisions
must be made. These concern: choosing which box from the list of possible
boxes should be investigated first, how boxes should be partitioned and how the
termination criteria are to be satisfied.
Ideally the box to be chosen from the list will be the one that, once partitioned
and bounded, will result in the removal of more boxes from the list. In practice
it is not possible to know which box, Xk, will be the best before evaluating
F{Xk) thus a choice may be made between choosing the box with the lowest
upper bound or the lowest lower bound. As it is vital that the lower bound be
stored with Xk regardless of which scheme is chosen and boxes are excluded
on the basis of this lower bound, choosing the box with the lowest lower bound
seems to be the more practical choice.
(5.24)
that is, the inclusion of f{x) over X, with all but the jth component of X
reduced to a point. This results in j extra evaluations of F(X) per iteration
but reduces the overall number of partitions that need to be made. The points
Xj are usually the point at which the partition is to be made.
168 R. P. BYRNE AND I. D. L. BOGLE
The results were obtained on an Intel 486DX using original code written in
C++.
A second order Taylor form inclusion function was used in addition to the
natural interval extension. If c E ~ 2 is a point in X E [2 the Taylor form
inclusion, FT, is given by
This problem has 15 stationary points in the region of interest. The two global
minima are at (0.0898, -0.7126)T and (-0.0898,0.7126)T. With f. < 10- 4 the
Natural Interval Extension did not converge in 1000 subdivisions. For f. = 10- 7
the Taylor form converges in less than 400 subdivisions, to a list with two
members:
[ 0.0898,0.0900 ] [ -0.0900, -0.0898 ]
-0.7130, -0.7119 0.7119,0.7300,
necessarily the best. The algorithm using a natural extension of this function
does not converge for values of f < 10- 4 independent of the maximum number
of iterations allowed. The Taylor form however is of a higher convergence order
and provides better bounds on the objective value over small boxes. These
results, along with the CPU time required are summarised in Table 1
The second problem (Eqn. 5.28) exhibits two global minima and demonstrates
how the interval optimisation locates both while discarding local minima
(5.28)
Given Xo = [( -5,8), (-2, 3), {-2, 3)]T this problem converges to the same point
for both bisection methods but requires 110 bisections using Eqn 5.24 as com-
pared to 174 with the traditional bisection mechanism.
X* = [ -3 X
±2.7859
10- 5 ,2 X 10- 5
1 (5.29)
-1.83 X 10- 4 ,1.22 X 10- 5
A multiextremal constrained problem (Eqn. 5.30) from [5] has also been solved.
mm
",EA
-Xl + X1X2 - X2
-6Xl + 8X2 < 3
3Xl - X2 < 3 (5.30)
Xl > 0
X2 < 5
This problem has two minima, as reported by [5], which occur at [0.916,1.062]
and [1.167,0.5] with J(x) = -1.0052 and -1.0833 respectively.
The problem is solved by modifying the objective function, J(x), and the inclu-
sion function, F(X), such that infeasible boxes are discarded while feasible and
indeterminate boxes are retained. Only feasible points contribute to the upper
bound.
Because boxes must be divided until they are considered feasible this approach
can result in a large number of boxes ifthe maximum width, f3 (see §4) is small.
This problem required in excess of 2000 divisions (> 1208) to achieve an un-
certainty on the value of J(x) of less than 10- 4 . At this point the algorithm
was stopped. There were 607 interval/bound pairs, all lying on the constraint,
on the list. All of these pairs fit into a box K E II2 with a range of J(x), Y E II.
This illustrates one of the difficulties of using interval methods for constrained
problems. The minima for this problem lie on the boundary of A. This bound-
ary is not orthogonal therefore a finite number of orthogonal subdivisions can-
not produce an entirely feasible box containing the global optimum. Thus, the
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 171
In terms of practical application any point in any of the 607 boxes will provide
a starting point for which I(x) < -1.0052. Furthermore a local optimisation
procedure will converge to the global optimum given anyone of these boxes as
a starting point.
In practice the maximum acceptable size for an indefinite box can be made
very small. However this has an appreciable impact on the computation time
required to solve the problem. It may not be necessary if a local optimisation
algorithm is to be used to refine the solution as this will converge to the global
minimum feasible point within the box.
6 CONCLUSIONS
In this chapter we have shown an approach to solving nonconvex Nonlinear Pro-
gramming problems. This approach is a method of covering the feasible region,
obtaining bounds on I( x) using Interval Analysis and locating the global minima
using the exclusion principle. The solutions provided are intervals containing
the global minimisers of I(x).
The solutions provided by the interval method are all the global minimisers.
This is particularly useful in Engineering Design where factors that are not
included in the objective function, but are important in the final design, can be
used to make the final decision between globally optimal solutions.
Some aspects of this technique have been considered and test problems used
to illustrate them. The first test problem shows how different inclusion func-
tions can affect performance and robustness. It was shown that a higher order
inclusion function can considerably increase the accuracy possible in the solu-
172 R. P. BYRNE AND I. D. L. BOGLE
tion. This does not necessarily imply that the higher order inclusion will be
the best for all problems. The lower order inclusion functions produce tighter
bounds over large intervals. Moreover, the cost of evaluating, for example, a
Taylor Form Inclusion where automatic differentiation is performed may res-
ult in fewer divisions but could require more computation time per division
possibly reducing the overall efficiency. A combination of these forms may be
appropriate using a Natural Extension for large intervals and augmenting with
one of the centred forms close to the solution.
The same aspects of efficiency are relevant to the second problem. This problem
highlights the improvement that can be obtained by changing the partitioning
strategy using more information about the problem than the standard bisection
method. Again the performance improvement will depend on the scaling of the
variables and the cost of evaluating the objective function /(x) as opposed to
the relatively inexpensive partitioning phase. A number of other 'accelerations'
are described in [17].
The third problem has a nonconvex objective with linear constraints and serves
to indicate the use of Interval Subdivision in constrained problems when the
global minimiser lies on the boundary of the feasible region, and have solutions
on the boundary of the constraints. This argument applies equally to equality
constrained problems where the solution must be on a constraint.
The Interval Optimisation algorithm must relax the feasibility criteria in order
to solve these problems. It is shown that, while the number of divisions may be
high, the algorithm can produce solutions to a prescribed accuracy. Therefore
the Interval Algorithm can certainly be used to supplement current convex
optimisation algorithms allowing location of the global optimum.
Future Work.
It is clear that an appropriate choice of inclusion function and partitioning
strategy can reduce the number of subdivisions that must be generated but it is
not clear how this affects performance when the objective function is expensive
to evaluate. A more extensive study, with an efficient implementation, to profile
the performance of these algorithms would provide insight into the optimal
implementation.
Interval Methods have successfully been applied to solving large scale nonlinear
equations using parallel computer architectures [18]. The Interval Optimisation
method is very similar and can probably be scaled in a similar fashion. This
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 173
Acknowledgements
This work was funded by the Engineering and Physical Sciences Research Coun-
cil and Performed as part of the Centre for Process Systems Engineering.
The authors would like to thank Kevlin Henney for his consistently excellent
advice on the subject of C++.
REFERENCES
[1] Rinnoy Kan, A.H.G, Timmer, G.T (1987). "Stochastic Global Optimiza-
tion Methods Part II: Multi-Level Methods." Math. Prog. 39 (1) 57-78.
[2] Rinnoy Kan, A.H.G, Timmer, G.T (1987). "Stochastic Global Optimiza-
tion Methods Part I: Clustering Methods." Math. Prog. 39 (1) 27-56.
[3] Androulakis, I.P and Venkatasubramanian, V. (1991) "A Genetic Al-
gorithmic Framework for Process Design and Optimization." Computer
Chern. Engng. 15 (4) 217-228.
[4] Floudas, C.A., Visweswaran, V. (1990) "A Global Optimization Algorithm
(GOP) for Certain Classes of Nonconvex NLPs. 1 Theory." Compo Chern.
Eng. , 14 (12),1397-1417.
[5] Floudas, C.A., Aggarwal, A., Ciric, A.R. (1989) "Global Optimum Search
for Nonconvex NLP and MINLP Problems." Computer. Chern. Eng 13
(10) 1117-1132.
[6] Quesada, I., Grossmann, I.E (1993) "Global Optimization Algorithm for
Heat-Exchanger Networks." Ind. Eng. Chern. Res. 32 (3) 487-499.
[7] Hansen, E. (1992) "Global Optimization Using Interval Analysis." Marcel
Dekker, New York.
[8] Kocis, G.R, Grossmann, I.E (1991). "Global Optimization of Noncon-
vex Mixed-Integer Nonlinear-Programming (MINLP) Problems in Process
Synthesis." Ind. Eng. Chern. Res. 27 (8) 1407-1421.
174 R. VAIDYANATHAN ANDM. EL-HALWAGI
[9] Piyavskii, S.A (1972). "An Algorithm for Finding the Absolute Extremum
of a Function." USSR Compo Mat. & Mat. Phys. 1257-67.
[10] Meewela, C.C, Mayne, D.Q (1988) "An Algorithm for Global Optimization
of Lipschitz Continuous Functions." J.Optim. Theory. Appl57 (2) 307-322.
[11] Torn, A., Zilinskas, A. (1989), "Global Optimization. Lecture Notes in
Computer Science." Springer-Verlag, Berlin.
[12] Hansen, P., Jaumard, B., Lu, S.H (1992). "On Using Estimates ofLipschitz-
Constants in Global Optimization." J. Optim. Theory. Appl. 75 (1) 195-
200.
[13] Hansen, P., Jaumard, B., Lu, S.H (1992). "Global Optimization of Uni-
variate Lipschitz Functions .1. Survey and Properties ." Math. Prog. 55
(3) 251-272.
[14] Moore, R.E. (1966), "Interval Analysis.", Prentice-Hall, Englewood Cliffs,
NJ.
[15] Ratschek, H., Rockne, J. (1988), "New Computer Methods for Global Op-
timization", Ellis Horwood Ltd., Chichester, West Sussex, England.
ABSTRACT
In this work, we introduce a global optimization algorithm based on interval analysis
for solving nonconvex Mixed Integer Nonlinear Programs (MINLPs). The algorithm
is a generalization of the procedure proposed by the authors (Vaidyanathan and El-
Halwagi, 1994a) for solving nonconvex Nonlinear Programs (NLPs) globally. The al-
gorithm features several tools for accelerating the convergence to the global solution.
A new discretization procedure is proposed within the framework of interval analysis
for partitioning the search space. Furthermore, infeasible search spaces are elimi-
nated without directly checking the constraints. illustrative examples are solved to
demonstrate the applicability of the proposed algorithm to solve nonconvex MINLPs
efficiently.
1 INTRODUCTION
A large number of chemical engineering problems can be formulated as mixed-
integer nonlinear programs "MINLPs". These MINLPs are typically nonconvex
and hence possess multiple local optima. Over the past three decades, a number
of algorithms have been proposed to solve optimization programs globally (for
recent reviews the reader is referred to Floudas and Grossmann, 1994; Floudas
and Pardalos, 1992 and Horst, 1990). The proposed procedures have mainly
been developed utilizing branch and bound, outer approximation, primal-dual
decomposition and interval analysis principles. Swaney (1990), Maranas and
Floudas (1994) and Ryoo and Sahinidis (1994) have developed global optimiza-
tion algorithms based on branch and bound methods. An outer-approximation
175
I. E. GrossmlJ1l1l (ed.J, Global Optimization in Engineering Design, 175-193.
© 1996 Kluwer Academic Publishers.
176 R. VAIDYANATHAN ANDM.EL-HALWAGI
Interval analysis can provide an attractive framework for the global solution of
noncovex optimization problems. Interval analysis algorithms are based on the
concept of continually deleting sub-optimal portions of the search space until
the global solution is alone retained. Interval Analysis algorithms have the
attractive property of guaranteed convergence to the global solution. The con-
cept of interval analysis was originally introduced to provide upper and lower
bounds on errors that occur in computations (Moore, 1966). Since then, the
scope of interval analysis has been significantly enhanced, particularly in the
area of global optimization. A number of implementations of interval-based
optimization procedures have been recently developed to solve nonlinear pro-
grams, "NLPs" (e.g. Moore et al., 1992; Ratschek and Rokne, 1991; Hansen,
1980; Moore, 1979; Ichida and Fujii, 1979). However, they are all computa-
tionally intensive for most problems. Recently, Vaidyanathan and EI-Halwagi
(1994) have introduced an interval-based global optimization procedure for the
solution of NLPs. In particular, they have introduced new techniques that ac-
celerate the solution and eliminate infeasible domains without directly checking
the constraints.
In this work, we proposed a new algorithm for the global solution of MINLPs.
This algorithm is a generalization of the NLP-solving procedure developed by
Vaidyanathan and EI-Halwagi (1994). In addition to the accelerating tools, new
strategies for partitioning the search space in the presence of discrete variables
will be discussed. For computational economy, the procedure treats discrete
variables as being continuous while applying the interval analysis tests. Case
studies have been solved to illustrate the efficacy of the algorithm.
2 PROBLEM STATEMENT
The problem to be addressed in this chapter can be stated as follows:
minimize(globally)/(z, y).
subject to the nonlinear inequality constraints,
which define the initial search box B. The domain (search space) is represented
by both continuous variables (x) and discrete variables (y).
i.e. u Ric and yf r- Ic
The objective function f(x,y) is assumed to be continuous and twice differen-
tiable whereas each constraint Pi (:I: , y) is assumed to be continuous and once
differentiable. Equality constraints can be handled as two inequalities. Alter-
natively, an equality constraint may be eliminated by solving for some variables
that are separable.
The width of an interval box is the maximum edge length over all the co-
ordinate directions, i.e.
w(X) = m~ W(Xi), (6.2)
l~'~n
178 R. VAIDYANATHAN ANDM. EL-HALWAGI
where
(6.3)
Interval Arithmetic: The basic mathematical operations that are performed
with real numbers can also be performed with intervals. A set of rules have
been established to carry out the mathematical operations with intervals. Some
rules for performing interval operations are:
Addition
[a, b] + [e, dJ = [a + e, b + dJ. (6.4)
Negation
-[a, b] = [-b, -a]. (6.5)
Multiplication
[a, b] * [e, dJ = [min(ac, ad, be, bd), maz(ac, ad, be, bd)]. (6.6)
Division
[a, b]/[e, dJ = [min(ale, aid, ble, bid), maz(ale, aid, ble, bid)], (6.7)
i/O ~[e, dJ (other rules apply when 0 E [e, dJ). (6.8)
M onotonicity test:
Consider a certainly strictly feasible sub-box X. By certainly strictly feasible,
we mean that Fi(X) < 0 for all i. Let 9,(:Z:) denote the gradient of the objective
180 R. VAIDYANATHANANDM.EL-HALWAGI
function f(x) with respect to Zi. Also, let Gi(X) be the inclusion of gi(Z) over
X. If:
(6.13)
then the objective function is monotonic in all the coordinate directions. Hence,
only the end point that corresponds to the minimum objective function value
in the box should be retained and the rest of the box X can be deleted.
Having discussed the basic principles of interval analysis and their applica-
tion in global optimization, we are now in a position to present our proposed
algorithm in the next section. It is based on generalizing the interval-based
algorithm proposed by Vaidyanathan and EI-Halwagi (1994a,b) for tackling
NLPs. In addition, a discretization scheme is developed to tackle the special
aspects of searching over integer variables.
4 REVISED ALGORITHM
The revised interval analysis algorithm for global optimization incorporating
the discretization procedure for treating integer variables will be discussed be-
low. In addition, the algorithm retains the tools developed earlier (Vaidyanathan
and EI-Halwagi, 1994a) to significantly accelerate interval-based global opti-
mization algorithms. The integer constraints are relaxed while applying the
interval analysis tests for deleting sub-optimal and/or infeasible portions of the
search space. This relaxation is reconciled later when the search space is par-
titioned. Accordingly, the shifted partitioning strategy has been modified to
utilize the discrete nature of some of the variables in deleting infeasible portions
of the search space.
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 181
Let a valid lower bound on the value of the objective function at the global
minimum be denoted by lwbd. Consider a sub-box X of the initial search box
B. Suppose that the inclusion of the objective function over X is given by [LBX,
UBX]. If the following condition holds:
UBX < lwbd. (6.15)
then the sub-box X is completely infeasible and can hence be deleted.
Several methods can be used to evaluate a lower bound on the value of the
objective function at the global minimum. For instance, if the optimization
program features an objective function and a set of constraints that are rational
polynomials, the procedure proposed by Manousiouthakis and Sourlas (1992)
can be used. For more general structures of NLPs, convex under-estimators
can be used to obtain a lower bound on the objective function (McCormick,
1976).
Given an infeasible point, x EX, the distrust region method will identify a
hypercube of side 20" around the point x such that the hypercube is completely
infeasible. The scalar 0" is called the distrust length.
The task of identifying the hypercube can be formulated as the following opti-
mization problem:
max 0",
subject to
Pi([X - 0"1, x + 0"1]) > 0 for some i = 1,2" ... , m (6.16)
182 R. VAIDYANATHAN AND M. EL-HALWAGI
0- ~ 0 (6.17)
where, Pi is the inclusion of the range ofthe constraint Pi and I is a unit vector.
This optimization program can be solved using any local optimization algo-
rithm since a local solution is sufficient for the purpose. An alternate solution
procedure involves solving the optimization program by trial and error. To
begin with, a large value of 0- is assumed and the feasibility with respect to
the constraints described by (16) is checked. If one or more of the constraints
are satisfied, a solution has been obtained. Otherwise, 0- can be scaled down
iteratively until at least one constraint in (16) is satisfied.
The solution to this program identifies a hypercube of side length 20- surround-
ing z. This hypercube can be completely deleted from the search space. A good
starting infeasible point is the global solution to the unconstrained optimiza-
tion problem which can be obtained via interval-analysis techniques. However,
any point that is infeasible with respect to the constraints of the problem may
be used.
In general, splitting the search box at a point will yield 271 sub-boxes. To
avoid such a tremendous increase in the number of sub-boxes, we split the
search box into two sub-boxes only. This is accomplished by partitioning in
only one direction. We arbitrarily assign this direction j as the one with the
largest width among all directions. This selection is aimed at quickly reducing
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 183
(6.19)
In addition, we slightly shift the partitioning point from the local minimum.
Let :z:* be the local minimum and its component in the jth direction be :z:]. The
integer constraints were relaxed while applying the tests described above. This
relaxation will be reconciled while performing the partitioning of the search box
as discussed below. We propose the following rules for partitioning depending
on the discrete or continuous nature of :z:; .
Case 1: If:Z:j is real and x] f:. aj or bj :
The shifted partitioning around the local minimum will yield the two sub-boxes:
[a!, bt ], [a2, b2], ... , [aj-l, bj-l], [aj,:z:; - {3], [aj+!, bj+l] , ... , [an, bn]
and
[all bl], [a2' b2], ... , [aj-l, bj-l], [:z:; - {3, bj], [aj+b bj+l] , ... , [an, bn],
where f3 is a very small number.
Case 2: If:Z:j is real and :z:; = aj or bj :
The partitioning may be carried out at the midpoint of the interval representing
:Z:j. This will yield the following two sub-boxes after partition:
and
[all bl ], [a2' b2], ... , raj -1, bj -1], raj, int(xj)], [aj+l! bj+l], ... , [an, bn]
and
[ai, bl ], [a2' b2], ... , [aj-l, bj - l], [int(:z:;) + 1, bj ], [aj+l, bj+l], ... , [an, bn],
Case 4: If Xi is an integer variable, and :z:] = bj, then the partitioning yields
the two sub-boxes:
184 R. VAIDYANATHAN ANDM. EL-HALWAGI
and
With these accelerating tools, we are now in a position to present the pro-
posed algorithm as illustrated in Fig. 1.
The details of the proposed interval algorithm for global optimization is pre-
sented next.
Step 1. In this step, input data are prepared in a suitable form. First, the initial
search box (B, which is given by the problem statement) is placed as the
first element in a list L. This list will acquire additional elements as the
algorithm proceeds. In addition, one has to specify f (the desired width
of the final box), 6 (the desired accuracy in the range of the objective
function over the final box) and Q (the minimum width of a box below
which a distrust-region method will not be implemented). Furthermore,
lower and upper bounds on the value of the objective function at the global
solution (referred to as lwbd and upbd, respectively) would be evaluated.
As has been previously described, lwbd can be obtained via the methods
proposed by Manousiouthakis and Sourlas (1992) or McCormick (1976).
On the other hand, upbd can be taken as the value ofthe objective function
at a local minimum.
If all of the optimization variables are integers, then f and 6 are each
specified to be zero for termination of the algorithm.
Step 2. Designate the largest box in list L as the active hox. If it has a width less
than or equal to f and the range of the objective function over the box is
less than or equal to 6, go to Step 13. Otherwise, go to Step 3.
Step 3. Relax the integer constraints and assume that the discrete variables can
take any real value within the bounds specified by the box. The discrete
nature of the these variables will be reconciled while performing the box
partitioning in Step 11. Go to Step 4.
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 185
Step 4. Apply the upper bound test to the active box. If the active box is deleted,
remove it from list L and go to Step 2. If the active box is not deleted, go
to Step 5.
Step 5. Apply the lower bound test to the active box. If the active box is deleted,
remove it from list L and go to Step 2. If the active box is not deleted,
proceed to Step 6.
186 R. VAIDYANATHAN AND M.EL-HALWAGI
Step 6. Apply the infeasibility test. If the active box is completely infeasible, delete
it from list L and go to Step 2. Otherwise, go to Step 7.
Step 7. If the width of the active box is greater than a, go to Step 12. Otherwise
go to Step 8.
Step 8. If the active box is certainly strictly feasible, go to step 9. Otherwise, go
to Step 11.
Step 9. Apply the monotonicity test. If the active box is monotonic, add the end
point (which yields the lower value of the objective function) to list L
while deleting the rest of the active box from the list and go to Step 2.
Otherwise, go to Step 10.
Step 10. Apply the nonconvexity test. If the interior of the active box can be
deleted, remove the active box from list L and add its exterior alone to the
list. Then, go to Step 2. Otherwise, go to the next step.
Step 11. Obtain the constrained local minimum (using a local optimizer), with the
integer constraints imposed, in the active box. Apply the discretization
procedure discussed earlier in section 4.3 to partition the box around the
local minimum. Remove the active box from list L and add the two new
sub-boxes to it. If the objective function value of the constrained local
minimum is less then the current upbd, then update upbd. Go to Step 2.
Step 12. Choose an infeasible point in the active box and apply the distrust-region
method to it. Delete the active box from list L and add the sub-boxes that
are created after deleting the distrust sub-box in the active box. Go to
Step2. If there is no infeasible point in the active box go to Step 8.
Step 13. If all the variables involved in the problem are integers, then go to Step 14.
Otherwise, the remaining boxes in the list L contain the global solution
and the algorithm is, therefore, terminated.
Step 14. The remaining boxes in the list L are all of zero width and therefore,
actually represent a finite number of points in the search space. These
points are then screened for feasibility with respect to the constraints of
the original problem. The objective function is then evaluated at the
feasible points. The point(s) that gives the least value for the objective
function is the global solution. The algorithm is then terminated.
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 187
5 ILLUSTRATIVE EXAMPLES
In order to demonstrate the applicability of the proposed algorithm, the fol-
lowing example problems are tackled.
min! = 2:z:+y,
subject to,
_:z:2 _y < -1.25
:z:+y < 1.6
0< :z: < 1.6
y f {0,1}
This problem has one real variable (x) and one integer variable (y). The first
constraint is nonlinear and nonconvex in the real variable. The initial box B =
[0 1.6], [0 1] was used to search for the global optimum. The interval analysis
algorithm was applied to the problem with tolerances f and 6 on the width
of the solution box and the objective function being 0.000001 and 0.00001, re-
spectively. The global solution was found to be :z:=[0.5, 0.5] and y=[l, 1]. The
corresponding range of the objective function is [2, 2]. The computing time
was O.ls on a SUN Sparcstation 10.
bi-valent, three tri-valent and one tetravalent) were chosen to represent the
initial search space. The groups along with their contributions to the vari-
ous properties are shown in Table 1. The non-linearity and non-convexity in
the program are introduced by the structural feasibility constraints and some
property constraints. The optimization program is represented by:
( ,,12 ) 12
7naa: .L,.,'-1
12
Yg,X, + 3. 5(~ Cpi·X·)
~ I
(Ei=l MiX,) i=l
subject to,
12
90 ~ EM,Xi ~ 104
i=l
12 12
0.95 ~ (E M,X,)/(E ViX,) ~ 1.25
,=1 ,=1
12 12
1.15 ~ (E Cp,Xi)/(E MiX,) ~ 1.45
i=l ,=1
12 12
300 ~ (E Yg,Xi)/(E MiX,) ~ 380
,=1 i=l
{Xl + X 2 + Xa + X 4 + Xs + 2(Xs + X7 + Xs + X 9 )
+3(X10 + Xu) + 4X12 - 6}(XlO + Xu) ~ 0
Xs+Xs ~ 1
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 189
Table 1 List of groups and their contributions towards the various properties
used in Example 2
UHi*
Mi Vwi 4642 Cps Ygi
group id (g/ (cm3/ m/ (J/ (K.g/ LlG'h
mole) /mol) mole)l/3 mol mol) (J/mol)
.(m/s) /K)
CH3 Xl 15 13.6 1130 30.9 6100 -4600+95T
Cl X2 35.5 11.6 1000 27.1 17500 -49000-9T
C6H5 X3 77 45.9 3650 85.6 34200 87000+167T
COOH X4 45 18.6 1100 50 13300 -393000+ 118T
CH3COO Xs 59 28.9 2030 75.9 18600 -383000+211 T
CH2 X6 14 10.2 675 25.3 2700 -22000+102T
CH2COO X7 58 25.4 1575 71.3 15200 359000+218T
CHNH2 Xs 29 18 1920 36.5 9700 8800+222T
C6H4 Xg 76 43.3 3300 78.8 29500 100000+180T
CH XlO 13 6.8 370 15.6 1900 -2700+120T
C6H4 Xu 75 40.8 2900 72.0 24800 113000+ 178T
c X 12 12 3.3 35 6.2 5500 20000+140T
The problem formulated above was then solved using the proposed interval
analysis algorithm. Since all the optimization variables are integers, the toler-
ance on the width of the solution box € was taken as 0 and the tolerance of the
objective function inclusion 6 was, therefore, taken to be 0 as well. By applying
our interval algorithm, we obtained the following solution:
Xl [1,1],
Xs [1,1],
X6 [1,1],
X l2 [1,1]
190 R. VAlDYANATHAN AND M. EL-HALWAGI
6 CONCLUSIONS
A general interval-based global optimization algorithm has been developed to
solve MINLPs. The algorithm utilizes integer discretization and search accel-
erating tools to eliminate sub-optimal sub-spaces from the search domain. The
solutions provided by this algorithm are guaranteed to be globally optimal.
Illustrative examples demonstrate the applicability of the proposed procedure
to nonconvex mixed integer nonlinear programs.
Acknowledgement
The financial support of the NSF (grant NSF-NYI-CTS-9457013) is gratefully
acknowledged.
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 191
REFERENCES
[1] Alefeld G. and J. Herzberger. 1983, "Introduction to Interval Computa-
tions", Academic Press, New York.
ABSTRACT
The problem of selecting processes and planning expansions of a chemical complex to
maximize net present value has been traditionally formulated as a multiperiod, mixed-
integer linear program. In this paper, the problem is approached using an entirely
continuous model. Compared to previous models, the proposed formulation allows
for more general objective functions. In solving the continuous model, minimizing its
nonconvex objective function poses the major obstacle. We overcome this obstacle by
means of a branch-and-bound global optimization algorithm that exploits the concav-
ity and separability of the objective function and the linearity of the constraint set.
The algorithm terminates with the exact global optimum in a finite number of itera-
tions. In addition, computational results demonstrate that the proposed algorithm is
very efficient as, for a number of problems from the literature, it outperforms OSL,
a popular integer programming package. We also develop a procedure for generating
test problems of this kind.
195
I. E. Grossmann (ed.), GlobalOptimiwtion in Engineering Design, 195-230.
Ie 1996 Kluwer Academic Publishers.
196 M. -L. LID, N. V. SAHINIDIS ANDJ. PARKER SHEC1MAN
1 INTRODUCTION
The process industry-now a multi-billion dollar international enterprise-com-
prises enormous amounts of natural resources, chemicals, personnel and equip-
ment. Despite the expected growth in the demand of chemicals, the industry
is becoming increasingly competitive while customer demands impose a sig-
nificant complexity on production requirements. This trend necessitates the
development of efficient optimization techniques for planning process opera-
tions.
Each of the final and intermediate products may be output from one or more
processes that reflect different technological recipes. Choosing from among dif-
ferent technological alternatives leads to a problem that grows combinatorially
with the number of potential products and processes. An additional complicat-
ing factor is the matter of when to expand the capacities of the processes. As
market demands, investment and operating costs fluctuate, one would like to
time capacity expansions in a way that takes into account economies of scale
and market dynamics.
The approach taken in this paper is to model economies of scale directly, rep-
resenting costs by univariate concave functions. In this way, the formulations
avoid the use of binary variables. This reformulation allows us to solve the
problem using a concave programming (CP) approach. In addition to elimi-
PLANNING OF CHEMICAL PROCESS NETWORKS 197
2.1 Indices
i The network is composed from a set of N P processes (i = 1, ... , N P).
j Streams of NC chemicals (j = 1, ... , NC) may be exchanged by the processes.
1 A set of N M markets are involved (I = 1, ... , N M).
t We consider production over the course of NT time periods (t = 1, ... , NT).
2.2 Parameters
The model allows for the purchase of between aflt and a~t units of
chemical j from market 1 during period t.
The model incorporates the prediction that we will be able to sell
between dflt and d~t units of chemical j to market 1 during period
t, at a forecasted price of Ijlt per unit.
I 0 are the input and output chemical proportionality constants used
J.tij' J.tij
for mass balances.
2.3 Variables
Eit The production capacity of process i is expanded by Eit units at the be-
ginning of period t.
Pjlt units of chemical j are purchased from market 1 at the beginning of period
t.
Qit The total capacity of process i during period t.
Sjlt units of chemical j are sold on market 1 at the end of period t.
Wit The actual operating level of process i during period t.
2.4 Functions
INVTit(Eit) The amount invested in process i during period t including estab-
lishment or expansion of the process, (but not operating costs).
The function may include fixed-charges for the establishment and
each subsequent expansion of the process, as well as variable costs
that depend on Eit.
OP ERit(Wit) The total cost of operating process i over period t as a function of
the operating level, Wit.
PU RCjlt(Pjlt) The total value of purchases of chemical j from market I during
period t as a function of Pj It .
PLANNING OF CHEMICAL PROCESS NETWORKS 199
Formulation i-General
NP NT NP NT
maxNPV= - LLINVTit(Eit) - LLOPER;t(Wit}
i=l t=l i=l t=l
NC NM NT
+ L L Lh'jltSjlt - PU RCjlt(Pjlt)) (7.1)
j=l 1=1 t=l
subject to
NM NP NM NP
L
ajlt:S Pjlt:S ajlt
U j=l, ... ,NC; l=l, ... ,NM; t=l, ... ,NT (7.5)
j=l, ... ,NC; l=l, ... ,NM; t=l, ... ,NT (7.6)
200 M. -L. LlU, N. V. SAHINIDIS ANDJ. PARKER SHECI'MAN
The objective function seeks to maximize the net present value NPVof the
process plan, considering investment, operation and purchase costs as well as
sales revenues. The set of mass balances, (7.4), reflects the technological recipe
for producing chemical j by means of process i. For simplicity, we assume that
mass balances can be expressed as equations that are linear in terms of the
operating level of the process.
3 FIXED-CHARGE MODELING
The fixed-charge CP formulation of the planning problem is essentially equiv-
alent to the traditional MILP formulation. Thus we begin by describing the
MILP.
NP NT NP NT
maxNPV= L2)ait E it + (JitYit) - L L 8it W it
i=l t=l i=l t=l
NCNM NT
+ L L L(-yjltSjlt - rjltPjlt ) (7.8)
j=l 1=1 t=l
subject to Constraints (7.2)-(7.7) and
YitE~ :::; Eit :::; YitEl{ i=1, ... ,NP; t=1, ... ,NT (7.9)
Yit E {O, 1} i=1, ... ,NP; t=1, ... ,NT (7.10)
when Eit = 0
INVTit(Eit) = {O,ait E it
(J
+ it, when Eit > O.
(7.11)
Note that this function is concave in E it . The other terms in the objective
function are retained from the MILP formulation, since these terms do not
202 M. -L. LIU, N. V. SAHINIDIS AND J. PARKER SHECTMAN
NP NT NP NT
mini = -NPV L L INVT;t(Eit) +L L Oit Wit
;=1 t=l ;=1 t=l
NCNM NT
L L L(-yj/tSj/t - fjltPj/t) (7.12)
j=l 1=1 t=l
subject to Constraints (7.2)-(7.7),
4.1 Reasons
To describe the net present value of a process plan, CP holds another option
which may in many instances reflect the economic reality of industrial oper-
ations better than model FCP. In particular, the individual functions in the
objective constitute three reasons why the use of continuous concave functions
to model costs and revenues enables us to solve a more realistic model. One
may safely assume that the costs of operating a process, expanding a process
capacity, and purchasing raw materials all involve economies of scale. MILP
PLANNING OF CHEMICAL PROCESS NETWORKS 203
models force one to assume that these costs are directly proportional to the
amount contracted, but in reality, the per unit cost decreases as the number of
units increases. Hence, the general form of the continuous concave objective is
the same as that of (7.1).
4.2 Model
In order to conduct computational experiments comparing a continuous CP
model to FCP, we have investigated the particular form:
NP NT NP NT
NPV= - LLaitEt;' - LLbitWit
i=l t=l i=l t=l
NC NM NT
+ L L L(-yj/tSjlt - fj1tPjlt), (7.13)
j=l 1=1 t=l
where the OP ER and PU RC terms match those found in FCP (7.12), but the
I NVT term has been changed from a fixed-charge form to the continuous form
(7.14)
where ait > 0, and 0 < bit < 1 for i = 1, ... , N P, and t = 1, ... , NT. Note that
(7.14) estimates investment by applying power-factor scaling to plant capacity.
We come to our working form of the continuous CP model of the planning
problem:
Remark 1: It is obvious that one can develop a CP model that involves any
combination of the objective function terms of models FCP and CCP. In this
204 M. -L. LIU, N. V. SAHINIDIS AND 1. PARKER SHECTMAN
way, fixed-charges and power functions can be brought together into a more
comprehensive CP model, since, e.g., expansion of a process capacity may
require a fixed reinvestment expense plus a variable cost that is itself concave
in the amount of the expansion.
Remark 2: In the above, the sales revenue term in the objective function has
been assumed to be linear for simplicity of the presentation. In reality, this
term is likely to exhibit diseconomies of scale as prices will fall with increased
amounts of production. This would lead to a nonlinear yet convex term in the
minimization problem to be solved.
Remark 3: As with model FCP, lower and upper limits on the size of expansions
can be enforced by the algorithm as bounds (Section 5.5.4).
5 BRANCH-AND-BOUND GLOBAL
OPTIMIZATION ALGORITHM
must lie between the least of the lower bounds and the least of the upper
bounds, the algorithm may delete any subproblem which has an associated
lower bound that is larger than or equal to the least upper bound.
The procedure will now be formally outlined. L will represent the least of
the lower bounds, U - the least of the upper bounds. The algorithm will
view the problem constraints as the intersection of two distinct sets. D will
denote the problem constraints that are not orthogonal to variable axes, e.g., for
FCP and CCP, constraints (7.2)-(7.4). G will denote those problem constraints
which are simple bounds- (7.5)-(7.7) and any desired bounds on budget, the
number of expansions, or the size of individual expansions for FCP and CCP.
In general, G will symbolize a hyperrectangle orthogonal to the variable axes.
For convenience, x will represent the vector of all the problem variables, i.e.,
x = [E, W, S, Pl. The major operations-preprocessing, selection, branching,
and bounding, which are italicized in the statement of the algorithm, will be
presented in full detail in the sequel.
5.2 Preprocessing
The algorithm begins by solving N P linear programs:
NT
maxLWit s.t. DnC, i = 1, . .. ,NP
t=l
letting Wi1 denote their respective solutions. For each process i, the method
computes
then sets
CO = en n NPNT
i=l t=l
{Wit: Wit :S W:t} nn
NPNT
i=l t=l
{Qit : Qit :S Bil ,
5.3 Selection
In Step k.1. of each iteration k, the procedure selects for bounding a single sub-
problem from the list of open subproblems-specifically, a subproblem which
has the best bound, i.e., the algorithm employs the rule:
Of course, in Step 0.1., the initial problem {minf(x) s.t. x E DnC O} is selected
by default.
+ (7.15)
Using the well-known fact that git(Eit ) is the convex envelope of INVT;,t(Eit)
u;n
on [I:;, the distributive property of convex envelopes applies to the entire
nonlinear portion of the objective. Hence, the convex envelope of E~ E;:~
INVT;,t(Eit} over C6 k isgSk(E) = E~E;:~g;tk(Eit). Letw Sk = [E*, W*,S*,
P*] be a solution to the LP relaxation
NP NT NCNM NT
gSk (E) + L: L: 0it Wit - L: L: L:b'j,tSj,t - rjltPjlt)
i=1 t=1 j=1 1=1 t=1
s.t. [E, W, S, P] E D n C 8 k.
For the optimal value of the concave program {minf(x) s.t. x E DnC8 k}, /,k
gives a lower bound, while tk 8k )
= f(W gives an upper bound. -
two stages. First, from among the set of variables that correspond to nonlinear
objective function terms, the partitioning rule selects a branching variable Ert.
The rule then determines a branching point p within in the current domain
of the selected variable. The algorithm uses different branching point and
branching variable selection criteria for problems FCP and CCP.
208 M. -L. LIU, N. V. SAlDNIDIS AND 1. PARKER SHECTMAN
Over the course of the search, the procedure keeps running averages LPCit
and RPCit of the pseudo-costs associated with each variable that is branched
on. Let S represent the set of subproblems no longer open. Hence Bit = {s E
- ~$
Suppose that the algorithm has just computed bounds for the current subprob-
lem and must now select a variable on which to branch. Intuition suggests
splitting the subproblem domain in such a way that the two resultant sub-
problems exhibit smaller underestimation gaps than their parent. Figure l(a)
illustrates the gap, or violation, between the concave objective term INVT;t of
problem s and its linear estimator gft for the Eft component of the relaxation
solution. To select a branching variable that will reduce the said violation, the
procedure can rely on its experience branching on various variables in other
portions of the search tree. In this regard, the average pseudo-costs measure
PLANNING OF CHEMICAL PROCESS NETWORKS 209
the potential for each branching variable to induce a gap-reduction. The metric
estimates the highest potential for a branching variable to result in two child
subproblems which both have reduced gaps. The branching rule must not rely
on precedent alone, however. Instead, the branching rule must also consider the
relative importance of each variable to the current subproblem s. The formula
~s {INVT;t(E~t) - g~t(F7.t) }
it = arg max
i,t "t , x min{ LPG-,t, RPC·}
,t (7.16)
Cost Violation
E;' Expansion
(a) Relaxation and violation
Cost
Eit=O Eit>O
Expansion
(c) Relaxation after branching
Branching variable selection: While we wish to reduce the net gap ofthe present
relaxation upon branching, at the same time, we wish to exploit the relative
potential of each individual variable to reduce the gap over its entire domain of
definition. We propose a rule that balances both considerations. Let mVit rep-
resent the maximum gap between the individual objective term corresponding
to variable Eit and its respective underestimator. Using (7.14) and (7.15), we
may analytically determine that:
(7.17)
is the said gap maximizer. For each variable, our composite variable selection
rule will weight the maximum gap over [lit, Uit] by the gap contribution at Eft,
its respective component of the current LP solution W S :
(7.18)
Figure 2(a) illustrates the violation at Eft (drawn arbitrarily), while Figure
2(b) illustrates the maximum gap, which occurs at mVit.
Cost
Maximum Violation
Cost
s Expansion
mVit
(b) Division at maximum violation point
Finiteness of FCP. The partitioning scheme for FCP branches on the appli-
cation of fixed-charge, which ensures that the original, nonconvex planning
problem will eventually be reduced to a binary tree of linear subproblems, hav-
ing at most 2NPxNT nodes. Since LP is itself a finite procedure, we can show,
without recourse to the modifying branching rule, that FCP terminates finitely.
Actually, the modifying branching rule can only decelerate the convergence of
FCP, hence it is not employed in the algorithm for FCP.
6.1 Introduction
The importance of process planning to the chemical industries motivates the de-
velopment of exact algorithms and heuristics to obtain optimal or near-optimal
process plans. Test problems with a variety of sizes, structures, and parameters
must be employed in any rigorous testing of such algorithms, and an automatic
test problem generator greatly facilitates this endeavor. This section develops
such a generator. When the numbers of processes and products are input to the
generator, it produces a desired number of problem instances having random
network structure and model parameters.
214 M. -L. LIU, N. V. SAHINIDIS AND 1. PARKER SHECIMAN
For any bipartite graph g = (Vl U V2 , f), if every node in Vl has at least one
in arc or out arc joining it to a node in V2 and every node in V2 has at least
one in arc joining it to a node in Vl and one out arc joining it to a node in Vl ,
then this bipartite graph is a feasible process network.
6.3.1 Legend
The following symbols will be used to describe the generator.
PLANNING OF CHEMICAL PROCESS NElWORKS 215
index of processes.
j index of chemicals.
k counter for added arcs.
d an indicator of arc direction. It takes a value of -1 or +1.
ARCU, i) the indicator of arcs.
+1, if there is one arc from node j to node i
ARCU, i) = { 0, if there is no arc between nodes j and i
-1, if there is one arc from node i to node j.
CINU) the indegree of node j in V1 .
COUTU) the outdegree of node j in V1 .
Density a density control factor for the bipartite graph.
9 = (Vl U V2 ,[) a bipartite graph with node sets V1 and V2 and arc set £.
MAXA the maximum number of arcs for a feasible process network.
MINA the minimum number of arcs for a feasible process network.
NA the number of generated arcs.
NC the number of chemicals, i. e., N C = IV11.
NP the number of processes, i. e., N P = IV21.
PIN (i) the in degree of node i in V2 .
POUT(i) the out degree of node i in V2 .
U(a, b) a uniform distribution between a and b.
Chemicall
-I Process 1
1--. Process 2 r---
Chemical 3
c hemical2 .
Process 3 r--
(a) Flow diagram
Chemicals Processes
The program is configured with several input parameters that control the size
of the process network as well as all the cost and price data for constructing a
problem described by the formulations of this paper. To conduct experiments of
a comparative nature, FCP problems are generated first and then transformed
into CCP form as described in the following Subsection.
Chemicals Processes
Chemical 2
Chemical 3
Chemical 1
Processes
Chemicals
Chemical 3
Chemical 5
Chemical 1
Chemica12
approximate one form of the cost function with the other, by minimizing the
Euclidean distance between them over a given range [I, u]. For convenience, let
us rewrite the fixed-charge cost function and the power cost function as follows:
{ 0, if x = 0,
<p(x)
ax + {3, if x i= 0,
7r(x) ax b ,
LSE = min 1 u
(ax + {3 - ax b )2dx
Suppose that a and b are known, the approximation of a and {3 can be found
by the method of least squares. We will find a and {3 so as to minimize the
least squares error (LSE). Differentiating LSE with respect to a and {3 and
setting the partial derivatives equal to zero, we have
8LSE = 0 d 8LSE = 0 (7.19)
8a an 8{3
ub
--a------=O
a f3
2b+ 1 b+ 2 b + 1
and
from which it may not be easy to find closed forms for a and b, but a numerical
approximation is easy to develop. Figure 6 shows a fixed-charge cost function
<p( x) = 95 + 1.56x and its approximation 7r( x) = 43.5xo. 366 obtained by solving
(7.20) numerically.
7 COMPUTATIONAL RESULTS
The proposed algorithm was implemented using BARON [6], a general-purpose
global optimization software for solving nonlinear and mixed integer-nonlinear
programs. BARON employs the branch-and-reduce optimization strategy [7,6]
which integrates conventional branch-and-bound with a wide variety of domain
reduction tools and branching rules. The implementation was done on an IBM
RS/6000 Power PC using the FORTRAN version of BARON. IBM's OSL (Re-
lease 2 [3]) was used to solve the relaxed LPs.
Table 1 describes sixteen example problems from [5], in terms of the numbers
of chemicals, processes, and time periods involved. The number of constraints
(m) and variables (n) for a concave formulation are in each instance substan-
tially less than those required for an MILP formulation. Note that a concave
formulation uses precisely N P x NT fewer variables and constraints because it
eliminates all binary variables and all constraints of the form (7.9).
250
-----
-- Fixed-charge Cost Function
Power Cost Function
200
150
100
50
O+-~~~~-r~~~~~~~~
o 20 40 60 80 100
Table 3 shows that for small problems, solving the FCP problem requires about
the same amount of nodes and CPU time as its CCP approximation. Never-
theless, Problem 16 illustrates that problems of substantially larger size can be
solved more quickly using the FCP technique. This follows from the fact that
we always branch so that the objective term of the branching variable is no
longer violated in descendant subproblems. Whereas, for CCP, the algorithm
is likely to branch on each variable many times in a single path of the tree.
It should be mentioned that the results obtained here for CCP using the pro-
posed subdivision rule are several orders of magnitude faster than when the
traditional omega subdivision or bisection rule is used instead.
224 M. -L. LID, N. V. SAHINIDIS AND J. PARKER SHEC1MAN
For Tables 5-7, we refer to four different sets of test problems. The first col-
umn of Table 5 gives the specifications for each set of problems in the form
NC-N P-NT. In the second column of this table, we give the number of bi-
nary and continuous variables, and the number of rows and nonzeroes in the
constraint matrix. Whereas the number of binaries and constraints is fixed
according to specifications for each problem set, we express the numbers of
226 M. -L. LIU, N. V. SAHINIDIS AND J. PARKER SHEC1MAN
continuous variables and nonzeroes in the constraint matrix using ranges, since
these quantities vary according to the network density of each randomly gen-
erated problem. Each set comprises 27 randomly generated problems.
For the four problem sets, average solution requirements are found in Tables 6
and 7. These tables provide comparisons between MILP and FCP, and between
CCP and FCP, respectively. First, in Table 6, we find nearly identical running
times solving the MILP with OSL versus solving the FCP with BARON. In
spite of this finding, the branch-and-reduce approach requires about one-half
as many nodes in the search tree.
Secondly, Table 7 shows that BARON solves randomly generated FCPs four
times faster than the corresponding CCPs. As with the example problems in
Table 3, we find that both approaches give the exact same solutions and nearly
the same solution values.
across two problem sizes, specified by NC-N P-NT. Each of the six rows
averages results for 27 randomly generated problems. Together with Figure
7, this table indicates a drastic decrease in computational requirements as the
network density of the problems increases. This is due to an increased number
of nonzeroes in the constraints of form (7.4), which shrinks the feasible region
so that it is explored more quickly by the algorithm.
8 CONCLUSIONS
1. The FCP approach to the process planning problem seems computationally
more expedient than solving the equivalent MILP. For solving examples
of actual process planning problems, BARON requires about one-third of
the time taken by OS1. Although running times of the two codes are,
at present, roughly the same when solving randomly generated problems,
228 M. -L. LIU, N. V. SAHINIDIS AND J. PARKER. SHECIMAN
--tit-- 15-15-6
~,---------------------------~
20
CPU
(sec)
10
o~,-,-~~~~~~~~~~~~
o 20 40 60 80 100
Density (%)
----+-- 15-20-6
300~-------------------------,
200
CPU
(sec)
100
o 20 40 60 80 100
Density (%)
3. The novel subdivision rule proposed for CCP largely outperforms the bi-
section and omega subdivision rules traditionally used in concave program-
ming. As the proposed subdivision strategy is applicable to general concave
programs, its application to additional problems is clearly of interest.
9 ACKNOWLEDGEMENTS
Partial financial support from the EXXON Education Foundation and from the
National Science Foundation under grant DMII 94-14615 is gratefully acknowl-
edged.
REFERENCES
[1] A. Brooke, D. Kendrick and A. Meeraus. GAMS-A User's Guide. The
Scientific Press, Redwood City, CA, 1988.
[2] M. Benichou, J. M. Gauthier, P. Girodet, G. Hentges, G. Ribiere, and
O. Vincent. Experiments in mixed-integer linear programming. Mathe-
matical Programming, 1:76-94, 1971.
[5] M. L. Liu and N. V. Sahinidis. Long range planning in the process in-
dustries: A projection approach. To appear in Computers f3 operations
research, 1995.
ABSTRACT
8.1 INTRODUCTION
Due to lack of accurate process models and the variability of process and en-
vironmental data, it becomes indispensable to establish systematic ways to
solve process planning, scheduling and design problems involving stochastic el-
ements. Nevertheless, it is well understood that the consideration of uncertainty
transforms the problem from a deterministic one, where standard methods of
mathematical programming can be applied, to a stochastic problem where spe-
cial techniques are required. Decomposition techniques (Bienstock and Shapiro,
1988; Pistikopoulos and Grossmann, 1989a,b; Dantzig, 1989; Straub and Gross-
mann, 1993) based on the exploitation of the stage wise nature of the stochastic
problem as well as large scale optimization approaches (Grossmann et al. 1983;
Brauers and Weber, 1988; Sahinidis et al. 1989) based on the discretization of
the uncertain parameter space have appeared in the open literature to deal
with the problem of process design/planning under uncertainty. Yet, in many
model instances, only local solutions can be obtained without any guarantee
and/or proof of global optimality.
The main objectives of this work are to present mathematical models of plan-
ning/design of continuous and batch plants for the case when some of the
model parameters are random variables described by probability distribution
functions or considered to vary within specified range, and develop suitable so-
lution techniques to determine the global optimal design/plan based OIl model
reformulation and decomposition principles. In particular, we propose a two-
stage stochastic programming formulation as the basis to model in a consistent
way stochastic process design, planning and scheduling problems of continuous
and/or multiproduct/multipurpose batch plants. Global optimization solution
strategies are then presented for the efficient solution of these models.
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 233
The chapter is organised as follows. Section 8.2 deals with linear stochastic
planning models, a decomposition-based global optimization algorithm for the
resulting bilevel linear two-stage stochastic programming formulation, and a
single-level reformulation based on the relaxation of demand requirements. A
similar single-level reformulation is described in section 8.3 for the problem
of scheduling of multiproduct continuous plants with uncertain demands. A
global optimization algorithm for batch plant design and scheduling models
are presented in section 8.4.
PROCESSING UNIT
We consider production networks (similar to the one shown in Figure 8.1) con-
sisting of M existing processes which are interconnected by a set of streams
(arcs) denoting raw materials, intermediates and products which may be pur-
chased/sold in the market subject to prices, availabilities and production ca-
pacities. The general planning problem involves the determination of optimal
decisions regarding production profiles, sales and purchases of chemicals as well
as capacity expansion policies of existing processes.
Consider a time horizon of operation [0, T] which is divided into t time periods.
We introduce the following notation:
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 235
Index sets:
chemical, i=l, .. ,N
j process, j=l, .. ,M
t time period, t=l, .. ,T
Parameters:
L' u
ai,t, ai,t lower and upper bounds on the availability of chemical i
at period t
dft,
, dft, lower and upper bounds on the demands of chemical i at
period t
lower and upper bounds of capacity expansion of process j
at period t
['min ['max minimum and maximum required amount of inventory
i,t 'i,t
for chemical i at the end of each time period t
stoichiometric coefficient for chemical i in process j
sale price of chemical i at time period t
purchase price of chemical i at time period t
value of final inventory of chemical i at time period t
value of starting inventory of chemical i at time period t
(may be taken as the material purchase price)
operating cost of process j at time period t
variable-size cost coefficients for the investment cost
of capacity expansion of process j at time period t
{3j,t fixed-cost charges for the investment cost of capacity expansion
of process j at time period t
Variables:
:r.j,t production of process j during time period t
CEj,t potential capacity expansion of process j at period t
Yj,t binary variables representing the occurrence or not
of an expansion of process j at period t
Si,t amount of chemical i sold at period t
Pu amount of chemical i purchased at period t
It t and 1ft,
,
initial and final inventory of chemical i at period t
Problem (P)
T N N N
max Profit = L[L'Yi,tSi,t + Lii,tI!,t - LAi,tPi,t-
t=1 i=1 i=1 i=1
N M M
LAi,tI:'t - LGj,tXj,t - L(ajtCEj,t + ,Bj,tYj,t)) (8.1)
i=1 j=1 j=1
Problem (SP)
max CIXI
Xl
+ Eo {max{c2{B)X2}}
X2
S.t. AIXI ~ bl
BIXI + B2{B)X2 ~ b2{B)
Xl,X2 2: 0, BE R
where
Xl = (Xjt, Yjt, CEjt, Sit, Pit), t = I, .. , Tl except for Yjt, CEjt where t=l, .. ,TI
+T2, is the first stage decision vector; X2 = (Xjt, Sit, Pit) corresponds to the
second stage decision vector; B is the vector of uncertain parameters that may
involve dft, dft, aft, aft, Au, "tit, Cjt , t = Tl + 1, .. , TI + T2 with associated distri-
bution function J(B); Ee{.} is the expectation operator of {.} over B; R is the
feasible region of (Xl, X2), i.e. R = {B I VB E R3x2: B1XI + B 2{B)X2 S b2 (B) }.
Note that capacity expansions are considered as first stage decision variables
which typically have to be decided prior to resolution of uncertainty. It should
also be noted that in the above formulation the estimation of expectancy is
based on feasible planning decisions, i.e. we do not account for partially meet-
ing customer orders nor do we penalize unfilled orders (which it may result in
pessimistic planning decisions).
For the solution of problem (SP), Ierapetritou and Pistikopoulos (1994) pro-
posed an algorithm, which essentially transforms the two-stage stochastic op-
timization problem into a series of deterministic optimization subproblems ex-
ecuted in an iterative fashion. In particular, feasibility subproblems are first
solved to induce the feasible region of a selected plan. This allows profit maxi-
mization to be effectively performed providing a lower bound. A master prob-
lem is then computed from dual information of the feasibility and profit sub-
problems, the solution of which returns a new plan (to be examined) while
providing an upper bound. The algorithm converges to an optimal plan; i.e.
a plan with maximum expected profit and sufficient ("optimal") plan feasibil-
ity. The main steps of the proposed algorithmic procedure for the case of two
uncertain parameters are the following (see also Figure 8.2):
Step 1: Select an initial plan (x}, y} , y~, C Et ,C E~). Set the lower bound
EPL = -00, k=l and select a tolerance €.
Step 2: Solve the feasibility subproblems (B1), (B2 q1 ) to obtain the bound-
- - ary of the feasible region and the corresponding Lagrangian mul-
tipliers (Straub and Grossmann, 1993).
Step 4: Calculate the expected profit (EP) using the Gaussian Quadra-
- - ture formula:
Step 5b: Solve the following master problem to obtain a new plan (x~+l , yf+l , y~+l ,
C Ef+l , C E;+l) and an upper bound E pb.
QI Q2
- L L 71qlq2[A3(Xl + CEd + Bl(()~1 ,Bglq2)X~lq2 + B zGE2 - b2(B~1 ,Bgl'i2}
'11=1'12=1
01
- L.\fql [A 3(Xl +CEd + Bd()i\()fql):cz(.) +B2CE2 -bz(eyl)()~"l)]
(/1=1
01
- L .\f ql [A3(Xl + GEl) + Bl (()i l , ()f ql )xz(.) + B 2CEz - bz(et, By'll)]
(/1=1
Step 6: If EPti ::; EPL +t:, stop; the solution is the plan (x~, y~', y~', C Ef', C E})
with expected profit EPL. Otherwise, set k=k+ 1 and return to
Step 2.
240 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
SELECT INITIAL
PLAN plan1
EpU=+oO, EP~ -00
k=l
-
SOLVE FEASmILITY
...--------1
SUBPROBLEMS
MAXIMIZE PROFIT
Dl al AT EACH POINT
infor mation
k=k+l
NO
- plank = plan new
YES
III
OPTIMAL PLAN
plank
Note also that the algorithm follows the general Benders Decomposition prin-
ciples; its main difference is that the primal problem involves the solution of a
sequence of feasibility and profit subproblems, the dual information of which is
all then incorporated in the master problem.
Problem (SPI)
max CIXI + E SER(X1) {maxc2X2} + ESER(xd{maxc2sS2}
Xl X2 52
Other strategies appeared in the open literature with regard to uncert.ain de-
mand considerations are the following:
Si = Oi
Upper 9i E R(xl)
Bound
Lower
Round
,,
,,
9.
Lower Upper
Bound Bound
Si = 9i 9i E T( 9)
~ 9 i 9 i E R(xll
' .
... ...
' .
... ...
... ...
...
... ...
... ...
' .
... ...
... ...
... ...
eI
Figure 8.4 Effect of penalty coefficient 'Y
Property- Any first stage plan (Xl) which satisfying the constraints of problem
(P) for fixed product. demands fJ i , is feasible for the whole parameter range,
T(O).
The important implication of this property is that the integration can be per-
formed within the region defined by the bounds of the uncertain demand pa-
rameters, since the feasible region of any design depicted from the solution of
problem (P) in the space of the uncertain demands coincides with the uncertain
parameter range. Therefore, the obstacle of unknown integrands is effecth-ely
overcome.
Problem (SP2)
Q Q
maxEP = CIXI + 2:WqC2XP(Oq) + 2:wqc2s8~J(Oq)
q q
Q
-"( 2:wq]J2(oq - 8~)J(Oq)
q
subject to
A1Xl ~ bi
Bixi + B2X~ + B2s8~ ~ b2 Vq
8:; oq Vq
~
xI,x2,82 ~ 0, eq E T(e)
Notice that the location of quadrature points within the plant's feasible region is
fully determined from the optimization procedure. In this way, the postulation
of arbitrary scenarios is effectively avoided; instead, the number and location
of scenarios (integration points in this case) will only depend on the degree of
accuracy required for the integration. Moreover, since (SP2) corresponds to a
single level optimization problem, conventional solution algorithms can be used
to obtain the global optimal solution for linear and convex planning models.
Note also that the case of additional uncertain parameters appearing only in
the objective function, can be treated similarly in a straightforward manner
(Ierapetritou et al. 1995).
Illustrating Example
and prices are also given). Two crude oils are available for purchase subject to
supply limitations. Four products are produced according to a yield matrix with
limits on the production capacities (see Table 8.1). Both crudes and products
can be stored subject to tank inventory limits. Products are available for
sales according to market demands which are considered uncertain for gasoline
and fuel oil during the second stage following normal distribution functions
N(15, 1.25) and N(5,1), respectively. The objective is the maximization of
the expected value of profit function, defined as the difference between income
(from product sales) and cost (operating cost and cost of purchasing), over the
two stages. The two-stage planning model consists of 28 inequalities and 36
equalities involving 40 variables.
Yield Maximum
Crude oil 1 Crude oil 2 Capacity
Gasoline 0.8 0.44 24
Kerosene 0.5 0.4 15
FUel oil 0.2 0.36 1
Residual 0.05 0.1
Processing cost 0.5 1
(per thousand bbl)
(i) The application of the approach in section 8.2.3 leads to the following results
q2 / ql 1 2 3 4 5
1 420.31 432.98 446.13 459.27 468.25
2 418.95 444.76 477.99 498.34 512.23
3 467.13 491.24 514.51 532.57 540.82
4 499.05 513.89 535.63 547.70 555.40
5 515.75 525.68 540.21 554.75 560.13
The algorithm requires only 2 iterations to reach the optimal solution within
a tolerance of 0.1 for t.he lower and upper bounds. The optimal solution is the
plan that corresponds to purchasing and utilizing 20 and 12.5 thousand bbl/day
of crude 1 and 2 respectively. This plan has an expected profit of 948.5 K$fday
within a corresponding feasible region defined by the following bounds: for 01 •
13 :::: OJ :::: 19.7, and for each quadrature point Or! by bounds for O2 as shO\\"I1
in Table 8.4.
ljl 1 2 3 4 5
(j'11 13.31 14.55 16.35 18.15 19.39
1
OLql 4.69 3.45 2.50 3.04 4.04
2
OUq, 7.01 7.05 7.12 7.13 6.61
2
Using GAMSlMINOS for the solution of the linear subproblems and for the
linear master problem it requires a total of approximately 33 CPU s Oll SPARC
2 {16.5 CPU s for each iteration with approximately 15.5 CPU s spent at thp
248 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
solution ofthe subproblems and 1 CPU s at the solution ofthe master problem).
(ii) The solution of the same example by relaxing the demand requirement
Si ::; (Ji {instead of Si = (Ji) results in the plan of purchasing and utilizing 20
and 15 thousand bbljday of crude 1 and 18 and 15 thousand bbljday of crude 2
respectively. Compared to the plan obtained in (i) it features a smaller feasible
region and less expected profit; i.e. it represents a less risk aversion decision
since in this case we take into account partial demand satisfaction (whereas in
the former partial order fulfilment is not considered).
Here, we will address t.hese scheduling problems for the case when uncertaillt~·
ill product demands is involved. In order to provide more insight on the nature
of the problem '''e begin our analysis by first considering the simple case of
single stage plant ,,,ith a single production line. The additional complications
introduced by considering several stages interconnected by storage tanks as well
as cyclic schedule are discussed in the subsequent sections.
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 249
8.3.1 Single production line - one stage
Consider the production of N p products within a given time horizon of H hr.
The time horizon is decretized in NT time slots consisting of production, idle
and changeover time. The problem is then to determine the sequencing of
products, the amount of products to be produced, the production times along
with the levels of intermediate storage inventories. In order to mathematically
represent the scheduling problem the following notation is introduced:
Index sets:
Parameters:
Variables:
The following mathematical model then formally describes the scheduling prob-
lem as previously posed.
250 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS
I iO = liNT = It Vi (8.10)
I>t=H (8.11)
t
Tt ~ Ttidl + L L ZijtQij Vt (8.12)
j
LYit =1 Vt (8.13)
i
..
Zijt ~ Yit. - Yjt+l - 1
d· <
- B· Vi
Vi,j, t (8.14)
(8.15)
The objective function corresponds to the maximization of expected profit (over
the time horizon) represented by the difference between the revenue due to
product sales and the overall cost (inventory cost and transition cost). A
penalty term can also be introduced to penalize partial demand satisfaction
(as discussed in the previous sections) of the form 'Y LLPi (Bi - dih, where 'Y
i t
is a penalty coefficient used to control demand satisfaction. The mass balances
for each product i at each time slot t are considered in equations (8.9), (8.10);
equations (8.11) and (8.12) represent the timing constraints; constraint (8.13)
ensur~s the assignment of only one product to each time slot, while constraint
(8.14) establishes the link of the transition variables Zijt with the assignment
variables Yit; finally, constraint (8.15) corresponds to the relaxed demand con-
straint for each product.
The use of Gaussian Quadrature Formula to evaluate the expected profit mul-
tiple integral as well as the utilization of a similar feasibility property (see
Appendix C for the detailed proof) leads to the following equivalent reformu-
lation of problem (PC):
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 251
Problem (PCI)
""..W
md~TqL. q""
~~Pi
q q "" "" I
"" diTt - ~~Ci I~ +2I~ -1 Ttq - "" "" "" t r
~~~CijZijt
y,t, i' t q i t i t i j t
>
T tq _ T tidlq + """"
~~Zijt Q ij Vt,q
j
LYit = 1 Vt
i
Zijt 2: Yit - Yjt+l - 1 Vi,j, t
d? ::; B? Vi, q, B E T(B)
Problem (PC1) corresponds to a single yet nonconvex optimization problem due
to the nonconvex objective function (bilinear terms in the investment cost and
revenue term) and inventory constraint (due to the introduction of uncertainty).
However, problem convexification can be achieYed based on the following ideas:
Problem (PC2)
'~~TqLwqLUiDit
Yd, q itT
it' t
- :: Lc[(I! + LIit ) - LLLc~jzijt
t i t i j
LYit =1 Vt
i
Zijt ~ Yit - Yjt+l - 1 Vi,j, t
Dit ~ Oir! Vi, q
Problem (PC2) still appears to be a nonconvex MINLP formulation due to the
bilinear term in the demand constraint. However, due to the feasibility property
in Appendix C, 0 E T{O), i.e. the location of quadrature points does not depend
on the decision variables; consequently, quadrature points can be fixed prior to
t.he optimization based only on the desired accuracy for the integration. As a
result., problem (PC2) eventually corresponds to an l\HLP formulation which
can be solved to global optimality using conventional MILP solvers.
Illustrating Example
A small scheduling problem of a continuous multiproduct plant. having one pro-
duction line and a single stage is considered here involving the production of 2
products over a horizon of 72 hours (discretized in 4 slots). The mathematical
model (PC) is used to describe the scheduling problem whereas the problem
data are given in Table 8.5. The demand of both products are considered as
uncertain parameters described by normal distribution functions of the form
N{15,5) and N{10,3) for products 1 and 2, respectiYely. Five quadrature points
are used for each uncertain parameter.
The scheduling model in (PC2) consists of 802 constraints with 841 variables
(40 binary variables). Using GAMS/CPLEX for the solution of problem (PC2)
requires 1.3 CPU s to determine the optimal schedule with an expect.ed profit
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 253
Index sets:
Parameters:
Variables:
Problem (PCM)
Vk,m (8.32)
256 M. G. IERAPEfRITOU AND E. N. PISTIKOPOULOS
o::; ¢:km - TPPikm+l + Tepikm+l - Tepikm :::; uAm (1 - X;km) 'Vk, m(8.38)
o ::; ¢ikm
3 2 3
::; UU,mXikm 'Vk, m (8.39)
IOkm = 13km 'Vk, m (8.40)
0::; Il km ::; Imaxkm 'Vk, m (8.41)
o ::; 12km ::; Imaxkm 'Vk, m (8.42)
o ::; 13km :::; Imaxkm 'Vk, m (8.43)
Imaxkm = LIPikm 'Vk, m (8.44)
Based on the relaxation of the demand constraint (equation 8.49), the derived
feasibility property (see Appendix C) and the use of the Gaussian Quadrature
formula to evaluate the expected profit, the stochastic formulation in (PCM)
can be recasted as the following MILP reformulation for the identification of a
robust schedule (Yik, Zijk) able to meet the uncertain demand requirements.
Problem (PCMl)
Ts11 = LLTii1Z;jl Vq
j
Vk,m,q
Te~m ::; TcLll+1 Vk, m., q
Vk,m,q
Vk.m.q
258 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
LYik = 1 Vi
k
Zijk 2: Yik - YjA,-l -1 'Vi,j, k
LWP~kM ~ O?Tcq 'Vi,q
k
Illustrating Example
The example considered here is a small scheduling problem of a continuous
multiproduct plant involving the production of 3 products (A, Band C) with
one production line and two stages shown in Figure 8.i (Pinto and Grossmann.
1994). The problem data are given in Tables 8. i. 8.8. The demand of products
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 259
Product
1
20-----1
3 Stage 1 Stage 2
The scheduling model (PCMl) consists of 5674 constraints with 4587 ,·ariables
(736 binary variables). Using GAMS/CPLEX for the solution of problem (PC2)
requires approximat.ely 10 CPU min to determine the optimal schedule \\Oith
an expected profit. of 6704 ($) that corresponds to the following producrion
260 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
8.4 MULTIPRODUCT/MULTIPURPOSE
BATCH PLANT DESIGN &
SCHEDULING UNDER
UNCERTAINTY
For the problem of designing and scheduling batch plants under uncertainty,
Reinhart and Rippin (1986,1987) and later Fichtner et al. (1990) presented
a number of model variants and solution procedures (scenario-based, penalty
functions, two-stage approach) for the problem of multiproduct batch plant
design with uncertain demands assuming one piece of equipment per stage.
Wellons and Reklaitis (1989) considered staged plant expansion over time to
account for uncertainty in product demand; they also suggested a distinction
between "hard" and "soft" constraints, introducing penalty terms for the latter
type. Straub aml Grossmann (1992) considered uncertainties in both product
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 261
STAGE 1
I B I I C
10 111.4 17.4 20.3 2X3 29.6
STAGE 2
B.3 23.6 24.6 27 _l 2K_l 30.x
Inventory 73X.9
level 268.7
STAGE I I B I c
10 105 17.4 20.5 2X.5 :111.2
STAGE 2 I"I
23.3 HX 24.K 27.5 2X.5 31.K
Inventory
level 274.7
Problem (PB)
t~,~ EOER(Vj,Nj) Lwp~r2;.)iQf -
P •
c5l;N a V!j
J
j j
-')'E8ER(Vj,Nj)LWP~~(LPiOi - LPiQf)
p 1 i i
s.t.
t!:.
I' > ( &J) . \..I' •
T Li - N. ,vZ,J,P
J
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 263
B Vj. \.I' •
i ::S S1!. ' vt,},P
I)
(Ji E R(Vj, N j )
where p=l, .. ,P is the set of scenarios used to describe the variation of process
parameters tij, Sij; (J is the vector of uncertain product demands; R(Vj, N j ) is
the feasible region of the design (V;, N j ) i.e. R(V;, N j ) = {(J I 'VB E R3Q i, T Li
satisfying the constraints of problem (PB) }. In the above formulation, the
first set of constraint corresponds to the horizon constraint, the second set are
the timing constraints and the third denotes the batch size constraints; finally,
the last set represents the relaxed demand constraint.
Note that in (PB) the evaluation of the expected profit should be performed
within the feasible region of the batch plant, R(V;, Nj). However, the establish-
ment of feasibility property in this case, as described in Appendix D, removes
this need (and thereby the bilevel nature of the problem); i.e. using for exam-
ple a Gaussian Quadrature formula to evaluate the profit expectancy, problem
(PB) can be rewritten as follows:
Problem (PBI)
p >
t tJ. . \.I' .
P
TLi - N. ' vt,},P
J
264 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
Q t?P <
-
O? . Vi q P
t' "
Problem (PB2)
M Q N
max {-oLaj exp(LYjr In(r) + 3j vj) + Lwqjq{ViQD
Vj, bi , Yj,· j=1 r q=1 ;=1
tLi,Q7,
Q N N
-, Lw qjq {DiOr - LPiQD}
q=l i=1 i=1
Qi ::; ot ; i = 1, .. , N, q =
;=1
1,.., Q
In(VLO)
J
<
-
v·J <
-
In(VUP)
J
Yjr = {O, I}, Or E T(O), ~: 2: 0
Although GOP algorithm can in principle be directly applied for the solution of
problem (PB2) to global optimality, it would require prohibitively high compu-
tational effort (2 NXQ subproblems per iteration). However, by exploiting the
special structure of the batch plant design model in (PB2), a number of prop-
erties can be established (see Ierapetritou and Pistikopoulos, 1995 for details)
with which the number of relaxed dual subproblems that have to be solved per
iteration can be reduced by several orders of magnitude (scaling only to the
number of products). --
(i) The qualifying constraints (gradients of the Lagrange function with respect
to the "connected" variables Qn to be added along with the Lagrange
function in the relaxed dual problem are only function of TLi, B i . As a
result the number of the required relaxed dual subproblems per iteration
is reduced from 2 NxQ to only 2N which is a reduction of at least twenty
orders of magnitude (2 25 ) even for two uncertain parameters with five
quadrature points each!
(ii) If at the kth iteration the qualifying constraint for product i is :s; 0 ( or
~ 0) for every other product, i.e. p,qk[exp(tLi - bi) - exp(tL - b~:)l :s;
o (or ~ 0) Vi = 1, .. , N, implying that TL;/ B; ~ Ttl Bf' (or TL ;/ Bi :s;
Ttl Bn Vi = 1, .. , N - this is true when Tfj Bf corresponds to lower (or
upper) bound of T L ;/ B i , and consequently the solution of the relaxed dual
with the following qualifying constraints: p,qk [exp(tLi -b;) -exp( ti; -b~·) 1~
o (or :s; 0) can be effectively avoided.
(iii) If at the kth iteration the qualifying constraint of product i p,qk[exp(tLi ~
b;) - exp(tLi - bnl = 0 Vi = 1, .. ,N, which is satisfied when f..L qk = 0,
\/q = 1, .. , Q, then it is sufficient to solve only one RD problem at either
the lower or upper bounds of Qi variables.
Based on the properties described above, the follo,,·ing modified global opti-
mization algorithm is proposed for the solution of problem (PB2):
266 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
Step 1 Select an initial design Vj,Nj . Set K=l, the lower bound EPL =
-00, the upper bound EPu = +00 and select a tolerance f.
Step 2 Solve the primal problem to obtain the expected profit EP and
the required dual information. Update the upper bound EPu =
max{ EP, EPu }.
Step 3 Construct and solve the required relaxed dual problems (at most
2N) that correspond to different bounds of Q; variables for each
product i and store the obtained solutions.
Step 4 Select as a new lower bound EPt the lowest value of the stored
solutions of the RD problems; set as the new design YjI<, N r the
corresponding design variables.
where l'i is restricted to take values from the set SVj = {Vj1' ... , Vjs}. In this
way, problem (PBl) can be recasted as follows:
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 267
Problem (PB3)
~~ Lw q LwPUiQ1 P -I5LLLnajv::Y;sn
,. ' q p i ; s n
q P i i
S.t. n Qp
i
> ""Q1PSf;
_ ~~-V-.-Yjsn;
V't,J,P,
. q
s n 1S
"T~P
~ 'I.
< H . Vq p
- , ,
Problem (PB3) is a mixed integer linear programming problem for which con-
ventional MILP tools (such as SCICONIC, CPLEX) can be used for its solution.
Note that in this case the structure of the deterministic formulation is fully pre-
served despite the use of general continuous probability distribution forms for
the description of uncertain product demands.
For comparison, the steps of the modified GOP algorithm, as outlined in the
previous section and described in detail in Appendix E, is applied for the solu-
tion of the same problem. The results are summarized in Table 8.11.
'"1-0
Number of RD Upper Lower Design
Without With Bound Bound (VI, V2, 1'3)
properties properties
iteration 1 2'u 1 146.7 -1394.5 (500,500,500)
iteration 2 250 1 146.7 -569.2 (883.7, 1325.5, 1767.4)
iteration 3 250 4 ( 22) 20.2 -323 (500, 1703.4, 937.8)
Optimal Design (-0.016 8 IteratIOns
(1800 , 2700, 3600) (=0.0002 14 iterations
Different penalty values Different Starting points
Optimal Design :starting uesign Number of CPU s per
'"I value VI V2 V3 (VI, V2, V3) iterations iteration
'"1-0 1800 2700 3600 (1000, 1000, 1000) 13 0.8
'"1=4. 1907 2861 3815 (4500, 4500, 4500) 14 0.8
'"1=8 1972 2958 3944 (500, 500, 500) 15 0.8
Assuming that the equipments are only available from the following set of dis-
crete values {1200, 2200, 3200, 4200}, problem (PB3) is solved for the following
cases: (i) considering only short-term demand variation i.e. 160 ::; demand of
product 1 ::; 240, 60 ::; demand of product 2 ::; 140, (ii) accounting also for
long-range variations i.e. 410 ::; demand of product 1 ::; 490, 310 ::; demand
of product 2 ::; 390, during a second period, and (iii) considering uncertain
processing times and size factors through a multiperiod formulation \\"ith three
rime periods as shown in Table 8.12. The results are summarized in Table 8.13.
Case Design
(Fl' l'2, l'3)
(i) (3200(1),3200(1),3200(1»
(ii) (3200(2), 3200(2), 3200(2))
(iii) (4200(1),4200(1),4200(1))
(1) one equipment
(2) two equipments
that the consideration of more detailed scheduling models leads to more opti-
mistic plants due to better utilization of the processing equipmentsj in this case
the best utilization of the proposed plant is achieved by changing the schedule
patterns to follow the demand realization.
8.5 CONCLUSIONS
We have presented stochastic models and algorithmic methods to determine the
global optimum solution for the problems of planning, scheduling and design of
continuous and batch plants for the case when some of the model parameters
are stochastic variables (described by any probability distribution function)
based on model reformulation and decomposition principles. In particular,
for production and capacity planning problems a decomposition-based global
optimization approach is developed to obtain the plan with the maximum ex-
pected profit by simultaneously considering future feasibility. The relaxation
of demand requirement enables the consideration of partial order fulfilment
while properly penalizing unfilled orders in the objective function. Based on
the relaxation of demand constraint, for the problem of scheduling of con-
tinuous plants, it was shown that the structure of the deterministic problem
is fully preserved enabling the determination of a robust schedule capable of
meeting stochastic demands. Finally, for the problem of design/scheduling of
multiproduct batch plant when uncertain demand and process parameters are
considered, global solution procedures were derived for the cases of continuous
272 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
and discrete equipment sizes by exploiting the special structure of the resulting
stochastic models.
APPENDIX A
max
:rl
CIXI + C2X2 + C2sS2 - 'YP2(e - S2)
max
Xl
CIXI + C2X2 + C2s S2
s.t. AIXI ::; bl
BIXI + B 2 X 2 + B 2s S2 ::; b2
S2 =e
The KKT optilllality conditiolls of thi~ problem are:
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 273
APPENDIX B
Property- Any first stage decision vector corresponding to production plan and
capacity policy i.e. X]= (Xjt,Yjt,CEjt , Sit, Pil ), t = 1, .. ,TI except for Yjt,CEjf
where t=I, .. ,TI +T:z which satisfies problem constraints 8.2-8.8 is feasible ,,·ithin
T(O).
Proof- For ease in the representation, the proof is shown here for a plant of
two processes used for the production of two products within a time horizon
discretized into two time periods.
For fixed first period production plan (Xjl,Sil,Pid, ~ = 1,2,j = 1,2 and ca-
pacity expansion policy Yjt,CE jt , t = 1,2,j = 1,2, the feasibility test problem
has the following form after the elimination of inventory 1ft and capacity :/:),1
yariables:
274 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
'l/l(xd = min u
n,Xj,2,Si,2,Pi.2
subject to:
5 12 - fh $ u (B.I)
-812 $ U (B.2)
8 22 - O2 $ U (B.3)
-822 $ U (B.4)
P12 - a~ $ u (B.5)
-P12 +at2 $ u (B.6)
P22 - af2 $ u (B.7)
-P22 +ar2 $ u (B.8)
Ifo + Pll - 8 ll + P1Z - 8 12 + (bllxll + b1ZX2d + (b ll xl2 + b 1Z xZ2) - Imax $ u (B.9)
-Ifa - Pll + 8 ll - P12 + 8 1Z - (bllxll + bI2X21) - (b ll xl2 + b12X22) + [",in $ u (B.lO)
110 + P21 - 821 + P22 - 8 22 + (b 21 Xll + b22X2d + (b21 X12 + b22X22) - I»lax $ u (B.ll)
-110 - P21 + 8 21 - P22 + 8 zz - (b Z1 Xll + b 2Z xzd - (b21X12 + b22X22) + Imin $ u (B.l2)
where:
>.t, >'i are the Lagrange multipliers of the constraints (B.I) and (B.2), respectively;
>.~, >.~ are the Lagrange multipliers of the constraints (B.3) and (B.4), respectively;
IlLJ.Li are the Lagrange multipliers of the constraints (B.5) and (B.6), respectively;
J.L~,J.L~ are the Lagrange multipliers of the constraints (B.7) and (B.8), respectively;
l7t,1Jf are the Lagrange multipliers of the constraints (B.9) and (B.lO), respectively;
1JL 1J~ are the Lagrange multipliers of the constraints (B.ll) and (B.12), respectively
Based on the above optimality conditions the feasibility function can be rewrit-
ten in the following form:
Based on the KKT optimality conditions the following two different cases with
regard to potential active sets can be considered that capture all possible com-
binations, namely the one that corresponds to satisfaction of demand for both
products and the one that corresponds to zero sales for both products. For
the first case, constraints (B.1), (B.2), (B.7), (B.8),(B.9) and (B.1O) are active
which results in u = 0; for the second case, constraints (B.3), (B.4), (B.5),
(B.6),(B.1l) and (B.12) are active resulting in:
APPENDIX C
Property- Any schedule (Yit, Zijt) satisfying the constraints 8.9-8.15 for fixed
product demands Oi, i=l, .. ,N, is always feasible.
276 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS
Proof- Let's consider the production of N p products within the time horizon
H consisting of NT time periods. For each product i the following inventory
constraints hold:
2: t
Tp
t i tT'
- H2: -
di _ O
- (C.1)
The feasibility test problem with fixed schedule Yit, Zijt and fh and after the
elimination of inventory variables is:
s.t. LTi
t
- H2:i di = 0 7'i
(C.2)
di - (}i ~ U (C.3)
-di ~ U (C.4)
-Tf ~ U (C.5)
TJ - At = 0 t = 1, .. , NT (C.6)
H - I-li2 -
-TJ
Ti
I-li1=0 'Z = 1, .. ,N P (C.7)
where At is the Lagrange multiplier of the constraint (C.5); I-lL Ilf are the
Lagrange multipliers for the bounding constraints of production d j of product
i (C.3) and (CA), respectively; TJ is the Lagrange multiplier of the constraint
(C.2).
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 277
Notice that from (C.6) T/ = At ;::: O. Consequently (C.7) implies that It; :f 0
H H
and fJ.; = -T/ = - At· Based on these results the following feasibility function
ri ri
can be derived:
(}.
~l/)(Y't
. 1,., z"t)
t), = -AtH...3:.
.
,. < 0 V (}
_ 1.
I.
APPENDIX D
T Li >
-
( N.
tij)
;
\.I'
vl,]
.
Qi ~ ei ; 'Vi
Property- Any design (Vj, N j ) satisfying design constraints above for fixed prod-
uct demands ()i, i=l, .. ,N, is always feasible.
Proof- The feasibility test problem - with fixed Fj , N j and ()i - is:
s.t.
where A is the Lagrange multiplier of the production constraint; J.l} , J.l7 are the
Lagrange multipliers for the bounding constraints of production Qi of product i.
Since there are :\f control variables Q; the H\lmber of active constraints must
be less than or equal to N+l. From the KKT conditions (D.1), (D.2) it can be
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 279
easily identified that the only potential active set consist of the production con-
straint and the lower bounds of production rates, which results in the following,
always negative feasibility function:
This permanent feasibility implies that the feasible region of batch plant (in the
space of uncertain parameters) coincides with the considered range of uncertain
parameters independently of the design.
(b.) Vj discrete
s.t. ni >
-.;;;:-'-.;;;:-'QiSij \.I . .
_ ~~-v-.-Yjsn vZ,]
s 11 JS
s n
Property- Any design (Yis,,) satisfying design constraints abow for fixed prod-
uct demands Bi , i=l, .. ,N, is always feasible.
Proof- The feasibility test problem - with fixed Yisn and 8i - is:
280 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS
s.t. (D.3)
(DA)
(D.5)
Q t·-(J·<uVi
,_ (D.6)
,-
-Q·<uVi (D.7)
with the following KKT optimality conditions:
(D.9)
(D.lO)
-I>-i.i + /1 =0 Vi (D.ll)
i
where Aij are the Lagrange multiplier of the (D.3) constraints; JLij, are the
Lagrange multipliers for the (D.4) constraints; v is the Lagrange multiplier of
the (D.5) constraint and kil' ki2 are the Lagrange multipliers for the bounding
constraints of production Qi of product i (D.6) and (D.7).
Since there are 3*N control variables Qi, ni, Ti, i = 1, ... , N, where N is the
number of products, the number of active constraints must be less than or
equal to 3*N+1. From the KKT conditions (D.ll) it can be identified that
(D.5) as well as (DA) one Vi must be active. Following this result, (D.10)
suggests that also one (D.3) constraint Vi must be active; this in turn points
out that, from (D.9). the lower bounding constraint for production of product
GWBAL OPTIMIZATION FOR STOCHASTIC PLANNING 281
i must be active. As constraint (D.6) is not included in the active set, it can
be easily shown through algebraic manipulations that a negative feasibility
function is always obtained. Note that similar proofs can be obtained for other
batch plant design models (Ierapetritou, 1995).
APPENDIX E
Iteration t
The solution of problem (PB2) has an objective function of 146.7 units and
provides a first upper bound on the global optimum. The Lagrange multipliers
for the production constraints jJ.q are all equal to zero resulting in zero Lagrange
function gradient with respect to Qr. Hence, it is only necessary to solve one
RD problem with Qr at either its upper or lower bounds. The solution of this
problem is (VI, V2 , V3) = (500,500,500), with a value of the objective function
of -1394.5 ,units which is a first lower bound to the global opt.imum. The fixed
value of y for t.he second iterat.ion is WI, V2 , V3 ) = (500,500,500).
Iteration 2
For (VI, V2 , V3) = (500,500,500) the solution of the primal problem yields a
value of 297.3 unit.s (no update of the upper bound). Since the qualifying con-
straints are bl - bf, b2 - bf which are gTeater or equal to zero for any (b l , b2 ),
only one RD problem is also required at this iteration with Qr
at its upper
bounds. The Lagrange function from the first iteration is also added in the
current relaxed dual problem since the qualifying constraints for this Lagrange
function are zero for any (bI , b2 ). The solution of the relaxed dual problem
yields the design (VI, V2 , V3 ) = (883.7,1325.5,176iA), with an objective value
of -569.2 units which corresponds to a new lower bound. The fixed design for
282 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
Iteration 3
The solution of the primal problem yields an expected profit of 20.2 units, which
provides a new upper bound to the global optimum, and the corresponding
Lagrange multipliers which are all nonzero. Four relaxed dual problems are
then solved, for the combinations of bounds (Qr ,Q~lq2) equal to (0,0), (fW ,0),
(fW, O~lq2) and (0, Or Q2 ). Since the qualifying constraints of Lagrange functions
from the first and second iterations are satisfied for every (b I , b2 ), they are both
added in the relaxed dual problem. The solutions of the four relaxed problems
are summarized in Table E.l. At the end of this iteration these four solutions
are stored from which the design (VI, V2 , V3 ) = (500, 703.4, 937.8) with the
smaller objective is selected for the next iteration.
REFERENCES
6. Birge, J. R., R. Wets (1989). Sublinear Upper Bounds for Stochastic Pro-
grams with R.ecourse. Math. Prog., 43, 131.
8. Borison, A. B., P.A. 1\1orris and S.S. Oren (1984). A State-of-the World
Decomposition Approach to Dynamics and Uncertainty in Electric Ut.ility
Generation Expansion Planning. Oper. Res., 32, 1052.
9. Brauers, J. and M.A. Weber (1988). New Method of Scenario Analysis for
Strategic Planning. Jl. of Forecasting, 1, 31-47.
10. Clay R.L. and I.E. Grossmann (1994a). Optimization of Stochastic Plan-
ning Models I. Concepts and Theory. Submitted for publication.
284 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
11. Clay R.L. and I.E. Grossmann (1994b). Optimization of Stochastic Plan-
ning Models II. Two-Stage Successive Disaggregation Algorithm. Submit-
ted for publication.
14. Fichtner, G., H.J. Reinhart and D.W.T. Rippin (1990). The Design of
Flexible Chemical Plants by the Application of Interval Mathematics.
Compo Chem. Engng., 14, 1311.
17. Friedman, Y. and G.V. Reklaitis (1975). Flexible Solutions to Linear Pro-
grams under Uncertainty: Inequality Constraints. A/ChE Jl, 21, 77-83.
18. Grossmann, I.E., K.P. Halemane K.P. and R.E. Swaney (1983). Optimiza-
tion Strategies for Flexible Chemical Processes. Comp1Lt. chern. E!~gng.,
1,439-462.
22. Ierapetritou, M.G. and E.N. Pistikopoulos (1995). Batch Plant design
and operations under Uncertainty. Accepted for publication in Ind. Eng.
Chern. Res ..
25. Kocis, G.R. and I.E. Grossmann (1988). Global Optimization of Noncon-
vex MINLP Problems in Process Synthesis. Ind. Eng. Chern. Res., 27,
1407.
26. Liu, M.L. and N.V. Sahinidis (1995). Process Planning in a Fuzzy EIl\"i-
ronment. Submitted for publication in Eur. J. Oper. Res.
27. Modiano, E.M. (1987). Derived Demand and Capacity Planning Under
Uncertainty. OIJer. Res., 35, 185-197.
28. Pinto J. and I.E. Grossmann (1994). Optimal Cyclic Scheduling of Multi-
stage Continuous Multiproduct Plants. Submitted for publication.
30. Pistikopoulos, E.N. and I.E. Grossmann (1989b). Optimal Retrofit Design
for Improving Process Flexibility in nonlinear Systems: -II. Optimal Level
of Flexibility. Co rnp1I.t. chern. Engng. 13, 1087.
286 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS
31. Pistikopoulos, E.N. and M.G. Ierapetritou (1995). A Novel Approach for
Optimal Process Design Under Uncertainty. Comput. chem. Engng., 19,
1089.
32. Reinhart, H.J. and D.W.T. Rippin, (1986). Design of flexible batch chem-
ical plants. AIChE Spring National Mtg, New Orleans, Paper No 50e.
33. Reinhart, H.J. and D.W.T. Rippin, (1987). Design of flexible batch chem-
ical plants. AIChE Annual Mtg, New York, Paper No 92f.
34. Rotstein, G.E., R. Lavie and D.R. Lewin (1994). Synthesis of Flexible and
Reliable Short-Term batch production Plans. Submitted for publication.
35. Sahinidis, N.V., I.E. Grossmann and R.E. Fornari (1989). Chathrathi,
M. Optimization Model for Long-Range Planning in Chemical Industry.
Comput. Chem. Engng.,1!, 1049.
36. Sahinidis, N.V. and I.E. Grossmann (1991). MINLP model for Cyclic
Multiproduct Scheduling on Continuous parallel lines. Comput. Chem.
Engng., 15, 85.
37. Schilling, G., Y.-E. Pineau, C.C. Pantelides and N. Shah. Optimal Schedul-
ing of IVlultipurpose Continuous Plants AIChE 1994 Annual ?-.leeting San
Francisco.
40. Straub, D.A. and I.E. Grossmann (1992). Evaluation and optimization of
stochastic flexibility in multiproduct batch plants. Comput. chem. En-
gng., 16, 69.
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 287
41. Straub, D.A. and I.E. Grossmann (1993). Design Optimization of Stochas-
tic Flexibility (1993). Comput. Chern. Engng., 17,339.
42. Subrahmanyam, S., J.F. Pekny and G.V. Reklaitis (1994). Design of Batch
Chemical Plants under Market Uncertainty. Ind. Eng. Chern. Res., 33,
2688.
43. Van Slyke, R.M. and R. Wets, (1969). L-Shaped Linear Programs with
Applications to Optimal Control and Stochastic Programming. SIAM J.
Appl. Math., 17, 573.
44. Voudouris, V.T. and I.E. Grossmann, (1992). Mixed-Integer Linear Pro-
gramming Reformulation for Batch Process Design with Discrete Equip-
ment Sizes, Ind. Eng. Chern. Res., 31, 1315.
46. Wellons, H.S. and G.V. Reklaitis (1989). The design of multiproduct batch
plants under uncertainty with staged expansion. Comput. Chern. Engng.,
13, 115-126.
9
GLOBAL OPTIMIZATION OF HEAT
EXCHANGER NETWORKS WITH
FIXED CONFIGURATION FOR
MULTIPERIOD DESIGN
Ramaswamy R. Iyer and Ignacio E. Grossmann
ABSTRACT
The algorithm for global optimization of heat exchanger networks by Quesada and
Grossmann [171 has been extended to multi period operation for fixed configuration.
Under the assumptions of linear cost function, arithmetic mean driving force and
isothermal mixing, the multi period problem is an NLP with linear constraints and
a nondifferentiable, nonconvex objective function involving linear fractional terms.
A modified partitioning rule is used and global optimization properties are retained.
Exploiting the fact that an exact approximation is not required for exchangers in
non-bottleneck periods leads to a reduction in number of partitions required to reach
the global optimum.
1 INTRODUCTION
There has been an increased interest in the design of flexible chemical processes
in the last decade. Changing plant conditions as a result of change in feedstock
or product demands provides a motivation to develop systematic methods for
design of flexible plants [9]. A major class of flexibility problems is the multi-
period design problem for plants operating under different specified conditions
for different time periods [10].
The design and synthesis of heat exchanger networks (HEN) at nominal condi-
tions in which the selection of configurations, areas and energy are optimized,
has received considerable attention [12]. The multiperiod HEN synthesis and
design problem, which is normally encountered in the industry for handling pro-
cess uncertainties and variations [14], has received much less attention. Floudas
and Grossmann [6] have proposed a modified transhipment model for synthesis
of HEN for multiperiod operation with the objective of minimizing utility costs
using the fewest number of units. Subsequently they extended this method to
perform automatic synthesis of network configurations [7]. Galli and Cerda [8]
have proposed an algorithm that is based on the idea of representing the un-
certainties with permanent and transient process streams. However, the above
techniques cannot account simultaneouly for the tradeoffs in investment and
operating costs. Rigorous mathematical programming techniques have been
proposed for synthesis and design of HEN which can simultaneously account
for trade-offs in energy, area and unit costs [3, 19]. Papalexandri and Pistikop-
uolous [16] have studied the problem of introducing controllability together with
synthesis and design of flexible processes. However, a major limitation of these
methods is nonconvexities leading to multiple local optimal solutions. Thus,
local search techniques cannot guarantee global optimality of the solution.
There has been considerable interest lately in addressing the global optimization
of nonconvex nonlinear problems [13]. Global optimization of HEN has been
addressed using stochastic and deterministic methods. Dolan et al. [5] applied
simulated annealing to the synthesis of HEN. This method is computationally
intensive and only guarantees global optimality if an infinite number of trials are
performed. Quesada and Grossmann [17] developed an algorithm for HEN with
fixed configuration which guarantees global optimality. They used nonlinear
convex underestimators for the nonconvex terms to obtain tight bounds for the
objective value within a spatial branch and bound method.
The aim of this work is to extend the algorithm by Quesada and Grossmann
[17] to design of HEN for multi period operation for a fixed configuration. The
configuration could be either selected using any of the synthesis methods re-
ferred above ([6],[7],[8]), or else supplied by the user. Selecting a configuration
is largely a discrete optimization problem, while optimizing a fixed stn"cture
corresponds to a nonconvex, continuous optimization problem in which areas
and energy consumptions are optimized.
HEAT EXCHANGER NE'IWORKS 291
The formulation presented by Yee et al. [19] will be used to describe the
mathematical model for the given network configuration that is selected within
the superstructure. Aside from the fact that this provides a systematic way to
derive the model, it should be noted that most network configurations under
the assumptions used in this paper can be embedded in such a superstructure.
Figure 1 shows a typical configuration for a 2 hot-2 cold stream system with
all possible exchanges. The selection of the configuration is determined by
choosing a subset of these exchangers (shaded circles). The number of stages
may be chosen either as max(Nh' N c ) or the maximum number of temperature
intervals as described in [4].
slage 1 slage2
k.l
(Iemperalure
Iocalion)
It may be noted that the assumptions above generally yeild good results that
are of practical use. The arithmetic temperature difference underestimates and
usually is close to the logarithmic temperature difference while the linear cost
of area provides an underestimation of the concave cost functions for area. The
isothermal mixing assumption aids ill avoiding nonlinear mixing constraints
in the model. Finally, the assumption of by-passes ensures that feasible heat
exchange for a given unit is guaranteed by considering the largest area required
over all time periods.
The following indices, parameters, variables and constraints are involved in the
multiperiod NLP model.
(A) Indices
(B) Parameters
where Zijk, ZCi , ZHj are predetermined by the particular network configura-
tion.
(9.7)
Vi, j, k, n if Zijk = 1
DTCUin = ((t;NOJ(+ln - TOUTcu ) + (TOUT;n - T1Ncu))/2
Vi,n if ZCi = 1
DTHUjn = ((TOUTHU - tjln) + (TINHu - TOUTjn ))/2
Vj,n if ZHj = 1
The equations 9.1-9.7 that include the heat balances, driving force and ap-
proach temperature constraints for all N periods of operation may be concisely
represented as
where Xn is the set of all other state variables at period n in the set of equations
9.1-9.7.
The multiperiod HEN design problem (PI) then corresponds to the linearly
constrained NLP which is of the form
' 'k)
( '.1, U ij 'cu
I,
UiCU 'HU
"
UjHU
(9.9)
n i n j
subject to
The objective function has nondifferentiable 'ma..x' operators to select the largest
area for each exchanger over the n time periods of operation. Note that this
implicitly assumes that by-passes are available at each exchanger as was men-
tioned before. The bounds for the variables shown above may be obtained from
a preanalysis of the network for each period of operation [17].
Define
We analogously define areas for hot and cold utility exchangers AHU N jn ,
AHUj and ACU N in , ACUj , respectively. Using the concept of convex underes-
timators developed in Quesada and Grossmann [17], the convex underestimator
problem (NLPL ) may be formulated as
+
n j
"
subject to
linear constraints that ensures feasiblity of operation for aU periods
296 R. R. IYER AND I. E. GROSSMANN
!;(QHjn,DTHUjn,AHUNjn) ~ 0 Vj
!i(QCin,DTCUin,ACUNin ) ~ 0 Vi (9.10)
where
(Qijkn, DTijkn, ANijkn, QHjn, DT HUjn, AHU Njn, QCin , DTCUin, ACUNin) E {}
where
HEAT EXCHANGER NETWORKS 297
The key property on which the solution procedure is based is presented below.
3 SOLUTION PROCEDURE
It will be assumed below that the reader is familiar with the paper by Quesada
and Grossmann [17]. Let the incumbent (current best) solution be denoted
by (*) and solution of (N LPL ) be GI for the subregion r. Let F be the set
containing the subregions that need further examination.
min (9.12)
s.t.
g(Qijkn, DTijkn, QHjn , DTHUjn, QCin , DTCUin,Xn ) ~ 0 Vi,j,k
For obtaining bounds on area, linear fractional programming may be used [2].
Store the bounds to generate the region n°. Let F = F U {o}. Also obtain
projections of variables for each period of operation using solution of the LP's
solved during the procedure for obtaining bounds. However, it may not always
be possible to obtain projections in the convex direction. Evaluate the original
objective function and store the lowest value as the incumbent solution C*.
In the multiperiod case the value of the area chosen is the largest amongst all
periods of operation and thus feasibility is ensured for all periods of operation.
The value C· so obtained represents an upper bound to the global optimum of
PI.
y ~ y* and y ~ y*
4 REMARKS
At this point it is important to note the differences between the solution proce-
dure for the single period case [17] and the multiperiod case considered in this
paper.
• 1) The partition rule only determines those exchangers for which there is
a bottleneck period where the calculated area is greater than the largest
underestimated area. Thus, no partitioning will be required in the space of
variables for other periods for which the calculated area is smaller than the
largest underestimated area leading to a reduction in computation time.
• 2) When the lower bound equals the upper bound then the solution is the
global optimum. However, with the above partition rule, it is possible to
have a nonexact approximation for an exchanger in non bottleneck periods.
This is because the solution corresponding to the upper bound yields fea-
sible areas for all periods of operation. Even if the area is underestimated
for a nonbottleneck period nnb , such that
'~Nnb
.,1.ijkn -
< Qnb
ijkn
/DTnb
ijkn
the feasible solution corresponding to the upper bound must be the lowest
feasible value of area for that exchanger such that
Qnb
ijkn
/DT ZJOk n nb <
0
- Qbijkn /DTbijkn
300 R. R. IYER AND I. E. GROSSMANN
5 EXAMPLES
Two example problems were solved with the proposed global optimization s0-
lution procedure and the results were compared with the solutions obtained
from the local NLP solver MINOS 5.3 [15]. The time taken to solve the convex
NLP's was found to grow almost linearly with the number of periods. Therefore
no special decomposition strategy was used for solving the multiperiod NLP
because the NLP in the full space can be solved in reasonable time.
5.1 Example 1
An example problem (see Figure 2) from Quesada and Grossmann [17] is solved
for a 10 period case with varying inlet temperatures and fiowrates as indicated
in Table 1. The solution of this problem using MINOS 5.3, in GAMS [1]
C2 _ _~f'
H2
C1
H1
(with default starting point) gave a local solution of $476,300. However, the
global solution is $377,950, which clearly justifies the need for a global solution
procedure since a 20% reduction in costs is possible for this example.
The solution procedure for obtaining the global solution results in a NLPprob-
lem with larger number of variables and equations as shown in Table 3. At
the first iteration, the underestimator problem yields a solution of $347,320 as
compared to the upper bound of $405,480. The global solution was obtained
after partitioning the feasible region into 30 subregions.
1 575 575 576 576 565 565 575 575 575 576
2 718 708 718 708 718 708 708 708 718 718
Outlet temperature for each period (K)
1 395 385 396 386 395 395 395 395 395 395
2 398 388 398 388 388 398 398 398 398 398
Heat capacity flow rate (kW/K)
1 55.55 52.63 55.55 52.63 58.82 58.82 55.55 55.55 55.55 55.25
2 31.25 31.25 31.25 31.25 30.2 32.26 32.26 32.26 31.25 31.25
1 300 290 302 292 292 292 295 295 302 302
2 365 355 365 355 355 355 355 365 365 365
3 358 348 358 348 348 348 348 360 360 360
Outlet temperature for each period (K)
400 390 400 400 400 400 400 400 400 400
Heat capacity flow rate (kW/K)
1 100 100 102 92.6 92.6 92.6 95.24 95.24 102 102
2 45.45 45.45 42 42 42 42 42 45 45 45
3 35.71 35.71· 38 38 38 38 38 40 40 40
1 0.1 270
2 0.1 720
3 1.0 240
4 1.0 900
Note- DTmin =5 K
exchanger with a much larger total investment cost. Table 5 shows the values for
calculated areas and underestimated areas at the global solution for exchanger
1. The largest value of calculated areas corresponds to the bottleneck period
(in this case 5, 6 and 10). For these periods, the calculated areas are equal
to the underestimated areas (since at the global optimum, the upper bound is
equal to the lower bound for objective function). However for non-bottleneck
periods (e.g. 1,2,3 etc.), the underestimated area is lower than the calculated
area. Thus, an exact approximation for the areas is not required in the non-
bottleneck periods because these areas do not affect the value of the objective
function. This is true because only the largest value of calculated area is used
in the objective function which corresponds to the bottleneck periods. It is also
possible that for nonbottleneck periods, the underestimated area may be larger
than the calculated area (e.g. period 8) because the variable ANijkn is free
to take any value less than A ijk and still not affect the value of the objective
function.
5.2 Example 2
The network considered in the structural flexibility analysis problem from
Grossmann and Floudas [11] with 4 hot and 3 cold streams (see Figure 3) and
uncertain inlet temperatures was solved using the proposed procedure. The
network has a structural flexibility index of 0.75 for a ± 100 C change in inlet
temperatures. To determine the required areas, five periods of operation with
values of inlet temperatures perturbed by ±7.5° C were taken and the data is
presented in Table 2. The global solution was obtained after only 2 partitions
and the solution is at one of the bounds. In this case, the NLP solver finds the
same solution as the global solution of $77,127.
Table 4 presents the calculated areas at the global solution which is the same as
for the suboptimal solution in this case. The utility costs were weighed equally
for all periods and are shown in Table 4. It can be observed from Table 3 that
the NLP has larger number of equations and variables as compared to Example
1. However, the time taken for solving each NLP is much smaller for Example 2.
This can be explained on the basis of the network structure. Note that example
1 has 4 exchangers and 10 periods as compared to 8 exchangers and 5 periods in
example 2. Thus, one would expect equal number of nonlinear underestimating
equations for both problems. However, for example 2, the upper and lower
bounds for area are equal for exchangers 1,2,7,6 and 8. Thus, the nonlinear
underestimators are redundant for these exchangers for all periods of operation.
As a result, example 2 has fewer nonlinear equations (168) than example 1 (280)
resulting in a lower computation time for the convex underestimator problem.
6 CONCLUSION
A global optimization procedure based on the approach of Quesada and Gross-
mann [17] has been extended to the multiperiod HEN design problem. The
multiperiod problem was formulated as an NLP with nonconvex, nondifferen-
tiable objective function with linear fractional terms and with linear constraints.
The model was reformulated using convex underestimators and a solution pro-
cedure with a modified partitioning rule was used to obtain the global optimum.
It was shown that an exact approximation for areas is not required for non-
bottleneck periods. Example problems indicate that for some instances there
might be a significant difference between the global optimum and suboptimal
solutions obtained using a local NLP solver justifying the need for a global
solution strategy for the multiperiod HEN design problem.
1 4 4 4 4 4
2 2 2 2 2 2
3 2 2 2 2 2
4 2.5 2.5 2.5 2.5 2.5
Note- 1) Overall Heat Transfer Coefficient (kW/K m 2) = 1.0 for each exchanger.
2) Area Cost Coefficient ($/m 2) =300 for each exchanger.
3) Hot utility cost =$258 /kW.
4) Utility costs for all periods weighed equally in total cost.
5) DTmin = 1 K
REFERENCES
[1] Brooke A., Kendrick D. and Meeraus A., GAMS: A users Guide, Scientific
Press, Palo Alto (1992).
[2] Charnes, A. and Cooper,W.W, "Programming with Linear Fractional
Functions" ,Naval Research Logistics Q., 1962,9, 181-186.
[3] Ciric, A.R. and Floudas,C.A, "Heat Exchanger Network Synthesis without
Decomposition", Compo & Chern. Eng, 1991,6, p 385-396.
[4] Daichendt , M.M. and Grossmann, I.E., "Preliminary Screening Proce-
dures for the MINLP Synthesis of Process Systems-II. Heat Exchanger
Networks" ,Compo & Chern. Eng, 1994, 18, p 679.
[5] Dolan, N.B , Cummings, P.T. and LeVan, M.D, "Process Optimization via
simulated annealing:Application to Network Design" , A/CBE J, 1989, 35,
725-736.
[6] Floudas,C.A and Grossmann I.E., "Synthesis of Flexible Heat Exchanger
Networks for Multiperiod Operation", Compo & Chern. Eng, 1986, 10, p
153.
306 R. R. IYER AND I. E. GROSSMANN
Example Exchanger
(areas in m2)
2 3 4 5 6 7 8
2 3 4 5
Calculated 727.4 701.9 733. 730.5 733.5 733.5 722.8 718.9 733.3 733.5
area
Underestimated 711.9 685.9 727.4 730.5 733.5 733.5 721.8 723.5 733.2 733.5
area.
[12] Gundersen,T. and Naess, L. , "The synthesis of cost optimal heat ex-
changer network synthesis- an industrial review of the state of the
art" ,Comput. (1 Chern. Eng., 1988, 12, 503-530.
[13] Horst, R., "Deterministic method in constrained global optimization: Some
recent advances and fields of application.", Nav. Res. Logistics, 1990, 37,
433-471.
[19] Yee,T.F and Grossmann, I.E., "Simultaneous optimization models for heat
integration- II , Heat Exchanger Network Synthesis", Comput. & Chern.
Eng., 1990, 14, 1165-1184.
10
ALTERNATIVE BOUNDING
APPROXIMATIONS
FOR THE GLOBAL OPTIMIZATION
OF VARIOUS ENGINEERING
DESIGN PROBLEMS
I. Quesada and I.E. Grossmann
Department o/Chemical Engineering
Carnegie Mellon University. Pittsburgh. PA 15213
ABSTRACT
This paper presents a general overview of the global optimization algorithm by
Quesada and Grossmann [6] for solving NLP problems involving linear fractional and
bilinear terms, and it explores the use of alternative bounding approximations.
These are applied in the global optimization of problems arising in different
engineering areas and for which different relaxations are proposed depending on the
mathematical structure of the models. These relaxations include linear and nonlinear
underestimator problems. Reformulations that generate additional estimator functions
are also employed. Examples from structural design, batch processes, portfolio
investment and layout design are presented.
INTRODUCTION
One of the difficulties in the application of continuous nonlinear optimization
techniques to engineering design problems is that one is often confronted with the
following dilemma. One can either apply fairly efficient gradient based techniques
(e.g. SQP or reduced gradient algorithms) or else one can apply direct or random
heuristic search procedures (e.g. complex method or simulated annealing). The
problem is that the former methods may only produce rigorous results when certain
convexity conditions hold, while the latter may in principle produce improved
solutions but at a computational expense that is unacceptably high. Also, if the
309
I. E. Grossmann (ed.), Global OptimiZiltion in Engineering Design, 309-331.
Cl 1996 Kluwer Academic Publishers.
310 I. QUESADA AND I. E. GROSSMANN
The objective of this paper is to fIrst present an overview of the global optimization
algorithm proposed by Quesada and Grossmann [6] for solving nonconvex NLP
problems that have the special structure that they involve linear fractional and
bilinear terms. These problems can be represented in general as follows:
mingo
st. gl ~ 0 1=1 .....L (NLP)
x·
where gl =:L :L ci/ ~
ieI jeJ YJ
- :L :L ci/ xiYj + hi (x. y. z). 1= O.l .....L
ie I' je J'
xL ~ x ~ XU
yL ~ Y ~ yU
ze Z
As shown above. the objective function and the constraints generally involve linear
fractional and bilinear terms corresponding to the two summation terms. while the
last term h/(x. y. z) is assumed to correspond to a convex function. These type of
problems arise very often in engineering and management applications [1]. The
difficulty involved in solving these NLP optimization problems is that the
application of local search methods is generally not rigorous. Not only can a
conventional NLP algorithm produce local solutions that are suboptimal. but the
method may even fail to converge to a feasible solution due to the nonconvexities of
the constraints.
Another objective of this paper is to consider the application of the proposed methods
to problems from a variety of areas. The first includes a layout design model. In
this model a fixed layout configuration is given and the dimensions of the different
units are to be optimized. A portfolio investment model is also considered and in
this case, the percentage to be invested in each security is optimized to minimize the
total variance. Also, a model for the design of truss structures is presented. The
objective in this case is to minimize the total weight of the structure. Finally, two
models for batch process design are considered where the size of the equipment has to
be selected. Numerical results are reported for all these problems.
Algorithm Outline
The main idea behind the method proposed by Quesada and Grossmann [6] is to first
replace the bilinearities and linear fractional terms in (NLP) by valid under and
overestimators which will yield a convex NLP (or LP) whose solution provides a
lower bound to the global optimum. So for instance for fractional terms with
positive coefficients, introducing the variables rij' the fractional term can be expressed
as the constraint,
i E I, j E J (1)
J r1J·· + r"Uy'
X1· -< y.L 1J J - y.Lr
J 1J.. U i E I, j E J (2b)
where XiL, Xiu, YjL, yt' r l, riju, are valid lower and upper bounds of the variables.
In addition, Quesada and Grossmann [6] showed that the following nonlinear convex
constraint,
i E I, j E J (3)
and Yj. In fact when these bounds are obtained by the optimization of individual
variables in (NLP) it is also possible to generate projected bounding constraints
which can serve to tighten the representation of the NLP [6].
The proposed method then consists in reformulating problem (NLP) in terms of valid
linear and nonlinear bounding constraints such as in (2)-(3), giving rise to a convex
NLP (or LP) problem which predicts valid lower bounds to the global optimum. If
there is a difference between the current upper and lower bounds, the idea is to
partition the feasible region by performing a spatial branch and bound search as
outlined in the following steps:
(b) Bounds over the variables involved in the nonconvex terms are obtained. For
this purpose specific subproblems can be solved or a relaxation of the original
problem is used. Update the upper bound f*
(c) Define space W0 as a valid relaxation of the feasible region in the space of the
nonconvex variables. The branch and bound search will be conducted over Wo.
The list F is initially defined as the region Wo.
(b) If (f* - fi-) ~ e f* stop and the global solution corresponds to f*.
VARIOUS ENGINEERING DESIGN PROBLEMS 313
Step 2. Partition.
From the list F consider a subregion Wj (generally the region with the smallest fL is
selected) and divide it into two new subregions Wj+l and W j +2 which are added to the
list F and subregion Wj is deleted from F.
Step 3. Bounding.
(a) Solve problem CUL for the two new subregions.
(b) If the solutions are feasible evaluate the actual objective function. Otherwise
the original nonconvex problem can be solved according to a given criterion.
Step 4. Convergence.
Delete from list F any subregion with (1'* - fL) ~ e 1'*. If list F is empty then
stop and the global optimum is fi'; otherwise go to step 2.
REMARKS
The global optimization algorithm described in the previous section uses a spatial
branch and bound procedure (steps 2 to 4). As many of the branch and bound
methods, the algorithm consists of a set of branching rules, and upper bounding and
lower bounding procedures.
The branching rules include the node selection rule, the branching variable selection
and the level at which the variable is branched on. A simple branching strategy has
been followed in this work. The node with the smallest lower bound is the node
selected to branch on and two new nodes are generated using constraints of the type,
Xi ~ Xj* and Xi ~ Xj*
Different strategies can be used to do the branching. These include generating more
than two nodes from a parent node, using different type of branching constraints or
different node selection rules. For the latter, some type of degradation function
similar to the one used in branch and bound for MILP problems can be used.
Additional criteria used in branch and bound algorithms for MILP problems can be
extrapolated to the global optimization case. These include the fixing of variables,
tightening of bounds, range reduction, etc. [8]. One main difference between the
branch and bound for binary variables and the spatial branch and bound search used
314 I. QUESADA AND I. E. GROSSMANN
here is the fact that it might be necessary to branch more than once on the same
variable. When in the selection rule there is more than one variable within a small
range it is often useful to branch on a variable that has not been used previously even
though it may not be the first candidate.
With respect to the upper bound there are two cases. The first one is when the
feasible region of the original problem is convex. In this case the evaluation of the
original objective function at the solution of the convex underestimator problem
often provides a good upper bound. For the case of a nonconvex feasible region it is
sometimes necessary to obtain an upper bound through a different procedure since the
solution of the convex underestimator problems might be infeasible for the original
problem. In some particular cases it may be better to use a specialized heuristic to
obtain a good upper bound. In general, however, it may be necessary to solve the
original nonconvex problem to generate an upper bound. As pointed out in Quesada
and Grossmann [6], [7] the solution of the convex underestimator problem provides a
good initial point to the nonconvex problem.
Our previous work has mainly concentrated on the generation of tight convex
relaxations, which are generally nonlinear, and that allow for an efficient lower
bounding of the global optimum. The major motivation has been to reduce the effort
in the spatial branch and bound search. The use of additional convex relaxations that
are somewhat different from the ones used in Quesada and Grossmann [6] is explored
for the models presented in this paper.
y
I
I
L
r
I x + a,y 5 q{ b> 0
a> 0
Lower and upper bounds over the variables x and y can be obtained through heuristics
or the solution of LP subproblems. In this case, the best possible bounds are given
by xL, xu, yL and yU. Consider the linear under and over estimators by McCormick
[5] for the bilinear term, xy, used in Quesada and Grossmann [6],
x y ~ max [xLy + yLx_ xLyL, xUy + yUx _ XUyU]
x y ~ min [xLy + yU x_ xLyU, xUy + yLx _ xuyL]
These constraints correspond to the convex and concave envelopes of the bilinear
term over the relaxation of the feasible region defined by the lower and upper bounds
of the variables. As pointed out in Quesada and Grossmann [6], [7] these estimators
have the property of matching the actual function at the boundaries. However, these
equations do not always provide tight bounds since the relaxation of the actual
feasible region can be very loose. Consider, the value of the bilinear term over the
boundary defined by the first constraint, x + alY ~ bl , that is given by;
(4)
316 I. QUESADA AND I. E. GROSSMANN
(5)
The above is a concave overestimator and therefore a valid convex constraint that can
be included in the formulation. It is also tighter since it provides an exact
approximation of the bilinear term over the linear constraint. In the case that the
valid bound constraint, XU - x ~ 0, is used to generate additional constraints, the
following equation is obtained
(6)
The above inequality is a concave underestimator and the concave term, _x2, has to be
linearized over the bounds, xL and XU. This corresponds to the approach followed by
Sherali and Alameddine [9]. With this reformulation-linearization a linear
underestimation of the bilinear term over that particular boundary is obtained. In
fact, (6) is the best approximation of the bilinear term in this boundary since it
projects in a concave form as in (4) and the approximation is a linear estimator that
matches the actual function at the extreme points A and B. Equation (5) corresponds
to the convex underestimator envelope of the bilinear term in that boundary and helps
to generate a tighter convex approximation.
In the case of constraints like the second one in Figure I, x + a2Y ~ b2, the bilinear
term behaves in a convex form. Convex quadratic underestimators that match the
function in the boundary and tighter linear overestimators can be obtained.
The introduction of these additional constraints yields a tighter convex underestimator
problem. However, there is a trade-off since the size of the underestimator problems
can become substantially large. Nevertheless, the use of projections or some
particular mathematical structures can be employed to identify the most relevant
additional constraints so as to avoid generating a large number of constraints. In the
following applications different types of relaxations are used which include linear
and/or nonlinear constraints.
Layout Design
In this example a floor layout is given in which the distribution of the rooms is
known. The dimensions of the rooms are to be optimized to minimize the total cost
that is a function of the area of the rooms.
VARIOUS ENGINEERING DESIGN PROBLEMS 317
Bathroom
Dinning room Ys
Bedroom 1
Living room
Kitchen Bedroom 2
Storage
).3
Xl
Example 1
Consider the layout given in Figure 2. Here the two bedrooms have the same length.
The storage room and the bathroom have also the same length. The complete
formulation is given by;
Y7 - Ys ~ 1
Y6 - Y3 ~ 1
Xl + X2 ~ 8
3 =:;; Xl =:;; 5, 4 =:;; X2 =:;; 6, 2 =:;; X3 =:;; 4, 4 =:;; X4 =:;; 6
5 =:;; Yl =:;; 7, 2 =:;; Y2 =:;; 5, 2 =:;; Y3 =:;; 5,2 =:;; Y4 =:;; 4,3 =:;; Ys =:;; 5, 3 =:;; Y6
=:;; 6, 4 =:;; Y7 =:;; 6
The objective function consists of minimizing the total cost as a function of the area
of the rooms. The fifth and sixth constraints ensure that some hall space is left for
the doors. Bounds over the dimensions of the rooms are given. The feasible region
is linear and the nonconvexities are involved in the objective function. The bilinear
terms can be linearized (wij = xiYj) and the linear underestimators used in (2) and (3)
are included.
w--1J-
- x-y-
1 J-> x-1L y-J + y_L L J
J x-1 _ x-1 y_L (7)
(8)
Only underestimators are considered because the bilinear terms are only present in the
objective function with a positive coefficient. The nonlinear estimators are not used
since there are no bounds over the individual bilinear terms [6]. The linear
underestimator problem is solved and a solution of fL= 130 is obtained. The
approximations are exact and this solution corresponds to the global solution with
xl=3, x2=5, x3=2, x4=4, Yl=5, Y2=3, Y3=3, Y4=2, Ys=3, Y6=4 and Y7=4.
Example 2
A second layout example is considered using a similar configuration (see Figure 3).
In this case the dimensions of the bathroom are allowed to change independently.
Constraints over the aspect ratio and the size of the rooms are included. The
objective function contains an additional term that accounts for the perimeter of the
layout. The complete formulation is the following,
3
2 7
y - - -I - - -
, 4
I
5
8
6
x
Figure 3 Layout for example 2
VARIOUS ENGINEERING DESIGN PROBLEMS 319
This new problem has a nonconvex objective function and nonlinear constraints.
The data for the ratio constants (3;. bi ) and the area lower bounds (di ) are given in
Table 1. The nonconvex terms in the objective function are linearized and linear
estimators are introduced. The nonlinear constraints over the area can be written in a
convex form as,
X 1-
'>~ (9)
Yi
320 I. QUESADA AND I. E. GROSSMANN
Room 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8
~ 1.25 1/3 1/1.5 1.25 1 1 1.25 1.25
~ 1.5 1/2 1/1.25 1.5 1.25 1.25 1.5 1.5
<\ 16 40 10 20 4 4 20 20
(11)
which is a convex constraint. In the same form the other ratio constraints can be
multiplied by Xi ~ 0 to obtain the following constraints;
(12)
b) Equilibrium equations
c) Compatibility equations
n
L bik djk = Vij for i=l...mj=l...L (15)
k=1
d) Hooke's law
Ei
1:i 8j Vij = Sij for i=l...mj=1...L (16)
e) Stress equations
Ei
~ Vij = CYij for i=1...mj=1...L (17)
t)Bounds
djkL S djk S djku (18)
CYijL S CYij S CYiju (19)
SIJ.. L < (! •• < (! •• U
- "lJ - "lJ (20)
VI··JL -< V··IJ -< V··IJu (21)
(22)
The objective function of this model is linear and the nonconvex terms in the form of
bilinearities are involved in Hooke's law equations (16).
322 I. QUESADA AND I. E. GROSSMANN
Example 3
This example consists of the truss illustrated in Figure 4. The modulus of elasticity
is lxl07 psi, the density is 0.1 Ib/in 3 and the maximum stress is 20,000 psi in
compression or tension. The remaining data is given in Table 2.
Ba- 1 I 2 I 3 I 4 I 5
bi! -0.89443 -0.95783 -0.99504 -0.99504 -0.95783
bi2 -0.44721 -0.28735 -0.0995 0.09950 0.28735
Ii 111.8034 104.4031 100.4988 100.4988 104.4031
-2 ~ djk ~ 2, -200,000 ~ Sij ~ 200,000, -0.22 ~ Vij ~ 0.22, 0 ~ ai ~ 10,
-20,000 ~ Sij ~ 20,000
100,000 Ib~
The bilinear terms are linearized by Wij =Vij ai and linear over and underestimators [5]
are included. In this case it is possible to exploit further the mathematical structure
of this problem. Additional constraints are generated using the stress equations (17),
VARIOUS ENGINEERING DESIGN PROBLEMS 323
E'I
17 Vlj= Vij
. IT
(17)
Multiplying by ai ~ 0 yields,
(23)
(24)
Linear over and underestimators are also included for Zij =crij ai. The resulting LP
model includes the estimators for Zij, Wij and the equations (24). The solution of this
problem is fL = 147.5 lb and the approximations are exact corresponding to the
global solution with a =(7.102,0,0,0,6.525). If the additional equation (24) with the
corresponding linear estimators is not generated the lower bound yields fL = 144.0 lb
which represents a 2.3% gap from the global optimum. When the original
nonconvex problem is solved with MINOS 5.2 providing zero values as an initial
point, no feasible solution is obtained.
Example 4
Consider the truss shown in Figure 5. The modulus of elasticity is lx107 psi, the
density 0.1 Ib/in 3 and the maximum stress is 25,000 psi in compression or tension.
The remaining data are given in Table 3.
Bar 1 2 3 4 5 6 7 8 9 10
-10 ~ djk ~ 10, -250,000 ~ S;.j~ 250,000, -1.273 ~ Vij~ 1.273, 0 ~ a;. ~ 10,
-25,000 ~ O'ij ~ 25,000
~
~--~~~--~~~
'/
Portfolio Investment
A set of securities, i, is available for investing. The investment has to be done
achieving a target mean annual return according to the mean annual returns on the
individual securities, mi. The total variance of the investment has to be minimized.
By defining Xi as the fraction to be invested for each security i, the optimization
problem can be expressed as:
In this case the bilinear terms in the objective function are linearized introducing
variables Wij and the linear estimators. The quadratic terms x? remain in the convex
underestimator problem when the variance coefficient, Vii, is positive. The upper
bounds on the investment fractions, Xi, can in some cases be tightened according to
the following equation
(25)
Example 5
The data for this example are given in Table 4. The initial lower bound is f1. = 5.22
and corresponds to an actual objective function of f = 5.429. Since the difference is
greater than the tolerance, e = 0.001, a branch and bound search is conducted. After 7
nodes the global optimal of f = 5.429 is obtained with x=(0.143,0.143, 0.714, 0.0).
4r->
Stages
Figure 6 Multiple stages batch process
The objective is to maximize the profit given by the income from the sales of the
products minus the investment cost. Lower bounds are specified for the demands of
the products and the investment cost is assumed to be given by a linear cost function.
Since the size of the vessels and the number of batches are assumed to be continuous,
this gives rise to the following NLP model:
max P= I, Pini Bi - I, aj Vj
i j
S.t. Vj ~ SijBi i=1...N, j= ... N (NLPp)
I, niTi ~ H
1
QiL _ R ~ a i=1...N
ni 1
where ni and Bi are the number and size of the batches for product i, and Vj is the size
of the equipment at stage j. The first inequality is the capacity constraint in terms of
the size factors Sij' the second is the horizon constraint in terms of the cycle times
for each product Ti and the total time H, and the last inequality is the specification of
lower bounds for the demands QiL • Note that the objective function is nonconvex as
it involves bilinear terms, while the constraints are convex.
Example 6
The data for this example are given in Table 5. A maximum size of 5000 L is
specified for the units in each stage.
VARIOUS ENGINEERING DESIGN PROBLEMS 327
Ti Pi QL
,Sij(L;g)1
Product (hrs) {$/k~ (kg) 1 3
A 16 15 80000 2 3 4
B 12 13 50000 4 6 3
C 13.6 14 50000 3 2 5
D 18.4 17 25000 4 3 4
at =50, a2 =80, a3 =60 ($/1...); H =8,000 hrs
When a local search algorithm (MINOS 5.2) is used for solving this NLPp problem
(default starting point in GAMS), the predicted optimum profit is $8,043,8oo/yr and
the corresponding batch sizes and their number are shown in Table 6.
Product A I B I c I D
Bi 1250 833.33 1000 1250
ni 79.15 60 50 289.87
maximum equipment sizes. The only difference was in the number of batches
produced for products A and D.
Product A I B I c I D
Bi (kg) 1250 833.33 1000 1250
ni 389.5 60 50 20
min f = I aj Yj - I b i Qi
j i
Y·>S··B-
J- 1J 1 for i=l...N, j=l...M
~T Qi < H
L... Li B' -
i 1
YJ.L <
-
y. < y.u
J- J
Q 1.L<Q.
- 1-
<Q.u1
Bi~O
The first set of constraints corresponds to the volume requirements for each unit with
respect to all the products. The second constraint states that the total time of
production has to be smaller that the allocated time H. The third constraint
represents a raw material limitation. Bounds over the volumes, Yj ' and the
production levels, Qi' are given. Note that the nonconvexities appear in the time
constraint in the form of a sum of linear fractions. Nonlinear underestimator of these
terms are included and have the following form
VARIOUS ENGINEERING DESIGN PROBLEMS 329
It is necessary to have bounds over the batch sizes Bj • These are given by the
following valid relaxations of the original constraints in NLPB ;
V· u
S ij =B·U
min·J [~] 1
~ B-
1
(30)
Q .LTL·
. >- B.L
B1 1 -
> 11
H (31)
Example 7
This example involves 5 products and 6 stages, and the corresponding data are given
in Table 8. The following additional linear constraints are imposed;
(32)
(33)
Product A I B I c I D I E
QjL (kg) 200,000 120,000 180,000 130,000 100,000
Qju (kg) 300,000 180,000 200,000 160,000 150,000
<\ 0.8 0.7 0.6 0.4 0.5
~ 0.1 0.15 0.15 0.2 0.2
Tu (hr) 8.31 6.8 11.9 3.5 4.2
aj = 2.5$/L; VjL = 3,000 L; Vju = 6,000 L, F = 550,000 kg
using the solution of the underestimator problems as the initial point. In this form
an upper bound of f = -73,270 is generated. It is necessary to perform a branch and
bound and after 7 nodes the initial upper bound is proven to be globally optimal with
tolerance E = om. The global solution yields V = (5737, 3600, 3776, 4983, 4430,
4014) (L).
CONCLUSIONS
This paper has presented a general overview of the global optimization algorithm by
Quesada and Grossmann [6] and outlined several alternative bounding approximations
which can be applied in layout design, truss structures, portfolio investment and
batch process design. As has been shown the use of some of these alternative
approximations can sometimes tighten the relaxations so that the solution of only
one convex programming problem is required.
ACKNOWLEDGMENT
The authors would like to acknowledge financial support from the Enginering Design
Research Center at Carnegie Mellon University.
REFERENCES
[1] Floudas, C.A. and Pardalos, P.M. (1990). A Collection of Test Problems for
Constrained Global Optimization Algorithms. Edited by G. Goss and I. Hartmanis,
Springer Verlag.
[2] Grossmann, I.E., Voudouris, V.T. and Ghattas, O. (1992). Mixed-Integer Linear
Programming Reformulation for Some Nonlinear Discrete Design Optimization
Problems. Recent Advances in Global Optimization (Floudas, C.A and Pardalos,
P.M., eds.) Princeton University Press, Princeton, NI, 478-512.
[6] Quesada, I. and Grossmann, I.E. (1995). A Global Optimization Algorithm for
Linear Fractional and Bilinear Programs. Journal of GZobal Optimization, 6, 39-
76.
[7] Quesada, I. and Grossmann, I.E. (1993). Global Optimization Algorithm for Heat
Exchanger Networks. Ind. Eng. Chern. Research, 32,487-499.
[8] Sahinidis, N.V. (1993). Accelerating Branch and Bound in Continuous Global
Optimization. TIMS/ORSA meeting, Phoenix, AZ paper MA 36.2.
ABSTRACT
A municipal water distribution system is a network of underground pipes,
usually mirroring the city street network, that connects water supply sources
such as reservoirs and water towers with demand points such as residential
homes, industrial sites and fire hydrants. These systems are extremely
expensive to install, with costs running in the tens of millions of dollars.
Several optimization models and algorithms have been developed to generate
a least cost construction plan along with optimal flows and energy heads for a
given network configuration and demand pattern. However, in reality. such
models need to examine replacement and expansion decisions associated with
an existing distribution network, rather than generate a new design from
scratch. Moreover, several input parameters require to be detennined via a
pipe reliability and cost analysis, that in tum is dependent on the usage of the
system as determined by the output of this model. Accordingly, we propose
in this paper a pipe reliability and cost submodel that uses statistical methods
to predict pipe breaks and hence to estimate future maintenance costs. This in
tum detennines annualized costs and optimal economic lives. thereby
facilitating replacement decisions for relatively expensive-to-maintain or
undercapacitated pipes. This model is then integrated with the pipe network
optimization submodel in an overall design approach that uses a feedback
loop to reprocess the information that is generated by each model over a
number of stages, until a stable design is attained. The proposed approach
hence provides a holistic framework for designing reliable water distribution
systems. In particular, it identifies a nonconvex optimization subproblem that
global optimizers can focus on solving within this framework.
t This work has been partially supported by a research grant from the University of
Korea, Seoul, Korea.
333
1. E. Grossmann (ed.), Global Optimization in Engineering Design, 333-354.
C 1996 Kluwer Academic Publishers.
334 H. D. SHERALI, E. P. SMITH AND S. KIM
1 INTRODUCTION
1.1 MOTIVATION
The most general WDS problem would require the modification and/or
expansion of an existing network to enable it to satisfy the varied anticipated
demand patterns for water at required pressure levels, even while experiencing
pipe breakages. If the network is designed with low energy heads and
undersized or rough pipes in a skeletal fashion, then flow and/or pressure
requirements will not be met during certain demand peaks or under various
pipe failure scenarios. On the other hand, if the energy sources and pipes are
overdesigned, or if there are too many redundant paths, then increased costs
may lead to an inefficient solution. Therefore, the problem at hand requires a
cost effective network design and replacement strategy that satisfies stated
hydraulic requirements under various likely demand patterns and failure
modes.
two nodes of the distribution system network, the lengths of which add up to
the required length of the pipeline between the nodes.
Pipe reliability and cost models provide reliability and annualized cost
information for new and existing pipe segments, along with replacement
recommendations based on a comparison of estimated costs of either
retaining an existing pipe, or replacing it with a suitable new pipe. These
determinations require the computation of optimal lifetimes for both existing
pipes and new pipes. that. in tum, require estimating individual pipe segment
reliabilities. Several analytical methods have been proposed by researchers
that address such issues (see the books by Walski, 1984. 1987).
Stacha (1978) presents a simple replacement model for water mains based on
the premise that such a replacement should occur when the annual cost of
maintenance exceeds that of replacing the main. In a similar vein, Shamir and
Howard (1979) develop a model for determining the optimal year for
individual pipe replacement by r;onsidering the present value dollar costs of
replacement and maintenance. In two separate case studies, Clark et al.
(1982) investigate replacement cost and frequency data, and develop
regression equations to determine several coefficients required in the Shamir
and Howard break-rate equations. A financial analysis is then performed in
accordance with Shamir and Howard to determine an optimal replacement
year for each segment of pipe. Kettler and Goulter (1985) also investigate
historical pipe breakage data from a number of cities and develop statistical
regression model relationships to study the effect of time and pipe diameters
on failure rates. Andreou et al. (1987 a. b) propose a statistical methodology
for analyzing break records using nonparametric methods that obviate the
need to hypothesize distributions. The prescribed model can be used to (a)
determine the future cost of various replace/repair strategies. (b) determine an
optimal pipeline replacement time, and (c) estimate the network reliability.
Karaa et al. (1987) present a linear programming formulation for a water
distribution system resource allocation problem. Pipes having similar
maintenance histories are grouped into bundles, and the decision variables
represent the percentages of these bundles that are chosen for replacement.
The second submodel referred to above, that is, the pipe network design
submodel, provides construction decisions for a fixed network configuration
under a number of demand scenarios, including the peak demand and various
firefighting demand patterns. (These demand patterns specify the flow rates
and hydraulic pressure levels required at each demand node.) This submodel
also includes some level of hydraulic redundancy in the network design which
ensures that demand can continue to be satisfied under various failure
scenarios.
m == meters.
Csi : annualized cost per unit energy head ($/m/year) provided at source node
i E S.
[HiL' H iU ]: admissible interval for the energy head at demand node i ED.
A: set of directed arcs or links (i, J) and (j, I) for each connected pair of nodes
i and j in the given network configuration, including source node slack arcs
(i,O), i E S.
Qij: decision variable representing the flow rate (m 3 /hr) on link (i, j) EA.
X ijk : decision variable representing the length (m) of a new segment of link
(i, j) E A that is to be constructed, having a diameter d k • (Note that
X"IJk = XJ'ik')
CiJk: annualized construction and maintenance cost per unit length ($/m/year)
of link (i, j) E A that has a diameter of dk .
due to friction in link (i, j) based on rough flow conditions (see Walski,
= =
1984), where xij- i i (xijle' k 1, ... , K) and Xij. !! (Xijle' k 1, ... , K).
K
NOP: Minimize L L Cijlexijle + L CgiHgi (1 a)
(i.j)eA h1 ieS
i<j
", .. (Q .. , x .. • X .. )
- (H. + E ,) ={
't"'1 '1 'J' 'J'
(H,, + E.)
, J J SO
(10
(1 g)
K
't" (x .."
~ 'J
+ X"L)
'J'"
=L..'J fQr each (i, j) e A. i < j (1 h)
k=1
(H.
I
+ E.)
I
- (H. + E.)
J J
s lit ..
'I'IJ
(QIJ.. , xIJ'
.. ,XIJ'.. )
The foregoing model assumes several known and fixed entities. It assumes a
given network configuration that might be one of several alternative
configurations that need to be investigated. It assumes a given demand
pattern. while the performance of the design would need to be examined in
light of several demand scenarios. including peak demand and firefighting
requirements. It assumes that annualized cost coefficients cijk are available.
A pipe reliability and cost model that considers pipe breakages along with
capital and maintenance cost information. is needed to compute these
coefficients. Furthermore. an analysis is needed to prescribe which existing
pipe segments should be retained (these show up as the X jjk values above).
and which should be discarded or replaced. Since this analysis is based on
network performance. it requires an estimate of link flows. which are actually
part of the output of this pipe network optimization model. Based on the
decision to construct new pipe segments, or to retain existing pipe segments.
the associated age-dependent Hazen-Williams frictional head loss coefficients
could then be prescribed. These considerations are addressed within the
framework of an integrated approach that is developed in the following
section.
"2'"' -100
2 [180.2101
3 [190.2101
"5=-270 "4= -120 4 [185.210]
5 [180.210]
6 [195.210]
7 [190.210]
when maintenance events are expected to occur and how much they will cost,
and when the replaced pipe is itself to be replaced.
years), relative to the horizon of the design problem or budgetary cycle, then
the existing pipe segment is recommended for replacement by a new segment.
(A suggested diameter for an initial design solution is also available.)
Otherwise, the existing pipe segment is retained, and an accompanying Hazen-
Williams coefficient is prescribed.
In what follows, we assume that based on the structure of the existing network,
anticipated demand changes, the practical feasibility of constructing pipe
connections, and an analysis of providing adequate connectivity redundancy
so that no demand node is cut off from its principal source(s) if any link in
the network fails (see Loganathan et al., 1990), some network configuration
(N, A) is available. (Note that this overall methodology can be applied to
various alternative configurations, perhaps composed based on designs
analyzed over previous runs.) Given this, the foregoing two submodels can be
integrated using the following stepwise procedure.
I. Preprocessing Cost Analysis. First, the reliability and cost submodel is run
for new pipe segments using all commercially available diameters in order to
determine their respective optimal lives and annualized costs.
II. Preprocessing Flow Analysis. Using the annualized costs for the new
pipes from Step I, the pipe network optimization submodel is run for a
representative demand pattern, assuming tentatively that the network is being
designed from "scratch," that is, with all existing pipes also being replaced by
new pipes. The resulting solutiOIt suggests a baseline flow for each link in the
network, and provides an estimate of the hydraulic properties (flow and
pressure gradients) that are desirable in each of the pipe links.
III. Pipe Reliability and Cost Submodel. For each existing pipe segment, the
pipe reliability and cost analysis submodel is run using the current estimated
flows to compute the annualized expected cost over a, say, 40-year time
horizon. This cost is determined using a suggested replacement diameter that
does not reduce the hydraulic gradient in the pipe, along with an
accompanying computed optimal year of replacement. If the replacement
falls within the current budgetary cycle (say, 5 years), or if the pipe segment
satisfies any oth~r criterion for replacement, such as being undercapacitated
with respect to expected flow requirements, the existing section is identified
for replacement in the network design. Otherwise, the existing section is
retained.
IV. Pipe Network Design Submodel. The pipe network design optimization
submodel is now run again, using the annualized costs computed in Step I for
the new pipes, and using the retained existing pipe segments as determined at
Step III, to prescribe a set of pipe section diameters for the remaining newly
constructed segments, as well as source energy head levels. (The current
network design, including the recommended replacement diameters for
existing pipes that have been identified for replacement at Step III, can be
used as an advanced-start solution for this optimization run.) A
corresponding set of resulting hydraulic pressures and flow rates are hence
determined for each node and link, respectively. Note that the pipe flow rates
prescribed in this step will not necessarily be the ones estimated by the
344 H. D. SHERALI, E. P. SMITH AND S. KIM
previous run of the optimization submodel. If this difference is substantial (as
determined subjectively by the decision maker), then the process could be
made to transfer back to Step III, using the new flows as determined at the
current iteration of Step IV. This can be repeated until an acceptable design
is attained.
The pipe reliability and cost submodel formulated in this section, as discussed
above, can be used to predict the annualized costs of installing new segments
of pipes having various standard diameters, as well as to ascertain when and
using what option should each existing pipe segment in the water distribution
system network be replaced This analysis is conducted using an optimal
DESIGNING WATER DISTRIBUTION SYSTEMS 345
economic life for each alternative over a 40-year time horizon. For this
purpose, in order to project pipe failure rates, Hazen-Williams coefficients,
and maintenance and replacement costs, we will appropriately compose a set
of existing statistical models from Shamir and Howard (1979), Quindry et at.
(1981), WalsId (1984, 1987), and Kettler and Goulter (1985).
(2)
where
where
We will combine the basic equation for the break rate given by Equation (2),
with the break rate versus diameter relationship of Equation (3), and add an
extension for larger diameter pipes, to formulate the following break rate
model as a function of time and diameter:
03 0 01 O.l(t-to)
N(t, D) = NO(D) eO. 1(t-tO) 5 { (. - D) e
. if D S 16}
(0.14 e-(D-16) I 14 i.1(t-to ) if D ~ 16
breaks/year/km. (4)
Noting (2), the break rate coefficient b has been taken to be 0.1 as
recommended by Walski (1987). Furthermore. the initial break rate No has
been designated to be a function of the diameter D. following Equation (3)
for D S 16. and decreasing with a decreasing slope as suggested by Kettler
and Goulter as the diameter increases beyond 16". Note that the coefficient
of the expression for D ~ 16 has been determined to make the function
smooth at D = 16.
t
JL No(D) eO. 1('r-to) d-r = n (5)
t/lOW
We must also consider the case when the pipe for which we are estimating
costs has been in place for several years and may have experienced previous
DESIGNING WATER DISTRIBUTION SYSTEMS 347
breaks. We will assume that the break rate is still modeled by Equation (4)
regardless of the previous history of breaks. For example. suppose that our
16" diameter pipe segment has had two previous breaks and is 12 years old
(as compared with the expected 13.5 years until the second break). We wish
to compute the expected time of the next (third) break. We solve Equation
(6) with to =O. tl'lOW = 12 and n = 1 for the time of the first future (next)
failure. giving t = 15.58 years. Notice that the expected time of the third
break is earlier than before. since the second break occurred earlier than
expected. Also. the time between the second and third breaks of 3.58 years is
longer than the previously computed 3.15 years. since the failure rate is lower
during the earlier years.
To model the capital cost of installing a new pipe. Quindry et al. (1981)
recommended an exponential function of the diameter D. and Walski (1984)
refined this model to include the dependence of the model coefficients on the
pipe construction material and on various ranges of pipe diameters. This
relationship is given as follows. where
CC(D) is the capital cost per unit length ($/m) as a function of the pipe
diameter (D) (inches).
14.1eo.170D ifD ~ 8
3. 00 D 1.40 if8~D~24
Cc(D) = (7)
6.45D1.l6 if24~D~48
0.656D 1. 75 if D ~ 48 .
Next. let us formulate a model for repair or maintenance costs. Small leaks
that are caused by a hole or a small crack can be fixed with a repair clamp
that wraps around the pipe. or for larger diameters, by welding a patch onto
the pipe. One model reported by Walski (1984) that was useful for
approximating the maintenance costs for repairing such a break in the
Buffalo District from a U.S. Army Corps of Engineers study is given by
600D°.4 $/break. and includes allowances for crew cost. equipment. sleeve.
paving. and tools. Occasionally. larger longitudinal cracks or crushed pipes
might actually require the replacement of a physical section of pipe. We will
348 H. D. SHERALI, E. P. SMITH AND S. KIM
assume that such cracks requiring a replacement of the pipe section occur
fSe£ E (0,1) fraction of the time, and that sections are LSe£ = 10 m long
(these are variable parameters in the model). Hence, in this case, an
additional cost of LSe£Cc(D) would be incurred, where Cc(D) is given by
(7). Thus, the average or expected maintenance repair cost for a break is
given by
(8)
For example, using our example of a 16" pipe, the estimated (noninflated)
cost of repairing a single break is given by
600(16)°·4 + 0.1(10)(3)(161.4) =$1964, assuming f sec = 0.1.
4.3 Annualized Costs for New Pipes
The annualized cost for a section of new pipe can now be computed for each
standard diameter based on its optimal lifetime with respect to capital and
maintenance costs. For each diameter of pipe, various candidate lifetimes are
considered, coinciding with the expected failure times given by Equation (6),
and each candidate lifetime is analyzed by computing the capital plus
maintenance costs using Equations (7) and (8), based on the assumed section
length and the expected number of breaks corresponding to the given
lifetime, and then annualizing all costs incurred over that lifetime. For
computing annualized costs, we use inflation-free real prices and real interest
rates (see Grant et al., 1987). As the lifetime is increased, the annualized costs
first decrease until maintenance costs begin to take over, and then the
annualized costs start to increase. We take this least-cost time to be the
optimal economic life of the pipe based on financial considerations. For
example, assuming a 4% inflation rate and an 8% market interest rate, the real
interest rate can be computed to be (1.08/1.04) - 1, or 3.85% (see Grant et
al., 1987). Using this rate and considering a l000m length section of pipe
for each of the twenty diameters considered to be commercially available, the
corresponding optimal lives and annualized costs can be computed, and are
listed in Table 1. These optimal lifetime calculations are slightly dependent
on length since we take into account the occasional section replacements
required for longitudinal cracks and crushed pipes in Equation (8).
(Otherwise, the length would simply have been a direct proportionality factor
in the total cost expression.) Since the dependence on length is slight, we
could assume that the annualized cost per meter as computed in Table I is
sufficiently representative for general use as the required coefficients cijlc in
the pipe network design submodel.
DESIGNING WATER DISTRIBUTION SYSTEMS 349
The smallest commercially available diameter of a new pipe that has a lower
=
hydraulic gradient than 0.0246 under the baseline flow of Q 600 m3/hour
and with CHW = 97 is D = IS". For this diameter, the hydraulic gradient is
computed via (10) as 0.023 (meters head loss/meter of pipe).
To illustrate, consider our examp1e of a 16" pipe segment that is 500 m long
and currently 12 years old, for which we have prescribed a replacement
option given by a IS" new pipe as determined above. We will detail the
analysis for replacement in the candidate year corresponding to the fifteenth
break, given by Year 20 via Equation (6).
The expected maintenance times for the existing pipe are found by solving
Equation (6) using to = -12, tnow = 0, and n = 1,2,3,4 ..... yielding 3.58,
6.21, 8.29, to.O, ... years, respectively, each costing $1964 as computed before
via Equation (8). The annualized capital plus maintenance costs from Table I
for the new IS" pipe are $8.83/year/m, or $4415/year for the 500 m segment.
If we add up the discounted (present value) expected maintenance costs for
the existing pipe (including that for the break in Year 20), and discount the
annualized costs of $4415/year for the new pipe (occurring from Year 21
until the end of the 40 year time horizon), we find that the present value of
the option to replace during Year 20 at the 15th break is $45,270.
Performing this analysis for each such candidate lifetime, we detennine that
the optimal least cost year of replacement is 23 years, with a present value cost
of $44,696. For this option that replaces the pipe in Year 23, there are 22
expected failures and corresponding maintenance actions for the existing 16"
pipe segment before its replacement in Year 23, followed by 2 expected
failures and corresponding maintenance actions for the new 15" pipe between
Year 23 and Year 40. Since the optimal replacement time does not occur
during the next five years, we would recommend the continued use of this
pipe segment, unless if the subsequent network desigri phase detennines that it
is hydraulically unacceptable.
352 H. D. SHERALI, E. P. SMITH AND S. KIM
6 References
Alperovits, E. and Shamir, U. "Design of Optimal Water Distribution Systems," Water
Resources Research, Vol. 13, December 1977, pp. 885-900.
Andreou, S. A., Marks, D. H. and Clark, R. M. "A New Methodology for Modeling
Break Failure Patterns in Deteriorating Water Distribution Systems: Theory," Advances
in Water Resources, Vol. 10, March 1987a, pp. 2-10.
Andreou, S. A., Marks, D. H. and Clark, R. M. "A New Methodology for Modeling
Break Failure Patterns in Deteriorating Water Distribution Systems: Applications,"
Advances in Water Resources, Vol. 10, March 1987b, pp. 11-21.
Collins, M., Cooper, L., Helgason, R., Kennington, J. and leBlanc, L. "Solving the
Pipe Network Analysis Problem using Optimization Techniques," Management Science,
Vol. 24, No.7, March 1978, pp. 747-700.
Kessler, A. and Shamir. U. "Analysis of the Linear Programming Gradient Method for
Optimal Design of Water Supply Networks," Water Resou.rces Research, Vol. 27, No.7,
July 1989, pp. 1469-1480.
Lansey, K. and Mays, L. "A Methodology for Optimal Network Design," Compu.ter
Applications in Water Resou.rces, ed. H. C. Torno, 1985, pp. 732-738.
354 H. D. SHERALI, E. P. SMITH AND S. KIM
Lasdon, L. S. Optimization Theory for Large Systems, Macmillan, New York, NY,
1970.
Stacha, J. H. "Criteria for Pipeline Replacement," Journal of the American Water Works
Association, May 1978, pp. 256-258.
12
GLOBAL OPTIMISATION OF
GENERAL PROCESS MODELS
Edward M.B. Smith and Constantinos C. Pantelides
ABSTRACT
This paper is concerned with the application of deterministic methods for global optimisation
to general process models of the type used routinely for other applications. A major difficulty
in this context is that the methods currently available are applicable only to rather restricted
classes of problems. We therefore present a symbolic manipulation algorithm for the automatic
reformulation of an algebraic constraint of arbitrary complexity involving the five basic
arithmetic operations of addition, subtraction, multiplication, division and exponentiation, as
well as any univariate function that is either convex or concave over the entire domain of
its argument. This class includes practically every constraint encountered in commonly used
process models.
The reformulation converts the original nonlinear constraint into a set of linear constraints and
a set of nonlinear constraints. Each of the latter involves a single nonlinear term of simple form
that can be handled using a spatial branch and bound algorithm.
The symbolic reformulation and spatial branch and bound algorithms have been implemented
within the gPROMS process modelling environment. An example illustrating its application is
presented.
1 INTRODUCTION
Many important process design and operation tasks may be expressed mathematically
as nonlinear programming problems (NLP) of the form:
min cp(x)
x
subject to
g(x) 0
h(x) < 0
and
355
I. E. Grossmann (etL), Global Optimillltion in Engineering Design, 355-386.
~ 1996 Kluwer Academic Publishers.
356 H. D. SHERALI, E. P. SMITH AND S. KIM
subject to
and
358 H. D. SHERALI, E. P. SMITH AND S. KIM
where <I>L(x) is a convex underestimator of the objective function <I>(x), gL(x) and
hL(x) convex underestimators ofthe functions g(x) and h(x) respectively, and gU (x)
a concave overestimator of g(x).
Convex relaxations have already been proposed for many special algebraic forms,
such as bilinear (xy) and linear fractional (;) terms [15, 21]. However, in general
engineering optimisation applications, we potentially have to deal with much more
general expressions. Consider, for instance, the nonlinear expression
xln(y)+z
z+xy
where x, y and z are variables. This clearly does not correspond to anyone of the
simple special forms for which convex bounds are available. However, by inspection,
we can produce the following reformulation:
WI In(y)
W2 XWI
[xln(y) +z] W3 W2 +z
z+xy W4 xy
W5 Z+W4
W6 W3/ W5
We note that the original constraint has been replaced by two linear and four nonlinear
constraints. Each of the latter involves a single term of special form that can, in fact,
be bounded using results already reported in the literature. Some extra variables (w)
have also been introduced in this process, with W6 being equivalent to the original
nonlinear expression.
The above reformulation was easily achieved by inspection. In this section, we seek
to establish a general algorithm that employs symbolic manipulation to carry out this
type of reformulation for expressions of arbitrary complexity. First, we review the
binary tree representation of algebraic expressions on which the symbolic algorithms
operate. We then describe in detail the symbolic reformulation algorithm itself.
left 8 right.
DESIGNING WATER DISTRIBUTION SYSTEMS 359
Albeit strictly correct, this is extremely inefficient in a number of different ways: the
definitions of 'WI and 'W2 are unnecessary as they are both constant quantities; there is no
need for separate definitions of 'W3 and 'W4 as they actually represent the same quantity
DESIGNING WATER DISTRIBUTION SYSTEMS 361
multiplied by different constants; the introduction of the intermediate quantities W5,
W6 and W7 is also superfluous. In fact, the expression can be reformulated simply as:
=
=
a exp((J) (x + Y)
x + 'YY + 6z
1
= WIW2
i.e. as two linear constraints and a nonlinear one involving a single bilinear term.
The above example indicates that the reformulation algorithm should keep track of the
constancy or variability of quantities it encounters as it moves up the binary tree; and
also that it should avoid replacing linear sub-expressions by new variables unless this
is absolutely necessary, which is the case only if they become involved in nonlinear
terms that must themselves be replaced by new variables.
The above ideas are incorporated in the algorithm shown in pseudo-code form in
Figure 2 which reformulates a given binary tree b. It is worth first clarifying two
general points:
We now proceed to examine the algorithm in more detail, considering the treatment
of each type of tree separately.
b.type .- Leaf
b.class .- V
b.content .- w(j)
Figure 3 Creation of New Linear Constraint
Once the function argument has been reformulated and assigned a class (denoted
by b.right.class), we have to examine whether we need to replace it by a
new variable. This will be so only if the argument has been determined to be a
linear expression (class X), in which case we need to create a new linear constraint
by invoking procedure CreateLinearConstraint, the definition of which is
shown in Figure 3. We note that this procedure creates a new variable Wj by increasing
a global variable count j by 1. It then proceeds to create a new linear constraint by
equating this new variable to the given binary tree b. The new constraint is added to
a list of linear constraints created by reformulation. Finally, b is replaced by the new
variable: its type is changed to Leaf, its class to V and its contents become Wj.
Having dealt with the univariate function's argument, we now come to consider the
function itself. In particular, if its argument has been determined to be anything other
than a constant, then we must replace it by a new variable. This is achieved by an
invocation of procedure CreateVariableDefinition shown in Figure 4. This
is very similar to the CreateLinearConstraint procedure discussed earlier,
except that in this case we create a definition of a new variable rather than a constraint,
and store this definition in a separate list. As we shall see later, we will use this list to
construct the problem relaxation by creating convex upper and lower bounds for each
one of its members.
We note that CreateVariableDefinition also sets the class of the binary
tree b under consideration to a simple variable V. However, if the argument of the
univariate function was determined to be a constant, then CreateVariableDef-
ini tion will not be invoked, and the class of b must be set to a constant (C) in
ClassifyReformulate.
To illustrate the handling of univariate function operators, consider the expression:
exp(x + 2y)
DESIGNING WATER DISTRIBUTION SYSTEMS 365
PROCEDURE CreateVariableDefinition (b BinaryTree)
j := j + 1
which corresponds to a binary tree with a unary operator node at its root. In this
case, its argument would be classified as a linear expression, and therefore would be
replaced by a linear constraint involving a new variable Wj:
x + 2y - Wj =0
Then the function itself would be replaced by another variable defined as:
Wj+! == exp(wj)
If, instead, the expression under consideration was simply exp(x), then no additional
linear constraint need be created, and the new variable definition
Wj == exp(x)
would suffice.
In both of the above cases, the expression would be classified as variable (class V).
On the other hand, an expression of the form
exp(a + 2(3)
where a and f3 are constants would not be reformulated at all, and would itself be
classified as constant (class C).
x + 2y+z
x+y
This involves the ratio of two linear sub-expressions, and therefore corresponds to
the penultimate row of Table 1. This then indicates that we need to create linear
constraints for each sub-expression:
x + 2y + z - Wj = 0
x+y -Wj+! = 0
thus introducing two new variables Wj and Wj+l. We also need to replace the entire
expression by a new variable defined in terms of a linear fractional term:
W·
Wj+2 == _3_
Wj+!
x(2y + z)
In this case, the left sub-tree is simply a variable. We therefore need to define a
linear constraint to replace the right sub-tree by a new variable, and then introduce a
definition of another new variable in terms of a bilinear product:
2y +Z - Wj 0
Wj+l = XWj
As the entries in the fourth column of Table 1 indicate, only three types of new
variable definition may arise from the reformulation of a binary operator. These
correspond to the bilinear form xy, the linear fractional form x / y and the power form
x Y respectively. To these, we have to add a fourth type created by the reformulation
of unary operators.
The above reformulation of the original problem is exact. It is also completely linear
with the exception of the last item which has collected all the nonlinearities and
nonconvexities of the original problem in a single list. Each element of the latter
belongs to one of four special types, and we therefore need to consider the derivation
of convex upper and lower bounds for each of these.
We generally assume that the original variables x are supplied with physically
meaningful lower and upper bounds, Xl and XU. Although no such bounds are
available for the variables W introduced by the reformulation procedure, these may
well be necessary for the construction of the convex relaxation of the original NLP.
The rest of this section is concerned with deriving convex bounds for each type of
nonlinear term, and obtaining upper and lower bounds for the W variables.
Jlz)
/!z)
Z1 z· Zl z
(a) Concave Univariate Function (b) Convex Univariate Function
trigonometric functions), it actually includes most of the univariate functions that are
commonly encountered in process engineering problems (e.g. In(.),exp(.), yT)).
We also assume that the functions are well defined over the entire domain of their
argument z, a fact that may be exploited to tighten the bounds on z if necessary.
For concave univariate functions, the secant
provides the lower bound for W j while the function f (z) itself acts as the upper bound
(see Figure 5):
On the other hand, for purely convex functions, we have the bounds
The above bounds represent convex relations of the definition of Wj. The definition
can also be used to derive upper and lower bounds on the Wj variable itself. In fact,
most common univariate functions are monotonic, and in these cases we have simply:
where y, z E {x, w} are single variables. In this case, we employ the linear bounds
proposed by McCormick [15]:
W~ = min (ylzl,ylzu,yuzl,yuzu)
wj max (ylzl,ylzu,yuzl,yuzu)
The convex nonlinear over- and underestimators recently proposed by Quesada and
Grossmann [21] could also be used for constructing the convex relaxation of Wj = yz.
However, because of the way the bounds on Wj are derived from the bounds of y and
z, these nonlinear estimators are initially weaker than their linear counterparts listed
above (see Property 3 and Corollary 2 in [21]). On the other hand, if the branch and
bound algorithm (see Section 4) branches on the Wj variable, thereby reducing its
range, then the convex nonlinear bounds may become non-redundant and should then
be included in the relaxation.
Wj <
ZZ
z;
+ zr
~~
zr
As noted in [21] (see Property 1 and Corollary 1), these are, in fact, stronger than
the linear estimators if the bounds on Wj are calculated in the manner shown above.
However, for a given combination of the signs of the bounds for y and z, only two
of these four nonlinear constraints are convex and can therefore be included in the
relaxation.
wI.
J
0
wj max ((yl)z) (yU)Z)
DESIGNING WATER DISlRIBUTlON SYSTEMS 371
The more general case in which both y and z are variables rarely occurs in practical
process models. It can be handled by writing Wj == yZ as
In(wj) - zln(y) =0
and reformulating this constraint further using the algorithm of Section 2.
If a variable Wj appears with non-negative coefficients f3ij in all such linear inequality
constraints i without appearing in any other linear constraint, then it suffices to include
its underestimator(s) in the relaxed formulation. Similarly, if all f3ij are non-positive
and again, Wj does not appear in any equality constraint, then we only need to consider
its overestimator(s).
It is interesting to consider the relaxation of problems in which all the equality
constraints g(x) = 0 are linear, and all the inequality constraints h(x) ::; 0 are convex.
372 H. D. SHERALI, E. P. SMITH AND S. KIM
Such problems are, of course, convex, and do not actually require the application of
global optimisation techniques. Fortunately, it can be shown that, in many such cases,
the relaxations derived by our techniques will be exact. Consider for instance, any
constraint of the form
where ak, 13k and rk are non-negative constant coefficients, £k(X), Mk(X), Nk(x)
and P (x) are general linear expressions, Ff x (.) and Ffc (.) are respectively convex
and concave univariate functions, and c a constant.
It can be verified that the application of the symbolic reformulation procedure to such
a constraint will result in the following set of linear constraints
r () -
J--k Xw_(1)
k o
M k () _(2)
X -w k o
.r() _(3)
JVk X -w k o
where the new variables wiA} ,wi>-} ,A = 1,2,3 are related through the definitions:
Let us now consider relaxing this problem. We note that the first definition is a special
form of a linear fractional term in which the numerator is a constant. Thus, in terms
of the notation of Section 3.3, we have y = yl = yU = 1 and can identify z == wi1}.
Therefore, both of the Quesada and Grossmann [21] nonlinear underestimators applied
to this case reduce simply to:
Similarly, because of the signs of 13k and rk, it suffices to relax the definition of
wi2 } by the nonlinear underestimator of the convex function Ffx (.), and that of 3} wi
by the nonlinear overestimator of the concave function Ffc (.). In both cases, as
explained in Section 3.1, these estimators are the nonlinear functions themselves, and
we therefore have the relaxations:
> FfX(wi2 })
< FfC(wi3})
DESIGNING WATER DISTRIBUTION SYSTEMS 373
Now, the above relaxations, together with the set oflinear constraints generated by the
reformulation, are exactly equivalent to the original convex nonlinear constraint. We
therefore conclude that the application of our reformulation/relaxation technique to a
purely convex problem of this form will result in an exact relaxation. Consequently, a
branch and bound algorithm of the type detailed in Section 4 below, will converge to
the optimal point in one iteration without ever requiring any branching.
Initialise the list of subregions C to a single region R covering the full domain of
the variables x, w: R == [xl, XU] X [wi, WU].
Step 6: Branching
n
Apply a branching rule to subregion to choose a variable and its corresponding
value on which to branch. Add the two new subregions generated by partitioning
n at this variable value to the list .c.
Step 7: Delete subregion
Delete the current subregion n from the list .c. Go to step 2.
Step 8: Termination
If <T?U = 00, the problem is infeasible.
Otherwise solution is x* with an objective function value of <T?u.
The basic concepts of this type of algorithm have already been described in detail
by several authors [21, 22, 23]. In our implementation, the subregion n
selected to
be examined next at step 2 is the one with the lowest lower bound <I>k in the list of
pending subregions .c.
Also the rule used at step 6 to select the variable on which the
algorithm will branch next is that of Ryoo and Sahinidis [23]. This selects the variable
with the highest contribution to the gap between the objective function values of the
relaxed and exact problems.
Step 4 of the algorithm establishes a lower bound <I>k on the objective function by
solving the relaxed problem. Depending on the form of the estimators used in the
relaxation, this may be either a linear or a convex nonlinear programming problem.
At step 5, we attempt to establish an upper bound <I>¥<. to the objective function. This
can be done cheaply if the solution of the relaxed problem, x n happens to be feasible
with respect to the original problem. In this case, <I>¥<. can be calculated simply by
evaluating the objective function of the original problem at xn.
DESIGNING WATER DISTRIBUTION SYSTEMS 375
If <lik <lin -
< E, there may still be scope for a better upper bound. This we try to obtain
by solving the original nonconvex NLP using an iterative local optimisation algorithm,
with x'R as the initial point for the iteration. Of course, given the nonconvexity of the
problem, there is no guarantee that such an algorithm will converge, or, even if it does,
that the solution obtained will be better than the current upper bound for this region.
In any case, once an upper bound is established, we check to see if it improves on the
global upper bound ~t.I. If so, we update the best solution x* found so far and prune
the list ,c to remove any clearly inferior regions.
Finally, if the lower and upper bound for this region are within the given optimality
n
margin, no further branching within is necessary and it can be deleted from the list
,c (step 7).
The next subsection deals in detail with the bounds tightening procedure applied at
step 3.
from which we can attempt to tighten the bounds on Zk through the relations:
IF ak >0
ZUk .- min (Zk' (b-
a1k j1k min(ajZ},ajzj')))
The above tightening is based entirely on the linear constraints in the original problem.
One of the effects of the symbolic reformulation procedure presented in Section 2
is that the nonlinearities are extracted from the objective function and constraints,
and are all collected together in a list of simple nonlinear definitions. Here we are
interested in using this information for further bounds tightening.
Consider, for instance, the nonlinear constraint
xy+z=5
where x, y, and Z are variables in the range (1, 10]. Applying the symbolic reformulation
procedure will result in the linear constraint:
w+z=5
where the new variable w := xy is in the range [1, 100] (cf. Section 3.2).
By applying feasibility-based tightening to the now linear constraint rearranged as
w = 5 - z, we can reduce WUto 4. Similarly, from Z = 5 - w, we reduce ZU also to 4.
Now, consider the definition w := xy rearranged as either x = w/y or y = w/x.
From these, we can easily deduce that the upper bounds of both x and y can be
reduced from 10 down to 4. The final result of the tightening procedure is therefore
x,y,z,w E [1,4].
Overall, the reformulation procedure not only has linearised the originally nonlinear
constraint, thereby making it amenable to the application of the standard feasibility-
based tightening techniques, but also has extracted the nonlinearity to a form that can
readily be used for further tightening. In particular, the following general tightening
DESIGNING WATER DISTRIBUTION SYSTEMS 377
rules can be deduced from bilinear product definitions w == yz (cf. Section 3.2)1:
I .- (I.
y .- max y, mm (WI W z;;-
w' 7'
zr, ZU, WU)l U U._ . (U zr,
y .- mm y ,max (WI w' WU)l
W z;;-
zU' 7,
U
while for linear fractional term definitions w == ~ (cf. Section 3.3), the corresponding
expressions are:
ZU .- mm
. (uz ,max(-l'-'-l'-)
yl yl yU yU )
w wU w wU
5 IMPLEMENTATION
The constraint reformulation, NLP relaxation and branch and bound algorithms
detailed in Sections 2--4 have been implemented within the gPROMS process
modelling environment [4]. This software package supports the modelling of the
transient and steady-state behaviour of processes involving both lumped and distributed
operations. In general, process models may be described in terms of mixed sets of
integral, partial and ordinary differential and algebraic equations (IPDAEs) expressed
1 It is understood that evaluation of each of these expressions will be undertaken only if all variable
bounds appearing in their denominators are non-zero.
378 H. D. SHERALI, E. P. SMITH AND S. KIM
in a high-level symbolic"language [16]. Moreover, any model may involve instances
of simpler models; this allows the establishment of model hierarchies of arbitrary
depth, and provides an effective mechanism for dealing with modelling complexity.
An example of the gPROMS language is shown in Appendix A. Two types of entity
can be distinguished in this particular case. MODELs describe the physical behaviour
of the system, and, to a large extent, can be defined independently of any specific
application. On the other hand, a PROCESS defines an "experiment" to be performed
in terms of the objects being investigated (i.e. instance(s) of the MODELs), the
experimental frame (i.e. the conditions under which the investigation is to take place),
andthe results generated by its execution [17, 33]. Multipurpose process modelling
environments, such as gPROMS, support several different types of experiment (e.g.
process simulation, optimisation, parameter estimation etc.). The particular PROCES S
(C1PLIsoSeries) shown in Appendix A defines an optimisation experiment to be
carried out in an instance P of the MODEL Iso_Series_C1Pl defined earlier in the
same input file.
The solution of a global optimisation problem defined in the gPROMS language
involves several steps, all of which are performed automatically and completely
transparently to the user. First the input is translated into an internal representation
of the various generic MODEL and PROCESS entities in it. The execution of a
PROCESS (corresponding to performing the "experiment" associated with it) causes
an instantiation of the specific MODEL(s) on which it operates, generating the actual
variables and constraints involved in them. At this stage, any spatial distribution of
variables and constraints is approximated through appropriate spatial discretisation
techniques, thereby reducing the IPDAE system to a system of ordinary differential
(with respect to time) and algebraic equations. For steady-state experiments, such as
those of interest to this paper, the time variation is subsequently removed by setting
all time derivatives to zero, thus further reducing the problem description to a set of
purely algebraic constraints. The latter are then differentiated symbolically to generate
their first partial derivatives with respect to all the variables occurring in them (see,
for instance, [18]).
The above steps are common to all types of gPROMS experiment [3, 16]. They result
in a large set of nonlinear constraints (and their partial derivatives) held in binary tree
form, together with other information (such as the constraint sparsity pattern) that is
typically required by numerical algorithms.
In the specific case of global optimisation, the next step involves the application of
the symbolic reformulation procedure of Section 2 to the binary tree representation
of the objective function and constraints. Our implementation of the Create-
variableDefini tion procedure is slightly more sophisticated than that shown
in Figure 4. Thus, for reasons of efficiency in subsequent operations, separate lists
are maintained for each of the four types of variable definition. Also, before a new
DESIGNING WATER DISTRIBUTION SYSTEMS 379
variable definition is actually created, a test is carried out to prevent any duplication
of previous definitions. Thus, for instance, only one w variable representing a bilinear
product xy is ever created, irrespective of the actual number of occurrences of xy or
yx terms in the problem.
Testing of the reformulation algorithm on problems constructed from standard process
engineering models (including various complex unit operations such as plug flow
reactors, distillation columns and absorbers) indicates that its current implementation
achieves an average of over 3000 constraint reformulations per second on a SUN
SPARC 10/51 workstation.
Once the problem reformulation is completed, gPROMS proceeds with the construc-
tion of the NLP relaxation using the estimators presented in Section 3, before applying
the branch and bound algorithm of Section 4. Our current implementation of the
latter uses a sequential linear programming (SLP) method [5] for the solution of the
convex relaxation and the attempted solution of the nonconvex NLP at steps 4 and 5
respectively. The CPLEX code [6] is used for the solution of the large, sparse linear
programming subproblems within the SLP algorithm.
6 ILLUSTRATIVE EXAMPLE
This example is concerned with the design of a reactor network involving a continuous
stirred tank reactor (CSTR) and a plug flow reactor (PFR) operating in series. The
reactors operate isothermally and reactions occur in the liquid phase according to the
Van de Vusse [32] scheme shown below:
The feed to the CSTR is 100 lis of a dilute solution of A at a concentration of 5.8 molll.
The objective is to determine the reactor volumes that maximise the concentration of
component B in the outlet stream of the PFR.
Standard steady-state CSTR and PFR models were used, and these are shown in the
gPROMS input file listed in Appendix A. The CSTR model is expressed simply as
a set of nonlinear algebraic equations, but the PFR model involves a mixed set of
differential and algebraic equations (DAEs) reflecting the variation of concentrations
of the four species A, B, C, D, and other quantities (e.g. reaction rates) over the length
of the reactor. For instance, the mass conservation equation for component A is
written as:
380 H. D. SHERALI, E. P. SMITH AND S. KIM
.- ............................ -., ....-.......... -......................................
the first eight nodes, the remainder of the nodes being examined to confirm that this
was in fact the global optimum. The solution required 2908 CPU seconds on a SUN
SPARC 10/51 workstation. Of this, 37% was taken solving the relaxed problems,
58% solving exact problems, and the remaining 5% on bounds tightening and other
house-keeping activities.
7 CONCLUSIONS
Despite the very significant progress in global optimisation techniques over recent
years, these remain applicable to relatively limited classes of mathematical problems.
This paper has presented a general and efficient symbolic manipulation algorithm that
can reformulate most process engineering optimisation problems into a form that is
amenable to solution by spatial branch and bound techniques for global optimisation. In
fact, the availability of symbolic information also provides the potential for improving
certain aspects (e.g. bounds tightening) of the numerical algorithm itself.
Of course, the potentially poor performance of spatial branch and bound algorithms
applied to large problems remains an important issue. We have not, so far, examined
in detail the effects of various algorithmic decisions on the efficiency of the algorithm.
However, it is worth stressing that the symbolic manipulation, problem relaxation
and bounds tightening algorithms presented in this paper are complementary to, and
can be used in conjunction with, other branch and bound algorithms implemented on
either sequential or distributed architecture computers [2, 7, 21, 23].
The ultimate aim of our work is to enable the automatic application of global
optimisation techniques to engineering optimisation problems without the need for
special expertise in mathematical problem formulation and solution. As a first step
in this direction, the algorithms presented have been implemented in the gPROMS
mUltipurpose process modelling environment. This permits the use of standard process
models for global process optimisation and also facilitates the efficient implementation
of symbolic manipulation algorithms due to the availability of a high-level symbolic
representation of the mathematical constraints.
382 H. D. SHERALI, E. P. SMITH AND S. KIM
REFERENCES
[1] A. Aggarwal and C. A. Floudas. Synthesis of general distillation sequences ---
Nonsharp separations. Comput. Chem. Engng., 14:631--653, 1990.
[6] CPLEX Optimization Inc., Incline Village, NY. Using the CPLEX Callable
Library and CPLEX Mixed Integer Library, 1993.
[31] H. Vaish and C. M. Shetty. A cutting plane algorithm for the bilinear program-
ming problem. Naval Research Logistics Quarterly, 24:83--94, 1977.
[32] J. G. Van de Vusse. Plug-flow type reactor vs. tank reactor. Chemical Engineering
Science, 19:994--999,1964.
[33] B. P. Zeigler. The Theory of Modeling and Simulation. John Wiley, New York,
1976.
APPENDIX A
gPROMS INPUT FILE FOR EXAMPLE
# --------------------------------------------------------------------
#
# EXAMPLE 2
# ---------
# Isothermal Van de Vusse Reaction - 1 CSTR & 1 PFR - Max ConcB
#
# --------------------------------------------------------------------
#
# Model of an Isothermal CSTR
#
MODEL Iso_CSTR
PARAMETER
NoComp, NoReac AS INTEGER
VARIABLE
Rate AS ARRAY (NoReac) OF Positive
Flow AS Flowrate
Conc_In, Conc_Out AS ARRAY (NoComp) OF Concentration
Volume AS positive
EQUATION
# rates
Rate(l) 10. O*Conc_Out (1)
Rate(2) 1.0*Conc_Out(2)
Rate(3) 1.0*Conc_Out(1)h2
DESIGNING WATER DISTRIBUTION SYSTEMS 385
# mass balances
Flow*(Conc_In(l) - Conc_Out(l)) Volume*(Rate(l) + Rate(3))
Flow*(Conc_In(2) - Conc_Out(2)) Volume*(Rate(2) - Rate(l))
Flow*(Conc_In(3) - Conc_Out(3)) - Volume*Rate(2)
Flow*(Conc_In(4) - Conc_Out(4)) - Volume*Rate(3)
END # model
#
# Model of an Isothermal PFR
#
MODEL Iso_PFR
PARAMETER
NoComp, NoReac AS INTEGER
DISTRIBUTION_DOMAIN
Axial
VARIABLE
AS (
° : 1
UNIT
PFR AS Iso_PFR
CSTR AS Iso_CSTR
EQUATION
# define flow into CSTR
Flow_In = CSTR.Flow
Conc_In = CSTR.Conc_In ;
# flow out of CSTR is flow into PFR
CSTR.Flow = PFR.Flow
CSTR.Conc_Out = PFR.Conc_In
# flow out of PFR is product
PFR.Flow Flow_Out
PFR.Conc_Out Conc_Out
END # model
#
# Process to describe the optimisation experiment
#
PROCESS C1P1_IsoSeries
UNIT
P
SET
WITHIN P DO
NoReac .- 3 ;
NoComp .- 4 ;
WITHIN PFR DO
# use 3rd order collocation over 5 elements
Axial := [OCFEM,3,5]
END # within
END # within
ASSIGN
WITHIN P DO
# fix feed flowrate and concentrations
Flow_In . - 100 ;
Conc_In .- [5.8, 0.0, 0.0, 0.0] ;
END # within
PRESET
# specify upper bounds on variables
WITHIN P DO
WITHIN PFR DO
volume.- 100.0
Rate .- 350.0
END # within
WITHIN CSTR DO
Volume.- 100.0
Rate .- 350.0
END # within
END # within
MAXIMISE
P.PFR.Conc_Out(2)
END # process
Nonconvex Optimization and Its Applications
KL~ACADEMUCPUBUSHERS-DORDRECHT/BOSTON/LONDON