Vous êtes sur la page 1sur 390

Global Optimization in Engineering Design

Nonconvex Optimization and Its Applications


Volume 9

Managing Editors:
Panos Pardalos
University ofFlorida, U.S.A.

Reiner Horst
University of Trier, Germany

Advisory Board:
Ding-ZhuDu
University ofMinnesota, U.S.A.

C. A. Floudas
Princeton University, U.S.A.

G.lnfanger
Stariford University, U.S.A.

J. Mockus
Lithuanian Academy of Sciences, Lithuania

H. D. Sherali
Virginia Polytechnic Institute and State University, U.S.A.

I. E. Grossmann
Carnegie Mellon University

The titles published in this series are listed at the end ofthis volume.
Global Optimization in
Engineering Design
Edited by

Ignacio E. Grossmann
Carnegie Mellon University

SPRINGER-SCIENCE+BUSINESS MEDIA, B.V.


Library of Congress Cataloging-in-Publication Data

Global optimization in engineering design 1 edited by Ignacio E.


Grossmann.
p. cm. -- <Nonconvex optimization and its applications v.
9>
ISBN 978-1-4419-4754-3 ISBN 978-1-4757-5331-8 (eBook)
DOI 10.1007/978-1-4757-5331-8
1. Chemical engineering--Mathematics. 2. Mathematical
optimization. 3. Nonlinear programming. I. Grossmann, Ignacio E.
II. Series.
TP149.G55 1996
620' .0042'015197--dc20 95-48887

ISBN 978-1-4419-4754-3

Printed on acid-free paper

AU Rights Reserved
© 1996 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1996
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanical,
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner.
TABLE OF CONTENTS
Preface ..................................................... vii

1. Branch and Bound for Global NLP: New Bounding LP


T. G. W. Epperly and R. E. Swaney ....................................... 1

2. Branch and Bound for Global NLP: Iterative LP Algorithm Be


Results
T. G. W. Epperly and R. E. Swaney ..................................... 37

3. New Formulations and Branching Strategies for the GOP Algo-


rithm
V. Visweswaran and C. A. Floudas ....................................... 75

4. Computational Results for an Efficient Implementation of the


GOP Algorithm and Its Variants
V. Visweswaran and C. A. Floudas ...................................... Ul

5. Solving Nonconvex Process Optimisation Problems Using Interval


Subdivision Algorithms
R. P. Byrne and I. D. L. Bogle .......................................... 155

6. Global Optimization of Nonconvex MINLP's by Interval Analysis


R. Vaidyanathan and M. EI-Halwagi .................................... 175

7. Planning of Chemical Process Networks via Global Concave Min-


imization
M.-L. Liu, N. V. Sahinidis and J. Parker Shecttnan ...•.................... 195

8. Global Optimization for Stochastic Planning, Scheduling and De-


sign Problems
M. G. Ierapetritou and E. N. Pistikopoulos .............................. 231

9. Global Optimization of Heat Exchanger Networks with Fixed


Configuration for Multiperiod Design
R. R. Iyer and I. E. Grossmann ......................................... 289

10. Alternative Bounding Approximations for the Global Optimiza-


tion of Various Engineering Design Problems
I. Quesada and I. E. Grossmann ........................................ 309
vi TABLE OF CONTENTS

11. A Pipe Reliability and Cost Model for an Integrated Approach


Toward Designing Water Distribution Systems
H. D. Sherali, E. P. Smith and S. Kim ................................... 333

12. Global Optimisation of General Process Models


E. M. B. Smith and C. C. Pantelides .................................... 355
PREFACE
Mathematical Programming has been of significant interest and relevance in
engineering, an area that is very rich in challenging optimization problems.
In particular, many design and operational problems give rise to nonlinear
and mixed-integer nonlinear optimization problems whose modeling and solu-
tion is often nontrivial. Furthermore, with the increased computational power
and development of advanced analysis (e.g., process simulators, finite element
packages) and modeling systems (e.g., GAMS, AMPL, SPEEDUP, ASCEND,
gPROMS), the size and complexity of engineering optimization models is rapidly
increasing. While the application of efficient local solvers (nonlinear program-
ming algorithms) has become widespread, a major limitation is that there is
often no guarantee that the solutions that are generated correspond to global
optima. In some cases finding a local solution might be adequate, but in others
it might mean incurring a significant cost penalty, or even worse, getting an
incorrect solution to a physical problem. Thus, the need for finding global optima
in engineering is a very real one.

It is the purpose of this monograph to present recent developments of tech-


niques and applications of deterministic approaches to global optimization in
engineering. The present monograph is heavily represented by chemical engi-
neers; and to a large extent this is no accident. The reason is that mathematical
programming is an active and vibrant area of research in chemical engineering.
This trend has existed for about 15 years. Currently the trend even appears to be
increasing! In contrast, the interest in other engineering disciplines is generally not
equally strong. Part of the reason can be attributed to those areas where there is
greater use of heuristics and non-traditional optimization tools such as simulated
annealing and genetic algorithms which are claimed to provide satisfactory answers
in selected applications.

To see how the need for global optimization has been motivated in chemical
engineering, it is instructive to briefly follow the development of nonlinear
optimization over the last 20 years in this engineering discipline. In the
70's, pioneering research on nonlinear programming algorithms, which were
applied to process design and optimal control problems, was performed at Imperial
College. Subsequently the advent of the successive quadratic programming
algorithm spurred a great deal of interest and was first applied to chemical process
simulators at Wisconsin in the late 70's. This algorithm was also adapted to
problems with many equations and few degrees of freedom (a common case in
engineering) with decomposition approaches developed in the early 80's at
Carnegie Mellon. New techniques for mixed-integer nonlinear programming
vii
viii PREFACE
emerged also at Carnegie Mellon in the mid to late 80's and were applied for
the first time to chemical process synthesis problems. Nonlinear programming
techniques, especially successive quadratic programming algorithms, continue
to be of active interest at Carnegie Mellon, Clarkson and Imperial College, par-
ticularly for large scale applications, such as real time optimization. Likewise,
mixed-integer nonlinear programming algorithms continue to be of interest at
Abo Akademi, Carnegie Mellon, Dundee, Maribor and Imperial College. Ini-
tial work in global optimization was performed at Stanford in the early 70's,
while the study of implications of nonconvexities in design and their handling
in decomposition strategies were developed at Florida in the mid-70's. Inter-
est in global optimization resurfaced in the late 80's with the development of
Benders type of algorithms at Princeton. Since that time global optimization
has attracted increased attention, and work is being pursued at a number of
universities (largely represented in this monograph). It is this level of research
activity that has motivated the creation of this monograph.

The chapters of this monograph are roughly divided into two major parts: chap-
ters 1 to 6 emphasize algorithms, and chapters 7 to 12 emphasize applications.
Chapters 1 and 2 describe a novel and elegant LP-based branch and bound
approach by Epperly and Swaney for solving nonlinear programs that are ex-
pressed in factorable form. Computational experience is reported on a set of
50 test problems. Chapters 3 and 4 describe several recent enhancements and
improvements to the GOP algorithm by Visweswaran and Floudas. Details of
the implementation of the algorithm are described, as well as results of process
design and general test problems. Chapters 5 and 6 describe implementations
of methods based on interval analysis. Chapter 5 emphasizes implementations
of existing methods and the effect of inclusion functions, while chapter 6 em-
phasizes strategies for accelerating the search and quickly identifying infeasible
solutions.

As for the applications, chapter 7 presents an exact and finite branch and bound
solution approach to the planning of process networks with separable concave
costs. Chapter 8 by Ierapetritou and Pistikopoulos deals with a number of
stochastic planning and scheduling models which are shown to obey convexity
properties when discretization and decomposition schemes are used on two-
stage programming formulations. Chapter 9 by Iyer and Grossmann presents
the extension of the global optimization method for heat exchanger networks by
Quesada and Grossmann to the case of multiperiod design problems. Chapter
10 by Quesada and Grossmann explores alternative bounding approximations
for their algorithm for linear fractional and bilinear functions which is applied
to problems in layout design, design of truss structures, and multiproduct batch
design. Chapter 11 by Sherali, Smith and Kim outlines a comprehensive solu-
PREFACE ix

tion approach to the design of reliable water distribution systems in which a


number of nonconvex optimization subproblems are identified. Finally, Chap-
ter 12 by Smith and Pantelides presents a symbolic manipulation algorithm
for the systematic reformulation of large structured models, and its implemen-
tation in gPROMS for solution with branch and bound algorithms for global
optimization.

We believe that this monograph provides a good overview of the current state-
of-the-art of deterministic global optimization techniques for engineering de-
sign.

Ignacio E. Grossmann
Carnegie Mellon University
1
BRANCH AND BOUND FOR
GLOBAL NLP: NEW BOUNDING LP
Thomas G. W. Epperly
Ross E. Swaney
Department of Chemical Engineering
University of Wisconsin
Madison, Wisconsin

We present here a new method for bounding nonlinear programs which forms the
foundation for a branch and bound algorithm presented in the next chapter. The
bounding method is a generalization of the method proposed py Swaney [34] and
is applicable to NLPs in factorable form, which include problems with quadratic
objective functions and quadratic constraints as well as problems with twice differ-
entiable transcendental functions. This class of problems is wide enough to cover
many useful engineering applications including the following which have appeared
in the literature: phase and chemical equilibrium problems [5, 15, 16], complex re-
actor networks [5], heat exchanger networks [5, 23, 38], pool blending [36, 37]' and
flowsheet optimization [5, 28]. Reviews of the applications of general nonlinear and
bilinear programs are available in [1, 5].

Although the problem of finding the global optimum of nonconvex NLPs has been
studied for over 30 years, still relatively few concrete algorithms have been proposed
and tested for solving problems where the feasible region is a general compact non-
convex set. Much of the research has focused on concave programs with a convex
feasible region, concave and indefinite quadratic programs, and bilinear programs.
Many of these methods have been summarized in the following review papers or
monographs [11, 6, 12, 10, 21, 22]. While these algorithm have many possible
applications, there are many engineering design problems that do not fit the re-
quirements of these methods because of general nonconvex objective functions or
nonconvex feasible regions.

Recent deterministic approaches to solving general global NLPs in engineering have


tended along one of two lines. The first approach involves application of generalized
Benders decomposition (GBD) to the global optimization problem [8]. Floudas and
Visweswaran [7] treated nonconvexities by splitting the variables into two sets such
that fixing one set of variables causes the problem to be convex in the other set. The
algorithm solves a sequence of subproblems and relaxed dual subproblems to obtain
the global solution. A possible limitation with this method is that the number of

I. E. Grossmann (ed.). Global Optimization in Engineering Design. 1-35.


© 1996 Kluwer Academic Publishers.
2 T. G. W. EPPERLY AND R. E. SWANEY

relaxed dual subproblems required during each iteration may increase exponentially
with the number of variables appearing in bilinear terms. To address this difficulty,
Visweswaran and Floudas [37J introduced some properties to reduce the number of
relaxed dual subproblems required in most cases.

The second line of approaches involve various branch and bound algorithms applied
to the continuous variable domain. These are distinguished by how they obtain
bounds and how they partition the variable domain. Bounding approaches fall
into two main groups, using either interval mathematics or convex underestimating
programs. Interval mathematics provides tools for placing bounds on the objective
function and restricting the feasible region; these methods have been summarized
in the following review articles and monographs [9, 25, 26, 27J. Interval methods
have been applied recently to some engineering design problems in [35J. The tools
offered by interval mathematics have several uses in global optimization.

The other group of bounding techniques are based on convex underestimating pro-
grams. Falk and Soland [4J used convex envelopes of separable functions to provide
bounds for problems with nonconvex objective functions and convex feasible re-
gions, and Soland [33J extended their work to separable nonconvex constraints.
McCormick [13J then introduced a general method for constructing convex/concave
envelopes for factorable functions, thereby removing the need for separability, and
in [14J presented a branch and bound algorithm based on these envelopes. More
recently, Swaney [34J presented new bounding functions based on McCormick's
envelopes and positive definite combinations of quadratic terms to improve conver-
gence and to provide finite termination even when the minimum is not determined
by a full set of active constraints.

Sherali and Alameddine [31J developed the reformulation-linearization technique


(RLT) for developing a tight LP to bound bilinear programs, and Sherali and Tunc-
bilek [32] extended this work to NLPs constructed from polynomials. The RLT
produces tighter bounds than McCormick's envelopes at the cost of possible ex-
ponential growth in the number of constraints required. Quesada and Grossmann
[24J presented a bounding method for bilinear and fractional NLPs using a com-
bination of McCormick's envelopes and additional estimators based on projections
of the feasible space, and developed methods for determining which constraints are
nonredundant. The additional estimators are equivalent to RLT for bilinear terms,
but only a limited number of these types of constraints are included. Both linear
and convex nonlinear estimators are incorporated into their convex underestimating
NLP. Ryoo and Sahinidis [29J introduced methods to reduce the variable domain at
each iteration based on optimality and feasibility criteria.

The tightness of the bound strategy used by a branch and bound algorithm is crit-
ical to its success. Tighter bounding functions reduce the need for partitioning,
decreasing the computational effort required. Swaney [34] identified and remedied
BRANCH AND BOUND FOR GLOBAL NLP 3

a problem that may occur when McCormick's envelopes are constructed at minima
with less than a full active set, but the covering program involved had to be con-
structed at a local minimum, where the projection of the Hessian of the Lagrangian
in the unconstrained directions is positive semi-definite. In this chapter, we relax
this requirement.

Branch and bound algorithms typically get an upper bound on the solution from the
best known feasible point, so some branch and bound algorithms use a local NLP
method to find good feasible points. The algorithm presented here incorporates
global bounding information into the search for local minima and feasible points.

The bounding method here is developed for branch and bound algorithms using
rectangular domain regions. An underestimating LP based on McCormick's convex
envelopes and additional constraints formed from positive definite combinations of
quadratic terms is used to provide a lower bound for the original problem over a
rectangular region. Two different approaches are used in the search for feasible
points, one using MINOS 5.4 [17] and a second one using the underestimating LP
in an iterative scheme. These aspects are treated in the next chapter.

The key features of the bounding method are its ease of evaluation, its wide appli-
cability, and its ability to provide the exact lower bound over a region of finite size.
Because the bounding method uses an LP, it can be reliably and efficiently solved
using existing algorithms. It can be applied to a large class of problems because it
only requires them to be in factorable form. Lastly, the underestimating LP is de-
signed to provide the exact lower bound when built around the global minimum of a
region of finite size, enabling finite termination of the branch and bound algorithm.

The development of the underestimating LP will be presented as follows: in Sec-


tion. 1, a LP underestimator for a particular orthant (one of 2 n sub-rectangles of
the current rectangular region) is developed using McCormick's convex envelopes.
The difficulty with unconstrained minima is demonstrated, and a variable transfor-
mation and new constraints based on positive definite combinations of quadratic
terms are presented to solve the problem.

In Section 2, a method to combine the 2n orthants into a single underestimating LP


is presented. The combination of the orthant programs can be viewed as a LP with
interval coefficients, so a method for transforming a LP with interval coefficients
into a LP with real coefficients is developed. The interval form of the orthant pro-
gram is simplified and tightened by eliminating some of the variables, and then it is
transformed into a conventional LP with fixed (noninterval) coefficients. This trans-
formation avoids the solution of 2n subproblems while retaining finite termination
characteristics for nonconvex NLPs.
4 T. G. W. EPPERLY AND R. E. SWANEY

The resulting underestimating LP will be used in the branch and bound algorithm
presented in the following chapter. It is used to provide lower bounds for each
region, and is also used iteratively in a search for local minima and feasible points.

1 THE ORTHANT PROGRAM


In this section, a LP to calculate a lower bound on a NLP for a particular orthant of
the rectangular variable domain will be developed and explained. Given a point in
the interior of a rectangular variable subdomain, the region is hypothetically split
into 2n n-dimensional rectangles corresponding to the orthants formed by a set of
rectangular axes centered at the point. The set of these 2n orthant LPs could be
used jointly to obtain a bound over a sub domain, but this is clearly impractical for
large problems. The orthant programs are mainly useful as a step in developing the
covering program presented in the following section.

The procedure to develop the orthant program from the NLP is summarized as
follows. The first step is to transform the original problem into a quadratic NLP, a
NLP with a quadratic objective and quadratic constraints. Next, using a variable
transformation, the quadratic terms are separated into two groups - those with
gradients in the range space of the constraint gradients, and those with gradients
in the null space of the constraint gradients. Linear bounds for the constraint
space quadratic terms are constructed using McCormick's convex envelopes for a
bilinearity, while linear bounds for the null space quadratic terms are constructed
from positive definite combinations. Both types of constraints are combined into a
LP which underestimates the original NLP for a particular orthant. The details of
each of these steps are presented below.

1.1 Transformation to Quadratic NLP

It is assumed that the problem is well scaled and written in the following factorable
form:

min Xo (1.1)
S.t. gi(x)=biTX+xTH(i)x+ Llj(xj)~O iEl. .. m (1.2)
j=F.
(1.3)
(1.4)
McCormick [14] shows how many problems can be written in this form. Equality
constraints are written as two opposing inequality constraints. H(i) is chosen to be a
symmetric matrix, and the functions Ij (x j) are nonlinear, single variable functions.
BRANCH AND BOUND FOR GLOBAL NLP 5

For this algorithm, IJ (x j) must be twice differentiable and able to be underestimated


over a specified domain interval by a quadratic function. This problem form allows
for a wide class of problems to be considered.

In addition to the problem definition, the transformation needs a specific region


defined by [xL, xU] ~ [Jl., x], a point x E [xL, xU], an estimate U of the Lagrange
multipliers at x, and an estimate of the active set, fA and fA'

fA = {i E l ... n I (Xi = Xf,Ui > 0) or (Xi = Xf,Ui < On (1.5)


fA = {i E l ... m Igi(X) = O,Ui+n > O} (1.6)

These definitions assume that the first n components of U correspond to the vari-
able bounds, followed by m components corresponding to the general constraints.
Swaney's algorithm [34] required that X be a local minimum of (1.1) and U and
fA,!A be the corresponding Lagrange multipliers and active set. The orthant pro-
gram presented here is more general because it can be applied at any point including
infeasible ones; when the x is a local minimum, it is equivalent to the earlier version.

The first step in the transformation is to replace the single variable, nonlinear
functions with underestimating quadratics. The Taylor series expansion of IJ(xj)
about x can be written as

where (j(Xj) holds the higher order terms. Using the expansion, gi can be rewritten
exactly

where ej is the j'th column of a n x n identity matrix. Because the functions Ij


are single variables functions, the matrix ! ~2:; Ix is diagonal. A quadratic under-
estimator for g(x) can be constructed by replacing the functions (j with quadratic
6 T. G. W. EPPERLY AND R. E. SWANEY
underestimators. This is accomplished by replacing the second order and all higher
order terms with the tightest quadratic underestimation

(1.10)

where &) is the largest value that underestimates Ij over the range [xy, xY]. With
the following definition,

the quadratic underestimation of g(x) can be written as

(1.11)

1.2 Variable Transformation

Now the original nonlinear program has been transformed into a quadratic NLP, and
the next task is to provide bounding functions for the quadratic terms. McCormick
[14, pages 387-416] developed the convex/concave envelopes for quadratic terms,
and these can be used to provide bounding functions. However, in this context
these bounds are insufficient for two reasons explained below. These shortcomings
may be overcome by separating the bilinearities into two groups of components,
those whose gradients lie in the space spanned by the gradients of the set of active
constraints, and those whose gradients lie in the null space. This separation is
accomplished through a variable transformation.

The first problem with McCormick's envelopes is that the underestimators match
the actual quadratic functions only on the boundaries of the region, and for the
underestimating program to remain stationary when x is a global minimum, there
must be no error in the underestimation at that point. This point is demonstrated
in Figures 1, 2.a, and 2.b, which show a bilinear function, its convex envelope, and
the estimation error as a function of position. Later, w and d will be used to denote
deviations from x, so the error at (0,0) prevents this bound from being tight at x.
This difficulty is addressed below in Section 1.3 by bounding each orthant separately
using piecewise convex envelopes.

Concave terms in the Lagrangian give rise to the other problem with convex en-
velopes. If piecewise convex envelopes are generated for a function which is the
sum of a convex part and a concave part, the addition of the two bounding func-
tions is not the tightest possible linear bound for the function. Figure 3 provides
a one dimensional illustration of this difficulty, which arises in multiple dimensions
through combination of bilinear terms having individual envelope functions. This
BRANCH AND BOUND FOR GLOBAL NLP 7

-2

Figure 1 Graph of bilinearity w d

Error
4
3

2
!~~~~~~tt~~~~~~1 2
-2

w 2

a) b)

Figure 2 Convex envelope of w d and underestimation error


4m om -+~-----i----
4m
8 T. G. W. EPPERLY AND R. E. SWANEY

. ! ---=_ 3 i

;
0--

·1
...-/
./"
I
1
I
- +::
.3
! . 2 !
I
,
I -1 _------.

-2 ! -4' -3
·2 ·1 0 1 2 ·2·1 0 1 2 ·2·1 0 1 2

Figure 3 Problem with McCormick's envelopes

situation is a natural occurance in the Lagrangian, and hence stationarity of the


underestimating program may be lost, with an attendant bound gap. The problem
is not solved by using smaller regions, and is addressed below in Section 1.4.

Both of these difficulties with convex envelopes can be solved after separating the
quadratic terms into constraint space contributions and null space contributions.
To perform the necessary separation, new coordinate variables d and p are defined
for the constraint space and null space respectively. To express the relation between
d and p and x, the constraint space basis G and the null space basis N are defined
as follows:
'l"7 g. ....... ei ' ...J i E lA i' E lA'
G -- [... ,v", (1.12)
, '" , .
Note that Y'gi = "bi and ei refers to the i'th column of an n x n identity matrix. This
basis matrix is generally not orthogonal. The null space basis N can be calculated
from G. The rows of G are permuted, so G can be partitioned G =[ g:! ] with
G! nonsingular. N may then be taken as

N = [ GI~fr! ]. (1.13)

This definition forces the columns of N to be orthogonal to those of G. The


relation between the original variables and the transformed coordinate variables is
x -x = Gd+Np. (1.14)
By substituting (1.14) into the quadratic terms of (1.9), for each i E 1, ... , m
(x - x)T h(i) (x - x) = (Gd + Np)T h(i)(Gd + Np)

By defining
------2(x-x)

(2Gd + 2Np-Gd)T h(i)Gd + pT NT h(i) Np. (1.15)

(1.16)

and
w = 2(x -x) - d (1.17)
BRANCH AND BOUND FOR GLOBAL NLP 9
(1.15) becomes

Then by defining
tjk = pT(NTeje kT N)p Vj,k E 1, . .. ,n (1.18)
qjk = wjdk V(j, k) E 1, ... , n (1.19)
- (i)
Hjk (tjk + qjk) (1.20)
j,kEl, ... ,n

The qjk terms hold the constraint space terms and the cross terms, and the tjk
terms hold the null space terms. Linear bounding functions for the qjk and tjk
terms will be developed in sections 1.3 and 1.4.

Using definitions (1.16) and (1.17), d, W and p can be related directly to x - X .

d (I - N(NT N)-l NT) (x - x) (1.21)


W (I + N(NT N)-l NT) (x - x) (1.22)

1.3 Constraint space quadratic terms

Figure 2 demonstrated the problem with using the convex envelope to bound bilin-
earities when the point of interest is in the interior of the region. This problem will
be solved in this section by dividing the region into orthants and using the convex
and concave envelopes within each orthant. The envelopes for all of the orthants
will be combined into a single program in Section 2.

The quadratic terms qjk to be bounded come from the constraint space and cross
terms as defined in (1.19). If Wj E [wf, wf] and dk E [df, df], McCormick [14]
provides the convex and concave envelopes

(1.23)
(1.24)

The bounds on wand d are calculated by evaluating (1.22) and (1.21) with interval
arithmetic.
(I + N(NT N)-l NT)([xL, x U]_ x)
(I - N(N T N)-l NT)([x L,x u ]_ x)

These bounds on wand d superscribe the region [xL - x, xU - x].


10 T. G. W. EPPERLY AND R. E. SWANEY

Error
4
3

-2
4
2
~~_~
0
2

~ ~

w 2 w 2

Figure 4 Piecewise convex envelope for bilinearity w d and its error

The next step in the bounding procedure is to split the variable region into orthants
around x and then to develop convex and concave envelopes for each orthant. The
combined result of this is a piecewise convex/concave envelope. The resulting piece-
wise convex envelope is demonstrated in Figure 4 along with the estimation error.
As desired, there is no estimation error at x.

Below are the linear bound constraints that are added to the orthant underestimat-
ing program. CTjk indicates the direction of support that is needed.

Ojk . (2: -
SIgn jk -
Ui fI{i)) '
- SIgn (-
'Yjk ) (1.25)

I
CjkWj + cjk2 dk < CTjkqjk (1.26)
4 d
CjkWj + Cjk k <
3 (1.27)
CTjkqjk

if d~u = df (1.28)

if d~L = df
IfCTjk=-l

if d~u = df (1.29)

if d~L = df

Constraints (1.26) and (1.27) have first order gaps in the directions of the space
spanned by the gradients of the active constraints, so Lagrange multiplier adjust-
ments will compensate for those deviations in the stationarity conditions. This point
will be demonstrated in section 1.5 where the first order optimality conditions for
the orthant program will be presented.
BRANCH AND BOUND FOR GLOBAL NLP 11

1.4 Null space quadratic terms

The remaining quadratic terms are those in the null space directions. These terms
are not bounded well by termwise convex envelopes because, as demonstrated in
Figure 3, problems with any concave terms will not be tightly bound even if the
combined function behavior is convex. The bounds for these variables are developed
from positive definite combinations of the quadratic terms, derived from the overall
behavior of the program at x to make a tight bound. The quadratic terms under
consideration are defined as follows:

\:I(j,k) E 1, .. . ,n x 1, ... ,n (1.18)

Given any n x n positive semidefinite (PSD) matrix 'Y, the following is true by
definition:

pTNT'YNp pT (L~;'(JVTe;e'TN»)
J,k
p (1.30)

= L 'Yjktjk (1.31)
j,k
> 0 (1.32)

Legitimate constraints of this form can be written for any PSD matrix 'Y, but only
certain 'Y's will give a sufficiently tight bound to support stationarity as desired.

Writing the orthant program with a single constraint of the form (1.31) provides
some insight into the choice of 'Y. Below is a program with only the constraints
involving 'Y, and its first order optimality conditions for the tjk variables, using

Ax x -x,
Ax u xU -x,
AxL xL -x

minAxo
s.t.
Ui
_
gi + -iT
b Ax + 'L..J
"' (i){
Hjk tjk + qjk )
A
~ 0 u,;
v.
j,k
X : L1jk t jk ~ 0 (1.33)
j,k
(First Order Optimality Conditions:)
(i)
L..J uiHjk
'"' A _
- X'Yjk =0 \:Ij, k (1.34)
12 T. G. W. EPPERLY AND R. E. SWANEY
If X is a local minimum of (1.1), the second order optimality conditions require that
L:i uiNT R(i) N be PSD, and choosing
(1.35)

will satisfy (1.34) at ~x = 0 if lI(i) = R(i) and u = U. However, (1.33) is not


sufficient under general conditions where lI(i) f. R(i) or u f. u. Changes in u may
occur because of the gradient differences between the original definition of qjk and
its linear bounding functions or due to the interval relaxation explained in Section
2.

The solution to this difficulty is to generate a set of 'Y matrices and a corresponding
set of constraints to bound the null space quadratic terms. The set of 'Y's that
can accommodate the greatest allowable change in u would inscribe the space of
positive definite combinations of the lI(i) 's. It is not clear how that particular set
of constraints can be conveniently generated, so the method presented here tries
to span the largest space possible while requiring reasonable computational effort
and maintaining problem sparsity. The method takes 'Y and perturbs it in each of
the bilinear directions which appear in the problem until the semi-definite limit is
reached. If 'Y is not positive definite in the null space, it is adjusted with a diagonal
positive definite matrix. Here are the details of the method.
(1.36)
Q is perturbed by one symmetric dyad pair at a time in each direction which appears
in the problem.
Q=Q - pjkNT(eiekT + ekeiT)N
By realizing that at the semidefinite limit Q becomes singular, it is possible to
calculate a formula for the limiting value of Pjk.
+ eke jT ) N) x = 0 for some x f. 0
(Q - PikNT (eje kT (1.37)
x = Pjk (Q-l NTeie kT Nx + Q-l NTekeiT Nx)
ekTNx = Pik (e kT NQ-l NTeje kT Nx + e kT NQ-l NTekejTNx) (1.38)
e jT Nx = Pjk (e iT NQ-INTeiekT Nx + eiT NQ-l NTeke jT Nx) (1.39)

Then define
'f/ik = e jT NQ-l NTe k ,
a ekTNx,
b = eiTNx.

Substituting these definitions into (1.38) and (1.39) gives


a Pik'f/kja + Pjk'f/kk b
b Pik'f/jja + Pik'f/ik b
BRANCH AND BOUND FOR GLOBAL NLP 13

Solving these. two equations together gives the limiting values of Pjk.

Pjk1 = 'TJjk ± J'TJjj'TJkk (1.40)

For each perturbation dyad jk, equation (1.40) will give either a positive and a zero
value for Pjk1 or two nonzero values of opposite signs. These two values give two
constraints via (1.30-1.32), each of the form

(1.41 )
r,s

The two values of Pjk1 are renamed separately as Pjk1 and Pjk1 , such that

-1
Pjk < 0,
--1
Pjk > 0.
Then the two constraints from (1.41) can be rewritten as

0 < (tjk + tkj) + IPjk1 1 L ;Yrstrs (1.42)


r,s
0 < -(tjk + tkj) + IPjk1 1 L ;Yrstrs· (1.43)
r,s

If x is not a local minimum or the Lagrange multipliers are not correct, Q in (1.36)
is potentially indefinite. In the development above, it is necessary to factorize Q.
A modified Cholesky algorithm [30] is used, and if necessary, a diagonal matrix is
added to make Q positive definite. ;Y is modified to incorporate the adjustment to Q.
Given a diagonal adjustment E to Q, the equivalent adjustment to 'Y is determined
as follows:

NT(;y + [~ ~])N = NT;yN + (1.44)

[GHG l 1 -J f [~ ~] [ Gl T1 ]:7
= Q+E

Constraints (1.42) and (1.43) are included in the orthant program for every pair
jk that is used in the problem or added by the diagonal adjustment. When
Ilil(i) - II
j{(i) and Ilu - fill are not too large, this set of constraints derived from
perturbing Q in each direction is sufficient to support the first order optimality
conditions of (1.1).
14 T. O. W. EPPERLY AND R. E. SWANEY

1.5 Orthant Program

With the pieces derived, the complete orthant program can be written. This orthant
program is useful as an intermediate result, though the 2n orthants are too many
to be used directly in an algorithm. This combinatorial difficulty is solved by the
covering program presented in the next section.

It is illuminating to compare the original quadratic NLP with the orthant program
and to compare their first order optimality conditions.

Quadratic NLP Orthant Program


min~xo min~xo (1.45)
s.t. s.t.
Ui: 9i + biT ~x + Vi (1.46)
LH;~(tjk +qjk) ~ 0
j,k j,k
O~tjk + tkj+ (A)
IPjk11 L
rs
1rstrs
Vj, k(1.47)
O~-(tjk +tkj)+ (B)
IPjk11 L
rB
1rstrs

{C~kWj ~ O'jkqjk (A)


a: qjk = wjdk Vj,k
cjkdk ~ O'jkqjk (B)
Vj,k (1.48)

f3: Np+d=~x Np+d= ~x (1.49)


cp: NTd=O NTd=O (1.50)
1C': W = 2~x-d W = 2~x-d (1.51)
.A: ~xL ~ ~x ~ ~xu ~xL ~ ~x ~ ~xu (1.52)
K: W'L ~ W ~ W'U W'L ~w ~w'u (1.53)
J1.: d,L ~ d ~ d'u d,L ~ d ~ d'u (1.54)
(1.55)

The first order optimality conditions compare as follows:

First Order Conditions First Order Conditions


o -.
~x: e + ~i Ui b' - f3 - eO + ~i uibi - f3 - (1.56)
21C' +.A = 0 21C' +.A =0
BRANCH AND BOUND FOR GLOBAL NLP 15

(1.57)

rs

p: - L2XjkNTejekTNp+ NTj3 =0 (1.59)


j,k
NTj3 = 0
Wj : - L ajkdk + 7rj + "'j = 0 La~kc}k + 7rj + "'j =0 (1.60)
k k
dk : - L ajkWj + 13k + 7rk + L a~kcjk + 13k + (1.61 )
j k
N'{cp + J.Lk =0 7rk+N'{CP+J.Lk=O
u2:0 u,aA,aB,xA,x B 2: 0 (1.62)

For the most part, the programs are very similar, differing only in how the quadratic
variables are treated, and if Ax = 0 satisfies the original program's optimality
conditions, Ax = 0 will also satisfy the optimality conditions of the orthant program
if the region is small enough. The changes in (1.48) create differences in (1.58),
(1.60), and (1.61). Constraint (1.58) can be satisfied by adjusted values of afk and
afk as long as signC"(jk) = sign(I:i uiHj~). These changes in a will ultimately
require changes in u via the variables 7r and 13.

The changes in (1.47) create differences in (1.57) and (1.59). When Ax = 0, con-
straints (1.59) are identical because p = 0, so that will not cause the orthant program
to have a different solution. The orthant version of (1.57) may be satisfied by some
combination of Xfk' Xfk if I:i uiH( i) is in the space of positive semi-definite matrices
spanned by the set of'Y developed above, which will be true when Iii - I:i UiH( i) II
is not too large.

2 THE COVERING PROGRAM

The orthant program derived in the previous section may be impractical as a means
of obtaining a bound on a region because it implies the solving of 2n subproblems.
In this section, an interval-based relaxation will be developed and then applied to
the set of 2n orthant programs to give a single program to solve for a lower bound
on a region.
16 T. G. W. EPPERLY AND R. E. SWANEY

In the orthant program, only constraints (1.48.A) and (1.48.B) and variable bounds
(1.53) and (1.54) are affected by the choice of orthant. To extend the orthant
program into a single program for the whole variable domain, the variable bounds
can be extended to their whole ranges, and the coefficients in constraints (1.48.A)
and (1.48.B) can be replaced with intervals.

2.1 Constraints from an interval linear system

For the derivation that follows, it is necessary to develop a set of linear constraints
from a linear system of equations with interval coefficients. Given a linear system
Au=c (1.63)
u2::0

with A E [A., A], an interval matrix, the goal is to develop the tightest set of linear
constraints that can be written to limit the range of U E [y., ill. (The motivation here
°
is to be able to show Y. 2:: in order to demonstrate stationarity.) Use of LP and
linear constraints to approximate linear systems with interval coefficients has been
studied before [2, 3, 19, 20]. The method presented here includes the constraints
of this previous work plus an additional constraint to increase the tightness of the
bounds on u. Neumaier [18] presents other interval methods for solving this prob-
lem, but these methods are not applicable for the method presented here because
they cannot be used within aLP.

For any variable x, the notations xC +) and xC -) will refer to its positive and negative
parts defined as
x C+) max{O,x}
x(-) max{O, -x}

Consider a row i of (1.63), and choose a particular multiplier, Ur(i). Under the
condition that U 2:: 0, the following may be used to obtain one limit of a valid
interval [Y.r(i),iLr(i)] containing the value of Ur(i) for some solution of (1.63) for all
Air(i) E [A.irCi),Air(i)]
~ -(+)- -(-)
Air(i)ur(i) :::; Ci - ~ (Aij Uj - Aij Y.j) (1.64)
#r(i)

One method for selecting r(i) will be described in the following section. By choosing
the value of Ai r( i) which maximizes the left hand side and the value of UrC i) which
minimizes it, the following is obtained.

(1.65)
BRANCH AND BOUND FOR GLOBAL NLP 17

This can be rewritten as


""' -(+)- -(-) - -
L/Aij Uj - Aij 1l!.j) -IAir(i)l(ur(i) -1l!.r(i») :::; Ci (1.66)
j

Similarly, the following condition may also be required


Air(i)Ur(i) 2: Ci - L (A~t)1l!.j - A~j)uj) (1.67)
jf.r(i)

for all Air(i) E [Air(i),Air(i») and some Ur(i). By choosing the Air(i) which mini-
mizes the left hand size and the value of ur ( i) which maximizes it and performing
similar rearrangements, the following is similarly obtained.
L(A~j)uj - ~t)1l!.j) -IAir(i)l(ur(i) -1l!.r(i») :::; -Ci (1.68)
j

Combining (1.66) and (1.68) for all i gives a set of constraints which can be used
to determine intervals [1l!., u) containing the values of the solution U to (1.63) for
A E [A, A). These constraints will be used below to develop the underestimating
linear program.

2.2 Choosing r (i)


In the previous section, r( i) was an unspecified function assigning a particular
variable for each row. In this section, the method of assignment developed by
Swaney [34) will be presented. The choice of r(i) can greatly affect the quality of
the constraints developed in the previous section.

The following method requires an approximate solution, u of the interval linear


system, (1.63), for some value of A E [A, A). u can be taken from a previous
iteration or can be initialized to a vector of ones.

The function r( i) is developed in a sequential fashion. First a row i is chosen from


the list of unassigned rows, and then the column r(i) for that row is chosen from
the list of unassigned columns. This continues until all rows have been assigned.
The following should be calculated for use in choosing the rows and columns.
X Au (1.69)
X -Au (1.70)

i next = unassigned
arg max
i
( ~in
unassigned}
. max { ( _Xi. ) ,(
AijUj
Xi.) })
- AijUj
(1. 71)
18 T. G. W. EPPERLY AND R. E. SWANEY

r(i) = i = i next (1.72)

r(i) is determined by calculating i next and then calculating r(i next ). Let s(j) be
defined as the inverse function of r(i) such that

i s(r(i))
j r(s(j))

2.3 Interval Linear Program

Now that constraints for a interval linear system have been developed, it is possible
to develop an underestimating program for a linear program that has an interval
coefficient matrix. Given an LP of the form:

P(A) = min cT x (1.73)


S.t. Ax ~ b

with A E [A, A] c IRmxn, the set of all m x n matrices with interval coefficients.
The dual of this LP is:

D(A) = min bTu (1.74)


ATu =-c (1.75)
u~O (1.76)

Constraint (1. 75) is an interval linear system, so the bounding constraints developed
in Section 2.1 above can be applied to it. Applying these constraints leads to the
following linear program

min [b{+}T, -b{-}T] [ : ] (1.77)

s.t. L (A~~}uj -A~t}Yj) -IAr{i}il (Ur{i) -Yr{i}) ~ Ci Vi (1.78)


j

""
L..J (-(+)-
Aji Uj -(-})
- Aji - (- )
Yj -IAr{i}il Ur{i} - !!r(i} ~ -Ci
'W,;
v. (1.79)
j

-Y ~ 0 (1.80)
y-u~O (1.81)

Constraints (1.78) and (1.79) define restrictions on Y,u such that they describe
a valid interval containing the solution to (1.74) for all A E [A, A]. The interval
[y,u] so defined may overestimate the true range of U values in (1.74), i.e. these
constraints may be somewhat overrestrictive in that role. Also, (1.80) is introduced
BRANCH AND BOUND FOR GLOBAL NLP 19

as an added restriction. Thus this LP representation of the interval dual will be


referred to as the restricted dual. The choice for the objective function will be
justified below. The order of the subscripts of A reflects the fact that constraint
(1.75) uses AT.

Next the restricted dual is transformed back to corresponding primal variables.

min cT(x(+)-x(-») (1.82)

~ (A~-:-)x(+)
L...J ~J J
+ A("'!-)x(-»)
OJ J .
-
j

S.t. I-is(i)
A 1xs(i)
(+)
- IAis(i) 1xs(i)
(-) - Yi = - b(+)
i 'Vi (1.83)
~ (-A~+)x(+) - A(-:-)x(-»)
L...J ~J J OJ J
+
j
A
-is(i)
I (+)
1xs(i) + IA- is(i) 1xs(i)
(-)
- Zi + Yi -
-
b(-)
i 'Vi (1.84)
x(+) ~ 0 (1.85)
x(-) ~ 0 (1.86)
z~O (1.87)
y~O (1.88)

Equation (1.83) can be used to eliminate y, and Z converts (1.84) into an inequality.

min cT (x(+) -x(-»)

t
s.. ~
L...J (A~+)
-'J
A~-:-») x(+)
- ~J J
- (A-("'!-)
OJ
- A-(-:-»)
OJ
x(-)
J
<
_.
b· 'Vi
j

I-is(i)
A Ixs(i)
(+)
+ IA- is(i) Ixs(i)
(-) '" (A(-)
- L...J
(+)
=ij Xj + A-(+) (-»)
ij Xj ~
b(+)
i
u;
v.
j

Simplifying gives the underestimating program:

min cT (x(+) - x(-») (1.89)


s.t. Ax(+) - Ax(-) ~ b (1.90)

IAiS(i)lx~t~ + IAiS(i)lx~~~ -
~ (A(-:-)x(+) + A~"'!-)x(-») < b~+) 'Vi (1.91)
L...J ~J J OJ J - 0
j

x(+) ~ 0 (1.92)
x(-) ~ 0 (1.93)
20 T. G. W. EPPERLY AND R. E. SWANEY
The choice of the objective function (1.77) for the restricted dual causes the right
hand side of (1.90) to be equivalent to the constraints in the original LP, and it
makes (1.91) as tight as it can be.

Summarizing, for constraints with interval coefficients of the form

d ~ [B,B]x, (1.94)

the corresponding set of LP constraints are


d ~ Bx(+) - Bx(-) (1.95)
IB- is(i) Ixs(i)
(+)
+ IB I (-) '" (B-(+)
-is(i) xs(i) - ~
(+)
ij Xj + -B(-) (-))
i j Xj
(-) (1.96)
~ di
j

Equality constraints in the original program can be treated equivalently as a pair


of opposing inequality constraints using the above.

2.4 Proving the Underestimating Program

The objective function of (1.89) is identical to (1.73). To establish that (1.89)


provides a lower bound to the original linear program (1.73), it must be shown that
every feasible point of (1.73) has a feasible counterpart in (1.89). Given X, a feasible
point of (1.73), it will be shown that
X(+) = x(+)

x(-) = x(-)

is feasible in (1.89). This choice of x(+) and x(-) clearly satisfies the positivity
constraints (1.92) and (1.93). The following shows that (1.90) is satisfied:
(A - A)x(+) + (A - A)x(-) + A(x(+) - x(-))
(A - A)x(+) + (A - A)x(-) + Ax
~~~
~o ~o ~b
< b

To justify the right hand side of (1.91) and to illustrate that it is satisfied, it is
useful to consider a simple constraint

x·J <
- b·,

Assuming that s(i) = j, applying constraint (1.91) to this gives


x(+) < b(+)
J -,
BRANCH AND BOUND FOR GLOBAL NLP 21

If bi were used in place of b~+), the underestimating program would be infeasible


when bi < 0 because X)+) ~ 0 and Xj ~ bi < O. When bi ~ 0, it is the tightest
upper bound for (1.91). Anything less than bi, would be more restrictive than the
original constraint, and anything greater would be needlessly loose.

To prove that (1.91) is satisfied by X, two cases need to be considered. In the first
case, s(i) = 0 or Ais(i) = 0, so (1.91) is reduced to

<
- b~+)
,
In this case, this constraint is redundant, so it need not be included when solving
the underestimating LP.

In the remaining case, s(i) "# 0, and Ais(i) "# O. From above, the following is true
for row i.

b~+) > bi
> -,
A'!x(+) - .IFx(-)
,
"((A(+) - A~-:-»)x(+) - (A(~) - A~-:-»)x(-»)
~ -OJ -OJ J OJ OJ J
j

= (i!{i) - i~{i»)X~t~ - (A~!{i) - At{i»)X~;~ +


" ((A~~)x(+) + A~-:-)x(-») - (A~-:-)x(+) + A~~)x(-»))
~ =1,J J 'J J =1,J J . OJ J
jis(i) , v '
2:0
> ~~li)X~t~ + A~;li)X~~~ - ~;li)X~t~ - A~~li)X~~~ -
" (A(-:-)x(+)
~ =1,J J
+ A(+}x(-})
'J J
Jis(i)

Because of the (+) and (-) designations and the fact that A ~ A, only one of
the four products involving Ais(i) and Xs(i} may be nonzero. This constraint must
be satisfied for each of these four terms alone. The third and fourth terms are
satisfied trivially, and the first and second terms give the following constraint which
is equivalent to (1.91).

b(+} > A~+) x(+)


• - -.s(,) s(.}
+ A~-) x(-)
.s(.} s(.)
- " (A(-:-}x(+)
~ =1,J J
+ A(~)x(.-})
'J J
Jis(i)
Thus, every feasible point in the original problem has a corresponding solution to
the underestimating program.
22 T. O. W. EPPERLY AND R. E. SWANEY

2.5 Modifying the Orthant Program


To finish the derivation of the covering program, an interval form of the orthant
program must be developed and the interval relaxation applied. In addition to the
interval changes, several modifications are made to the orthant program to improve
its bounding behavior.

The constraints (1.48.A) and (1.48.B) are the only general constraints to depend on
the choice of orthant. The values of e}k and elk are determined by equations (1.28)
and (1.29). If O'jk = 1, e}k E [df, df], and elk E [wf, wf]. The convex and concave
envelopes of these quadratic terms can be written as (1.99) and (1.100). The simple
variable bounds on w and d become (1.111) and (1.112).

The general McCormick convex and concave envelopes for !:l.Xj !:l.xk are added to
the program to give a bound which will work if the other bounding functions fail
to keep the LP stationary at !:l.x = O. This requires the addition of constraints
(1.101-1.104).

(1.97)
Vi (1.98)
j,k
o ~ [df,dflwj - qjk ~ 0 Vj, k (1.99)
o ~ [wf,wf]dj - qjk ~ 0 Vj, k (1.100)
!:l.xf !:l.Xk + !:l.xf !:l.Xj - ~(qjk + tjk + qkj + tkj) :5 tl.xf tl.xf '1j, k (1.101)
!:l.xf !:l.Xk + !:l.xf !:l.Xj - ~(qjk + tjk + qkj + tkj) ~ !:l.xf !:l.xf Vj, k (1.102)
!:l.xf !:l.xk + !:l.xf !:l.Xj - Hqjk + tjk + qkj + tkj) ~ !:l.xf !:l.xf Vj, k (1.103)
!:l.xf !:l.xk + !:l.xf!:l.xj - t(qjk + tjk + qkj + tkj) ~ !:l.xf !:l.xf Vj, k (1.104)
o ~ (tjk + tkj) + IPjk11 L 7rstrs Vj, k (1.105)
r,s
Vj, k (1.106)
r,B

Np+d= !:l.x (1.107)


NTd=O (1.108)
w+d=2x (1.109)
!:l.x L ~!:l.x ~ !:l.x u (1.110)
w L ~ W ~ wU (1.111)
d L ~ d ~ dU (1.112)
BRANCH AND BOUND FOR GLOBAL NLP 23
Equations (1.107)-(1.109) are used to eliminate wand d from this formulation
giving:

(1.113)
Vi (1.114)
j,k
o ::; [df, dfle j T (I + N(N T N)-l NT)LlX -
qjk ::; 0 Vj, k (1.115)
0::; [wf, wflejT(I - N(NT N)-l NT)LlX - qjk ::; 0 Vj, k (1.116)
Llxy LlXk+ Llxf LlXj - ~(qjk + tjk + qkj + tkj) ::; Llxy Llxf Vj, k (1.117)
Llxf LlXk + Llxf LlXj - ~(qjk + tjk + qkj + tkj) ::; Llxf Llxf Vj, k (1.118)
Llxf LlXk + Llxf LlXj - hqjk + tjk + qkj + hj) ~ Llxf Llxf Vj, k (1.119)
Llxy LlXk + Llxf LlXj - hqjk + tjk + qkj + tkj) ~ Llxy Llxf Vj, k (1.120)
0::; (tjk + tkj) + IPjk11 L 'Yrstrs Vj, k (1.121)
r,s
Vj,k (1.122)
r,s
(1.123)

When the interval LP relaxation derived above is applied to this LP, it is able
to verify the global minimum for some finite region when there is no constraint
gradient null space. However, it is unable to verify the global minimum for finite
sized regions when there is a null space. This problem can be solved by introducing
a new variable z to deal with the rank deficiency of G.

z is defined by these two conditions.

Z + Njj =Llx (1.124)


Zj = 0 j fj J (1.125)
Inserting the definition of N and partitioning Z and Llx according to J gives the
following results.

Z = (I + [0 N])LlX (1.126)

Bounds on z are calculated by evaluating (1.126) using interval arithmetic.


(1.127)
24 T. G. W. EPPERLY AND R. SWANEY

z has two important properties which allow it to replace Ax in (1.114) for i E fA


and (1.116). First,

because biT N = 0 for i E fA. Second, the components of z and Ax in the constraint
space are identical.

(f - N(N T N)-l NT)Ax = (f - N(N T N)-l NT)Z


(f - N(NT N)-l NT)(f + [0 N])Ax
(f - N(NT N)-l NT)Ax +
o
"

Introducing z into the LP and replacing Ax with it where possible tightens the
bounds produced by the covering program.

An additional variable change is needed to improve the covering program's bounds.


It is possible to reduce the number of t variables by enforcing the symmetry,tjk =
tkj. This is valid because of the definition of t in (1.18). However, qjk is not in
general equal to qkj, so similar treatment for q is not possible. When using the LP
presented below, only variables tjk with j $ k are included, and whenever tjk with
j > k appears, tkj is used in its place.

min cTAx (1.128)


S.t. biT z + I: H;~ (tjk + qjk) $ 0 Vi E fA (1.129)
j,k
b Ax + I: Hjk (tjk + qjk) $ 0
-9i + -iT -(i)
Vi r:J. fA (1.130)
j,k
o$ [d~ ,dflejT(f + N(N TN)-l NT)Ax - qjk $ 0 Vj,k (1.131)
o$ [w~ ,wflejT(I - N(N TN)-l NT)z - qjk $ 0 Vj,k (1.132)
Axy AXk + Axf AXj - t(qjk + qkj + 2tjk) $ Axy Axf Vj,k (1.133)
Axf AXk + Ax~ AXj - Hqjk + qkj + 2tjk) $ Axf Axf Vj,k (1.134)
Axf AXk + Axf AXj - t(qjk + qkj + 2tjk) ~ Axf Axf Vj,k (1.135)
AXy AXk + Axf AXj - Hqjk + qkj + 2tjk) ~ Axy Axf Vj,k (1.136)
0$ 2tjk + IPjk11 I: 'Yrstrs Vj,k (1.137)
r,s
BRANCH AND BOUND FOR GLOBAL NLP 25

Vj,k (1.138)
T,S

z= ~x+N~x (1.139)
~XL ~ ~x ~ ~xu (1.140)
zL ~ Z ~ zU (1.141)
qL ~ q ~ qU (1.142)
tL ~t~ tU (1.143)

where

zf = zY = 0 Vj f/. J
[qfk,qf,.l = [wf,wYl x [dy,dYl Vj,k
[tYk, tYkl = (e jT N(N T N)-l NT[~xL, ~xU]) x (e kT N(NT N)-l NT [.6.x L,~xu])
Applying the interval LP relaxation to this program gives the end result:

Covering Program
min CT(~X(+) - ~x(-)) (1.144)
S.t. l)iT(z(+) - z(-)) + Vi E fA (1.145)
'"
~
iIiJk (t(+)
Jk
- t(-)
Jk
+ q(+)
Jk
- q(-)) < 0
Jk-
j,k
Vi f/. fA (1.146)

j,k
Vj,k (1.147)
Vj,k (1.148)
Vj,k (1.149)

Vj,k (1.150)
Vj,k (1.151)

Vj,k (1.152)

Vj,k (1.153)
26 T. G. W. EPPERLY AND R. E. SWANEY

Vj,k (1.154)

Vj,k (1.155)
r,s
o <- -2(t~+)
Jk - t~-»
Jk + Ip--:-11
Jk '"'L.J lVIrs (t(+)
rs - t(-»
rs Vj,k (1.156)
r,s
z(+) - z(-) = (/ + [0 NJ)(~x(+) - ~x(-» (1.157)
~xL ~ ~x(+) - ~x(-) ~ ~xU (1.158)
zL ~ z(+) - z(-) ~ zU (1.159)
qL ~ q(+) _ q(-) ~ qU (1.160)
t L ~ t(+) - t(-) ~ t U (1.161 )
bi(~)Z(+)
s(,)
+ b'(-)z(-)
s(i)
- '"'
L.J
(bi,(-)z(+)
J J
+ bi,(+)Z~-»-
J J Vi E IA (1.162)
#s(i)
'"'(iIi(-)(t('+)
L.J Jk Jk
+ q(+»
Jk
+ iIJki(+) (l-)
Jk
+ q~-»)
Jk
<
-
0
j,k
-z~-) + (I + [0 N])~i) ~x~+) + (I + [0 N])~t) ~x~-)­ Vi E J (1.163)
L ((I + [0 N])~t) ~x;+) + (I + [0 N])~;) ~x;-») ~ 0
#i
q ~-) - hTjk(-)z(+) - jiTjk(+)z(-) <0 Vj, k E U (1.164)
Jk - -
q ~+) - hTjk(+)Z(+) - hTjk(-)z(-) < 0 Vj, k ¢ U (1.165)
Jk - -
k)
(2+ Ipjk11"Yjk)(-)ti + (2+ Ipjkll"Yjk)(+)t;~)- V(j,k) E C(1.166)
L ((IPjk11"Yrs)(+ t~~)+(lpjkll"Yrs)(-)t~:;}) ~O
r,s#j,k
(-2 + Ipjk11"Yjk)(-)ti k)+ (-2 + Ipjkll"Yjk)(+)tj~)- V(j, k) ¢ C(1.167)
L ((lpjk11"Yrs) (+ t~~) + (IPjk11"Yrs) (-) t~:;») ~ 0
r,s#j,k

where

C = {(j, k) : 12 + IPjk11"Yjk)1 ~ 1- 2 + Ipjk11"Yjkl}


v jkT = [d~, df] ejT(I + N(NT N)-l NT)
h jkT = [wr,wfJ ejT(I - N(NTN)-lNT)
U = {(j,k): ajk >= O}
BRANCH AND BOUND FOR GLOBAL NLP 27
Although large and notationally complicated, this program is a straight forward
translation of (1.128-1.143). As the interval relaxation dictates, all of the variables
get replaced by their positive and negative components, and each original constraint
is transformed into the new program. Constraints (1.145) and (1.146) correspond
to constraints (1.129) and (1.130) in the interval orthant program, and likewise
constraints (1.147-1.150) correspond to constraints (1.131) and (1.132). Each of
the constraints (1.151-1.161) corresponds to a constraint in (1.128-1.143).

The interval relaxation also includes a constraint for each variable of the form (1.91).
For each variable, a particular constraint is chosen to make an extra constraint to
tighten the bound. Constraints (1.162) are the constraints for the z variables, and
s(i) is defined by the pivot sequence used in LU factoring G. The x constraints
(1.163) are taken from the equations (1.139) that relate x and z and the qjk con-
straints (1.164) and (1.165) are taken from the constraint (1.132) using (Jjk to
indicate whether the upper or lower bound will be active. Lastly, the t constraints
(1.166) and (1.167) are chosen from (1.137) and (1.138) using the constraint with
the largest coefficient on t j k.

When there is no null space, the interval LP can be simplified to the following:

min cT~X (1.168)


s.t. -
9i +b
-iT
~x +
I: Hjk
'(i)
qjk ::; 0 Vi (1.169)
j,k
o ::; [~xf, ~xfl~xj - qjk ::; 0 Vj,k (1.170)
~xy ~Xk + ~xf ~Xj - qjk ::; ~xy ~xf Vj,k (1.171)
~xJ ~Xk + ~xf ~Xj - qjk ::; ~xJ ~xf Vj,k (1.172)
~xJ ~Xk + ~xf ~Xj - qjk ~ ~xJ ~xf Vj,k (1.173)
~xy ~Xk + ~xf ~Xj - qjk ~ ~xy ~xf Vj,k (1.174)
~xL ::; ~x ::; ~xu (1.175)
qL ::; q ::; qU (1.176)

where
[qJk,qlicl = [~xJ,~xYl x [~xf,~xfl
Applying the interval LP relaxation gives the following useful result:

min cT(~x(+) - ~x(-)) (1.177)


s.t. gi + biT(~x(+) - ~x(-)) + I: iI;k(q;t) - q;~)) ::; 0 Vi (1.178)
j,k
~xL ~x(+) - ~xu ~x(-) - q(+)
k J k J Jk
+ q(-)
Jk-
<0 Vj,k (1.179)
~xu ~x(+) - ~xL ~x(-) - q(+) + q(-) > 0 Vj,k (1.180)
k J k J Jk Jk-
28 T. O. W. EPPERLY ANDR. E. SWANEY

~xf(~x~+) - ~x~-») + ~xf(~x;+) - ~x;-»)- Vj,k (1.181)


q(+) + qH < ~xu ~xu
Jk Jk - J k

~xf(~x~+) - ~x~-») + ~xf(~x;+) - ~x;-»)- Vj,k (1.182)


q(+) + q(-) < ~xL ~xL
Jk Jk - J k

~xf(~x~+) - ~x~-») + ~xf(~x;+) - ~x;-»)- Vj,k (1.183)


q(+) + qH > ~xL ~xu
Jk Jk - J k

~xf (~x~+) - ~x~-») + ~xf(~x;+) - ~x;-»)- Vj,k (1.184)


q(+) + q(-) > ~xu ~xL
Jk Jk - J k
~XL ::; ~x(+) - ~x(-) ::; ~xu (1.185)
qL ::; q( +) _ q( -) ::; qU (1.186)
bi(+) ~x(+) + bi(~) ~x(-L Vi (1.187)
s( » s( »
L (b i(-) ~x(+) + bi(+) ~x(-»)-
J J J J
j#s(i)
L(1ii(-)q(+) + iIi(+)q(-») < 0
Jk Jk Jk Jk -
j,k

q(-) - ~xL(-) ~x(+) - ~xU(+) ~x(-) <0 Vj,k E U (1.188)


Jk k J k J-
q(+) - ~xU(+) ~x(+) - ~xL(-) ~x(-) <0 Vj,k ft U (1.189)
Jk k J k J-
(1.190)

The above give two forms of the covering program available to calculate a lower
bound for the NLP over a range of variables. LP (1.177) is much smaller and is
used when the solution point is completely constrained, while the larger (1.144) is
used when there is a constraint null space. This decision is based on the results
from a previous solution. In addition, the solution to these gives a step for the x
variables, ~x, which will be used in a line search in the algorithm.

2.6 Adding a Newton Constraint

One problematic feature of the null space covering program is that there is no
constraint relating tjk to ~x except for the McCormick envelopes (1.151-1.154).
When the program can reduce its objective function by moving away from j; in the
direction of the null space, McCormick's constraints and variable bounds are the
only ones that keep it from moving infinitely in the null space direction, and these
constraints do not give a D..x which will lead the iterative algorithm to the global
minimum. In this section, a constraint which can be added to the covering program
to limit motion in the null space will be developed. It is designed to try to produce
a Newton step in the null space.
BRANCH AND BOUND FOR GLOBAL NLP 29

Assuming that the active set is correct, the LU factors of G used in calculating N
may also be used to calculate the Newton step in the constraint space.

dN = _ [ G b T
] 9 (1.191)

The KKT conditions of the quadratic subproblem in successive quadratic program-


ming are:
m

\7 !(Xk) + \7 2 L(Xk)Llx + L Ui\7gi(Xk) o (1.192)


i=l

Ui (gi(Xk) + \7gi(Xk)T Llx) 0 (1.193)


U > 0 (1.194)
By premultiplying by NT, the contributions of the active constraints in the first
equality are eliminated.
(1.195)
If x is broken into its constraint space and null space parts, the following requirement
for a Newton step in the null space is obtained.
(1.196)
In constructing the null space quadratic term bounds, the Cholesky factors of Q
are calculated, so it costs little to use equation (1.196) to calculate the Newton
step in the null space. The constraint space and null space Newton steps can be
added to give the complete Newton step which will be used in the line search in the
algorithm.
_Q~l NT (\7 !(Xk) + 'Yd N ) (1.197)
d N +NpN (1.198)

By assumption Q is a symmetric, positive semi-definite matrix, so

for all p. Choosing the correct values of p will produce a constraint which is valid
and can give a Newton step in the null space direction.
(p_jj)TQ(p_jj) > 0 (1.199)
pT Qp _ 2jjT Qp + jjT Qjj > 0 (1.200)
'--v-'
L 'Yijtij
ij
L 'Yijtij > 0 (1.201)
ij
30 T. O. W. EPPERLY AND R. E. SWANEY

The goal is to choose a set of p's such that when constraints (1.200) and (1.201) are
active, the Newton step in the null space given by (1.196), so (1.200) and (1.201)
will be treated like equalities below. By scaling each row of (1.196) by -am, the
value of p which makes (1.200) equivalent to a row of (1.196) when (1.201) is active
can be determined.
-amemTQp = amemTNT(V!(Xk) + "YdN ) (1.202)
_2pTQp = -pATQAp (1.203)
am m
p -e (1.204)
2
-4e mT NT(V!(Xk) + "Yd N )
am (1.205)
Qmm
These values of a and p give a constraint of the following form for each null space
dimension.

L "Yijtij - amemTQp ~ amemT NT(V !(Xk) + "Yd N) (1.206)


ij

The combination of constraints (1.206) and (1.137-1.138) constrain the LP's move-
ment in the null space.

2.7 Program Size


Bounding the original NLP requires additional variables and constraints, resulting
in a covering program that is a larger LP. In this section, the size of the covering
program including the Newton constraints is related to the problem characteristics.

Given a problem in factorable form (1.1), the following index sets are defined:

[( = {(i, k) I (3i E 1 ... m I Hj~ :f 0 or (j = k, i E Fi ) ) } (1.207)


l' {(i,k)EKIi~k} (1.208)

K is the index set required for qjk, and l' is the initial index set required for tjk.
The cardinality of a set is denoted with the notation I . I. The number of active
constraints is IlAI + IlAI, and the null space has dimension n - (llAI + IlAI. In
Section 1.4, some additional quadratic terms may be added to adjust the Q matrix;
let a be the number of quadratic terms added in this step. It can be shown that
a~n - (llAI + IlAD·
For the first form of the covering program (1.144), the number of variables and rows
can be calculated and bounded as follows:

Number Variables = 2(n + IlAI + IlAI + IKI + 11'1 + a)


BRANCH AND BOUND FOR GLOBAL NLP 31

Actual size "


Worst case -----
Bestfit ------,,'

10000

1000

100
-------.... -- ..
10 ....•..........t ... ···•·····•···•···•··•··•·•.... ·•·

1 L -_ _ _ _ ~ __ ~ __ ~~~~~~ ____ ~ ____ ~~~

1 10
Number of problem variables, n

Figure 5 Number of extra constraints versus number of problem variables

Number Variables< 2(2n + IKI + 11'1)


Number Rows m + 2n + IIAI + II~I + glKI + 3(11'1 + a)
Number Rows < m + 5n + glKI + 311'1

The second form of the covering program (1.177) always has the same number of
variables and rows because it does not depend on the null space. The number of
variables and rows required is given by

Number Variables 2(n + IKI)


Number Rows m+n+7IKI·

A completely dense problem presents the worst case for covering program size,
giving IKI = n 2 and 11'1 = ~n(n + 1). Figure 5 shows a comparison of the number
of extra constraints (in addition to the m original constraints) required in the null
space version of the covering program applied to the test problems used in the next
chapter and the worst case behavior. The slope for the line of best fit through the
test points is 1.03, indicating linear growth in the number of extra constraints with
respect to the number of original problem variables.
32 T. G. W. EPPERLY AND R. E. SWANEY

3 CONCLUSIONS

This chapter develops a covering LP for the underestimation of a NLP in factorable


form. Its application is presented in the following chapter. This bounding method
offers a way to apply branch and bound to a large class of problems. It is capable
of producing the exact lower bound for a region of finite size when constructed at
the region's global minimizer, and so can provide finite termination of branch and
bound for nonconvex NLPs without requiring 2n subproblems.

The size of the covering program is a linear function of the number of variables,
number of original constraints, and the number of unique nonlinear terms appearing
quadratically or in single variable functions. For dense problems, the number of
bilinear terms is proportional to n 2 , so the covering program may be too large
for conventional solvers. However, the majority of engineering problems will be
sparse, and for these the number of additional constraints grows linearly with n, as
suggested by the test problem set.

As a secondary result, a method of treating linear equations with interval coeffi-


cients with a system of linear inequalities is described. Our experience with these
systems shows that the supplementary constraints presented significantly improve
the tightness of the bounds obtained.

Acknowledgements

This work was supported by the Computational Science Graduate Fellowship Pro-
gram of the Office of Scientific Computing in the Department of Energy. The Na-
tional Science Foundation also provided partial support under grant DDM-8619582.

REFERENCES

[1) F. A. AI-Khayyal. Generalized bilinear programming: Part 1. models, appli-


cations and linear programming relaxation. European Journal of Operational
Research, pages 306-314, 60.
[2) J. E. Cope and B. W. Rust. Bounds on solutions of linear systems with inac-
curate data. SIAM Journal on Numerical Analysis, 16(6):950-963, December
1979.
[3) A. Deif. Sensitivity Analysis in Linear Systems. Springer-Verlag, 1986.
[4) J. E. Falk and R. M. Soland. An algorithm for separable nonconvex program-
ming problems. Management Science, 15(9):550-569, May 1969.
RANCH AND BOUND FOR GLOBAL NLP 33
5] C. A. Floudas and P. M. Pardalos. A Collection of Test Problems for Con-
strained Global Optimization Algorithms, volume 455 of Lecture Notes in Com-
puter Science. Springer-Verlag, 1990.
6] C. A. Floudas and P. M. Pardalos, editors. Recent Advances in Global Opti-
mization. Princeton Series in Computer Science. Princeton University Press,
1992.
7] C. A. Floudas and V. Visweswaran. A global optimization algorithm (GOP)
for certain classes of nonconvex NLPs - 1. Theory. Computers and Chemical
Engineering, 14(12):1397-1417,1990.

8] A. M Geoffrion. Generalized Benders decomposition. Journal of Optimization


Theory and Applications, 10:237-260, 1972.
9] E. R. Hansen. Global Optimization using Interval Analysis. Marcel Dekker,
1992.
0] R. Horst. Deterministic methods in constrained global optimization: Some re-
cent advances and new fields of application. Naval Research Logistics Quarterly,
37:433-471, 1990.
1] R. Horst and P.M. Pardalos. Handbook of Global Optimization. Kluwer, 1995.
2] R. Horst and H. Thy. Global Optimization, Deterministic Approaches. Springer-
Verlag, 1990.
3] G. P. McCormick. Computability of global solutions to factorable nonconvex
programs: Part I - convex underestimating problems. Mathematical Program-
ming, 10:147-175,1976.
4] G. P. McCormick. Nonlinear Programming, Theory, Algorithms, and Applica-
tions. John Wiley & Sons, 1983.

5] C. M. McDonald and C. A. Floudas. Decomposition based and branch and


bound global optimization approaches for the phase equilibrium problem. Jour-
nal of Global Optimization, 5:205-251, 1994.
6] A. Mfayokurera. Nonconvex phase equilibria computations by global minimiza-
tion. Master's thesis, University of Wisconsin-Madison, June 1989.
7] B. A. Murtagh and M. A. Saunders. MINOS 5.1 user's guide. Technical Report
SOL 83-20R, Systems Optimization Laboratory, Stanford University, Stanford,
CA 94305-4022, January 1987.
8] A. Neumaier. Interval Methods for Systems of Equations. Encyclopedia of
Mathematics and Its Applications. Cambridge University Press, 1990.
9] W. Oettli. On the solution set of a linear system with inaccurate coefficients.
SIAM Journal on Numerical Analysis, 2(1):115-118, 1965.
34 T. O. W. EPPERLY AND R. E. SWANEY
[20] W. Oettli, W. Prager, and J. H. Wilkinson. Admissible solutions of linear
systems with not sharply defined coefficients. SIAM Journal on Numerical
Analysis, 2(2):291-299, 1965.

[21] P. M. Pardalos. Global optimization algorithms for linearly constrained in-


definite quadratic problems. Computers and Mathematics with Applications,
21(6/7):87-97, 1991.

[22] P. M. Pardalos and J. B. Rosen. Constrained Global Optimization: Algorithms


and Applications, volume 268 of Lecture Notes in Computer Science. Springer-
Verlag, 1987.
[23] 1. Quesada and 1. E. Grossmann. Global optimization algorithm for heat ex-
changer networks. Industrial Engineering and Chemical Research, 32:487-499,
1993.

[24] 1. Quesada and 1. E. Grossmann. A global optimization algorithm for linear


fractional and bilinear programs. Journal of Global Optimization, 6:39-76,
1995.

[25] H. Ratschek and J. Rokne. New Computer Methods For Global Optimization.
Mathematics and its Applications. Ellis Horwood Limited, 1988.
[26] H. Ratschek and J. Rokne. Interval tools for global optimization. Computers
and Mathematics with Applications, 21(6/7):41-50, 1991.

[27] H. Ratschek and R. L. Voller. What can interval analysis do for global opti-
mization? Journal of Global Optimization, 1:111-130,1991.
[28] G. V. Reklaitis, A. Ravindran, and K. M. Ragsdell. Engineering Optimization:
Methods and Applications. John Wiley & Sons, Inc., 1983.
[29] H. S. Ryoo and N. V. Sahinidis. Global optimization of nonconvex NLPs
and MINLPs with applications in process design. Computers and Chemical
Engineering, 19(5):551-566, 1995.

[30] R. B. Schnabel and E. Eskow. A new modified Cholesky factorization. SIAM


Journal on Scientific and Statistical Computing, 11(6):1136-1158, November
1990.

[31] H. D. Sherali and A. Alameddine. A new reformulation-linearization technique


for bilinear programming problems. Journal of Global Optimization, 2:379-410,
1992.

[32] H. D. Sherali and C. H. Tuncbilek. A global optimization algorithm for poly-


nomial programming problems using a reformulation-linearization technique.
Journal of Global Optimization, 2:101-112, 1992.
BRANCH AND BOUND FOR GLOBAL NLP 35
[33] R. M. Soland. An algorithm for separable nonconvex programming problems
II: Nonconvex constraints. Management Science, 17(11}:759-773, July 1971.

[34] R. E. Swaney. Global solution of algebraic nonlinear programs. A1ChE Annual


Meeting (Chicago, 1L 1990). Publication pending.
[35] R. Vaidyanathan and M. El-Halwagi. Global optimization of nonconvex non-
linear programs via interval analysis. Computers and Chemical Engineering,
18(10):889-897, 1994.

[36] V. Visweswaran and C. A. Floudas. A global optimization algorithm (GOP)


for certain classes of nonconvex NLPs - II. Application of theory and test
problems. Computers and Chemical Engineering, 14(12):1419-1434,1990.
[37] V. Visweswaran and C. A. Floudas. New properties and computational im-
provement of the GOP algorithm for problems with quadratic objective func-
tions and constraints. Journal of Global Optimization, 3:439-462, 1993.
[38] A. W. Westerberg and J. V. Shah. Assuring a global optimum by the use of an
upper bound on the lower (dual) bound. Computers and Chemical Engineering,
2:83-92, 1978.
2
BRANCH AND BOUND FOR
GLOBAL NLP: ITERATIVE LP
ALGORITHM & RESULTS
Thomas G. W. Epperly
Ross E. Swaney
Department of Chemical Engineering
University of Wisconsin
Madison, Wisconsin

This chapter presents a branch and bound algorithm for global solution of nonconvex
nonlinear programs. The algorithm utilizes the covering program developed in
the previous chapter to compute bounds over rectangular domain partitions. An
adaptive rectangular partitioning strategy is employed to locate and verify a global
solution.

Two versions of the algorithm are presented which differ in how they search for
feasible points. Version 1 uses MINOS 5.4 [4] to search for local minima. Version
2 employs an iterative strategy using a search direction obtained from the covering
program as well as an approximate Newton step.

Section 1 describes the algorithm in detail. Section 2 reports our results in applying
the algorithm to a number of test problems and engineering design problems, and
we give some brief conclusions in Section 3.

1 THE ALGORITHM
The algorithm uses the covering program to calculate a lower bound for the NLP,
and uses the objective function at feasible points as an upper bound on the solution
of the problem. When the lower bound for the NLP equals the upper bound, the
problem is solved. If the algorithm is applied to a problem without a solution, the
algorithm determines this by eliminating all subsets of the variable domain because
of infeasibility. The overall goal during the algorithm is to increase the lower bound
and decrease the upper bound until they meet.

In difficult problems, the covering program applied to the original domain will
provide a lower bound significantly lower than the global optimum, so the branch
and bound algorithm must resort to an adaptive domain partitioning strategy to
37
I. E. Gross1flll1l1l (ed.), Global Optimization in Engineering Design, 37-73.
" 1996 Kluwer Academic Publishers.
38 T. O. W. EPPERLY AND R. E. SWANEY
increase the lower bound. The original variable domain is split into a list of subsets
which are bounded separately. The lower bound for the NLP is the least lower
bound of all of the regions in the list. Any member of the list can be removed if its
lower bound is greater than or equal to the current upper bound or if its covering
program is infeasible. The algorithm proceeds by bounding the region with the
least lower bound, and if it cannot be ruled out, it is removed from the list and split
into two subsets which are added to the list. As the region sizes decrease, the lower
bound increases, until all other regions can be ruled out and the global optimum is
verified, or all of the regions are infeasible.

Decreases in the upper bound are obtained in the course of searching for better
feasible points. The program stores the best feasible point and its objective function
value. Each time the problem constraints are evaluated at a point, the program
checks to see if it is a better feasible point than the stored one, and if it is, it replaces
the stored one. The systematic search for better feasible points is accomplished
by the region analysis procedure. Two different versions of the region analysis
procedure have been developed for comparison. The first is a modification of the
algorithm presented by Swaney [6] and uses MINOS 5.4 [4] to search for local
minima of the NLP. The second uses a line search in the direction provided by the
covering program solution, and the direction provided by a Newton step if available,
to search for new feasible points.

The algorithm presentation below will be organized in three main parts: the branch
and bound loop, the variable splitting method, and the region analysis procedure.

1.1 The Branch and Bound Loop

The branch and bound loop is the manager for the algorithm. It organizes and
maintains the region list, and it makes the choice of which region to analyze next.
It keeps track of the current upper and lower bounds, and prunes the region list
when needed. All of the other procedures operate under its control.

One of the primary functions of the branch and bound loop is to manage the region
list. Each element in the region list contains the specification of a region describing
the lower and upper bounds for each variable and a lower bound on the objective
function within that region. This list is kept sorted in order of increasing lower
bounds, and the next region to be analyzed is always taken from the top of the
list. This sorting and region selection corresponds to analyzing the region with
the lowest lower bound first. The algorithm has also been operated in a highest
lower bound first manner, in a last in first out (LIFO) manner, and first in first out
(FIFO) manner. However, the lowest lower bound appears to be the best based on
tests conducted early in the algorithm development.
BRANCH AND BOUND FOR GLOBAL NLP 39
The region list is initialized with the variable bounds from the problem specification,
and the best objective function value is initialized to a large value. The algorithm
terminates when the region list is empty.

The procedures that evaluate the objective function and constraints keep track of
the best feasible point. Each time the constraints are evaluated, the procedure
checks if the point is feasible (not violating any of the constraints by more than
f O. 55 = 2.4578 x 10- 9 ) and if it is better than the current best point. The evaluation
procedures also store the gradient of the constraints at the best feasible point to
avoid recalculating them. The branch and bound loop can poll the evaluation
routines to obtain the best objective function value, the best point, and the number
of feasible points found.

The objective function value of the best feasible point can be used to remove el-
ements from the region list. Any region list element with a lower bound greater
than or equal to the current best objective function value cannot contain a better
feasible point, so it is removed from the region list (pruned).

Here is a summary of the algorithm:

Branch and bound loop


Initialize search list
While Search list not empty
1. Remove the top of the search list to be treated as
the current region.
2. Choose a point inside the current region using the current
point or the midpoint of the region
3. Analyze the current region (using region analysis procedure)
4. Prune region list if a better feasible point was found
5. If the current region is not pruned due to bound or
infeasibility, choose a split variable and a split
location and add the two new regions to the region list.

These steps are described further below.

Step 2: Point choosing scheme

The covering program requires a point around which to construct its bounds. In this
step, the algorithm first checks if the best feasible point is contained in the region
of interest; if so, it chooses the best feasible point. Otherwise, the algorithm adapts
the point used in the previous iteration (the "current point"). The algorithm checks
if each component Xi of the current point is inside the bounds for that variable in the
current region. If it is inside the bounds, its value remains unchanged; otherwise,
40 T. G. W. EPPERLY AND R. E. SWANEY

Xi is assigned the value of the average of the lower and upper bounds for Xi. If the
current point is inside the region, region analysis can be started without having to
reevaluate the objective function, constraints, and the constraint gradients.

Step 4: Prune the region list

When a new feasible point is found, the algorithm checks each element in the region
list to see if it can be pruned. Any region whose bound is greater than the current
upper bound can be removed from the list. For finite precision mathematics, a region
is pruned if (upper bound -lower bound) ~ 1 x 10- 4 . The problems are scaled using
a heuristic method described below which attempts to give the objective function an
order of magnitude of 10°. This pruning criteria guarantees the objective function
to a high enough tolerance for most engineering applications. However, problems
with many local minima with objective function values very close to each other may
require a higher tolerance.

1.2 Step 5: Splitting the region

The method for choosing which variable to split and where it should be split has a
large effect on the performance of the overall algorithm. Each time the algorithm
splits it creates two more regions that may need to be stored and processed or
pruned, so it is important that each split be chosen judiciously. Otherwise, the
algorithm may just generate more work without improving the lower bound. The
goal of the method presented here is to choose the variable which will most greatly
improve the objective function bound.

The method used is analogous to the method employed in [6]. The starting idea is
to choose the variable whose bounds most greatly affect the value of the covering
program objective function. The effect on the objective function of changes in
the bounds is estimated from sensitivity analysis of the solution to the covering
program. Given the following program which depends on certain parameters a

min r.p(x)
x
s.t. g(x, a) ~ 0

and the solution r.p* = r.p(x*), the effect of a change in the parameters, 6.a, can be
estimated as follows:
6.r.p*~U*T 8g I 6.a (2.1)
8aT x=x·
where u* are the Lagrange multipliers at the solution. For the variable splitting
procedure, the parameters are the variable bounds and the program is either (1.144)
or (1.177). For the null space program, constraints (1.145-1.146) are affected by
BRANCH AND BOUND FOR GLOBAL NLP 41

changes in variable bounds through changes in H(i) , (1.147-1.154), (1.158-1.161),


and (1.164-1.165), and similarly for the full rank program, constraints {I. 178-1. 186)
and (1.188-1.189) are affected. To improve efficiency, ~L=:z:. is only determined
for the active constraints, so the effect of a bound change may be estimated within
a reasonable effort.

The process of splitting takes place in two steps. First the variable to split is chosen,
and then the actual location to split at is chosen. In the first step, it is necessary
to choose a hypothetical location for the split to occur. For each variable, a point
in its domain, xf, is chosen and the effects of changing the bounds from [xf, xfl to
[xf, xfj and [xf, xfl are estimated. xi is chosen in the following manner:
if xf < Xi < xf
otherwise if xf < Xi + D.Xi < xf (2.2)
if neither of the above is satisfied

where D.Xi comes from the solution of the appropriate covering program.

The effects of changing each variable's lower and upper bound are estimated and
labeled D.f and D.f respectively. For purposes of choosing the variable to split, it
is better to reduce the contribution of the simple variable bounds on the x and z
variables, constraints (1.158-1.159) or (1.185-1.186), to a small percentage of their
contribution. The goal of splitting is to improve the bounds on the quadratic terms,
and the effect of simple variable bounds should be secondary. The results presented
in· Section 2 are for an algorithm using 1% of the contribution from the simple
variable bounds. The estimates with the reduced contribution from the simple
variable bounds are called D.f' and D.f'. The variable to split on is chosen by the
following:
l = argmax !(D.f' + 6f') (2.3)
i

If D.f and D.f should be zero, the nonlinear variable with the widest domain is
bisected, and Step 5 ends causing the algorithm to continue with Step 1.

When D.f and 6f are not zero, the algorithm now switches to the task of choos-
ing the location of the split. If an upper bound for the problem is available, the
algorithm attempts to determine a split which will result in one region that can be
eliminated by a lower bound increase and one region that still might contain the
global minimum. This procedure may fail if the estimated bound increase is not
large enough to predict that part of the region may be eliminated, or if it requires
a division by a number near zero. If this procedure should fail or if an upper bound
is not available, the algorithm will split at xf as long as it satisfies the following
conditions
xf + o.2(xf - xf) ~ xf ~ xf - o.2(xf - xf)
Otherwise, the variable is split at Hxf + xf).
42 T. G. W. EPPERLY AND R. E. SWANEY
When an upper bound is available, the estimated increase of the covering program
objective function can be used to predict a split location yielding one region that
can be eliminated and another than may contain the solution. The lower bound
given by the covering program solution is Xo + ~xo, and the estimated new bound
as a function of split location, xi, can be written as

The goal is for x8st to meet or exceed the upper bound, U B, so by substituting in
for x8st , the following equations for xi are obtained.

,/,. U B - (xo + ~xo) (C L) L


xi = 'I' ~L Xl - Xl + XI (2.4)
I

xi = U ,/,. U B - (xo + ~Xo) (u C) (2.5)


XI - 'I' ~U Xl - XI
I

Here 4> is a small multiplier (1 ~ 4> ~ 1.2) to overestimate the cut needed. If the
estimate for one of the bound changes lies outside the variable domain or if ~I is
zero, this analysis for that bound cannot predict a split that will eliminate part of
the region. The splitting algorithm chooses that value of xi which will eliminate
the largest piece of the region. For a lower bound change, xV - xi is predicted to be
eliminated, and for an upper bound change, xi - xf is predicted to be eliminated.

After the variable and location have been recommended, the algorithm checks the
ratio of the width of the recommended variable to the width of the widest nonlinear
variable normalized by their original widths. If the recommended variable's nor-
malized width is less than one one-hundredth of the widest nonlinear variable's nor-
malized width, the algorithm overrides the recommendation and bisects the widest
nonlinear variable.

After the splitting method has chosen a variable and a split location, it needs to
add the two new regions to the branch and bound list. The new regions take their
lower bound on the objective function from the region they subdivided because
their lower bounds have to be at least as high as the region from which they come.
If their bound is the same as the first element in the list, they are added to the
front of the list; otherwise, they are inserted in order of increasing lower bound.
The region which includes the current point (the point around which the covering
program was constructed) is added second (on top). If this region has the least
lower bound, the covering program of the next iteration can be constructed without
having to evaluate the objective function, constraints, and their gradients again.
This is not necessary for the algorithm to work, but reduces the number of function
evaluations.
BRANCH AND BOUND FOR GLOBAL NLP 43

1.3 The Region Analysis Procedure, Version 1

This region analysis procedure is a modification of the one presented by Swaney [6].
It is different because it uses MINOS 5.4 [4] as its local NLP solver and because it
uses the generalized covering program.

Region analysis procedure, Version 1


1. Call MINOS 5.4 to solve the NLP (1.1)
restricted to [XL, xU] using the point provided
by the branch and bound loop as a starting place.
2. Construct the covering problem at the point found by
MINOS 5.4, and return the results to the branch and bound
loop.

Because the generalized covering program is used, this region analysis loop can still
provide a lower bound even when MINOS 5.4 does not find a feasible point. If the
problem is infeasible, the covering program will become infeasible when the region
size is small enough. The details of constructing the covering program are shown
in the next section.

1.4 The Region Analysis Procedure, Version 2

The second version of the region analysis procedure is a new approach combining
a search direction derived from the covering program and a Newton like search
direction. The covering program provides a search direction based on the global
character of the problem which may provide an improved point when the local
methods fail. The local step is used to provide quick convergence when close to a
local minimum.

The region analysis procedure is designed to find a global minimizer of a particular


region and to provide a lower bound on the optimal objective function value in the
region. It is also desirable for the loop to find feasible points as soon as possible.

Region analysis procedure, Version 2


Loop
1. Construct and solve the covering program at the current point.
2. If the covering program is infeasible or if the lower bound
allows this region to be pruned, exit the loop.
3. Update the region's lower bound estimate from the covering program.
4. If the covering program was stationary (11~xll = 0)
If the current point is feasible, exit the loop (optimum found).
44 T. G. W. EPPERLY AND R. E. SWANEY
Else, exit because of zero search direction.
5. Perform a line search.
6. Check if the current region can be pruned.

Step 1: Constructing the covering program

To construct the appropriate covering program, a point and an estimate of the


Lagrange multipliers at that point are needed. If an estimate of the Lagrange
multipliers is not available (as on the first iteration), the interval LP with no null
space (1.177) can be used. After solving each covering program, the algorithm stores
the Lagrange multipliers for use by the next covering program. When working with
the null space covering program, the Lagrange multiplier estimates for the general
constraints come from the LP Lagrange multipliers from constraints (1.145-1.146)
and (1.162). When working with the full rank covering program, the Lagrange
multiplier estimates come from (1.178) and (1.187).

Given Lagrange multiplier estimates, the algorithm first counts the number of active
constraints as determined by the value of the multipliers and MINOS' basis array. If
the number of active constraints is less than the number of variables, the algorithm
constructs the null space covering program; otherwise, it constructs the full rank
covering program.

Constructing the full rank program (1.177) is relatively straightforward. Given the
current point and the bounds on the variables, constructing the LP does not require
any difficult operations.

Constructing the null space program (1.144) requires more computational effort.
First the algorithm constructs the matrix G, defined by (1.12), whose columns are
gradients of the active constraints determined from the Lagrange multipliers. Next
the LU factors of G are calculated using LUSOL [3], the sparse LU factorization
routine from MINOS 5.4 [4]. If G is not of full rank, the linearly dependent con-
straints are removed from the active set, and G is recalculated until a set of linearly
independent active constraint gradients is found.

Once a G of full rank has been found, the algorithm uses it to calculate new estimates
of the multipliers. Assuming that the current point is a Karush-Kuhn-TUcker point,

c+Gu = 0

which is solved for u using the LU factors of G. These updated u estimates are
checked to make sure they are of the correct sign. If a sign is incorrect, the associated
constraint is removed from the active set, and a new G is used.
BRANCH AND BOUND FOR GLOBAL NLP 45

Using the LU factors of G, a basis N for the null space of the active constraints is
calculated.
(1.13)

where G [ are the most linearly independent rows of G as determined by the row
permutation provided by the LU factorization routines. The Newton step in the
constraint space dN is computed from the LU factors of G using (1.191).

Once N has been calculated, the projection matrix N(NT N)-l NT is calculated
using the QR factorization of N, from which N(NT N)-l NT = Q NQiv. This
matrix is calculated and stored using full matrix routines.

Next ""y is calculated as Li UiH(i). Matrix Q is calculated from ""Y according to the
definition (1.36). Q is stored and processed as a full matrix. Q is factorized using
a modified Cholesky factorization enforcing positive definiteness [5] which adds the
smallest possible diagonal matrix needed to Q to make it positive definite. That
procedure was modified to add a larger diagonal matrix to avoid scaling problems
in the LP. The diagonal adjustment is propagated to ""Y using (1.44). From Q, r] is
calculated using full matrix computations using equation (1.39), and from r], the
required elements of p-l and p-l are calculated using (1.40). The Cholesky factors
of Q are also used to calculate the Newton step in the null space pN, using (1.197)
which gives the total Newton step of l1x N = d N + NpN. The a's for the Newton
constraints are calculated using equation (1. 205).

From this information, the null space covering program can be constructed and
solved using MINOS 5.4.

Step 5: The line search

The algorithm employs a special kind of line search that can search two directions
simultaneously. In some cases, the algorithm will have a search direction from the
covering program, l1x c , and a Newton search direction, l1x N , and sometimes, it
will have only one of the two. When the full rank bound is used, only l1x c is
available, and when the covering program gives a zero step or cannot be solved by
MINOS 5.4 (which occurs very rarely), only the Newton step is available.

The line search uses the same merit function for evaluating progress for either search
direction. The merit function contains the objective function and a weighted sum
of the infeasibility.
(2.6)

The weighting factors W are a kind of moving average of the Lagrange multipliers.
The method of calculate Wk+l from Wk is evaluated using the first rule that applies
46 T. G. W. EPPERLY AND R. E. SWANEY
in the following:
if Ui = 0
if Wk, = 0 (2.7)
otherwise
This definition allows the weights to be adjusted without drastic changes that can
cause cycling.

The line search first calculates the directional derivative for each available search
direction. If the Newton directional derivative is nonnegative, which may occur if
the active set is wrong, the Newton step is ignored. When the directional derivative
is negative, an Armijo criteria [1, Section 8.3] is used as a terminating condition.
Otherwise, a fixed decrease is required for the step be accepted. The line search
takes a step with each available direction, and then pursues the one that has the
best merit function.

After taking the first step, if the directional derivative is negative, the minimum
of a quadratic approximation of the merit function is used to calculate a new step
length. If the directional derivative is positive, it decreases the step by 0.4 each
iterations. The line search continues until an improved point is found or until a
maximum of 8 iterations have been taken.

1.5 Problem Scaling


As with many numerical algorithms, this algorithm can fail or perform poorly if the
problems are not well scaled. The termination criteria and matrix pivot elements
are chosen using absolute tolerances, so it is important that the variables, objective
function value, and constraint values be of approximately the same order of mag-
nitude from problem to problem. It is necessary to develop a method to scale the
variable values, objective function value, and constraint values.

This algorithm uses a simple, a priori procedure to scale the variables and function
values. After a problem has been defined as published, the scaling procedure is
applied to compute variable and function scales, which are added to the problem
definition file before executing the branch and bound algorithm.

The variable scale is based on the order of magnitude of the average of the upper
and lower bound. Here are the definitions of some of the functions used and how
the scale for variable i is determined.

trunc(y) { max {i E Integers Ii::; y}


min{ i E Integers I i ~ y}
if y
if y
~
<0
0
(2.8)

{
lOtrunc(ioglo (\Y\))
if y ¥= 0
Magnitude(y) (2.9)
1 if y = 0
BRANCH AND BOUND FOR GLOBAL NLP 47

VarScale(i) = Magnitude({£i; Xi) (2.10)

To determine the constraint scale, 100 points are chosen at random uniformly dis-
tributed in the variable domain determined by the upper and lower bounds, and
the magnitudes are estimated from the averages of the largest magnitude term in
each expression as explained below. The random points are designated xk for k
ranging from 1 to 100. It is assumed that problems are in the form (1.1) with one
possible modification. If the objective function of the original problem specification
is linear, the linear objective function is used as published, and Xo is not introduced
as a dummy objective variable.

ObjectiveMax{x) = max Ic·x'l


jEl...n
J J
(2.11)
100
ObjectiveScale = Magnitude{ 1~0 L ObjectiveMax{xk)) (2.12)
k=l

Maximums and absolute values give the scale of the largest term in the expression.
The method for the constraints is similar.

ConstMax{ i, x) max{,max Ib;xjl, . max IXjH;~Xkl,maxlfj{xj)l} (2.13)


JEl...n J,kEl...n JEFi
1 100
ConstScale{ i) Magnitude{ 100 {; ConstMax{i, xk)) (2.14)

Without this scaling procedure, the algorithm fails to solve several of the test prob-
lems. The scaling produced by this method are sufficient for the set of test problems,
and it is the only method that has been tested.

1.6 Nondifferentiable or undefined functions

Some of the single variable functions or their derivatives are undefined for some
subset of the variable domain, and to solve them with this algorithm, small adjust-
ments must be made to the problem definition. Of all of the single variable functions
needed to define the 50 test problems, PoxP1 , Po In(p1x + P2), and POX In(p1 x + P2)
are the only three that may have undefined function values or derivatives. With
In{,8),,8 ~ 0, undefined, the bounds on variable x may need to be adjusted to keep
the lower bound of P1X+P2 greater than or equal to some small value strictly greater
than zero.

The power function Pox Pl is a more complicated problem. The restrictions on x


depend on the value of Pl. There are different conditions for P1 being positive or
negative, integral or real, and greater than or less than one. The case that is actually
48 T. O. W. EPPERLY AND R. E. SWANEY

required in the test problems has PI = 0.6, resulting from the estimated cost of heat
exchangers taken as proportional to (Area)o.6. In this case, the power function is
defined for all nonnegative values of x, but the derivative approaches infinity as x
approaches zero from the right. To treat this infinite derivative, the power function
is replaced by a fourth order polynomial which goes smoothly to zero for x <= 0.05.
The fourth order polynomial chosen has the minimum integrated error squared such
that the value and derivative of the power function are matched at x = 0.05 and
the value of power function is matched at x = o.

2 COMPUTATIONAL EXPERIENCE

The algorithm presented was implemented in C++ with calls to FORTRAN numer-
ical libraries, and it was tested on four different platforms, a DECstation 3100, a
NeXTstation Turbo, a HP 9000/735, and an IBM SP-2. This section will present a
summary of the results obtained by observing the behavior of the algorithm applied
to a variety of problems from the literature.

Tables 1, 2, 3, and 4 give an overview of the problem sizes and characteristics and
the runtimes obtained on a HP 9000 Series 700 Model 735 with a clock speed of 99
MHz and 80MB of RAM. Table 5 gives the benchmark results for this machine as
reported by HP.

The algorithm was able to solve 47 of the 50 test problems within the arbitrary
iteration limit of 25000. In 42 of these cases, it found a result agreeing with the
results reported in the problem source, and in 5 cases, it found either better solutions
or alternative minima. The problems definitions and locations of these minima are
presented in an Appendix available on request from the authors.

For the three unsolvable problems, the algorithm provides mixed results. For prob-
lem fp_3_1, both versions found the solution but were unable to reduce the bound
gap to zero within the iteration limit. In the case of problem fpA_9, neither version
of the algorithm found a feasible point. Problem fp_5_1 is not solved within the
iteration limit with Version 1, and the runtime was prohibitive for Version 2. For
the unsolved problems, Table 6 shows the objective function value found by the al-
gorithm, the bound gap after 25000 iterations, and the number of regions remaining
on the list at the end.

Figures 1 and 2 show the runtime required to solve each problem versus the number
of problem variables and number of bilinearities respectively. Both show a high
variance, with the runtime correlating poorly with both. However, the runtime
correlates somewhat better with the number of quadratic terms than the number
of variables.
BRANCH AND BOUND FOR GLOBAL NLP 49

Version 1 Version 2
Name na XiXjb SVFc NSD d Time(s) Iter. Time(s) Iter. Soln. e
fp_2_1 5 5 0 0 1.48 33 1.23 37 .J
fp_2.2 6 5 0 0 0.03 1 0.02 1 .J
fp_2_3 13 4 0 1 1.82 16 0.11 3 .J
fp_2A 6 1 0 1 0.11 3 0.08 7 .J
fp_2_5 10 7 0 0 1.24 31 0.81 21
*
fp_2_6 10 10 0 0 3.91 31 2.73 27 .J
fp_23_1 20 20 0 0 275.68 643 363.55 663 .J
fp_2_7_2 20 20 0 0 265.86 615 275.10 515 .J
fp_23_3 20 20 0 0 751.67 971 642.74 835 .J
fp_2_7A 20 20 0 0 198.26 623 213.15 473 .J
fp_2_7 _5 20 20 0 0 1168.83 1909 745.30 1611
fp_2_8 24 24 0 1 48.18 53 148.49 141
*
fp_3_1 8 5 0 2 1278.21 25000 4837.56 25000 x
*
fp_3_2 5 8 0 2 1.26 27 21.15 40 .J
fp_3_3 6 7 0 0 1.01 23 0.40 7 .J
fp_3A 3 9 0 0 245.90 9213 595.69 6269
*
fpA_3 4 2 2 0 0.23 15 0.12 5 .J
fpAA 4 2 2 0 0.34 25 0.30 19 .J
fpA_5 6 3 3 0 0.62 35 0.80 29
*
fpA_6 2 1 2 0 0.15 13 0.32 11 .J
fpA3 2 2 1 1 0.05 3 0.37 17 .J
fpA_9 lIt 16 8 0 9357.13 25000 19472.80 25000 x
fp_5_1 48 44 0 9 192930.82 25000 nla nla x
aThe number of variables
bThe number of bilinearities
CThe number of single variable nonlinear functions. Each of these also is counted as a quadratic
variable.
dThe number of null space dimensions at the solution.
e...; means the solution agrees with published values, * means the algorithm found a better or
different solution, and x means the algorithm did not find a solution within the iteration limit.
f Application of the algorithm to this problem required 11 additional variables.

Table 1 Problem Set 1 from Floudas and Pardalos


50 T. G. W. EPPERLY AND R. E. SWANEY

Version 1 Version 2
Name na Xi X / SVFc NSDd Time(s) Iter. Time(s) Iter. Soln. e
La 2 1 0 1 0.46 35 0.92 23 v'
Lb 2 1 0 0 0.03 1 0.01 1 v'
Lc 4 4 0 0 0.04 1 0.03 1 v'
Ld 20 20 0 0 49.93 87 30.67 103 v'
Le 6 6 0 0 1.38 23 0.17 3 v'
Lf 2 2 0 1 0.21 9 0.27 9 v'
Lg.l 9 2 0 1 0.08 3 0.52 6 v'
Lg.lI 9 2 0 1 0.31 3 0.29 3 v'
Lg.lII 9 2 0 0 0.35 7 1.91 13 v'
Lwingo 1 1 3 1 0.32 27 0.24 27 v'
a,b,c,d"See Table 1 footnotes.

Table 2 Problem Set 2 from Floudas and Visweswaran

Version 1 Version 2
Name na XiXjb SVF c NSD d Time(s) Iter. Time(s) Iter. Soln.e
s_l 2 3 0 0 0.35 19 0.36 19 J
s_lb 2 3 0 0 0.39 19 0.29 21 v'
s_lc 2 3 0 0 0.11 7 0.10 7 J
s_ld 2 3 0 0 0.31 17 0.26 13 v'
s_2 2 1 0 1 0.02 1 0.16 5 v'
s_2b 2 1 0 1 1.39 119 0.13 7 v'
s_2c 2 1 0 1 0.02 1 0.04 1 v'
s_2d 2 1 0 1 0.03 1 0.04 1 v'
s_3 2 2 1 1 0.06 3 0.18 3 J
sA 3 3 2 0 0.40 23 2.91 56 v'
s_5 4 2 2 0 0.32 19 0.22 17 v'
s_6 9 2 0 1 0.18 3 0.45 3 v'
a,b,c,d,eSee Table 1 footnotes.

Table 3 Problem Set 3 from Swaney


BRANCH AND BOUND FOR GLOBAL NLP 51

Version 1 Version 2
Name na XiXjb SVFc NSD d Time(s) Iter. Time(s) Iter. Soln. e
diet 6 0 0 0 0.02 1 0.01 1 v'
e_1 3 3 0 2 0.14 7 0.12 7 v'
w 6 8 5 0 278.13 4129 128.25 885 v'
w_b 6 5 8 0 12.51 323 19.34 128 v'
w_c 3 5 8 0 7.97 307 11.43 129 v'
a,b,c,d,eSee Table 1 footnotes.

Table 4 Problem Set 4: Miscellaneous Problems

SPECint92 109
SPECfp92 168
MIPS 125
MFLOPS (dp) 45

Table 5 Benchmarks for HP 9000 Series 700 Model 735 at 99 MHz

Version 1 Version 2
Name Objective Gap Regions Objective Gap Regions
fp_3_1 7049.25 108.98 2443 7049.25 135.59 2181
fpA_9 None 00 3693 None 00 2189
fp_5_1 1.567 0.132931 5868 nla nla nla
Table 6 Results for unsolved problems
Figure 1 Runtime versus number of variables

10000
Version One..
Version Two +

1000 ..
t
. ..
+

:
+

+ +
100
U; .. ..
I
"C
+
+ +

CD
E
10 •..
E
.. . t .. ..
+ "+

,
:::J +
II:
+ +

•" i "
+
* +

*. ..
+
0.1 +
+ ~ +
$ ..
.
+
+
"
0.01
0 5 10 15 20 25
Number of bilinearilies

Figure 2 Runtime versus number of bilinearities


BRANCH AND BOUND FOR GLOBAL NLP 53
Version 1 and Version 2 of the algorithm have comparable runtimes, and neither is
clearly better than the other. Version 1 required less time than Version 2 in 22 of
the 47 solvable problems. The combined time required to solve all of the problems
for Version 1 is 3322.09 seconds, and the combined time for Version 2 is 3211. 76
seconds. Version 2 requires fewer iterations in 21 of the 47 solvable problems, and
they tie on 14 problems. Version 1 requires a total of 19478 iterations to solve all
of the problems, and Version 2 requires 12223 iterations.

It is also useful to compare function, constraint and constraint gradient evaluations


for the two algorithms. Tables 7, 8, 9, and 10 show this comparison. Version 2 of
the algorithm requires fewer objective function evaluations in 32 of the 47 problems,
fewer constraint evaluations in 40 of the 47 problems, and fewer constraint gradient
evaluations in 44 of the 47 problems. Table 11 compares the sum of the evaluations
for all of the problems for each version, which highlights the observation that Version
2 requires almost an order of magnitude fewer constraint evaluations and more than
an order of magnitude fewer constraint gradient evaluations. Version 1 is solving
more quick subproblems, and Version 2 is solving fewer slow subproblems.

The results of the runs of both algorithms have been grouped into characteristic
behaviors, and some examples of each characteristic behavior will be presented.
The runs have been grouped into the following categories: trivial runs, runs where
the log of the bound gap decreases nearly linearly, runs where the log of the bound
gap decreases superlinearly, and poor runs. All of the graphs are in terms of the
scaled objective function. Some problems appear in different groups for the different
algorithm versions.

The first category consists of runs which took less than fifteen iterations to solve for
a particular algorithm. Figures 3 and 4 show the bounds as a function of iteration
number for problem Lg-III for Versions 1 and 2 of the algorithm respectively. In
both cases, the bounds converge in a few iterations.

The second category are those where the log of the bound gap decreases approx-
imately linearly with the iteration number. Applying Version 1 of the algorithm
to problem fp_2_7 _5 demonstrates this behavior on a problem with 20 variables
and 20 quadratic terms, and applying Version 2 of the algorithm to problem s_l
demonstrates this performance on a problem with 2 variables and 1 quadratic term.
Figures 5, 6, and 7 show the bounds, bound gap, and region size versus iteration
number for fp_2_1-5 using Version 1, and Figures 8, 9, and 10 show the analogous
results for s_1.

The next category are those where the log of the bound gap decreases superlin-
early with respect to the iteration number. Applying Version 2 of the algorithm to
problem fp_23_3 gives a typical example of this as shown in Figures 11, 12 and 13.
54 T. G. W. EPPERLY AND R. E. SWANEY

Version 1 Version 2
Name Obj.a Const. b Grad. C Obj. Const. Grad.
fp_2_1 4162 4195 4224 172 181 54
fp_2_2 10 11 12 2 2 2
fp_2_3 218 234 243 22 23 4
fp-2A 137 140 142 35 38 11
fp_2_5 229 273 290 64 65 20
fp_2_6 4463 4814 4836 74 74 24
fp-2-1_1 7647 40855 41183 3614 3760 679
fp_2_7 -2 13347 38686 39014 3565 3775 697
fp_2_7 --3 43323 154888 155372 4740 4886 770
fp_2_7A 13425 35490 35818 2899 3028 490
fp_2_7 _5 78500 245840 246859 5653 5735 1390
fp-2_8 4651 4709 4736 917 994 197
fp_3_2 67 1327 1339 624 673 118
fp_3_3 833 1989 2001 41 45 11
fp_3A 87141 334185 341338 57562 64512 11849
fpA_3 49 69 77 39 39 9
fpAA 145 175 192 93 98 22
fpA_5 247 346 366 240 260 40
fpA_6 121 213 220 241 261 46
fpA-1 6 91 92 94 100 17
aObjective function evaluations
bConstraint evaluations
CConstraint gradient evaluations

Table 7 Function evaluation comparison for Problem Set 1

Version 1 Version 2
Name Obj.a Const. b Grad. C Obj. Const. Grad.
La 176 398 416 352 423 89
Lb 43 44 45 2 2 2
Lc 13 16 17 2 2 2
Ld 4454 8882 8925 331 336 70
Le 933 3342 3354 17 19 6
f..f 252 440 446 106 122 24
LgJ 24 84 86 39 42 10
LgJI 19 155 157 13 15 9
LgJII 46 499 503 69 72 19
Lwingo 202 406 420 129 134 20
a,b,cSee Table 7 footnotes.

Table 8 Function evaluation comparison for Problem Set 2


BRANCH AND BOUND FOR GLOBAL NLP 55

Version 1 Version 2
Name Obj.a Const. b Grad. c Obj. Const. Grad.
s_1 350 541 550 123 130 25
s_lb 520 768 778 97 99 19
s_lc 21 48 51 26 27 8
s_ld 279 422 430 62 65 13
s_2 3 3 4 75 85 15
s_2b 633 1038 1124 29 32 9
s_2c 3 3 4 6 8 4
s_2d 3 3 4 8 11 5
s_3 5 100 101 24 31 9
sA 76 389 401 524 577 95
s_5 97 125 134 62 64 12
s_6 12 84 86 16 18 6
a,b,cSee Table 7 footnotes.

Table 9 Function evaluation comparison for Problem Set 3

Version 1 Version 2
Name Obj.a Const. b Grad. c Obj. Const. Grad.
diet 10 10 11 2 2 2
e_l 134 198 202 39 41 12
w 4683 61969 64406 9632 11105 2441
w_b 1223 6991 7160 1279 1462 301
w_c 1734 4619 4778 1439 1537 273
a,b,cSee Table 7 footnotes.

Table 10 Function evaluation comparison for Problem Set 4

Version 1 Version 2
Function evaluations 274669 95194
Constraint evaluations 960107 105010
Constraint gradient evaluations 972947 19950

Table 11 Cumulative totals for function evaluations


56 T. G. W. EPPERLY ANDR. E. SWANEY

-0.7 r-------,r-------,----,.---~---__._---___,

Lower bound -
Upper bound -----
-0.72

-0.74

-0.76

~
S -0.78

-0.81---------------

-0.82

-0.84

o 2 3 4 5 6
Iteration Number

Figure 3 Bounds versus iteration number for typical rapid convergence problem
(LgJII, Version 1)

-0.1
---------------------------------,
Lower bound -
Upper bound -----
-0.2

-0.3

-0.4
'"c:
"0
:J
0
m -0.5

\
-0.6

-0.7
._-----------------------
-0.8

0 2 4 6 8 10 12
Iteration Number

Figure 4 Bounds versus iteration number for typical rapid convergence problem
(LgJII, Version 2)
BRANCH AND BOUND FOR GLOBAL NLP 57

-4 •••...________________ =
___:__:;:;:___;.__: :=___
=__=___=__=_ _ _ _ _ _ _--'=:.:..==---'

-6

-8

-10

-12

-14~--~--~--~----~--~--~--~----~--~--~
o 200 400 600 800 1000 1200 1400 1600 1800 2000
Iteration Number

Figure 5 Bounds versus iteration number for typical linear convergence problem
(fp-2_7 _5, Version 1)

10~--,_--~----~--_r--_.r_--~--_r----r_--,_--~

"C
C

~
1..:.
0.1

c
::l
.8 0.01
~
:::>

0.001

0.0001 '--__...I..-_ _--'-_ _---L_ _ _ _L...-_ _ _ _ _ _ _ _ _ _ _ _


~ ~ ~ ~ ~~....J

o 200 400 600 800 1000 1200 1400 1600 1800 2000
Iteration Number

Figure 6 Bound gap versus iteration number for typical linear convergence prob-
lem (fp-2_7-5, Version 1)
58 T. G. W. EPPERLY AND R. E. SWANEY

1e-100

o 200 400 600 800 1000 1200 1400 1600 1800 2000
Iteration Number

II(xf - xf) versus iteration number for typical linear convergence


n
Figure 7
;=1
problem (fp-2-1_5, Version 1)

1.1 r----,---,------.--,------.--,------.--,--,-----,
Lower bound
Upper bound -- --

0.9

0.8
\~ ... ------------------------------------------::::--"'--~--~-~--------------'
0.7

0.6

0.5

0.4

0.3 '--_----'_ _--'-_ _...1-_ _L-_----' _ _--'-_ _--'-_ _"--_----'


o 2 4 6 8 10 12 14 16 18
Iteration Number

Figure 8 Bounds versus iteration number for typical linear convergence problem
(s_l, Version 2)
BRANCH AND BOUND FOR GLOBAL NLP 59

0.1
"C
c:
:::>
0
.Q
:n
~
"C
0.01
c:
:::>
0
.Q
:n
8:
::l
0.001

0.0001
o 2 4 6 8 10 12 14 16 18
Iteration Number

Figure 9 Bound gap versus iteration number for typical linear convergence prob-
lem (s_l, Version 2)

10r----.----,---~r---_r----~--_,----_.----r_--_,

CI>
E
:::> 0.1
"0
>
"C
CI>
.!::!
eCI>
c:
CI>
0.01
<!l

0.001

0.0001 L -_ _ ~ _ _ _ _ _ L_ _ _ _ ~ _ _ _ _L __ _ ~ ____ ~ ____ ~ ____ ~ __ ~

o 2 4 6 8 10 12 14 16 18
Iteration Number

Figure 10 II(xY - xf) versus iteration number for typical linear convergence
i=l
problem (s_l, Version 2)
60 T. G. W. EPPERLY AND R. E. SWANEY

Lower bound
-4 Upper bound

-6

-8
-------'
-----------------------------------------------------------=--:;:;--""--~-

-10

-12

_14L-L--L___ __ _ _ L_ _ _ _
~ ~~ ~_~ _ _ _ _ _ L_ _ _ _ ~ __ ~

o 100 200 300 400 500 600 700 800 900


Iteration Number

Figure 11 Bounds versus iteration number for typical superlinear convergence


problem (fp ..~L7 ..3, Version 2)

10.----.----,----,,---.----,----,-----,----,----,

'C
c:
:l
0
.c
Q; 0.1
~
.::,c:
:l
0
.c 0.01
Q;
Co
Co
:::>

0.001

0.0001 '----__-L____....L..._ _--''--_......L.____-'--__- - '____....L.._ _ _ _.I.....='----'


o 100 200 300 400 500 600 700 800 900
Iteration Number

Figure 12 Bound gap versus iteration number for typical superlinear convergence
problem (fp..2 ..7..3, Version 2)
BRANCH AND BOUND FOR GLOBAL NLP 61

1e+10

1e-10
CIl
E
:::J
(5
> 1e-20
'0
.~
eCIl
c 1e-3O
CIl
Cl

1e-4O

1e-50

1e-60
0 100 200 300 400 500 600 700 800 900
Iteration Number

IT
n

Figure 13 (xf - xf) versus iteration number for typical linear convergence
i=l
problem (fp.2_L3, Version 2)

Problem fp_3A displays another kind of superlinear decrease, and the behavior is
virtually the same for both algorithms. Because the behaviors are similar, only
Version 1 of the algorithm is shown. Figures 14, 15, and 16 show the bounds,
bound gap, and region size versus iteration number. The lower bound increases
slowly and linearly with respect to iteration number, which causes the semi-log
plot to have negative curvature. This was the only problem to display this kind of
behavior. The difficulty is not caused by problem size because the problem only has
three variables and nine quadratic terms, and it has a linear objective functions,
two linear constraints, and one reverse convex constraint.

Another interesting difference in this problem is the region size versus iteration
number graph, Figure 16. Most problems show a much greater reduction in the
region size over that many iterations, which suggests that the difficulty may lie in
the splitting method. Most of the problems are able to use the split recommended
by the sensitivity analysis, referred to here as the "best split," most of the time; the
alternative is referred to here as the "widest split." Table 12 shows the percentages
of best and widest splits for problems using more than 5% widest split for one of
the versions. The runtimes in Tables 1, 2, 3, and 4, tend to be higher for problems
with a high percentage of widest splits. This suggests that when the algorithm has
to resort to a widest split for certain variables, the efficiency suffers.
62 T. G. W. EPPERLY AND R. E. SWANEY

-3.5 ,..--....,.-----r--r------.----,.--.,----r---,--,,----,
Lower bound -
Upper bound -----

-4 ---------------------------------------------------------------------------------

-4.5

-5

-5.5

_6L--~_~_~ __ L__~ _ __ L_ _ ~ _ _ _ _L __ _~_ __ J

o 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Iteration Number

Figure 14 Bounds versus iteration number for problem fp_3A, Version 1

10.---.--.---r---,.---r---,,..----.----,----.---.

"0
c:
::l
0
.c
a; 0.1
~
"0
c:
::l
0
.c 0.01
!0.
:::l

0.001

0.0001
o 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Iteration Number

Figure 15 Bound gap versus iteration number for problem fp_3A, Version 1
BRANCH AND BOUND FOR GLOBAL NLP 63

1&-10

o 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Iteration Number

II(xf - xf) versus iteration number for problem fp-3A, Version 1


n

Figure 16
i=1

Version 1 Version 2
Name Widest Best Widest Best
fp_3_1 0.3558 0.6442 0.3500 0.6500
fp _3A 0.4870 0.5130 0.4650 0.5350
fpA_9 0.5781 0.4219 0.3965 0.6035
fp_5_1 0.1438 0.8562 n/a n/a
s-2b 0.1017 0.8983 0.0000 1.0000
sA 0.0909 0.9091 0.4722 0.5278
w 0.5804 0.4196 0.6240 0.3760
w_b 0.3540 0.6460 0.4318 0.5682
w_c 0.4314 0.5686 0.4302 0.5698

Table 12 Problems which use the widest rule over 5% of the time
64 T. O. W. EPPERLY AND R. E. SWANEY

Override = 0.001 Override = 0.01 Override = 0.1


Name Time (s) Regions Time (s) Regions Time (s) Regions
fp_3_1 4545 25000 4837.56 25000 2568.19 8563
fp_3A 1454.04 25000 595.69 6269 16.96 187
fpA_9 21316.41 25000 19472.80 25000 25309.63 25000
s-2b 0.13 7 0.13 7 0.13 7
sA 3.06 60 2.91 56 2.92 52
w 2943.41 21407 128.25 885 46.61 271
w_b 17.48 111 19.34 128 13.35 91
w_c 12.15 150 11.43 129 10.60 96

Table 13 Effect of the override parameter on runtime for Version 2

One of the adjustable parameters in the algorithm is the ratio of variable widths
required to override the best split. The normal value is 0.01, and the best split is
overridden while the following inequality is satisfied:
xU _xL xU -xL
_best best:::; (Override) . _widest widest
xbest - ebest xwidest - ewidest
Table 13 shows how varying the widest override parameter affects the runtime and
the number of regions searched for the problems in Table 12. In the first two cases
of problem fp_3_1, the algorithm does not find a solution. For Override = 0.001,
problem fp_3_1 terminates with a bound gap of 0.158927, and for Override = 0.01,
it terminates with a bound gap of 0.135591. For the third case, Override = 0.1, the
algorithm solves fp_3_1. For Override = 0.001, problem fp_3A exceeds the iteration
limit before the solution is verified; however, as the override factor increases the
problem solves faster. Problem fpA_9 does not find a feasible point in all three
cases, so its bound gap is infinite. Most of these problems benefit from having a
higher override factor, which branches using the widest rule more frequently, thereby
showing that the sensitivity based selection rule is not providing good splits for these
problems.

Table 14 shows the same information for the remaining problems when the change
in override factor caused more than a negligible change in the performance. Most
of the problems are unaffected by the change in override factor, and among those
that are, there is no consistent trend.

In the case of problem fp_3A, the slow convergence is due to the splitting rule. The
best splitting rule almost always recommends variable one or two; however, the al-
gorithm requires variable three to be split to verify the solution. Generally, variable
three is only split when the best split is overridden (i.e. when variable three's re-
gion width is 100 times that of the recommended variable). The sensitivity analysis
usually misses the dependence on variable three bounds because the McCormick
BRANCH AND BOUND FOR GLOBAL NLP 65

Override = 0.001 Override = 0.01 Override = 0.1


Name Time (s) Regions Time (s) Regions Time (s) Regions
fp_2_7_1 361.03 663 363.55 663 362.35 713
fp_23-2 272.63 515 275.10 515 273.16 515
fp_2_7..3 617.85 835 642.74 835 610.08 753
fp_23A 212.57 473 213.15 473 215.96 523
fp_2_7 _5 747.42 1623 745.30 1611 790.77 1771
fp_2_8 145.50 141 148.49 141 163.71 227
fp_3_2 21.14 40 21.15 40 13.10 35
Ld 30.58 103 30.67 103 35.93 121

Table 14 Effect of the override parameter on runtime for Version 2

constraints (1.151-1.154) are active. For example, if constraint (1.151) is active for
bilinearity 2,3 with a Lagrange multiplier of U2,3, the effect on the objective of a
change in xf in this constraint is estimated by

where b..xf' is the new bound for b..X3 and b..xf is the old bound. When b..X2 =
b..x¥ , which occurs frequently in problem fp_3A, the dependence on xf disappears.
The sensitivity analysis predicts changes assuming that the LP basis remains the
same, so it cannot predict how a change in the bounds of X3 will affect the LP basis
and consequently the objective function. This kind of problem keeps the sensitivity
based selection rule from choosing variable three. If X, the point around which
the covering program is constructed, is in the interior of the region (ie. not at a
variable bound), McCormick's bounds, constraints (1.151-1.154), will always have
a bound gap as shown in Figure 2, so ultimately, those constraints must leave the
basis and be replaced by the constraint space or null space quadratic constraints.
It may be possible to improve the selection rule by using a more sophisticated
and computationally intensive sensitivity analysis which can account for potential
changes in the LP basis [2].

Figures 17 and 18 illustrates how the algorithm can proceed for a number of itera-
tions without any improvement in the lower bound. The flat spaces in the bound
graph can be explained by a feature of the splitting strategy. The algorithm always
examines the region with the least lower bound. When it splits, the splitting strat-
egy may select a variable that produces one region with an increased lower bound
and another with no increase. This happens when the solution of the covering pro-
gram depends strongly on either the upper or lower variable bound but not both.
Because of this feature, the splitting strategy may have to split several times before
it picks a variable that will raise the least lower bound. The criteria for choosing a
variable, (2.3), favors variables where both bounds are significant, but it may still
choose variables which do not improve the least lower bound.
66 T. O. W. EPPERLY AND R. E. SWANEY

·49 r - - - , - - - , - - - - , - - - r - - - r - - - " T - - , - - - . . - , - - ,
Lower bound -f-
Upper bound . . •

·49.2
.,.--------------------'-
................

·49.4

·49.6

·49.8 ..-'

·50L-__- L____L -__- L____L -__- L____L -__ ____ ~ ~ ___"

o 10 20 30 40 50 60 70 80 90
Iteration Number

Figure 17 Bounds versus iteration number for problem Ld, Version 1

0.1
n
"0
C
:::l
.&
Q;
~
~ 0.01
c
:::l
0
.0
Q;
8:
::>
0.001

0.0001
o 10 20 30 40 50 60 70 80
I 90
Iteration Number

Figure 18 Bound gap versus iteration number for problem Ld, Version 1
BRANCH AND BOUND FOR GLOBAL NLP 67
10000 .--..,.----,----,--..,----r-----,--..---.----,

1000

100
'C
c::
.8 10

I
0.1

0.01

0.001

0.0001 L - _ - ' - _ - - ' -_ _'-_...l..-_--'-_--'_ _-'--_-'-"-----'

° 500 1000 1500 2000 2500


Iteration Number
3000 3500 4000 4500

Figure 19 Bound gap versus iteration number for problem w, Version 1

The last category of problems contains those on which either one or both of the
algorithms performed poorly. Applying Version 1 to problems w and s-2b results
in slow convergence. In the case of problem w shown in Figure 19, the difficulty
seems to be due to the unbounded derivative of XO. 6 at x = o. XO. 6 is replaced
with a polynomial at small values of x, but it still has first and second derivatives
which causes the underestimating quadratic to fit poorly. The poor underestimation
causes it to require a very small region to verify the minimum.

Version 1 requires 119 iterations to solve problem s-2b, which is much larger than
the 7 iterations required by Version 2, and the performance of Version 1 is shown in
Figure 20. Of the 119 regions that the algorithm examines, only 4 of them contain
the global minimum. The remaining 115 are needed to eliminate the area that does
not contain the solution, which is more than it ought to take considering the size
of the problem and the overall size of the variable domain. The excess is due to a
split very near the global optimum and looseness in the full rank program (1.177).
The split near the global optimum causes the regions to have lower bounds very
close to the global optimum, and the looseness in the full rank program is sufficient
to keep them from getting pruned. The looseness in the full rank program comes
because of difficulty in choosing constraints (1.187). There are three variables and
only two constraints, so one of the variables does not have a constraint of the form
of {1.187} to help enforce complementarity of its positive and negative component.
This problem might be solved by generating more than one constraint of the form
{1.187} for a particular original constraint.
68 T. O. W. EPPERLY AND R. E. SWANEY

0.1
't>
C

~
1,:, 0.01
c
::>
0
.D
Gi
8:
:::l
0.001

0.0001 ' - - - - - ' - - - - ' ' - - - - - ' - - - - - - - ' - - - - ' - - - - = ' ' ' - '
o 20 40 60 80 100 120
Iteration Number

Figure 20 Bound gap versus iteration number for problem s..2b, Version 1

It is also interesting to look at an example of when the algorithm failed. Versions 1


and 2 of the algorithm have very similar performance for problem fp_3_1. Figures
21-23 show the performance of Version 2 of the algorithm. It appears that the lower
bound is asymptotically approaching the upper bound at a slow rate.

The splitting rule is not recommending the best sequence of splits to improve the
lower bound. The algorithm logs show that this problem is suffering from the same
difficulty as fp_3A. The sensitivity with respect to some variables is disappearing
because the McCormick envelopes indicate no dependence on some variable bounds.
The sensitivity based splitting rule is usually recommending three of the eight vari-
ables, and the remaining five are usually split by the widest split override. Table 15
shows the frequency of different types of splits for each variable. It is also interesting
to note the small percentage of the best splits which are predicted to eliminate a
region.

Another difficulty with problem fp_3_1 is that the branch and bound list size keeps
growing. Figure 24 compares the list size behavior for fp_3_1 with the typical be-
havior shown by Version 1 applied to problem fp_2_7 _5.

The Newton constraint turned out to be insignificant on many of the problems. For
Version 1, the constraint was only active in problems fp_3_1, fpA_9, and e_l. For
Version 2, it was only active in problems fp_3_1, fp_3A, fpA-1, fpA_9, La, U, s_ld,
~d s-3. In the other 41 problems, it was never active in the solution of a covering
program. Rerunning these problems with the Newton constraint removed from the
BRANCH AND BOUND FOR GLOBAL NLP 69
8._------,_------~--------._------,_------_,

Lower bound -
Upper bound -----

7 -----------------------------------------------------------------------------------------

o 5000 10000 15000 20000 25000


Iteration Number

Figure 21 Bounds versus iteration number for problem fp-3_1, Version 2

10._-------.--------,--------,--------,--------,

0.1 L -______- ' -________.1..-______- ' -________.1..-______- '

o 5000 10000 15000 20000 25000


Iteration Number

Figure 22 Bound gap versus iteration number for problem fp-3_1, Version 2
70 T. G. W. EPPERLY AND R. E. SWANEY

o 5000 10000 15000


Generalized volume

II(xf - xf) versus iteration number for problem fpJLl, Version 2


n

Figure 23
i=1

Variable Widest Splits a Best Eliminateb Best PointC Best Bisect d


1 0 232 1204 245
2 0 548 1846 403
3 0 364 2628 422
4 1320 51 14 11
5 1001 5 29 0
6 826 7 241 17
7 803 32 515 0
8 811 1 27 0
a Chosen when the widest quadratic variable is 100 or more times the width of the recommended
variable.
bThe variable with the highest sensitivity split at a location to eliminate one of the regions.
cThe variable with the highest sensitivity split at the current point.
dThe variable with the highest sensitivity bisected .

Table 15 Branch types for problem fpJLl, Version 2


BRANCH AND BOUND FOR GLOBAL NLP 71

Unear convergence problem (fp_2_7_5. version one)


90.----.~--._--~--~

80
2000
70
60
II) .~ 1500
.~ 50
iii
:::; 40 :!l 1000
30
20 500
10
0
0 500 1000 1500 2000 6250 12500 18750 25000
iteration number Iteration number

Figure 24 Comparison of list size behavior

Version 2 Version 2
Without Newton With Newton
Time (s) Regions Time (s) Regions
fp_3_1 4644.20 25000 4837.56 25000
fp_3A 641.06 7129 595.69 6269
fpA_7 0.32 17 0.37 17
fpA_9 21639.35 25000 19472.80 25000
e_l 0.11 7 0.12 7
La 0.52 23 0.92 23
U 0.47 11 0.27 9
s_ld 0.24 13 0.26 13
s--3 0.13 3 0.18 3

Table 16 Effect of Newton constraint on the performance of Version 2

covering program had a small effect on the overall performance of the algorithm, as
shown in Table 16.

3 CONCLUSIONS
Both versions of algorithm have been shown to be successful at solving to a high tol-
erance a variety of problems including concave and indefinite quadratic programs,
bilinear programs, polynomial programs, and quadratic NLPs with nonlinear tran-
scendental functions. The time required for solution is highly program dependent,
but correlates somewhat with the number of quadratic terms.

Both versions of the algorithm perform about equally in terms of runtime, but
Version 2 of the algorithm requires far fewer contraint and constraint gradient eval-
72 T. G. W. EPPERLY AND R. E. SWANEY

uations. The evaluations required for Version 1 might be reduced by using a dif-
ferent local NLP solver such as successive quadratic programming. MINOS was
used primarily because of its availability and its ability to solve both NLPs and LPs
efficiently.

In the cases where the algorithm fails or performs poorly, the primary cause of the
poor performance is the branching rules. The problems with poor runtimes use a
higher percentage of widest-variable splits. In some cases, the widest split is needed
because the sensitivity analysis ignores the effect of the bounds of some variables,
and in other cases, the widest split rule is a hindrance to success because it causes
the algorithm to split on variables that do not matter. The majority of the problems
are able to succeed while using the best split over 95% of the time.

This branch and bound algorithm can be readily adapted to massively parallel
computers or parallel distributed computers, which is a subject currently under
study. Preliminary results show that the problems that could not be solved with
one processor can be solved here on a multiprocessor machine.

Acknowledgements
This work was supported by the Computational Science Graduate Fellowship Pro-
gram of the Office of Scientific Computing in the Department of Energy. The Na-
tional Science Foundation also provided partial support under grant DDM-8619582.

REFERENCES
[1] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty. Nonlinear Programming:
Theory and Algorithms. John Wiley & Sons, Inc., second edition, 1993.

[2J T. Gal. Postoptimal Analyses, Parametric Programming, and Related Topics.


McGraw-Hill Inc., 1979.

[3J P. E. Gill, W. Murray, M. A. Saunders, and M. H. Wright. Maintaining LU fac-


tors of a general sparse matrix. Linear Algebra and Its Applications, 88/89:239-
270,1987.
[4J B. A. Murtagh and M. A. Saunders. MINOS 5.1 user's guide. Technical Report
SOL 83-20R, Systems Optimization Laboratory, Stanford University, Stanford,
CA 94305-4022, January 1987.
BRANCH AND BOUND FOR GLOBAL NLP 73

[5] R. B. Schnabel and E. Eskow. A new modified Cholesky factorization. SIAM


Journal on Scientific and Statistical Computing, 11 {6}: 1136-1158, November
1990.

[6] R. E. Swaney. Global solution of algebraic nonlinear programs. AIChE Annual


Meeting {Chicago, IL 1990}. Publication pending.
3
NEW FORMULATIONS AND
BRANCHING STRATEGIES FOR
THE GOP ALGORITHM
V. Visweswaran* and C. A. F1oudas**
* Mobil Research and Development Corporation, Princeton, NJ
** Department of Chemical Engineering, Princeton University, Princeton, NJ

ABSTRACT
In Floudas and Visweswaran (1990, 1993), a deterministic global optimization approach was
proposed for solving certain classes of nonconvex optimization problems. A global optimization
algorithm, GOP, was presented for the solution of the problem through a series of primal and
relaxed dual problems that provide valid upper and lower bounds respectively on the global
solution. The algorithm was proven to have finite convergence to an f-global optimum. In this
paper, a branch-and-bound framework of the GOP algorithm is presented, along with several
reduction tests that can be applied at each node of the branch-and-bound tree. The effect of the
properties is to prune the tree and provide tighter underestimators for the relaxed dual problems.
We also present a mixed-integer linear programming (MILP) formulation for the relaxed dual
problem, which enables an implicit enumeration of the nodes in the branch-and-bound tree
at each iteration. Finally, an alternate branching scheme is presented for the solution of the
relaxed dual problem through a linear number of subproblems. Simple examples are presented
to illustrate the new approaches. Detailed computational results on the implementation of both
versions of the algorithm can be found in the companion paper in chapter 4.

1 INTRODUCTION
In recent years, the global optimization of constrained nonlinear problems has
received widespread attention. A considerable body of research has focused on the
theoretical, algorithmic and computational aspects for identifying the global solution.
Comprehensive reviews of the various existing approaches can be found in Dixon and
Szego (1975, 1978), Archetti and Schoen (1984), Pardalos and Rosen (1986, 1987),
75
I. E. Grossmann (ed.). Global Optimization in Engineering Design. 75-109.
C 1996 Kluwer Academic Publishers.
76 V. VISWESWARAN AND C. A. FLOUDAS

Tom and Zilinskas (1989), Mockus (1989), Horst and Tuy (1990) and Floudas and
Pardalos (1990, 1992).

Floudas and Visweswaran (1990, 1993) proposed a deterministic primal-relaxed


dual global optimization approach for solving certain classes of smooth optimization
problems. A global optimization algorithm (GOP) was presented for the solution of
the nonconvex problem through a series of primal and relaxed dual subproblems that
provide upper and lower bounds on the global optimum. The algorithm was shown
to attain finite E-convergence and E-global optimality regardless of the starting point.
The application of the algorithm to several test problems was detailed in Visweswaran
and Floudas (1990). Visweswaran and Floudas (1993) presented properties that vastly
improve the efficiency of the algorithm.

The GOP algorithm presented in Floudas and Visweswaran (1990, 1993) follows a
cutting plane approach to the solution of the relaxed dual subproblems. While this
approach provides tight lower bounds by including all the valid cuts in the relaxed dual
subproblems, it renders the implementation of the actual relaxed dual problem more
complex. In particular, the identification of valid underestimators at each iteration of
the algorithm must be followed with care. Moreover, the algorithm leaves open the
questions of (i) an implicit enumeration of all the relaxed dual subproblems, and (ii)
the reduction of the number of relaxed dual subproblems from exponential to linear,
which would greatly improve the efficiency of the solution procedure.

This paper presents the GOP algorithm in the framework of a branch-and-bound


approach. At each node in the branch and bound tree, a primal problem is solved, and
the solution of this problem is used to provide a Lagrange function. By branching on
the first derivatives of this Lagrange function, several new children nodes are created.
This framework has several advantages over the original cutting plane approach,
including considerably simplifying the formulation and solution of the relaxed dual
problem and allowing for the incorporation of pruning and reduction tests at each
node in the tree. While the approach is derived from the same basic properties that
motivated the earlier algorithm, it differs sufficiently from the earlier approach so as
to merit a complete discussion, which is presented in Section 4.

One of the main advantages of the branch-and-bound framework for the GOP algorithm
is that it allows naturally for an implicit enumeration of the relaxed dual subproblems
at each level. The introduction of binary variables linked to the sign of the derivatives
of the Lagrange function results in mixed integer linear and nonlinear programming
formulations that offer considerable scope for incorporation of reduction tests on a per
node basis. The resulting GOPIMILP algorithm is discussed in detail in Section 5.
GOP ALGORITHM 77

Due to the partitioning of the variable domain using the gradients of the Lagrange
function, the GOP algorithm can require, in the worst case, an exponential number of
dual subproblems at each iteration. This can lead to large CPU times as the number of
variables increases. Therefore, it is worth considering alternate partitioning schemes
that can reduce the number of subproblems that need to be solved at each iteration. In
Section 6, one such branching scheme is presented that requires only a linear number
of subproblems for the determination of the lower bound. A simple example is used
to illustrate the new scheme.

In a companion paper (Visweswaran and Floudas, 1995b), a complete implementation


of the algorithms presented here, along with comprehensive computational experience
on several problems in chemical process design and control, is described.

2 PROBLEM FORMULATION
The general form of the optimization problem addressed in this paper is given as
follows:
min F(:c,y)
"',Y

s.t. G(:c,y) < 0 (3.1)


H(:c,y) 0
:c E X
y E Y

where X and Yare non-empty, compact, convex sets, F(:c, y) is the objective function
to be minimized, G(:c, y) is a vector of inequality constraints and H(:c, y) is a vector
of equality constraints. It is assumed that these functions are continuous and piecewise
differentiable over X x Y. For the sake of convenience, it will be assumed that the
set X is incorporated into the first two sets of constraints. In addition, the problem is
also assumed to satisfy the following conditions:

Conditions (A):

(a) F(:c, y) and G(z, y) are convex in:c for every fixed y, and convex in y for every
fixed :c,

(b) H(:c, y) is affine in:c for every fixed y, and affine in y for every fixed:c,
78 V. VISWESWARAN AND C. A. FLOUDAS

(c) Y ~ V, where V == {y: G(z, y) ~ 0, H(z, y) = 0, for some z E X}, and


(d) An appropriate constraint qualification (e.g., Slater's qualification) is satisfied for
fixed y.

It has been shown (Floudas and Visweswaran, 1990) that the class of problems that
satisfies these conditions includes, but is not restricted to, bilinear problems, quadratic
problems with quadratic constraints and polynomial and rational polynomial problems.
Recently, it has also been shown (Liu and Floudas, 1993; Liu and Floudas, 1995) that
a very large class of smooth optimization problems can be converted to a form where
they satisfy Conditions (A), and hence are solvable by the GOP algorithm.

3 PRIMAL AND RELAXED DUAL PROBLEMS


The GOP algorithm utilizes primal and relaxed dual subproblemsto obtain upper and
lower bounds on the global solution. The primal problem results from fixing the y
variables to some value, say yk, and is defined as follows:

B.t. G(z, yk) < 0 (3.2)


H(z,yk) = 0

where yk E Y. It has been assumed here that any bounds on the z variables are
incorporated into the first set of constraints. Notice that because of the introduction of
additional constraints by fixing the y variables, this problem provides an upper bound
on the global optimum of (3.1). Moreover, pk (yk), the solution value of this problem
yields a solution zk for the z variables and Lagrange multipliers Ak and J'k for the
equality and inequality constraints respectivelyl .

The Lagrange function constructed from the primal problem is given as:

The z variables that are present in the linearization of the Lagrange function around
zk, and for which the gradients of the Lagrange functions with respect to z at zk are
1 It is assumed here that the primal problem is feasible for y = yk. See Floudas and Visweswaran (1990,
1993) for the treatment of the cases when the primal problem is infeasible for a given value of y.
GOP ALGORITHM 79
functions of the y-variables, are called the connected variables. It can easily be shown
that the linearization of the Lagrange function around zk can also be written in the
fonn:
NI:;
L k( z, y, Ak,pk)l xk
lin
= Lok( y,,I\,k ,pk) + L...J
" Zigik()
Y (3.4)
i=1

where N I~ is the number of connected variables at the let h iteration (representing the
z variables that appear in the Lagrange function), and L~ (y, Ak, pk) represents all the
tenns in the linearized Lagrange function that depend only on y. The positivity and
negativity of the functions gf (y) define a set of equations that are called the qualifying
constraints of the Lagrange function at the le th iteration, and which partition the y
variable space into 2N I; subregions. In each of these subregions, a Lagrange function
can be constructed (using the bounds for the z variables) that underestimates the global
solution in the subregion, and can therefore be minimized to provide a lower bound
for the global solution in that region.

Consider the first iteration of the GOP algorithm. The initial parent region is the
entire space y E Y from the original problem. This region is subdivided into 2N1:
subregions, and in each of these subregions, a subproblem of the following fonn is
solved:

s.t. PB > L 1(zBI, y,,I\,1 ,p1)jIin


Xl'

gI(y) ~ 0 if Zfl = zf I~,


g[(y) ~ 0 if Zfl = zf } ViE

where I; is the set of connected variables at the first iteration, N I~ is the number
of connected variables, and zfand zf
are the lower and upper bounds on the ith
connected variable respectively. This subproblem corresponds to the minimization
of the Lagrange function, with the connected variables replaced by a combination of
their lower and upper bounds. Note the presence of the qualifying constraints in the
problem. These constraints ensure that the minimization is carried out in a subregion
of the parent node. If this problem has a value of PB that is lower than the current
best upper bound obtained from the primal problem, then it is added to the set of
candidate lower bounds; otherwise, the solution is fathomed, that is, removed from
consideration for further refinement.

Consider a problem with two z and two y variables. In the first iteration, assuming
that both Zl and Z2 are in the set of connected variables for the first iteration, there
are four relaxed dual subproblems solved. These problems are shown in Figure lao It
80 V. VISWESWARAN AND C. A. FLOUDAS

~ 1 ~----------------------+,_/_/------~
_____ gf(y) < 0 • yC // g~(y) > 0
g~(y) = 0- -------____~2(y) > 0 ,/ g~(y) > 0
-- ..........
~~~"
1 "'
"
• yA
.yD ----- J.'-
-,,-,,~ ..... ...........
g~(y) < 0 ,,
, --- ..........
,,
g~(y) < 0 , ,,
,
---
,,
,,
, • yB
,,
,
,,
,,

Figure 18 Partition in tI for first iteration with two connected variables

Figure Ib Branch and bound tree for first iteration


GOP ALGORITHM 81

• JF

><,~~",

Figure 28 Partition in 11 for second iteration with one connected variable

Figure 2b Branch and bound tree for second iteration


82 V. VISWESWARAN AND C. A. FLOUDAS
can be seen that the qualifying constraints partition the y-space into the four regions.
Each of the relaxed dual subproblems solved provides a valid underestimator for the
corresponding region, as well as a solution point (denoted in the figure by yA, yB, yC
and yD) in the region.

Figure Ib shows the corresponding branch-and-bound tree created by the solution of


these four problems. The starting point yl is the root node, and it spawns four leaf
nodes. The infimum of the four nodes provides the point for the next iteration, in this
case, say yA .

In the second iteration, the relaxed dual problem is equivalent to further partitioning
the subregion that was selected for refinement. In each of these partitions, a relaxed
dual subproblem is solved. Figure 2a shows the subregions created in the example,
assuming that there was only one connected variable in this iteration. The two relaxed
dual subproblems solved in this iteration give new solutions yE and yF and are possible
candidates for entering at future iterations. Figure 2b shows the corresponding nodes
in the branch-and-bound tree created by this iteration.

The preceding discussion illustrates the key features of a branch and bound framework
for the algorithm. The framework is based upon the successive refinement of regions
by partitioning on the basis of the qualifying constraints. In the next section, the key
features of its implementation are discussed, based on which a formal statement of the
algorithm is then presented.

4 A BRANCH-AND-BOUND FRAMEWORK FOR THE GOP


ALGORITHM
The terminology used in this section is as follows. Given a node j in the branch and
bound tree, Pj is its parent node, and Ij is the iteration at which node j is created. Rj
is the set of constraints defining the region corresponding to node j. At any point, N
denotes the total number of nodes in the tree, and C denotes the current node.

4.1 Root Node and Starting Region

At the beginning of the algorithm, there are no subdivisions in the y-space. Therefore,
the root node in the branch and bound tree is simply the starting point for the
algorithm, yl. The region of application for this node (i.e., the current region) is the
entire y-space.
GOP ALGORITHM 83

4.2 Reduction Tests at Each node

At each node, the current region of application is divided into several subregions using
the qualifying constraints of the current Lagrange function. It is possible to conduct
simple tests on the basis of the signs of the qualifying constraints that can be used to
reduce the number of connected variables. One such test, based upon the properties
first presented in Visweswaran and Floudas (1993) is presented below:

Reduction Test:

Suppose a node j is to be partitioned in the le th iteration (i.e., Ii = Ie). Then,

(i) If gf (y) ~ 0 Vy E Ri' set Zi = zf in Lie (z, y,)..Ie, J.£1e) and remove i from the
set of connected variables.

(ii) If gf (y) ~ 0 Vy E Ri' set Zi = zY in Lie (z, y,)..Ie, J.£1e) and remove i from the
set of connected variables.

The proofs of the validity of these reductions can be easily obtained by considering
that the term zjgf(y) can be underestimated by zfgf(y) for all positive gf(y) and
zY gf (y) for all negative gf (y). For more details, the reader is referred to Visweswaran
and Floudas (1993).

4.3 Evaluation of bounds for the x variables


Often, the original problem contains linear and/or convex constraints in both z and y.
When the relaxed dual problem is being solved at a given iteration, the region for the
y variables is smaller than for the original problem. This can be exploited to provide
tighter bounds on the z variables.

Consider, for example, a problem where there is a one-to-one correspondence between


the z and y variable set in the feasible region of the problem (i.e., Yi =
Zi). Then,
consider the K th iteration of the GOP algorithm, where the node j is being partitioned.
Rj is the set of constraints defining the current region. Then, it is possible to obtain
tighter bounds on the z variables by the following procedure:

1. Choose an i E If.
84 V. VISWESWARAN AND C. A. FLOUDAS
2. Solve the following two problems:
min ± Zi
II:

z-y 0
Y E Rj
Use the solutions of the two problems for the lower and upper bounds on Zi
respectively. Note that the set Rj includes all linear and convex constraints from
the original problem.
3. Repeat Step 1 and 2 for all i E If.

Similarly, when there are other convex constraints in Z and y, these constraints can
be added to the above problem. This procedure can be very useful in obtaining the
tightest bounds on the connected Z variables at each iteration and consequently in
obtaining the tightest underestimators for the relaxed dual subproblems. Note also that
in the case of nonconvex constraints, we can incorporate their convex underestimators
in the evaluation of the bounds problems.

4.4 Branch-and-Bound Algorithm


The major steps of the branch-and-bound version of the GOP algorithm are described
in this section. The terminology is the same as described in 4. In addition, F denotes
the set of iterations with a feasible primal problem, while I denotes the set of iterations
when the primal problem was infeasible.

STEP 0: Initialization

(a) Read in the data for the problem including tolerance for convergence, E.
(b) Define initial upper and lower bounds (fu I fL) on the global optimum.
(c) Generate initial bounds for the z variables, zL and zP.
(d) Choose a starting point yl for the algorithm.
(e) Set K = 1, C = Pc = 1, N = 1.

STEP 1: Selection of Current Region


GOP ALGORITHM 85
(a) If K = 1, set Re = 0 and goto Step 2.
(b) If K ~ 2, set Re = 0, m = C. Then:
(i) Add the Lagrange function and qualifying constraints for node m to Re.
(ii) Set m = Pm. If m = 1, then goto Step 2.
(iii) Repeat steps (i) and (ii).

STEP 2: Primal problem

(a) Solve the primal problem (3.2) to give pK (yK).

(i) Iffeasible, set F =F U K and update fU = MIN[fU, pK (yK)].


(ii) If infeasible, solve a relaxed primal problem. Set I = I uK.
(b) Store yK, ).K and p.K.

STEP 3: Determination of Current Partitions

(a) Generate the current Lagrange function LK (z, y,).K, p.K).

(b) Determine the set of connected variables If and the corresponding partial
derivatives gf (y) (i = 1, ... , If) of the current Lagrange function.

(c) For each connected variable, determine (if possible) tight lower and upper bounds
zf and zf in the current region y E Rc. Otherwise, use the original bounds.
(d) Evaluate lower and upper bounds on gf (y) in the region y ERe.

(i) If gf (y) ~ 0 "iy ERe, set zP = zf in the current Lagrange function,


and remove i from the set If.
(ii) If gf (y) ~ 0 "iy E Re, set zP = zf in the current Lagrange function,
and remove i from the set If.

STEP 4: Relaxed Dual Problem

(a) Select a combination of the bounds B/ of the connected variables, say B/ = Bl.
86 v. VISWESWARAN AND C. A. FLOUDAS
(b) Find the solution (J'~, y*) to the following relaxed dual subproblem:

s.t. J'B > LK( z BI , y, 1\,K , J' K)l z:K


lin

gf(y) ~ 0 if Zfl = zf
gf (y) ~ 0 if Zfl = zf
(y, J'B) E Rc
(i) If J'~ < fU - E , set j = N + 1, P(j) = 0, N = N + 1, and store the
solution in J'~ , yi .
(ii) If J'~ ~ fU - E , fathom the solution.

(c) Select a new combination of bounds, say B, = B2, for the connected variables.
(d) Repeat steps (b) and (c) until all the combinations of bounds for the connected
variables have been considered.

STEP 5: Selecting a new lower bound

Select the infimum of all J'~, say J'~. Set 0 =p, yK +1 = 11 ,fL = J'~.
STEP 6: Check for convergence

If IIUj;JL I< E, STOP; otherwise, set K = K + 1 and return to step 1.

4.5 Dlustration
Consider the application of the branch and bound algorithm to the following problem,
taken from Al-Khayyal and Falk (1983):
min -z+ zy-y
z:,g
-6z+ 8y- 3 < 0,
3z - y - 3 < 0,
z,y > 0
Note that with these constraints, the bounds on both z and y are (0,1.5). Consider a
starting point of y1 = 1 for the algorithm.
GOP ALGORITHM 87

Iteration 1
For yl = 1, the fIrst primal problem has the solution of z = 0, I-'t = I-'~ = I-'~ = 0,
with the objective value of -1. The upper bound on the problem is therefore -1. The
Lagrange function is given by
L1(z,y,I-'1) = -z + zy - y = zgi(y) - y
where gt(y) = y - 1 is the fIrst (and only) qualifying constraint. From the original
problem, the bounds on y are (0, 1.5). Therefore, -1.0 ~ gl(y) ~ 0.5, implying that
two relaxed dual subproblems need to be solved. These problems, solved for positive
and negative gl(y) (with z set to 0 and 1.5 respectively), are shown below:

min I-'B min I-'B


1/.J.lB 1/.J.lB
I-'B ~ -y I-'B ~ 0.5y - 1.5
gt(y) = y-1 ~ 0 gt(y) = y-1 ~ 0
o ~ y ~ 1.5 o ~ y ~ 1.5
Solution: y = 1.5,I-'B = -1.5 Solution: y =0.0, I-'B = -1.5
The solution of these two problems provides the fIrst partition in the tree depicted
in Figure 3. R is the root node corresponding to the starting point yl = 1.0. At
this iteration, two nodes 1 and 2 are created by the solution of the two relaxed dual
subproblems. Both nodes have the root node R as their parent node. Both problems
have equal objective values, and are thus equal candidates for the best lower bound.
Suppose that node 2 is selected for further exploration.

Iteration 2
From node 2, the current value of y is 0.0. For this value, the primal problem has the
solution z = 1.0, I-'~ = 0, I-'~ = ~, I-'~ = O. with the objective value of -1.0. The
Lagrange function from this problem is

L2(z, y, 1-'2) = -z + zy - y + !(3z


3
- y - 3) = zg~(y) - ~y
3
- 1

where g~ (y) = y is the qualifying constraint for this Lagrange function.

For this iteration, the relaxed dual subproblems are solved in the region 0 ~ y ~ 1.
The tightest bounds on z for this region can be found by solving the following two
88 V. VISWESWARAN AND C. A. FLOUDAS

(y =1.0)

(y = 1.5) (K=l)

y< 1.5 (K=2)

(y =1.25) (K=3)

,
:, 5 ' (K=4)
.... ,,'
(y =0.333) (y =0.047)

o Explored Nodes

(i Fathomed Nodes

~.. ) Unexplored Nodes

Figure 3 Branch and Bound tree for illustrating Example


GOP ALGORITHM 89
problems:

mm ±:c
x
-6:c + 8y - 3 < 0,
3:c - y - 3 < 0,
O~y < I,
:c > O.

The solutions of these problems provide lower and upper bounds for :c respectively.
Thus, for the region 0 ~ y ~ 1, this yields the bounds 0 ~ :c ~ ~.

Since y > 0, it is obvious that gf(y) is positivefor all y in the current region. Therefore,
only one relaxed dual problem needs to be solved, with a valid underestimator to
L 2(:c, y, J.l.2) being used by fixing :c to its lower bound. Moreover, from the first
iteration, the Lagrange function corresponding to node 2 is also a valid cut for this
region. Note, however, that instead of using the original bounds on :c in both these
Lagrange functions, the improved bounds can be used. This yields the following
relaxed dual problem:

min J.l.B
Y,i"B
J1.B ~ h-~
gi(y) y-l~O
J.l.B ~ -b- 1
o< y :S 1.5
Solution: y = 0.2, J.l.B = -1.26667.
At the end of this iteration, there are two candidate regions for further partitioning: (i)
the region 1 :S y :S 1.5 corresponding to node 1, with a lower bound of -1.5, and (ii)
the region 0 :S y ~ 1 corresponding to node 3, with the lower bound of -1.26667.
Following the criterion of selecting the region with the best lower bound, node 1 is
chosen for further exploration.
90 V. VISWESWARAN AND C. A. FLOUDAS

Iteration 3

From node 1, the current value of y is 1.5. For this value, the primal problem has the
=
solution z 1.5, I"~ = = =
1\' I"~ 0, I"g 0, with the objective value of -0.75. The
Lagrange function for this iteration is

3 3
L (z, y, I" ) = -z + zy - y + 121 (-6z + 8y - 3) = zgl3()
Y -"31Y - 4"1
where gr(y) = y - 1.5 is the qualifying constraint for this Lagrange function.

For this iteration, the relaxed dual subproblems are solved in the region 1 ~ y ~ 1.5,
In this region, solving the bounds problems for z yields ~ ~ z ~ 1.5. Since gf(y) ~ 0
for all y, only one relaxed dual problem needs to be solved, with z fixed to its upper
bound. From the first iteration, the Lagrange function corresponding to node 1 is also
a valid cut for this region. Using the improved bounds on z shown above yields the
following relaxed dual problem:

min I"B
Y,J.'B
I"B > -~y- ~
gi(y) y-1~0
I"B > ~y - 2.5
0 < y:5 1.5
Solution: y = 1.25,I"B = -1.04167.
Again, there is no partition of the region in this iteration, but the relaxed dual provides
a tighter lower bound for this region than was originally available.

At the end of this iteration, there are two candidate regions for further partitioning: (i)
the region 0 ~ y ~ 1, corresponding to the node 3, with the lower bound of -1.26667,
and (ii) the region 1 ~ y ~ 1.5, corresponding to node 4, with the lower bound of
-1.04167. Following the criterion of selecting the region with the best lower bound,
node 3 is chosen for further exploration.
GOP ALGORITHM 91

Iteration 4
From node 3, the current value of y is 0.2. For this value, the primal problem has
the solution z = 1.0667, J.'t = 0, J.'i = 0.2667, J.'~ = 0, with the objective value
of -1.05333. Note that the solution ofthis problem has the immediate consequence
that it provides an upper bound that is lower than the lower bound for node 4 (which
is -1.04167). Therefore, node 4 can be immediately fathomed, i.e., removed from
consideration for any further refinement or exploration.

The Lagrange function from the current primal problem is

L4(Z, y, J.'4) = -z + zy - y + 0.2667(3z - y - 3) = zgi(y) - 1.2667y - 0.8

where gt(y) = y - 0.2 is the qualifying constraint for this Lagrange function.

For this iteration, the relaxed dual subproblems are solved in the region 0 ~ y ~ 1.0,
and try to provide refined lower bounds by partitioning the region further. The tightest
bounds for z in this region are 0 ~ z ~ ~.

Unlike the previous two iterations, it is necessary to partition the current region
since -0.2 ~ g1(y) ~ 1.3 and the reduction tests of Section 4.2 do not provide any
help. It is therefore necessary to solve two relaxed dual subproblems in the current
iteration. For both these problems, the Lagrange functions from nodes 2 and 3 are
valid underestimators. These two problems are shown below:

NodeS Node 6

mm J.'B mm J.'B
Y,I-'B Y,I-'B
J.'B > h-~ J.'B > h-~
gi(y) y-1~0 gi(y) y-1~0
J.'B > -~y-1 J.'B > -b- 1
J.'B > -1.2667y - 0.8 J.'B > 0.0667y - 1.0667
gt(y) y - 0.2 ~ 0 gt(y) y - 0.2 ~ 0
0 < y ~ 1.5 0 < y ~ 1.5
Solution: y = 0.333, Solution: y = 0.04762,
J.'B = -1.2222. J.'B = -1.06349.

Together, these two problems provide a tighter lower bound (-1.2222) for the region
o ~ y ~ 1 than before (-1.26667).
92 V. VISWESWARAN AND C. A. FLOUDAS

At the end of this iteration, there are two candidate regions for further partitioning --
the region 0 ~ y ~ 0.2, corresponding to node 6, with the lower bound of -1.06349,
and (ii) the region 0.2 ~ Y ~ 1, corresponding to node 5, with the lower bound of
-1.2222. Therefore, node 5 is chosen for further refinement.

The algorithm continues in this fashion for 18 iterations, converging to the global
solution of -1.0833 at z = =
1.1667, Y 0.5 with a tolerance of 0.001 between the
upper and lower bounds. It is interesting to note that the original GOP algorithm,
which does not compute the tightest bounds on the z variables at each iteration, takes
76 iterations to converge with the same tolerance. This indicates the importance of
having the tightest possible bounds on the connected variables at each iteration.

5 REFORMULATION OF THE RELAXED DUAL AS A


SINGLE MILP PROBLEM
The solution of the relaxed dual subproblems at each node is the most time-consuming
step in the algorithm outlined in Section 4. The reduction test mentioned in Section
4.2 can help to prune the branch-and-bound tree at each node; however, it is still
necessary to solve a large number of subproblems at each iteration. It is very likely
that the solution of most of these subproblems are useless as far as the succeeding
iterations are concerned, that is, most of the nodes will be fathomed as soon as they
are spawned. Naturally, this raises the question whether these subproblems can be
solved implicitly. This section presents one possible approach for reformulation of
the relaxed dual problem at each iteration so that the implicit enumeration of all the
solutions can be achieved by solution of an MILP problem.

At the Kth iteration, the Lagrange function has the form given by (3.4). Consider the
ith term in the summation. In each of the 2NI~ relaxed dual subproblems, this term
takes on either of two values:

zfgf(y) if gf(y)? 0
zY gf (y)
if gf (y) ~ 0

Now, Zi can be implicitly expressed as a combination of its lower and upper bounds:

(3.5)

where af E {O, 1}.


GOP ALGORITHM 93

This leads to the following formulation for the ith term in (3.4):

zjgf (y) = tj + zfgf (y)


where
tj> ajK( ZjU - ZjL) ~
K

ti > (zV - zf)(gf (y) - (1 - af)gf)


afgf < gf(y) ~ (1- af)gf

where gf and gf are respectively the lower and upper bounds on the qualifying
constraints. As the following property shows. this can be used to reformulate the
relaxed dual problem as a mixed integer linear program (MILP):

Property 5.1 Suppose that, at the Kth iteration. C denotes the current node to be
partitioned, and Re denotes the set of constraints defining the region associated with
C. Then, the best solutionfrom all the relaxed dual subproblems at this iteration can
be obtained as the optimal solution of the following mixed-integer linear program.
mm I'B (3.6)
~eY'''B
t,'"
NI~ NI~
B.t. I'B > I: tf + I: zfgf (y) + L{! (y,,\K, I'K) (3.7)
i=l i=l

, > ajK( ZjU -


t!' K
L) ~
Zj (3.8)

, >
t!' (zV - zf)(gf (y) - (1 - af)gf) (3.9)
a!'g!' :$ gf (y) :$ (1 - af)gf (3.10)
, -'-
tK E ~NI~, a K E{O,l}NI~, yEY (3.11)
(y,I'B) E Re (3.12)

where gf and gf are the lower and upper bounds on gf (y) over Y.

Proof. Since af is a binary variable. it can take on only two values in any solution.
either 0 or 1. Consider these two possible cases for af :

Case I (af = 0):


In this case. equations (3.8)-(3.10) reduce to
t!'
I >
_ 0 (3.13)
94 V. VISWESWARAN AND C. A. FLOUDAS

tf > (zY - zf)(gf (y) - gf) (3.14)


o < gf (y) ~ gf (3.15)

Since gf (y) ~ gf for all y E Y, (3.14) is redundant. Similarly, the second


inequality in (3.15) is also trivially satisfied. Therefore, if this set of constraints
is active in any solution, then tf = 0, the contribution from the ith components
of the first two terms in (3.7) to J1.B is zfgf (y), and in addition, we must also
have gf (y) 2 o.

Case II (of = 1) :
In this case, equations (3.8)-(3.10) reduce to

, > (u
t~ L) K
Zi -Zi ~ (3.16)
, > (zY - zf)gf (y)
t~ (3.17)
gf < gf(y) ~O (3.18)

Since gf (y) 2 gf for all y E y, (3.16) is redundant. Similarly, the first


inequality in (3.18)is trivially satisfied. Therefore, if this set of constraints is
active in any solution, then tf = (zY - zf )gf (y), the contribution from the ith
components of the first two terms in (3.7) to J1.B is zY gf (y), and in addition,
gf(Y)~O.

Thus, it can be seen that any solution of the relaxed dual problem in Step 4 of the
algorithm in Section 4 is automatically embedded in the set of constraints described by
(3.7)-(3.12). Therefore, (3.6)-(3.12) is a valid formulation for obtaining the solution
of the relaxed dual problem. 0

Remark 5.1 If Llf (y,).K, J1.K) are convex functions in y, then (3.6)-(3.12) is a
convex MINLP, and can be solved with the Generalized Benders Decomposition
(Geoffrion, 1972; Floudas et al., 1989) or the Outer Approximation algorithm (Duran
and Grossmann, 1986).

It should be noted that the reduction tests of Section 4.2 can also be applied to the
MILP formulation, as shown by the following property.

Property 5.2 At the Kth iteration,

(i) If gf (y) 2 0 for all y (respectively gf (y) ~ 0 for all y) then variable of can be
fixed to 0 (respectively 1.)
GOP ALGORITHM 95

(il) Ifgf (y) = ofor all y then variable Olf vanishesfromformulation (3.6)-(3.12).

Proof. (i) Suppose that gf (y) ~ 0 for all y E Y. Then, to underestimate the Lagrange
function from the Kth iteration, zp
must be set to zf .
By the definition of Olf '

Hence, this leads to Olf = O. Conversely, if gf (y) ~ 0 for all y E Y, then Olf must
be equal to 1.

(ii) If gf (y) = 0 for all y E Y, then this implies that

gf = gf (y) = gf = 0

Therefore, in (3.6)-(3.12), tf is always equal to zero, and the variable Olf vanishes
from the formulation. 0

Backtracking
With the MILP reformulation, it is possible to solve the relaxed dual subproblems
implicitly for the best solution at each iteration. However, it is not sufficient to find
the best solution; it must also be determined whether any of the other partitions can
provide a useful solution for further refinement.

Consider the relaxed dual subproblems solved when node j is being partitioned.
Suppose that this node was partitioned during iteration K. Then, there are NIt'
binary variables, and 2NI~ partitions to consider. Solving the problem (3.6)-(3.12)
gives the best solution among these partitions. Suppose that this solution corresponds
to the combination Ol c. Suppose also that J C is the set of binary variables that are
equal to 1 in this combination, and that there are N J c of them. Consider now the
following cut
LOli- LOli~NJc-l
iEJc irpc
If problem (3.6)-(3.12) is resolved with the above cut added to the problem, then the
solution will have a value for Ol different from Ol C , and will therefore correspond to
a different subregion of the current problem. Note that the objective value of this
problem represents the "second" best possible solution. The best solution, of course,
is the one corresponding to the solution of the first MILP problem, with Ol = Ol c .
Therefore, this methodology is sufficient to go back to a partitioned node at any point.
96 V. VISWESWARAN AND C. A. FLOUDAS

Note that although the size of the MILP problems increases slightly at each iteration
due to the accumulation of constraints from previous iterations, the number of binary
variables present in these problems is equal to the number of connected variables for
each iteration. In other words, the number of binary variables in the MILP problems
is bounded by the number of:z: variables in the original problem.

5.1 The GOPIMILP Algorithm


As before, given a node j in the branch and bound tree, Pj is its parent node, and Ij
is the iteration at which node j is created. Rj is the set of constraints defining the
region corresponding to node j. At any point, N denotes the total number of nodes
in the tree, and C denotes the current node. F denotes the set of iterations with a
feasible primal problem, while I denotes the set of iterations when the primal problem
was infeasible. Aj denotes the set of integer cuts to be used when solving the MILP
problem for the node j.

STEP 0: Initialization

This step is the same as in Section 4.4, with the addition of setting Al = 0.

STEP 1 •• Step 3:

Same as in Section 4.4.

STEP 4: Current Relaxed Dual Problem

Solve the MILP problem (3.6)-(3.12).

(i) If J.''B < fU -


j .
f , set j =N + 1, P{j) = C, N = N + 1, and store the solution
in J.'B d; .

(ii) If J.''B ~ fU - f , fathom the solution.

Let the solution for the binary variables in this problem be Q = QC. Let J C be the
set of variables which are 1 in this solution, and let N J C the number of such binary
variables.
GOP ALGORITHM 97

STEP S: Selecting a new lower bound

Same as in Section 4.4.

STEP 6: Regenerating Solutions From Partitioned Nodes

Suppose that the solution selected in Step 5 corresponds to node C, and that this node
was originally partitioned at iteration k. Then, add the cut

L O:j - L O:i $ N Jc - 1
iEJc i~Jc

to the set of binary cuts Ac. Solve the MILP problem (3.6)-(3.12) with the added set
of binary cuts Ac. Suppose the solution of this problem is J.l.1J.

(i) If J.l.1J < fU - E, then set j = N + 1, P(j) = C, N = N + 1, and store the


solution in J.I.~, yi . Also set o:c to be the solution of the binary variables in this
formulation.

(ii) If J.l.1J ~ fU - E, fathom the node C.

STEP 7: Check for convergence

Same as in Section 4.4.

Remark 5.2 After the MILP problem has been solved in either Step 4 or Step 6, an
integer cut is added to the corresponding formulation which ensures that that solution
cannot be repeated. This implies that the same MILP formulation might be solved
several times over the course of the iterations with small differences arising from the
additional integer cuts. Subsequently, there is considerable potential for storing the
tree information from these problems for use in future iterations.

Remark 5.3 At each iteration of the algorithm, there is a single MILP problem solved
in Step 4 or Step 6 as compared to the original algorithm, which needs to solve 2NI~
subproblems at the Kth iteration. This MILP problem contains N If binary variables
in the case of Step 4, or N If
variables in Step 6. In either case, the number of binary
variables present in any MILP formulation during all the iterations is bounded by the
98 V. VISWESWARAN AND C. A. FLOUDAS

maximum number of:z: variables. However, it is usually the case that the number of
connected variables is a fraction of the total number of :z: variables, implying that the
MILP problems are likely to have few binary variables.

Remark 5.4 The major advantage of the MILP problem appears when there are more
than about 15 connected variables at any iteration. In such cases, the original algorithm
would need to solve over 2 million problems at that iteration, the vast majority of
which would never be considered as candidate solutions for further branching. In the
case of the MILP algorithm, the implicit enumeration allows for far fewer problems
to be solved. The maximum number ofMILP problems solved is twice the number of
iterations of the algorithm.

5.2 Illustration of the GOPIMILP Algorithm


Consider the example from Section 4.5, with a starting point of yl 1 for the
algorithm.

Iteration 1

For yl = =
1, the first primal problem has the solution of:z: 0, 14 I"~= = =
I"§ 0,
with the objective value of -1. The upper bound on the problem is therefore -1. The
Lagrange function is given by
Ll(:z:, y,l"l) = -:z: +:z:y - y = :z:gHy) - y
where gt(y) =y - 1 is the first (and only) qualifying constraint.

The following MILP problem is solved first in Step 4:


mm I"B
Y,/JB

I"B > t} - y
t 11 > -1.5a}
t 11 > 1.5(gi - 0.5(1 - aD)
a} < gi ::; 0.5(1 - aD
gi y-1
0 < y ::; 1.5
The solution of this problem is y =
0.0, I"B =
-1.5, a1 =
1. Note that this
corresponds to node 2 in the branch and bound tree in Figure 3. This solution is
GOP ALGORITHM 99
chosen to be the next candidate for branching. However, in order to ensure that the
other regions are also considered for future reference, it is necessary to solve one more
problem, with the cut
a 11 <
_ 0

added to the MILP. This problem has the solution y = 0, J1.B = -1.5 and a~ = 1. It
is stored for future reference.

Iteration 2
=
For y 0.0, the primal problem has the solution z =
1.0, J.£~ =
0, J.£~ = k,
J.£g = 0,
with the objective value of -1.0. The Lagrange function from this problem is
1 4
L (z, y, J.£ ) = -z + zy - y + 3(3z - Y - 3) = zgl (y) - 3Y - 1
3 3 3

where g~ (y) = y is the qualifying constraint for this Lagrange function.

Since 0 :::; y :::; 1, tight bounds on z can be obtained to be 0 :::; z :::; ~. Since y > 0,
a valid underestimator to L2(Z, y, J.£2) for all y can be obtained by fixing z to its
lower bound. Therefore, there are no binary variables, and consequently, the MILP
formulation reduces to the same formulation as in Section 4.4. The solution of the
=
resulting subproblem is y 0.2, J1.B = -1.2667.

At the end of this iteration, there are two candidate regions for further branching: (i)
node I (1 :::; y :::; 1.5) with a lower bound of -1.5, and (ii) node 3 (0 :::; y :::; 1) with a
lower bound of -1.2667. The former node is selected for further exploration.

Iteration 3
=
For y 1.5, the primal problem has the solution z =
1.5, J.£r =
112 , J.£~ =
0, J1.~ = 0,
with the objective value of -0.75. The Lagrange function from this problem is
1
L (z, y, J.£ ) = -z + zy - y + 12 (-6z
2 2
+ 8y - 2
3) = zgl (y) - 31 Y - 4"1
where gHy) = y - 1.5 is the qualifying constraint for this Lagrange function.

For y :::; 1.5, the tightest bounds on z are ~ :::; z :::; 1.5. Again, only one relaxed dual
problem needs to be solved, with a valid underestimator to L 3(z, y, J1.3) being used
by fixing z to its upper bound. Therefore, the MILP is again identical to the original
=
algorithm formulation, and has the solution y 1.25, J1.B = -1.04167.
100 V. VISWESWARAN AND C. A. FLOUDAS

At the end of this iteration, there are two candidate regions for further partitioning -
(i) the region 0 ~ y ~ 1, corresponding to node 3, with a lower bound of -1.26667,
and (ii) the region 1 ~ y ~ 1.5, corresponding to node 4, with the lower bound of
-1.04167. Following the criterion of selecting the region with the best lower bound,
node 3 is chosen for further exploration.

Iteration 4
For y = 0.2, the primal problem has the solution z = 1.0667,1-'1 = 0, I-'~ = 0.2667,
I-'~ = 0, with the objective value of -1.05333. Note that the solution of this problem
provides an upper bound that is lower than the lower bound for node 4 (which is
-1.04167). Therefore, node 4 can be immediately fathomed, i.e., removed from
consideration for any further refinement or exploration.

The Lagrange function from the current primal problem is

L4(Z, y, 1-'4) = -z + zy - y + 0.2667(3z - y - 3) = zgt(y) - 1.2667y - 0.8

where g1(y) =y - 0.2 is the qualifying constraint for this Lagrange function.

For this iteration, the relaxed dual subproblems are solved in the region 0 ~ y ~ 1.0,
and try to provide refined lower bounds by partitioning the region further. The tightest
bounds for z in this region are 0 ~ z ~ ~.

Unlike the previous two iterations, it is necessary to partition the current region since
-0.2 ~ g1(y) ~ 1.3. Therefore, the MILP in this iteration takes the form:

min I-'B
Y,J,lB

1 4
I-'B > -y-
3
-
3
gi(y) y-1~0
4
I-'B > --y-1
3
I-'B > tt - 1.2667y
t 41 > -0.26667at
t 41 > 1.3333(gt - (1 - at) . 1.0667)
-0.2at < gt ~ (1 - at) ·0.8
GOP ALGORITHM 101

gt y - 0.2
o< y ~ 1.5

The solution of this problem is y = 0.333, I-'B = -1.222, ai = o.


Thus, the MILP algorithm produces the exact sequence of solutions given by the
original branch and bound algorithm. As in Section 4.5, this algorithm also takes 18
iterations to converge.

Remark 5.5 Note that in this example, there is no arguable advantage to using the
MILP formulation, since it needs to be solved for both combinations of al at each
iteration. However, for problems with more than one connected variable, it is obvious
that this formulation can offer a major advantage over the original formulation. This is
because at each iteration, no more than 2 MILP problems need to be solved. Although
these problems are bigger in size and more complex than the original relaxed dual
subproblems, their structure is such that finding their solution is not really dependent
on the presence of the binary variables, and a good MILP solver can be expected to
solve them very efficiently. At the same time, they feature the key advantage of not
having to solve the full set of subproblems at each iteration.

It should be noted, however, that the convenience of solving just one compact problem
is achieved at the expense of problem size. Because all possible solutions of the relaxed
dual problem have to be incorporated in the GOPIMILP formulation, the result is a
much larger problem to solve. A number of constraints and variables need to be used
to implicitly represent all the possible bound combinations. For large problems, this
could cause difficulties, although the availability of increasingly fast MILP solvers
makes this less of a drawback.

6 A LINEAR BRANCHING SCHEME FOR THE GOP


ALGORITHM
In both the GOP and GOPIMILP algorithms, the qualifying constraints (i.e., the
gradients of the Lagrange function) are used to partition the y-space. The reduction
properties presented in Section 4 can provide a significant reduction in the number of
connected variables and subsequently the number of partitions. However, in the worst
case, the number of subproblems solved still increases exponentially with the number
of connected variables. It is then natural to ask the follow'ing question: Is it possible to
develop a valid lower bound at each iteration using only a linearly increasing number
102 V. VISWESWARAN AND C. A. FLOUDAS

of relaxed dual subproblems? In this section, we present one branching scheme that
achieves this goal. This scheme originates from the study of Barmish et al (1995a,
1995b) on the stability of polytopes of matrices of robust control systems.

6.1 Reformulation of Qualifying Constraints

Consider the relaxed dual problem at the lc th iteration. This problem has the constraint

NI~
J.l.B ~ L~(y,Ak,J.l.k) + L zjgf(y)·
i=l

Suppose that all the Z variables are bounded between -1 and 1. If this is not the
case, it can be achieved by use of the following linear transformation. Suppose that
zL :s Z :s zU. Then, define z' such that -1 :s z' :s 1, and

Z =a·z' +b
The substitution of the lower and upper bounds gives

ZL =a·(-l)+b, and zU =a·(l)+b


leading to
a=--- and
2
The variables z' can then be substituted for z using the above transformation, leading
to a Lagrange function in y and z'. We will continue the presentation in this section
by considering the case -1 :s z :s 1.

The following observation is now made:

(a) If gf (y) ~ 0,

(b) Ifgf(y):s 0,

Combining these two cases leads to the inequality

zjgf(y) ~ -lgf(y)1
GOP ALGORITHM 103

and
NI~
J.l.B ~ L~(y,).rc,J.l.k) - I: /gf(y)/ (3.19)
i=l

The first term on the right hand side is convex, and can remain unaltered. Consider
now the summation term. Using the concept of the infinity norm, (3.19) can be written
as
(3.20)

For any value of y, there is some j E 1, ... , N I~ such that

/gJ(y)/ =. max Ie /gf(y)/


.=l, ... ,Nlc

implying that
IgJ(y)1 ~ Igf(y)I, i= l, ... ,NI~ (3.21)
Consider the following two possibilities:

(a) If gJ(y) ~ 0, then IgJ(y)1 = gJ(y), and (3.21) reduces to the two inequalities

gt(y) ~ gf(y)
} i = 1, ... , N I~, i =1= j (3.22)
gj (y) ~ -gf(y)
and (3.20) becomes

(b) If gJ(y) ~ 0, then Igj(y)1 = -gj(y), and (3.21) reduces to the two inequalities

gt(y) ~ gf(y)
} i=l, ... ,NI~,i=l=j (3.23)
gj (y) ~ -g:(y)
and (3.20) becomes

The two cases presented above indicate how the summation in (3.19) can be replaced
by a linear term when gj (y) represents the maximum of all the qualifying constraints
at a given value of y. This concept can then be extended to cover the entire region for
y. To do this, the above procedure needs to be repeated for all values of j, resulting
in 2 x N I~ subproblems that need to be solved in order to properly underestimate the
Lagrange function at all values of y.
104 V. VISWESWARAN AND C. A. FLOUDAS

Remark 6.1 It should be noted that with the use of the linear branching scheme, the
same space in y is now spanned by a linear number of underestimators (as opposed
to an exponential number in the original algorithm). Therefore, the tightness of these
underestimators will be less than with the original algorithm. Therefore, at the end
of each iteration, the lower bounds obtained from the dual problems with the linear
branching scheme will be looser than those obtained with the original algorithm,
resulting in an increase in the number of iterations required for convergence. At the
same time, the number of subproblems solved at each iteration is vastly reduced.
Therefore, the total computational effort required for the entire algorithm is likely to
be much smaller with the linear branching scheme.

6.2 Illustration

Consider the following problem:

B.t. Zl - Y1 = 0
Z2 - Y2 = 0
-1::; z,y::; 1

Suppose that the GOP algorithm is applied to this problem, with the starting point of
y = O. The fIrst primal problem has the solution z 0, = .\t =
0 and .\~ =
O. This
leads to the following constraint in the fIrst relaxed dual problem:

J.'B ~ Zl(O - yt} + Z2(0 - Y2)


~ -10 - yd -10 - Y21
where gl(y) = 0 - Y1 and g~(y) = 0 - Y2 are the two qualifying constraints. The
region in the y variables, as well as its division using these qualifying constraints as
used by the original GOP and GOPIMILP algorithms, is shown in Figure 4(a). Note
that the four regions A, B, C and D represent the four relaxed dual subproblems
solved by the original algorithms.

Suppose that IgHy)1 ~ IgHy)l. There are two possibilities:

(a) gl(y) ~ O. Then, the use of (3.22) results in

Y1-Y2 < 0
Y1+Y2 < 0
Y1 < 0
GOP ALGORITHM 105

c A
I
1 I
Y
-----------4------------
I

I
I
I
I
I

D B

(a) Original qualifying constraints

Y1 I ,,
,,
,,
, ,,
,, ,
,, H ,,
,, ,
, ,,

E F

, ,,
,, ,,
,, ,

(b) Transfonned qualifying constraints

Figure 4 Transfonned qualifying constraints


106 V. VISWESWARAN AND C. A. FLOUDAS

The region of Y described by these constraints is shown as region E in Figure 4(b).


The corresponding constraint for the relaxed dual problem is given by

(b) gHy) ~ O. Then, the use of (3.23) results in

Yl - Y2 > 0
Yl + Y2 > 0
Yl > 0

These equations describe region F in Figure 4(b). The corresponding constraint


for the relaxed dual problem is given by

Similarly, when IgHy)1 ~ Igi(Y)I, there are two possibilities:

(a) gHY) ~ O. Then, the use of (3.22) results in

Y2 - Yl < 0
Y2 + Yl < 0
Y2 < 0

The region of Y described by these constraints is shown as region G in Figure 4(b).


The corresponding constraint for the relaxed dual problem is given by

(b) gHY) ~ O. Then, the use of (3.23) results in

>
Y2 - Yl 0
Y2+Yl > 0
Y2 > 0

These equations describe region H in Figure 4(b). The corresponding constraint


for the relaxed dual problem is given by
GOP ALGORITHM 107

Thus, it can be seen that the use of the equations (3.22) and (3.23) result in a new
set of partitions of the region in y. For this example, there are still 4 partitions, so
there is no reduction in the number of subproblems to be solved. However, when
the number of connected variables is more than 2, the use of these transformations
will result in a linearly increasing (as opposed to exponentially increasing) number of
subproblems at each iteration. For example, when there are 10 connected variables,
the new partitioning scheme requires 20 relaxed dual subproblems as opposed to 1024
for the original GOP algorithm.

7 CONCLUSIONS
This paper has focussed on presenting the GOP Algorithm of Floudas and Visweswaran
(1990, 1993) in a branch and bound framework. This framework is based upon
branching on the gradients of the Lagrange function, and is considerably simpler
than the original cutting plane algorithm. The primary advantage of the framework
is in simplicity of implementation. In particular, the selection of previous Lagrange
functions as cuts for current dual problems is considerably simplified. Moreover, the
framework allows for the use of a mixed integer formulation that implicitly enumerates
the solutions of all the dual subproblems. This paper has also considered the issue
of reducing the number of subproblems at each iteration, and in Section 6, a new
partitioning scheme was presented that requires only a linear number of subproblems.
This is a significant reduction from the exponential number of subproblems required
by the original algorithm.

The new algorithms have been implemented in a package cGOP (Visweswaran and
Floudas, 1995a) and applied to a large number of problems. The results of these
applications can be found in the companion paper (Visweswaran and Floudas, 1995b).

Acknowledgements
Financial support from the National Science Foundation under grant CTS-922141I is
gratefully acknowledged.
108 V. VISWESWARAN AND C. A. FLOUDAS

REFERENCES
[1] F. A. Al-Khayyal and J. E. Falk. Jointly constrained biconvex programming.
Math. ofOper. Res., 8(2):273, 1983.
[2] F. Archetti and F. Schoen. A Survey on the Global Optimization Problem: General
Theory and Computational Approaches. Annals of Operations Research, 1:87,
1984.

[3] B. R. Barmish, C. A. Floudas, H. V. Hollot, and R. Teinpo. A Global Linear


Programming Solution to Some Open Robustness Problems Including Matrix
Polytope Stability. IEEE Transactions on Automatic Control, 1995a. Submitted
for Publication.

[4] B. R. Barmish, C. A. Floudas, H. V. Hollot, and R. Tempo. A Global Linear


Programming Solution to Some Open Robustness Problems Including Matrix
Polytope Stability. Proceedings of the ACC 95, Seattle, June 21-23, 1995b. To
appear.

[5] L.C.W. Dixon and G.P. Szego. Towards global optimisation. North-Holland,
Amsterdam, 1975.

[6] L.C.W. Dixon and G.P. Szego. Towards global optimisation 2. North-Holland,
Amsterdam, 1978.

[7] M. A. Duran and I. E. Grossmann. An outer approximation algorithm for a class


of mixed-integer nonlinear programs. Mathematical Programming, 36:307,
1986.

[8] C. A. Floudas, A. Aggarwal, and A. R. Ciric. Global optimum search for


nonconvex NLP and MINLP problems. Compo & Chem. Eng., 13(10):1117,
1989.

[9] C. A. Floudas and P. M. Pardalos. A Collection of Test Problems for Constrained


Global Optimization Algorithms, volume 455 of Lecture Notes in Computer
Science. Springer-Verlag, Berlin, Germany, 1990.
[10] C. A. Floudas and P. M Pardalos. Recent Advances in Global Optimization.
Princeton Series in Computer Science. Princeton University Press, Princeton,
New Jersey, 1992.

[11] C. A. Floudas and V. Visweswaran. A global optimization algorithm (GOP) for


certain classes of nonconvex NLPs: I. theory. Compo & Chern. Eng., 14:1397,
1990.
GOP ALGORITHM 109

[12] C. A. Aoudas and V. Visweswaran. A primal-relaxed dual global optimization


approach. J. Optim. Theory and Appl. , 78(2):187,1993.
[13] A. M. Geoffrion. Generalized Benders Decomposition. J. Optim. Theory and
Appl., 10(4):237,1972.
[14] R. Horst and H. Tuy. Global Optimization: DeterministicApproaches. Springer-
Verlag, Berlin, Germany, 1990.
[15] W. B. Liu and C. A. Floudas. A Remark on the GOP Algorithm for Global
Optimization. J. Global Optim., 3:519,1993.
[16] W. B. Liu and C. A. Floudas. Convergence of the GOP Algorithm for a Large
Class of Smooth Optimization Problems. Journal of Global Optimization, 6:207,
1995.
[17] J. Mockus. Bayesian Approach to Global Optimization. Kluwer Academic
Publishers, Amsterdam, Holland, 1989.
[18] P. M. Pardalos and J. B. Rosen. Constrained global optimization: Algorithms
and applications, volume 268 of Lecture Notes in Computer Science. Springer
Verlag, Berlin, Germany, 1987.
[19] P.M. Pardalos and J.B. Rosen. Methods for global concave minimization: A
bibliographic survey. SIAM Review, 28(3):367, 1986.
[20] A. Tom and A. Zilinskas. Global Optimization, volume 350 of Lecture Notes in
Computer Science. Springer-Verlag, Berlin, Germany, 1989.
[21] V. Visweswaran and C. A. Floudas. A Global optimization algorithm (GOP) for
certain classes of nonconvex NLPs: II. Application of theory and test problems.
Compo & Chem. Eng., 14:1419, 1990. .
[22] V. Visweswaran and C. A. Floudas. New properties and computational improve-
ment of the GOP algorithm for problems with quadratic objective function and
onstraints. J. Global Optim., 3(3):439,1993.
[23] V. Visweswaran and C. A. Floudas. cGOP: A User's Guide. Princeton University,
Princeton, New Jersey, 1995a.
[24] V. Visweswaran and C. A. Floudas. Computational Results For an Efficient
Implementation of the GOP Algorithm and Its Variants. In Global Optimization
in Engineering Design, (Ed.) I. E. Grossmann, Kluwer Book Series in Nonconvex
Optimization and Its Applications, Chapter 4, 1995b.
4
COMPUTATIONAL RESULTS FOR
AN EFFICIENT IMPLEMENTATION
OF THE GOP ALGORITHM AND
ITS VARIANTS
V. Visweswaran* and C. A. Floudas**
* Mobil Research and Development Corporation, Princeton, NJ
** Department of Chemical Engineering, Princeton University, Princeton, NJ

ABSTRACT

Recently, Aoudas and Visweswaran (1990, 1993) proposed a global optimization algorithm
(GOP) for the solution of a large class of nonconvex problems through a series of primal
and relaxed dual subproblems that provide upper and lower bounds on the global solution.
Visweswaran and Aoudas (1995a) proposed a reformulation of the algorithm in the framework
of a branch and bound approach that allows for an easier implementation. They also proposed
an implicit enumeration of all the nodes in the resulting branch and bound tree using a mixed
integer linear (MILP) formulation, and a linear branching scheme that reduces the number
of subproblems from exponential to linear. In this paper, a complete implementation of the
new versions of the GOP algorithm, as well as detailed computational results of applying the
algorithm to various classes of nonconvex optimization problems is presented. The problems
considered including pooling and blending problems, problems with separation and heat
exchanger networks, robust stability analysis with real parameter uncertainty, and concave and
indefinite quadratic problems of medium size.

1 INTRODUCTION

Aoudas and Visweswaran (1990, 1993) proposed a global optimization algorithm


(GOP) for the solution of a large class of nonconvex problems. The algorithm
solves the original problem iteratively through a series of primal and relaxed dual
subproblems, which provide upper and lower bounds on the global solution. The
algorithm has a guarantee of finite convergence to an f-optimal solution; however,
the nature of its cutting plane approach renders the implementation very difficult,
especially in the steps leading to the choice of underestimators to be used during
111
I. E. Grossmann (cd.), Global Optimization in Engineering Design, 111-153.
© 1996 Kluwer Academic Publishers.
112 V. VISWESWARAN AND C. A. FLOUDAS

various iterations. To circumvent this problem, Visweswaran and Floudas (1995a)


proposed the reformulation of the algorithm in the framework of a branch and bound
approach. At each iteration, the gradients of the Lagrange function are used for
branching, with the primal and relaxed dual problems at each node are used to
provide upper and lower bounds on the global solution. The paper also addressed
the question of implicit enumerations of all the nodes in the tree by using a mixed
integer linear (MILP) formulation for the relaxed dual problem, and proposed a new
branching scheme that only requires a linear number of relaxed dual subproblems at
each iteration.

In this paper, a complete implementation of the new versions of the GOP algorithm,
along with computational results, is discussed. The actual details of the implementation
can be found in Appendix A, which discusses the various aspects involved in the
implementation, including reduction tests and local enhancements at each node of the
tree. In particular, the movement of data from one part of the program to another is
discussed in detail. In the following sections, the results of applying the implementation
to various classes of nonconvex optimization problems, including pooling and blending
problems, problems with separation and heat exchanger networks, and quadratic
problems from literature are described.

2 COMPUTATIONAL RESULTS

A complete description of the GOP and GOPIMILP algorithms can be found in


Visweswaran and Floudas (1995a). These algorithms have been implemented in a
complete package cGOP (Visweswaran and Floudas, 1995b). The details of the
implementation can be found in Appendix A. In this section, we present the results
of the application of the cGOP package to various problems in chemical engineering
design and control and mathematical programming.

2.1 Heat Exchanger Network Problems

Heat exchanger network synthesis problems have traditionally been solved using a
decomposition strategy, where the aims of targeting, selection of matches and opti-
mization of the resulting network configuration are treated as independent problems.
Given the minimum utility requirements and a set of matches, a superstructure of
all the possible alternatives is formulated. The resulting optimization problem is
nonconvex. In this section, two such superstructures of heat exchanger networks are
solved using the GOP algorithm.
GOP ALGORITHM AND ITS VARIANTS 113

The problems solved in this section have the following form:

OBJ .
mlD
'" (
L.J C%ij U .. LMTD ..
Qij )f3 i i
ijeMA 1J I)

s.t.

(Initial splitter mass balance)

(Mixer balances at exchanger inlets)

1£, + L If, ,k" - If, OJ V AI E HOT.


k"esk'

(Splitter balances at exchanger outlets)

Ir, + L If,',k' - If, = OJ V AI E HOT.


k"esk,

(Energy balances at mixers)

Tk 1£, + L If"klltr" - Iff,t{, = OJ V AI E HOT.


k"ES k ,

(Energy balances in exchangers)


Qij = liE,j (t[,j - t?,j) V (ij) E MA

Qij = If,i(t7,i - t7,i) V (ij) E MA

"32 X
( DT1ij X DT2ij) 1/2 +"61 X (DT1ij + DT2ij)
114 V. VISWESWARAN AND C. A. FLOUDAS

Here. Uij are the fixed heat transfer coefficients. It should be noted that for fixed
Qij. the objective function is convex. Therefore. by projecting on the flow rates
Ii. the primal problem becomes convex in the remaining variables (the temperatures
and temperature differences). Linearization of the Lagrange function ensures that the
relaxed dual subproblems are LP subproblems in the flowrates.

Example 2.1 This example is taken from Floudas and Ciric (1989). In this problem.
the objective is to determine the globally optimal network for a system of two hot
streams and one cold stream. The superstructure of all possible solutions is shown in
Figure 1. Based upon this superstructure. the model can be formulated as the following
optimization problem :

1300[ 1000
min
0.05[~(.6.Tll.6.T12)] + H.6.Tll + .6.T12)
]0.6
+
1300 [ 600 ] 0.6
0.05[~(.6.T21.6.T22)] + H.6.T21 + .6.T22)

s.t.
H +/~ 10
It +/~ -If 0
I~ +/~ -If 0
If> +/~ - If 0
If +/~ -If 0
150lt + t~ I~ - tUf 0
1501: + t? I~ - t~/f 0
If(t? - to 1000
I:(t~ - t~) = 600
.6.Tll = 500 - t? , .6.T12 250 - t{
.6.T21 = 350 - t~ , .6.T22 200 - t~
.6.Tll ,.6.T12,.6.T21! .6.T22 > 10
Considering the set of possible solutions inherent in Figure 1. it is obvious that the
bypass streams (/~ and I~) can never be simultaneously active. i.e. at least one of
these streams has to be zero. Therefore. two different problems can be solved. one
with I~ = 0 and another with I~ = o. When the GOP algorithm is applied to the
GOP ALGORITHM AND ITS VARIANTS 115

II IE 110
1 1
tI to
1 1

f2~
10 10
fl~
150 0 310 0

/,1 /,E /,0


2 2 2
tI to
2 2

Figure 1 Heat Exchanger Network Superstructure For Example 2.1

10 10 10

Figure 2 Optimal Configuration For Example 2.1


116 v. VISWESWARAN AND C. A. FLOUDAS
problem in this form, the optimal solution (given in Figure 2) is found in 11 iterations,
needing 0.54 cpu seconds on an HP 730.

Example 2.2 This example is also taken from Floudas and eiric (1989). It features
three hot streams and two cold streams.

min

B. t. If + 1£ + IJ 45
If + I~ + Irs - IF 0
I~ + 1ft + I~ - If 0
IJ + 1ft + 1£ - If 0
If + 1ft + 1ft - IF = 0
If + I~ + Ira - If = 0
Ir
+ Irs + I~ - If 0
1001{ + t~ I~ + t~ Ira - t{lF = 0
100/~ + t? 1ft + t~ I~ - t~/f 0
10011 + t~ 1ft + t~ 1£ - t~/f 0
IF(t? - t{} = 2000, If(t~ - t~) = 1000, If(t~ - t~) = 1500
.6.T11 = 210 - t?, .6.T21 = 210 - t~, .6.T31 210 - t30
.6.T12 = 130 - tL =
.6.T22 160 - t~, .6.T32 180 - t~
.6.T11 , .6.T12,.6.T21! .6.T22.6.T31 , .6.T32 > 10
o ~ If,/~,/J,lf,lf,/r,

The superstructure for this example is shown in Figure 3. There are a total of 27
variables and 19 constraints (of which six are bilinear). With a projection on the flow
rates, there are six connected variables. The GOP algorithm requires a total of 39
iterations and 54.62 cpu seconds to solve this problem. The optimal solution found by
the algorithm is given in Figure 4.
GOP ALGORITHM AND ITS VARIANTS 117

/.0
45 2 45

II
3

Figure 3 Heat Exchanger Network Superstructure For Example 2.2

45 45 45 45

Figure 4 Optimal Configuration For Example 2.2


118 V. VISWESWARAN AND C. A. FLOUDAS

450 HI

Figure 5 Heat Exchanger Example From Quesada and Grossmann (1993)

2.2 Heat Exchanger Problems With Linear Cost Functionals


In this section, we apply the GOP algorithm the global optimization of several heat
exchanger networks with fixed topology. The problems are taken from Quesada and
Grossmann (1993) and assume linear cost functionals for the exchanger areas as well as
arithmetic mean driving forces for the temperature differences between the exchanging
streams. Under these assumptions, the problems reduce to the minimization of a sum
of linear fractional functions (which is nonconvex) over a set of linear constraints.

In order to reduce these problems to a form where the GOP algorithm could be applied,
we employ the ideas ofLiu and Floudas (1993), which involve a difference of convex
functions transformation. This involves use of eigenvalue analysis on the resulting
fractional objective functions in order to determine the smallest quadratic terms that
are needed to "convexify" the objective function. Since this method is very general
and can be of use in various problems of this type, it is outlined in some detail here for
one of the examples.

This example (Example 4 of Quesada and Grossmann, 1993) features a network of


three exchangers used to heat one cold stream and cool three hot streams. This network
is shown in Figure 5, with FCp = 10 for all the streams. The minimum temperature
of approach is 10° K.

The problem formulation, featuring constraints for the heat balances, minimum
temperature approaches and feasibility is shown below:

min
GOP ALGORITIIM AND ITS VARIANTS 119
Temperature Differences:

2,6,T1 150 + Tl - T4
2,6,T2 500 + T2 - T4 - T5
2,6,T3 150 + T3 - T5

Heat Balances:

Ql 10(T4 - 300) = 10(450 - T 1 )


Q2 10(T5 - T4) = 10(500 - T 2)
Q3 10(400 - T5) = 10(550 - T3)

Minimum Temperature Approaches:

Tl - 300 > 10 450 - T4 ~ 10


T2 - T4 > 10 500 - T5 ~ 10
T3 - T5 > 10

Feasibility:

The three heat balance equations can be used to eliminate three of the variables in the
problem. Choosing the intermediate streams T4 and T5 as the independent variables
leads to

Tl 750 - T4
T2 500 + T4 - T5
T3 150 + T5

Using the minimum temperature approaches, tighter bounds on T4 and T5 are obtained:

Tl ~ 310 ~ 750 - T4 ~ 310 ~ T4 ~ 440


T2 ~ 10 + T4 ~ 500 + T4 - T5 ~ 10 + T4 ~ T5 ~ 490

Similarly the temperature differences reduce to


120 V. VISWESWARAN AND C. A. FLOUDAS

aT2 = 500-T5
aT3 150

Thus, the problem formulation reduces to

. [T4 - 300 T5 - T4 400 - T5]


mm 10000 450 _ T4 + 500 - T5 + 150

300 $ T41 T5 $ 400


Consider now the three individual terms inside the parentheses. For the sake of clarity,
the factor of 10000 is omitted below.

• The first fractional term is


F _ T4 - 300
1- 450-T4 '
The Hessian of this function is given by

62 F1 300
6Ti - (450 - T4)3
which is always positive, since T4 $ 400. Therefore, this term is convex for all
values of T4 and T5.

• The third term


F. _ 400 - T5
3 - 150 I
is a linear term and therefore always convex.

• The second term is

The Hessian of F2 is given by

where z = 500 - T4 and y = 500 - T5. The eigenvalues of this Hessian are
given by
GOP ALGORITHM AND ITS VARIANTS 121

It can be seen that the second eigenvalue (for the negative value of the square root)
will always be negative. Thus, the Hessian has mixed eigenvalues, indicating
that the second term in the objective is nonconvex.
In order to "convexify" this term, a quadratic term in one or more of the variables
can be added. Suppose that the term aTl is added. Then, the term becomes
,Ts-T4 2
F2 = 500 _ Ts + aT4

The Hessian of this term is given by

H~= [~7YT ~yr 1


where again z = 500 - T4 and Y = 500 - Ts. The eigenvalues of this Hessian
are given by

For the second eigenvalue to be positive for all values of T4 and Ts , the term in
the square brackets must be positive. In other words,

This leads to the inequality


1
a-> -
4zy
Since 100 :::; Z, Y :::; 200, we obtain
1
a>--
- 40000

Thus, adding the term 40~OO Tl to F2 is sufficient to make this term convex. The net
result of this is that the objective function can now be written as

min [T4 - 300 Ts - T4 400 - Ts Tl] Tl


10000 450 _ T4 + 500 - Ts + 150 + 40000 -"4
where the first term is convex, and the second term is concave. By the addition of an
extra variable and renaming all the variables, the problem now becomes

min 10000 [Yl - 300 Y2 - Yl 400 - Y2 ~] _ 0.25z


450 - Yl + 500 - Y2 + 150 + 40000 lYl
122 V. VISWESWARAN AND C. A. FLOUDAS

Problem Problem Size GOP Algorithm


Name Variables Constraints Iterations CPU (sec)
Example 1 12 13 4 0.09
Example 2 12 13 3 0.06
Example 4 11 9 3 0.10
Example 5 11 9 8 0.20
Example 7 26 30 4 0.11

Table 1 Heat Exchanger Network Problems from Quesada and Grossmann (1993) with
variables eliminated as detailed in Section 2.2

Zl - Y1 0
300::; Zl, Yh Y2 < 400

Now the problem satisfies the conditions of the GOP algorithm, being a convex
problem in Y for all fixed z and a linear problem in z for all fixed y.

Similar reductions were obtained for all the example problems given in Quesada and
Grossmann (1993). The results of applying the GOP algorithm to these problems is
given in Table 1. Note that in all the cases, the problems reduced to either one or
two variable unconstrained problems. Consequently, the subproblems solved by the
algorithm are very small in size, as shown in the CPU times taken to converge to the
optimum.

2.3 Pooling and Blending Problems

Pooling and blending problems are a feature of models for most chemical processes.
In particular, for problems relating to refinery and petrochemical processing, it is
often necessary to model not only the product flows but the properties of intermediate
streams as well. These streams are usually combined in a tank or pool, and the pool
is used in downstream processing or blending. The presence of these streams in the
model introduces nonlinearities, often in a nonconvex manner. The nonconvexities
arise from the interactions between the qualities of the input streams and the blended
products.

Traditionally, pooling problems have been solved using successive linear programming
(SLP) techniques. The first SLP algorithm (Method of Approximation Programming)
was proposed by Griffith and Stewart (1961). Subsequently, SLP algorithms have
GOP ALGORITHM AND ITS VARIANTS 123

A 3%5
Max. 2.5%5
x

1%5
B---------
Max 1.5% 5 Y
C 2%5

Figure 6 The Haverly Pooling Problem

been proposed by Lasdon et al. (1979), Palacios-Gomez et al. (1982) and Baker and
Lasdon (1985) among others. These algorithms have been applied to pooling problems
by Haverly (1978) and Lasdon et al. (1979). SLP algorithms have the advantage
that they can utilize existing LP codes and can handle large scale systems easily.
However, to guarantee convergence to the global solution, they require convexity in
the objective function and the constraints. For this reason, these methods cannot be
relied upon to determine the best solution for all pooling problems.

Various formulations have been proposed for pooling and blending problems. In
the following sections, we consider the application of the GOP algorithm to three of
these formulations, namely, the Haverly Pooling problem, two pooling problems from
Ben-Tal and Gershovitz (1992), and a multiperiod tankage quality problem commonly
occuring in refineries.

The Haverly Pooling Problem

In his studies of the recursive behavior of linear programming (LP) models, Haverly
(1978) defined a pooling problem as shown in Figure 6. Three substances A, B and
C with different sulfur contents are to be combined to form two products z and y
with specified maximum sulfur contents. In the absence of a pooling restriction, the
problem can be formulated and solved as an LP. However, when the streams need to
be pooled (as, for example, when there is only one tank to store A and B), the LPmust
be modified. Haverly has shown that without the explicit incorporation of the effect
of the economics associated with the sulfur constraints on the feed selection process,
a recursive algorithm for solving a simple formulation having only a pool balance
cannot find the global solution. Lasdon et al. (1979) added a pool quality constraint
to the formulation. This complete NLP formulation is shown below:
124 V. VISWESWARAN AND C. A. FLOUDAS

mm 6A + 16B + 10(Cx + Cy ) - 9z -15y


s.t.
Px + Py - A- B = 0 } pool balance

z- Px -Cx
y - Py - Cy
= 0
= 0 } component balance

p.(Px + Py) - 3A - B = 0 } pool quality

P'Px + 2.Cx - 2.5z


p.Py + 2.Cy - 1.5y
<
<
0
0 } product quality constraints

z <
Y <
zU
yU } upper bounds on products

where p is the sulfur quality of the pool; its lower and upper bounds are 1 and 3
respectively. This problem was solved by both Haverly (1979) and Lasdon et al.
(1979). In all cases, however, the global optimum could not always be determined,
the final solution being dependent on the starting point.

More recently, Floudas and Aggarwal (1990) solved the problem using the Global
Optimum Search (Floudas et ai., 1989). They had to reformulate the problem by
adding variables and constraints, and despite being they were successful in finding
the global minimum from 28 out of 30 starting points, they could not mathematically
guarantee that the algorithm would converge to the global minimum.

The GOP Algorithm

By projecting on the pooling quality p, the problem becomes linear in the remaining
variables. Hence, p is chosen as the "y" variable. From the constraint set, it can
be seen that only Px and Py are the connected variables. Hence, four relaxed dual
subproblems need to be solved at each iteration. Three cases of the pooling problem
have been solved using the GOP and GOPIMILP algorithms. The data for these
three cases, as well as the average number of iterations required by the algorithms to
converge, are given in Table 2. It can be seen that in all cases, the algorithms require
less than 15 iterations to identify and converge to the global solution.
GOP ALGORITHM AND ITS VARIANTS 125

Case Bounds Cost Optimal Solution GOP GOPIMILP


ZU yU ofB r P* Iter. CPU Iter. CPU
I 100 200 $16 -$400 1.0 12 0.22 12 0.49
IT 600 200 $16 -$600 3.0 12 0.21 12 0.45
III 100 200 $13 -$750 1.5 14 0.26 14 0.56

Table 2 Data and results for the Haverly Pooling Problem

Pooling Problems From Literature


We have also applied the GOP algorithm to two pooling problems taken from Ben-Tal
and Gershovitz (1992). The following notation is used for these problem models:
{I, 2, ... , i, ... , I} - set of components
{I,2, .. ·,i,···, J} - set of products
{I,2,.",k,· .. ,K} - set of qualities
{I,2, .. ·,I, .. ·,L} - set of pools

The following variable sets are present in the model :

Zil amount of component i allocated to pool I


Ylj amount going from pool I to product i
Zij amount of component i going to product i
PI" level of quality k in pool I
The parameters in the problem are :
Ai Upper bounds for component availabilities
Dj Upper bounds for product demands
51 Upper bounds for pool sizes
Qj" Upper bounds for product qualities
qi" Level of quality k in component i
Ci Unit price of component i
dj Unit price of product i

Using this notation, these pooling problems have the following form:

max - LL CiZil + LLdjYlj + LL(dj - Ci)Zij


I I j j
126 V. VISWESWARAN AND C. A. FLOUDAS

I Problem No. I Problem Size GOP Algorithm


I J K L Iterations CPU (HP730)
1. 4 2 1 1 7 0.95
2. 5 5 2 1 41 5.80

Table 3 Pooling Problems From Ben-Tal and Gershovitz (1992).

s.t. E Zil +E Zij < Ai


I j

E Zil +E Ylj 0
j

EZil < S,

- E qikZi/ + Plk E Ylj 0


j

E Ylj +E Zij < Dj


I

E(Plk - Qjk)Ylj + E(qik - Qjk)Zij < 0


I

The data for these problems can be found in Ben-Tal and Gershovitz (1992). The
results of application of the GOP algorithm to these problems is given in Table 3.

Multiperiod Tankage Quality Problem

This example concerns a multiperiod tankage quality problem that arises often in the
operations of refineries. The models for these problems are similar to the pooling
problem of the previous section.

In order to develop the mathematical formulation, the following sets are defined :

PR {p} == set of products


CO {c} == set of components
T { t } == set of time periods
QL {I} == set of qualities
GOP ALGORITHM AND ITS VARIANTS 127

For this problem, there are 3 products (Pl,P2,P3), 2 components (Cl, C2), and 3 time
periods (to, tl, t2). The following variables are defined :

Ze,p,. amount of component C allocated to product P at period t


Sp,. stock of product P at end of period t
qp,I,. = quality l of product P at period t

The objective of the problem is to maximize the total value at the end of the last time
period. The terminal value of each product (vp) is given. Also provided are lower and
upper bounds on the qualities of the products, qualities of stocks at start of each time
period (sp,t), qualities in each component (QUe,I)' and the product lifting (LFp ,.) for
every period. The data for this problem is provided in Table 4.

The complete mathematical formulation for this problem, consisting of 39 variables


and 22 inequality constraints (of which 12 are nonconvex) is given below:

max L vp.sp,'t;
pEPR
S.t.
L Zc,p,t < ARc,t t E {tl, t2}, C E CO
pEPR

Sp,t + L Zc,p,t+! - Sp,t+! > LFp,t+l t E {to, td, pEP R


cECO

Sp,t.qp,l,t + L Zc,p,t+!.QUc,1 >


cEca
(Sp,t+l + LFp,t+l) qp,l,t+l t E {to, td, pEP R, l E QL

The sources of nonconvexities in this problem are the bilinear terms Sp,t . qp,l,t in the
last set of constraints. Thus, fixing either the set of S or q variables makes the problem
linear in the remaining variables.

The GOP Algorithm: To apply the GOP algorithm to this problem, we can project
on the qualities (ql, q2). Then, the stocks are the connected variables. Since there are
six of them (corresponding to three products at two time periods), 64 relaxed dual
problem problems need to be solved at every iteration. The results of solving this
problem using the branch-and-bound GOP and GOPIMILP algorithms are shown in
Table 5.
128 V. VISWESWARAN AND C. A. FLOUDAS

Component Arrivals and Qualities


Component Arrivals Qualities
to tl t2 qi q2
CI 0.20 0.25 0.15 40 80
C2 0.20 0.15 0.25 100 50

Product Lifting and Limits on Stocks


Product Product Lifting Stock Limits
tl t2 to tl t2
PI 0.08 0.12 0.05 0.10 0.10
P2 0.15 0.10 0.05 0.10 0.10
P3 0.15 0.20 0.05 0.10 0.10

Bounds and Initial Values for Product Qualities


Products Lower Bounds Upper Bounds Initial Values
qi q2 qi q2 qi q2
PI 70 50 100 100 70 50
P2 80 70 100 100 90 70
P3 60 40 100 100 60 40

Tenninal Value of products : vp = (60,90,40).

Table 4 Data for the Multiperiod Tankage QUality Problem


GOP ALGORITHM AND ITS VARIANTS 129

Starting Point Original GOP GOPIMILP


(y) Iter. Subproblems CPU Iter CPU
Lower bound 8 18 3.66 7 14.7
Upper bound 9 19 3.68 9 13.1
qtl = 100, qt2 = 70 11 18 3.95 13 22.4
qn = 80, qt2 = 100 9 19 3.23 13 16.5

Table 5 Multiperiod Tankage Quality Problem

2.4 Problems in Separation Sequences

As in the case of heat exchanger networks, problems involving separations (sharp and
nonsharp) can often be posed as a superstructure from which the best alternative is to
be selected. The following example considers one such formulation.

Example 2.3 This problem involves the separation of a three component mixture into
two multicomponent products using separators, splitters, blenders and pools. The
superstructure for the problem (Floudas and Aggarwal, 1990) is given in Figure 7.
The NLP formulation for the problem is given below:

min 0.9979 + 0.00432F5 + 0.01517 Fl3

subject to
(Overall Mass Balances)
Fl + F2 + F3 + F4 300
F6 - F7 - F8 o
Fg - FlO - Fll - Fl2 o
Fl4 - Fl5 - Fl6 - Fl7 o
Fl8 - Fl9 - F20 o
(Splitter Component Balances)
F5zj,5 - F6Zj,6 - Fgzj,g o j = A,B,C
F l3 Zj,l3 - F l4 Zj,l4 - F l8Zj,l8 o j = A,B,C

(Inlet Mixer Balances)


130 V. VISWESWARAN AND C. A. FLOUDAS

0.333F1+ F15Zj,14 - F5zj,5 o i=A,B,C


0.333F2 + F 1ozj,9 - F 13Zj,13 o i=A,B,C
0.33 3F3 + F 7 zA,6 + F ll zA,9 + F 16 ZA,14 + F 19 ZA,18 30
0. 333F3 + F 7 zB,6 + F ll zB,9 + F 16 ZB,14 + F 19 ZB,18 50
0. 333F3 + F 7zc,6 + FllZC,9 + F 16 ZC,14 + F 19ZC,18 30

(Compositions)
ZA,i + ZB,i + ZC,i = 1 i =5, 16,9,13,14,16
(Sharp Split)
ZB,6 = ZC,6 = ZA,9 = ZC,14 = ZA,18 = ZB,18 = 0
By projecting on the compositions ZA., ZB. and ZC., the primal and relaxed dual sub-
problems become linear. There are a total of 38 variables and 32 equality constraints.
There are initially 20 connected variables (the flow rates.) However, considering
Figure 7, it is obvious that the recycle streams cannot both be simultaneously active.
This leads to solving two independent problems, with FlO = 0 in the first case and
F15 = 0 in the second case. In each case, the resulting problem has 9 connected
variables. Application of the GOP algorithm to the problem identifies the optimal
solution (shown in Figure 8) in 17 iterations using the parallel configuration as a
starting point. The total CPU time taken was 3.84 seconds on an HP730.

2.5 Phase Equilibrium Problems

Phase and Chemical equilibrium problems are of crucial importance in several process
separation applications. For conditions of constant pressure and temperature, a global
minimum of the Gibbs free energy function describes the equilibrium state. Moreover,
the Gibbs tangent plane criterion can be used to test the intrinsic thermodynamic
stability of solutions obtained via the minimization of the Gibbs free energy. Simply
stated, this criterion seeks the minimum of the distance between the Gibbs free energy
function at a given point and the tangent plane constructed from any other point in
the mole fraction space. If the minimum is positive, then the equilibrium solution is
stable.

The tangent plane criterion for phase stability of an n-component mixture can be
formulated as the following optimization problem (McDonald and Floudas, 1995):

mm F(y)
y
= Lyd~i(Y) - ~?(z)}
iEC
GOP ALGORITHM AND ITS VARIANTS 131

A 6

30A
1 5
I 40B
BC 30C

100 A
looB
looC
14 70A
2 13
n AD
SOB
70C
'-----r-----'
C 18

Figure 7 Superstructure for Example 2.3

A 20

30A
60 40
I BC
40B
30C

100 A
looB
looC
70A
40 B
n 20
SOB
70C

C 20

240

Figure 8 Optimal Configuration For Example 2.3


132 V. VISWESWARAN AND C. A. FLOUDAS

B.t. EYi 1
ieC
0::; Yi < 1

where Y is the mole fraction vector for the various components, J.'i(Y) is the chemical
potential of component i, and J.'?(z) represents the tangent constructed to the Gibbs
free energy surface at mole fraction z. The use of the NRTI.. equation for the chemical
potential reduces the problem to the following formulation:

min F(y) = C(y) + E Yi . E gijTijXj


ieC jEC

B.t. Xi' EgjiYj Yi 'Vi E C


jEC
EYi = 1
iec
o ::; Yi < 1 'Vi E C

where Tij are non-symmetric binary interaction parameters, gij are parameters
introduced for convenience, and the function C(y) is a convex function. By projecting
on Yi, it can be seen that this problem satisfies Conditions (A).

The GOP algorithm was applied to solve several problems in this class. These
problems are taken from McDonald and Floudas (1995) and have been solved by them
using the GLOPEQ package (McDonald and Floudas, 1994). The results are shown
in Table 6. It can be seen that for most of the problems, the GOP algorithm performs
very well when compared to the specialized code in GLOPEQ, which is a package
specifically designed for phase equilibrium problems.

2.6 An Example In Robust Stability Analysis

The following example was first studied by de Gaston and Sofonov (1988). It
concerns the exact computation of the stability margin for a system with real parameter
uncertainty. This problem (shown in Figure 9) involves a single-input single-output
feedback system with a lead-lag element controller. The model for the problem is
given below:
min AIm = Z6
GOP ALGORITHM AND ITS VARIANTS 133

Problem Problem Size GOP GLOPEQ*


Name Nx Ny Ne Iterations CPU Iterations CPU
BAW2L 2 2 3 27 0.68 32 0.15
BAW2G 2 2 3 30 0.75 36 0.16
1WA3T 6 3 4 13 0.86 16 0.22
1WA3G 6 3 4 121 9.00 85 0.96
PBW3T1 6 3 4 82 6.33 53 0.63
PBW3Gl 6 3 4 393 35.21 213 2.37
PBW3T6 6 3 4 1366 134.99 549 4.98
PBW3G6 6 3 4 1886 207.19 757 7.09

Table 6 Results for the Phase Stability Problem

r e 1..+2 u q] y
- -
1..+10 A.(A.+q 2) (A.+q 3)

Figure 9 Feedback Structure For Robust Stability Analysis Example


134 V. VISWESWARAN AND C. A. FLOUDAS

(:1:2 + :1:3 + 10)Y1 - 10:1:4 - :1:1 0


:1:4 - Y2:1:3 0
:1:5 - Y1 0
:1:2 - Y2 0
800 - 800:1:6 ~ :1:1 < 800 + 800:1:6
4 - 2:1:6 ~ :1:2 < 4+ 2:1:6
6 - 3:1:6 ~ :1:3 < 6+ 3:1:6

Details of the development of the model can be found in Psarris and Floudas (1993).
The optimal solution for this problem is km = 0.3417. Application of the GOP
algorithm to this problem converges to the optimal solution in 45 iterations, requiring
1.5 seconds on an HP730.

2.7 Concave and Indefinite Quadratic Problems

The conditions under which the GOP algorithm can be applied make it highly attractive
for problems with quadratic functions in the objective and/or constraints. Of particular
interest are quadratic problems with linear constraints, which occur as subproblems in
successive quadratic programming (SQP) and other optimization techniques, as well
as being interesting global optimization problems in their own right. In this section,
the results of applying the GOP and GOPIMILP algorithms to various problems of
this type is discussed.

2.8 Problems from the literature

Eleven small-size concave quadratic problems from Phillips and Rosen (1988) have
been solved using the GOP algorithm. The problems have the following form:
GOP ALGORITIIM AND ITS VARIANTS 135

Problem Problem Size GOP Algorithm P&R


m n k Iterations CPU (HP730) CPU (CRAY2)
1 5 2 0 3 0.09 0.026
2 5 6 0 2 0.07 0.022
3 5 6 0 2 0.06 0.020
4 5 6 0 2 0.03 0.026
5 4 2 0 4 0.12 0.017
6 4 3 0 4 0.11 0.015
7 4 3 0 4 0.14 0.014
8 10 3 0 17 0.50 0.022
9 10 3 0 8 0.20 0.020
10 4 4 0 3 0.18 0.029
11 9 2 1 3 0.08 0.023

Table 7 Test Problems from Phillips and Rosen (1988) (~ =0.001)

mm 'I/J(z, y) = 91 IP(z) + 9201 y


x,!lEO
n
s.t. IP 0.5 E ~i(Zi - Wi)2,
i=l
(4.1)
n {(z, y) : A1Z + A2Y ~ b, z ~ 0, y ~ O},
z,~, iii E !Rn, y, dE !R k
A1 E !Rmxn, A2 E !Rmxk

Ih,92 E !R.

Here, m is the number of linear constraints, n is the number of concave variables (z),
and k is the number of linear variables (y). The parameters 91 and 92 are -1 and 1
respectively, and the relative tolerance for convergence between the upper and lower
bounds (€) is 0.001.

The results of the application of the algorithm to these problems are given in Table 7.
The CPU times for the GOP algorithm and the Phillips and Rosen algorithm (denoted
by P&R) are given in seconds. It should be noted that the P&R algorithm was run on
a eRAY2. As can be seen, the algorithm solves problems of this size very fast, taking
about 5 iterations to identify and converge to the optimal solution.
136 V. VISWESWARAN AND C. A. FLOUDAS

Problem Problem Size GOP Sherali & Tuncbilek


Name N z Ny Nc Iterations CPU Iterations CPU
CQPl 10 10 11 27 0.68 32 0.15
CQP3 20 20 10 11 10.84 3 3.29
CQP4 20 20 10 4 3.57 1 2.61
CQP5 20 20 10 11 10.91 1 2.55
CQP6 20 20 10 5 5.07 1 2.61
CQP7 20 20 10 229 177.04 11 15.94
IQP1 20 20 10 3 0.65 3 2.73

Table 8 Quadratic Problems from Sherali and Tuncbilek (1994).

Results from application of the GOP algorithm to another set of concave and indefinite
quadratic test problems taken from Floudas and Pardalos (1990) are given in table 8.
These problems have also been solved recently by Sherali and Tuncbilek (1994) whose
results are listed in the same table. Here, N z , Ny and Nc refer to the number of z and
y variables and the number of linear constraints respectively.

Run Problem size Iterations CPU (sec)


m n k GOP GOPIMILP
CLR1 50 50 50 2.3 0.510 0.317
CLR2 50 50 100 3.0 5.736 2.254
CLR3 50 50 200 4.33 27.620 8.293
CLR4 50 50 300 5.0 ---- 8.977
CLR5 50 100 50 3.5 32.07 5.665
CLR6 50 100 150 6.8 ---- 38.892
CLR7 100 100 100 2.2 3.485 31.147
CLR8 100 200 100 3.8 ---- 100.370
CLR9 100 250 100 3.6 ---- 267.124

Table 9 Concave Quadratic Problems from Phillips and Rosen (1988), e =0.01

Randomly Generated Quadratic Problems


This section describes the application of the GOP and GOPIMILP algorithms to
randomly generated problems of the form (4.1). Such problems have earlier been
GOP ALGORITHM AND ITS VARIANTS 137

Run Problem size Iterations CPU (sec)


m n k GOP GOPIMILP
CLRI 50 50 50 2.0 0.120 0.116
CLR2 50 50 100 2.0 0.145 0.141
CLR3 50 50 200 2.2 6.047 1.574
CLR4 50 50 500 3.0 ---- 14.125
CLR5 50 100 100 2.0 0.217 1.373
CLR6 50 100 200 2.0 0.360 11.982
CLR7 100 100 100 2.0 0.305 0.306
CLR8 100 100 200 2.0 0.374 0.369
CLR9 100 100 200 2.0 0.374 0.369
CLRIO 100 100 500 3.0 ---- 80.028
CLR11 100 150 400 1.7 ---- 182.208

Table 10 Concave Quadratic Problems from Phillips and Rosen (1988). £ =0.1

Run Problem size f = 0.1 f = 0.01


m n k Iter CPU Iter CPU
ILRI 25 25 25 2.0 0.232 2.200 0.312
ILR2 25 25 50 2.0 0.416 2.600 0.606
ILR3 25 25 100 2.2 1.522 3.000 2.030
ILR4 25 50 100 4.0 13.19 11.50 37.56
ILR5 50 50 50 2.0 0.864 2.400 1.504
ILR6 50 50 100 2.0 1.264 2.800 3.018
ILR7 25 75 100 3.0 6S.S6 30.00 294.3
ILRS 50 75 100 2.0 1.564 3.600 9.724
ILR9 75 75 100 2.0 2.120 2.S00 6.304
ILRI0 25 75 150 4.0 115.S0 ---- ----
ILRll 50 75 150 2.2 9.5380 ---- ----
ILR12 75 75 150 2.0 2.9560 ---- ----

ILR13 25 100 50 3.6 23.21 23.50 118.6


ILR14 50 100 50 2.2 2.130 3.S00 6.510
ILR15 75 100 50 2.2 3.544 2.800 5.244

Table 11 Indefinite Quadratic Problems from Phillips and Rosen (1988). £ = 0.1 and
0.01
138 V. VISWESWARAN AND C. A. FLOUDAS

studied by Phillips and Rosen (1988), and we generated the data for the constants
'\, iii, d, Ab A2 and b as they have used. The parameters (h and 82 have been set to
values of -0.001 and 0.1 respectively. Depending on the values of '\i, the problems
generated are either concave quadratic or indefinite quadratic problems. For the
case of indefinite quadratic problems, roughly as many postive'\i as negative '\i are
generated. For each problem size, 5-10 different problems (using various seeds) have
been generated and solved.

Tables 9 and 10 present the results for concave quadratic problems using tolerances of
0.01 and 0.1 respectively, while Table 11 presents the results for indefinite quadratic
problems using tolerances of 0.01 and 0.1 with the GOP algorithm. In all the cases, it
can be seen that the algorithm generally requires very few iterations for the upper and
lower bounds to be within 10% of the optimal solution; generally, the convergence to
within 1% is achived in a few more iterations. Moreover, certain trends are noticeable
in all cases. For example, as the number of constraints (m) grows, the problems
generally become easier to solve. Conversely, as the size of the linear variables (Ie)
increases, the algorithm requires more time for the solution of the dual problems,
leading to larger overall CPU times. In general, these results indicate that the GOP
and GOPIMILP algorithms can be very effective in solving medium sized quadratic
problems with several hundred variables and constraints.

It should be noted that several sizes of these problems have also been solved on
a supercomputer using a specially parallelized version of the GOP algorithm. The
results can be found in Androulakis et al. (1995).

3 CONCLUSIONS
Visweswaran and Floudas (1995) proposed new formulations and branching strategies
for the GOP algorithm for solving nonconvex optimization problems. In this paper, a
complete implementations of various versions of the algorithm has been discusssed.
The new formulation as a branch and bound algorithm permits a simplified implemen-
tation. The resulting package cGOP has been applied to a large number of engineering
design and control problems as well as quadratic problems. It can be seen from the
results that the implementation permits very efficient solutions of problems of medium
size.
GOP ALGORITHM AND ITS VARIANTS 139

Acknowledgments
Financial support from the National Science Foundation under grant CTS-9221411 is
gratefully acknowledged.

REFERENCES
[1] I. P. Androulakis, V. Visweswaran, and C. A. Floudas. Distributed
Decomposition-Based Approaches in Global Optimization. In Proceedings
of State of the Art in Global Optimization: Computational Methods and Appli-
cations (&is. C.A. Floudas and P.M. Pardalos), Kluwer Academic Series on
Nonconvex Optimization and Its Applications, 1995. To Appear.
[2] T.E. Baker and L.S. Lasdon. Successive linear programming at Exxon. Mgmt.
Sci., 31(3):264, 1985.
[3] A. Ben-Tal and V. Gershovitz. Computational Methods for the Solution of
the Pooling/Blending Problem. Technical report, Technion-Israel Institute of
Technology, Haifa, Israel, 1992.
[4] R. R. E. de Gaston and M. G. Sofonov. Exact calculation of the multiloop
stability margin. IEEE Transactions on Automatic Control, 2: 156, 1988.
[5] C. A. Floudas and A. Aggarwal. A decomposition strategy for global optimum
search in the pooling problem. ORSA, Journal on Computing, 2(3):225, 1990.
[6] C. A. Floudas, A. Aggarwal, and A. R. Ciric. Global optimum search for
nonconvex NLP and MINLP problems. C&ChE, 13(10): 1117, 1989.
[7] C. A. Floudas and A. R. Ciric. Strategies for overcoming uncertainties in heat
exchanger network synthesis. Compo & Chem. Eng., 13(10): 1133, 1989.
[8] C. A. Floudas and P. M. Pardalos. A Collection of Test ProblemsforConstrained
Global Optimization Algorithms, volume 455 of Lecture Notes in Computer
Science. Springer-Verlag, Berlin, Germany, 1990.
[9] C. A. Floudas and V. Visweswaran. A global optimization algorithm (GOP) for
certain classes of nonconvex NLPs: I. theory. C&ChE, 14:1397,1990.
[10] C. A. Floudas and V. Visweswaran. A primal-relaxed dual global optimization
approach. J. Optim. Theory and Appl., 78(2):187,1993.
[11] R. E. Griffith and R. A. Stewart. A nonlinear programming technique for the
optimization of continuous processesing systems. Manag. Sci., 7:379, 1961.
140 V. VISWESWARAN AND C. A. FLOUDAS

[12] Studies of the Behaviour of Recursion for the Pooling Problem. ACM SIGMAP
Bulletin, 25: 19, 1978.
[13] Behaviour of Recursion Model- More Studies. SIGMAP Bulletin, 26:22,1979.

[14] L.S. Lasdon, AD. Waren, S. Sarkar, and F. Palacios-Gomez. Solving the
Pooling Problem Using Generalized Reduced Gradient and Successive Linear
Programming Algorithms. ACM SIGMAP Bulletin, 27:9, 1979.

[15] W. B. Liu and C. A. Floudas. A Remark on the GOP Algorithm for Global
Optimization. J. Global Optim., 3:519, 1993.

[16] C.D. Maranas and C.A Floudas. A Global Optimization Approach for Lennard-
-Jones Microc1usters. J. Chem. Phys., 97(10):7667, 1992.

[17] C.M. McDonald and C.A Floudas. A user guide to GLOPEQ. Computer Aided
Systems Laboratory, Chemical Engineering Department, Princeton University,
NJ,1994.

[18] C.M. McDonald and C.A Floudas. Global Optimization for the Phase Stability
Problem. A1CHE Journal, 41:1798,1995.

[19] F. Palacios-Gomez, L.S. Lasdon, and M. Engquist. Nonlinear Optimization by


. Successive Linear Programming. Mgmt. Sci., 28(10): 1106, 1982.

[20] A parallel algorithm for constrained concave quadratic global minimization.


Mathematical Programming, 42:421, 1988.
[21] Polycarpos Psarris and C. A Floudas. Robust Stability Analysis of Linear and
Nonlinear Systems with Real Parameter Uncertainty. Journal 0/ Robust and
Nonlinear Control, 1994. Accepted for publication.
[22] I. Quesada and I. E. Grossmann. Global Optimization Algorithm for Heat
Exchanger Networks. I&EC Res., 32:487, 1993.
[23] H. Sherali and C. H. Tuncbilek. Tight Reformulation-Linearization Technique
Representations for Solving Nonconvex Quadratic Programming Problems.
Submitted/or Publication, 1994.
[24] V. Visweswaran and C. A Floudas. New Formulations and Branching Strategies
for the GOP Algorithm. In Global Optimization in Engineering Design, (Ed.)
I. E. Grossmann, Kluwer Book Series in Nonconvex Optimization and Its
Applications, Chapter 3, 1995a.

[25] V. Visweswaran and C. A Floudas. cGOP: A User's Guide. Princeton University,


Princeton, New Jersey, 1995b.
GOP ALGORITIIM AND ITS VARIANTS 141

Appendix A: Implementation of the GOP and GOPIMILP Algo-


rithms
This section describes the key features of the implementation of the GOP and
GOPIMILP algorithms. In particular, the interaction of the various subroutines and the
storage and transfer of relevant data between these routines are crucial to the efficiency
of the algorithm, and are therefore discussed in some detail. The implementation has
been written so as to be a useful framework in the development of any generic branch
and bound algorithms for global optimization.

Overview of the eGOP package


The cOOP package is written entirely in the C programming language, and consists
of approximately 8000 lines of source code, of which around 30% are comments.
The algorithms can be called either in standalone mode or as subroutines from within
another program. The primal and relaxed dual subproblems are solved either using
CPLEX 2.1 (for linear or mixed integer linear) problems or MINOS 5.4 for nonlinear
problems. Various options are available to change the routines that are used, such as
obtaining tighter bounds on the z variables and gf (y) (the gradients of the Lagrange
function), as well as solving the full problem as a local optimization problem at each
node.

Data Structures
Since the cGOP package is written in C, it is highly convenient to aggregate the data
transfer from one routine to another using structures (equivalent to COMMON blocks
in Fortran). The primary data structures used in the package describe the problem
data, the solutions of the various primal problems, the data for the various Lagrange
functions, and the solutions of the relaxed dual subproblems at each iteration.

The most important group of data is obviously the problem data itself. In order to
facilitate easy and general use of this data, the implementation was written assuming
142 V. VISWESWARAN AND C. A. FLOUDAS

that the following types of problems would be solved:

B.t. Ij ~CJZ+dJY+ZTQjY~Uj, i=l,,,.,M1


(4.2)
FAz) + GAy) ~ Uj i = Ml + 1, ... , M2
L~(:)~U
wherej = l, ... ,Ml are the set of bilinear constraints, andj = Ml + l, ... ,M2 are
the set of general nonlinear constraints. It is assumed that the functions Fj ( z) and
Gj(Y) are convex in z and y respectively. Under this assumption, it can easily be
shown that (4.2) satisfies Conditions (A). Note also that while the bilinear constraints
can be equalities or inequalities, the other nonlinear terms in the constraints are
assumed to lie in convex inequalities.

Given the formulation (4.2), the data for the problem can be separated into one part
containing the linear and bilinear terms,and another part containing the nonlinear
terms Fj(z) and Gj(Y). The first part can be specified through a data file or as
arguments during the subroutine call that runs the algorithm. The nonlinear terms,
which in general cannot be specified using data files, can be given through user defined
subroutines that compute the contribution to the objective function and constraints
from these terms, as well as their contribution to the Hessian of the objective function
and the Jacobian of the constraints. The problem data is therefore carried in one
data structure (called pdat from here on, and shown in Figure 10) that describes the
following items:

Control Data This refers to the type of the problem (bilinear, quadratic, nonlinear,
etc), number of z and y variables, the number of constraints, type and value of
the starting point for the y variables, as well as tolerances for convergence.

Bilinear Data For reasons of convenience, the linear and bilinear terms in the
objective function and constraints are treated together. The data is stored in
sparse form, with only the nonzero terms being stored. For each term, the value
of the term as well as the indices of its z and/or y terms are stored.

Bounds The global bounds on the variables (which can be changed before the start of
the algorithm, but thereafter remain constant) are stored in arrays.

Nonlinear Data The pointers to the functions that compute the nonlinear terms and
their gradients are stored in the data structure.
GOP ALGORITHM AND ITS VARIANTS 143

Iteration Data Various counters and loop variables that control and aid in the progress
of the iterations are stored in the main data structure. In addition, the best solution
obtained by the algorithm so far is also stored.

It is important to note that almost all of the main data structure, once it has been
read in from the data file or passed to the main subroutine in the algorithm, remains
constant throughout the progress of the algorithm. The only exceptions are the iteration
variables and the best solution obtained by the algorithm so far.

The solution of the primal problem is stored together as another data structure, psol
(shown in Figure 11). This contains the value of yK for which the primal problem was
solved, solution for the :Il variables, the marginals for all the constraints and variables
at their bounds, as well as an indicator of whether the primal was feasible or not.

Because of the form (4.2), the Lagrange function (for iterations with feasible primal
problems) can be written (after linearization of the terms with respect to :Il and
substitution of the KKT optimality conditions for the primal problem) as

NIK

L (:Il, y, AK)llin T +~
x K = Lc + LLY L.J :Ili9iT( Y - YK) + G'()
Y
i=l

where G'(y) represents all the nonlinear terms weighted by the marginals, and can be
written as
M2
G'(y) = Go(Y) + L Af Gj(y)
j=M , +1

By introducing new variables to represent the nonlinear constraints, the Lagrange


function can be rewritten as

NI~ M2
L (:Il, y, AK)I'in
xK Lc + LI y + L :Ili9T(y - yK) + L Af Zj (4.3)
i=l j=M , +1
Zj > Go(Y) + Gj(y) (4.4)

Note that a simplistic implementation of the algorithm for the general nonlinear
problem in (4.2) leads to a problem with nonlinear terms in each Lagrange function,
making it much more computationally intensive. Given the fact that the nonlinear
terms are the same in each Lagrange function except for a factor due to the marginals
Af, it is far more efficient to group the terms together, and therefore to compute
their gradients only once. Moreover, the regrouping of the terms means that as far as
144 V. VISWESWARAN AND C. A. FLOUDAS

struct pdat {
/* Control section */
char *probname; /* Name of original problem */
char objtype; /* Type of objective function */
char contype; /* Type of constraints */
char pr imaltype; /* Type of primal problems */
char rdualtype; /* Type of relaxed dual problems */
int nxvar; /* Number of x variables */
int nyvar; /* Number of y variables */
int ncon; /* Number of constraints */
int nzcnt; /* Total number of non-zeros */

/* Data */
char *ctype; /* Type of X and Y variables */
int *sense; /* Sense of row: <=, ==, >= */
double *rhs; /* Right hand sides of the rows */
int *count; /* Number of entries in each row */
int *begin; /* Start of entries for each row */
TERMS terms; /* Bilinear terms in problem */
double *xlbd, *xubd; /* Bounds on X variables */
double *ylbd, *yubd; /* Bounds on Y variables */
double objconst; /* Constants in the objective */
double epsa; /* Absolute tolerance specified */
double epsr; /* Relative tolerance specified */
int maxiter; /* Maximum number of iterations */

/* Various functions */
void userobj(}; /* Nonlinear terms in objective */
void usercon(}; /* Nonlinear terms in constraints */

/* Solution */
int niter; /* Number of iterations so far */
double primalubd; /* Current upper bound from primals */
double rdlbd; /* Current lower bound from duals */
double *Xi /* Starting point, solution for X */
double *y; /* Starting point, solution for Y * /
double abserror; /* Absolute error between bounds */
double relerror; /* Relative error between bounds */
};

Figure 10 Main data structure for the GOP and GOPIMILP algorithms
GOP ALGORITHM AND ITS VARIANTS 145

struct psol {
int modstat; /* Feasible or infeasible */
int nxvar; /* Number of x variables */
int nyvar; /* Number of y variables */
int ncon; /* Number of constraints */
double *yval; /* Fixed values for Y variables */
double objval; /* Objective value for primal */
double *varval; /* solution for X variables */
double *cmargval; /* Marginals for constraints */
double *bmargval; /* Marginals for bounds */
char *varstat; /* Status for each variable */
char *solver; /* Which solver was used */
};

Figure 11 Solution of the Primal Problem

/* Structure to hold the data for the Lagrange function */


typedef struct lagdata {
int NIc; /* Number of connected X */
int nyvar; /* Number of Y variables */
double *xlbd; /* Lower bounds for connected X */
double *xubd; /* Lower bounds for connected X */
int *xindex; /* Indices of connected X */
double *ylbd, *yubd; /* Bounds on Y variables */
double *glbd, *gubd; /* Bounds on qualifying constraints */
double **glin; /* Terms in qualifying constraints */
double *gconst; /* Constants in qualifying const. */
double *llin; /* Terms in Lagrange function */
double lconst; /* Constants in Lagrange function */
};

Figure 12 Lagrange function data structure


146 V. VISWESWARAN AND C. A. FLOUDAS

each individual Lagrange function is concerned, only the data regarding (4.3) need to
be stored, .e. the coefficients of the linear terms L1,the bilinear terms g;
and the
multipliers >.f. Its structure is shown in Figure 12.

The solutions of the relaxed dual subproblems comprise the last major data structure.
Apart from the actual objective value for the solution and the values of the y variables,
this data includes information about which iteration and parent node generated each
child node in the branch and bound tree. Thus, the entire information about the tree is
stored in the array of relaxed dual solution structures, rdsol.

Based upon these various data units, the overall scheme of the implementation is now
presented. A pictorial view of the algorithm is given in Figure (13).

Initialization of parameters

At the start of the algorithm, the list of relaxed dual solutions rdsol is initialized to
contain the starting point for the y variables, indicating the root node for the whole
branch and bound tree. An initial local optimization problem can be solved to find a
good upper bound and starting point for the y variables, if desired. Various counters
and bookkeeping variables are initialized before the start of the iterations.

Selection Of Previous Lagrange Functions and Current Region

At any given iteration, the relaxed dual subproblems will contain a Lagrange function
from the current iteration, and one from each of the parent nodes of the current node
in the branch and bound tree. In order to select these functions, a backward search
is done through the list of solutions to the relaxed dual problems starting from the
current node (i.e. the node that has been chosen at the end of the previous iteration).
The following steps are repeated:

Step O. Initialize lagsel[MAXI1ER], the array of parent nodes for the current node.

Step 1. Add the current node C to lagsel. Set lagsel[1] =C, and set the number of
Lagrange functions numlag = 1.

Step 2. Find the iteration P that generated the current node.

Step 3. Go to the node corresponding to iteration P (say node D) and add this node
to the list, i.e. set numlag = numlag + 1, lagsel[numlag] D. =
Step 4. Repeat Steps 2 and 3 until the root node has been reached.
GOP ALGORITHM AND ITS VARIANTS 147

Start Of The Algorithm

-
START

Initialize data -- Storage for Algorithm/Solvers


arrays -- Parameters for Solvers

-- Read input file or pass via function


Input the data -- All data in ONE STRUCTURE
-- Data includes
for the problem
-- Tolerances
-- Starting Point for Algorithm

Invoke any solver -- Read in option files


specific routines -- CPLEXIOSL : Load dummy problems

BEGIN ITERATIONS

Figure 13 Implementation of the GOP Algorithm in C


148 V. VISWESWARAN AND C. A. FLOUDAS

Primal Problem

(Black box)
Data for Problem, current Y
Set up data
:---Nonlinear ---:
_______ ~ I

Subroutines :
for the
1
r ---i- ----
1- - - -

Primal Problem
1 1
1 1

Pointer to data for primal 1


1
1

1
~
NPSOUMINOS
(nonlinear)
Pointer to datafor primal

SOLVE THE
CPLEXIOSL
PRIMAL (linear)
PROBLEM
Pointer to solution of
Primal problem

l Function
Evaluation
(square)

Primal problem solution


Generate Lagrange
Function
Pointer to lAgrange data
-- Number of connected variables
-- Bounds for X variables
-- Bounds for gradients

Figure 13 (continued) Implementation of the GOP Algorithm in C


GOP ALGORITIIM AND ITS VARIANTS 149

Relaxed Dual Problem

Lagrange data
Current and Previous fixed Y
Select Previous -- Gradients used as criterion
i - - - - - - - - - - - - i Lagrange Functions -- One Lagrange /Unction
Set of constraints for per iteration
relaxed dual problems

Constraint data
SOLVE THE
r----- MILP Fonn
RELAXED DUAL
I - - - (CPLEXIOSL)
PROBLEM
Set of solutions for
relaxed dual I One problem
[ One Solution
problems
1 r
Original Fonn
Linear: CPLEXIOSL
------
:
r--
~~~i~~;;-
------~ Sub~utines
Nonlinear: NPSOUMINOS ,- ------- -----

- Several subproblems
- Branch on gradients of
Lagrange /Unction
- CPLEXIOSL can reuse bases
from one problem to another
- Solutions stored in linked list

UPDATE BOUNDS

Figure 13 (continued) Implementation of the GOP Algorithm in C


150 V. VISWESWARAN AND C. A. FLOUDAS

Selecting The Best Solution and Lower Bound

-- All solutions are stored


Select Best Stored in a single linked list.
Solution From -- Solution Provides Lower
All iterations Bound and new value for
Yvariables

Delete the selected


-- Go through the linked list
Solution From and delete selected node
The Stored Set -- Update the linked list

Go to next iteration -- Are Bounds within


Specified Tolerance?

YES
-- Clean up and exit

Figure 13 (continued) Implementation of the GOP Algorithm in C


GOP ALGORITHM AND ITS VARIANTS 151

The list of nodes generated in the above steps provides a set of qualifying constraints
(one set per node) that define the region of operation for the current node.

Obtaining Tighter Bounds For The X Variables


If desired, a set of bounds problems are solved that try to find the tightest bounds on
the z and y variables given any linear and convex constraints in the original problem,
and the current region for the y variables as defined by the qualifying constraints
for the parent nodes of the current node. This is a very important step, because the
tightness of the bounds on the z variables is crucial to obtaining tight underestimators
for the relaxed dual problems.

Primal problem
The primal problem takes as data the pdat structure, along with the current vector
for yK. It is also given the current region for the problem as defined by the selected
qualifying constraints. There are several schemes that can be followed to solve the
primal problem, all of which involve various combinations of the primal, relaxed
primal or a local optimization problem solved in the current region. One possible
scheme is as follows:

1. Solve the primal problem at the current yK .

2. If the primal problem is feasible, update the upper bound.

(a) Solve the full NLP as a local optimization problem in the current region.
(b) If the NLP solution is lower than the upper bound, replace yK with the NLP
solution and go to Step 1. Otherwise go to Step 4.

3. If the primal problem is infeasible

(a) Solve the full NLP as a local optimization problem in the current region.
(b) If the NLP provides a feasible solution, then replace yK with the new
solution from the NLP and go to Step 1. Otherwise, solve the relaxed primal
problem go to Step 4.

4. Return the solution of the problem as a psol data structure.


152 V. VISWESWARAN AND C. A. FLOUDAS

Determination Of Connected Variables

The solution of the current primal (or relaxed primal) problem is used to determine
the set of connected variables. Several reduction tests are used to determine the set.
These include testing for the lower and upper bounds on the gradients of the Lagrange
function and the tightness of the bounds on the :c variables. If the lower and upper
bounds on an :c variable are within a certain tolerance, that variable can be fixed at
its bound. Provision is also made for user defined tests for reducing the number of
connected variables.

Generation of Lagrange Function Data

As mentioned earlier, only the data for the Lagrange functions (4.3) are stored. This
data is generated from the current psol structure. Once the data is generated, it can be
used again whenever the Lagrange functions from that iteration need to be generated.

Global Lagrange functions

If there are no connected variables in the Lagrange function generated at the current
iteration, then this function contains only the y variables. Therefore, it is a valid
underestimator for the entire y space, and can be included as a cut for all future relaxed
dual subproblems. In such a case, the current Lagrange function is added to the list of
"global" Lagrange functions.

Relaxed Dual Problem

Given the current region and a set of connected variables, the region is partitioned
using the qualifying constraints of the current Lagrange function. Then, a relaxed
dual subproblem is solved in each region, and the solutions are stored as part of rdsol
if feasible. The nonlinear terms in the objective function and constraints are again
incorporated through calls to the user defined functions. In the case of the GOPIMILP
algorithm, only one MILP problem needs to be solved.

Selection of the Lower Bound

After the relaxed dual problem has been solved for every possible combination of the
bounds of the connected variables (in the case of the GOPIMILP algorithm, after the
MILP has been solved), a new lower bound needs to be determined for the global
solution. Since the solutions are all stored as a linked list, this permits a simple
GOP ALGORITHM AND ITS VARIANTS 153

search for the best solution. This solution is then removed by simply removing the
corresponding node from the linked list. At the same time, the corresponding value of
y is also extracted to use for the next iteration.

Resolving the MILP Formulation

In the case of the GOPIMILP formulation, after a solution has been selected from
the list of candidate solutions, the MILP formulation corresponding to the iteration
from which the solution was generated needs to be resolved. To accomplish this, a
binary cut that excludes the selected solution is generated and added to the MILP
formulation, which is then solved. Because of the likelihood that the formulation for
any given iteration is likely to be solved again and again at least a few times, several
such formulations are stored in memory, so that when they are resolved, it is merely
a matter of restarting the problem with the additional binary cut. This saves valuable
loading and startup time for the solution of these problems.

Convergence

Finally, the check for convergence is done. The algorithm is deemed to have converged
if the relative difference between the upper bound from the primal problems and the
lower bound from the relaxed dual problems is less than f. Then, the algorithm
terminates (in the case of the standalone version) or returns to the calling routine (in
case of the subroutine version). Otherwise, the algorithm continues with the new fixed
value of y for the primal problem found from the previous step.
5
SOLVING NONCONVEX PROCESS
OPTIMISATION PROBLEMS USING
INTERVAL SUBDIVISION
ALGORITHMS
R.P Byrne, I.D.L Bogle
Department of Chemical (1 Biochemical Engineering, t
University College London, London, England

ABSTRACT
Many Engineering Design problems are nonconvex. A particular approach to global
optimisation, the class of 'Covering Methods', is reviewed in a general framework.
The method can be used to solve general nonconvex problems and provides guaran-
tees that solutions are globally optimal. Aspects of the Interval Subdivision method
are presented with the results of their application to some illustrative test problems.
The results show the care that must be taken in constructing inclusion functions and
demonstrate the effects of some different implementation decisions. Some particular
difficulties of applying the method to constrained problems are brought to light by the
results.

1 INTRODUCTION
1.1 Motivation
The advantages of optimisation are well known: it provides the best possible
solution to a well defined problem. Thus the Design Engineer can be confident
that the design produced is the best one available for the problem. Traditionally,
however, this has only been the case if the problem is convex. When nonconvex
problems are attempted it can no longer be assumed that the design is the best
possible design because the solution to the optimisation problem may not be
the global solution.
t This work was done as part of the Centre For Proce88 Systems Engineering and supported
by the EPSRC.

155
I. E. Grossmann (ed.), Global Optimization in Engineering Design, 155-174.
" 1996 Kluwer Academic Publishers.
156 R. P. BYRNE AND I. D. L. BOGLE

Techniques for determining the minimiser of a convex optimisation problem are


well established but procedures for solving nonconvex optimisation problems
and determining global minima are not so well developed, widely used, or well
documented.

1.2 Problem statement


The general nonconvex optimisation problem is stated in the same way as a
convex problem. The difference is that there are no implicit assumptions about
convexity, continuity or differentiability;

minf(x) (5.1)
xEA

A = {x E ~nlhi(X) = O,9j(X) ~ O}. (5.2)

This problem is an Unconstrained Optimisation Problem if A = ~n, or, more


typically, if A = X where X is an interval or hyperrectangle.

A global solution, x*, to this problem is defined as

x* = {x E Alf(x*) ~ f(x)}. (5.3)

As computer implementations cannot produce exact results this requirement is


frequently relaxed to
If(x*) - f(x)1 ~ f (5.4)
where f is a machine/precision dependent constant.

1.3 Aims and Objectives


This chapter aims to explore the application of interval analysis optimisation
methods to global optimisation problems. Emphasis is placed on issues that are
particularly relevant to process design.

In the absence of systematic approaches to global optimisation Random Search


techniques have been used without any assumptions about the properties of
f( x) but they are not an effective way of locating minima and do not guarantee
global optima. The more sophisticated random search methods, Singlestart and
Multistart [1], Clustering [2] and Genetic Algorithms [3] make more assumptions
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 157

about the problem structure and are more successful but still cannot guaran-
tee global optimality. In order to do so the problem space must be covered and
bounded. This is the basis for 'Covering Methods' which solve nonconvex global
optimisation problems (§2). Covering methods, typically, require more inform-
ation about the problem than Random Search methods but global optimality
can be guaranteed.

To use Covering Methods there must be some way of obtaining a lower bound
on J(x) and an upper bound on the value of the global optimum, J(x*). A
methodology for Covering Methods, independent of the bounding procedure, is
discussed in §2. The bounds needed to cover the problem may be provided in
a number of different ways.

For functions which are Lipschitz, the Lipschitz constant, when known, can be
used to provide these bounds (§2.1). These are the simplest of the covering
methods. However, not all practical problems are Lipschitz and the constant
may not be available.

An alternative for J( x) which is not Lipschitz is to use Interval Analysis, an ana-


logue of Real Analysis for ranges. The relevant properties of Interval Analysis
and inclusion functions are described in §3 and §3.2. Extension to constrained
optimisation is discussed in §4 and the results of applying the Interval Optimisa-
tion method to some illustrative test problems from the literature are presented
in §5.

2 COVERING/EXCLUSION METHODS
In order to ensure that the solution to a general nonconvex optimisation problem
is global it is necessary either to locate all the minima or to cover the region of
interest so that no minima are missed. These 'Covering Methods' are usually
based on excluding subregions until a region, or set of regions, that is sufficiently
small may be said to contain the global optimum. To exhaustively search a
region it is necessary to have some mechanism for obtaining lower, and perhaps
upper, bounds on the value ofthe objective over this region and an upper bound,
fj, on the value of J(x*).

The algorithms rely on bounding the objective function over subsets, Xk, of
the feasible region, and maintaining an upper bound on the value of the global
optimum. If the lower bound for a given X k is greater than fj then Xk cannot
158 R. P. BYRNE AND I. D. L. BOGLE
contain a global minimiser. This is the exclusion principle and is common to
all covering methods.

A general form of the algorithm for a Covering/Exclusion method is:

1. Initialise :
(a) an upper bound on the value f(x*), y = 00.

(b) the initial region, Xo 2 A.


(c) a list L with the initial pair (Xo, y).
2. Remove an element of the list, X.
3. Split X into subsets Xk.
4. For each subset Xk

(a) Obtain a lower bound, lk on the value of f(x), x E Xk.


(b) Obtain an upper bound, Yk, on the value of the global minimum in
Xk.
(c) Add the pair (Xk, lk) to the list.
(d) Set Y= min(y, Yk)'
5. Discard any pairs from L for which lk > y.
6. If termination criteria apply, Terminate.
7. Go to step 2.

Methods vary in the manner bounds are generated, in how the region is divided
and in the assumptions required for f(x) and A but, while many are more
sophisticated, all follow this general scheme.

In some algorithms which solve a relaxed problem to generate lower bounds ([4],
[6]) it is possible that the solution of the relaxed problem will be a solution to
the original problem. As the relaxed problem is a convex problem this solution
is global and so is also the global minimum of f (x) on Xk. In this case a slightly
different treatment may be applied whereby Xk is removed from L and added
to a secondary list of regions whose global minimum is known. Elements of this
list will also be discarded in step 5 subject to the criteria lk > y.
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 159

Most covering methods are constructed so that they can take advantage of
additional information or properties if it is available. An interval algorithm due
to Hansen [7] takes advantage of differentiability to reduce Xk at step 2 using
an Interval Newton method and to discard more sets from the list in step 5 with
a monotonicity test.

Given this common base Covering methods can be split into two main groups:

1. Those that employ sophisticated, computationally expensive, bounding tech-


niques and aim to divide the region a small number of times. In addition,
these methods can often use complex strategies for dividing each region
such as, for example, using nonlinear constraints to split Xk along a con-
straint from the initial problem.

2. Methods that use an inexpensive mechanism for bounding, making a large


number of divisions of the region. Thus the computational expense of the
steps applied is low but they must, typically, be applied a large number of
times in order to refine the bounds on f(x), x E Xk.

The former class can take advantage of well developed convex optimisation
techniques applying them to relaxed problems or decompositions of the original
problem. Examples may be found in, amongst others, [4], [6] and [8].

Lipschitz Optimisation is an example of a method that uses a simple technique


for generating bounds. Because it is mathematically simpler than many other
methods convergence can be proved [9], [10].

2.1 Lipschitz Methods


That the rate of change of an objective function is bounded is a common, and
not altogether unreasonable, assumption for practical functions. This means
that the region of interest can be adequately searched by the use of a grid with
sufficient density [11]. If the objective function, f : ~ -+ ~ is Lipschitz with
known constant, L on A

(5.5)

The task then, is to determine the density of the grid which will provide a
solution to the required accuracy. Evaluating f(x) at N points Xi in A, given
160 R. P. BYRNE AND I. D. L. BOGLE

by
(2i - 1)(
Xi a+ L (5.6)
L(b - a)
N > 2(
(5.7)

will result in at least one point, Xf , satisfying an (-global optima criteria (Eqn
5.4) [9].

Lipschitz algorithms construct a piecewise linear bounding function, J(x) from


lower bounding functions, Ji(X) at each sample point Xi

J;(X) f(x;) - L Ilx - xiii (5.8)


J(x) min { Ji (x)} . (5.9)
I

This underestimating function can be used to provide lower bounds on the


objective which provides a method of exhaustively searching the feasible area
of a problem. The points at which the functions intersect provide lower bounds
on the value of f( x) and the sampled points X; provide information to calculate
y. Given a set, Xk, with a lower bound greater than the current upper bound
on the global optimum Xk cannot contain a global minimiser and thus, may be
discarded and excluded from any further search.

The Lipschitz methods are, in general, effective for single variable problems if
f(x) is known to be Lipschitz and L can be found. Complications arise when
solving multivariate problems because the regions or sets which can be excluded
are hyper-spheres. Thus, as more hyper-spheres are excluded from the search,
the region remaining becomes increasingly difficult to describe computationally
[12].

Lipschitz Optimisation is described in [9], [12] and [13].

3 INTERVAL ANALYSIS, A BOUNDING


PROCEDURE
While functions obeying a Lipschitz condition are quite common the problems of
determining such a constant and the fact that the maximum gradient of f( x) on
A will, necessarily, be a long way from x* mean that an inefficient fine grid will
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 161

be used by the algorithm as it approaches the minimum [11]. Interval methods


may be used without the need for a global constant to bound the function
and operate, in a sense, like an adaptive Lipschitz constant. These methods
have been successfully employed for convex minimisation and the solution of
nonlinear equations and are particularly suited to computer implementation.

3.1 Interval Analysis


An interval is written as an ordered pair of real numbers [a, b] with a ~ b, it is
the set of real numbers {x E lRla ~ x ~ b}.

A degenerate interval [a, a] is a real number in much the same way as a complex
number z = a + Oi is real. The set of all possible intervals is denoted by IT and
lR C IT.

For intervals X = [a, b] and Y = [e, d] the operation:


[a, b] • [e, d] = {x. yla ~ x ~ b, e ~ y ~ d} (5.10)

can be defined for:


[a, b] + [e, d] [a + e, b + d]
[a, b] - [e, d] [a-d,b-e]
(5.11)
[a, b] x [e, d] [min(ae, ad, be, bd), max(ae, ad, be, bd)]
[a, b] -:- [e, d] [a, b] x [lid, lie] iff 0 f/. [e, d]

Further, the width of X, w(X), is defined as b - a. Relational operators follow


the rule; [a, b] < [e, d] iff b < e. Thus the sign of X may be positive (a > 0),
negative (b < 0) or both (0 E [a, bD.

The application of this algebraic system was first used to bound the error in
finite arithmetic operations on computers. If an operation on a number is,
instead, posed as an operation on an interval which bounds the number then
the resulting interval will bound the answer. This provides a representation of
the number and the absolute error incurred.

For a complete exposition of Interval Analysis and Interval Optimisation see


[14] and [15] respectively.
162 R. P. BYRNE AND I. D. L. BOGLE

3.2 Inclusion Functions


The number of practical problems that interval analysis could be applied to
would be limited if only those functions in Eqn. 5.11 (and their combinations)
could be used. Thus, the concept of an inclusion function is introduced and the
properties of the different forms of inclusion function are described.

Define the range of a function / : JR. -+ JR. over an interval Y as j (Y)

j(y) = {/(x)lx E Y}. (5.12)

A function F : II -+ II is an inclusion of / : JR. -+ JR. on V if 1,


j(y) ~ F(Y), \fY E V. (5.13)

This system is useful for global optimisation because an inclusion, F'(X), of


the gradient of /(x), collects information about /'(x) for all x E X. Thus,
if F'(X) :jJ 0 then /,(x) i= 0, \fx E X and X does not contain a stationary
point. This is a consequence of what is often called the fundamental property
of interval arithmetic; that is for F : II -+ II an inclusion of / : JR. -+ JR.;
x EX::} /(x) E F(X), (x E JR., X ElI). (5.14)

When an inclusion function is used in global optimisation it is important that


the bounds on /(x) get better as the interval, Xk, of interest becomes smaller.
This property is called the 'convergence order' of the inclusion function.

The order, a> 0, of an inclusion, F(Y), of /(x), is defined by,

w(F(Y))-w(i(y)) ~AW(Yt, YEII (5.15)

where A is a positive constant. Thus for small intervals the higher order inclusion
functions will produce tighter bounds. However for wide intervals the converse
is true and a lower order inclusion will produce tighter bounds. In practice if
w(X) ~ 2 a second order inclusion will be better than a first order inclusion
[15].

The quality of inclusion function is often critical in optimisation applications


and so it is important to choose a function inclusion form which is suitable for
1 In the case of vector valued functions Eqn. 5.13 must be satisfied componentwise
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 163

the problem. The choice of inclusion function directly affects the 'tightness'
of the bounds that can be generated. The better the bounds are the fewer
subdivisions need to be made and, consequently, the algorithms performance is
improved 2. In general the lower order inclusion functions are better at the start
of an algorithm when the intervals, Xk are large and the higher order inclusions
are more suitable for small intervals.

In order to construct a class of these inclusion functions it must be assumed


that some standard functions are already known such that others may be defined
recursively. In many cases the standard inclusion function can be derived from
simple information about /(x).

Take, for example, a function, s : ~ -+ ~, which is monotonically increasing


on V, such as efC on ~. Inclusion functions S : II -+ II of sand EXP: II -+ II are
easily defined.

S([a, b)) [s(a), s(b)] (5.16)


EXP([a, b)) [e a , eb] (5.17)

F is an inclusion of / on V and EXP(X) is an inclusion of efC on ~.

Once the base functions have been defined (e.g. EXP(X)) inclusion functions
may be constructed using natural extension, one of the centred forms or a
combination of these and the rules of Interval Arithmetic.

The significance of choosing the best inclusion for a given problem is illustrated
by the Six Hump Camel problem (Eqn. 5.25) where different inclusion functions
have a dramatic effect on the convergence and attainable accuracy.

Natural Interval Extension


Natural Interval Extension constructs /(x) by replacing x with the appropriate
interval X and each component function /;(x) with an inclusion F.(X). These
are then combined using the arithmetic operations defined in Eqn. 5.11

Some of the differences between Real Analysis and Interval Analysis are import-
ant for the construction of inclusion functions by Natural Extension. Foremost
2See also Eqn. §5.23
164 R. P. BYRNE AND I. D. L. BOGLE

and multiplication, as is the case with real arithmetic i.e. A - A =1=


AI A =1= 1. For example,
°
amongst these is that subtraction and division are not the inverses of addition
and

[0,1] [0,1] [-1,1]


(5.18)
[1,2] [1,2] [~, 2].

Thus, the order of evaluation of A + B - C is significant and may affect the


quality of the bound produced. Also, the distributive law of Real Analysis
holds for certain cases only, meaning that AB + AC is typically not as good an
inclusion as A(B + C), and A2 is better than A.A.

Mean Value Inclusion Functions


Mean Value inclusion functions are derived from the mean value theorem of
real analysis. For f(x) E ([:1 and F'(X) an inclusion of f'(x)

T(X) = f(c) + (X - cf F'(X), cE X. (5.19)

The constant c can be chosen as the midpoint of X. The Mean Value form of
an inclusion is of order two [15]. Therefore it will provide tighter bounds on
intervals with a small width.

Taylor Form Inclusion Function


The Taylor Form inclusion is also a centred form. For f(x) E (C2, given F"(X)
an inclusion for the Hessian matrix, f" (x).

T2(X) = f(c) + (X - cf J'(c) + ~(X - cf F"(X)(X - c), (5.20)

T2 is an inclusion of f(x). F"(X) may be obtained by automatic differentiation.


For functions with a bounded Hessian, the Taylor Form inclusion function is of
order two.

Specialised inclusion forms for univariate and/or rational f(x) can be found in
[7] and [16].
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 165

4 EXTENDING INTERVAL METHODS TO


CONSTRAINED OPTIMISATION
The extension of interval optimisation methods to constrained problems is rel-
atively simple as it uses and builds on the principles used by the unconstrained
algorithm. Given the exclusion strategy used in the unconstrained case where
regions (or boxes) are excluded according to bounds on the objective function,
an extra step can be included to allow exclusion on the basis of infeasibility.
Should a box be determined to be infeasible it should be discarded.

4.1 Testing Feasibility


Interval analysis can be used to examine the feasibility of a given region with
respect to equality and inequality constraints.

Inequality Constraints
For an inequality constraint, g(x) ~ 0, with an inclusion function G(X), interval
analysis provides bounds on the value of the constraint function and thus the
'status' of Xk may be determined to be feasible, infeasible or indeterminate 3
with respect to g(x) ~ o.

Feasible G(Xk) < 0, (b ~ 0)


Infeasible G(Xk) > 0, (a > 0) (5.21)
Indeterminate G(Xk) 3 0,

If an interval is indeterminate with respect to a constraint it may contain only


feasible points or both feasible and infeasible points.

Equality Constraints
A box, Xk, may be determined to contain feasible points with respect to an
equality,h(x), iff 0 E H(X), the inclusion of h(x).

Clearly it is not possible for the process of finite subdivision to result in a box
which is entirely feasible with respect to an equality, h(x) = O. For this reason
some form of relaxation must be made with respect to equality constraints. It
3 Relational expressions on II are outlined in §3.1, pg 7.
166 R. P. BYRNE AND I. D. L. BOGLE

is possible to relax the equality to two inequalities by choosing a relaxation


constant 0'
h(x) < +0'
h(x) = 0 is relaxed to (5.22)
h(x) > -0'

Alternatively, a maximum box width, /3, is chosen and boxes satisfying W(Xk) ::;
/3 are considered to be feasible. This results in a 'chain' of boxes, below the
acceptable size, along the equality. As with the treatment of inequality con-
straints, this retains all feasible points.

4.2 Exclusion of Infeasible Regions


Infeasibility may be added to the general algorithm as an additional exclusion
criteria. However, it will not in general be possible to exclude regions such that
the remaining set, Z, is equal to the feasible region, A, within a finite number
of divisions. This is because the division of intervals is orthogonal whereas the
constraints, typically, will not be.

Further uncertainty is introduced because the bounds produced by the inclusion


functions are not necessarily 'tight'. This uncertainty is reduced as the width
of X is reduced.

If it holds that,

w(G) --7 0
w(H) --7 0
as w(X) --7 0 (5.23)

then the region Z can be reduced such that its complement with A is smaller
than any reasonable specified tolerance. It is always the case that A ~ Z so
that no feasible points are discarded.

As boxes are divided and bounds on the objective function are accumulated
they may be discarded if they do not contain any feasible points or if the lower
bound on f(x) over the box is greater than the current estimate of the global
optimum, y.

For unconstrained problems obtaining an upper estimate on the value of f(x*)


is simply a matter of sampling any point, Xk E A, on the graph of f(x) or loc-
ating an unconstrained local minimum. To obtain this value when the problem
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 167

contains constraints Xk must be a feasible point. Thus only those Xk that are
feasible contribute. This can be improved if a feasible point algorithm is used
to determine a feasible point in each indeterminate box and improved further
if a constrained local search is performed. In the case of equality constrained
problems it is necessary to use a feasible point search, or constrained local
optimiser, in order to generate values for fj.

5 IMPLEMENTATION AND RESULTS

5.1 Implementation
When implementing an interval optimisation algorithm a number of decisions
must be made. These concern: choosing which box from the list of possible
boxes should be investigated first, how boxes should be partitioned and how the
termination criteria are to be satisfied.

Ideally the box to be chosen from the list will be the one that, once partitioned
and bounded, will result in the removal of more boxes from the list. In practice
it is not possible to know which box, Xk, will be the best before evaluating
F{Xk) thus a choice may be made between choosing the box with the lowest
upper bound or the lowest lower bound. As it is vital that the lower bound be
stored with Xk regardless of which scheme is chosen and boxes are excluded
on the basis of this lower bound, choosing the box with the lowest lower bound
seems to be the more practical choice.

Partitioning of boxes depends on two decisions: direction of partition and the


point at which the partition will be made. The Moore-Skelboe algorithm [14]
bisects each box perpendicular to the longest edge. This prevents the production
of long thin boxes but can also result in a very even search of each box. A more
directed search is obtained by using a measure Fj{X),j = 1 ... n where:

(5.24)

that is, the inclusion of f{x) over X, with all but the jth component of X
reduced to a point. This results in j extra evaluations of F(X) per iteration
but reduces the overall number of partitions that need to be made. The points
Xj are usually the point at which the partition is to be made.
168 R. P. BYRNE AND I. D. L. BOGLE

The point at which to partition is chosen here, somewhat arbitrarily, to be the


centre of the interval. This is the simplest method and results in bisection of
X. Progress can be improved by using a few steps of, say, a steepest descent
algorithm to choose the split point [15]. This is, in the main, due to the improved
upper bounds on /(x*) obtained during the descent and not because the lowest
point in X is necessarily the best point at which to partition.

The results were obtained on an Intel 486DX using original code written in
C++.

5.2 Test Problems


The results of two unconstrained test problems are presented. The first from
[15] is called 'The Six Hump Camel Back Function' and is given by

/(Xl, X2) = 4x~ - 2.1xi + ~x~ + X1X2 - 4x~ + 4xj (5.25)

with A = [(-10,5), (-7,4)].

A second order Taylor form inclusion function was used in addition to the
natural interval extension. If c E ~ 2 is a point in X E [2 the Taylor form
inclusion, FT, is given by

FT(X) = /(c) + /f (c)(Xl - cd + /2(C)(X2 - C2)


+ tHdX)(Xl - ct}2 + (Xl - Ct}(X2 - C2) (5.26)
+ 2 H2 (X)(X2 - C2)2
8 + X?(-25.2 + lOX?),
where (5.27)
-8 + 48X?

This problem has 15 stationary points in the region of interest. The two global
minima are at (0.0898, -0.7126)T and (-0.0898,0.7126)T. With f. < 10- 4 the
Natural Interval Extension did not converge in 1000 subdivisions. For f. = 10- 7
the Taylor form converges in less than 400 subdivisions, to a list with two
members:
[ 0.0898,0.0900 ] [ -0.0900, -0.0898 ]
-0.7130, -0.7119 0.7119,0.7300,

This problem illustrates an interesting point. While Natural Interval Extension


is the most readily available way of developing an inclusion function it is not
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 169

necessarily the best. The algorithm using a natural extension of this function
does not converge for values of f < 10- 4 independent of the maximum number
of iterations allowed. The Taylor form however is of a higher convergence order
and provides better bounds on the objective value over small boxes. These
results, along with the CPU time required are summarised in Table 1

Table 1 Result summary for problem 1.

Inclusion Divisions Time to solve (s) Time / division (s x 103 )


Eqn 5.26 374 53 141.3
Eqn 5.25 1000 - 140.0

The second problem (Eqn. 5.28) exhibits two global minima and demonstrates
how the interval optimisation locates both while discarding local minima

(5.28)

For a symmetrical problem bisecting Xk perpendicular to its longest edge is


suitable, whereas with an asymmetrical problem bisecting by using the measure
Fj (X), as defined in Eqn 5.24, improves the overall performance. This difference
is not so clearly seen with the Camel Hump problem as it is symmetrical, while
in this case the problem is made deliberately asymmetrical.

Given Xo = [( -5,8), (-2, 3), {-2, 3)]T this problem converges to the same point
for both bisection methods but requires 110 bisections using Eqn 5.24 as com-
pared to 174 with the traditional bisection mechanism.

X* = [ -3 X
±2.7859
10- 5 ,2 X 10- 5
1 (5.29)
-1.83 X 10- 4 ,1.22 X 10- 5

Both these solutions have an objective function value of -1.809312. Note in


table 2 the penalty in CPU time for using the more advanced bisection method
is negligible in this case.
170 R. P. BYRNE AND I. D. 1. BOGLE

Table 2 Result summary for problem 2.

Bisection Divisions Time to solve (8) Time / division (810 -;1)


Widest edge 174 1.1 6.4
Use Eqn 5.24 110 0.75 6.8

A multiextremal constrained problem (Eqn. 5.30) from [5] has also been solved.

mm
",EA
-Xl + X1X2 - X2
-6Xl + 8X2 < 3
3Xl - X2 < 3 (5.30)
Xl > 0
X2 < 5

This problem has two minima, as reported by [5], which occur at [0.916,1.062]
and [1.167,0.5] with J(x) = -1.0052 and -1.0833 respectively.

The problem is solved by modifying the objective function, J(x), and the inclu-
sion function, F(X), such that infeasible boxes are discarded while feasible and
indeterminate boxes are retained. Only feasible points contribute to the upper
bound.

Because boxes must be divided until they are considered feasible this approach
can result in a large number of boxes ifthe maximum width, f3 (see §4) is small.

This problem required in excess of 2000 divisions (> 1208) to achieve an un-
certainty on the value of J(x) of less than 10- 4 . At this point the algorithm
was stopped. There were 607 interval/bound pairs, all lying on the constraint,
on the list. All of these pairs fit into a box K E II2 with a range of J(x), Y E II.

K = [ 1.1375,1.1926 ] Y = (-1.0834, -1.0774)


0.418,0.5752

This illustrates one of the difficulties of using interval methods for constrained
problems. The minima for this problem lie on the boundary of A. This bound-
ary is not orthogonal therefore a finite number of orthogonal subdivisions can-
not produce an entirely feasible box containing the global optimum. Thus, the
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 171

requirements for feasibility must be relaxed slightly and indeterminate boxes


which are below a maximum acceptable size are treated as feasible. This re-
tains all feasible points but also means that an indeterminate box, i.e part
feasible, part infeasible, may be given as a solution.

In terms of practical application any point in any of the 607 boxes will provide
a starting point for which I(x) < -1.0052. Furthermore a local optimisation
procedure will converge to the global optimum given anyone of these boxes as
a starting point.

In practice the maximum acceptable size for an indefinite box can be made
very small. However this has an appreciable impact on the computation time
required to solve the problem. It may not be necessary if a local optimisation
algorithm is to be used to refine the solution as this will converge to the global
minimum feasible point within the box.

6 CONCLUSIONS
In this chapter we have shown an approach to solving nonconvex Nonlinear Pro-
gramming problems. This approach is a method of covering the feasible region,
obtaining bounds on I( x) using Interval Analysis and locating the global minima
using the exclusion principle. The solutions provided are intervals containing
the global minimisers of I(x).

The Interval Analysis approach to global optimisation is very flexible. It re-


quires only that inclusion functions for the objective function and constraints
can be generated and provides a guarantee that the solutions are global optim-
isers. The objective function and constraints may be convex, nonconvex or
concave without alteration of the algorithm.

The solutions provided by the interval method are all the global minimisers.
This is particularly useful in Engineering Design where factors that are not
included in the objective function, but are important in the final design, can be
used to make the final decision between globally optimal solutions.

Some aspects of this technique have been considered and test problems used
to illustrate them. The first test problem shows how different inclusion func-
tions can affect performance and robustness. It was shown that a higher order
inclusion function can considerably increase the accuracy possible in the solu-
172 R. P. BYRNE AND I. D. L. BOGLE

tion. This does not necessarily imply that the higher order inclusion will be
the best for all problems. The lower order inclusion functions produce tighter
bounds over large intervals. Moreover, the cost of evaluating, for example, a
Taylor Form Inclusion where automatic differentiation is performed may res-
ult in fewer divisions but could require more computation time per division
possibly reducing the overall efficiency. A combination of these forms may be
appropriate using a Natural Extension for large intervals and augmenting with
one of the centred forms close to the solution.

The same aspects of efficiency are relevant to the second problem. This problem
highlights the improvement that can be obtained by changing the partitioning
strategy using more information about the problem than the standard bisection
method. Again the performance improvement will depend on the scaling of the
variables and the cost of evaluating the objective function /(x) as opposed to
the relatively inexpensive partitioning phase. A number of other 'accelerations'
are described in [17].

The third problem has a nonconvex objective with linear constraints and serves
to indicate the use of Interval Subdivision in constrained problems when the
global minimiser lies on the boundary of the feasible region, and have solutions
on the boundary of the constraints. This argument applies equally to equality
constrained problems where the solution must be on a constraint.

The Interval Optimisation algorithm must relax the feasibility criteria in order
to solve these problems. It is shown that, while the number of divisions may be
high, the algorithm can produce solutions to a prescribed accuracy. Therefore
the Interval Algorithm can certainly be used to supplement current convex
optimisation algorithms allowing location of the global optimum.

Future Work.
It is clear that an appropriate choice of inclusion function and partitioning
strategy can reduce the number of subdivisions that must be generated but it is
not clear how this affects performance when the objective function is expensive
to evaluate. A more extensive study, with an efficient implementation, to profile
the performance of these algorithms would provide insight into the optimal
implementation.

Interval Methods have successfully been applied to solving large scale nonlinear
equations using parallel computer architectures [18]. The Interval Optimisation
method is very similar and can probably be scaled in a similar fashion. This
SOLVING NONCONVEX PROCESS OPTIMISATION PROBLEMS 173

is of great interest in process engineering because a many large process design


optimisation problems can be nonconvex.

Acknowledgements
This work was funded by the Engineering and Physical Sciences Research Coun-
cil and Performed as part of the Centre for Process Systems Engineering.

The authors would like to thank Kevlin Henney for his consistently excellent
advice on the subject of C++.

REFERENCES
[1] Rinnoy Kan, A.H.G, Timmer, G.T (1987). "Stochastic Global Optimiza-
tion Methods Part II: Multi-Level Methods." Math. Prog. 39 (1) 57-78.
[2] Rinnoy Kan, A.H.G, Timmer, G.T (1987). "Stochastic Global Optimiza-
tion Methods Part I: Clustering Methods." Math. Prog. 39 (1) 27-56.
[3] Androulakis, I.P and Venkatasubramanian, V. (1991) "A Genetic Al-
gorithmic Framework for Process Design and Optimization." Computer
Chern. Engng. 15 (4) 217-228.
[4] Floudas, C.A., Visweswaran, V. (1990) "A Global Optimization Algorithm
(GOP) for Certain Classes of Nonconvex NLPs. 1 Theory." Compo Chern.
Eng. , 14 (12),1397-1417.
[5] Floudas, C.A., Aggarwal, A., Ciric, A.R. (1989) "Global Optimum Search
for Nonconvex NLP and MINLP Problems." Computer. Chern. Eng 13
(10) 1117-1132.
[6] Quesada, I., Grossmann, I.E (1993) "Global Optimization Algorithm for
Heat-Exchanger Networks." Ind. Eng. Chern. Res. 32 (3) 487-499.
[7] Hansen, E. (1992) "Global Optimization Using Interval Analysis." Marcel
Dekker, New York.
[8] Kocis, G.R, Grossmann, I.E (1991). "Global Optimization of Noncon-
vex Mixed-Integer Nonlinear-Programming (MINLP) Problems in Process
Synthesis." Ind. Eng. Chern. Res. 27 (8) 1407-1421.
174 R. VAIDYANATHAN ANDM. EL-HALWAGI

[9] Piyavskii, S.A (1972). "An Algorithm for Finding the Absolute Extremum
of a Function." USSR Compo Mat. & Mat. Phys. 1257-67.
[10] Meewela, C.C, Mayne, D.Q (1988) "An Algorithm for Global Optimization
of Lipschitz Continuous Functions." J.Optim. Theory. Appl57 (2) 307-322.
[11] Torn, A., Zilinskas, A. (1989), "Global Optimization. Lecture Notes in
Computer Science." Springer-Verlag, Berlin.
[12] Hansen, P., Jaumard, B., Lu, S.H (1992). "On Using Estimates ofLipschitz-
Constants in Global Optimization." J. Optim. Theory. Appl. 75 (1) 195-
200.
[13] Hansen, P., Jaumard, B., Lu, S.H (1992). "Global Optimization of Uni-
variate Lipschitz Functions .1. Survey and Properties ." Math. Prog. 55
(3) 251-272.
[14] Moore, R.E. (1966), "Interval Analysis.", Prentice-Hall, Englewood Cliffs,
NJ.
[15] Ratschek, H., Rockne, J. (1988), "New Computer Methods for Global Op-
timization", Ellis Horwood Ltd., Chichester, West Sussex, England.

[16] Neumaier, A. "Interval Methods for Systems of Equations." Cambridge


University Press, London.
[17] Csendes, T., Pinter, J. (1993) "The Impact of Accelerating Tools on the
Subdivision Algorithm for Global Optimization." European J. of Ops. Res.
65314-320.
[18] Schnepper, C.A, Stadtherr, M.A (1993) "Application of a Parallel Inter-
val Newton/Generalized Bisection Algorithm to Equation-based Chemical
Process Flowsheeting.", in Proc. International Conference on Numerical
Analysis with Automatic Result Verification, Lafayette, LA.
6
GLOBAL OPTIMIZATION OF
NONCONVEX MINLP'S BY
INTERVAL ANALYSIS
Ragavan Vaidyanathan*
and Mahmoud EI-Halwagi
Department of Chemical Engineering, Auburn Univer,ity, Auburn, AL 96849
* The M. W. Kellogg Company, HOUlton, TX 77!10-4557

ABSTRACT
In this work, we introduce a global optimization algorithm based on interval analysis
for solving nonconvex Mixed Integer Nonlinear Programs (MINLPs). The algorithm
is a generalization of the procedure proposed by the authors (Vaidyanathan and El-
Halwagi, 1994a) for solving nonconvex Nonlinear Programs (NLPs) globally. The al-
gorithm features several tools for accelerating the convergence to the global solution.
A new discretization procedure is proposed within the framework of interval analysis
for partitioning the search space. Furthermore, infeasible search spaces are elimi-
nated without directly checking the constraints. illustrative examples are solved to
demonstrate the applicability of the proposed algorithm to solve nonconvex MINLPs
efficiently.

1 INTRODUCTION
A large number of chemical engineering problems can be formulated as mixed-
integer nonlinear programs "MINLPs". These MINLPs are typically nonconvex
and hence possess multiple local optima. Over the past three decades, a number
of algorithms have been proposed to solve optimization programs globally (for
recent reviews the reader is referred to Floudas and Grossmann, 1994; Floudas
and Pardalos, 1992 and Horst, 1990). The proposed procedures have mainly
been developed utilizing branch and bound, outer approximation, primal-dual
decomposition and interval analysis principles. Swaney (1990), Maranas and
Floudas (1994) and Ryoo and Sahinidis (1994) have developed global optimiza-
tion algorithms based on branch and bound methods. An outer-approximation
175
I. E. GrossmlJ1l1l (ed.J, Global Optimization in Engineering Design, 175-193.
© 1996 Kluwer Academic Publishers.
176 R. VAIDYANATHAN ANDM.EL-HALWAGI

algorithm was introduced by Kocis and Grossmann (1988) to solve nonconvex


MINLPs. The Generalized Benders Decomposition "G BD" originally proposed
by Geoffrion (1972) has been revised to be applicable to a large class of opti-
mization problems (e.g. Floudas and Visweswaran, 1993; Sahinidis and Gross-
mann; 1991; Bagajewicz and Manousiouthakis, 1991).

Interval analysis can provide an attractive framework for the global solution of
noncovex optimization problems. Interval analysis algorithms are based on the
concept of continually deleting sub-optimal portions of the search space until
the global solution is alone retained. Interval Analysis algorithms have the
attractive property of guaranteed convergence to the global solution. The con-
cept of interval analysis was originally introduced to provide upper and lower
bounds on errors that occur in computations (Moore, 1966). Since then, the
scope of interval analysis has been significantly enhanced, particularly in the
area of global optimization. A number of implementations of interval-based
optimization procedures have been recently developed to solve nonlinear pro-
grams, "NLPs" (e.g. Moore et al., 1992; Ratschek and Rokne, 1991; Hansen,
1980; Moore, 1979; Ichida and Fujii, 1979). However, they are all computa-
tionally intensive for most problems. Recently, Vaidyanathan and EI-Halwagi
(1994) have introduced an interval-based global optimization procedure for the
solution of NLPs. In particular, they have introduced new techniques that ac-
celerate the solution and eliminate infeasible domains without directly checking
the constraints.

In this work, we proposed a new algorithm for the global solution of MINLPs.
This algorithm is a generalization of the NLP-solving procedure developed by
Vaidyanathan and EI-Halwagi (1994). In addition to the accelerating tools, new
strategies for partitioning the search space in the presence of discrete variables
will be discussed. For computational economy, the procedure treats discrete
variables as being continuous while applying the interval analysis tests. Case
studies have been solved to illustrate the efficacy of the algorithm.

2 PROBLEM STATEMENT
The problem to be addressed in this chapter can be stated as follows:
minimize(globally)/(z, y).
subject to the nonlinear inequality constraints,

Pi(Z,y) ~ 0 i = 1,2, ... ,m


GLOBAL OPTIMISAnON OF NONCONVEX MINLP'S 177

as well as the following box constraints,


Ui ::; :l:i ::; hi, i = 1,2, ... , k
Ci::;Yi::;d;, i=I,2, ... ,n-k

which define the initial search box B. The domain (search space) is represented
by both continuous variables (x) and discrete variables (y).
i.e. u Ric and yf r- Ic
The objective function f(x,y) is assumed to be continuous and twice differen-
tiable whereas each constraint Pi (:I: , y) is assumed to be continuous and once
differentiable. Equality constraints can be handled as two inequalities. Alter-
natively, an equality constraint may be eliminated by solving for some variables
that are separable.

3 INTERVAL ANALYSIS: BACKGROUND


In this section, some of the basic principles of interval analysis will be discussed.
For more details the readers are referred to Hansen (1992), Ratschek and Rokne
(1991), Alefeld and Herzberger (1983), Moore (1979).

3.1 Intervals and Interval Arithmetic


An interval, Xi = [Ui, hi], containing a real variable :l:i is characterized by the
two scalars tli and bi such that tli ::; :l:i :::; bi and tli, bi,:l:i E R.

Let B denote the set of real compact intervals such that:


B = {Xi Ii = 1,2, ... ,n}. (6.1)
An interval vector X = (Xl! X 2 , ••• , Xi, ... , Xn)T E B n represents a rectangular
region Xl, X 2, ... , Xn in the n-dimensional space R n and is referred to as a box.
Let X and Y be two interval boxes. Then, X is said to be sub-box of Y if
Xi ~ Yi for each i = 1, 2, ... , n.

The width of an interval box is the maximum edge length over all the co-
ordinate directions, i.e.
w(X) = m~ W(Xi), (6.2)
l~'~n
178 R. VAIDYANATHAN ANDM. EL-HALWAGI

where
(6.3)
Interval Arithmetic: The basic mathematical operations that are performed
with real numbers can also be performed with intervals. A set of rules have
been established to carry out the mathematical operations with intervals. Some
rules for performing interval operations are:
Addition
[a, b] + [e, dJ = [a + e, b + dJ. (6.4)
Negation
-[a, b] = [-b, -a]. (6.5)
Multiplication
[a, b] * [e, dJ = [min(ac, ad, be, bd), maz(ac, ad, be, bd)]. (6.6)
Division
[a, b]/[e, dJ = [min(ale, aid, ble, bid), maz(ale, aid, ble, bid)], (6.7)
i/O ~[e, dJ (other rules apply when 0 E [e, dJ). (6.8)

3.2 Function Inclusion


There are several methods to evaluate an inclusion of a function over a given
interval (Ratschek and Rokne, 1984). Of these, the natural inclusion and the
centered-form of inclusion are most commonly used. The natural inclusion of a
function f(x) is obtained by replacing each occurrence of the variable z with an
interval including it, X. Interval arithmetic, rather than real arithmetic, is then
used to compute the function. On the other hand, the centered-form inclusion
of f(x) is obtained by applying the natural inclusion to Taylor's expansion of
f(x). The centered-form of inclusion gives tighter bounds on the function for
small intervals as compared to the natural inclusion and hence may be used
whenever possible in calculations. However, the natural inclusion is often very
useful because of its computational simplicity.

3.3 Inclusion Isotonicity


The inclusion isotonicity of intervals is an important property that makes in-
terval analysis useful for global optimization. Consider a real-valued function,
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 179

f(x). The interval function F(X) is said to be an inclusion isotone of f(x):

if:z: E X implies that f(:z:) E F(X), (6.9)


and in general,
if Y ~ X implies that F(Y) ~ F(X). (6.10)

3.4 Current interval-based


global-optimization methods
While there are several global-optimization procedures which are based on in-
terval analysis (e.g. Moore et al., 1992; Ratschek and Rokne, 1991; Sengupta,
1981; Hansen, 1980; Moor, 1979; Ichihida and Fugii, 1979), they are all based
on the principle of successively deleting portions of the optimization space that
cannot contain the global solution. Invariably, four tests are used:

The upper-bound test:


Let the value of the objective function at an arbitrary feasible point be upbd.
Hence, upbd is a valid upper bound on the global minimum. Consider a sub-box
X of the original search box B. Let the inclusion of the objective function over
X be F(X)=[LBX,UBX]. Therefore, if:

LBX > upbd (6.11)


one can completely delete the sub-box X.

The infeasibility test:


Suppose that X is a sub-box of B. Let Fi(X) be the inclusion of constraint
Pi(:Z:) over X. If for some i=1, 2, ... , m:
(6.12)
then X is certainly infeasible and can be deleted.

M onotonicity test:
Consider a certainly strictly feasible sub-box X. By certainly strictly feasible,
we mean that Fi(X) < 0 for all i. Let 9,(:Z:) denote the gradient of the objective
180 R. VAIDYANATHANANDM.EL-HALWAGI

function f(x) with respect to Zi. Also, let Gi(X) be the inclusion of gi(Z) over
X. If:
(6.13)
then the objective function is monotonic in all the coordinate directions. Hence,
only the end point that corresponds to the minimum objective function value
in the box should be retained and the rest of the box X can be deleted.

The non convexity test:


Let X be a certainly feasible box. In order for a minimum solution to be in the
interior of X, the Hessian H of the objective function must be positive semi-
definite in some neighborhood of the minimum. Hence, if H is not positive
semi-definite over the entire X, the interior of X can be deleted. The test is
typically carried out by evaluating the interval inclusion of each of the diagonal
elements of the Hessian, denoted by [H/i(X), Hlf(X)]. If:
Hf:(X) < 0 for any i = 1,2, ... , n, (6.14)
the interior of X can be deleted.

Having discussed the basic principles of interval analysis and their applica-
tion in global optimization, we are now in a position to present our proposed
algorithm in the next section. It is based on generalizing the interval-based
algorithm proposed by Vaidyanathan and EI-Halwagi (1994a,b) for tackling
NLPs. In addition, a discretization scheme is developed to tackle the special
aspects of searching over integer variables.

4 REVISED ALGORITHM
The revised interval analysis algorithm for global optimization incorporating
the discretization procedure for treating integer variables will be discussed be-
low. In addition, the algorithm retains the tools developed earlier (Vaidyanathan
and EI-Halwagi, 1994a) to significantly accelerate interval-based global opti-
mization algorithms. The integer constraints are relaxed while applying the
interval analysis tests for deleting sub-optimal and/or infeasible portions of the
search space. This relaxation is reconciled later when the search space is par-
titioned. Accordingly, the shifted partitioning strategy has been modified to
utilize the discrete nature of some of the variables in deleting infeasible portions
of the search space.
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 181

4.1 Lower bound test


One of the main limitations of the previously-described infeasibility test is the
need to examine the feasibility of the constraints one at a time until an infeasi-
ble constraint is identified. This can be computationally intensive. Therefore,
we propose the following new test to identify infeasible boxes without directly
checking the constraints.

Let a valid lower bound on the value of the objective function at the global
minimum be denoted by lwbd. Consider a sub-box X of the initial search box
B. Suppose that the inclusion of the objective function over X is given by [LBX,
UBX]. If the following condition holds:
UBX < lwbd. (6.15)
then the sub-box X is completely infeasible and can hence be deleted.

Several methods can be used to evaluate a lower bound on the value of the
objective function at the global minimum. For instance, if the optimization
program features an objective function and a set of constraints that are rational
polynomials, the procedure proposed by Manousiouthakis and Sourlas (1992)
can be used. For more general structures of NLPs, convex under-estimators
can be used to obtain a lower bound on the objective function (McCormick,
1976).

4.2 Distrust-region method


Once an infeasible point is located, it is desired to identify a completely infea-
sible box surrounding this point so as to delete it. Hence, we introduce the
following procedure that we call the "distrust-region" method.

Given an infeasible point, x EX, the distrust region method will identify a
hypercube of side 20" around the point x such that the hypercube is completely
infeasible. The scalar 0" is called the distrust length.

The task of identifying the hypercube can be formulated as the following opti-
mization problem:

max 0",
subject to
Pi([X - 0"1, x + 0"1]) > 0 for some i = 1,2" ... , m (6.16)
182 R. VAIDYANATHAN AND M. EL-HALWAGI

0- ~ 0 (6.17)

where, Pi is the inclusion of the range ofthe constraint Pi and I is a unit vector.

This optimization program can be solved using any local optimization algo-
rithm since a local solution is sufficient for the purpose. An alternate solution
procedure involves solving the optimization program by trial and error. To
begin with, a large value of 0- is assumed and the feasibility with respect to
the constraints described by (16) is checked. If one or more of the constraints
are satisfied, a solution has been obtained. Otherwise, 0- can be scaled down
iteratively until at least one constraint in (16) is satisfied.

The solution to this program identifies a hypercube of side length 20- surround-
ing z. This hypercube can be completely deleted from the search space. A good
starting infeasible point is the global solution to the unconstrained optimiza-
tion problem which can be obtained via interval-analysis techniques. However,
any point that is infeasible with respect to the constraints of the problem may
be used.

In order to ensure the potential of the distrust-region method to delete rea-


sonably large portions of the space, it is useful to specify a scalar which cor-
responds to the minimum width of the box X below which the distrust-region
method should not be implemented.

4.3 Discretization procedure for performing


shifted partitioning around local solutions
In addition to the above-proposed methods, local minimization is employed to
accelerate the search. Throughout this paper, we employ the software GINO
(Liebman et al., 1986) to find the local minimum of the program over a box
X. This local solution can then be used as a partitioning point. If none of the
interval-analysis tests lead to deleting the box, we split the box around the
local minimum.

In general, splitting the search box at a point will yield 271 sub-boxes. To
avoid such a tremendous increase in the number of sub-boxes, we split the
search box into two sub-boxes only. This is accomplished by partitioning in
only one direction. We arbitrarily assign this direction j as the one with the
largest width among all directions. This selection is aimed at quickly reducing
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 183

the size of the search space. Hence, for a search box

XT = [a!, bt ], [a2' b2], ... , [ai, bi ], ... , [an, bn ], (6.18)


the partitioning direction, j, is characterized by:

(6.19)
In addition, we slightly shift the partitioning point from the local minimum.
Let :z:* be the local minimum and its component in the jth direction be :z:]. The
integer constraints were relaxed while applying the tests described above. This
relaxation will be reconciled while performing the partitioning of the search box
as discussed below. We propose the following rules for partitioning depending
on the discrete or continuous nature of :z:; .
Case 1: If:Z:j is real and x] f:. aj or bj :
The shifted partitioning around the local minimum will yield the two sub-boxes:

[a!, bt ], [a2, b2], ... , [aj-l, bj-l], [aj,:z:; - {3], [aj+!, bj+l] , ... , [an, bn]
and

[all bl], [a2' b2], ... , [aj-l, bj-l], [:z:; - {3, bj], [aj+b bj+l] , ... , [an, bn],
where f3 is a very small number.
Case 2: If:Z:j is real and :z:; = aj or bj :
The partitioning may be carried out at the midpoint of the interval representing
:Z:j. This will yield the following two sub-boxes after partition:

[ai, bt ]' [a2' b2 ], ••• , [aj-l, bj - l], [aj, aj + T],


b- - a-
[aj+1' bj+l], ... , [an, bn]

and

[al, bl ], [a2' b2], ... , [aj-l, bj - l], raj + T'


b- -a-
bj], [aj+1! bj+1], ... , [an, bn],

Case 3: If Xj is an integer variable, and :z:] f:. aj or bj , then the partitioning


yields the two sub-boxes:

[all bl ], [a2' b2], ... , raj -1, bj -1], raj, int(xj)], [aj+l! bj+l], ... , [an, bn]
and

[ai, bl ], [a2' b2], ... , [aj-l, bj - l], [int(:z:;) + 1, bj ], [aj+l, bj+l], ... , [an, bn],
Case 4: If Xi is an integer variable, and :z:] = bj, then the partitioning yields
the two sub-boxes:
184 R. VAIDYANATHAN ANDM. EL-HALWAGI

and

Such a shifted partitioning is likely to induce strictly feasible sub-boxes over


which the objective function is monotonically increasing/decreasing and/or
nonconvex and, hence, can be deleted. In addition, these local minima are
upper bounds on the global solution and, thus, can be used in the upper-bound
test. It is to be noted that the operation "int" rounds the real number down
to the nearest integer.

With these accelerating tools, we are now in a position to present the pro-
posed algorithm as illustrated in Fig. 1.

The details of the proposed interval algorithm for global optimization is pre-
sented next.

Step 1. In this step, input data are prepared in a suitable form. First, the initial
search box (B, which is given by the problem statement) is placed as the
first element in a list L. This list will acquire additional elements as the
algorithm proceeds. In addition, one has to specify f (the desired width
of the final box), 6 (the desired accuracy in the range of the objective
function over the final box) and Q (the minimum width of a box below
which a distrust-region method will not be implemented). Furthermore,
lower and upper bounds on the value of the objective function at the global
solution (referred to as lwbd and upbd, respectively) would be evaluated.
As has been previously described, lwbd can be obtained via the methods
proposed by Manousiouthakis and Sourlas (1992) or McCormick (1976).
On the other hand, upbd can be taken as the value ofthe objective function
at a local minimum.
If all of the optimization variables are integers, then f and 6 are each
specified to be zero for termination of the algorithm.

Step 2. Designate the largest box in list L as the active hox. If it has a width less
than or equal to f and the range of the objective function over the box is
less than or equal to 6, go to Step 13. Otherwise, go to Step 3.

Step 3. Relax the integer constraints and assume that the discrete variables can
take any real value within the bounds specified by the box. The discrete
nature of the these variables will be reconciled while performing the box
partitioning in Step 11. Go to Step 4.
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 185

Figure 1 Proposed algorithm for global optimization of MINLP.

Step 4. Apply the upper bound test to the active box. If the active box is deleted,
remove it from list L and go to Step 2. If the active box is not deleted, go
to Step 5.
Step 5. Apply the lower bound test to the active box. If the active box is deleted,
remove it from list L and go to Step 2. If the active box is not deleted,
proceed to Step 6.
186 R. VAIDYANATHAN AND M.EL-HALWAGI

Step 6. Apply the infeasibility test. If the active box is completely infeasible, delete
it from list L and go to Step 2. Otherwise, go to Step 7.
Step 7. If the width of the active box is greater than a, go to Step 12. Otherwise
go to Step 8.
Step 8. If the active box is certainly strictly feasible, go to step 9. Otherwise, go
to Step 11.
Step 9. Apply the monotonicity test. If the active box is monotonic, add the end
point (which yields the lower value of the objective function) to list L
while deleting the rest of the active box from the list and go to Step 2.
Otherwise, go to Step 10.

Step 10. Apply the nonconvexity test. If the interior of the active box can be
deleted, remove the active box from list L and add its exterior alone to the
list. Then, go to Step 2. Otherwise, go to the next step.
Step 11. Obtain the constrained local minimum (using a local optimizer), with the
integer constraints imposed, in the active box. Apply the discretization
procedure discussed earlier in section 4.3 to partition the box around the
local minimum. Remove the active box from list L and add the two new
sub-boxes to it. If the objective function value of the constrained local
minimum is less then the current upbd, then update upbd. Go to Step 2.

Step 12. Choose an infeasible point in the active box and apply the distrust-region
method to it. Delete the active box from list L and add the sub-boxes that
are created after deleting the distrust sub-box in the active box. Go to
Step2. If there is no infeasible point in the active box go to Step 8.
Step 13. If all the variables involved in the problem are integers, then go to Step 14.
Otherwise, the remaining boxes in the list L contain the global solution
and the algorithm is, therefore, terminated.
Step 14. The remaining boxes in the list L are all of zero width and therefore,
actually represent a finite number of points in the search space. These
points are then screened for feasibility with respect to the constraints of
the original problem. The objective function is then evaluated at the
feasible points. The point(s) that gives the least value for the objective
function is the global solution. The algorithm is then terminated.
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 187

5 ILLUSTRATIVE EXAMPLES
In order to demonstrate the applicability of the proposed algorithm, the fol-
lowing example problems are tackled.

Example 1 (Kocis and Grossmann, 1988; Floudas et al., 1989)


A simple MINLP problem that has been reported in the literature will be solved
first. Since the initial search space is quite small for this problem, it was suffi-
cient to partition the box at the midpoint thereby eliminating the need to solve
for a local solution.

min! = 2:z:+y,

subject to,
_:z:2 _y < -1.25
:z:+y < 1.6
0< :z: < 1.6
y f {0,1}

This problem has one real variable (x) and one integer variable (y). The first
constraint is nonlinear and nonconvex in the real variable. The initial box B =
[0 1.6], [0 1] was used to search for the global optimum. The interval analysis
algorithm was applied to the problem with tolerances f and 6 on the width
of the solution box and the objective function being 0.000001 and 0.00001, re-
spectively. The global solution was found to be :z:=[0.5, 0.5] and y=[l, 1]. The
corresponding range of the objective function is [2, 2]. The computing time
was O.ls on a SUN Sparcstation 10.

Example 2 (Vaidyanathan and EI-Halwagi, 1994b)


Next, we will discuss a molecular design problem whose objective is to syn-
thesize a polymer that meets a set of target properties. The molecular design
problem is formulated as a Mixed Integer Nonlinear Optimization Program
whose objective function is a performance criterion for the designed molecule.
Constraints based on target properties, structural feasibility and designer speci-
fications are included. Property predictions are based on the group contribution
theory. In this case, the target properties specified are that of Polystyrene.
Properties that are considered include glass transition temperature, density,
specific heat capacity, modulus of elasticity and molecular weight. The objec-
tive function is a weighted average of glass transition temperature and specific
heat capacity in the proportion 1:3.5. Twelve groups (five uni-valent, three
188 R. VAIDYANATHAN ANDM. EL-HALWAGI

bi-valent, three tri-valent and one tetravalent) were chosen to represent the
initial search space. The groups along with their contributions to the vari-
ous properties are shown in Table 1. The non-linearity and non-convexity in
the program are introduced by the structural feasibility constraints and some
property constraints. The optimization program is represented by:

( ,,12 ) 12
7naa: .L,.,'-1
12
Yg,X, + 3. 5(~ Cpi·X·)
~ I
(Ei=l MiX,) i=l
subject to,
12
90 ~ EM,Xi ~ 104
i=l

12 12
0.95 ~ (E M,X,)/(E ViX,) ~ 1.25
,=1 ,=1

12 12
1.15 ~ (E Cp,Xi)/(E MiX,) ~ 1.45
i=l ,=1

12 12
300 ~ (E Yg,Xi)/(E MiX,) ~ 380
,=1 i=l

{Xl + X 2 + Xa + X 4 + Xs + 2(Xs + X7 + Xs + X 9 )
+3(X10 + Xu) + 4X12 - 6}(XlO + Xu) ~ 0

Xs+Xs ~ 1
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 189

Xi = [0 3] for i = 1,2, ... , 12

Xi integer for i = 1,2, ... , 12

Table 1 List of groups and their contributions towards the various properties
used in Example 2

UHi*
Mi Vwi 4642 Cps Ygi
group id (g/ (cm3/ m/ (J/ (K.g/ LlG'h
mole) /mol) mole)l/3 mol mol) (J/mol)
.(m/s) /K)
CH3 Xl 15 13.6 1130 30.9 6100 -4600+95T
Cl X2 35.5 11.6 1000 27.1 17500 -49000-9T
C6H5 X3 77 45.9 3650 85.6 34200 87000+167T
COOH X4 45 18.6 1100 50 13300 -393000+ 118T
CH3COO Xs 59 28.9 2030 75.9 18600 -383000+211 T
CH2 X6 14 10.2 675 25.3 2700 -22000+102T
CH2COO X7 58 25.4 1575 71.3 15200 359000+218T
CHNH2 Xs 29 18 1920 36.5 9700 8800+222T
C6H4 Xg 76 43.3 3300 78.8 29500 100000+180T
CH XlO 13 6.8 370 15.6 1900 -2700+120T
C6H4 Xu 75 40.8 2900 72.0 24800 113000+ 178T
c X 12 12 3.3 35 6.2 5500 20000+140T

The problem formulated above was then solved using the proposed interval
analysis algorithm. Since all the optimization variables are integers, the toler-
ance on the width of the solution box € was taken as 0 and the tolerance of the
objective function inclusion 6 was, therefore, taken to be 0 as well. By applying
our interval algorithm, we obtained the following solution:

Xl [1,1],
Xs [1,1],
X6 [1,1],
X l2 [1,1]
190 R. VAlDYANATHAN AND M. EL-HALWAGI

with all the other variables being zero.

The identified global solution is polymethylmethacrylate (PMMA) with an ob-


jective function value of 816.6 at the optimum. The computing time was 14s on
a SUN Sparcstation 10. The same problem was then solved using the software
GINO (Liebman et al., 1986) which identified a local solution which is very
close to the global solution (objective function at the optimum is 815.8). This
local solution is polystyrene which meets the required properties.

6 CONCLUSIONS
A general interval-based global optimization algorithm has been developed to
solve MINLPs. The algorithm utilizes integer discretization and search accel-
erating tools to eliminate sub-optimal sub-spaces from the search domain. The
solutions provided by this algorithm are guaranteed to be globally optimal.
Illustrative examples demonstrate the applicability of the proposed procedure
to nonconvex mixed integer nonlinear programs.

Acknowledgement
The financial support of the NSF (grant NSF-NYI-CTS-9457013) is gratefully
acknowledged.
GLOBAL OPTIMISATION OF NONCONVEX MINLP'S 191

REFERENCES
[1] Alefeld G. and J. Herzberger. 1983, "Introduction to Interval Computa-
tions", Academic Press, New York.

[2] Bagajewicz M. and V. Manousiouthakis. 1991, "On the Generalized Ben-


ders decomposition", Computers chem. Engng, 15, 10.

[3] Floudas C. A. and P. M. Pardalos.1990,"A Collection of Test Problems for


Constrained Global Optimization Algorithms", Lecture Notes in Computer
Science, pp. 29-30, Vol. 455. Springer-Verlag, New York.
[4] Floudas C. A. and P. M. Pardalos. 1992, "Recent Advances in Global
Optimization", Princeton University Press, Princeton, New Jersey.
[5] Floudas C. A. and V. Visweswaran. 1993, "A Primal-Relaxed Dual Global
Optimization Approach", J. Opt. Theory Applic, 78, 2, 87.
[6] Floudas C. A. and I. E. Grossmann. 1994, "Algorithmic Approaches to Pro-
cess Synthesis: Logic and Global Optimization" , Foundations of Computer-
Aided Process Design, Snowmass Village, Colorado.

[7] Geoffrion A. M. 1972, "Generalized Benders decomposition", J. Opt. The-


ory Applic, 10, 237.
[8] Hansen E. R. 1980, "Global Optimization Using Interval Analysis-the Mul-
tidimensional Case", Numer. Math., 34, 247.
[9] Hansen E. R. 1992, "Global Optimization Using Interval Analysis", Marcel
Dekker, Inc., New York.
[10] Horst R. 1990, "Deterministic Methods in Constrained GlobalOptimiza-
tion: Some Recent Advances and New Fields of Application", Naval Res.
Logist, 37, 433.
[11] Ichida K. and Y. Fujii, "An Interval Arithmetic Method of Global Opti-
mization", Computing, 23, 85.
[12] Kocis, G. R. and I. E. Grossmann. 1988, "Global Optimization of Noncon-
vex Mixed-Integer Nonlinear Programming (MINLP) Problems in Process
Synthesis", Ind. Eng. Chem. Res., 27, 1407.
[13] Liebman J., L. Lasdon, L. Schrage and A. Waren. 1986, "Modeling and
Optimization with GINO", The Scientific Press, CA.
192 R. VAIDYANATHANANDM.EL-HALWAGI

[14] Manousiouthakis V. and D. Sourlas. 1992, "A Global Optimization Ap-


proach to Rationally Constrained Rationally Programming", Chem. Engng
Commun., 115, 127.

[15] Maranas, C. D. and C. A. Floudas. 1994, "Global Minimum Potential En-


ergy Conformations of Small Molecules", Journal of Global Optimization,
Vol. 4, No.2, 135.
[16] McCormick G. P. 1976, "Computability of Global Solutions to Factorable
Nonconvex Programs: Part I-Convex Underestimation Problems", Math.
Prog., 10, 147.
[17] Moore R. E. 1966, "Interval Analysis", Prentice-Hall, Englewood cliffs, NJ.
[18] Moore R. E. 1979, "Methods and Applications oflnterval Analysis", SIAM,
Philadelphia.
[19] Moore R., E. R. Hansen and A. Leclerc. 1992, "Rigorous Methods for
Global Optimization", Recent Advance6 in Global Optimization, (Edited
by C. A. Floudas and P. M. Pardalos), p. 321. Princeton University Press,
Princeton.
[20] Ratschek H. and J. Rokne. 1984, "Computer Methods for the Range of
Functions", Elli6 Horwood, Chichester.
[21] Ratschek Hand J. Rokne. 1991, "Interval Tools for Global Optimization" ,
Computer6 Math. applic., 21, 41.
[22] Ryoo, H. S. and N. V. Sahinidis. 1993, "Global Optimization of Noncon-
vex NLPs and MINLPs with Applications in Process Design", Technical
Report, Department of Mechanical and Ind1£6trial Engineering, Univer6ity
of fllinoi6 at Urbana-Champaign, Urbana, IL.
[23] Sahinidis N. V. and I. E. Grossmann. 1991, "Convergence Properties of
Generalized Benders decomposition", Computer6 chem. Engng, 15, 481.
[24] Sengupta S. 1981, "Global Nonlinear Constrained Optimization", Ph.D.
Di66ertation, Washington State University.

[25] Stephanopoulos G. and A. W. Westerberg. 1975, "The Use of Hestenes'


Method of Multipliers to Resolve Dual Gaps in Engineering System opti-
mization", JOTA, 15, 285.
[26] Swaney, R. E. 1990, "Global Solution of Algebraic Nonlinear Programs",
AICkE Annual Meeting, Chicago, IL.
GLOBAL OPI'IMISATION OF NONCONVEX MINLP'S 193

[27] Vaidyanathan R. and M. EI-Halwagi. 1994a, "Global Optimization of Non-


convex Nonlinear Programs via Interval Analysis", Computer, chern. En-
gng, 18, 889.
[28] Vaidyanathan R. and M. EI-Halwagi. 1994b, "Bounding Methods of In-
terval Analysis for Global Optimization", A/CkE Annual Meeting, San
Francisco.
[29] Visweswaran V. and C. A. Floudas. 1990, "A Global Optimization Pro-
cedure (GOP) for Certain Classes of Nonconvex NLPs-II. Application of
Theory and Test Problems", Computer, chern. Engng, 14, 1419.
[30] Zwart P. B. 1974, "Global Maximization of a Convex Function with Linear
Inequality Constraints", Oper. Re" 22, 602-609.
7
PLANNING OF CHEMICAL
PROCESS NETWORKS VIA
GLOBAL CONCAVE
MINIMIZATION
Ming-Long Liu*,
Nikolaos V. Sahinidis** and J. Parker Shectman
Department of Mechanical & Industrial Engineering
The University of Illinois at Urbana-Champaign
1206 West Green Street
Urbana, Illinois 61801

* Department of Mathematical Science,


National Chengchi University, Taipei, Taiwan, R.O.C.
** Address all correspondence to this author (e-mail: nikoslGuiuc. edu).

ABSTRACT
The problem of selecting processes and planning expansions of a chemical complex to
maximize net present value has been traditionally formulated as a multiperiod, mixed-
integer linear program. In this paper, the problem is approached using an entirely
continuous model. Compared to previous models, the proposed formulation allows
for more general objective functions. In solving the continuous model, minimizing its
nonconvex objective function poses the major obstacle. We overcome this obstacle by
means of a branch-and-bound global optimization algorithm that exploits the concav-
ity and separability of the objective function and the linearity of the constraint set.
The algorithm terminates with the exact global optimum in a finite number of itera-
tions. In addition, computational results demonstrate that the proposed algorithm is
very efficient as, for a number of problems from the literature, it outperforms OSL,
a popular integer programming package. We also develop a procedure for generating
test problems of this kind.
195
I. E. Grossmann (ed.), GlobalOptimiwtion in Engineering Design, 195-230.
Ie 1996 Kluwer Academic Publishers.
196 M. -L. LID, N. V. SAHINIDIS ANDJ. PARKER SHEC1MAN

1 INTRODUCTION
The process industry-now a multi-billion dollar international enterprise-com-
prises enormous amounts of natural resources, chemicals, personnel and equip-
ment. Despite the expected growth in the demand of chemicals, the industry
is becoming increasingly competitive while customer demands impose a sig-
nificant complexity on production requirements. This trend necessitates the
development of efficient optimization techniques for planning process opera-
tions.

Consider the problem of designing a profit-maximizing network of chemical


operations. In approaching the problem, we first compile a list of chemicals that
we consider producing for sale. To this list we add any salable by-products of
producing each chemical, as well as the ingredients necessary for the production
of each chemical. We might then contemplate the in-house production of some
of the required ingredients, forcing us to consider another tier of ingredients
and by-products. The listing continues until we have considered all processes
which may relate to the ultimate production of the products initially proposed
for sale. At this point the final list of chemicals will contain all raw materials we
consider purchasing from the market, all products we consider offering for sale
on the market, and all possible intermediates. In fact, the problem of process
planning may require us to account for the presence of multiple markets.

Each of the final and intermediate products may be output from one or more
processes that reflect different technological recipes. Choosing from among dif-
ferent technological alternatives leads to a problem that grows combinatorially
with the number of potential products and processes. An additional complicat-
ing factor is the matter of when to expand the capacities of the processes. As
market demands, investment and operating costs fluctuate, one would like to
time capacity expansions in a way that takes into account economies of scale
and market dynamics.

In existing literature, researchers have tackled the problem as a mixed-integer


linear program (MILP) [9,8,4]. The main reason for introducing 0-1 variables
in these formulations is to model economies of scale by adding fixed-charges in
the investment costs whenever capacity expansions take place.

The approach taken in this paper is to model economies of scale directly, rep-
resenting costs by univariate concave functions. In this way, the formulations
avoid the use of binary variables. This reformulation allows us to solve the
problem using a concave programming (CP) approach. In addition to elimi-
PLANNING OF CHEMICAL PROCESS NETWORKS 197

nating binary variables, the CP approach permits us to solve planning problems


with more realistic cost functions.

In the remainder of the paper, we describe two different CP approaches to the


problem. First, however, in Section 2 we introduce the major facets of process
planning by providing a general model. Section 3 describes a fixed-charge CP
approach, while Section 4 introduces a CP approach with continuous nonlinear
objective functions. In these sections, we answer the questions:

How is the CP model of process planning more realistic than the


traditional MILP model?
Why are several terms in the proposed model concave?
What is the specific nature of the concavity in each objective function,
and what are the economic forces which cause it?

Then, in Section 5, we explain the aspects of CP solution through branch-and-


bound global optimization. The algorithm is driven by novel branching rules,
in the framework of the more general CP approach proposed by Shectman and
Sahinidis [10]. In specializing the algorithm for the planning problem, we also
incorporate a number of techniques traditionally used in integer programming
theory, such as pseudo-costs. Section 6 offers a procedure for generating test
problems for planning models. In Section 7, we relate extensive computational
experience with the models and algorithms described in this paper. We report
test results for an implementation that uses BARON [6], a recently developed
general-purpose software package for branch-and-bound. Finally, conclusions
are drawn in Section 8.

2 GENERAL MODEL OF PLANNING


PROBLEM WITH EXPANSIONS
The set of functions and parameters used in the model includes forecasts of
demands for final products, availability of raw materials, and sale and purchase
prices of chemicals, as well as forecasts of investment and operating costs over
a long-range horizon.
198 M. -L. LIU, N. V. SAHINIDIS AND 1. PARKER SHECIMAN

2.1 Indices
i The network is composed from a set of N P processes (i = 1, ... , N P).
j Streams of NC chemicals (j = 1, ... , NC) may be exchanged by the processes.
1 A set of N M markets are involved (I = 1, ... , N M).
t We consider production over the course of NT time periods (t = 1, ... , NT).

2.2 Parameters
The model allows for the purchase of between aflt and a~t units of
chemical j from market 1 during period t.
The model incorporates the prediction that we will be able to sell
between dflt and d~t units of chemical j to market 1 during period
t, at a forecasted price of Ijlt per unit.
I 0 are the input and output chemical proportionality constants used
J.tij' J.tij
for mass balances.

2.3 Variables
Eit The production capacity of process i is expanded by Eit units at the be-
ginning of period t.
Pjlt units of chemical j are purchased from market 1 at the beginning of period
t.
Qit The total capacity of process i during period t.
Sjlt units of chemical j are sold on market 1 at the end of period t.
Wit The actual operating level of process i during period t.

2.4 Functions
INVTit(Eit) The amount invested in process i during period t including estab-
lishment or expansion of the process, (but not operating costs).
The function may include fixed-charges for the establishment and
each subsequent expansion of the process, as well as variable costs
that depend on Eit.
OP ERit(Wit) The total cost of operating process i over period t as a function of
the operating level, Wit.
PU RCjlt(Pjlt) The total value of purchases of chemical j from market I during
period t as a function of Pj It .
PLANNING OF CHEMICAL PROCESS NETWORKS 199

2.5 Mathematical Programming Problem

Formulation i-General

NP NT NP NT
maxNPV= - LLINVTit(Eit) - LLOPER;t(Wit}
i=l t=l i=l t=l
NC NM NT
+ L L Lh'jltSjlt - PU RCjlt(Pjlt)) (7.1)
j=l 1=1 t=l

subject to

Qit = Qi,t-1 + Eit i = 1, ... , N P; t = 1, ... , NT (7.2)

Wit:S Qit i = 1, .. . ,NP; t = 1, .. . ,NT (7.3)

NM NP NM NP

L Pjlt + LJl~Wit L Sjlt +L Jlfj Wit


1=1 ;=1 1=1 ;=1
j = 1, .. . ,NC; t = 1, .. . ,NT (7.4)

L
ajlt:S Pjlt:S ajlt
U j=l, ... ,NC; l=l, ... ,NM; t=l, ... ,NT (7.5)

j=l, ... ,NC; l=l, ... ,NM; t=l, ... ,NT (7.6)
200 M. -L. LlU, N. V. SAHINIDIS ANDJ. PARKER SHECI'MAN

i = 1, ... , N P; t = 1, ... , NT (7.7)

The objective function seeks to maximize the net present value NPVof the
process plan, considering investment, operation and purchase costs as well as
sales revenues. The set of mass balances, (7.4), reflects the technological recipe
for producing chemical j by means of process i. For simplicity, we assume that
mass balances can be expressed as equations that are linear in terms of the
operating level of the process.

3 FIXED-CHARGE MODELING
The fixed-charge CP formulation of the planning problem is essentially equiv-
alent to the traditional MILP formulation. Thus we begin by describing the
MILP.

3.1 MILP Model of Planning Problem with


Expansions
The following parameters, variables, and functions are used in addition to those
employed in the general model.

3.1.1 Supplementary Parameters


The per unit cost of expanding process i at the beginning of period t.
The fixed cost of establishing or expanding process i and at the be-
ginning of period t.
The forecasted price for purchasing a unit of product j from market 1
The unit production cost to operate process i during period t.
The model constrains the capacity expansion of process i to be between
Eh and Eg units during period t.
3.1.2 Additional Variables
Yit A 0-1 integer variable. If process i is expanded during period t,
then Yit = 1. If not, then Yit = O.
PLANNING OF CHEMICAL PROCESS NEIWORKS 201

3.1.3 Formulation 2-MILP

NP NT NP NT
maxNPV= L2)ait E it + (JitYit) - L L 8it W it
i=l t=l i=l t=l
NCNM NT
+ L L L(-yjltSjlt - rjltPjlt ) (7.8)
j=l 1=1 t=l
subject to Constraints (7.2)-(7.7) and

YitE~ :::; Eit :::; YitEl{ i=1, ... ,NP; t=1, ... ,NT (7.9)
Yit E {O, 1} i=1, ... ,NP; t=1, ... ,NT (7.10)

Sahinidis et al. [9] explore branch-and-bound, cutting planes, and Benders


decomposition as solution techniques for this MILP formulation. Sahinidis
and Grossmann [8] develop alternative MILP formulations, which tighten the
bounds of the linear programming relaxation by introducing additional vari-
ables and constraints. Liu and Sahinidis [5] eliminate these reformulation vari-
ables by first using a projection approach before generating the cutting planes.
A branch-and-cut algorithm was subsequently suggested for utilizing these cut-
ting planes (Liu and Sahinidis [4]).

3.2 Fixed-Charge Concave Programming


Approach
The fixed-charge CP approach eliminates the necessity of binary variables by
using affine functions to represent fixed-costs in the investment term of the
objective. Each affine investment function includes a fixed-charge (Jit for the
initial construction and each subsequent expansion of the process, as well as a
cost that varies linearly with Eit by a coefficient ait:

when Eit = 0
INVTit(Eit) = {O,ait E it
(J
+ it, when Eit > O.
(7.11)

Note that this function is concave in E it . The other terms in the objective
function are retained from the MILP formulation, since these terms do not
202 M. -L. LIU, N. V. SAHINIDIS AND J. PARKER SHECTMAN

involve integer variables. The fixed-charge CP model also includes constraints


(7.2)-(7.7). Naturally, constraints (7.9) and (7.10) are dropped since they ad-
mit binary variables which are not used here. Instead, the proposed solution
algorithm directly handles lower and upper bounds on individual expansions
by enforcing them during branching (see Section 5.5.4). Hence, the complete
model of the planning problem using the fixed-charge CP approach is:

3.2.1 Formulation 3-FCP

NP NT NP NT
mini = -NPV L L INVT;t(Eit) +L L Oit Wit
;=1 t=l ;=1 t=l
NCNM NT
L L L(-yj/tSj/t - fjltPj/t) (7.12)
j=l 1=1 t=l
subject to Constraints (7.2)-(7.7),

where INVT;t are as defined in (7.11). Note that minimizing I in (7.12) is


equivalent to maximizing NPV. In comparing the MILP and FCP formula-
tions one should observe that the FCP formulation involves fewer variables and
constraints due to the elimination of binary variables. Although the objective
function has now become nonlinear, a linear programming relaxation of this
formulation can be easily constructed as will be shown in Section 5.

4 MODELING OBJECTIVES WITH


CONTINUOUS FUNCTIONS

4.1 Reasons
To describe the net present value of a process plan, CP holds another option
which may in many instances reflect the economic reality of industrial oper-
ations better than model FCP. In particular, the individual functions in the
objective constitute three reasons why the use of continuous concave functions
to model costs and revenues enables us to solve a more realistic model. One
may safely assume that the costs of operating a process, expanding a process
capacity, and purchasing raw materials all involve economies of scale. MILP
PLANNING OF CHEMICAL PROCESS NETWORKS 203

models force one to assume that these costs are directly proportional to the
amount contracted, but in reality, the per unit cost decreases as the number of
units increases. Hence, the general form of the continuous concave objective is
the same as that of (7.1).

4.2 Model
In order to conduct computational experiments comparing a continuous CP
model to FCP, we have investigated the particular form:

NP NT NP NT
NPV= - LLaitEt;' - LLbitWit
i=l t=l i=l t=l
NC NM NT
+ L L L(-yj/tSjlt - fj1tPjlt), (7.13)
j=l 1=1 t=l

where the OP ER and PU RC terms match those found in FCP (7.12), but the
I NVT term has been changed from a fixed-charge form to the continuous form

(7.14)

where ait > 0, and 0 < bit < 1 for i = 1, ... , N P, and t = 1, ... , NT. Note that
(7.14) estimates investment by applying power-factor scaling to plant capacity.
We come to our working form of the continuous CP model of the planning
problem:

4.2.1 Formulation 4-CCP

min! -NPVfrom (7.13)


subject to Constraints (7.2)-(7.7).

Remark 1: It is obvious that one can develop a CP model that involves any
combination of the objective function terms of models FCP and CCP. In this
204 M. -L. LIU, N. V. SAHINIDIS AND 1. PARKER SHECTMAN

way, fixed-charges and power functions can be brought together into a more
comprehensive CP model, since, e.g., expansion of a process capacity may
require a fixed reinvestment expense plus a variable cost that is itself concave
in the amount of the expansion.

Remark 2: In the above, the sales revenue term in the objective function has
been assumed to be linear for simplicity of the presentation. In reality, this
term is likely to exhibit diseconomies of scale as prices will fall with increased
amounts of production. This would lead to a nonlinear yet convex term in the
minimization problem to be solved.

Remark 3: As with model FCP, lower and upper limits on the size of expansions
can be enforced by the algorithm as bounds (Section 5.5.4).

5 BRANCH-AND-BOUND GLOBAL
OPTIMIZATION ALGORITHM

5.1 Outline of Algorithm


In this section we outline the branch-and-bound algorithm that we use for
concave programming, with particular detail to the novel branching operations.
In doing so, we will mention the use of pseudo-costs for selection of branching
variables. Finally, we will contrast the rules used for branching in the fixed-
charge and continuous approaches.

We refer to the algorithm as a branch-and-reduce algorithm, meaning one


which combines standard branch-and-bound with specialized acceleration de-
vices known as domain reduction techniques [7, 6]. As the algorithm progresses,
these techniques yield increasingly tighter reformulations of the subproblems
solved in the course of the branch-and-bound search.

'Branch-and-bound' denotes a family of global optimization methods which


operate by branching, that is by dividing the original problem into subproblems
of smaller domain, which are recorded in a list. In each iteration, the procedure
selects a set of these subproblems for bounding, a process that generates a
numerical interval, consisting of an upper and a lower bound, in which the
optimal value of a subproblem must lie. The algorithm can then utilize this
information in its search for the global minimum. Since the global minimum
PLANNING OF CHEMICAL PROCESS NETWORKS 205

must lie between the least of the lower bounds and the least of the upper
bounds, the algorithm may delete any subproblem which has an associated
lower bound that is larger than or equal to the least upper bound.

The procedure will now be formally outlined. L will represent the least of
the lower bounds, U - the least of the upper bounds. The algorithm will
view the problem constraints as the intersection of two distinct sets. D will
denote the problem constraints that are not orthogonal to variable axes, e.g., for
FCP and CCP, constraints (7.2)-(7.4). G will denote those problem constraints
which are simple bounds- (7.5)-(7.7) and any desired bounds on budget, the
number of expansions, or the size of individual expansions for FCP and CCP.
In general, G will symbolize a hyperrectangle orthogonal to the variable axes.
For convenience, x will represent the vector of all the problem variables, i.e.,
x = [E, W, S, Pl. The major operations-preprocessing, selection, branching,
and bounding, which are italicized in the statement of the algorithm, will be
presented in full detail in the sequel.

Initialization Preprocess the problem constraints D n G to form a bounded


initial hyperrectangle GO. Add the problem {minf(x) s.t. x E DnGO} to
the list S of open subproblems. Choose a positive integer N < 00 to be
used in branching.
Let k +-- O. At each iteration k of the algorithm,
do (Step k). Step k.1. Select a subproblem Sk, defined as {minf(x) s.t. x E
D n GSk}, from the list S of currently open subproblems. Set S +--
S \ {sd.
Step k.2. Bound the or,timal value of subproblem Sk from above and
below, i.e., find 7 k and /"k satisfying tk ~ {minf(x) s.t. x E
D n GSk} ~ /"k. By convention, ]"k = /"k = +00 if D n GSk = 0 (Sk
r
is infeasiblefIf k < +00, a feasible point X Sk ED n GSk such that
f(x Sk ) =]"k will be found in the process.
Step k.2.a. U +-- minsES ]"; L +-- minsES t.
Step k.2.h. If U = L, then terminate with optimal solution.
Step k.2.c. S +-- S \ {s s.t. r r
2: U} (fathoming rule). If k 2: U,
then goto Step k.l (sele~t another subproblem). -
Step k.3. Branch, partitioning GSk into two subspaces GSk! and GBk2.
= =
Partitioning means that GSk! UG Sk2 GSk and GSk! nG Sk2 BGSk! n
BGS k2. S +-- S U {Skl,Sk2}, (i.e., append the two subproblems
{minf(x) s.t. xED n GSk!} and {minf(x) s.t. xED n G Sk2} to
the list of open subproblems).
206 M. -L. LIU, N. V. SAHINIDIS AND 1. PARKER. SHECTMAN

For selection purposes, let LOU ,l"k2 -l"k.


Let k - k + 1, and goto Step k.l.
end do

5.2 Preprocessing
The algorithm begins by solving N P linear programs:
NT
maxLWit s.t. DnC, i = 1, . .. ,NP
t=l
letting Wi1 denote their respective solutions. For each process i, the method
computes

then sets

CO = en n NPNT

i=l t=l
{Wit: Wit :S W:t} nn
NPNT

i=l t=l
{Qit : Qit :S Bil ,

which yields an equivalent, but bounded formulation of the initial problem.

5.3 Selection
In Step k.1. of each iteration k, the procedure selects for bounding a single sub-
problem from the list of open subproblems-specifically, a subproblem which
has the best bound, i.e., the algorithm employs the rule:

Select any Sk E S such that ["k = L.

Of course, in Step 0.1., the initial problem {minf(x) s.t. x E DnC O} is selected
by default.

5.4 Bounding [Step k.2. of the Algorithm]


The algorithm determines bounds on each concave subproblem (Step k.2.) by
solving a linear programming relaxation. For each univariate concave term
PLANNING OF CHEMICAL PROCESS NE1WORKS 207
INVT;,t(Eit) in the objective, the procedure first constructs the linear under-
estimator, call it git(Eit), that intersects function INVT;,t(Eit) at the current
bounds Ittk and u:; of Ejt. In other words,

+ (7.15)

Using the well-known fact that git(Eit ) is the convex envelope of INVT;,t(Eit)
u;n
on [I:;, the distributive property of convex envelopes applies to the entire
nonlinear portion of the objective. Hence, the convex envelope of E~ E;:~
INVT;,t(Eit} over C6 k isgSk(E) = E~E;:~g;tk(Eit). Letw Sk = [E*, W*,S*,
P*] be a solution to the LP relaxation

NP NT NCNM NT
gSk (E) + L: L: 0it Wit - L: L: L:b'j,tSj,t - rjltPjlt)
i=1 t=1 j=1 1=1 t=1
s.t. [E, W, S, P] E D n C 8 k.

For the optimal value of the concave program {minf(x) s.t. x E DnC8 k}, /,k
gives a lower bound, while tk 8k )
= f(W gives an upper bound. -

5.5 Branching [Step k.3. of the Algorithm]


In the branching step of each iteration k, the partitioning rule splits the domain
CSk of subproblem Sk into two smaller subdomains. The rule devises the split in

two stages. First, from among the set of variables that correspond to nonlinear
objective function terms, the partitioning rule selects a branching variable Ert.
The rule then determines a branching point p within in the current domain
of the selected variable. The algorithm uses different branching point and
branching variable selection criteria for problems FCP and CCP.
208 M. -L. LIU, N. V. SAlDNIDIS AND 1. PARKER SHECTMAN

5.5.1 FCP Branching Scheme


Branching variable selection: In an endeavor to select a branching variable
which will induce the largest change in the objective function, a well-known
method in integer programming (e.g., [2]) is to maintain a tally of pseudo-costs
in order to assist the algorithm in its navigation down the branch-and-bound
tree. Each time the algorithm branches on variable Eit at some subproblem s, it
calculates the left and right pseudo-costs Ipclt and rpcit, defined as the absolute
difference between the lower bound of subproblem s, and the respective lower
bounds of its left and right child subproblems, sl and sr:

Over the course of the search, the procedure keeps running averages LPCit
and RPCit of the pseudo-costs associated with each variable that is branched
on. Let S represent the set of subproblems no longer open. Hence Bit = {s E
- ~$

S : it = it} is the set of subproblems at which the algorithm has branched on


variable Eit so far. Therefore:

where IBitl is the cardinality of set Bit.

Suppose that the algorithm has just computed bounds for the current subprob-
lem and must now select a variable on which to branch. Intuition suggests
splitting the subproblem domain in such a way that the two resultant sub-
problems exhibit smaller underestimation gaps than their parent. Figure l(a)
illustrates the gap, or violation, between the concave objective term INVT;t of
problem s and its linear estimator gft for the Eft component of the relaxation
solution. To select a branching variable that will reduce the said violation, the
procedure can rely on its experience branching on various variables in other
portions of the search tree. In this regard, the average pseudo-costs measure
PLANNING OF CHEMICAL PROCESS NETWORKS 209
the potential for each branching variable to induce a gap-reduction. The metric

max {min{ LPCit , RPCit}}


',t

estimates the highest potential for a branching variable to result in two child
subproblems which both have reduced gaps. The branching rule must not rely
on precedent alone, however. Instead, the branching rule must also consider the
relative importance of each variable to the current subproblem s. The formula

~s {INVT;t(E~t) - g~t(F7.t) }
it = arg max
i,t "t , x min{ LPG-,t, RPC·}
,t (7.16)

scales each pseudo-cost by the contribution of the corresponding variable to


the present underestimation gap. Equation (7.16) also accounts for the fact
that earlier time periods have greater importance in the planning problem by
introducing a penalty of lit into the formula. To summarize, the branching
rule proposed here adopts the notion of pseudo-costs and specializes the idea for
problem FCP. Computational testing has proven this rule superior to a number
of others.

Branching point selection: Suppose we wish to branch at the current subprob-


lem, and that we have selected variable Eft > 0 on which to branch. Intuition
dictates that we split the domain of the 'parent' subproblem in such a way that
the two resulting 'child' subproblems exhibit smaller underestimation gaps than
their parent. In fact, when solving FCP, we can always branch in such a way
that the it terms in the objective will not contribute to the underestimation
gap in either child. We accomplish this by fixing Eft at zero in the left child
subproblem while constraining Eft to be strictly positive in the right child sub-
problem, as depicted in Figures l(b) and (c). It follows from (7.11) that I NVTft
will be linear over the entire range of Eft in the right child, while throughout
the left child I NV~t will take the value zero. Hence, our choice of branching
point ensures that I NV~t will not contribute to the underestimation gap in
any descendant subproblems. Although strict inequalities cannot be handled
by LP techniques, this choice of branching point can be effectively enforced
in the right subproblem by uniformly imposing the fixed-charge while allowing
the capacity expansion to vary freely.

5.5.2 CCP Branching Scheme


In developing a branching scheme for problem FCP, we employed a number
of intuitive principles. The branching scheme for CCP also appeals to these
principles.
210 M. -L. LIU, N. V. SAHINIDIS AND J. PARKER SHECfMAN

Cost Violation

E;' Expansion
(a) Relaxation and violation

Cost

Eit=O Eit>O

(b) Search tree

Expansion
(c) Relaxation after branching

Figure 1 Relaxation and branching for fixed-charge concave programs.


PLANNING OF CHEMICAL PROCESS NE'IWORKS 211

Branching variable selection: While we wish to reduce the net gap ofthe present
relaxation upon branching, at the same time, we wish to exploit the relative
potential of each individual variable to reduce the gap over its entire domain of
definition. We propose a rule that balances both considerations. Let mVit rep-
resent the maximum gap between the individual objective term corresponding
to variable Eit and its respective underestimator. Using (7.14) and (7.15), we
may analytically determine that:

(7.17)

is the said gap maximizer. For each variable, our composite variable selection
rule will weight the maximum gap over [lit, Uit] by the gap contribution at Eft,
its respective component of the current LP solution W S :

(7.18)

Figure 2(a) illustrates the violation at Eft (drawn arbitrarily), while Figure
2(b) illustrates the maximum gap, which occurs at mVit.

Branching point selection: We will branch at the point p = mvh of maximum


violation for the selected branching variable Eft. Note that branching at the
gap maximizing point minimizes the collective area of underestimation for both
child subproblems, as shown by the shaded region in Figure 2(b). Compare this
to the shaded region in Figure 2(a). Clearly, choice of mVit for branching point
reduces the total gap area more than branching at the LP solutionEft'

5.5.3 Finiteness and the Modifying Branching Rule


Finiteness of CCP. A central feature ofthe branch-and-reduce variant of branch-
and-bound is its unconventional modifying branching rule [6, 10]. The rule,
which may override any existing branching scheme at any iteration of the al-
gorithm, is best described in [10], where the authors prove for any CP that
the branch-and-reduce algorithm will terminate in a finite number of iterations
with the exact global minimum. In addition, the modifying branching rule also
accelerates convergence when solving problems of the form CCP.
212 M. -L. LID, N. V. SAlDNIDIS AND 1. PARKER SHECTMAN

Cost

(a) Division at relaxation solution

Maximum Violation

Cost

s Expansion
mVit
(b) Division at maximum violation point

Figure 2 Branching point selection for power-cost problems.


PLANNING OF CHEMICAL PROCESS NEIWORKS 213

Finiteness of FCP. The partitioning scheme for FCP branches on the appli-
cation of fixed-charge, which ensures that the original, nonconvex planning
problem will eventually be reduced to a binary tree of linear subproblems, hav-
ing at most 2NPxNT nodes. Since LP is itself a finite procedure, we can show,
without recourse to the modifying branching rule, that FCP terminates finitely.
Actually, the modifying branching rule can only decelerate the convergence of
FCP, hence it is not employed in the algorithm for FCP.

5.5.4 Direct Enforcement of Bounds


Usually problems include lower and upper bounds on the sizes of individual ex-
pansions similar to (7.9) expansions. In particular, lower bounds are frequently
included in order to avoid solutions with too many expansions that are too
small. In such cases, an MILP approach must necessarily carry the variable
lower and upper bounds throughout the search tree as rows of the constraint
matrix. These rows must be updated when solving the LP subproblems and
increase the size of the basis. On the other hand, the proposed algorithms for
FCP and CCP merely impose capacity expansion bounds as simple bounds on
the subproblems to which they apply. For instance, to enforce a strictly posi-
tive lower bound on variable Eft when solving an FCP, after branching on this
variable as described in Section 5.5.1, the lower bound need only be applied to
the right child subproblem and its descendants.

6 GENERATING TEST PROBLEMS

6.1 Introduction
The importance of process planning to the chemical industries motivates the de-
velopment of exact algorithms and heuristics to obtain optimal or near-optimal
process plans. Test problems with a variety of sizes, structures, and parameters
must be employed in any rigorous testing of such algorithms, and an automatic
test problem generator greatly facilitates this endeavor. This section develops
such a generator. When the numbers of processes and products are input to the
generator, it produces a desired number of problem instances having random
network structure and model parameters.
214 M. -L. LIU, N. V. SAHINIDIS AND 1. PARKER SHECIMAN

6.2 Feasible Process Networks and Bipartite


Graphs
Given a numbers of chemicals and processes, a process network can be repre-
sented as a flow diagram that consists of arcs representing the flow of chemicals
and nodes representing processes. Alternatively, one can view a process net-
work as a bipartite graph. Let g = (Vl U V2, £) be the bipartite graph with
node sets Vl and V2 corresponding to the chemicals and processes, respectively,
and a set £ of directed arcs such that every e E £ joins some node of Vl to
some node of V2 • Each arc in this bipartite graph represents the relationship
between one chemical and one process. A chemical relates to a process either as
an input or as an output; a directed arc from a chemical node to a process node
in the bipartite graph can represent the appropriate relationship. In general,
not every bipartite graph yields a feasible process network. A feasible process
network can be stated as follows:

For any bipartite graph g = (Vl U V2 , f), if every node in Vl has at least one
in arc or out arc joining it to a node in V2 and every node in V2 has at least
one in arc joining it to a node in Vl and one out arc joining it to a node in Vl ,
then this bipartite graph is a feasible process network.

An example of a process network involving three processes and three chemicals


is shown as a flow diagram in Figure 3(a). Figure 3(b) shows the corresponding
bipartite representation. In the bipartite graph, each chemical has at least one
arc which connects to a process and each process has at least one arc connecting
it to a chemical input and at least one arc connecting it to a chemical output.

6.3 Process Network Test Problem


Generation

6.3.1 Legend
The following symbols will be used to describe the generator.
PLANNING OF CHEMICAL PROCESS NElWORKS 215

index of processes.
j index of chemicals.
k counter for added arcs.
d an indicator of arc direction. It takes a value of -1 or +1.
ARCU, i) the indicator of arcs.
+1, if there is one arc from node j to node i
ARCU, i) = { 0, if there is no arc between nodes j and i
-1, if there is one arc from node i to node j.
CINU) the indegree of node j in V1 .
COUTU) the outdegree of node j in V1 .
Density a density control factor for the bipartite graph.
9 = (Vl U V2 ,[) a bipartite graph with node sets V1 and V2 and arc set £.
MAXA the maximum number of arcs for a feasible process network.
MINA the minimum number of arcs for a feasible process network.
NA the number of generated arcs.
NC the number of chemicals, i. e., N C = IV11.
NP the number of processes, i. e., N P = IV21.
PIN (i) the in degree of node i in V2 .
POUT(i) the out degree of node i in V2 .
U(a, b) a uniform distribution between a and b.

6.3.2 Test Problem Generator


Method for Generating Random Test Problems

1. Generate Process Network

Step 1 Initialization: Calculate MINA = max(2 x NP,NC), MAXA =


NC x NP, and NA = MINA + l(MAXA - MINA) x DensityJ.
Set ARCU, i) = 0, CINU) = 0, COUTU) = 0, PIN (i) = 0, and
POUT(i) = 0 for all j = 1, ... , NC and i = 1, ... , N P. Set k = O.
Step 2 Generate a smallest feasible process network: Repeat Steps 2.1,
2.2, and 2.3 until k = MIN A.
Step 2.1 Generate an integer between 1 and NC; assign it to j. Generate
an integer between 1 and N P; assign it to i. Generate a random value;
denote it by r '" U(O, 1). If r < 0.5, then d = -1; else d = 1.
Step 2.2 If ARC(j, i) i= 0, then goto Step 2.1. If CINU) + COUTU) >
0, then goto Step 2.1. If PIN(i) + POUT(i) > 1, then goto Step
2.2. If PIN (i) = 1, then d = -1. If POUT(i) = 1, then d = 1.
216 M. -L. L1U, N. V. SAHINIDIS AND J. PARKER SHECTMAN

Chemicall
-I Process 1
1--. Process 2 r---
Chemical 3

c hemical2 .
Process 3 r--
(a) Flow diagram

Chemicals Processes

(b) Bipartite graph

Figure 3 Flow diagram and bipartite graph of problems 1-9.


PLANNING OF CHEMICAL PROCESS NETWORKS 217

Step 2.3 Set ARC(j, i) = d. If d = 1, then set COUT(j) = COUT(j) + 1


and PIN (i) = P IN(i)+1. If d = -1, then set CIN(j) = CIN(j)+l
and POUT( i) = POUT( i) + 1. Set k = k + 1
Step 3 Generate the remaining arcs ofthe process network: Repeat Steps
3.1, 3.2, and 3.3 until k = N A.
Step 3.1 Same as Step 2.l.
Step 3.2 If ARC(j, i) =1= 0, then goto Step 3.l.
Step 3.3 Same as Step 2.3.

2. Given feasible ranges of parameters, randomly generate prices, availabili-


ties, demands, capital requirements, operating costs, etc.

The above procedure has been programmed in the FORTRAN77 language.


The output file of the generator is in GAMS [1] format and can be solved
directly by GAMS. Generating a fairly large problem requires negligible CPU
time on an IBM RS/6000 Power PC. Two examples of random process networks
are shown in Figures 4 and 5. In Figure 4, we find that while chemical 3 is
produced by all of the processes in the complex, processes 1, 3, and 5 yield
chemical 2 as a by-product, which can then be utilized by processes 2 and 4.
Figure 5 demonstrates the possible complication of feedback, or recycling. In
this example, processes 3, 8, and 4 are part of the recycle loop.

The program is configured with several input parameters that control the size
of the process network as well as all the cost and price data for constructing a
problem described by the formulations of this paper. To conduct experiments of
a comparative nature, FCP problems are generated first and then transformed
into CCP form as described in the following Subsection.

6.4 Relation Between Fixed-Charge and


Power Cost Functions
Let us assume that the CCP investment functions are power cost functions
defined as in (7.14). In the cost function described by (7.11), a smaller ratio
ait/ f3it indicates greater economy of scale, while the cost function defined by
(7.14), a smaller value of bit > 0 indicates greater economy of scale. Since both
ofthese functions are approximations of more general cost behavior, either FCP
or CCP can be used to solve a process planning problem. Here, we present a
way to transform an FCP to a CCP or vice versa by a best approximation. We
218 M. -L. LIU, N. V. SAlflNIDIS AND J. PARKER SHECTMAN

Chemicals Processes

Chemical 2

Chemical 3

Chemical 1

Figure 4 A generated process network with 3 chemicals, 5 processes and 15


arcs.
PLANNING OF CHEMICAL PROCESS NE1WORKS 219

Processes
Chemicals

Chemical 3

Chemical 5

Chemical 1
Chemica12

Figure 5 A generated process network with 5 chemicals, 8 processes and 18


arcs.
220 M. -L. LIU. N. V. SAHINIDIS AND J. PARKER SHECTMAN

approximate one form of the cost function with the other, by minimizing the
Euclidean distance between them over a given range [I, u]. For convenience, let
us rewrite the fixed-charge cost function and the power cost function as follows:

{ 0, if x = 0,
<p(x)
ax + {3, if x i= 0,
7r(x) ax b ,

where a, {3 > 0; a > 0 and 0 < b < 1.


The problem of calculating an optimal transformation between the two forms
over a given range [I, u] can be stated as follows:

LSE = min 1 u
(ax + {3 - ax b )2dx

Suppose that a and b are known, the approximation of a and {3 can be found
by the method of least squares. We will find a and {3 so as to minimize the
least squares error (LSE). Differentiating LSE with respect to a and {3 and
setting the partial derivatives equal to zero, we have
8LSE = 0 d 8LSE = 0 (7.19)
8a an 8{3

Evaluating (7.19) and rearranging terms, we obtain the equations

(u 2 _ 12)a + 2( u - 1){3 - ~(ub+1 - 1b+ 1 ) = 0


b+1
and
~(U3 -13)a + (u 2 _/2){3 _ ~(Ub+2 _l b+2) = 0
3 b+2
which may be solved simultaneously to yield formulas for a and {3.

For the case of I = 0, the closed-forms computed are


6abu b- 1 2a(1- b)u b
a = (b + l)(b + 2) and {3 = (b + l)(b + 2)"
Similarly, for the case in which a and {3 are known, a and b can be obtained by
solving the following equations:
8LSE = 0 d 8LSE = 0 (7.20)
8a an 8b
PLANNING OF CHEMICAL PROCESS NEIWORKS 221

Rearranging the terms in (7.20), and setting 1 = 0, we obtain the equations

ub
--a------=O
a f3
2b+ 1 b+ 2 b + 1
and

from which it may not be easy to find closed forms for a and b, but a numerical
approximation is easy to develop. Figure 6 shows a fixed-charge cost function
<p( x) = 95 + 1.56x and its approximation 7r( x) = 43.5xo. 366 obtained by solving
(7.20) numerically.

7 COMPUTATIONAL RESULTS
The proposed algorithm was implemented using BARON [6], a general-purpose
global optimization software for solving nonlinear and mixed integer-nonlinear
programs. BARON employs the branch-and-reduce optimization strategy [7,6]
which integrates conventional branch-and-bound with a wide variety of domain
reduction tools and branching rules. The implementation was done on an IBM
RS/6000 Power PC using the FORTRAN version of BARON. IBM's OSL (Re-
lease 2 [3]) was used to solve the relaxed LPs.

Table 1 describes sixteen example problems from [5], in terms of the numbers
of chemicals, processes, and time periods involved. The number of constraints
(m) and variables (n) for a concave formulation are in each instance substan-
tially less than those required for an MILP formulation. Note that a concave
formulation uses precisely N P x NT fewer variables and constraints because it
eliminates all binary variables and all constraints of the form (7.9).

Computational re~ults for these example problems appear in Tables 2 and 3.


These tables provide comparisons between MILP and FCP, and between CCP
and FCP, respectively. In each table, the last several columns supply ratios
of branch-and-bound nodes and required CPU times for the two approaches
being compared. Table 2 shows that for these problems, an FCP approach using
BARON is, on average, more than three times faster than solving the equivalent
222 M. -L. LID, N. V. SAIDNIDIS AND J. PARKER SHECTMAN

250
-----
-- Fixed-charge Cost Function
Power Cost Function

200

150

Cost q:>(x) = 1.56x + 95

100

50

O+-~~~~-r~~~~~~~~
o 20 40 60 80 100

Capacity Expansion (x)

Figure 6 Two typical capacity expansion cost functions.

Table 1 MILP and CP model sizes.

Problem Network Size MILP size CP size


No NC NP NT n m n m
1-3 3 3 3 55 37 46 28
4-6 3 3 6 91 73 73 55
7-9 3 3 8 121 97 97 73
10-11 6 10 4 225 145 185 105
12-13 6 10 6 277 217 217 157
14-15 6 10 8 369 289 289 208
16 28 38 4 897 569 745 417
PLANNING OF CHEMICAL PROCESS NE1WORKS 223
MILP using OSLo The OSL search tree averages 4.6 times as many nodes than
the branch-and-reduce algorithm. Moreover, in generating the MILP results,
we first experimented with the OSL options to find the optimal MILP solution
strategy. This strategy employs probing only on binaries that are satisfied (0
or 1 in the current solution) and updating pseudo-costs for all binary variables
(both satisfied and unsatisfied) at each branching.

Table 2 MILP and FOP computational requirements.

Problem MILP solved by OSL FCP solved by BARON OSL-7-FCP


Number Nodes Time (sec) Nodes Time Nodes Time
1 7 .2 9 .2 0.8 1.0
2 7 .2 9 .2 0.8 1.0
3 9 .2 9 .2 1.0 1.0
4 23 .5 11 .3 2.1 1.7
5 32 .6 11 .3 2.9 2.0
6 23 .5 11 .3 2.1 1.7
7 42 .9 13 .3 3.2 3.0
8 52 .9 13 .3 4.0 3.0
9 50 .9 15 .3 3.3 3.0
10 103 2.4 19 .7 5.4 3.4
11 102 2.4 17 .7 6.0 3.4
12 327 8.6 65 2.4 5.0 3.6
13 205 5.6 39 1.7 5.3 3.3
14 2,495 74.8 117 5.3 21.3 14.1
15 1,102 30.2 153 7.0 7.2 4.3
16 13,289 770 5,237 638 2.5 1.2
Average 1,117 56.2 359 41.1 4.6 3.2

Table 3 shows that for small problems, solving the FCP problem requires about
the same amount of nodes and CPU time as its CCP approximation. Never-
theless, Problem 16 illustrates that problems of substantially larger size can be
solved more quickly using the FCP technique. This follows from the fact that
we always branch so that the objective term of the branching variable is no
longer violated in descendant subproblems. Whereas, for CCP, the algorithm
is likely to branch on each variable many times in a single path of the tree.
It should be mentioned that the results obtained here for CCP using the pro-
posed subdivision rule are several orders of magnitude faster than when the
traditional omega subdivision or bisection rule is used instead.
224 M. -L. LID, N. V. SAHINIDIS AND J. PARKER SHEC1MAN

Table 3 CCP and FCP computational requirements.

Problem CCP model FCP model CCP...;-FCP


Number Nodes Time (sec) Nodes Time (sec) Nodes Time
1 5 .2 9 .2 0.6 1.0
2 5 .2 9 .2 0.6 1.0
3 5 .2 9 .2 0.6 1.0
4 7 .2 11 .3 0.6 0.7
5 9 .3 11 .3 0.8 1.0
6 7 .2 11 .3 0.6 0.7
7 13 .3 13 .3 1.0 1.0
8 15 .3 13 .3 1.2 1.0
9 15 .3 15 .3 1.0 1.0
10 37 1.1 19 .7 1.9 1.6
11 21 .7 17 .7 1.2 1.0
12 127 4.1 65 2.4 2.0 1.7
13 43 1.7 39 1.7 1.1 1.0
14 119 5.0 117 5.3 1.0 0.9
15 79 3.9 153 7.0 0.5 0.6
16 14,039 2,153 5,237 638 2.7 3.4
Average 909 135.7 359 41.1 1.1 1.2
PLANNING OF CHEMICAL PROCESS NE'IWORKS 225
In Table 4, we see that the transformation given in Section 6 causes a shift
in solution value of only about 0.3%, on average. Here again, Problem 16 is
exceptional, showing a shift of nearly 4% in optimum. Because this problem is
so much larger than the rest, its objective contains a much greater proportion of
terms that are concave, which magnifies the error in approximating the value of
one formulation by t):te other. For all problems solved, both formulations gave
the exact same solution point. Note that, when all the capacity expansions
are equal to zero, the FCP representation of CCP becomes exact. The small
difference in the optimal solution values of the two models is then clearly seen
to be a consequence of the fact that, due to economies of scale, solutions to
planning problems typically involve very few capacity expansions.

Table 4 Performance of approximation by FCP and CCP models.

Problem Optimal of Optimal of Difference of


Number FCP Problems CCP Problems Two Models (%)
1 1774.8 1773.2 0.1
2 1123.3 1121.7 0.1
3 2440.8 2446.2 0.2
4 2196.5 2194.6 0.1
5 1387.9 1386.1 0.1
6 3019.1 3025.4 0.2
7 2451.4 2449.6 0.1
8 1583.9 1582.0 0.1
9 3395.7 3402.0 0.2
10 51031.0 51063.5 0.1
11 51450.1 51539.3 0.2
12 67322.7 67345.0 0.0
13 67746.2 67749.9 0.0
14 71919.9 71972.7 0.1
15 72369.9 72477.2 0.1
16 529.8 548.9 3.6
Average 0.3

For Tables 5-7, we refer to four different sets of test problems. The first col-
umn of Table 5 gives the specifications for each set of problems in the form
NC-N P-NT. In the second column of this table, we give the number of bi-
nary and continuous variables, and the number of rows and nonzeroes in the
constraint matrix. Whereas the number of binaries and constraints is fixed
according to specifications for each problem set, we express the numbers of
226 M. -L. LIU, N. V. SAHINIDIS AND J. PARKER SHEC1MAN

continuous variables and nonzeroes in the constraint matrix using ranges, since
these quantities vary according to the network density of each randomly gen-
erated problem. Each set comprises 27 randomly generated problems.

Table 5 The size of randomly generated problems.

Network Size MILP Model Size CP Model Size


Problem NC NP NT 0-1 Cont. Total Total Total
Number Var. Var. Const. Var. Const.
10-10-6 10 10 6 60 319-349 241 259-289 181
10-15-4 10 15 4 60 305-313 221 245-253 161
15-15-6 15 15 6 90 493-529 361 403-439 271
15-20-6 15 20 6 120 625-649 451 505-529 331

For the four problem sets, average solution requirements are found in Tables 6
and 7. These tables provide comparisons between MILP and FCP, and between
CCP and FCP, respectively. First, in Table 6, we find nearly identical running
times solving the MILP with OSL versus solving the FCP with BARON. In
spite of this finding, the branch-and-reduce approach requires about one-half
as many nodes in the search tree.

Table 6 Computational results for randomly generated problems: MILP and


FOP Models

Problem MILP solved by OSL FCP solved by BARON OSL+FCP


Case Nodes Time (sec) Nodes Time (sec) Nodes Time
10-10-6 90 4.0 42 3.8 2.1 1.1
10-15-4 113 4.7 72 5.9 1.6 0.8
15-15-6 166 10.3 104 15.5 1.6 0.7
15-20-6 1,132 67.4 501 67.0 2.3 1.0
Average 375 21.6 180 23.1 1.9 0.9

Secondly, Table 7 shows that BARON solves randomly generated FCPs four
times faster than the corresponding CCPs. As with the example problems in
Table 3, we find that both approaches give the exact same solutions and nearly
the same solution values.

Finally, Table 8 illustrates the effects of network density on the computational


performance of the two CP approaches. Three different densities are studied
PLANNING OF CHEMICAL PROCESS NETWORKS 227

Table 7 Computational results for randomly generated problems: CCP and


FCP models.

Problem CCP model FCP model CCP-7-FCP % Diff. in


Case Nodes Time (sec) Nodes Time Nodes Time Value
10-10-6 198 10.5 42 3.8 4.7 2.8 0.25
10-15-4 454 23.5 72 5.9 6.3 4.0 0.22
15-15-6 787 76.0 104 15.5 7.6 4.9 0.20
15-20-6 2,596 299.7 501 67.0 5.2 4.0 0.00
Average 1,009 102.4 180 23.1 6 4 0.17

across two problem sizes, specified by NC-N P-NT. Each of the six rows
averages results for 27 randomly generated problems. Together with Figure
7, this table indicates a drastic decrease in computational requirements as the
network density of the problems increases. This is due to an increased number
of nonzeroes in the constraints of form (7.4), which shrinks the feasible region
so that it is explored more quickly by the algorithm.

Table 8 The effect of network density on the computational requirements of


CCP and FCP Models.

Problem Network CCP FCP


Case Density Nodes Time Nodes Time
15-15-6 0.1 939 85.5 265 24.7
0.5 776 77.0 153 22.2
0.9 4 5.5 2 5.4
15-20-6 0.1 3653 469.0 1788 265.4
0.5 1668 238.3 349 54.1
0.9 8 9.7 4 9.3

8 CONCLUSIONS
1. The FCP approach to the process planning problem seems computationally
more expedient than solving the equivalent MILP. For solving examples
of actual process planning problems, BARON requires about one-third of
the time taken by OS1. Although running times of the two codes are,
at present, roughly the same when solving randomly generated problems,
228 M. -L. LIU, N. V. SAHINIDIS AND J. PARKER. SHECIMAN

--tit-- 15-15-6
~,---------------------------~

20

CPU
(sec)

10

o~,-,-~~~~~~~~~~~~

o 20 40 60 80 100

Density (%)

----+-- 15-20-6

300~-------------------------,

200

CPU
(sec)

100

o 20 40 60 80 100

Density (%)

Figure 7 The effect of network density on the computational requirements.


PLANNING OF CHEMICAL PROCESS NETWORKS 229
BARON requires just about one-half of the amount of iterations. This
suggests that upon careful tuning, the BARON/CP approach will be even
more competitive with existing MILP solvers. In short, concave program-
ming proves an attractive, viable alternative to MILP for process planning.
2. The least-squares approximation of CCP by FCP results in identical solu-
tions and nearly identical solution values for the same problems, yet allows
them to be solved significantly faster. When given planning problems for-
mulated as CCPs, it seems desirable to transform them to FCP form, in
order to reduce solution time.

3. The novel subdivision rule proposed for CCP largely outperforms the bi-
section and omega subdivision rules traditionally used in concave program-
ming. As the proposed subdivision strategy is applicable to general concave
programs, its application to additional problems is clearly of interest.

9 ACKNOWLEDGEMENTS
Partial financial support from the EXXON Education Foundation and from the
National Science Foundation under grant DMII 94-14615 is gratefully acknowl-
edged.

REFERENCES
[1] A. Brooke, D. Kendrick and A. Meeraus. GAMS-A User's Guide. The
Scientific Press, Redwood City, CA, 1988.
[2] M. Benichou, J. M. Gauthier, P. Girodet, G. Hentges, G. Ribiere, and
O. Vincent. Experiments in mixed-integer linear programming. Mathe-
matical Programming, 1:76-94, 1971.

[3] IBM. Optimization Subroutine Library Guide and Reference Release 2.


International Business Machines Corporation, Kingston, NY, third edition,
July 1991.
[4] M. L. Liu and N. V. Sahinidis. Computational trends and effects of approx-
imations in an milp model for process planning. Industrial & Engineering
Chemistry Research, 34:1662-1673, 1995.
230 M. -L. LIU, N. V. SAHINIDIS AND J. PARKER SHECTMAN

[5] M. L. Liu and N. V. Sahinidis. Long range planning in the process in-
dustries: A projection approach. To appear in Computers f3 operations
research, 1995.

[6] H. S. Ryoo and N. V. Sahinidis. A branch-and-reduce approach to global


optimization. Accepted for publication, Journal of Global Optimization,
1995.
[7] H. S. Ryoo and N. V. Sahinidis. Global optimization of non convex nIps
and minlps with applications in process design. Computers f3 Chemical
Engineering, 19:551-566,1995.

[8] N. V. Sahinidis and I. E. Grossmann. Reformulation of the multiperiod


milp model for capacity expansion of chemical processes. Operations Re-
search, 40, Supp. No. 1:S127-S144, 1992.

[9] N. V. Sahinidis, I. E. Grossmann, R. E. Fornari, and M. Chathrathi. Op-


timization model for long range planning in the chemical industry. Com-
puters and Chemical Engineering, 13:1049-1063, 1989.

[10] J. P. Shectman and N. V. Sahinidis. A finite algorithm for global minimiza-


tion of separable concave programs. In C. A. Floudas and P. M. Pardalos,
editors, Proceedings of State of the Art in Global Optimization: Computa-
tional Methods and Applications, Princeton University, April 28-30, 1995,
1995.
8
GLOBAL OPTIMIZATION FOR
STOCHASTIC PLANNING,
SCHEDULING AND DESIGN
PROBLEMS
M. G. Ierapetritou, E. N. Pistikopoulos

Centre for Process Systems Engineering,


Department of Chemical Engineering,

Imperial College, London, SW7 2BY, UK

ABSTRACT

The work addresses the problem of including aspects of uncertainty in pro-


cess parameters and product demands at the planning, scheduling and design
of multiproduct/multipurpose plants operating in either continuous or batch
mode. For stochastic linear planning models, it is shown that based on a
two-stage stochastic programming formulation, a decomposition based global
optimization approach can be developed to obtain the plan with the maximum
expected profit. by simultaneously considering future feasibility. An equivalent
representation is also presented based on the relaxation of demand requirements
enabling the consideration of partial order fulfilment while properly penalizing
unfilled orders in the objective function. A similar relaxation is shown for
the problem of scheduling of continuous multiproduct plants enabling the de-
termination of a robust schedule capable of meeting stochastic demands. In
both cases, it is shown that such relaxed reformulations can be solved to global
optimality, since despite the presence of stochastic parameters the convexity
properties of the original deterministic (i.e. without uncertainty) models are
fully preserved. Finally, for the case of batch processes, global solution proce-
dures are derived for the cases of continuous and discrete equipment sizes by
exploiting the special structure of the resulting stochastic models. Examples
are presented to illustrate the applicability of the proposed techniques.
231
I. E. GroSSmIlM (etL). Global Optimization in Engineering Design. 231-287.
" 1996 Kluwer Academic Publishers.
232 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

8.1 INTRODUCTION
Due to lack of accurate process models and the variability of process and en-
vironmental data, it becomes indispensable to establish systematic ways to
solve process planning, scheduling and design problems involving stochastic el-
ements. Nevertheless, it is well understood that the consideration of uncertainty
transforms the problem from a deterministic one, where standard methods of
mathematical programming can be applied, to a stochastic problem where spe-
cial techniques are required. Decomposition techniques (Bienstock and Shapiro,
1988; Pistikopoulos and Grossmann, 1989a,b; Dantzig, 1989; Straub and Gross-
mann, 1993) based on the exploitation of the stage wise nature of the stochastic
problem as well as large scale optimization approaches (Grossmann et al. 1983;
Brauers and Weber, 1988; Sahinidis et al. 1989) based on the discretization of
the uncertain parameter space have appeared in the open literature to deal
with the problem of process design/planning under uncertainty. Yet, in many
model instances, only local solutions can be obtained without any guarantee
and/or proof of global optimality.

In the field of global optimization, recent developments based on branch and


bound and outer approximation methods (as well as combinations of those two)
have enlarged the fields of optimization where these methods can be applied.
Applications include global integer programming, systems of equations, d.c.,
reverse convex programming (see Horst, 1990 for a complete sun'ey), primarily
focussing on deterministic nonlinear models (i.e. without any elements of un-
certainty) .

The main objectives of this work are to present mathematical models of plan-
ning/design of continuous and batch plants for the case when some of the
model parameters are random variables described by probability distribution
functions or considered to vary within specified range, and develop suitable so-
lution techniques to determine the global optimal design/plan based OIl model
reformulation and decomposition principles. In particular, we propose a two-
stage stochastic programming formulation as the basis to model in a consistent
way stochastic process design, planning and scheduling problems of continuous
and/or multiproduct/multipurpose batch plants. Global optimization solution
strategies are then presented for the efficient solution of these models.
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 233

The chapter is organised as follows. Section 8.2 deals with linear stochastic
planning models, a decomposition-based global optimization algorithm for the
resulting bilevel linear two-stage stochastic programming formulation, and a
single-level reformulation based on the relaxation of demand requirements. A
similar single-level reformulation is described in section 8.3 for the problem
of scheduling of multiproduct continuous plants with uncertain demands. A
global optimization algorithm for batch plant design and scheduling models
are presented in section 8.4.

8.2 PRODUCTION AND CAPACITY


PLANNING UNDER UNCERTAINTY
Solution approaches appeared in the open literature to deal with uncertainties
in model parameters include: (a) the scenario analysis approach, which is char-
acterised by discretization over the parameter space (Brauers and Weber, 1988),
(b) the use of multiperiod models, which is characterised by discretization over
the time horizon (Beale et al. 1980, Grossmann et al. 1983, Sahinidis et al.
1989), (c) stochastic programming models with recourse (Birge, 1982, 1985,
Wallace, 1987, Birge and Wets, 1989) and (d) approximation techniques in-
volving fuzzy programming (Inuiguchi et aI., 1994, Liu and Sahinidis, 1995).
Bloom (1983) and Bienstock and Shapiro (1988) described resource acquisi-
tion problems faced by an electric utility company; a Benders decomposition
method was applied to solve the proposed two-stage programming with re-
course model. For electric utility planning problems, Borison et al. (1984)
presented a stochastic dynamic programming model for the determination of
optimal purchase policies of generating technologies in the face of uncertainty;
Modiano (1987) developed a stochastic programming with recourse model to
analyze the impact of demand uncertainties on capacity expansion policies. Van
Slyke and Wets (1969) and more recently Dantzig (1989) proposed a Benders
decomposition algorithm to solve the two-stage stochastic programming fonnu-
lation of the resource planning problem for large-scale electric power systems
under uncertainty. A common feature of the above approaches is that the~· are
based on an "a priori" discretization ("scenarios") of the uncertainty involved.
Friedman and Reklaitis (1975) and Shimizu (1989) suggested an approach for
incorporating flexibility in a system where, by allowing for possible future ad-
ditive corrections on the current decisions, the system can be optimized by
applying an appropriate cost-for-correction in the objective function. Clay and
Grossmann (1994) proposed a successiye disaggregation algorithm for produc-
234 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS

tion planning problems featuring discrete right-hand-side uncertainties.

8.2.1 Problem Definition

RAW INTERMEDIATES ,---'L.-_.I


MATERIALS PRODUcrS

PROCESSING UNIT

PROCESSING UNITS PROCESSING UNITS

Figure 8.1 Production Network

We consider production networks (similar to the one shown in Figure 8.1) con-
sisting of M existing processes which are interconnected by a set of streams
(arcs) denoting raw materials, intermediates and products which may be pur-
chased/sold in the market subject to prices, availabilities and production ca-
pacities. The general planning problem involves the determination of optimal
decisions regarding production profiles, sales and purchases of chemicals as well
as capacity expansion policies of existing processes.

Consider a time horizon of operation [0, T] which is divided into t time periods.
We introduce the following notation:
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 235

Index sets:

chemical, i=l, .. ,N
j process, j=l, .. ,M
t time period, t=l, .. ,T
Parameters:
L' u
ai,t, ai,t lower and upper bounds on the availability of chemical i
at period t
dft,
, dft, lower and upper bounds on the demands of chemical i at
period t
lower and upper bounds of capacity expansion of process j
at period t
['min ['max minimum and maximum required amount of inventory
i,t 'i,t
for chemical i at the end of each time period t
stoichiometric coefficient for chemical i in process j
sale price of chemical i at time period t
purchase price of chemical i at time period t
value of final inventory of chemical i at time period t
value of starting inventory of chemical i at time period t
(may be taken as the material purchase price)
operating cost of process j at time period t
variable-size cost coefficients for the investment cost
of capacity expansion of process j at time period t
{3j,t fixed-cost charges for the investment cost of capacity expansion
of process j at time period t
Variables:
:r.j,t production of process j during time period t
CEj,t potential capacity expansion of process j at period t
Yj,t binary variables representing the occurrence or not
of an expansion of process j at period t
Si,t amount of chemical i sold at period t
Pu amount of chemical i purchased at period t
It t and 1ft,
,
initial and final inventory of chemical i at period t

The following muitiperiod mathematical model then formally describes the


planning problem as previously posed.
236 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

Problem (P)
T N N N
max Profit = L[L'Yi,tSi,t + Lii,tI!,t - LAi,tPi,t-
t=1 i=1 i=1 i=1
N M M
LAi,tI:'t - LGj,tXj,t - L(ajtCEj,t + ,Bj,tYj,t)) (8.1)
i=1 j=1 j=1

s.t. Xj,t = Xj,t-1 + GEj,t Vj, t (8.2)


GEttYj,t :S GEj,t :S CEY,tYj,t Vj, t (8.3)
df.t :S Si,t :S df.t Vi, t (8.4)

aft :S Pi,t :S aft Vi, t (8.5)


d min < If
'/ -
< 'I!max
I,t - ',t
Vi t
, (8.6)
Ii,t = It.t+! Vi, t (8.7)
Pi,t + It,t + 2:J!,1 bijxj,t - Si,t - I!,t =0 Vi, t (8.8)
The objective function (8.1) corresponds to the maximization of profit over the
time horizon represented by the difference between the revenue due to product
sales and the overall cost (cost of raw material purchases, operating cost, inven-
tory cost and investment cost of potential capacity expansions). Constraints
(8.2), (8.3) determine production capacity and capacity expansion limits. The
material balances for each chemical are given by (8.8); constraints (8.6) and
(8.7) describe inventory requirements while (8.4) and (8.5) represent sales and
purchasing limits.

8.2.2 Two-stage Stochastic Formulation


The inclusion of uncertainty transforms the deterministic problel1l to it Iwo-
stage stochastic formulation where at the first stage (\,·hich lIlay consist of lllore
than one periods) the values of model parameters are assumed known whereas
their values over the second stage (involving more than one periods) can only
be forecasted and typically given either by a probability distributional form or
a range within estimated bounds. The objective is then the maximization of
the expected value of profit over the two stages. For simplicity in representa-
tion, by properly aggregating the variables and eliminating inventory variables
of problem (P), the planning model in (P) can be recasted in the following
compact two-stage stochastic program (SP):
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 237

Problem (SP)
max CIXI
Xl
+ Eo {max{c2{B)X2}}
X2
S.t. AIXI ~ bl
BIXI + B2{B)X2 ~ b2{B)
Xl,X2 2: 0, BE R
where
Xl = (Xjt, Yjt, CEjt, Sit, Pit), t = I, .. , Tl except for Yjt, CEjt where t=l, .. ,TI
+T2, is the first stage decision vector; X2 = (Xjt, Sit, Pit) corresponds to the
second stage decision vector; B is the vector of uncertain parameters that may
involve dft, dft, aft, aft, Au, "tit, Cjt , t = Tl + 1, .. , TI + T2 with associated distri-
bution function J(B); Ee{.} is the expectation operator of {.} over B; R is the
feasible region of (Xl, X2), i.e. R = {B I VB E R3x2: B1XI + B 2{B)X2 S b2 (B) }.
Note that capacity expansions are considered as first stage decision variables
which typically have to be decided prior to resolution of uncertainty. It should
also be noted that in the above formulation the estimation of expectancy is
based on feasible planning decisions, i.e. we do not account for partially meet-
ing customer orders nor do we penalize unfilled orders (which it may result in
pessimistic planning decisions).

Problem (SP) mathematically describes the following two-stage planning strat-


egy. At the first stage, the operating decision variables {Xjt, Sit, Pid and/or
the capacity expansion variables (YI, Y2, CEl , CE2 ) should be selected to max-
imize the average profit and also to ensure future feasibility, i.e. that a vector
of future operating variables can be found to satisfy the constraints. At the
second stage, after a first-stage plan is implemented the solution of the inner
optimization problem results in the selection of a feasible second-stage plan
that optimize plant operation.

8.2.3 Proposed Approach


Problem (SP) corresponds to a two-stage stochastic embedded optimization
problem, the direct solution of which is not possible for several reasons. First,
the expected profit evaluation requires integration over the (inner) second-stage
optimization subproblem. Moreover, the feasible region of the plan, within
which the integration is ought to be considered, is unknown. As will be shown
later, the determination of the boundary of the feasible region which also de-
picts the (unknown) integrands for the integration requires the solution of a.
238 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS

sequence of optimization subproblems.

For the solution of problem (SP), Ierapetritou and Pistikopoulos (1994) pro-
posed an algorithm, which essentially transforms the two-stage stochastic op-
timization problem into a series of deterministic optimization subproblems ex-
ecuted in an iterative fashion. In particular, feasibility subproblems are first
solved to induce the feasible region of a selected plan. This allows profit maxi-
mization to be effectively performed providing a lower bound. A master prob-
lem is then computed from dual information of the feasibility and profit sub-
problems, the solution of which returns a new plan (to be examined) while
providing an upper bound. The algorithm converges to an optimal plan; i.e.
a plan with maximum expected profit and sufficient ("optimal") plan feasibil-
ity. The main steps of the proposed algorithmic procedure for the case of two
uncertain parameters are the following (see also Figure 8.2):

Step 1: Select an initial plan (x}, y} , y~, C Et ,C E~). Set the lower bound
EPL = -00, k=l and select a tolerance €.
Step 2: Solve the feasibility subproblems (B1), (B2 q1 ) to obtain the bound-
- - ary of the feasible region and the corresponding Lagrangian mul-
tipliers (Straub and Grossmann, 1993).

e}"er =argmax{ef -ef} }


sf,sf
s.t. A3(xI + CEt} + B I (ef.e 2(.»X2(.) + B 2CE2 ~ b2(eV,B2(.» (BI)
A 3 (xI + GEd + B I (Or,02(.»X2(.) + B 2 GE2 :5 b2 (Of ,02 (.»
e~q>, e~ql = arg max {efq, - e;ql} }
oU"
2
(JLOI
' 2 (B2q,).
of
s.t. A 3 (xI + GEd + BI (Of I , q, )X2{.) + B 2GE2 :5 HBr' ,B~'ql) Ifl = 1,-.. :QJ
A 3{xI + GEd + BI{Of' ,(J;QI)X2(.) + B 2GE2 :5 b2(fJr'.B~,q,)
Place t.he quadrature points inside the feasible region:
Oi' = O.5[6f (1 + V~I) + 6f(1 - V~I)], q1 = 1, .. : Q1
6~1 '12 = O.5[6fq'(1+v~2)+6;q'(1-v~2)], q1 = 1, .. ,Q1' q2 = 1: .. ,Q2
Step 3: At each quadrature point (6~' , 6~' q2) solve problem (PQ) to eval-
- - uate the optimal value of profit and obtain the corresponding
Lagrangian multipliers.
max{c2(e ql e9192)x9Iq2} }
0102 1 '2 2
subj~~t to (PQ)
BI (Br', (J~' q2 ):C~,q2 :5 b"l{Of'. (J~192) - ,·h(xl + C Ed - B 2GE2
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 239

Step 4: Calculate the expected profit (EP) using the Gaussian Quadra-
- - ture formula:

Update the lower bound, EPL = max{EP, EPd.


Step 5a: Calculate the correction factors (see Ierapetritou and Pistikopou-
10s,1994) and obtain the corrected multipliers 71 q1q2 ,.\f,.\f, ql , .\f .\f ql .

Step 5b: Solve the following master problem to obtain a new plan (x~+l , yf+l , y~+l ,
C Ef+l , C E;+l) and an upper bound E pb.

EPb = J.l.B ,XI,YImax


,Y2,CE ,CE 1 2
/LE

s.t. /LE:::; CI XI + al CEl + (31Yl + a2 CE2 + (32Y2


()u ()L QI ()u ql ()Lql Q2
+ 1 - 1 ' " ql 2 - 2 ' " q2 . (()ql ()ql <]2) <]1 '12 J(()ql. ()<]I <]2)
2 ~ WI 2 ~ w 2 Cl l ' 2 X2 1 , Z
'11=1 '12=1

QI Q2
- L L 71qlq2[A3(Xl + CEd + Bl(()~1 ,Bglq2)X~lq2 + B zGE2 - b2(B~1 ,Bgl'i2}
'11=1'12=1

-.\f[A:3(Xl + CEd + B 1(Bf,B 2(,))X2(.) + B 2GEz - bz(()f,B z (.))]


-.\f[A3(Xl + GEl) + BdBf, Bz(.))xz(.) + BzCEz - bz(er, ()z(.))]

01
- L.\fql [A 3(Xl +CEd + Bd()i\()fql):cz(.) +B2CE2 -bz(eyl)()~"l)]
(/1=1
01
- L .\f ql [A3(Xl + GEl) + Bl (()i l , ()f ql )xz(.) + B 2CEz - bz(et, By'll)]
(/1=1

CEfyz :::; CEz :::; CEfyz

Step 6: If EPti ::; EPL +t:, stop; the solution is the plan (x~, y~', y~', C Ef', C E})
with expected profit EPL. Otherwise, set k=k+ 1 and return to
Step 2.
240 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

SELECT INITIAL
PLAN plan1
EpU=+oO, EP~ -00
k=l
-
SOLVE FEASmILITY
...--------1
SUBPROBLEMS

MAXIMIZE PROFIT
Dl al AT EACH POINT
infor mation

Dllal EVALUATE EXPECTED


inforI1 ation PROFIT EP kAND UPDATE
LOWER BOUND IF EP k ~ EpL

SOLVE MASTER PROBLEM


L ____ ~_-::::.~
--;;;;0
I new
pan,
Ep U

k=k+l
NO
- plank = plan new

YES
III

OPTIMAL PLAN
plank

Figure 8.2 Algorithm for stochastic planning models


GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 241

The proposed algorithm is guaranteed to converge to global optimal solution


under the following convexity assumptions: convexity of the profit function,
convexity of the constraints in (Xl, X2, YI, Y2, GEl, GE2) for fixed B and in (B, X2)
for fixed (Xl, YI , Y2, eEl, C E2). Under these assumptions the master problem
of step 5b provides a valid upper bound to the global optimal solution whereas
a valid lower bound is obtained through the evaluation of the expected profit
in step 4 (Pistikopoulos and Ierapetritou, 1995). Consequently, for linear plan-
ning models described by problem (P) where the above convexity assumptions
hold, convergence to global optimality is guaranteed.

Note also that the algorithm follows the general Benders Decomposition prin-
ciples; its main difference is that the primal problem involves the solution of a
sequence of feasibility and profit subproblems, the dual information of which is
all then incorporated in the master problem.

Such an approach enables plan feasibility to be systematically embodied in


planning decision making. However, it does not consider the possibility of par-
tially order fulfilment since the estimation of the expected profit is based on
strictly feasible planning decisions, which results in rather "pessimistic" deci-
sions. In the next section, this assumption is relaxed based (i) on the partial
relaxation of the feasibility constraint and (ii) the incorporation of a penalty
term in the ob.iectiYf~ function to monitor demand satisfaction.

8.2.4 Equivalent Formulation


As pictorially shmm in Figure 8.3, the possibility of partial order fulfilment can
be considered b~· (i) relaxing the demand constraint of the form Si :::; Oi. and
(ii) introducing a penalty term in the objective function to penalize the lo~~ of
the expected reyenue due to unfilled orders 'YEOER(xdLPi(Oi - Si), where 'Y
i
is a penalty coefficient used to control the effect of penalty term. Note that
the priority of demand satisfaction among the different products is controlled
by the relative yalues of stoichiometric coefficients bij (representing the yield of
process.i in product i) and product prices Pi. The incorporation of the abo\"e
ill problem (SP) results in the following relaxed reformulation:
242 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

Problem (SPI)
max CIXI + E SER(X1) {maxc2X2} + ESER(xd{maxc2sS2}
Xl X2 52

-"{E SER(X1) {maxP2 (0 - S2)}


52
s.t. A1Xl ~b1
B1Xl + B2X2 + B 2s S2 ~ ~
S2 ~ 0
Xl,X2,S2 ~ 0,0 E R
where uncertainty is only considered in product demands and the demand con-
straint is considered separately.

Other strategies appeared in the open literature with regard to uncert.ain de-
mand considerations are the following:

• Si = Oi, 0i E R(Xl) representing the decision maker's choice to meet always


the demand within plant's feasible region (as proposed in section 2.3)
• Si = Oi,Oi E T(Oi) (scenario approach) representing the DM requirement
of demand satisfaction within a specified range T defined by appropriate
lower and upper bounds (see Clay and Grossmann, 1994a,b)

It is interesting t.o note that, as illustrated in Figure 8.4, the formulation in


(SPI) embodies t.he above model instances regarding the demand satisfaction
depending on the value of the penalty coefficient '"'f. In part.icular, as t.he value
of"{ increases (implying a more severe penalization of partial order fulfilment),
t.he result of the decision strategy in (SPI) moves towards the direction of com-
plete demand satisfaction, which coincides with the result of the scenario-based
approach. In fact, there exists a critical value of"{ for \vhich (SPI) is equivalent
to the multiperiod (scenario based) formulation (see Appendix A).

Nevertheless, similarly to problem (SP), model (SPl) cannot be directly solved


since it involves the evaluation of the expected profit within the unknown at the
design stage feasible region of the plant and an integration (expectation eval-
uation) over an optimization problem (maximization of revenue with respect
to production variables). These difficulties can be overcome by evaluating the
expected profit through a numerical integration scheme such as the Gaussian
quadrature formula and then utilizing the following feasibility property arising
due to the relaxation of the demand constraint.
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 243

Si = Oi
Upper 9i E R(xl)
Bound

Lower
Round
,,
,,

9.
Lower Upper
Bound Bound

Si = 9i 9i E T( 9)

~ 9 i 9 i E R(xll

Figure 8.3 Dilfnrnllt. d"lIlalld ("ollst.raillts


244 M. G.IERAPETRITOU AND E. N. PISTIKOPOULOS

' .
... ...
' .
... ...
... ...
...

... ...

... ...
' .
... ...
... ...
... ...

eI
Figure 8.4 Effect of penalty coefficient 'Y

Property- Any first stage plan (Xl) which satisfying the constraints of problem
(P) for fixed product. demands fJ i , is feasible for the whole parameter range,
T(O).

Proof- see Appendix B.

The important implication of this property is that the integration can be per-
formed within the region defined by the bounds of the uncertain demand pa-
rameters, since the feasible region of any design depicted from the solution of
problem (P) in the space of the uncertain demands coincides with the uncertain
parameter range. Therefore, the obstacle of unknown integrands is effecth-ely
overcome.

Based on the above property and by incorporating equation (8.1). problem


(SP1) can be transformed into the following equivalent (single level) optimiza-
tion problem (SP2):
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 245

Problem (SP2)
Q Q
maxEP = CIXI + 2:WqC2XP(Oq) + 2:wqc2s8~J(Oq)
q q
Q
-"( 2:wq]J2(oq - 8~)J(Oq)
q
subject to
A1Xl ~ bi
Bixi + B2X~ + B2s8~ ~ b2 Vq
8:; oq Vq
~
xI,x2,82 ~ 0, eq E T(e)

Notice that the location of quadrature points within the plant's feasible region is
fully determined from the optimization procedure. In this way, the postulation
of arbitrary scenarios is effectively avoided; instead, the number and location
of scenarios (integration points in this case) will only depend on the degree of
accuracy required for the integration. Moreover, since (SP2) corresponds to a
single level optimization problem, conventional solution algorithms can be used
to obtain the global optimal solution for linear and convex planning models.

Note also that the case of additional uncertain parameters appearing only in
the objective function, can be treated similarly in a straightforward manner
(Ierapetritou et al. 1995).

Illustrating Example

Costs Sales price


(24$lbbl) Gasoline (36$1bbl)
Crude Oil 1
Kerosene (36$1bbl)
(lS$lbbl) Refinery Fuel oil (36$1bbl)
Crude Oil 2
Residual (36$lbbl)

Figure 8.S Refinery input and output schematic

This example is a variation of the refinery planning problem considered in Edgar


and Himmelblau (1988) and Clay and Grossmann (1994a,b). Figure 8.5 is a
simplified schematic of feedstocks and products for the refinery (where costs
246 M. O. lERAPETRITOU AND E. N. PISTIKOPOULOS

and prices are also given). Two crude oils are available for purchase subject to
supply limitations. Four products are produced according to a yield matrix with
limits on the production capacities (see Table 8.1). Both crudes and products
can be stored subject to tank inventory limits. Products are available for
sales according to market demands which are considered uncertain for gasoline
and fuel oil during the second stage following normal distribution functions
N(15, 1.25) and N(5,1), respectively. The objective is the maximization of
the expected value of profit function, defined as the difference between income
(from product sales) and cost (operating cost and cost of purchasing), over the
two stages. The two-stage planning model consists of 28 inequalities and 36
equalities involving 40 variables.

Yield Maximum
Crude oil 1 Crude oil 2 Capacity
Gasoline 0.8 0.44 24
Kerosene 0.5 0.4 15
FUel oil 0.2 0.36 1
Residual 0.05 0.1
Processing cost 0.5 1
(per thousand bbl)

Table 8.1 Data for the refinery crudes and products

(i) The application of the approach in section 8.2.3 leads to the following results

Step 1: An initial plan is selected to purchase 20 and 15 thousand bbl/day


of crude oil 1 and 2, respectively, and to utilize 18 and 15 thou-
sand bbl/day of crude oil 1 and 2, respectively, for the first-stage;
EPL = -DC, k=l and E = 0.1.
Step 2: The solution ofthe feasibility problem (B1) leads to the following
bounds of fh: 13::; 81 ::; 19.7. Considering 5 quadrature points
for 8, and solving (B2 q, ) at each point the results shmvn in Table
8.2 are obtained.
(j1 1 2 3 4 5
e 1ql 13.31 14.55 16.35 18.15 19.39
e :2Lql 4.69 3.45 3.00 3.29 4.04
e(!q,
:2 7.01 7.05 7.12 7.13 6.61

Table 8.2 Bounds for 62 at 6~1 for the initial plan


GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 247

Step 3: Considering 5 quadrature points for (h and solving problem (PQ)


at each point the results shown in Table 8.3 are obtained.

q2 / ql 1 2 3 4 5
1 420.31 432.98 446.13 459.27 468.25
2 418.95 444.76 477.99 498.34 512.23
3 467.13 491.24 514.51 532.57 540.82
4 499.05 513.89 535.63 547.70 555.40
5 515.75 525.68 540.21 554.75 560.13

Table 8.3 Optimal profit at the quadrature points


Step 4: The expected profit is evaluated through a Gaussian Quadrature
- - formula yielding a value of 939.98 K$/day; the lower bound is
updated, EPL = 939.98.
Step 5a,5b: After the evaluation of the corrected multipliers the master prob-
lem is formulated and solved. A plan is then obtained that
corresponds to purchasing and utilizing 20 and 12.5 thousand
bbl/ day of crude 1 and 2 respectively; an upper bound is found
of EPi, =950.93 K$/day.
Step 6: Since the stopping criterion is not satisfied EPli=950.93 > EPL =939.98,
set k=k+l and return to Step 2.

The algorithm requires only 2 iterations to reach the optimal solution within
a tolerance of 0.1 for t.he lower and upper bounds. The optimal solution is the
plan that corresponds to purchasing and utilizing 20 and 12.5 thousand bbl/day
of crude 1 and 2 respectively. This plan has an expected profit of 948.5 K$fday
within a corresponding feasible region defined by the following bounds: for 01 •
13 :::: OJ :::: 19.7, and for each quadrature point Or! by bounds for O2 as shO\\"I1
in Table 8.4.
ljl 1 2 3 4 5
(j'11 13.31 14.55 16.35 18.15 19.39
1
OLql 4.69 3.45 2.50 3.04 4.04
2
OUq, 7.01 7.05 7.12 7.13 6.61
2

Table 8.4 Bounds for 82 at 81 1 for the optimal plan

Using GAMSlMINOS for the solution of the linear subproblems and for the
linear master problem it requires a total of approximately 33 CPU s Oll SPARC
2 {16.5 CPU s for each iteration with approximately 15.5 CPU s spent at thp
248 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

solution ofthe subproblems and 1 CPU s at the solution ofthe master problem).

(ii) The solution of the same example by relaxing the demand requirement
Si ::; (Ji {instead of Si = (Ji) results in the plan of purchasing and utilizing 20
and 15 thousand bbljday of crude 1 and 18 and 15 thousand bbljday of crude 2
respectively. Compared to the plan obtained in (i) it features a smaller feasible
region and less expected profit; i.e. it represents a less risk aversion decision
since in this case we take into account partial demand satisfaction (whereas in
the former partial order fulfilment is not considered).

8.3 SCHEDULING OF MULTIPRODUCT


CONTINUOUS PLANTS WITH
UNCERTAIN DEMANDS
While the scheduling problem for batch processes has received much attention
in the open literature, much less work has been reported for continuous plants.
Sahinidis and Grossmann (1991) considered the cyclic scheduling of multiple
continuous lines where each product is produced on a single line. Pinto and
Grossmann (1993) addressed the problem of optimizing cyclic schedules of mul-
tiproduct continuous plants comprising a sequence of stages interconnected b~'
storage tanks. Schilling et al. (1994) considered more general multipurpose
plants implementing recipes of arbitrary complexity. In all previous reported
works, problem data are assumed deterministic.

Here, we will address t.hese scheduling problems for the case when uncertaillt~·
ill product demands is involved. In order to provide more insight on the nature
of the problem '''e begin our analysis by first considering the simple case of
single stage plant ,,,ith a single production line. The additional complications
introduced by considering several stages interconnected by storage tanks as well
as cyclic schedule are discussed in the subsequent sections.
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 249
8.3.1 Single production line - one stage
Consider the production of N p products within a given time horizon of H hr.
The time horizon is decretized in NT time slots consisting of production, idle
and changeover time. The problem is then to determine the sequencing of
products, the amount of products to be produced, the production times along
with the levels of intermediate storage inventories. In order to mathematically
represent the scheduling problem the following notation is introduced:

Index sets:

ij product sets, i=I, .. ,Np, j=I, .. ,Np


t time slots, t=I, .. ,NT

Parameters:

ri production rate of product i


Pi sale price of product i
Qij transition time from product i to product j
c{ inventory cost for product i
c~j transition cost between product i and product j
It initial inventory for product i

Variables:

d; sales rate for product i


lit inventory level of product i in slot t
Tt time duration of slot t
TId idle time of slot t
Yit 0-1 variable to denote the assignment of product i to slot t
Zijt 0-1 variable to denote if a changeover from product i to j
occurs at the end of time slot t
OJ continuous demand rate for product i (uncertainty)

The following mathematical model then formally describes the scheduling prob-
lem as previously posed.
250 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS

s.t. lit = Iit- 1 + (Tt - Tt dl - L:L:>ijtQijhYit - Ttdi Vi, t (8.9)


i j

I iO = liNT = It Vi (8.10)
I>t=H (8.11)
t
Tt ~ Ttidl + L L ZijtQij Vt (8.12)
j

LYit =1 Vt (8.13)
i

..
Zijt ~ Yit. - Yjt+l - 1
d· <
- B· Vi
Vi,j, t (8.14)
(8.15)
The objective function corresponds to the maximization of expected profit (over
the time horizon) represented by the difference between the revenue due to
product sales and the overall cost (inventory cost and transition cost). A
penalty term can also be introduced to penalize partial demand satisfaction
(as discussed in the previous sections) of the form 'Y LLPi (Bi - dih, where 'Y
i t
is a penalty coefficient used to control demand satisfaction. The mass balances
for each product i at each time slot t are considered in equations (8.9), (8.10);
equations (8.11) and (8.12) represent the timing constraints; constraint (8.13)
ensur~s the assignment of only one product to each time slot, while constraint
(8.14) establishes the link of the transition variables Zijt with the assignment
variables Yit; finally, constraint (8.15) corresponds to the relaxed demand con-
straint for each product.

The use of Gaussian Quadrature Formula to evaluate the expected profit mul-
tiple integral as well as the utilization of a similar feasibility property (see
Appendix C for the detailed proof) leads to the following equivalent reformu-
lation of problem (PC):
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 251

Problem (PCI)

""..W
md~TqL. q""
~~Pi
q q "" "" I
"" diTt - ~~Ci I~ +2I~ -1 Ttq - "" "" "" t r
~~~CijZijt
y,t, i' t q i t i t i j t

-'Y LWqLLPi(B? - di)Ti


q i t

>
T tq _ T tidlq + """"
~~Zijt Q ij Vt,q
j

LYit = 1 Vt
i
Zijt 2: Yit - Yjt+l - 1 Vi,j, t
d? ::; B? Vi, q, B E T(B)
Problem (PC1) corresponds to a single yet nonconvex optimization problem due
to the nonconvex objective function (bilinear terms in the investment cost and
revenue term) and inventory constraint (due to the introduction of uncertainty).
However, problem convexification can be achieYed based on the following ideas:

• introduction of a new variable Dr, = d; T? representing the sales of product


i durip.g the time slot t
• approximation of investment cost (Schilling et al., 1994):

Problem (PC1) can then be rewritten in the following way:


252 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS

Problem (PC2)

'~~TqLwqLUiDit
Yd, q itT
it' t
- :: Lc[(I! + LIit ) - LLLc~jzijt
t i t i j

-"{ LwqLUi{Oirtq - Drt)


q i t

s.t. I~ = I~_l + (r! - rfdl q - LLZijtQij )riYit - Dit Vi, t, q


i j

LYit =1 Vt
i
Zijt ~ Yit - Yjt+l - 1 Vi,j, t
Dit ~ Oir! Vi, q
Problem (PC2) still appears to be a nonconvex MINLP formulation due to the
bilinear term in the demand constraint. However, due to the feasibility property
in Appendix C, 0 E T{O), i.e. the location of quadrature points does not depend
on the decision variables; consequently, quadrature points can be fixed prior to
t.he optimization based only on the desired accuracy for the integration. As a
result., problem (PC2) eventually corresponds to an l\HLP formulation which
can be solved to global optimality using conventional MILP solvers.

Illustrating Example
A small scheduling problem of a continuous multiproduct plant. having one pro-
duction line and a single stage is considered here involving the production of 2
products over a horizon of 72 hours (discretized in 4 slots). The mathematical
model (PC) is used to describe the scheduling problem whereas the problem
data are given in Table 8.5. The demand of both products are considered as
uncertain parameters described by normal distribution functions of the form
N{15,5) and N{10,3) for products 1 and 2, respectiYely. Five quadrature points
are used for each uncertain parameter.
The scheduling model in (PC2) consists of 802 constraints with 841 variables
(40 binary variables). Using GAMS/CPLEX for the solution of problem (PC2)
requires 1.3 CPU s to determine the optimal schedule with an expect.ed profit
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 253

Product ri fSt Qij c1t tl'


Cij
(kg/hr) (kg) (hr) ($/kg/hr) ($)
1 40 50 0.2 0.07 10
2 18 50 0.3 0.075 5

Table 8.5 Data for Example

of 1.1E+4 ($), which corresponds to the following production sequence: pro-


duction of product 2 at time slots 1,3 and production of product 1 at time slots
2, 4. Notice that the optimal schedule does not change in order to meet vary-
ing production requirements, i.e. it corresponds to a robust schedule capable
of meeting demand variations within the ranges [0, 30] and [1, 19] for products
1 and 2, respectively. Production times on the other hand change according
to demand variations (see Table 8.6 for production times at different demand
rates).

Demand Production times (h)


rate (kg/hI') Period 1 Period 2 Period 3 Period 4
(15, 10) 6.665 0.33 0.3 64.7
(23.077, 14.846) 4.289 0.473 0.3 68.3

Table 8.6 Results of Example

8.3.2 Single Production line - multistage

Figure 8.6 Multiproduct Continuous plant with several stages

In this section we consider the additional complications arising in the stochastic


scheduling problem from the introduction of several stages that are intercon-
nected by intermediate inventory tanks and the consideration of cyclic schedules
as ciiscllssed ill Pinto and Grossmann (1!)!)4). (see Figure 8.6). The following
254 M. G.IERAPETRITOU AND E. N. PISTIKOPOULOS

notation is adopted in order to mathematically describe this scheduling prob-


lem:

Index sets:

ij product sets, i=l, .. ,Np, j=l, .. ,Np


k time slots, t=l, .. ,Np
m stages m=l, .. ,M

Parameters:

"(Pim processing rate of product i at stage m


aim mass balance coefficient for product i at stage m
Pi sale price of product i
Tijm transition time from product i to product j at stage m
e{m inventory cost for product i at stage m
c~j transition cost between product i and product j
Uim, ulkm , UA,harge numbers (used in problem reformulation)

Variables:

IPikm inventory level of product i in slot k between stages m and Il1 + 1


IOkm,I1km break points for inventory level between stages m and m + 1
12 km ,I3 km break points for inventory level between stages m and m + 1
Irnaxkm maximum inventory level at slot k between stages m and III + 1
akm lllass balance coefficient at stage m in slot k
TS km start time of stage III at slot k
TSPikm start time of product i at stage m in slot k
Tekm end time of stage m at slot k
Tepikm end time of product i at stage m in slot k
TPkm processing time of stage m at slot k
TPPikm processing time of product i at stage m at slot k
Te cycle time
WPikm amount produced of product i at stage m in slot k
Yik 0-1 variable to denote the assignment of product i to slot k
Zijk 0-1 yariable to denote if a changeover from product i to .i
occurs at the end of time slot k
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 255

ikm' X2ikm' x3ikm 0-1 variables introduced ro remove nondifferentiabilities


Xl
Oi continuous demand rate for product i
¢Ikm' ¢~km' ¢~km variables introduced to model inventory profiles

The scheduling problem is then formulated as follows:

Problem (PCM)

'max ESER{ max {LViWPikm - LLLc[mlPikm-


!lik,Zijlc [Pik .... ,Tc,Wpik .... ,TpPilc.... i k i k m

(LLC[mI'PiMTPPikM - LLc[mWPikM)}} - LLLc~jzijt(8.16)


ik ik ijk
subject to:
W Pikm = I'PimTPPikm Vi, k, m (8.17)
W Pikm = aim+l W Pikm+l Vi, k, m (8.18)
TPPikm ~ UimYik Vi, k, m (8.19)
TPikm = LTPPikm Vk, m (8.20)

TPkm = Tekm - TSkm Vk, m (8.21)


TS1'+lm = Tekm + LLTiimZiik+l Vk, m (8.22)
i
(8.23)
i i
TS km ~ TS km +1 Vk, m (8.24)
Tekm ~ Tekm+l Vk, m (8.25)
Tc ~ L(TPl:m + LLTijmZiid Vm (8.26)
k i j

TSkm = LTS1Jikm Vk, m (8.27)

TSPikm ~ UimYik Vi, k, m (8.28)


Tekm = LTepikm Vk, m (8.29)

Tepikm ~ UimYik Vi, k, m (8.30)


Il km = 2)l'im (TPPikm - ¢~km)] + lOkm Vk, m (8.31)

Vk,m (8.32)
256 M. G. IERAPEfRITOU AND E. N. PISTIKOPOULOS

0::; ¢tkm ::; ulkmxtkm 'Vk, m (8.33)


12km = ~)(-yim - O!kmH 'YimH )</>;kml + Ilkm 'Vk, m (8.34)

o ::; ¢~km - Tepikm + TSPikm+l :::; UAm(l - X~km) 'Vk, m (8.35)


o ::; ¢~km ::; Ui~mX~km 'Vk, m (8.36)
13km = - I)O!kmH'YimH(TPP;kmH - </>~km)l + 12km 'Vk,m (8.37)

o::; ¢:km - TPPikm+l + Tepikm+l - Tepikm :::; uAm (1 - X;km) 'Vk, m(8.38)
o ::; ¢ikm
3 2 3
::; UU,mXikm 'Vk, m (8.39)
IOkm = 13km 'Vk, m (8.40)
0::; Il km ::; Imaxkm 'Vk, m (8.41)
o ::; 12km ::; Imaxkm 'Vk, m (8.42)
o ::; 13km :::; Imaxkm 'Vk, m (8.43)
Imaxkm = LIPikm 'Vk, m (8.44)

IPikm :::; UimYik 'Vi, k, m (8.45)


LYik = 1 'Vk (8.46)

LYik = 1 'Vi (8.47)


k

Zijk ~ Yik - Yjk-l - 1 'Vi, j, k (8.48)


I:WPikJlJ :::; BiTe 'Vi (8.49)
k

Mass balances and amounts produced are considered in equations 8.17-8.20.


Equations 8.21-8.26 represent the timing constraints whereas equat.ions 8.27-
8.30 are int.roduced to serve the linearization technique based on ,'a.riahle ag-
gregation. Inventory levels are represented through equations 8.31-8.40 after
the introduction of 0-1 variables to remove nondifferentiabilities (see Pinto and
Grossmann, 1994 for details). These "alues are bounded in equation 8.41-8.45.
Equations 8.46-8.48 correspond to assignment constraints whereas equation
8.49 states the demand constraint. Finally, the objective function corresponds
to the maximization of expected profit over the cycle time Te, represented by
the difference between the revenue due to product sales and the total cost con-
sisting of inventory and transition cost.
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 257

Based on the relaxation of the demand constraint (equation 8.49), the derived
feasibility property (see Appendix C) and the use of the Gaussian Quadrature
formula to evaluate the expected profit, the stochastic formulation in (PCM)
can be recasted as the following MILP reformulation for the identification of a
robust schedule (Yik, Zijk) able to meet the uncertain demand requirements.

Problem (PCMl)

max LwqLLPiWPlkm - LLLc{mlP~km -


q i k i k m

(LLc{m'YPiMTPP~kM - LLc{mWP~kM) - LLLc~jzijt


k i k i j k
subject to:
W P~km = 'YPimTPP~km Vi, k, m, q
W P~km = aim+1 W P~km+1 Vi, k, m, q
TPP~km ::; UimYik, Vi, k, m, q
TP~km = LTpPikm Vk, m, q

TP: m = Te%m - TS~m Vk,m,q

TS%+1m = Te%m + LLTijmZijk+l Vk,m,q


j

Ts11 = LLTii1Z;jl Vq
j

Vk,m,q
Te~m ::; TcLll+1 Vk, m., q

Tc q ~ l)Tpr", + LLTijmZijk,) Vm, q


k' j

Vk,m,q

T sP~km ::; UimYik Vi,k,7n,q

Te%m = LTepik'm Vk,m,q

Tep~km ::; UimYik Vi, k, m, q


I1~m = L['Yim(Tpprklll - 4>~Zm)l + IO%m Vk, m, q

Vk.m.q
258 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

o ~ rP~:m ~ ulkmx~:m Vk, m, q


12~m = ~)bim - O:kmH rim+drP~:m] + I1~m Vk, m, q

o ~ rP~:m - Tep~km + TSP~kmH ~ Ui\m (1 - x~:m) Vk, m, q


o ~ rP~:m ~ u?kmx~:m Vk, m, q
13~m = - L[O:kmH rimH (TpprkmH - rP~:m)] + 12%m Vk, m, q

o ~ rP~:m - TPP~km+l + TeprkmH - Teprkm ~ U;km (1 - x~:m) Vk, m, q


o ~ rP~:m ~ ulkmx~:m Vk, m
IO~m = 13%m Vk, m, q
o ~ I1~m ~ Imax~m Vk,m,q
o ~ 12~m ~ Imax: m Vk,m,q
o ~ 13~m ~ Imax~m Vk,m,q
Imax~m = LlP~km Vk,m,q

IP~km ~ UimYik Vi, k, m, q


LYik =1 Vk

LYik = 1 Vi
k
Zijk 2: Yik - YjA,-l -1 'Vi,j, k
LWP~kM ~ O?Tcq 'Vi,q
k

Notice that a penalty term,), LLLPi(O?Tcq - Tl1prkM} can be also incor-


q i k
porated in the above formulation to penalize partial demand satisfaction and
control customer order fulfilment.

Illustrating Example
The example considered here is a small scheduling problem of a continuous
multiproduct plant involving the production of 3 products (A, Band C) with
one production line and two stages shown in Figure 8.i (Pinto and Grossmann.
1994). The problem data are given in Tables 8. i. 8.8. The demand of products
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 259

Product
1
20-----1
3 Stage 1 Stage 2

Figure 8.7 Multiproduct Continuous plant - example

A and B are considered as uncertain parameters described by normal distri-


bution functions of the form N(50,10) and N(10,2.5) for products A and B,
respectively. Five quadrature points are used for each uncertain parameter.

Product Pi Stage 1 Stage 2


($/ton) 'YPi (kg/h) c{m($/ton) 'YPi(kg/h) c{m($/ton)
A 800 1200 50 600 500
B 150 800 50 900 500
C HOO 1000 50 HOO 500

Table 8;7 Data for Example

Transition Times (h)


Stage 1 Stage 2
Product A B C A B C
A 10 3 7 3
B 3 6 3 10
C 8 3 4 o

Table 8.8 Data for Example

The scheduling model (PCMl) consists of 5674 constraints with 4587 ,·ariables
(736 binary variables). Using GAMS/CPLEX for the solution of problem (PC2)
requires approximat.ely 10 CPU min to determine the optimal schedule \\Oith
an expected profit. of 6704 ($) that corresponds to the following producrion
260 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

sequence: B -t C -t A. Notice that the optimal schedule does not change in


order to meet production requirements corresponding in this way to a robust
schedule capable of meeting demand variations within the ranges [20, 80] and
[2.5, 17.5] for product A and B, respectively and a demand of 100 for product
C. Production times on the other hand change according to demand variations
as shown in Table 8.9 and Figure 8.8 for different demand values.

Demand (50, 10, 100)


Product Stage 1 Stage 2
Tp Ts Te Tp Ts Te
A 1.3 28.3 29.6 2.5 28.3 30.8
B 0.3 10 10.4 0.4 23.3 23.6
C 2.9 17.4 20.3 2.7 24.6 27.3
Tc=29.6
Demand (66.2, 14, 100)
Product Stage 1 Stage 2
Tp Ts Te Tp Ts Te
A 1.7 28.5 30.2 3.3 28.5 31.8
B 0.5 10 10.5 0.5 23.3 23.8
C 3.0 17.4 20.5 2.7 24.8 27.5
Tc=30.2

Table 8.9 Results of Example

8.4 MULTIPRODUCT/MULTIPURPOSE
BATCH PLANT DESIGN &
SCHEDULING UNDER
UNCERTAINTY
For the problem of designing and scheduling batch plants under uncertainty,
Reinhart and Rippin (1986,1987) and later Fichtner et al. (1990) presented
a number of model variants and solution procedures (scenario-based, penalty
functions, two-stage approach) for the problem of multiproduct batch plant
design with uncertain demands assuming one piece of equipment per stage.
Wellons and Reklaitis (1989) considered staged plant expansion over time to
account for uncertainty in product demand; they also suggested a distinction
between "hard" and "soft" constraints, introducing penalty terms for the latter
type. Straub aml Grossmann (1992) considered uncertainties in both product
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 261

DEMAND =(50, 10, 100)

STAGE 1
I B I I C
10 111.4 17.4 20.3 2X3 29.6

STAGE 2
B.3 23.6 24.6 27 _l 2K_l 30.x
Inventory 73X.9
level 268.7

DEMAND =(66.2, 14, 100)

STAGE I I B I c
10 105 17.4 20.5 2X.5 :111.2

STAGE 2 I"I
23.3 HX 24.K 27.5 2X.5 31.K

Inventory
level 274.7

Figure 8.8 Cantt charts for different demand yalues


262 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

demands and equipment availability, and presented a framework to maximize


an expected profit by considering separately economic optimality and design
feasibility. Shah and Pantelides (1992) presented a scenario-based approach and
an approximate solution strategy for the design of multipurpose batch plants
considering different schedules for different set of production requirements. Rot-
stein et al. (1994) presented an MILP formulation for the evaluation of a lower
bound of stochastic flexibility of multipurpose batch plants based on the as-
sumption of independent uncertain parameters. Subrahmanyam et al. (1994)
presented a scenario-based approach and a decomposition solution strategy in
which batch plant scheduling aspects are simultaneously considered in design
optimization. In this section, we present a unified approach to address the
problem of design/scheduling of multiproduct batch plant considering uncer-
tain product demands and process parameters.

8.4.1 Problem Definition


We consider here the simplified model of a multiproduct batch plant design
(similar to Reinhart and Rippin and Straub and Grossmann) for the produc-
tion of N products (in single campaigns without intermediate storages) in ::-'1
stages (comprising N j identical pieces of batch equipment of size ltj, j=I,.";\I)
involving uncertaint.y in (i) processing times and size factors, reflecting proce~s
variability and model inaccuracy, and (ii) product demands, reflecting chang-
ing market conditions and/or variations of forecasted customer orders (the pro-
posed approach however, is general to deal with other batch operating models.
see Ierapetritou and Pistikopoulos, 1995). The resulting stochastic two-stage
mathematical model for this case is as follows:

Problem (PB)
t~,~ EOER(Vj,Nj) Lwp~r2;.)iQf -
P •
c5l;N a V!j
J
j j

-')'E8ER(Vj,Nj)LWP~~(LPiOi - LPiQf)
p 1 i i

s.t.

t!:.
I' > ( &J) . \..I' •
T Li - N. ,vZ,J,P
J
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 263

B Vj. \.I' •
i ::S S1!. ' vt,},P
I)

(Ji E R(Vj, N j )
where p=l, .. ,P is the set of scenarios used to describe the variation of process
parameters tij, Sij; (J is the vector of uncertain product demands; R(Vj, N j ) is
the feasible region of the design (V;, N j ) i.e. R(V;, N j ) = {(J I 'VB E R3Q i, T Li
satisfying the constraints of problem (PB) }. In the above formulation, the
first set of constraint corresponds to the horizon constraint, the second set are
the timing constraints and the third denotes the batch size constraints; finally,
the last set represents the relaxed demand constraint.

Note that in (PB) the evaluation of the expected profit should be performed
within the feasible region of the batch plant, R(V;, Nj). However, the establish-
ment of feasibility property in this case, as described in Appendix D, removes
this need (and thereby the bilevel nature of the problem); i.e. using for exam-
ple a Gaussian Quadrature formula to evaluate the profit expectancy, problem
(PB) can be rewritten as follows:

Problem (PBI)

p >
t tJ. . \.I' .
P
TLi - N. ' vt,},P
J
264 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

Q t?P <
-
O? . Vi q P
t' "

~LO ::; Vj ::; ~uP, or E T(O), ,2: 0

Nevertheless, problem (PBl) still corresponds to a non-convex nonlinear mixed


integer optimization problem due to the presence of the non-convex objective
function (investment cost) and the horizon constraint. In the following sections
we propose global optimization solution approaches for problem (PBl) for the
cases of continuous and discrete equipment sizes.

8.4.2 Continuous Equipment Sizes


The introduction of the exponential transformations of Kocis and Grossmann
(1988), leads to the following single nonlinear optimization problem involving
non convexities in the horizon constraint:

Problem (PB2)
M Q N
max {-oLaj exp(LYjr In(r) + 3j vj) + Lwqjq{ViQD
Vj, bi , Yj,· j=1 r q=1 ;=1
tLi,Q7,
Q N N
-, Lw qjq {DiOr - LPiQD}
q=l i=1 i=1

subject to: Vj 2: In (Sij ) + bi ; i = 1, .. , X. j = 1, .. , M


tLi 2: In(tij) - LrYjrln(r): i = I, .. ,N, j = I, .. ,M
N
LQr exp(tLi - b H i ) ::;

Qi ::; ot ; i = 1, .. , N, q =
;=1
1,.., Q
In(VLO)
J
<
-
v·J <
-
In(VUP)
J
Yjr = {O, I}, Or E T(O), ~: 2: 0

Floudas and Visweswaran (1990, 1993) have presented a decomposition-based


global optimization algorithm (GOP algorithm) to address this type of prob-
lems involving biconvex constraints. Based on the following variable partition:

Y={Vj,bi,Yj,·,tLd, x={Qi}, j=1, ... -'I. i=I, .. ,N, I]=I, .. ,Q


GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 265
problem (PB2) satisfies conditions (A) of the GOP algorithm, since both the
objective and the constraints are convex in {Vj, bi , Yjr, tLi} for every fixed Qj
and linear in Qj for each fixed {Vj, bi , Yjr, tLd. Similar partitions can be de-
rived for other design and scheduling formulations.

Although GOP algorithm can in principle be directly applied for the solution of
problem (PB2) to global optimality, it would require prohibitively high compu-
tational effort (2 NXQ subproblems per iteration). However, by exploiting the
special structure of the batch plant design model in (PB2), a number of prop-
erties can be established (see Ierapetritou and Pistikopoulos, 1995 for details)
with which the number of relaxed dual subproblems that have to be solved per
iteration can be reduced by several orders of magnitude (scaling only to the
number of products). --

Properties of multiproduct batch plant design problem

(i) The qualifying constraints (gradients of the Lagrange function with respect
to the "connected" variables Qn to be added along with the Lagrange
function in the relaxed dual problem are only function of TLi, B i . As a
result the number of the required relaxed dual subproblems per iteration
is reduced from 2 NxQ to only 2N which is a reduction of at least twenty
orders of magnitude (2 25 ) even for two uncertain parameters with five
quadrature points each!
(ii) If at the kth iteration the qualifying constraint for product i is :s; 0 ( or
~ 0) for every other product, i.e. p,qk[exp(tLi - bi) - exp(tL - b~:)l :s;
o (or ~ 0) Vi = 1, .. , N, implying that TL;/ B; ~ Ttl Bf' (or TL ;/ Bi :s;
Ttl Bn Vi = 1, .. , N - this is true when Tfj Bf corresponds to lower (or
upper) bound of T L ;/ B i , and consequently the solution of the relaxed dual
with the following qualifying constraints: p,qk [exp(tLi -b;) -exp( ti; -b~·) 1~
o (or :s; 0) can be effectively avoided.
(iii) If at the kth iteration the qualifying constraint of product i p,qk[exp(tLi ~
b;) - exp(tLi - bnl = 0 Vi = 1, .. ,N, which is satisfied when f..L qk = 0,
\/q = 1, .. , Q, then it is sufficient to solve only one RD problem at either
the lower or upper bounds of Qi variables.

Based on the properties described above, the follo,,·ing modified global opti-
mization algorithm is proposed for the solution of problem (PB2):
266 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

Step 1 Select an initial design Vj,Nj . Set K=l, the lower bound EPL =
-00, the upper bound EPu = +00 and select a tolerance f.

Step 2 Solve the primal problem to obtain the expected profit EP and
the required dual information. Update the upper bound EPu =
max{ EP, EPu }.

Step 3 Construct and solve the required relaxed dual problems (at most
2N) that correspond to different bounds of Q; variables for each
product i and store the obtained solutions.

Step 4 Select as a new lower bound EPt the lowest value of the stored
solutions of the RD problems; set as the new design YjI<, N r the
corresponding design variables.

Step 5 Check for convergence, if EPu :::; EPt + f, stop; YjK,Nr is


the global optimal design. Otherwise, set K=K +1 and return to
Step 2.

8.4.3 Discrete equipment sizes


For the case of discrete equipment sizes, nonconvexities can be effecti\"el~·
avoided based OIl the reformulations proposed by Voudouris and Grossmanll
(1992):

I if unit at stage j has size s


{
Yjs = 0 otherwise

where l'i is restricted to take values from the set SVj = {Vj1' ... , Vjs}. In this
way, problem (PBl) can be recasted as follows:
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 267
Problem (PB3)
~~ Lw q LwPUiQ1 P -I5LLLnajv::Y;sn
,. ' q p i ; s n

-'Y LW LWP(LPi D 1- ViQ1P)


Q

q P i i

S.t. n Qp
i
> ""Q1PSf;
_ ~~-V-.-Yjsn;
V't,J,P,
. q
s n 1S

n1P = LLWrJsn j Vi,j,p,q


s n

wrJsn ~ UijsnY;sn j Vi,j,s,n,p,q

"T~P
~ 'I.
< H . Vq p
- , ,

Problem (PB3) is a mixed integer linear programming problem for which con-
ventional MILP tools (such as SCICONIC, CPLEX) can be used for its solution.
Note that in this case the structure of the deterministic formulation is fully pre-
served despite the use of general continuous probability distribution forms for
the description of uncertain product demands.

8.4.4 Illustrating Example


Product 1
:
I--~ Product 2

Figure 8.9 Batch Plant


268 M. G.IERAPETRITOU AND E. N. PISTIKOPOULOS
Consider the batch plant design of Figure 8.9 involving two products to be
processed in three stages with one unit per stage. Size factors, processing
times and cost data are given in Table 8.10. The demand of both products are
considered as uncertain parameters described by normal distribution functions
of the form N(200,1O) and N(lOO,lO) for products 1 and 2, respectively. Five
quadrature points are used for each uncertain parameter.

(a) Size factors (b) Processing times


Stage Stage
Product 1 2 3 Product 1 2 3
1 2 3 4 1 8 20 8
2 4 6 3 2 16 4 4
(c) Investment cost coefficients (d) Prices of products
Stage j aj (3j Product Pi
1 5 0.6 1 5.5
2 5 0.6 2 7.0
3 5 0.6

Table 8.10 Data for Example Problem

Using GAMSjMINOS for the solution of problem (PB1) (without considering


any penalty term, 1'=0) results in different solutions if different starting points
are considered. For example, considering (VI, 12, 113 ) = (1000,1000,1000) as a
starting point, the design (VI, 112 , 113 ) = (800,1200,600) with expected profit
equal to 87.1 units is obtained, whereas if (Vi, 112 , 1'3) = (4500,4500,4500) is
used as a starting point a different design (FI , 12, 1'3) = (1800,2700,3600) with
a larger expected profit of 298.5 units is determined.

For comparison, the steps of the modified GOP algorithm, as outlined in the
previous section and described in detail in Appendix E, is applied for the solu-
tion of the same problem. The results are summarized in Table 8.11.

The following points should be highlighted: (a) orders of magnitude reductions


of the required relaxed dual problems are achie\'ed by applying the derived prop-
erties, (b) convergence of the algorithm does not depend on different starting
points, (c) the effect of penalty coefficient T the larger the value of I' the more
"conservative" the design, (d) the slo\\' cOllw'rgence of the algorithm to yield
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 269

'"1-0
Number of RD Upper Lower Design
Without With Bound Bound (VI, V2, 1'3)
properties properties
iteration 1 2'u 1 146.7 -1394.5 (500,500,500)
iteration 2 250 1 146.7 -569.2 (883.7, 1325.5, 1767.4)
iteration 3 250 4 ( 22) 20.2 -323 (500, 1703.4, 937.8)
Optimal Design (-0.016 8 IteratIOns
(1800 , 2700, 3600) (=0.0002 14 iterations
Different penalty values Different Starting points
Optimal Design :starting uesign Number of CPU s per
'"I value VI V2 V3 (VI, V2, V3) iterations iteration
'"1-0 1800 2700 3600 (1000, 1000, 1000) 13 0.8
'"1=4. 1907 2861 3815 (4500, 4500, 4500) 14 0.8
'"1=8 1972 2958 3944 (500, 500, 500) 15 0.8

Table 8.11 Results of Global Optimization Algorithm

highly accurate results (with an optimality stopping criterion of f = 2 X 10- 4 ,


the algorithm takes almost twice as many iterations compared to the solution
with f = 0.016), (e) increasing the number of quadrature points per does not e
increase the number of problems that have to be solved per iteration, e.g. for
a 9 x 9 grid again the maximum number of relaxed dual problems per iteration
is 4; yet, since the primal problem is of much larger size in this case, its CPU
~olution time increases (0.25 s versus 0.1 s required for the 5x5 grid).

Assuming that the equipments are only available from the following set of dis-
crete values {1200, 2200, 3200, 4200}, problem (PB3) is solved for the following
cases: (i) considering only short-term demand variation i.e. 160 ::; demand of
product 1 ::; 240, 60 ::; demand of product 2 ::; 140, (ii) accounting also for
long-range variations i.e. 410 ::; demand of product 1 ::; 490, 310 ::; demand
of product 2 ::; 390, during a second period, and (iii) considering uncertain
processing times and size factors through a multiperiod formulation \\"ith three
rime periods as shown in Table 8.12. The results are summarized in Table 8.13.

Finally, more involved scheduling models concerning plant operation in multi-


ple product campaign with zero wait (ZW) or unlimited intermediate storage
(UIS) are considered, where the possibility of having more than one units per
stage is also incorporated. The results are summarized in Table 8.14. Note
270 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS

Periods Product tij Sij


Stage 1 Stage 2 Stage 3 Stage 1 Stage 2 Stage 3
1 1 7 19 7 2.5 3.5 4.5
2 15 3 3 4.5 6.5 3.5
2 1 9 21 9 1.5 2.5 3.5
2 17 5 5 3.5 5.5 2.5
3 1 8 20 8 2 3 4
2 16 4 4 4 6 3

Table 8.12 Uncertain process parameters

Case Design
(Fl' l'2, l'3)
(i) (3200(1),3200(1),3200(1»
(ii) (3200(2), 3200(2), 3200(2))
(iii) (4200(1),4200(1),4200(1))
(1) one equipment
(2) two equipments

Table 8.13 Results for discrete equipment sizes


GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 271

that the consideration of more detailed scheduling models leads to more opti-
mistic plants due to better utilization of the processing equipmentsj in this case
the best utilization of the proposed plant is achieved by changing the schedule
patterns to follow the demand realization.

Single Product Multiple Product Campaign


Campaign Zero Wait (ZW) Unlimited Intermediate Storage
(No Clean-up times)
Stage J No of Equipment No of Equipment No of Equipment
units Size units Size units Size
Jl 3 1200 3 800 2 1200
J2 3 1200 3 800 2 1200
J3 3 1200 3 800 2 1200
Capital Cost: 7.9E+4 6.3E+4 5.3E+4
Expected Profit: 1.2E+5 1.4E+5 1.5E+5

Table 8.14 Results for detailed scheduling models

8.5 CONCLUSIONS
We have presented stochastic models and algorithmic methods to determine the
global optimum solution for the problems of planning, scheduling and design of
continuous and batch plants for the case when some of the model parameters
are stochastic variables (described by any probability distribution function)
based on model reformulation and decomposition principles. In particular,
for production and capacity planning problems a decomposition-based global
optimization approach is developed to obtain the plan with the maximum ex-
pected profit by simultaneously considering future feasibility. The relaxation
of demand requirement enables the consideration of partial order fulfilment
while properly penalizing unfilled orders in the objective function. Based on
the relaxation of demand constraint, for the problem of scheduling of con-
tinuous plants, it was shown that the structure of the deterministic problem
is fully preserved enabling the determination of a robust schedule capable of
meeting stochastic demands. Finally, for the problem of design/scheduling of
multiproduct batch plant when uncertain demand and process parameters are
considered, global solution procedures were derived for the cases of continuous
272 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

and discrete equipment sizes by exploiting the special structure of the resulting
stochastic models.

Acknowledgements: Financial support from EPSRC (IRC grant) and the


Commission of European Communities (grant ERB CRC CT93 0484) is grate-
fully acknowledged.

APPENDIX A

Evaluation of critical value of I


Consider the following planning problem for the case of specific values of de-
mand:

max
:rl
CIXI + C2X2 + C2sS2 - 'YP2(e - S2)

S.t. AIXI ::; bl


BIXI + B2X2 + B 2s S2 ::; b2
S2 ::; e
The KKT optimality conditions of this problem are:

-(:2+ B 2 ).. = 0 (A.l)


-(c2s + ~fP2) + B 2).. + J.l = 0 (A.2)
where).. is the vector of Lagrange multipliers for the constraints BI Xl + B 2 :1:2 +
B 2s S 2 ::; b2 , and /1. is the vector of the Lagrange multipliers for the demand
constraints.

Consider now the llluitiperiod formulation of the same problem:

max
Xl
CIXI + C2X2 + C2s S2
s.t. AIXI ::; bl
BIXI + B 2 X 2 + B 2s S2 ::; b2
S2 =e
The KKT optilllality conditiolls of thi~ problem are:
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 273

-cz + BZA' = 0 (A.3)


-CZ s + B 2 )...' + f-L' =0 (A.4)

where)...' is the vector of Lagrange multipliers for the constraints BI Xl + B2X2 +


B 2s S 2~ b2 and f-L' is the vector of the Lagrange multipliers for the demand
constraints.
By comparing (A.l), (A.3) we find that A' = A. Based on this result and by
comparing (A.2) with (A.4) we obtain the critical value of'Y be):

APPENDIX B

Feasibility property for production & capacity


planning
Consider the production and capacity planning problem described by model
(P). Then the following feasibility property holds:

Property- Any first stage decision vector corresponding to production plan and
capacity policy i.e. X]= (Xjt,Yjt,CEjt , Sit, Pil ), t = 1, .. ,TI except for Yjt,CEjf
where t=I, .. ,TI +T:z which satisfies problem constraints 8.2-8.8 is feasible ,,·ithin
T(O).

Proof- For ease in the representation, the proof is shown here for a plant of
two processes used for the production of two products within a time horizon
discretized into two time periods.

For fixed first period production plan (Xjl,Sil,Pid, ~ = 1,2,j = 1,2 and ca-
pacity expansion policy Yjt,CE jt , t = 1,2,j = 1,2, the feasibility test problem
has the following form after the elimination of inventory 1ft and capacity :/:),1
yariables:
274 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

'l/l(xd = min u
n,Xj,2,Si,2,Pi.2

subject to:
5 12 - fh $ u (B.I)
-812 $ U (B.2)
8 22 - O2 $ U (B.3)
-822 $ U (B.4)
P12 - a~ $ u (B.5)
-P12 +at2 $ u (B.6)
P22 - af2 $ u (B.7)
-P22 +ar2 $ u (B.8)
Ifo + Pll - 8 ll + P1Z - 8 12 + (bllxll + b1ZX2d + (b ll xl2 + b 1Z xZ2) - Imax $ u (B.9)
-Ifa - Pll + 8 ll - P12 + 8 1Z - (bllxll + bI2X21) - (b ll xl2 + b12X22) + [",in $ u (B.lO)
110 + P21 - 821 + P22 - 8 22 + (b 21 Xll + b22X2d + (b21 X12 + b22X22) - I»lax $ u (B.ll)
-110 - P21 + 8 21 - P22 + 8 zz - (b Z1 Xll + b 2Z xzd - (b21X12 + b22X22) + Imin $ u (B.l2)

Thus the feasibility function 'lj;(xd has the following form:


'I/J(xd = >'i (812 - ( 1 ) + >.~( -81Z ) + >'~(822 - OZ) + >.~( -822)
+JL}(PI2 - aY2) + JL~( -P12 + atz) + JL~(P22 - afz) + JL~( -P22 + ar2)
+TJ:(Ifo + P ll - 8 ll + P1Z - 8 12 + (bllxll + b12X21) + (bll :rl2 + b1ZX22) - Imax)
+TJr(-I{o - P ll + 8 11 - P12 + 8 12 - (bllxll + b1ZX21) + (b ll x12 + b 1Z X22) + I'"i,,)
+71~(Ilo + P21 - 8 21 + P22 - 8 22 + (b 21 Xll + b2ZX2t} + (b 21 ."l:12 + bZ2X22) - I""'X)
+7/:;( -110 - P21 + 8 21 - Pn + 8 22 - (b21 Xll + bZ2X21) - (b Z1 .TIZ + b2Z :r.2Z) + I'"in)

where:

>.t, >'i are the Lagrange multipliers of the constraints (B.I) and (B.2), respectively;
>.~, >.~ are the Lagrange multipliers of the constraints (B.3) and (B.4), respectively;
IlLJ.Li are the Lagrange multipliers of the constraints (B.5) and (B.6), respectively;
J.L~,J.L~ are the Lagrange multipliers of the constraints (B.7) and (B.8), respectively;
l7t,1Jf are the Lagrange multipliers of the constraints (B.9) and (B.lO), respectively;
1JL 1J~ are the Lagrange multipliers of the constraints (B.ll) and (B.12), respectively

The KKT conditions of the feasibility problem are:


>.~ - >'i - 1Jt + 1Jr =0
>.~ - >.~ - 1J~ + ld =0
J.L~ - ILI + TIt - 17i = 0
J.L~ - J.L~ + 17~ - 17~ = 0
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 275

Based on the above optimality conditions the feasibility function can be rewrit-
ten in the following form:

1fJ(xd = At( -8d + Ai( -82 )


+Ili(ah - af2) + 1l~(af2 - af2)
+7)}(Ifo + P11 - Sl1 + (b11x11 + b12X2d + (b11 xI2 + b12X22) - I'"ax + afl)
+7]~( -[fo - P11 + Sl1 - (b11x11 + b12X2d + (b 11 xI2 + b12X22) + [min - a~)
+7)~(I{o + P21 - S21 + (b21 X11 + b22X2d + (b21 X12 + b22X22) - [max + af2)
+7)~(-[{o - P21 + S21 - (b 21 X11 + b22X21) - (b 21 XI2 + b22X22) + [min - af2)

Based on the KKT optimality conditions the following two different cases with
regard to potential active sets can be considered that capture all possible com-
binations, namely the one that corresponds to satisfaction of demand for both
products and the one that corresponds to zero sales for both products. For
the first case, constraints (B.1), (B.2), (B.7), (B.8),(B.9) and (B.1O) are active
which results in u = 0; for the second case, constraints (B.3), (B.4), (B.5),
(B.6),(B.1l) and (B.12) are active resulting in:

which is negative VB E T(O) (summation of negative terms).

APPENDIX C

Feasibility property for multiproduct


continuous plants
Consider the scheduling problem described by model (PC) i.e. the determina-
tion of sequencing of products, the amounts produced along with the productioll
times for the plant operating in continuous mode involving one stage with a
single production line. Then the following feasibility property holds:

Property- Any schedule (Yit, Zijt) satisfying the constraints 8.9-8.15 for fixed
product demands Oi, i=l, .. ,N, is always feasible.
276 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS

Proof- Let's consider the production of N p products within the time horizon
H consisting of NT time periods. For each product i the following inventory
constraints hold:

IiI = I iO + rfriYil - T1di


Ii2 = Ii! + -rf T iYi2 - T2 di

liNT = IiNT-I + TKrTTiYiNT - TNTdi


I iO = liNT
where Tf = Tt - Tfdl - L~::>ijtQij. We can eliminate the inventory variables
i j
by summing the above equations which results in the following equation:

2: t
Tp
t i tT'
- H2: -
di _ O
- (C.1)

The feasibility test problem with fixed schedule Yit, Zijt and fh and after the
elimination of inventory variables is:

'l/J(Yit, zijd = min u


u,Tf ,d,

s.t. LTi
t
- H2:i di = 0 7'i
(C.2)

di - (}i ~ U (C.3)
-di ~ U (C.4)
-Tf ~ U (C.5)

The KKT conditions of the above problem are:

TJ - At = 0 t = 1, .. , NT (C.6)
H - I-li2 -
-TJ
Ti
I-li1=0 'Z = 1, .. ,N P (C.7)

where At is the Lagrange multiplier of the constraint (C.5); I-lL Ilf are the
Lagrange multipliers for the bounding constraints of production d j of product
i (C.3) and (CA), respectively; TJ is the Lagrange multiplier of the constraint
(C.2).
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 277

Notice that from (C.6) T/ = At ;::: O. Consequently (C.7) implies that It; :f 0
H H
and fJ.; = -T/ = - At· Based on these results the following feasibility function
ri ri
can be derived:

¢(Yit,Zijt) = LAt(-T[) + LfJ.;(di - (}i) =


t i

Substituting equation (C.l):

¢(Yit,Zijt) = LAt(-T[) + AtLTj - AtH~i


t t •

Using equation (C.6) we get:

(}.
~l/)(Y't
. 1,., z"t)
t), = -AtH...3:.
.
,. < 0 V (}
_ 1.
I.

APPENDIX D

Feasibility property for Multiproduct Batch


plant operating in Single Product Campaign
(a.) Vj continuous
278 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

The mathematical formulation in this case is as follows:

v.max: ~P'Q' - 6~N·a.v~j


N· Q.L..J •• L..J J J J
J' J' , i j

s.t. ~QiTLi < H


L..J B--
i •

T Li >
-
( N.
tij)
;
\.I'
vl,]
.

Qi ~ ei ; 'Vi

Property- Any design (Vj, N j ) satisfying design constraints above for fixed prod-
uct demands ()i, i=l, .. ,N, is always feasible.

Proof- The feasibility test problem - with fixed Fj , N j and ()i - is:

s.t.

The KKT conditions of the above problem are:


A~ + /1; -ILt = [) i = 1, .. , N (D.1)
N N
A+ I>} + I>? = 1
i=! ;=1
(D.2)

where A is the Lagrange multiplier of the production constraint; J.l} , J.l7 are the
Lagrange multipliers for the bounding constraints of production Qi of product i.

Since there are :\f control variables Q; the H\lmber of active constraints must
be less than or equal to N+l. From the KKT conditions (D.1), (D.2) it can be
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 279
easily identified that the only potential active set consist of the production con-
straint and the lower bounds of production rates, which results in the following,
always negative feasibility function:

This permanent feasibility implies that the feasible region of batch plant (in the
space of uncertain parameters) coincides with the considered range of uncertain
parameters independently of the design.

(b.) Vj discrete

In this case, the problem formulation is as follows:

s.t. ni >
-.;;;:-'-.;;;:-'QiSij \.I . .
_ ~~-v-.-Yjsn vZ,]
s 11 JS

s n

Property- Any design (Yis,,) satisfying design constraints abow for fixed prod-
uct demands Bi , i=l, .. ,N, is always feasible.

Proof- The feasibility test problem - with fixed Yisn and 8i - is:
280 M. O. IERAPETRITOU AND E. N. PISTIKOPOULOS

s.t. (D.3)

(DA)

(D.5)

Q t·-(J·<uVi
,_ (D.6)

,-
-Q·<uVi (D.7)
with the following KKT optimality conditions:

LLAij + LLJLij + V + Lkil + Lki2 = 1 (D.8)


i j i j

(D.9)

(D.lO)

-I>-i.i + /1 =0 Vi (D.ll)
i

where Aij are the Lagrange multiplier of the (D.3) constraints; JLij, are the
Lagrange multipliers for the (D.4) constraints; v is the Lagrange multiplier of
the (D.5) constraint and kil' ki2 are the Lagrange multipliers for the bounding
constraints of production Qi of product i (D.6) and (D.7).

Since there are 3*N control variables Qi, ni, Ti, i = 1, ... , N, where N is the
number of products, the number of active constraints must be less than or
equal to 3*N+1. From the KKT conditions (D.ll) it can be identified that
(D.5) as well as (DA) one Vi must be active. Following this result, (D.10)
suggests that also one (D.3) constraint Vi must be active; this in turn points
out that, from (D.9). the lower bounding constraint for production of product
GWBAL OPTIMIZATION FOR STOCHASTIC PLANNING 281

i must be active. As constraint (D.6) is not included in the active set, it can
be easily shown through algebraic manipulations that a negative feasibility
function is always obtained. Note that similar proofs can be obtained for other
batch plant design models (Ierapetritou, 1995).

APPENDIX E

Steps of Modified Global Optimization


Algorithm
The global optimization algorithm of section 8.4.2 is implemented here for the
example of section 8.4.4. Consider as a starting point (VI, V2 , V3 ) = (3000,4500,
4500).

Iteration t

The solution of problem (PB2) has an objective function of 146.7 units and
provides a first upper bound on the global optimum. The Lagrange multipliers
for the production constraints jJ.q are all equal to zero resulting in zero Lagrange
function gradient with respect to Qr. Hence, it is only necessary to solve one
RD problem with Qr at either its upper or lower bounds. The solution of this
problem is (VI, V2 , V3) = (500,500,500), with a value of the objective function
of -1394.5 ,units which is a first lower bound to the global opt.imum. The fixed
value of y for t.he second iterat.ion is WI, V2 , V3 ) = (500,500,500).

Iteration 2

For (VI, V2 , V3) = (500,500,500) the solution of the primal problem yields a
value of 297.3 unit.s (no update of the upper bound). Since the qualifying con-
straints are bl - bf, b2 - bf which are gTeater or equal to zero for any (b l , b2 ),
only one RD problem is also required at this iteration with Qr
at its upper
bounds. The Lagrange function from the first iteration is also added in the
current relaxed dual problem since the qualifying constraints for this Lagrange
function are zero for any (bI , b2 ). The solution of the relaxed dual problem
yields the design (VI, V2 , V3 ) = (883.7,1325.5,176iA), with an objective value
of -569.2 units which corresponds to a new lower bound. The fixed design for
282 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

the next iteration is (VI, V2, V3 ) = (883.7,1325.5,1767.4).

Iteration 3

The solution of the primal problem yields an expected profit of 20.2 units, which
provides a new upper bound to the global optimum, and the corresponding
Lagrange multipliers which are all nonzero. Four relaxed dual problems are
then solved, for the combinations of bounds (Qr ,Q~lq2) equal to (0,0), (fW ,0),
(fW, O~lq2) and (0, Or Q2 ). Since the qualifying constraints of Lagrange functions
from the first and second iterations are satisfied for every (b I , b2 ), they are both
added in the relaxed dual problem. The solutions of the four relaxed problems
are summarized in Table E.l. At the end of this iteration these four solutions
are stored from which the design (VI, V2 , V3 ) = (500, 703.4, 937.8) with the
smaller objective is selected for the next iteration.

Relaxed Combination Optimal Design Objective


Dual of (Q~l , Q~l Q2)
bounds (VI, V2 , V3 ) function
1 (1800,2700,3600) -317.1
2 (1298, 1948, 2597) -85.4
3 (500, 703.4, 937.8) -323
4 (1276.7, 1915.1,957.5 ) -53.8

Table E.1 Solutions of the Relaxed Dual Problems


The algorithm requires 14 iterations to converge to the global optimum design
(VI, V:;l.V3 ) = (1800,2700,3600) with an expected profit equal to 298.5 units
(within a tolerance limit 10=0.0002). The computational results for the appli-
cation of GOP from four different starting points with 'Y = a and considering
a 5x5 quadrature grid are summarized in Table E.2.

Starting Design Number of CPU s per


(VI, V2 , V3 ) Iterations iteration
(1000, 1000, 1000) 13 0.8
(3000,4500,4500) 14 0.8
(4500,4500,4500) 14 0.8
(500, 500, 500) 15 0.8

Table E.2 Results of GOP application


GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 283

REFERENCES

1. Acevedo J. and E. N. Pistikopoulos (1995). Computational Studies of


Stochastic Optimization Techniques for Process Synthesis under Uncer-
tainty. Manuscript in preparation.

2. Beale, E.M., J.J.H. Forrest and C.J. Taylor. Multi-time-period Stochastic


Programming, Stochastic Programming; Academic Press: New York, 1980.

3. Bienstock, D. and J.F. Shapiro (1988). Optimizing Resource Acquisition


Decisions by Stochastic Programming. Mang. Sci., 31., 215.

4. Birge, J. R. (1982). The Value of the Stochastic Solution in Stochastic


Linear Programs with Fixed Recourse. Math. Prog., 24, 314.

5. Birge, J. R. (1985). Aggregation Bounds in Stochastic Linear Program-


ming. Math. Prog., 25, 31.

6. Birge, J. R., R. Wets (1989). Sublinear Upper Bounds for Stochastic Pro-
grams with R.ecourse. Math. Prog., 43, 131.

7. Bloo~, J. A. (1983). Solving an Electricity Generating Capacity Expan-


sion Planning Problem by Generalized Benders Decomposition. Oller.
Res., 31, 84.

8. Borison, A. B., P.A. 1\1orris and S.S. Oren (1984). A State-of-the World
Decomposition Approach to Dynamics and Uncertainty in Electric Ut.ility
Generation Expansion Planning. Oper. Res., 32, 1052.

9. Brauers, J. and M.A. Weber (1988). New Method of Scenario Analysis for
Strategic Planning. Jl. of Forecasting, 1, 31-47.

10. Clay R.L. and I.E. Grossmann (1994a). Optimization of Stochastic Plan-
ning Models I. Concepts and Theory. Submitted for publication.
284 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

11. Clay R.L. and I.E. Grossmann (1994b). Optimization of Stochastic Plan-
ning Models II. Two-Stage Successive Disaggregation Algorithm. Submit-
ted for publication.

12. Dantzig, G. B. (1989). Decomposition Techniques for Large-Scale Electric


Power Systems Planning Under Uncertainty. Annals of Operations Re-
search.

13. Edgar, T.F. and D.M. Himmelblau Optimization of Chemical Processes;


McGraw Hill: New York, 1988.

14. Fichtner, G., H.J. Reinhart and D.W.T. Rippin (1990). The Design of
Flexible Chemical Plants by the Application of Interval Mathematics.
Compo Chem. Engng., 14, 1311.

15. Floudas, C.A. and V. Visweswaran, (1990). A Global Optimization Al-


gorithm (GOP) for Certain Classes of Nonconvex NLPs-1. Theory, Compo
Chem. Engng., 14, 1397.

16. Floudas, C.A. and V. Visweswaran, (1993). Primal-Relaxed Dual Global


Optimization Approach, JOTA, 78, 187.

17. Friedman, Y. and G.V. Reklaitis (1975). Flexible Solutions to Linear Pro-
grams under Uncertainty: Inequality Constraints. A/ChE Jl, 21, 77-83.

18. Grossmann, I.E., K.P. Halemane K.P. and R.E. Swaney (1983). Optimiza-
tion Strategies for Flexible Chemical Processes. Comp1Lt. chern. E!~gng.,
1,439-462.

19. Horst, R. (1990). Deterministic methods in Constrained Global Optimiza-


tion: Some Recent Advances and New Fields of Application. Nav. Res.
Log., 37, 433-471.

20. Ierapetritou, M.G. (1995). Optimization Approaches for Process Engineer-


ing Problems Under Uncertainty. PhD Thesis University of London.
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 285

21. Ierapetritou, M.G. and E.N. Pistikopoulos (1994). Novel Optimization


Approach of Stochastic Planning Models. Ind. Eng. Chern. Res., 33,
1930.

22. Ierapetritou, M.G. and E.N. Pistikopoulos (1995). Batch Plant design
and operations under Uncertainty. Accepted for publication in Ind. Eng.
Chern. Res ..

23. Ierapetritou, M.G., J. Acevedo and E.N. Pistikopoulos (1995). An Opti-


mization Approach for Process Engineering Problems Under Uncertainty.
Accepted for publication in Cornput. chern. Engng ..

24. Inuiguchi, M. M. Sakawa and Y. Kume (1994). The usefulness of Possi-


bilistic Programming in Production Planning Problems. Inter. J. Prod.
Econ., 33, 42.

25. Kocis, G.R. and I.E. Grossmann (1988). Global Optimization of Noncon-
vex MINLP Problems in Process Synthesis. Ind. Eng. Chern. Res., 27,
1407.

26. Liu, M.L. and N.V. Sahinidis (1995). Process Planning in a Fuzzy EIl\"i-
ronment. Submitted for publication in Eur. J. Oper. Res.

27. Modiano, E.M. (1987). Derived Demand and Capacity Planning Under
Uncertainty. OIJer. Res., 35, 185-197.

28. Pinto J. and I.E. Grossmann (1994). Optimal Cyclic Scheduling of Multi-
stage Continuous Multiproduct Plants. Submitted for publication.

29. Pistikopoulos. E.N. and I.E. Grossmann (1989a). Optimal Retrofit"Design


for Improving Process Flexibility in nonlinear Systems: -I. Fixed degree of
Flexibility. Cornput. chern. Engng., 13, 1003-1016.

30. Pistikopoulos, E.N. and I.E. Grossmann (1989b). Optimal Retrofit Design
for Improving Process Flexibility in nonlinear Systems: -II. Optimal Level
of Flexibility. Co rnp1I.t. chern. Engng. 13, 1087.
286 M. G. IERAPETRITOU AND E. N. PISTIKOPOULOS

31. Pistikopoulos, E.N. and M.G. Ierapetritou (1995). A Novel Approach for
Optimal Process Design Under Uncertainty. Comput. chem. Engng., 19,
1089.

32. Reinhart, H.J. and D.W.T. Rippin, (1986). Design of flexible batch chem-
ical plants. AIChE Spring National Mtg, New Orleans, Paper No 50e.

33. Reinhart, H.J. and D.W.T. Rippin, (1987). Design of flexible batch chem-
ical plants. AIChE Annual Mtg, New York, Paper No 92f.

34. Rotstein, G.E., R. Lavie and D.R. Lewin (1994). Synthesis of Flexible and
Reliable Short-Term batch production Plans. Submitted for publication.

35. Sahinidis, N.V., I.E. Grossmann and R.E. Fornari (1989). Chathrathi,
M. Optimization Model for Long-Range Planning in Chemical Industry.
Comput. Chem. Engng.,1!, 1049.

36. Sahinidis, N.V. and I.E. Grossmann (1991). MINLP model for Cyclic
Multiproduct Scheduling on Continuous parallel lines. Comput. Chem.
Engng., 15, 85.

37. Schilling, G., Y.-E. Pineau, C.C. Pantelides and N. Shah. Optimal Schedul-
ing of IVlultipurpose Continuous Plants AIChE 1994 Annual ?-.leeting San
Francisco.

38. Shah, N. and C.C.Pantelides (1992). Design of Multipurpose batch Plants


with l:ncertaiu Production Requirements. Ind. Eng. Chern. Res .. 31,
1325.

39. Shimizu, Y. (1989). Application of flexibility analysis for compromise solu-


tion in large-scale linear systems. Jt of Chem. Engng of Japan, 22,189-193.

40. Straub, D.A. and I.E. Grossmann (1992). Evaluation and optimization of
stochastic flexibility in multiproduct batch plants. Comput. chem. En-
gng., 16, 69.
GLOBAL OPTIMIZATION FOR STOCHASTIC PLANNING 287

41. Straub, D.A. and I.E. Grossmann (1993). Design Optimization of Stochas-
tic Flexibility (1993). Comput. Chern. Engng., 17,339.

42. Subrahmanyam, S., J.F. Pekny and G.V. Reklaitis (1994). Design of Batch
Chemical Plants under Market Uncertainty. Ind. Eng. Chern. Res., 33,
2688.

43. Van Slyke, R.M. and R. Wets, (1969). L-Shaped Linear Programs with
Applications to Optimal Control and Stochastic Programming. SIAM J.
Appl. Math., 17, 573.

44. Voudouris, V.T. and I.E. Grossmann, (1992). Mixed-Integer Linear Pro-
gramming Reformulation for Batch Process Design with Discrete Equip-
ment Sizes, Ind. Eng. Chern. Res., 31, 1315.

45. Wallace, S. W. (1987). A piecewise linear upper bound on the network


recourse function. Math. Prog., 38, 133.

46. Wellons, H.S. and G.V. Reklaitis (1989). The design of multiproduct batch
plants under uncertainty with staged expansion. Comput. Chern. Engng.,
13, 115-126.
9
GLOBAL OPTIMIZATION OF HEAT
EXCHANGER NETWORKS WITH
FIXED CONFIGURATION FOR
MULTIPERIOD DESIGN
Ramaswamy R. Iyer and Ignacio E. Grossmann

Department of Chemical Engineering,


Carnegie Mellon University, Pittsburgh, Pa 15213
USA

ABSTRACT
The algorithm for global optimization of heat exchanger networks by Quesada and
Grossmann [171 has been extended to multi period operation for fixed configuration.
Under the assumptions of linear cost function, arithmetic mean driving force and
isothermal mixing, the multi period problem is an NLP with linear constraints and
a nondifferentiable, nonconvex objective function involving linear fractional terms.
A modified partitioning rule is used and global optimization properties are retained.
Exploiting the fact that an exact approximation is not required for exchangers in
non-bottleneck periods leads to a reduction in number of partitions required to reach
the global optimum.

1 INTRODUCTION
There has been an increased interest in the design of flexible chemical processes
in the last decade. Changing plant conditions as a result of change in feedstock
or product demands provides a motivation to develop systematic methods for
design of flexible plants [9]. A major class of flexibility problems is the multi-
period design problem for plants operating under different specified conditions
for different time periods [10].

The conventional approach used in practice is to use empirical overdesign fac-


tors on sizes that are determined at a nominal parameter value [18]. With this
approach, however, it is difficult to gain insight on the degree of flexibility and
justify on economic grounds the extent of overdesign.
289
I. E. Grossmann (ed.). Global Optimization in Engineering Design. 289-308.
© 1996 Kluwer Academic Publishers.
290 R. R. lYER AND I. E. GROSSMANN

The design and synthesis of heat exchanger networks (HEN) at nominal condi-
tions in which the selection of configurations, areas and energy are optimized,
has received considerable attention [12]. The multiperiod HEN synthesis and
design problem, which is normally encountered in the industry for handling pro-
cess uncertainties and variations [14], has received much less attention. Floudas
and Grossmann [6] have proposed a modified transhipment model for synthesis
of HEN for multiperiod operation with the objective of minimizing utility costs
using the fewest number of units. Subsequently they extended this method to
perform automatic synthesis of network configurations [7]. Galli and Cerda [8]
have proposed an algorithm that is based on the idea of representing the un-
certainties with permanent and transient process streams. However, the above
techniques cannot account simultaneouly for the tradeoffs in investment and
operating costs. Rigorous mathematical programming techniques have been
proposed for synthesis and design of HEN which can simultaneously account
for trade-offs in energy, area and unit costs [3, 19]. Papalexandri and Pistikop-
uolous [16] have studied the problem of introducing controllability together with
synthesis and design of flexible processes. However, a major limitation of these
methods is nonconvexities leading to multiple local optimal solutions. Thus,
local search techniques cannot guarantee global optimality of the solution.

There has been considerable interest lately in addressing the global optimization
of nonconvex nonlinear problems [13]. Global optimization of HEN has been
addressed using stochastic and deterministic methods. Dolan et al. [5] applied
simulated annealing to the synthesis of HEN. This method is computationally
intensive and only guarantees global optimality if an infinite number of trials are
performed. Quesada and Grossmann [17] developed an algorithm for HEN with
fixed configuration which guarantees global optimality. They used nonlinear
convex underestimators for the nonconvex terms to obtain tight bounds for the
objective value within a spatial branch and bound method.

The aim of this work is to extend the algorithm by Quesada and Grossmann
[17] to design of HEN for multi period operation for a fixed configuration. The
configuration could be either selected using any of the synthesis methods re-
ferred above ([6],[7],[8]), or else supplied by the user. Selecting a configuration
is largely a discrete optimization problem, while optimizing a fixed stn"cture
corresponds to a nonconvex, continuous optimization problem in which areas
and energy consumptions are optimized.
HEAT EXCHANGER NE'IWORKS 291

2 PROBLEM STATEMENT AND MATHEMATICAL


MODEL
Given is a set of hot process streams, HP = {i I i = 1 .. . Nh} that must be
cooled and a set of cold process streams CP = {j I j = 1.. ..N c } to be heated.
Also, given is a hot utility HU and a cold utility CU and their corresponding
temperatures. Assuming a configuration has been found that is structurally
feasible for all N periods of operations, the problem consists in determining
the areas and utility loads at each period of operation. The parameters that
change over different time periods are the inlet temperatures, flowrates and
outlet temperatures of the process streams. The objective is to minimize the
total annual cost including the utility and exchanger area costs.

The formulation presented by Yee et al. [19] will be used to describe the
mathematical model for the given network configuration that is selected within
the superstructure. Aside from the fact that this provides a systematic way to
derive the model, it should be noted that most network configurations under
the assumptions used in this paper can be embedded in such a superstructure.
Figure 1 shows a typical configuration for a 2 hot-2 cold stream system with
all possible exchanges. The selection of the configuration is determined by
choosing a subset of these exchangers (shaded circles). The number of stages
may be chosen either as max(Nh' N c ) or the maximum number of temperature
intervals as described in [4].

slage 1 slage2

k.l
(Iemperalure
Iocalion)

Figure 1 A two-stage configuration for 2-hot 2-cold streams


292 R. R. IYER AND I. E. GROSSMANN

The major assumptions in the proposed model are 1) Isothermal mixing of


streams between stages. 2) Area cost is given by a linear cost function. 3)
Arithmetic mean temperature difference driving force. 4) All exchangers can
have bypasses on the hot and/or cold side.

It may be noted that the assumptions above generally yeild good results that
are of practical use. The arithmetic temperature difference underestimates and
usually is close to the logarithmic temperature difference while the linear cost
of area provides an underestimation of the concave cost functions for area. The
isothermal mixing assumption aids ill avoiding nonlinear mixing constraints
in the model. Finally, the assumption of by-passes ensures that feasible heat
exchange for a given unit is guaranteed by considering the largest area required
over all time periods.

The following indices, parameters, variables and constraints are involved in the
multiperiod NLP model.

(A) Indices

i =Hot process streams


j =Cold process streams
k=index of stages l..NOK and temperature location 1. .. ,NOK +1
n=periods of operation 1. .. ,N

(B) Parameters

TIN =Inlet temperature of stream


TOUT =Outlet·temperature of stream
C =Cost coefficient of exchanger
F =Heat capacity flow rate
U =Overall heat transfer coefficient
Z, ZH, ZC=Binary values defining the existence of an exchanger between
process streams, cold stream and hot utility, and hot stream
and cold utility respectively.
8 =Upper bound for heat duty of exchanger
d1,d2 =Unit cost for hot and cold utility respectively
HEAT EXCHANGER NETWORKS 293
(C) Variables

DTijkn =Arithmetic mean temperature difference for match (i,j) in


temperature location k in period n.
tikn, tjkn =Temperature of hot stream i fcold stream j at inlet of stage k in
period n.
DT HUjn ~Arithmetic mean temperature difference for cold stream j and
hot utility in period n.
DTCUin =Arithmetic mean temperature difference for hot stream i and
cold utility in period n.
=Heat exchanged between (i,j) in stage k in period n.
=Heat exchanged between cold stream j and hot utility in period n.
=Heat exchanged between hot stream i and cold utility in period n.

Overall heat balance for each stream for each period.


(TINin - TOUTin)Fin = Lk Lj Qijkn + QCin Vi,n (9.1)
(TOUTjn - T I Njn)Fjn = Lk Li Qijkn + QHjn Vj, n
Heat Balance at each stage for each period.
(tikn - tik+ln)Fin = Lj Qijkn Vi, k, n (9.2)
(tjkn - tjk+ln)Fjn = Li Qijkn Vj, k, n
Assignment of Inlet temperatures for each period.
TINin = tiln Vi,n (9.3)
TINjn = tjNOK+1n Vj, n
Feasibility of temperatures for each period.
tikn ~ tik+1n Vi,k,n (9.4)
tjkn ~ tik+1n Vj,k,n
TOUTin ::; tiNOK+1n Vi,n
TOUTjn ~ tiln Vj,n
Utility Load determination for each period.
(tiNOK+ln - TOUTin)Fin = QCin Vi, n (9.5)
(TOUTjn - tjln)Fjn = QHjn Vj, n
Existence of an exchanger.
Qijkn - eZijk ::; 0 Vi, j, k, n (9.6)
QCin - eZCi ::; 0 Vi, n
QH.in - eZHj ::; 0 Vj, n
294 R. R. lYER AND I. E. GROSSMANN

where Zijk, ZCi , ZHj are predetermined by the particular network configura-
tion.

Arithmetic mean temperature difference calculation for each period.

(9.7)
Vi, j, k, n if Zijk = 1
DTCUin = ((t;NOJ(+ln - TOUTcu ) + (TOUT;n - T1Ncu))/2
Vi,n if ZCi = 1
DTHUjn = ((TOUTHU - tjln) + (TINHu - TOUTjn ))/2
Vj,n if ZHj = 1

The equations 9.1-9.7 that include the heat balances, driving force and ap-
proach temperature constraints for all N periods of operation may be concisely
represented as

where Xn is the set of all other state variables at period n in the set of equations
9.1-9.7.

The multiperiod HEN design problem (PI) then corresponds to the linearly
constrained NLP which is of the form

" G ij max n g~jkkB + " Gj max n DQTCC'U'B, " Gj maxn DiA'U


min G = L,., _ _'"':"'O""'::";:":"lc.:.!!.n ~ 12 + L.-; J"

' 'k)
( '.1, U ij 'cu
I,
UiCU 'HU
"
UjHU

(9.9)
n i n j
subject to

Qtkn :5 Qijkn :5 Qf;kn


QHt. :5QHjn :5 QHK.
HEAT EXCHANGER NETWORKS 295

QCf;. ~ QCin <QC


-
U
.n
DTi1kn ~ DTijkn ~ DTgkn
DTHUfn ~ DTHUjn ~ DTHUJ:'
DTCUi~ ~ DTCUin ~ DTCUi~

The objective function has nondifferentiable 'ma..x' operators to select the largest
area for each exchanger over the n time periods of operation. Note that this
implicitly assumes that by-passes are available at each exchanger as was men-
tioned before. The bounds for the variables shown above may be obtained from
a preanalysis of the network for each period of operation [17].

Problem (PI) can be reformulated by defining scaled area variables (scaled to


product of area and overall heat transfer coefficient) for each period. This allows
us to remove the maximum operator in the objective function by adding linear
constraints that choose the maximum areas amongst all periods of operation.

Define

ANijkn = Scaled Area of exchanger (i,j, k) for period n.


A ijk = Largest scaled Area of exchanger (i, j, k) amongst all periods of
operation.

We analogously define areas for hot and cold utility exchangers AHU N jn ,
AHUj and ACU N in , ACUj , respectively. Using the concept of convex underes-
timators developed in Quesada and Grossmann [17], the convex underestimator
problem (NLPL ) may be formulated as

+
n j
"
subject to
linear constraints that ensures feasiblity of operation for aU periods
296 R. R. IYER AND I. E. GROSSMANN

AHUj ~ AHUNjn Vj,n


ACUi ~ ACUNin Vi, n

and constraints for underestimators

ANijkn > Qijkn/DTgkn + Qtkn(l/DTijkn -l/DTgkn )


ANijkn ~ + Qgkn(l/DTijkn -l/DTbkn)
Qijkn/DTi~kn
Qijkn ~ ANi1knDTijkn + ANijknDTgkn - ANbknDTgkn
Qijkn < ANi~knDTijkn + ANijknDTi~kn - ANgknDTi~kn
DTijkn ~ Qijkn/ANijknU + Qtkn(1/AN ijkn -l/ANgkn )
DTijkn > Qijkn/ANijkn L + Qgkn(l/ANijkn :.... 1/ANi1kn)
ANijkn ~ Qijkn/¢gkn(Qijkn) + Qbkn(l/DTijkn - l/¢gkn(Qijkn))
ANijkn > Qijkn/¢bkn(Qijkn) + Qgkn(l/DTijkn - l/¢f;kn(Qijkn))

where <Pijkn(Qijkn) is a linear function with negative slope obtained as a pro-


jection of DTijkn' The corresponding set of convex underestimators for hot
and cold utility exchangers are similar to the constraints above and they are
concisely represented as

!;(QHjn,DTHUjn,AHUNjn) ~ 0 Vj
!i(QCin,DTCUin,ACUNin ) ~ 0 Vi (9.10)

Also, the following constraints apply from (9.8)

where

(Qijkn, DTijkn, ANijkn, QHjn, DT HUjn, AHU Njn, QCin , DTCUin, ACUNin) E {}

where
HEAT EXCHANGER NETWORKS 297

Aijkn = (Qf;kn :$ Qijkn :$ Qf;kn,


DTi1kn :$ DTijkn :$ DTHkn,
ANbkn :$ ANijkn :$ ANi~kn) (9.11)
Afn = (QHfn :$ QHjn :$ QHj~'
DTHUfn :$ DTHUjn :$ DTHUf,.,
AHUNfn :$ AHUNjn :$ AHUNj~)
Afn = (QG[;. :$ QGin :$ QG~,
DTGUi~ :$ DTGUin :$ DTGUg,
AGUN[;. :$ AGUNin :$ AGUNg)

The key property on which the solution procedure is based is presented below.

PROPERTY: Any feasible solution to the convex multiperiod problem (N LPL)


provides a valid lower bound to the objective function of (PI). Furthermore,
any optimal solution of (N LPL ) provides a valid lower bound to the 'global
optimum of (PI).

From the above property it follows that if the optimal solution Gr


from (N LPL)
equals the objective function value for (PI) then the solution is the global
optimum to (PI) . The proof is omitted here as it is identical to the one given
in [17].

3 SOLUTION PROCEDURE
It will be assumed below that the reader is familiar with the paper by Quesada
and Grossmann [17]. Let the incumbent (current best) solution be denoted
by (*) and solution of (N LPL ) be GI for the subregion r. Let F be the set
containing the subregions that need further examination.

• Step 0: Initialize. Set G* = 00 and F = 0. Let r = O.


• Step 1: For each exchanger and at each time period, generate lower and
upper bounds for area, driving force and heat duty for each period of
operation by solving a sequence of linear programs. For instance, for de-
termining a lower bound for DTijkn , the LP is given by
298 R. R. lYER AND I. E. GROSSMANN

min (9.12)
s.t.
g(Qijkn, DTijkn, QHjn , DTHUjn, QCin , DTCUin,Xn ) ~ 0 Vi,j,k
For obtaining bounds on area, linear fractional programming may be used [2].
Store the bounds to generate the region n°. Let F = F U {o}. Also obtain
projections of variables for each period of operation using solution of the LP's
solved during the procedure for obtaining bounds. However, it may not always
be possible to obtain projections in the convex direction. Evaluate the original
objective function and store the lowest value as the incumbent solution C*.
In the multiperiod case the value of the area chosen is the largest amongst all
periods of operation and thus feasibility is ensured for all periods of operation.
The value C· so obtained represents an upper bound to the global optimum of
PI.

• Step 2: Solve the convex underestimator problem (NLPL) for no to obtain


C2. Evaluate the original objective function Co.
If Co < C· , set Co = C* and store the solution as the incumbent solution
(*).
• Step 3: For each subregion s in F, if CI 2: C* , delete that region s in F.
If F = 0 the global optimum solution has been obtained and is given by
the incumbent solution.
• Step 4: Take the last region s in F, and apply the following selection rule
to partition the region ns .
A straightforward extension of the partitioning rule for single period to
multiperiod case would be to take the exchanger (ij,k) and period n cor-
responding to the largest difference between underestimated and calculated
area. However, since the area of any exchanger should be the largest area
calculated amongst all periods of operation, the rule should be modified
for the multiperiod problem as follows.
Rule: For every (ij,k),. Determine the largest argument

arg[.n?-ax {(Cij(Qijkn/DTijkn - Aijk»,


t,},k,n

(Cj,Hu(QHjn / DT HUjn - AHUj», Ci,GU(QCin / DTCUin - ACUi)}]


where Aijk , AHUj, ACUi are largest underestimated area of exchanger for
all n.
Thus, the partition rule determines the largest difference between the cal-
culated area cost and the underestimated area cost amongst all periods of
HEAT EXCHANGER NETWORKS 299
operation. Two subregions are created by addition of the partition con-
straint for the variable (say) y of the form

y ~ y* and y ~ y*

Thus subregion Os is deleted from F and subregions Os+! and OB+2 is


added to F. Update the bounds for Os+! and os+2 for the exchanger
chosen for partitioning.
• Step 5: Solve (NLPd for fl s +1 and fls+2 and calculate CL B+! and CLs+2.
Evaluate the original objective function for each subregion.
If C < C* set C* = C and store that solution as the incumbent solution.
If CL B+! < CL B+2 , invert Os+! and flS+2. Go to Step 3.

4 REMARKS
At this point it is important to note the differences between the solution proce-
dure for the single period case [17] and the multiperiod case considered in this
paper.

• 1) The partition rule only determines those exchangers for which there is
a bottleneck period where the calculated area is greater than the largest
underestimated area. Thus, no partitioning will be required in the space of
variables for other periods for which the calculated area is smaller than the
largest underestimated area leading to a reduction in computation time.
• 2) When the lower bound equals the upper bound then the solution is the
global optimum. However, with the above partition rule, it is possible to
have a nonexact approximation for an exchanger in non bottleneck periods.
This is because the solution corresponding to the upper bound yields fea-
sible areas for all periods of operation. Even if the area is underestimated
for a nonbottleneck period nnb , such that

'~Nnb
.,1.ijkn -
< Qnb
ijkn
/DTnb
ijkn

the feasible solution corresponding to the upper bound must be the lowest
feasible value of area for that exchanger such that

Qnb
ijkn
/DT ZJOk n nb <
0

- Qbijkn /DTbijkn
300 R. R. IYER AND I. E. GROSSMANN

5 EXAMPLES
Two example problems were solved with the proposed global optimization s0-
lution procedure and the results were compared with the solutions obtained
from the local NLP solver MINOS 5.3 [15]. The time taken to solve the convex
NLP's was found to grow almost linearly with the number of periods. Therefore
no special decomposition strategy was used for solving the multiperiod NLP
because the NLP in the full space can be solved in reasonable time.

5.1 Example 1
An example problem (see Figure 2) from Quesada and Grossmann [17] is solved
for a 10 period case with varying inlet temperatures and fiowrates as indicated
in Table 1. The solution of this problem using MINOS 5.3, in GAMS [1]
C2 _ _~f'

H2

C1

H1

Figure 2 Example 1 configuration

(with default starting point) gave a local solution of $476,300. However, the
global solution is $377,950, which clearly justifies the need for a global solution
procedure since a 20% reduction in costs is possible for this example.

The solution procedure for obtaining the global solution results in a NLPprob-
lem with larger number of variables and equations as shown in Table 3. At
the first iteration, the underestimator problem yields a solution of $347,320 as
compared to the upper bound of $405,480. The global solution was obtained
after partitioning the feasible region into 30 subregions.

Table 4 shows the values of exchanger areas obtained by both procedures.


Clearly, the suboptimal solution results in a completely different design for the
HEAT EXCHANGER NETWORKS 301

Hot Inlet temperature for each period (K)


Stream
2 3 4 5 6 7 8 9 10

1 575 575 576 576 565 565 575 575 575 576
2 718 708 718 708 718 708 708 708 718 718
Outlet temperature for each period (K)

1 395 385 396 386 395 395 395 395 395 395
2 398 388 398 388 388 398 398 398 398 398
Heat capacity flow rate (kW/K)
1 55.55 52.63 55.55 52.63 58.82 58.82 55.55 55.55 55.55 55.25
2 31.25 31.25 31.25 31.25 30.2 32.26 32.26 32.26 31.25 31.25

Cold Inlet temperature for each period (K)


Stream
2 3 4 5 6 7 8 9 10

1 300 290 302 292 292 292 295 295 302 302
2 365 355 365 355 355 355 355 365 365 365
3 358 348 358 348 348 348 348 360 360 360
Outlet temperature for each period (K)

400 390 400 400 400 400 400 400 400 400
Heat capacity flow rate (kW/K)

1 100 100 102 92.6 92.6 92.6 95.24 95.24 102 102
2 45.45 45.45 42 42 42 42 42 45 45 45
3 35.71 35.71· 38 38 38 38 38 40 40 40

Exchanger Overall Heat Area Cost Coefficient


Transfer Coefficient
(kW/K m2) ($ 1m2)

1 0.1 270
2 0.1 720
3 1.0 240
4 1.0 900
Note- DTmin =5 K

Table 1 Data for Example 1


302 R. R. IYER AND I. E. GROSSMANN

exchanger with a much larger total investment cost. Table 5 shows the values for
calculated areas and underestimated areas at the global solution for exchanger
1. The largest value of calculated areas corresponds to the bottleneck period
(in this case 5, 6 and 10). For these periods, the calculated areas are equal
to the underestimated areas (since at the global optimum, the upper bound is
equal to the lower bound for objective function). However for non-bottleneck
periods (e.g. 1,2,3 etc.), the underestimated area is lower than the calculated
area. Thus, an exact approximation for the areas is not required in the non-
bottleneck periods because these areas do not affect the value of the objective
function. This is true because only the largest value of calculated area is used
in the objective function which corresponds to the bottleneck periods. It is also
possible that for nonbottleneck periods, the underestimated area may be larger
than the calculated area (e.g. period 8) because the variable ANijkn is free
to take any value less than A ijk and still not affect the value of the objective
function.

5.2 Example 2
The network considered in the structural flexibility analysis problem from
Grossmann and Floudas [11] with 4 hot and 3 cold streams (see Figure 3) and
uncertain inlet temperatures was solved using the proposed procedure. The

Figure 3 Example 2 configuration


HEAT EXCHANGER NETWORKS 303

network has a structural flexibility index of 0.75 for a ± 100 C change in inlet
temperatures. To determine the required areas, five periods of operation with
values of inlet temperatures perturbed by ±7.5° C were taken and the data is
presented in Table 2. The global solution was obtained after only 2 partitions
and the solution is at one of the bounds. In this case, the NLP solver finds the
same solution as the global solution of $77,127.

Table 4 presents the calculated areas at the global solution which is the same as
for the suboptimal solution in this case. The utility costs were weighed equally
for all periods and are shown in Table 4. It can be observed from Table 3 that
the NLP has larger number of equations and variables as compared to Example
1. However, the time taken for solving each NLP is much smaller for Example 2.
This can be explained on the basis of the network structure. Note that example
1 has 4 exchangers and 10 periods as compared to 8 exchangers and 5 periods in
example 2. Thus, one would expect equal number of nonlinear underestimating
equations for both problems. However, for example 2, the upper and lower
bounds for area are equal for exchangers 1,2,7,6 and 8. Thus, the nonlinear
underestimators are redundant for these exchangers for all periods of operation.
As a result, example 2 has fewer nonlinear equations (168) than example 1 (280)
resulting in a lower computation time for the convex underestimator problem.

6 CONCLUSION
A global optimization procedure based on the approach of Quesada and Gross-
mann [17] has been extended to the multiperiod HEN design problem. The
multiperiod problem was formulated as an NLP with nonconvex, nondifferen-
tiable objective function with linear fractional terms and with linear constraints.
The model was reformulated using convex underestimators and a solution pro-
cedure with a modified partitioning rule was used to obtain the global optimum.
It was shown that an exact approximation for areas is not required for non-
bottleneck periods. Example problems indicate that for some instances there
might be a significant difference between the global optimum and suboptimal
solutions obtained using a local NLP solver justifying the need for a global
solution strategy for the multiperiod HEN design problem.

ACKNOWLEDGl\1ENT: The authors would like to acknowledge funding from


the Department of Energy under Grant DE-FG02-85ER13396.
304 R. R. IYER AND I. E. GROSSMANN

Hot Inlet temperature for each period (K)


Stream
2 3 4 5

1 400 405 402 395 407.5


2 450 447.5 450 455 455
3 400 393 393 405 407
4 430 432.5 435 432 428

Oulletternperature for each period (K)

1 325 325 325 325 325


2 350 350 350 350 350
3 360 360 360 360 360
4 298 298 298 298 298

Heat capacity flow rate (kW/K)

1 4 4 4 4 4
2 2 2 2 2 2
3 2 2 2 2 2
4 2.5 2.5 2.5 2.5 2.5

Cold Inlet temperature for each period (K)


Stream
2 3 4 5

1 310 317 305 303 312


2 290 290 295 285 295
3 285 292.5 278.5 292 278

Oullet tern perature for each period (K)

1 380 380 380 380 380


2 410 410 410 410 410
3 340 340 340 340 340

Heat capacity flow rate (kW/K)


1 5 5 5 5 5
2 6 6 6 6 6
3 2 2 2 2 2

Note- 1) Overall Heat Transfer Coefficient (kW/K m 2) = 1.0 for each exchanger.
2) Area Cost Coefficient ($/m 2) =300 for each exchanger.
3) Hot utility cost =$258 /kW.
4) Utility costs for all periods weighed equally in total cost.
5) DTmin = 1 K

Table 2 Data for example 2


HEAT EXCHANGER NETWORKS 305

TABLE 3· Computational wU/ts for example problems

Example Problem Size Initial objective Global Subregions Lp(a) NLP(a)


Original NLPL CL C· optimum time time
(variables, equations)

(440,731) (515,1333) 347,320 405,480 377,950 30 75 990


2 (635,981) (710,1778) 77,117 77,138 77,122 2 4 9

(a) On IBM/R6000-530 CPU lime in seconds

Table 3 Computational results for examples

REFERENCES
[1] Brooke A., Kendrick D. and Meeraus A., GAMS: A users Guide, Scientific
Press, Palo Alto (1992).
[2] Charnes, A. and Cooper,W.W, "Programming with Linear Fractional
Functions" ,Naval Research Logistics Q., 1962,9, 181-186.
[3] Ciric, A.R. and Floudas,C.A, "Heat Exchanger Network Synthesis without
Decomposition", Compo & Chern. Eng, 1991,6, p 385-396.
[4] Daichendt , M.M. and Grossmann, I.E., "Preliminary Screening Proce-
dures for the MINLP Synthesis of Process Systems-II. Heat Exchanger
Networks" ,Compo & Chern. Eng, 1994, 18, p 679.
[5] Dolan, N.B , Cummings, P.T. and LeVan, M.D, "Process Optimization via
simulated annealing:Application to Network Design" , A/CBE J, 1989, 35,
725-736.
[6] Floudas,C.A and Grossmann I.E., "Synthesis of Flexible Heat Exchanger
Networks for Multiperiod Operation", Compo & Chern. Eng, 1986, 10, p
153.
306 R. R. IYER AND I. E. GROSSMANN

Example Exchanger
(areas in m2)
2 3 4 5 6 7 8

1 (optimal) 733.5 92.0 2.6 125.6


(suboplimal) 331.1 471.6 170 7.1
2 3.3 5.1 3.3 8.0 8.4 7.2 1.4 2.7

Example Hol Ulilily Load for Example 2


(kW)

2 3 4 5

2 270 213 271 316 195

Table 4 Computed areas and utility loads for examples

[7] Floudas,C.A and Grossmann, I.E., "Automatic Generation of Multiperiod


Heat Exchanger Network Configurations", Compo (3 Chern. Eng, 1987, 11,
p 123-142.
[8] Galli,M.R. and Cerda, J., "Synthesis of Flexible Heat Exchanger Networks
III. Temperature and Flowrate variations", Compo & Chern. Eng, 1991,15,
P 7-24.
[9] Grossmann, I.E. and Straub, D.A, "Recent Developments in the' Eval-
uation and Optimization of Flexible Chemical Processes", Proceedings
Computer-Oriented Process Engineering (eds. L.Puigjaner and A.Espuna),
Elsevier, p49-59, 1991.
[10] Grossmann, I.E. and Sargent, R.W.H, "Optimal Design of Multipurpose
chemical plants", Ind. Eng. Chern. Proc. Des. Dev, 1979, 18, P 343.
[11] Grossmann, I.E. and Floudas,C.A, "Active constraint strategy for flexi-
bility analysis in chemical processes", Compo (3 Chern. Eng, 1987, 11, p
675.
HEAT EXCHANGER NETWORKS 307

Example 1 Calculated and underestimated exchanger areas for exchanger #1


showing gap for non-bottleneck periods
Periods (areas in m2)

2 3 4 5' 6' 7 8 9 10'

Calculated 727.4 701.9 733. 730.5 733.5 733.5 722.8 718.9 733.3 733.5
area
Underestimated 711.9 685.9 727.4 730.5 733.5 733.5 721.8 723.5 733.2 733.5
area.

* denotes bonleneck period with exact approximation for exchanger

Table 5 Computed areas for example 1

[12] Gundersen,T. and Naess, L. , "The synthesis of cost optimal heat ex-
changer network synthesis- an industrial review of the state of the
art" ,Comput. (1 Chern. Eng., 1988, 12, 503-530.
[13] Horst, R., "Deterministic method in constrained global optimization: Some
recent advances and fields of application.", Nav. Res. Logistics, 1990, 37,
433-471.

[14] Marselle D.F, M.Morari and D.F.Rudd, "Design of Resilient processing


plants-II Design and control of energy management systems", Chern. Eng.
Sci., 1982, 37, 259-270.
[15] Murtagh B.A. and Sanders M.A, MINOS User's guide, Systems Optimiza-
tion Laboratory, Department of Operations Research, Stanford University
(1985).
[16] Papalexandri,K.P. and Pistikopoulous, E.N, "Synthesis and Retrofit De-
sign of Operable Heat Exchanger Networks-I. Flexibility and Structural
Controllability aspects", Comput. (1 Chern. Eng., 1994, 33, p 1718.
[17] Quesada,! and Grossmann, I.E. , "Global Optimization Algorithm for Heat
Exchanger Networks", I (1 EC Research, 1993, 32, p 487.
[18] Rudd,D.F and Watson C.C (1968) , Strategy of Process Engineering, John
Wiley, New York.
308 R. R. lYER AND I. E. GROSSMANN

[19] Yee,T.F and Grossmann, I.E., "Simultaneous optimization models for heat
integration- II , Heat Exchanger Network Synthesis", Comput. & Chern.
Eng., 1990, 14, 1165-1184.
10
ALTERNATIVE BOUNDING
APPROXIMATIONS
FOR THE GLOBAL OPTIMIZATION
OF VARIOUS ENGINEERING
DESIGN PROBLEMS
I. Quesada and I.E. Grossmann
Department o/Chemical Engineering
Carnegie Mellon University. Pittsburgh. PA 15213

ABSTRACT
This paper presents a general overview of the global optimization algorithm by
Quesada and Grossmann [6] for solving NLP problems involving linear fractional and
bilinear terms, and it explores the use of alternative bounding approximations.
These are applied in the global optimization of problems arising in different
engineering areas and for which different relaxations are proposed depending on the
mathematical structure of the models. These relaxations include linear and nonlinear
underestimator problems. Reformulations that generate additional estimator functions
are also employed. Examples from structural design, batch processes, portfolio
investment and layout design are presented.

INTRODUCTION
One of the difficulties in the application of continuous nonlinear optimization
techniques to engineering design problems is that one is often confronted with the
following dilemma. One can either apply fairly efficient gradient based techniques
(e.g. SQP or reduced gradient algorithms) or else one can apply direct or random
heuristic search procedures (e.g. complex method or simulated annealing). The
problem is that the former methods may only produce rigorous results when certain
convexity conditions hold, while the latter may in principle produce improved
solutions but at a computational expense that is unacceptably high. Also, if the
309
I. E. Grossmann (ed.), Global OptimiZiltion in Engineering Design, 309-331.
Cl 1996 Kluwer Academic Publishers.
310 I. QUESADA AND I. E. GROSSMANN

nonlinear programming (NLP) problem at hand is known to be nonconvex the fIrst


alternative is generally inconsistent with the goal of finding a global optimum.
While the second alternative may offer greater hope to globally optimize a design. the
heuristic nature of these methods may produce results that are in fact worse than the
ones obtained by a local search technique. Despite these difficulties. rigorous
deterministic methods for nonconvex NLP models have been developed. especially
over the last five years (see Horst [3]; Horst and Tuy [4] for a recent review). In this
way it is increasingly possible to find global optimum solutions with reasonable
computational expense. The specific structure of design problems is also being
identifIed and better understood given the increased trend towards the use of equation
based modeling systems.

The objective of this paper is to fIrst present an overview of the global optimization
algorithm proposed by Quesada and Grossmann [6] for solving nonconvex NLP
problems that have the special structure that they involve linear fractional and
bilinear terms. These problems can be represented in general as follows:

mingo
st. gl ~ 0 1=1 .....L (NLP)


where gl =:L :L ci/ ~
ieI jeJ YJ
- :L :L ci/ xiYj + hi (x. y. z). 1= O.l .....L
ie I' je J'
xL ~ x ~ XU
yL ~ Y ~ yU
ze Z
As shown above. the objective function and the constraints generally involve linear
fractional and bilinear terms corresponding to the two summation terms. while the
last term h/(x. y. z) is assumed to correspond to a convex function. These type of
problems arise very often in engineering and management applications [1]. The
difficulty involved in solving these NLP optimization problems is that the
application of local search methods is generally not rigorous. Not only can a
conventional NLP algorithm produce local solutions that are suboptimal. but the
method may even fail to converge to a feasible solution due to the nonconvexities of
the constraints.

A major objective of this paper is to explore the possibility of using alternative


bounding approximations for deriving valid relaxations. Different relaxations are
proposed depending on the mathematical structure of the model to be solved. Linear
VARIOUS ENGINEERING DESIGN PROBLEMS 311

and/or nonlinear estimator functions as the ones considered in Quesada and


Grossmann [6], [7] are included. In some cases, additional approximating functions
are obtained by reformulating and linearizing the original models. These constraints,
that are redundant for the original nonconvex problem, can often help to obtain a
tight convex relaxation.

Another objective of this paper is to consider the application of the proposed methods
to problems from a variety of areas. The first includes a layout design model. In
this model a fixed layout configuration is given and the dimensions of the different
units are to be optimized. A portfolio investment model is also considered and in
this case, the percentage to be invested in each security is optimized to minimize the
total variance. Also, a model for the design of truss structures is presented. The
objective in this case is to minimize the total weight of the structure. Finally, two
models for batch process design are considered where the size of the equipment has to
be selected. Numerical results are reported for all these problems.

Algorithm Outline
The main idea behind the method proposed by Quesada and Grossmann [6] is to first
replace the bilinearities and linear fractional terms in (NLP) by valid under and
overestimators which will yield a convex NLP (or LP) whose solution provides a
lower bound to the global optimum. So for instance for fractional terms with
positive coefficients, introducing the variables rij' the fractional term can be expressed
as the constraint,

i E I, j E J (1)

which is nonconvex. Valid linear underestimators which were suggested by


McCormick [5] for this constraint are given by,

Xi ~ YjUrij + rijLyj - YjUrijL i E I, j E J (2a)

J r1J·· + r"Uy'
X1· -< y.L 1J J - y.Lr
J 1J.. U i E I, j E J (2b)
where XiL, Xiu, YjL, yt' r l, riju, are valid lower and upper bounds of the variables.
In addition, Quesada and Grossmann [6] showed that the following nonlinear convex
constraint,

i E I, j E J (3)

can be used as a valid underestimator. The interesting feature of (3) is that it is a


stronger constraint than (2a) and (2b) provided rl, riju, are given by the bounds of Xi
312 I. QUESADA AND I. E. GROSSMANN

and Yj. In fact when these bounds are obtained by the optimization of individual
variables in (NLP) it is also possible to generate projected bounding constraints
which can serve to tighten the representation of the NLP [6].

The proposed method then consists in reformulating problem (NLP) in terms of valid
linear and nonlinear bounding constraints such as in (2)-(3), giving rise to a convex
NLP (or LP) problem which predicts valid lower bounds to the global optimum. If
there is a difference between the current upper and lower bounds, the idea is to
partition the feasible region by performing a spatial branch and bound search as
outlined in the following steps:

Step O. Initialization step.


(a) Set the upper bound to f* =00, and select the tolerances e.

(b) Bounds over the variables involved in the nonconvex terms are obtained. For
this purpose specific subproblems can be solved or a relaxation of the original
problem is used. Update the upper bound f*

(c) Define space W0 as a valid relaxation of the feasible region in the space of the
nonconvex variables. The branch and bound search will be conducted over Wo.
The list F is initially defined as the region Wo.

(d) Construct a convex underestimator problem (CUr.) by replacing the nonconvex


terms in the original problem with additional variables and introducing valid
convex approximations of these nonconvex terms (e.g. equations (2) and (3».
Constraints that are valid but were not present in the original problem because
they were redundant can be included to tighten the convex relaxation.

Step 1. Convex underestimator problem.


(a) Solve problem CUL over the relaxed feasible region Woo The solution
corresponds to a valid lower bound (fL) of the global optimum. The actual
objective function is evaluated if this is a feasible solution; otherwise the
original problem (NLP) is solved using the convex solution as the initial
point Update the upper bound.

(b) If (f* - fi-) ~ e f* stop and the global solution corresponds to f*.
VARIOUS ENGINEERING DESIGN PROBLEMS 313

Step 2. Partition.
From the list F consider a subregion Wj (generally the region with the smallest fL is
selected) and divide it into two new subregions Wj+l and W j +2 which are added to the
list F and subregion Wj is deleted from F.

Step 3. Bounding.
(a) Solve problem CUL for the two new subregions.

(b) If the solutions are feasible evaluate the actual objective function. Otherwise
the original nonconvex problem can be solved according to a given criterion.

Step 4. Convergence.
Delete from list F any subregion with (1'* - fL) ~ e 1'*. If list F is empty then
stop and the global optimum is fi'; otherwise go to step 2.

REMARKS
The global optimization algorithm described in the previous section uses a spatial
branch and bound procedure (steps 2 to 4). As many of the branch and bound
methods, the algorithm consists of a set of branching rules, and upper bounding and
lower bounding procedures.

The branching rules include the node selection rule, the branching variable selection
and the level at which the variable is branched on. A simple branching strategy has
been followed in this work. The node with the smallest lower bound is the node
selected to branch on and two new nodes are generated using constraints of the type,
Xi ~ Xj* and Xi ~ Xj*

Different strategies can be used to do the branching. These include generating more
than two nodes from a parent node, using different type of branching constraints or
different node selection rules. For the latter, some type of degradation function
similar to the one used in branch and bound for MILP problems can be used.

Additional criteria used in branch and bound algorithms for MILP problems can be
extrapolated to the global optimization case. These include the fixing of variables,
tightening of bounds, range reduction, etc. [8]. One main difference between the
branch and bound for binary variables and the spatial branch and bound search used
314 I. QUESADA AND I. E. GROSSMANN

here is the fact that it might be necessary to branch more than once on the same
variable. When in the selection rule there is more than one variable within a small
range it is often useful to branch on a variable that has not been used previously even
though it may not be the first candidate.

Information of the convex underestimator problem can be employed to select the


branching variables. At this point only the difference between the convex solution
and the actual value of the functions is used. It is also possible to consider dual
information, second order information or to generate small selection subproblems
[10].

With respect to the upper bound there are two cases. The first one is when the
feasible region of the original problem is convex. In this case the evaluation of the
original objective function at the solution of the convex underestimator problem
often provides a good upper bound. For the case of a nonconvex feasible region it is
sometimes necessary to obtain an upper bound through a different procedure since the
solution of the convex underestimator problems might be infeasible for the original
problem. In some particular cases it may be better to use a specialized heuristic to
obtain a good upper bound. In general, however, it may be necessary to solve the
original nonconvex problem to generate an upper bound. As pointed out in Quesada
and Grossmann [6], [7] the solution of the convex underestimator problem provides a
good initial point to the nonconvex problem.

Our previous work has mainly concentrated on the generation of tight convex
relaxations, which are generally nonlinear, and that allow for an efficient lower
bounding of the global optimum. The major motivation has been to reduce the effort
in the spatial branch and bound search. The use of additional convex relaxations that
are somewhat different from the ones used in Quesada and Grossmann [6] is explored
for the models presented in this paper.

To be able to obtain a tight convex relaxation it is necessary to obtain a good


approximation of the convex envelope of the nonconvex function. The linear and
nonlinear estimators functions used in Quesada and Grossmann [6] correspond to the
convex envelope over the boundaries defined by lower and upper bounds of the
nonconvex variables. These bounds are a relaxation of the actual feasible region. It
is often the case, however, that they do not yield a tight convex relaxation of the
feasible region (see Figure 1). The use of projections such as the ones described in
Quesada and Grossmann [6,7] help to obtain tighter relaxations of the feasible region.
Moreover, reformulation and generation of additional constraints can also improve the
approximation of the convex envelope over a tighter feasible region. To illustrate
these points consider the linear constrained feasible region in Figure 1.
VARIOUS ENGINEERING DESIGN PROBLEMS 315

y
I
I
L
r

I x + a,y 5 q{ b> 0
a> 0

Figure 1 Linear constrained feasible region and relaxations

Lower and upper bounds over the variables x and y can be obtained through heuristics
or the solution of LP subproblems. In this case, the best possible bounds are given
by xL, xu, yL and yU. Consider the linear under and over estimators by McCormick
[5] for the bilinear term, xy, used in Quesada and Grossmann [6],
x y ~ max [xLy + yLx_ xLyL, xUy + yUx _ XUyU]
x y ~ min [xLy + yU x_ xLyU, xUy + yLx _ xuyL]

These constraints correspond to the convex and concave envelopes of the bilinear
term over the relaxation of the feasible region defined by the lower and upper bounds
of the variables. As pointed out in Quesada and Grossmann [6], [7] these estimators
have the property of matching the actual function at the boundaries. However, these
equations do not always provide tight bounds since the relaxation of the actual
feasible region can be very loose. Consider, the value of the bilinear term over the
boundary defined by the first constraint, x + alY ~ bl , that is given by;

(4)
316 I. QUESADA AND I. E. GROSSMANN

This is a concave term and better approximations of it can be obtained by


reformulating the problem. Take that particular inequality, bl -x - alY ~ 0, and
multiply it by the valid bound constraint, x - xL ~ 0, obtaining,

(5)

The above is a concave overestimator and therefore a valid convex constraint that can
be included in the formulation. It is also tighter since it provides an exact
approximation of the bilinear term over the linear constraint. In the case that the
valid bound constraint, XU - x ~ 0, is used to generate additional constraints, the
following equation is obtained

(6)

The above inequality is a concave underestimator and the concave term, _x2, has to be
linearized over the bounds, xL and XU. This corresponds to the approach followed by
Sherali and Alameddine [9]. With this reformulation-linearization a linear
underestimation of the bilinear term over that particular boundary is obtained. In
fact, (6) is the best approximation of the bilinear term in this boundary since it
projects in a concave form as in (4) and the approximation is a linear estimator that
matches the actual function at the extreme points A and B. Equation (5) corresponds
to the convex underestimator envelope of the bilinear term in that boundary and helps
to generate a tighter convex approximation.
In the case of constraints like the second one in Figure I, x + a2Y ~ b2, the bilinear
term behaves in a convex form. Convex quadratic underestimators that match the
function in the boundary and tighter linear overestimators can be obtained.
The introduction of these additional constraints yields a tighter convex underestimator
problem. However, there is a trade-off since the size of the underestimator problems
can become substantially large. Nevertheless, the use of projections or some
particular mathematical structures can be employed to identify the most relevant
additional constraints so as to avoid generating a large number of constraints. In the
following applications different types of relaxations are used which include linear
and/or nonlinear constraints.

Layout Design
In this example a floor layout is given in which the distribution of the rooms is
known. The dimensions of the rooms are to be optimized to minimize the total cost
that is a function of the area of the rooms.
VARIOUS ENGINEERING DESIGN PROBLEMS 317

Bathroom
Dinning room Ys
Bedroom 1

Living room
Kitchen Bedroom 2
Storage
).3

Xl

Figure 2 Layout for example 1

Example 1
Consider the layout given in Figure 2. Here the two bedrooms have the same length.
The storage room and the bathroom have also the same length. The complete
formulation is given by;

min f =2 ( XlYl + X3YS) + XlY2 + X2YT + X3Y4 + X4Y6 + X4Y7


+ 0.5 (X3Y3)
Yr =Yl + Y2
Yr = Y3 + Y4 + Ys
Yr = Y6 + Y7

Y7 - Ys ~ 1
Y6 - Y3 ~ 1

Xl + X2 ~ 8
3 =:;; Xl =:;; 5, 4 =:;; X2 =:;; 6, 2 =:;; X3 =:;; 4, 4 =:;; X4 =:;; 6

5 =:;; Yl =:;; 7, 2 =:;; Y2 =:;; 5, 2 =:;; Y3 =:;; 5,2 =:;; Y4 =:;; 4,3 =:;; Ys =:;; 5, 3 =:;; Y6
=:;; 6, 4 =:;; Y7 =:;; 6

13 =:;; Xr =:;; 21, 8 =:;; Yr =:;; 12


318 I. QUESADA AND I. E. GROSSMANN

The objective function consists of minimizing the total cost as a function of the area
of the rooms. The fifth and sixth constraints ensure that some hall space is left for
the doors. Bounds over the dimensions of the rooms are given. The feasible region
is linear and the nonconvexities are involved in the objective function. The bilinear
terms can be linearized (wij = xiYj) and the linear underestimators used in (2) and (3)
are included.
w--1J-
- x-y-
1 J-> x-1L y-J + y_L L J
J x-1 _ x-1 y_L (7)
(8)

Only underestimators are considered because the bilinear terms are only present in the
objective function with a positive coefficient. The nonlinear estimators are not used
since there are no bounds over the individual bilinear terms [6]. The linear
underestimator problem is solved and a solution of fL= 130 is obtained. The
approximations are exact and this solution corresponds to the global solution with
xl=3, x2=5, x3=2, x4=4, Yl=5, Y2=3, Y3=3, Y4=2, Ys=3, Y6=4 and Y7=4.

Example 2
A second layout example is considered using a similar configuration (see Figure 3).
In this case the dimensions of the bathroom are allowed to change independently.
Constraints over the aspect ratio and the size of the rooms are included. The
objective function contains an additional term that accounts for the perimeter of the
layout. The complete formulation is the following,

3
2 7

y - - -I - - -
, 4
I
5
8
6

x
Figure 3 Layout for example 2
VARIOUS ENGINEERING DESIGN PROBLEMS 319

min f = 10 XlYI + 11 x3Y3 + 1.5 X7Y7 + 1.5 XsYs + (X2Y2 - X3Y3) +


X;jY4 + XsYs + 0.5 (X6Y6) + 1.5 (Yl + Y2) + XT
st. YT = Yl + Y2
YT = Y7 + Ys
Yl = Y4
Y4=YS+Y6
Y7 - Y3 ~ 1
Ys - Y6 ~ 1
Y3 ~ Y2 (NLPLAY~
XT = X2 + X3 + X7
X7 = Xs
x2 = Xl + X4 + Xs
X6= Xs
x3 ~ x2
x3 ~ Xs
a;.xi~ Yi
Yi ~bixi
xiYi~~

This new problem has a nonconvex objective function and nonlinear constraints.
The data for the ratio constants (3;. bi ) and the area lower bounds (di ) are given in
Table 1. The nonconvex terms in the objective function are linearized and linear
estimators are introduced. The nonlinear constraints over the area can be written in a
convex form as,
X 1-
'>~ (9)
Yi
320 I. QUESADA AND I. E. GROSSMANN

Table 1 Data for the second layout example

Room 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8
~ 1.25 1/3 1/1.5 1.25 1 1 1.25 1.25
~ 1.5 1/2 1/1.25 1.5 1.25 1.25 1.5 1.5
<\ 16 40 10 20 4 4 20 20

Additional convex nonlinear approximations can be generated. These nonlinear


constraints are obtained using the aspect ratio constraints. Consider the inequality,
(10)

multiplying by the constraint Yi ~ 0 and linearizing, yields

(11)

which is a convex constraint. In the same form the other ratio constraints can be
multiplied by Xi ~ 0 to obtain the following constraints;

(12)

In this form a convex nonlinear underestimator is obtained by introducing equations


(7), (8), (9), (11) and (12) in model NLPLAY2 and linearizing the bilinear terms. The
convex nonlinear underestimator problem has a solution of fL = 440.6. This solution
is feasible and has an actual objective function of f = 440.99. Since the difference is
e = 0.07% it is considered as the global optimal solution.

Optimal Design of Structures


An application in civil engineering is the design of a truss structure [2]. It is
assumed that a truss consists of a given number of bars, m, with a fixed location and
that are subject to a number of different loading conditions (see Figure 4). The
objective is to determine the cross section areas of the bars to minimize the weight of
the truss structure. The NLP formulation is the following,
a) Objective function, minimize the total weight
m
min f= ~ PiAi ai (13)
VARIOUS ENGINEERING DESIGN PROBLEMS 321

b) Equilibrium equations

for j= I...L,k= l...n (14)

c) Compatibility equations
n
L bik djk = Vij for i=l...mj=l...L (15)
k=1
d) Hooke's law
Ei
1:i 8j Vij = Sij for i=l...mj=1...L (16)

e) Stress equations
Ei
~ Vij = CYij for i=1...mj=1...L (17)

t)Bounds
djkL S djk S djku (18)
CYijL S CYij S CYiju (19)
SIJ.. L < (! •• < (! •• U
- "lJ - "lJ (20)
VI··JL -< V··IJ -< V··IJu (21)
(22)

where n is the number of degrees of freedom, L the number of loading conditions.


The parameters are the following; A.i is the length of bar i, Ei is the modulus of
elasticity of bar i, Pi is the density of bar i, Pjk is the kth component of load at
condition j, bile is the direction cosine relating force in bar i with degree of freedom k.
The variables are CYij the stress of bar i for condition j, ai the cross section area of bar
i, Sjj the force in bar i for condition j, Vij the elongation of bar i for condition j and djk
the displacement at degree of freedom k for condition j.

The objective function of this model is linear and the nonconvex terms in the form of
bilinearities are involved in Hooke's law equations (16).
322 I. QUESADA AND I. E. GROSSMANN

Example 3
This example consists of the truss illustrated in Figure 4. The modulus of elasticity
is lxl07 psi, the density is 0.1 Ib/in 3 and the maximum stress is 20,000 psi in
compression or tension. The remaining data is given in Table 2.

Table 2 Data for example 3

Ba- 1 I 2 I 3 I 4 I 5
bi! -0.89443 -0.95783 -0.99504 -0.99504 -0.95783
bi2 -0.44721 -0.28735 -0.0995 0.09950 0.28735
Ii 111.8034 104.4031 100.4988 100.4988 104.4031
-2 ~ djk ~ 2, -200,000 ~ Sij ~ 200,000, -0.22 ~ Vij ~ 0.22, 0 ~ ai ~ 10,
-20,000 ~ Sij ~ 20,000

100,000 Ib~

Figure 4 Structure for example 3

The bilinear terms are linearized by Wij =Vij ai and linear over and underestimators [5]
are included. In this case it is possible to exploit further the mathematical structure
of this problem. Additional constraints are generated using the stress equations (17),
VARIOUS ENGINEERING DESIGN PROBLEMS 323

E'I
17 Vlj= Vij
. IT
(17)

Multiplying by ai ~ 0 yields,

(23)

that can be linearized with Zij =crij ai to obtain

(24)

Linear over and underestimators are also included for Zij =crij ai. The resulting LP
model includes the estimators for Zij, Wij and the equations (24). The solution of this
problem is fL = 147.5 lb and the approximations are exact corresponding to the
global solution with a =(7.102,0,0,0,6.525). If the additional equation (24) with the
corresponding linear estimators is not generated the lower bound yields fL = 144.0 lb
which represents a 2.3% gap from the global optimum. When the original
nonconvex problem is solved with MINOS 5.2 providing zero values as an initial
point, no feasible solution is obtained.

Example 4

Consider the truss shown in Figure 5. The modulus of elasticity is lx107 psi, the
density 0.1 Ib/in 3 and the maximum stress is 25,000 psi in compression or tension.
The remaining data are given in Table 3.

The same reformulation as in example 3 is used. The LP solution is fL = 1,584 Ib


and it corresponds to the global solution with a = (8,0,8,4,0,0,5.657,5.657,5.657,0).
It is important to notice that in both examples the reformulated LP converges in one
iteration. The non reformulated LP has a solution of fL = 1,373 lb that is still 15%
under the global optimum.
324 I. QUESADA AND I. E. GROSSMANN

Table 3 Data for example 4

Bar 1 2 3 4 5 6 7 8 9 10

bi! 1 -1 0 0 0 0 0 -.7071 -.7071 0


bi2 0 0 0 0 0 0 0 -.7071 .7071 0
bi3 0 0 1 -1 0 0 .7071 0 0 -.7071
bi4 0 0 0 0 -1 0 -.7071 0 0 -.7071
biS 0 1 0 0 0 0 0 0 0 .7071
bi6 0 0 0 0 0 0 0 0 0 .7071
bi7 0 0 0 1 0 0 0 0 .7071 0
biS 0 0 0 0 0 -1 0 0 -.7071 0
Ii 360 360 360 360 360 360 509.1 509.1 509.1 509.1

-10 ~ djk ~ 10, -250,000 ~ S;.j~ 250,000, -1.273 ~ Vij~ 1.273, 0 ~ a;. ~ 10,
-25,000 ~ O'ij ~ 25,000

~
~--~~~--~~~
'/

Figure 5 Structure for example 4


VARIOUS ENGINEERING DESIGN PROBLEMS 325

Portfolio Investment
A set of securities, i, is available for investing. The investment has to be done
achieving a target mean annual return according to the mean annual returns on the
individual securities, mi. The total variance of the investment has to be minimized.
By defining Xi as the fraction to be invested for each security i, the optimization
problem can be expressed as:

minf=LL Xi Vir Xi'


i i'
st. L Xi = 1
i
L mi Xi = target

In this case the bilinear terms in the objective function are linearized introducing
variables Wij and the linear estimators. The quadratic terms x? remain in the convex
underestimator problem when the variance coefficient, Vii, is positive. The upper
bounds on the investment fractions, Xi, can in some cases be tightened according to
the following equation

(25)

Example 5
The data for this example are given in Table 4. The initial lower bound is f1. = 5.22
and corresponds to an actual objective function of f = 5.429. Since the difference is
greater than the tolerance, e = 0.001, a branch and bound search is conducted. After 7
nodes the global optimal of f = 5.429 is obtained with x=(0.143,0.143, 0.714, 0.0).

T abl e 4 Data for examplIe 5


1 I 2 I 3 I 4
Vil 4 3 -1 0
Vi2 3 6 1 0
Vi3 -1 1 10 0
Vi4 0 0 0 0
mi 8 9 12 7
target = 11
326 I. QUESADA AND I. E. GROSSMANN

Batch Process Design


Consider the design and production planning of a multiproduct batch plant with one
unit per stage (see Figure 6).

4r->
Stages
Figure 6 Multiple stages batch process

The objective is to maximize the profit given by the income from the sales of the
products minus the investment cost. Lower bounds are specified for the demands of
the products and the investment cost is assumed to be given by a linear cost function.
Since the size of the vessels and the number of batches are assumed to be continuous,
this gives rise to the following NLP model:

max P= I, Pini Bi - I, aj Vj
i j
S.t. Vj ~ SijBi i=1...N, j= ... N (NLPp)
I, niTi ~ H
1

QiL _ R ~ a i=1...N
ni 1

where ni and Bi are the number and size of the batches for product i, and Vj is the size
of the equipment at stage j. The first inequality is the capacity constraint in terms of
the size factors Sij' the second is the horizon constraint in terms of the cycle times
for each product Ti and the total time H, and the last inequality is the specification of
lower bounds for the demands QiL • Note that the objective function is nonconvex as
it involves bilinear terms, while the constraints are convex.

Example 6
The data for this example are given in Table 5. A maximum size of 5000 L is
specified for the units in each stage.
VARIOUS ENGINEERING DESIGN PROBLEMS 327

Table S. Data for Example 5

Ti Pi QL
,Sij(L;g)1
Product (hrs) {$/k~ (kg) 1 3
A 16 15 80000 2 3 4
B 12 13 50000 4 6 3
C 13.6 14 50000 3 2 5
D 18.4 17 25000 4 3 4
at =50, a2 =80, a3 =60 ($/1...); H =8,000 hrs

When a local search algorithm (MINOS 5.2) is used for solving this NLPp problem
(default starting point in GAMS), the predicted optimum profit is $8,043,8oo/yr and
the corresponding batch sizes and their number are shown in Table 6.

Table 6. Suboptimal solution for example 5

Product A I B I c I D
Bi 1250 833.33 1000 1250
ni 79.15 60 50 289.87

Since the formulation in (NLPp) is nonconvex there is no guarantee that this


solution is the global optimum. This problem can be reformulated by replacing the
nonconvex terms in the objective function by underestimator functions to generate a
valid NLP underestimator problem with the foIl owing constraints

<Ii. ~ niL Bi + Biu ni - niL Biu (26)


<Ii. ~ niu Bi + BiL ni - niu BiL (27)
The underestimator functions require the solution of LP subproblems to obtain tight
bounds on the variables, and yield a convex NLP problem with 8 additional
constraints.

The optimal profit predicted by the nonlinear underestimator problem is


$8,128,loo/yr with the variables given in Table 7. When the objective function of
the original problem (NLPp) is evaluated for this feasible point the same value of the
objective function is obtained proving that it corresponds to the global optimal
solution. It is interesting to note that both the local and global solutions had the
328 I. QUESADA AND I. E. GROSSMANN

maximum equipment sizes. The only difference was in the number of batches
produced for products A and D.

Table 7 Global optimum solution for example 5

Product A I B I c I D
Bi (kg) 1250 833.33 1000 1250
ni 389.5 60 50 20

Alternative Model for Batch Process


The next example corresponds to an alternative formulation of the batch process
design problem considered in the previous section. A process with one line per stage
is also considered operating with single product campaigns. All the products require
the same sequence of processing stages. The sizes of the equipment, Yj' and the
output of the products, Qi' are optimized to minimize the cost. Removing the
number of batches ni as variables the NLP formulation becomes;

min f = I aj Yj - I b i Qi
j i
Y·>S··B-
J- 1J 1 for i=l...N, j=l...M

~T Qi < H
L... Li B' -
i 1

YJ.L <
-
y. < y.u
J- J
Q 1.L<Q.
- 1-
<Q.u1
Bi~O

The first set of constraints corresponds to the volume requirements for each unit with
respect to all the products. The second constraint states that the total time of
production has to be smaller that the allocated time H. The third constraint
represents a raw material limitation. Bounds over the volumes, Yj ' and the
production levels, Qi' are given. Note that the nonconvexities appear in the time
constraint in the form of a sum of linear fractions. Nonlinear underestimator of these
terms are included and have the following form
VARIOUS ENGINEERING DESIGN PROBLEMS 329

Qi =R > Qi + Q.u ( 1.. __


1 ) (28)
Bi 1 - BiL 1 Bj BiL
Qi =R ~ Qi + QL ( 1.. __
1 ) (29)
Bi 1 Biu Bj Biu

It is necessary to have bounds over the batch sizes Bj • These are given by the
following valid relaxations of the original constraints in NLPB ;

V· u
S ij =B·U
min·J [~] 1
~ B-
1
(30)
Q .LTL·
. >- B.L
B1 1 -
> 11
H (31)

Example 7
This example involves 5 products and 6 stages, and the corresponding data are given
in Table 8. The following additional linear constraints are imposed;

(32)
(33)

Table 8 Data for batch design example 6

Product A I B I c I D I E
QjL (kg) 200,000 120,000 180,000 130,000 100,000
Qju (kg) 300,000 180,000 200,000 160,000 150,000
<\ 0.8 0.7 0.6 0.4 0.5
~ 0.1 0.15 0.15 0.2 0.2
Tu (hr) 8.31 6.8 11.9 3.5 4.2
aj = 2.5$/L; VjL = 3,000 L; Vju = 6,000 L, F = 550,000 kg

The initial lower bound is fL =- 74,4480 and it corresponds to an infeasible solution


of NLPB • The original nonconvex problem is solved to generate an upper bound
330 I. QUESADA AND I. E. GROSSMANN

using the solution of the underestimator problems as the initial point. In this form
an upper bound of f = -73,270 is generated. It is necessary to perform a branch and
bound and after 7 nodes the initial upper bound is proven to be globally optimal with
tolerance E = om. The global solution yields V = (5737, 3600, 3776, 4983, 4430,
4014) (L).

CONCLUSIONS
This paper has presented a general overview of the global optimization algorithm by
Quesada and Grossmann [6] and outlined several alternative bounding approximations
which can be applied in layout design, truss structures, portfolio investment and
batch process design. As has been shown the use of some of these alternative
approximations can sometimes tighten the relaxations so that the solution of only
one convex programming problem is required.

ACKNOWLEDGMENT
The authors would like to acknowledge financial support from the Enginering Design
Research Center at Carnegie Mellon University.

REFERENCES
[1] Floudas, C.A. and Pardalos, P.M. (1990). A Collection of Test Problems for
Constrained Global Optimization Algorithms. Edited by G. Goss and I. Hartmanis,
Springer Verlag.

[2] Grossmann, I.E., Voudouris, V.T. and Ghattas, O. (1992). Mixed-Integer Linear
Programming Reformulation for Some Nonlinear Discrete Design Optimization
Problems. Recent Advances in Global Optimization (Floudas, C.A and Pardalos,
P.M., eds.) Princeton University Press, Princeton, NI, 478-512.

[3] Horst, R. (1990). Deterministic Method in Constrained Global Optimization: Some


Recent Advances and Fields of Application. Naval Research Logistics, 37, 433-
471.

[4] Horst, R. and Tuy, T. (1990). Global Optimization: Deterministic Approaches.


Springer-Verlag, Berlin, New York.

[5] McCormick, G.P. (1976). Computability of Global Solutions to Factorable


Nonconvex Programs: Part I - Convex Underestimating Problems. Mathematical
Programming, 10, 146-175.
VARIOUS ENGINEERING DESIGN PROBLEMS 331

[6] Quesada, I. and Grossmann, I.E. (1995). A Global Optimization Algorithm for
Linear Fractional and Bilinear Programs. Journal of GZobal Optimization, 6, 39-
76.

[7] Quesada, I. and Grossmann, I.E. (1993). Global Optimization Algorithm for Heat
Exchanger Networks. Ind. Eng. Chern. Research, 32,487-499.

[8] Sahinidis, N.V. (1993). Accelerating Branch and Bound in Continuous Global
Optimization. TIMS/ORSA meeting, Phoenix, AZ paper MA 36.2.

[9] Sherali, H. D. and Alameddine, A. (1992). A New Reformulation-Linearization


Technique for Bilinear Programming Problems. Journal oj Global Optimization, 2,
379-410.

[10] Swaney, R. E. (1990). Global Solution of Algebraic Nonlinear Programs. AlChE


Meeting, Paper No.22f.
11
A PIPE RELIABILITY AND COST
MODEL FOR AN INTEGRATED
APPROACH TOWARD DESIGNING
WATER DISTRIBUTION SYSTEMS
Hanif D. Sherali, Ernest P. Smith, Seong-in Kim
Respectively. Virginia Polytechnic Institute and State University. t
Air Force Institute of Technology.
and Korea University

ABSTRACT
A municipal water distribution system is a network of underground pipes,
usually mirroring the city street network, that connects water supply sources
such as reservoirs and water towers with demand points such as residential
homes, industrial sites and fire hydrants. These systems are extremely
expensive to install, with costs running in the tens of millions of dollars.
Several optimization models and algorithms have been developed to generate
a least cost construction plan along with optimal flows and energy heads for a
given network configuration and demand pattern. However, in reality. such
models need to examine replacement and expansion decisions associated with
an existing distribution network, rather than generate a new design from
scratch. Moreover, several input parameters require to be detennined via a
pipe reliability and cost analysis, that in tum is dependent on the usage of the
system as determined by the output of this model. Accordingly, we propose
in this paper a pipe reliability and cost submodel that uses statistical methods
to predict pipe breaks and hence to estimate future maintenance costs. This in
tum detennines annualized costs and optimal economic lives. thereby
facilitating replacement decisions for relatively expensive-to-maintain or
undercapacitated pipes. This model is then integrated with the pipe network
optimization submodel in an overall design approach that uses a feedback
loop to reprocess the information that is generated by each model over a
number of stages, until a stable design is attained. The proposed approach
hence provides a holistic framework for designing reliable water distribution
systems. In particular, it identifies a nonconvex optimization subproblem that
global optimizers can focus on solving within this framework.

t This work has been partially supported by a research grant from the University of
Korea, Seoul, Korea.
333
1. E. Grossmann (ed.), Global Optimization in Engineering Design, 333-354.
C 1996 Kluwer Academic Publishers.
334 H. D. SHERALI, E. P. SMITH AND S. KIM

1 INTRODUCTION

1.1 MOTIVATION

The quality of water distribution systems (WDS) plays a crucial role in a


society because of its strong contribution to community health, firefighting
capabilities, quality of life, and the potential for future economic growth.
Aging, deteriorating water systems have long been recognized as a critical
national problem. Many existing systems are old. and pipes are heavily
tuberculated or undersized for present day use, and hence unable to provide
the required discharge at adequate pressure heads. Pipe failures occur
frequently in many cities, and recent catastrophic failures in Chicago and
Washington, D.C., and impending fears with respect to the New York City
distribution network, have raised an awareness for the need to upgrade the
water distribution infrastructure to an acceptable level of serviceability and
reliability. Meanwhile, new sectors are being continually added to existing
networks in order to accommodate growing communities and new industries.
The costs associated with the overhaul and expansion of these systems are
prohibitive and can easily escalate into the tens of millions of dollars.

As traditional methods for designing WDS are heuristic in nature and


incorporate high levels of redundancy, they tend to be overly conservative,
and therefore expensive. Savings of 10-30% have regularly been
accomplished through the use of computerized algorithms, but most
approaches have not adequately addressed the issue of replacement and that
of expanding an existing system, rather than designing one from scratch.
This paper presents a methodology to assess the existing condition of a given
WDS network. and to prescribe a cost effective construction plan for its
renovation and expansion, while retaining or designing for an acceptable level
of reliability and redundancy.

The most general WDS problem would require the modification and/or
expansion of an existing network to enable it to satisfy the varied anticipated
demand patterns for water at required pressure levels, even while experiencing
pipe breakages. If the network is designed with low energy heads and
undersized or rough pipes in a skeletal fashion, then flow and/or pressure
requirements will not be met during certain demand peaks or under various
pipe failure scenarios. On the other hand, if the energy sources and pipes are
overdesigned, or if there are too many redundant paths, then increased costs
may lead to an inefficient solution. Therefore, the problem at hand requires a
cost effective network design and replacement strategy that satisfies stated
hydraulic requirements under various likely demand patterns and failure
modes.

To clarify terminology in what follows, we will define sections of pipes to be


the short (20'-30') lengths of pipe that are used to physically construct a
pipeline. A segment is defined to be a length of pipe (perhaps many sections)
of constant properties of diameter, roughness and annualized cost. Links are
defined to be collections of segments (having different diameters) between
DESIGNING WATER DISTRIBUTION SYSTEMS 335

two nodes of the distribution system network, the lengths of which add up to
the required length of the pipeline between the nodes.

Traditionally, pipe-reliability-and-cost models, and network-optimization


models. are presented separately in the literature. For each type of model,
inputs that are required from the other type are assumed to be known. We
propose an approach for designing and generating an expansion plan for a
pipe network WDS that integrates these two models treated as submodels. into
a single comprehensive design approach. This is the principal consideration
in the present study.

Pipe reliability and cost models provide reliability and annualized cost
information for new and existing pipe segments, along with replacement
recommendations based on a comparison of estimated costs of either
retaining an existing pipe, or replacing it with a suitable new pipe. These
determinations require the computation of optimal lifetimes for both existing
pipes and new pipes. that. in tum, require estimating individual pipe segment
reliabilities. Several analytical methods have been proposed by researchers
that address such issues (see the books by Walski, 1984. 1987).

Stacha (1978) presents a simple replacement model for water mains based on
the premise that such a replacement should occur when the annual cost of
maintenance exceeds that of replacing the main. In a similar vein, Shamir and
Howard (1979) develop a model for determining the optimal year for
individual pipe replacement by r;onsidering the present value dollar costs of
replacement and maintenance. In two separate case studies, Clark et al.
(1982) investigate replacement cost and frequency data, and develop
regression equations to determine several coefficients required in the Shamir
and Howard break-rate equations. A financial analysis is then performed in
accordance with Shamir and Howard to determine an optimal replacement
year for each segment of pipe. Kettler and Goulter (1985) also investigate
historical pipe breakage data from a number of cities and develop statistical
regression model relationships to study the effect of time and pipe diameters
on failure rates. Andreou et al. (1987 a. b) propose a statistical methodology
for analyzing break records using nonparametric methods that obviate the
need to hypothesize distributions. The prescribed model can be used to (a)
determine the future cost of various replace/repair strategies. (b) determine an
optimal pipeline replacement time, and (c) estimate the network reliability.
Karaa et al. (1987) present a linear programming formulation for a water
distribution system resource allocation problem. Pipes having similar
maintenance histories are grouped into bundles, and the decision variables
represent the percentages of these bundles that are chosen for replacement.

Another linear programming approach is proposed by Li and Haimes


(1992a) for determining an optimal repair/r.eplace time for a single pipe
segment. A semi-Markov process is defined whose states represent whether
the pipe is operating or under repair, and the number of failures that have
occurred. The steady state probabilities from the Markov model are used as
parameters in the linear optimization problem, which maximizes the
availability of the pipe subject to an expected cost constraint. Li and Haimes
(1992b) have also extended this model to analyze the tradeoff between system
level availability and cost.
336 H. D. SHERALI, E. P. SMITII AND S. KIM

As one contribution in this paper, we will design a model that composes


suitable existing statistical methodologies based on historical records of pipe
breaks to estimate individual pipe segment reliabilities in order to predict
future annualized costs. As a side result, failure prone, undercapacitated, or
high-cost existing pipe segments will be identified for removal or
replacement.

The second submodel referred to above, that is, the pipe network design
submodel, provides construction decisions for a fixed network configuration
under a number of demand scenarios, including the peak demand and various
firefighting demand patterns. (These demand patterns specify the flow rates
and hydraulic pressure levels required at each demand node.) This submodel
also includes some level of hydraulic redundancy in the network design which
ensures that demand can continue to be satisfied under various failure
scenarios.

In regard to this submodel, our focus in this paper will be to present a


fonnulation that ties in with the input provided by the pipe reliability and cost
submodel, and to suggest a methodology that would integrate these two
submodels within a holistic design framework. As far as solving this model
fonnulation is concerned, we refer the reader to the various promising
approaches describedlsuxveyed in Alperovits and Shamir (1977), Collins et al.
(1978), Quindry et al. (1979, 19~81), Rowell (1979), Fujiwara et al. (1987),
Kessler and Shamir (1989), Bhave (1985), Jeppson (1985), Walski (1984,
1985), Lansey and Mays (1985), Loubser and Gessler (1993), Morgan and
Goulter (1985), Mays (1989), Hobbs and Hepenstal (1989), Fujiwara and
Khang (1990), Loganathan et al. (1990), Fujiwara and Tung (1991), Sherali
and Smith (1993, 1995), and Eiger et al. (1994). Although these papers
provide a wealth of approaches for the network. design problem, none of them
integrates the methods of pipe failure analysis, network reliability, and design
optimization. Moreover, except for the last two cited papers above, all other
approaches are only local optimization techniques. One motivation for this
paper is to identify a nonconvex optimization subproblem (the pipe network
design submodel) that can benefit from the development of global
optimization solution techniques, and to indicate how such a subproblem can
then be imbedded within an overall approach for designing practical water
distribution systems.

The remainder of this paper is organized as follows. Section 2 provides a


fonnulation of the single-stage pipe network design submodel and identifies
its input requirements and analytical characteristics. This is then embedded
within an integrated approach proposed in Section 3, that also delineates the
role played by a suitable reliability and cost submodel within this framework.
Section 4 provides the details of one such viable pipe reliability and cost
submodel, and Section 5 concludes the paper with suggestions for further
analytical work in this area.
DESIGNING WATER DISTRIBUTION SYSTEMS 337

2 Formulation of a Pipe Network Design Submodel


The pipe network design problem seeks to detennine an optimal network
configuration that would be capable of satisfying various anticipated demand
patterns at a least cost, while maintaining an acceptable level of reliability.
The demand patterns specify flow rates and hydraulic pressure levels required
at each demand node. The reliability constraints typically include a degree of
redundancy in connectivity between source and demand nodes that would
ensure a continued minimal level of service under various failure scenarios.

The core driver in such a design approach is a single-stage network


optimization problem that detennines a least cost pipe sizing (diameters and
lengths), and energy requirements (elevated source head levels), for a fixed
demand pattern (see Remark 1 below). The problem fonnulated is a
nonlinear program to minimize the cost of designing pipes and elevating
energy heads of sources subject to satisfying hydraulic flow and pressure
requirements at the different nodes of the distribution network. Pipe links are
limited to be composed of existing pipe segments in the network and/or new
pipe segments selected from commercially available pipe diameters. Existing
pipe links may be retained intact, or may be replaced either partially
(segment-wise) or completely. This problem turns out to be a hard
nonconvex optimization problem that has many local optima, different from a
globally minimum cost design, and has hence proven to be difficult to solve.
Researchers have studied this problem for three decades (see the references
cited in Section 1), proposing solution methodologies that yield better and
better approximate solutions for various test problems.

Below, we provide a recommended fonnulation for this single-stage


optimization problem that assumes a given set of input data and infonnation,
and then proceed to describe how this might fit in an overall design approach.
Toward this end, consider the following notation.

m == meters.

N = {I, ... , n}: set of nodes in the network.

SeN, D == N - S: set of source and demand nodes, respectively.

bi : net water supply or demand rate (m 3 /hr) corresponding to node i EN.


(By convention, this is taken as positive for source nodes and negative for
demand nodes.)

bo: demand (m 3 /hr) corresponding to a special dummy demand node 0,


where bo =- L bi is assumed to be nonpositive.
jeN

Ej : ground elevation (m) of node i.


338 H. D. SHERALI, E. P. SMITH AND S. KIM
Hi: decision variable representing the established head (m) at node i, above
the level Ei .

Fi.: available fixed energy head (m) at source node i E S.

H si : decision variable representing the additional head (m) provided at


source node i E S.

Csi : annualized cost per unit energy head ($/m/year) provided at source node
i E S.

[HiL' H iU ]: admissible interval for the energy head at demand node i ED.

A: set of directed arcs or links (i, J) and (j, I) for each connected pair of nodes
i and j in the given network configuration, including source node slack arcs
(i,O), i E S.

Qij: decision variable representing the flow rate (m 3 /hr) on link (i, j) EA.

Qg: specified, redundant, upper~ bound on Qij'

Li /= L ji ): pipe length (m) corresponding to link (i,j) (or (j,i» EA.

{dk , k = 1, ... , K}: set of standard available pipe diameters (inches).

Xijk : length (m) of existing segment of link (i, j) E A having a diameter dk


that is selected for continued use. (Note that Xijk == Xjik .)

X ijk : decision variable representing the length (m) of a new segment of link
(i, j) E A that is to be constructed, having a diameter d k • (Note that
X"IJk = XJ'ik')
CiJk: annualized construction and maintenance cost per unit length ($/m/year)
of link (i, j) E A that has a diameter of dk .

CHWN: assumed Hazen-Williams coefficient for new pipes, in order to


determine frictional head losses.
DESIGNING WATER DISTRIBUTION SYSTEMS 339

CHWE(i.j.Ie): assumed Hazen-Williams coefficient for the existing pipe


segment oflength Xijle' corresponding to link (i, j) E A of diameter die'

due to friction in link (i, j) based on rough flow conditions (see Walski,
= =
1984), where xij- i i (xijle' k 1, ... , K) and Xij. !! (Xijle' k 1, ... , K).

The network optimization problem (NOP) can then be formulated as follows:

K
NOP: Minimize L L Cijlexijle + L CgiHgi (1 a)
(i.j)eA h1 ieS
i<j

subject tQ L Qr - L Q'i + Qm =bi for each i e S (lb)


'.J
J:. (. ') e A
IJ
J:, (J.'
") A J
e

L Qr - ~ L Q'i = bi for each i e D (1 c)


J: '.J e J: J.' e
, (' ') A IJ , ( ") A J

- L QiO =bo fQr node 0 (ld)


ieS

", .. (Q .. , x .. • X .. )
- (H. + E ,) ={
't"'1 '1 'J' 'J'
(H,, + E.)
, J J SO

fQr each (i, j) e A (1e)

(10

(1 g)

K
't" (x .."
~ 'J
+ X"L)
'J'"
=L..'J fQr each (i, j) e A. i < j (1 h)
k=1

OS Qij S Qg 'V(i,j) e A,Hgi ~ 0 'Vi e S,Xijk ~ 0, 'V(i,j) e A, i < j,

k=1, ... ,K. (Ii)


340 H. D. SHERALI, E. P. SMITH AND S. KIM
The objective function, Equation (la), in the above model denotes the total
annualized construction plus maintenance costs. The constraints (lb) - (ld)
enforce the conservation or continuity of flow at each node in the network.
The constraints (Ie) represent the conservation of energy or head loss
constraints for each pipe in the direction of positive flow. Noting the form of
tPjj' observe that these constraints can be mathematically modeled as the
following pair of constraints for each (i, j) E A:

(H.
I
+ E.)
I
- (H. + E.)
J J
s lit ..
'I'IJ
(QIJ.. , xIJ'
.. ,XIJ'.. )

QIJ.. [(H.I + £.)


I
- (H. + E.)]
J J
~ QIJ'I'IJ
.. IIt .. (QIJ.. ,xIJ'.. ,XIJ'.. ).

Furthermore, note that these constraints imply that Hj + Ej > Hj + Ej ,


whenever Qjj > 0, and so, we will never have both Qjj and Q jj positive in
any feasible solution. The constraints (1f) represent the head available at each
source node i E S, constraints (lg) represent bounds on the head levels
enforced at each demand node, and constraints (lh) establish the appropriate
constructed pipe link lengths. Finally, constraints (l i) represent logical
restrictions.

The foregoing model assumes several known and fixed entities. It assumes a
given network configuration that might be one of several alternative
configurations that need to be investigated. It assumes a given demand
pattern. while the performance of the design would need to be examined in
light of several demand scenarios. including peak demand and firefighting
requirements. It assumes that annualized cost coefficients cijk are available.
A pipe reliability and cost model that considers pipe breakages along with
capital and maintenance cost information. is needed to compute these
coefficients. Furthermore. an analysis is needed to prescribe which existing
pipe segments should be retained (these show up as the X jjk values above).
and which should be discarded or replaced. Since this analysis is based on
network performance. it requires an estimate of link flows. which are actually
part of the output of this pipe network optimization model. Based on the
decision to construct new pipe segments, or to retain existing pipe segments.
the associated age-dependent Hazen-Williams frictional head loss coefficients
could then be prescribed. These considerations are addressed within the
framework of an integrated approach that is developed in the following
section.

Remark 1. As mentioned above, the foregoing model prescribes a network


design based on a single representative demand pattern. In the following
section. we address the treatment of multiple alternative demand scenarios.
essentially by re-designing under the alternative scenarios and retaining the
largest pipe sizes that are prescribed over the various cases. A more accurate
alternative approach. although at the expense of creating a more complex
model, would be to incorporate these alternative scenarios within NOP itself.
DESIGNING WATER DISTRIBUTION SYSTEMS 341

using a "two-stage optimization with recourse strategy" (see Lasdon, 1970,


for example). Here the design variables x and Hs ' would be treated as the
"first-stage" set of variables, and the ensuing head and flow variables H and
Q, respectively, would be treated as the "second-stage recourse" variables,
being defined as separate sets of variables for each possible demand scenario,
where the latter is identified via the specification of the flow requirements b
and the pressure head requirements [HL ' H U]. The flow constraints (1 b) -
(ld), and the pressure head constraints (1e) - (1g) would then be replicated
within the model for each such scenario. Such a model might then be
amenable to some decomposition approach (see Lasdon, 1970 and Geoffrion,
1972, for example), in which the design variables are manipulated in a master
program, and the recourse variables are accordingly adjusted over
subproblems, one for each scenario. However, such an approach would need
to contend with the inherent nonconvexity of the problem, developing
suitable lower and upper bounding schemes, perhaps within the framework of
some branch-and-bound algorithm.

Example: To illustrate the complicated nature of Problem NOP itself, under a


single demand scenario, we briefly present an illustrative test example taken
from Alperovits and Shamir (1977). The network configuration along with
some relevant data are shown in Figure 1. All links are 1000 m in length, and
the Hazen-Williams coefficient is taken as 130. (Other data are specified in
Alperovits and Shamir, 1977. Also, see Section 4 below on a discussion for
incorporating reliability/pipe fallure probability considerations in deriving
annualized cost data.)

The local optimization scheme proposed by Alperovits and Shamir (1977)


obtained a best solution for this problem having an objective function value
of $479,525. The first improvement on this solution (using the same standard
available set of pipe diameters) was reported in Sherali and Smith (1993)
where a solution having an objective function value of $441,674 was obtained,
using a special decomposition (heuristic) procedure. Recently, using a new
global optimization procedure, Sherali and Smith (1995) have reportedly
obtained an optimum for this problem, having an objective function value of
$425,821. The literature reports on several such sequences of improved
solutions obtained for various test problems, for which determining a global
optimum remains as an open challenge. [J

3 An Integrated Cost Analysis and Design Approach


Traditionally, pipe breakage and cost analysis models are run with the
assumption that if a pipe is considered for replacement, then the length,
diameter and type of a new pipe is known. However, these characteristics
depend on hydraulic requirements within the context of an overall network
design, which is determined via some optimization process. Likewise,
traditional optimization models presuppose the knowledge of annualized
capital and maintenance costs for the various sizes and types of new pipes.
But these inputs are not usually available unless a reliability analysis is
performed to determine when a pipe will be replaced, how much it will cost,
342 H. D. SHERALI, E. P. SMITH AND S. KIM

"2'"' -100

jeD [Hit' Hju1

2 [180.2101
3 [190.2101
"5=-270 "4= -120 4 [185.210]
5 [180.210]
6 [195.210]
7 [190.210]

Figure 1. Alperovits and Shamir's (1977) Test Problem Network


Configuration.

when maintenance events are expected to occur and how much they will cost,
and when the replaced pipe is itself to be replaced.

To address this issue, we develop in Section 4 below a reliability and cost


analysis model that takes estimated link flows that are expected to occur as
input, and then for each segment of link (i,}) having a particular diameter dk •
it provides the following information. If this is a new segment, it prescribes a
Hazen-Williams coefficient and computes an annualized construction cost
coefficient. If this is an existing segment. it recommends an optimal
replacement age based on predicted maintenance costs and based on a
suitable replacement option that provides at least a comparable hydraulic
performance. If the optimal replacement time is imminent (say, within five
DESIGNING WATER DISTRIBUTION SYSTEMS 343

years), relative to the horizon of the design problem or budgetary cycle, then
the existing pipe segment is recommended for replacement by a new segment.
(A suggested diameter for an initial design solution is also available.)
Otherwise, the existing pipe segment is retained, and an accompanying Hazen-
Williams coefficient is prescribed.

In what follows, we assume that based on the structure of the existing network,
anticipated demand changes, the practical feasibility of constructing pipe
connections, and an analysis of providing adequate connectivity redundancy
so that no demand node is cut off from its principal source(s) if any link in
the network fails (see Loganathan et al., 1990), some network configuration
(N, A) is available. (Note that this overall methodology can be applied to
various alternative configurations, perhaps composed based on designs
analyzed over previous runs.) Given this, the foregoing two submodels can be
integrated using the following stepwise procedure.

I. Preprocessing Cost Analysis. First, the reliability and cost submodel is run
for new pipe segments using all commercially available diameters in order to
determine their respective optimal lives and annualized costs.

II. Preprocessing Flow Analysis. Using the annualized costs for the new
pipes from Step I, the pipe network optimization submodel is run for a
representative demand pattern, assuming tentatively that the network is being
designed from "scratch," that is, with all existing pipes also being replaced by
new pipes. The resulting solutiOIt suggests a baseline flow for each link in the
network, and provides an estimate of the hydraulic properties (flow and
pressure gradients) that are desirable in each of the pipe links.

III. Pipe Reliability and Cost Submodel. For each existing pipe segment, the
pipe reliability and cost analysis submodel is run using the current estimated
flows to compute the annualized expected cost over a, say, 40-year time
horizon. This cost is determined using a suggested replacement diameter that
does not reduce the hydraulic gradient in the pipe, along with an
accompanying computed optimal year of replacement. If the replacement
falls within the current budgetary cycle (say, 5 years), or if the pipe segment
satisfies any oth~r criterion for replacement, such as being undercapacitated
with respect to expected flow requirements, the existing section is identified
for replacement in the network design. Otherwise, the existing section is
retained.

IV. Pipe Network Design Submodel. The pipe network design optimization
submodel is now run again, using the annualized costs computed in Step I for
the new pipes, and using the retained existing pipe segments as determined at
Step III, to prescribe a set of pipe section diameters for the remaining newly
constructed segments, as well as source energy head levels. (The current
network design, including the recommended replacement diameters for
existing pipes that have been identified for replacement at Step III, can be
used as an advanced-start solution for this optimization run.) A
corresponding set of resulting hydraulic pressures and flow rates are hence
determined for each node and link, respectively. Note that the pipe flow rates
prescribed in this step will not necessarily be the ones estimated by the
344 H. D. SHERALI, E. P. SMITH AND S. KIM
previous run of the optimization submodel. If this difference is substantial (as
determined subjectively by the decision maker), then the process could be
made to transfer back to Step III, using the new flows as determined at the
current iteration of Step IV. This can be repeated until an acceptable design
is attained.

V. Desi~n Adjustments. The available network can now be subjected to


alternate peak-load and firefighting demand scenarios. The maximum pipe
sizes required across these demand patterns could be retained for each link, in
order to ensure that hydraulic requirements are met under the specified
conditions. Let us refer to this network design as the first stage design. Now,
in order to ensure a reliable degree of service under link failure scenarios,
each pipe link in the network could be sequentially removed under a
representative load pattern (perhaps using a reduced pressure requirement
such as an upper 80 percentile level), and the network re-designed, hence
updating the pipe sizes for each such stage based on the solution obtained
from the previous stages. At each step, the largest pipe sizes from the current
and previous stages should be retained to ensure that the design is feasible to
all the conditions imposed thus far. In this process, links having the largest
diameter pipe segments would be selected first to be removed during the
initial stages in order to accelerate the impact on the network design, thereby
simplifying later iterations in the sequence. The result would be an
economical network design that satisfies all the requirements for pressure,
flow and reliability over the anticipated demand patterns and pipe failure
scenarios.

VI. Implementation. The actual implementation of the new and replacement


pipes can now be prioritized. depending on the hydraulic needs of the
evolving system, the costs involved, budgetary limitations, and management
objectives.

4 Pipe Reliability and Cost Submodel


Water lines that are constructed and installed properly in low stress areas can
easily last over one hundred years. In some cities where case studies have
been performed, many pipes were found to be very reliable with little or no
maintenance history. However, several water mains and smaller pipes in use
today were manufactured with many defects, were poorly installed, have
already aged considerably, or have high stress environments. These pipes can
require frequent and expensive maintenance, sometimes at an early age.
When a pipe can no longer be maintained at an annual cost less than that
under an alternative option, the pipe should be considered for replacement.
However, other more subjective factors need to be considered as well, such as
the availability of funds and labor, convenience to the public, and safety.

The pipe reliability and cost submodel formulated in this section, as discussed
above, can be used to predict the annualized costs of installing new segments
of pipes having various standard diameters, as well as to ascertain when and
using what option should each existing pipe segment in the water distribution
system network be replaced This analysis is conducted using an optimal
DESIGNING WATER DISTRIBUTION SYSTEMS 345

economic life for each alternative over a 40-year time horizon. For this
purpose, in order to project pipe failure rates, Hazen-Williams coefficients,
and maintenance and replacement costs, we will appropriately compose a set
of existing statistical models from Shamir and Howard (1979), Quindry et at.
(1981), WalsId (1984, 1987), and Kettler and Goulter (1985).

4.1 Pipe Failure Model

The basic pipe reliability/failure regression equation model proposed by


Shamir and Howard (1979) in analyzing historical data, and recommended by
Walski (1984, 1987) as a useful approximation for projecting future breaks, is
the hazard function given by

(2)

where

N(t) =break rate in year t (breaks/year/km)


NO =initial break rate in year to (breaks/year/km)

b =rate coefficient (year-i)


t =time (year)
to =base (installation) year.
Several quality, environmental, installation, and service factors influence the
break rate. Kettler and Goulter (1985) found that the pipe diameter exhibits a
strong linear tendency if the analysis is confined to a single city, with the
failure rate decreasing with an increase in diameter. For the city of
Philadelphia, the failure rate for pipes between 4" and 16" approximately
followed the relationship

N(D) = O.3-0.01D (3)

where

D = Diameter of pipe (4 to 16 inches),

N(D) =break rate (breaks/year/km) as a function of D.


The authors hypothesize that the relationship becomes nonlinear for larger
pipe diameters, with decreasing slope as the diameter increases.
346 H. D. SHERALI, E. P. SMITII AND S. KIM

We will combine the basic equation for the break rate given by Equation (2),
with the break rate versus diameter relationship of Equation (3), and add an
extension for larger diameter pipes, to formulate the following break rate
model as a function of time and diameter:

03 0 01 O.l(t-to)
N(t, D) = NO(D) eO. 1(t-tO) 5 { (. - D) e
. if D S 16}
(0.14 e-(D-16) I 14 i.1(t-to ) if D ~ 16
breaks/year/km. (4)

Noting (2), the break rate coefficient b has been taken to be 0.1 as
recommended by Walski (1987). Furthermore. the initial break rate No has
been designated to be a function of the diameter D. following Equation (3)
for D S 16. and decreasing with a decreasing slope as suggested by Kettler
and Goulter as the diameter increases beyond 16". Note that the coefficient
of the expression for D ~ 16 has been determined to make the function
smooth at D = 16.

The expected time to n future failures can now be found by integrating


Equation (4) from tnow to t. after multiplying this by the length L of the pipe
segment. and equating this to n, where tnow is the current time (year). Hence.
solving the equation

t
JL No(D) eO. 1('r-to) d-r = n (5)
t/lOW

for t. we obtain the time (years) to n future failures as.

t + 10 In(i· 1(t llOw -to) +


t ={
° n
10L(0.3 - O.OID)
(D-16)/l4
) if D S 16
(6)
t + 10 In(e O. 1(t llOw -to) + n e ) if D ~ 16
o lAL

As an example of determining expected break times, consider a 16" diameter


pipe segment of length 0.5 kID. The expected time from installation to the
first failure is found by solving Equation (6) with to = tnow = O. and n = 1,
yielding t = 8.87 years. Now, given that the first failure occurred at 8.87
years, the expected time of failure for the second break is found by solving
(6) with to =O. tnow = 8.87 and n = 1 (or to =tnow =0 and n = 2) to give t
= 13.50 years. with a 4.63 year expected time between the first and second
breaks. Likewise. the expected time to failure for the third break is 16.65
years. with a 3.15 year expected time between the second and third breaks.

We must also consider the case when the pipe for which we are estimating
costs has been in place for several years and may have experienced previous
DESIGNING WATER DISTRIBUTION SYSTEMS 347

breaks. We will assume that the break rate is still modeled by Equation (4)
regardless of the previous history of breaks. For example. suppose that our
16" diameter pipe segment has had two previous breaks and is 12 years old
(as compared with the expected 13.5 years until the second break). We wish
to compute the expected time of the next (third) break. We solve Equation
(6) with to =O. tl'lOW = 12 and n = 1 for the time of the first future (next)
failure. giving t = 15.58 years. Notice that the expected time of the third
break is earlier than before. since the second break occurred earlier than
expected. Also. the time between the second and third breaks of 3.58 years is
longer than the previously computed 3.15 years. since the failure rate is lower
during the earlier years.

4.2 Capital and Maintenance Cost Models

To model the capital cost of installing a new pipe. Quindry et al. (1981)
recommended an exponential function of the diameter D. and Walski (1984)
refined this model to include the dependence of the model coefficients on the
pipe construction material and on various ranges of pipe diameters. This
relationship is given as follows. where

CC(D) is the capital cost per unit length ($/m) as a function of the pipe
diameter (D) (inches).

14.1eo.170D ifD ~ 8
3. 00 D 1.40 if8~D~24
Cc(D) = (7)
6.45D1.l6 if24~D~48

0.656D 1. 75 if D ~ 48 .

We have slightly modified Walski's equations to include four diameter ranges.


versus the original three. in order to allow for a more accurate representation
of Walski's actual cost data as represented in his plot. The cost for small pipes
(D S 8) are based on PVC pipes. the cost for medium pipes (8 S D S 48) are
based on ductile iron pipes. and the cost for large pipes (D ~ 48) are based on
concrete pipes. The commercially available diameters are 4. 6. 8. 9, 10. 12.
14, 15, 16. 18.21, 24. 30. 36.42.48.54.60,66. and 72 inches.

Next. let us formulate a model for repair or maintenance costs. Small leaks
that are caused by a hole or a small crack can be fixed with a repair clamp
that wraps around the pipe. or for larger diameters, by welding a patch onto
the pipe. One model reported by Walski (1984) that was useful for
approximating the maintenance costs for repairing such a break in the
Buffalo District from a U.S. Army Corps of Engineers study is given by
600D°.4 $/break. and includes allowances for crew cost. equipment. sleeve.
paving. and tools. Occasionally. larger longitudinal cracks or crushed pipes
might actually require the replacement of a physical section of pipe. We will
348 H. D. SHERALI, E. P. SMITH AND S. KIM

assume that such cracks requiring a replacement of the pipe section occur
fSe£ E (0,1) fraction of the time, and that sections are LSe£ = 10 m long
(these are variable parameters in the model). Hence, in this case, an
additional cost of LSe£Cc(D) would be incurred, where Cc(D) is given by
(7). Thus, the average or expected maintenance repair cost for a break is
given by

(8)

For example, using our example of a 16" pipe, the estimated (noninflated)
cost of repairing a single break is given by
600(16)°·4 + 0.1(10)(3)(161.4) =$1964, assuming f sec = 0.1.
4.3 Annualized Costs for New Pipes

The annualized cost for a section of new pipe can now be computed for each
standard diameter based on its optimal lifetime with respect to capital and
maintenance costs. For each diameter of pipe, various candidate lifetimes are
considered, coinciding with the expected failure times given by Equation (6),
and each candidate lifetime is analyzed by computing the capital plus
maintenance costs using Equations (7) and (8), based on the assumed section
length and the expected number of breaks corresponding to the given
lifetime, and then annualizing all costs incurred over that lifetime. For
computing annualized costs, we use inflation-free real prices and real interest
rates (see Grant et al., 1987). As the lifetime is increased, the annualized costs
first decrease until maintenance costs begin to take over, and then the
annualized costs start to increase. We take this least-cost time to be the
optimal economic life of the pipe based on financial considerations. For
example, assuming a 4% inflation rate and an 8% market interest rate, the real
interest rate can be computed to be (1.08/1.04) - 1, or 3.85% (see Grant et
al., 1987). Using this rate and considering a l000m length section of pipe
for each of the twenty diameters considered to be commercially available, the
corresponding optimal lives and annualized costs can be computed, and are
listed in Table 1. These optimal lifetime calculations are slightly dependent
on length since we take into account the occasional section replacements
required for longitudinal cracks and crushed pipes in Equation (8).
(Otherwise, the length would simply have been a direct proportionality factor
in the total cost expression.) Since the dependence on length is slight, we
could assume that the annualized cost per meter as computed in Table I is
sufficiently representative for general use as the required coefficients cijlc in
the pipe network design submodel.
DESIGNING WATER DISTRIBUTION SYSTEMS 349

Table 1. Optimal Lite Characteristics tor New 1000 m Pipe Segments


Dlameter Annualized Cost Opumal Life Expected
(inches) ($1m) (years) Breaks (per
km)
4 2.77 22.9 23
6 3.59 24.7 26
8 4.61 26.8 30
9 5.22 27.9 32
10 5.82 28.9 34
12 7.03 31.2 39
14 8.23 33.1 42
15 8.83 34.1 44
16 9.42 35.2 46
18 10.60 37.2 49
21 12.40 40.3 54
24 14.24 43.1 58
30 17.21 48.1 63
36 20.12 53.0 67
42 23.04 57.9 71
48 25.99 62.7 75
54 30.92 68.0 83
60 36.30 73.2 91
66 42.07 78.2 98
72 48.25 83.3 106

4.4 Replacement Analysis for Existing Pipe Segments


As discussed in the foregoing two sections, for each existing pipe segment, we
need to determine whether to retain this segment or to replace it via the
optimization pipe network design model. There are (at least) four important
reasons for considering the replacement of a pipe segment. First. a segment
may be considered for replacement when the anticipated annualized costs of
continuing to maintain it exceed the capital and future maintenance costs for
a newly replaced pipe that has a comparable hydraulic performance. Second,
there might be a need to increase the hydraulic capacity of the pipe based on
an expanded network and/or increased demand pattern. Third. a utility may
choose to replace pipe segments when the failure rate reaches a certain
threshold level, regardless of economic consequences. Fourth, a segment may
be removed based on the number of breaks in its life. regardless of time. This
may simply be a conservative management practice that some utilities might
use. In this section, we will focus on the first of these considerations. The
other three considerations can be subjectively exercised as an option by the
designer. (Note that for the second of these considerations. the expected
hydraulic performance as suggested by the Step II output of the integrated
approach of Section 3, can be used as an indicator to decide whether this
existing segment should be replaced, or examined further for continued use.)

For each existing segment that remains as a candidate to be considered for


continued use, we will compute an optimal lifetime. based on related capital
350 H. D. SHERALI, E. P. SMITII AND S. KIM
plus maintenance costs with respect to a suitable replacement option. If this
optimal lifetime is too short relative to the design horizon or budgetary cycle.
then the existing segment can be recommended for immediate replacement
via the optimization design submodel. Otherwise, it can be retained within this
submodel for further design analysis.

Now. in order to determine the diameter of pipe to be considered for


replacement. we choose the smallest pipe in the list of commercially available
diameters that has less hydraulic gradient with respect to the baseline flow
than the existing pipe would have with the same flow. In order to compute
the hydraulic gradient. however. we need to estimate the Hazen-Williams
coefficients CHW for the existing and new pipes. Walski (1984) determined
that although several factors such as pipe material. diameter. flow rate. and
water composition affect CHW ' this coefficient decreases approximately
linearly with time (the roughness of the pipe being inversely proportional to
CHW)' with a change in slope at around thirty years. To simplify the model,
we will use a typical corrosion curve under severe condition as given by
Walski (1984), and derive a general formula for CHW as follows:

130 -1.671 for 0 :S; 1 :S; 30 years


C - { (9)
HW - 80 - 0.286(1 - 30) for 30 :S; 1 :S; 100 years.

Assuming a 40-year financial time-horizon. we will use this relationship to


"age" all pipes under new or continue-to-maintain strategies for a period of 20
years in order to derive the coefficients C HWN or CHWE(j,j,k) to be used in
the hydraulic analysis (see the form of ~ij used in Equation (Ie». This
would ensure that the WDS will continue to operate as designed at least
halfway through the 40-year horizon. For example. for our illustrative 16"
pipe that is twelve years old. we would age this pipe to 32 years for hydraulic
analysis under a continue-to-maintain plan. using CHW = 80-0.286(32-30)
= 79 for this pipe. Likewise. a new replacement pipe would not be analyzed
with its initial CHW = 130. but with an aged CHW = 130-1.67(20) = 97.

We are now prepared to determine the diameter of pipe to be considered for


replacement. Suppose that for our example. we determine that the new
baseline flow for the link under consideration is 600 m3 /hour. Note from
above that the Hazen-Williams coefficients for the existing and new pipes are
taken as 79 and 97. respectively. The hydraulic gradient due to friction in the
= =
existing D 16" pipe with a flow of Q 600 m3 /hour and CHW = 79 as
determined by the formula for tfJij of Equation (Ie) is given by

~ /length = (1.52)10 4 Q2(CHW )-1.852 D-4·87 = 0.0246


(meters head loss/meter of pipe). (10)
DESIGNING WATER DISTRIBUTION SYSTEMS 351

The smallest commercially available diameter of a new pipe that has a lower
=
hydraulic gradient than 0.0246 under the baseline flow of Q 600 m3/hour
and with CHW = 97 is D = IS". For this diameter, the hydraulic gradient is
computed via (10) as 0.023 (meters head loss/meter of pipe).

Similar to the analysis in Section 4.3, we now determine an optimal


replacement year for the given section of existing pipe under consideration.
For this purpose, we examine the various candidate replacement times as the
times corresponding to various numbers of expected failures as obtained by
applying Equation (6) to the existing pipe section. For each such candidate
lifetime, we compute the total annualized capital plus maintenance costs using
Equations (7) and (8) over the (40 year) financial time-horizon, of
maintaining the existing pipe up to the candidate lifetime, and then investing
in the new pipe that has a lower hydraulic gradient as detennined above, and
maintaining this pipe for the remainder of the time horizon, again using
Equation (6) to compute the failure times for the new pipe section. The
candidate lifetime that yields the least total present value cost is then
prescribed as the time of replacement. If this recommended replacement time
occurs earlier than some budgetary cycle duration (say, five years), then the
existing pipe section is scheduled for replacement via the optimization model
in the current cycle itself. (Note that the optimization process might then
prescribe some other replacement diameter pipe than that given by the above
analysis, based on the overall network design consideration.)

To illustrate, consider our examp1e of a 16" pipe segment that is 500 m long
and currently 12 years old, for which we have prescribed a replacement
option given by a IS" new pipe as determined above. We will detail the
analysis for replacement in the candidate year corresponding to the fifteenth
break, given by Year 20 via Equation (6).

The expected maintenance times for the existing pipe are found by solving
Equation (6) using to = -12, tnow = 0, and n = 1,2,3,4 ..... yielding 3.58,
6.21, 8.29, to.O, ... years, respectively, each costing $1964 as computed before
via Equation (8). The annualized capital plus maintenance costs from Table I
for the new IS" pipe are $8.83/year/m, or $4415/year for the 500 m segment.
If we add up the discounted (present value) expected maintenance costs for
the existing pipe (including that for the break in Year 20), and discount the
annualized costs of $4415/year for the new pipe (occurring from Year 21
until the end of the 40 year time horizon), we find that the present value of
the option to replace during Year 20 at the 15th break is $45,270.
Performing this analysis for each such candidate lifetime, we detennine that
the optimal least cost year of replacement is 23 years, with a present value cost
of $44,696. For this option that replaces the pipe in Year 23, there are 22
expected failures and corresponding maintenance actions for the existing 16"
pipe segment before its replacement in Year 23, followed by 2 expected
failures and corresponding maintenance actions for the new 15" pipe between
Year 23 and Year 40. Since the optimal replacement time does not occur
during the next five years, we would recommend the continued use of this
pipe segment, unless if the subsequent network desigri phase detennines that it
is hydraulically unacceptable.
352 H. D. SHERALI, E. P. SMITH AND S. KIM

5 Summary and Conclusions


In this paper, we have presented a holistic approach for designing a water
distribution system by integrating two principal submodels, namely, a pipe
reliability and cost submodel, and an optimization network design submodel.
The pipe reliability and cost submodel incorporates various models for pipe
failure, capital replacement and repair or maintenance costs, and for the
Hazen-Williams frictional head loss coefficients for new and existing pipe
segments. These models are in turn used for determining optimal lifetimes of
new pipe sections along with the corresponding annualized costs per unit
length for different standard, commercially available pipe diameters.
Furthermore, this analysis prescribes an optimal replacement time for each
existing pipe segment, along with a recommendation for a new standard pipe
diameter that can be used as an initial value in the subsequent optimization
submodel, in case the analysis identifies this existing segment for replacement.
However, this submodel requires estimated baseline flows as input, which it
obtains from the network design submodel. The formulation of the latter
submodel, in tum, is constructed using the outputs of the reliability and cost
submodel, and the solution of this model yields a design of the network using
commercially available pipe diameter sections, along with accompanying
source head elevation decisions and ensuing flow rates and pressure heads at
various demand nodes. This particular single stage, nonconvex, optimization
model that is formulated to interact with the foregoing reliability and cost
model is hence suggested to be, studied from the viewpoint of developing
effective (global optimization) algorithmic solution procedures. The more
complex overall design process can then be conducted via the integrated
approach proposed herein, where the two foregoing submodels are
coordinated in a feedback loop until a stable design results, perhaps for
different alternate, network configurations. Further adjustments in these
designs can be effected via the consideration of the overall network reliability
through the suggested redundancy assessment, and while evaluating various
alternate demand pattern scenarios. Using such a process, decision makers
can design cost effective and functional, practical, water distribution systems.

6 References
Alperovits, E. and Shamir, U. "Design of Optimal Water Distribution Systems," Water
Resources Research, Vol. 13, December 1977, pp. 885-900.

Andreou, S. A., Marks, D. H. and Clark, R. M. "A New Methodology for Modeling
Break Failure Patterns in Deteriorating Water Distribution Systems: Theory," Advances
in Water Resources, Vol. 10, March 1987a, pp. 2-10.

Andreou, S. A., Marks, D. H. and Clark, R. M. "A New Methodology for Modeling
Break Failure Patterns in Deteriorating Water Distribution Systems: Applications,"
Advances in Water Resources, Vol. 10, March 1987b, pp. 11-21.

Bhave, P. R. "Optimal Expansion of Water Distribution Systems," Journal of the


Environmental Engineering Division, ASCE, Vol. Ill, No. EE2, 1985, pp. 177-197.
DESIGNING WATER DISTRIBUTION SYSTEMS 353
Clark, R. M., Stafford, C. L. and Goodrich, J. A. "Water Distribution Systems: A
Spatial and Cost Evaluation," JourMl o/the Water Resources Planning and Management
Division, ASCE, Vol. 1008, No. WR3, October 1982, pp. 243-256.

Collins, M., Cooper, L., Helgason, R., Kennington, J. and leBlanc, L. "Solving the
Pipe Network Analysis Problem using Optimization Techniques," Management Science,
Vol. 24, No.7, March 1978, pp. 747-700.

Eiger, G., Shamir, U. and Ben-Tal, A. "Optimal Design of Water Disttibution


Networks," Water Resources Research, Vol. 30, No.9, 1994, pp. 2637-2646.

Fujiwara, 0., Jenchaimahakoon, B. and Edirisinghe, N. C. P. "A Modified Linear


Programming Gradient Method for Optimal Design of Looped Water Distribution
Networks," Water Resources Research, Vol. 23, No.6, June 1987, pp. 977-982.

Fujiwara, O. and Khang, D. B. "A Two-Phase Decomposition Method for Optimal


Design of Looped Water Disttibution Networks," Water Resou.rces Research, Vol. 26,
No.4, April 1990, pp. 539-549.

Fujiwara, O. and Tung, H. D. "Reliability Improvement for Water Disttibution Networks


Through Increasing Pipe Size," Water Resources Research, Vol. 27, No.7, July 1991,
pp. 1395-1402.

Geoffrion, A. M. "Generalized Benders Decomposition," Journal 0/ Optimization Theory


and Applications, 10, 1972. pp. 237-200.

Grand, E. L., Ireson. W. G. and Leavenworth. R. S. Principles 0/ Engineering


Economy, seventh edition, John Wiley and Sons, New York. 1982.

Hobbs, B. G. and Hepenstal. A. "Is Optimization Optimistically Biased?," Water


Resou.rces Research, Vol. 25, No.2, February 1989, pp. 152-100.

Jeppson, R. W. "Practical Optimization of Looped Water Systems," in Compu.ter


Applications in Water Resou.rces. ed. H. C. Tonro, 1985. pp. 723-731.

Karaa, F. A., Marks, D. H. and Clark, R. M. "Budgeting of Water Distribution


Improvement Projects," Jou.rMl/or Water Resou.rces Planning and Management, ASCE,
Vol. 113, No.3, May 1987, pp. 378-391.

Kessler, A. and Shamir. U. "Analysis of the Linear Programming Gradient Method for
Optimal Design of Water Supply Networks," Water Resou.rces Research, Vol. 27, No.7,
July 1989, pp. 1469-1480.

Kettler, A. J. and Goulter, I. C. "An Analysis of Pipe Breakage in Urban Water


Disttibution Networks," Canadian Jou.rMl o/Civil Engineering, Vol. 12,1985, pp. 286-
293.

Lansey, K. and Mays, L. "A Methodology for Optimal Network Design," Compu.ter
Applications in Water Resou.rces, ed. H. C. Torno, 1985, pp. 732-738.
354 H. D. SHERALI, E. P. SMITH AND S. KIM
Lasdon, L. S. Optimization Theory for Large Systems, Macmillan, New York, NY,
1970.

Li, D. and Haimes, Y. Y. "Optimal Maintenance-Related Decision Making for


Deteriorating Water Distribution Systems, 1. Semi-Markovian Model for a Water Main,"
Water Resources Research, Vol. 28. No.4, April 1992a, pp. 1053-1061.

Li, D. and Haimes, Y. Y. "Optimal Maintenance-Related Decision Making for


Deteriorating Water Distribution Systems, 2. Multilevel Decomposition Approach,"
Water Resources Research, Vol. 28, No 4, April 1992b, pp. 1063-1070.

Loganathan, G. V., Sherali, H. D. and Shah, M. P. "A Two-Phase Network Design


Heuristic for Minimum Cost Water Distribution System Under a Reliability Constraint,"
Engineering Optimization, Vol. 15(4), 1990, pp. 311-336.

Loubser, B. F. and Gessler, J. M. "Computer Aided Optimization of Water Distribution


Networks," Integrated Computer Applications in Water Supply, Vol. 1, Research Studies
Press Ltd., Somerset, England, 1993. pp. 103-120.

Mays. L. W. Reliability Analysis of Water Distribution Systems, American Society of


Civil Engineers, New York, NY. 1989.

Morgan, D. R. and Goulter, I. C. "Optimal Urban Water Distribution Design," Water


Resources Research, Vol. 21, No.5, May 1985, pp. 642-652.

Quindry, G., Brill, E. D. and Liebman, J. C. "Optimization of Looped Water


Distribution Systems," Journal of Environmental Engineering Division, ASCE, Vol.
107, EE4, August 1981, pp. 665-679.

Quindry, G., Brill, E. D., Liebman, J. C. and Robinson, A. "Comments on Design of


Optimal Water Distribution Systems," by E. Alperovits and U. Shamir. Water Resources
Research. Vol. 15, No.6. December 1979. pp. 1651-1654.

Rowell, W. F. "A Methodology for Optimal Design of Water Distribution Systems,"


Ph.D. thesis, University of Texas at Austin, 1979.

Shamir, U. and Howard. C. D. D. "An Analytic Approach to Scheduling Pipe


Replacement," Journal of the American Water Works Association, May 1979, pp. 248-
258.

Sherali, H. D. and Smith, E. P. "An Optimal Replacement-Design Model for a Reliable


Water Distribution Network System," Integrated Computer Applications in Water
Supply, Vol. 1, Research Studies Press Ltd., Somerset, England, 1993, pp. 61-75.

Sherali, H. D. and Smith, E. P. "A Global Optimization Approach to a Water


Distribution Network Design Problem," Repon #HDS95-6, Depanment of Industrial and
Systems Engineering, Virginia Polytechnic Institute and State University, Blacksburg,
VA 24061-0118, 1995.

Stacha, J. H. "Criteria for Pipeline Replacement," Journal of the American Water Works
Association, May 1978, pp. 256-258.
12
GLOBAL OPTIMISATION OF
GENERAL PROCESS MODELS
Edward M.B. Smith and Constantinos C. Pantelides

Centre for Process Systems Engineering


Imperial College of Science, Technology and Medicine
London SW7 2BY, United Kingdom

ABSTRACT
This paper is concerned with the application of deterministic methods for global optimisation
to general process models of the type used routinely for other applications. A major difficulty
in this context is that the methods currently available are applicable only to rather restricted
classes of problems. We therefore present a symbolic manipulation algorithm for the automatic
reformulation of an algebraic constraint of arbitrary complexity involving the five basic
arithmetic operations of addition, subtraction, multiplication, division and exponentiation, as
well as any univariate function that is either convex or concave over the entire domain of
its argument. This class includes practically every constraint encountered in commonly used
process models.
The reformulation converts the original nonlinear constraint into a set of linear constraints and
a set of nonlinear constraints. Each of the latter involves a single nonlinear term of simple form
that can be handled using a spatial branch and bound algorithm.
The symbolic reformulation and spatial branch and bound algorithms have been implemented
within the gPROMS process modelling environment. An example illustrating its application is
presented.

1 INTRODUCTION
Many important process design and operation tasks may be expressed mathematically
as nonlinear programming problems (NLP) of the form:
min cp(x)
x

subject to
g(x) 0
h(x) < 0
and
355
I. E. Grossmann (etL), Global Optimillltion in Engineering Design, 355-386.
~ 1996 Kluwer Academic Publishers.
356 H. D. SHERALI, E. P. SMITH AND S. KIM

where x is a vector of continuous variables. Problems which may be posed in this


manner include reactor network optimisation [12], separation network optimisation
[1], and plant wide design [28, 29].
Most of the established numerical methods for the solution of this problem are
concerned with the determination of a local minimum. Many process engineering
applications, however, lead to nonconvex NLPs that may possess multiple local
minima. Local optimisation methods applied to such problems do not provide any
guarantee of determining a global optimum -- in fact, they may fail to determine even
a local one even if the original problem is feasible.
The problem of establishing global optima for nonconvex NLPs has been receiving
increasing attention in both the mathematical and the process engineering literature
in recent years. The methods currently available for the solution of such problems
can be classified as either stochastic (see the review by Schoen [24]) or deterministic.
The latter, which are the ones of interest to the current paper, include a variety of
techniques such as interval analysis [30], primal-dual problem decomposition [9, 13],
cutting plane algorithms [31], and branch and bound techniques [2, 14,20,23,26].
One common feature of all currently available deterministic global optimisation
techniques is that they are applicable only to relatively restricted problem classes.
Some engineering problems already naturally fall within these classes, and others
can be reformulated manually to do so, often using a combination of mathematical
analysis and physical intuition. However, these restrictions pose a major obstacle to
the wider use of global optimisation in engineering optimisation.
This paper is concerned with the application of global optimisation to general
process models, such as those which can be defined routinely within general-purpose
process modelling packages. The latter provide high-level symbolic languages for the
description of arbitrarily complex process models under both steady-state and transient
conditions. One important characteristic of this type of software is that it clearly and
effectively separates the task of model definition (which is the responsibility of the user)
from that of the mathematical solution (which is largely undertaken automatically).
This leads to the concept of multipurpose process modelling environments in which
the same model is used for a wide variety of applications, ranging from steady-state
and dynamic simulation to plant data reconciliation and control system design [19].
The steady-state process models of interest to this paper are typically described in
terms of systems of nonlinear algebraic equations. However, taking account of spatial
variations of properties within the process may also lead to sets of partial differential
and algebraic equations in one or more space dimensions. Current process modelling
tools [16] support the direct modelling of such distributed systems, automatically
DESIGNING WATER DISTRIBUTION SYSTEMS 357

applying spatial discretisation techniques to reduce the corresponding models to sets


of nonlinear algebraic equations.
A major difficulty with introducing global optimisation capabilities within general-
purpose process modelling tools is the fact that most process models do not naturally
belong to the classes of problem that are tractable using the currently available methods.
On the other hand, it would be unreasonable to expect the users to reformulate their
problems to a "suitable" form as this would seem to negate many of the advantages
of using this type of software.
This paper presents a symbolic manipulation algorithm that can automatically refor-
mulate a very wide range of problems to a form that can be solved using branch and
bound global optimisation techniques. The problem objective function and constraints
may be of arbitrary complexity, involving any combination of binary arithmetic
operations (addition, subtraction, multiplication, division and exponentiation) and
univariate functions that are either everywhere convex or everywhere concave over
the entire domain of their argument. In fact, this class of problems includes most
process engineering models of practical interest.
The next section provides a detailed description of the above symbolic manipulation
algorithm. Section 3 is concerned with the use of the results of the symbolic
reformulation for the generation of a convex relaxation of the original problem while
Section 4 discusses the branch and bound algorithm used for the global optimisation.
Section 5 provides an outline of the implementation of these techniques within the
gPROMS process modelling package. An examples illustrating the algorithm and its
implementation are presented in Section 6. We conclude with some general remarks
on the work presented in this paper.

2 GENERAL CONSTRAINT REFORMULATION


The main prerequisite for the application of branch and bound global optimisation
methods (see Section 4) is a convex relaxation of the original problem. This is of the
form

subject to

and
358 H. D. SHERALI, E. P. SMITH AND S. KIM

where <I>L(x) is a convex underestimator of the objective function <I>(x), gL(x) and
hL(x) convex underestimators ofthe functions g(x) and h(x) respectively, and gU (x)
a concave overestimator of g(x).
Convex relaxations have already been proposed for many special algebraic forms,
such as bilinear (xy) and linear fractional (;) terms [15, 21]. However, in general
engineering optimisation applications, we potentially have to deal with much more
general expressions. Consider, for instance, the nonlinear expression

xln(y)+z
z+xy
where x, y and z are variables. This clearly does not correspond to anyone of the
simple special forms for which convex bounds are available. However, by inspection,
we can produce the following reformulation:

WI In(y)
W2 XWI
[xln(y) +z] W3 W2 +z
z+xy W4 xy
W5 Z+W4
W6 W3/ W5
We note that the original constraint has been replaced by two linear and four nonlinear
constraints. Each of the latter involves a single term of special form that can, in fact,
be bounded using results already reported in the literature. Some extra variables (w)
have also been introduced in this process, with W6 being equivalent to the original
nonlinear expression.
The above reformulation was easily achieved by inspection. In this section, we seek
to establish a general algorithm that employs symbolic manipulation to carry out this
type of reformulation for expressions of arbitrary complexity. First, we review the
binary tree representation of algebraic expressions on which the symbolic algorithms
operate. We then describe in detail the symbolic reformulation algorithm itself.

2.1 Binary Tree Representation of Algebraic Expressions


The application of symbolic manipulation algorithms is greatly facilitated by the adop-
tion of the binary tree representation (see, for instance, [11]) for algebraic expressions,
instead of the standard' 'infix" representation used in ordinary mathematical notation.
The leaf nodes in the binary tree correspond to either constants or variables. All other
nodes correspond to binary expressions of the form

left 8 right.
DESIGNING WATER DISTRIBUTION SYSTEMS 359

Figure 1 Binary Tree Representation of Algebraic Expressions

These are characterised by a binary operator 0, representing addition, subtraction,


multiplication, division or exponentiation (raising to a power), and have exactly two
children denoted as left and right. Both left and right are themselves
binary trees, and consequently the binary tree representation is recursive, being able
to describe expressions of arbitrary complexity in a completely unambiguous fashion.
In reality, some of the operators occurring in algebraic expressions are unary, rather
than binary, in nature. This includes the negation operator (unary minus) and the
common univariate functions such as In(.), exp(.) and so on. In the interests of
simplicity and uniformity, we treat negation as a binary operation equivalent to
subtraction from zero (i.e. -x == 0 - x). Univariate functional dependence is handled
by introducing a different unary operator for each type of function. By convention, we
refer to the sub-tree defining the argument of the function as right although there is
no left in this case.
As an illustration of the above concepts, the binary tree corresponding to the nonlinear
expression used in the reformulation example above is shown in Figure 1. The
conversion of symbolic algebraic expressions from the standard infix notation to the
equivalent binary tree is a straightforward matter and is already used routinely by
process modelling environments (see, for instance, [18]).
360 H. D. SHERALI, E. P. SMITH AND S. KIM

2.2 Symbolic Reformulation Algorithm


We now consider the symbolic reformulation of an algebraic expression given in the
form of a binary tree to a set of linear constraints, potentially involving some newly
introduced variables. Some of the latter will be defined as products or ratios of pairs of
other variables, or as one variable raised to a power of a constant or another variable,
or as a constant raised to a variable, or as univariate functions of a single variable.
Achieving this aim in a simple but not particularly efficient manner is, in fact,
straightforward if one is provided with the binary tree representation of the algebraic
expression to be reformulated. Thus, all one has to do is start from the bottom of the tree
and replace each binary operator node left 8 right where both left and right
are leaf nodes by a new variable defined accordingly. Binary addition/subtraction
operators will be replaced by linear constraints defining the new variable, while binary
multiplication and division operators will lead to the introduction of new variables
corresponding to bilinear and linear fractional terms respectively. Similarly, binary
exponentiation operators and univariate function operators will be replaced by new
variables defined in an appropriate manner.
Once the bottom level binary and unary operators are replaced by new variables, the
corresponding nodes become leaves and the procedure may be repeated to produce
further simplifications, and so on until the root of the binary tree is reached.
It can be verified that this simple strategy will result in the correct reformulation of
the example expression considered earlier. However, for more general expressions,
it may be unacceptably inefficient, leading to the introduction of many unnecessary
constraints and variables. Consider, for instance, the expression

a exp((3) (x + y)(x + "fY + c5z)


where a, (3, "f and c5 are constants and x, y and z variables. Applying the simple
algorithm outlined above to the corresponding binary tree would result in the following
reformulation:
WI exp((3)
W2 = aWl
W3 x+y
W4 W2W3
W5 = "fy
W6 = X +W5
W7 = c5z
Ws = W6 +W7
'Wg. 'W4'WS

Albeit strictly correct, this is extremely inefficient in a number of different ways: the
definitions of 'WI and 'W2 are unnecessary as they are both constant quantities; there is no
need for separate definitions of 'W3 and 'W4 as they actually represent the same quantity
DESIGNING WATER DISTRIBUTION SYSTEMS 361
multiplied by different constants; the introduction of the intermediate quantities W5,
W6 and W7 is also superfluous. In fact, the expression can be reformulated simply as:

=
=
a exp((J) (x + Y)
x + 'YY + 6z
1
= WIW2

i.e. as two linear constraints and a nonlinear one involving a single bilinear term.
The above example indicates that the reformulation algorithm should keep track of the
constancy or variability of quantities it encounters as it moves up the binary tree; and
also that it should avoid replacing linear sub-expressions by new variables unless this
is absolutely necessary, which is the case only if they become involved in nonlinear
terms that must themselves be replaced by new variables.
The above ideas are incorporated in the algorithm shown in pseudo-code form in
Figure 2 which reformulates a given binary tree b. It is worth first clarifying two
general points:

• A binary tree b is characterised by its type (denoted as b. type) which can


be a Leaf, a UnaryOperator or a BinaryOperator. The algorithm takes
different actions depending on this type.

• In addition to the reformulation of a given binary tree b, the algorithm determines


its class. The latter can take one of three values: a constant, C; a simple
variable, V; and a linear expression, X. No other value is possible as nonlinear
sub-expressions are replaced by new variables as soon as they are encountered.

We now proceed to examine the algorithm in more detail, considering the treatment
of each type of tree separately.

Handling of Leaf Nodes


Consider first the case of a binary tree b that is simply a leaf node. Clearly no
reformulation is necessary, and the class of the tree can be determined by examining
the contents of this leaf (b. content): if this is a problem variable, then the class is
V; otherwise it must be a constant, and is therefore assigned class C.

Handling of Unary Operator Nodes


Now consider the case of a tree b that is a unary operator corresponding to a univariate
function. Our first task is to reformulate the argument of the function, denoted by
b. right, which can itself be arbitrarily complex. This is done by a recursive
invocation of the classifyReformulate procedure, this time applied to the
binary tree b. right.
362 H. D. SHERALI, E. P. SMITII AND S. KIM
PROCEDURE ClassifyReforrnulate (b BinaryTree)
CASE b.type OF

Leaf IF b.content is a ProblemVariable THEN


b.class .- V
ELSE
b.class .- C

UnaryOperator ClassifYReformulate (b. right)


IF ( b.right.class = X ) THEN
CreateLinearConstraint (b. right)
IF ( b.right.class <> C ) THEN
CreateVariableDefinition (b)
ELSE
b.class := C

BinaryOperator ClassifyReformulate (b. left)


ClassifyReformulate (b. right)
Apply rules of Table 1 to determine
b.class and decide whether to:
i) CreateLinearConstraint (b. left) ?
ii) CreateLinearConstraint (b. right) ?
iii) CreateVariableDefinition (b) ?

Figure 2 Symbolic Refonnulation Algorithm


DESIGNING WATER DISTRIBUTION SYSTEMS 363

Left Right Binary New New Linear Binary


Subtree Subtree Operator Variable Constraint Tree
Class Class Definition Creation Class
C C ± c
x C
c
t c
V c ± X
x X
X
t Power V
X C ± X
x X
X
t Power Left V
C V ± X
x X
Linear Fractional V
t Power V
V V ± X
x Bilinear V
Linear Fractional V
t Power V
X V ± X
x Bilinear Left X
Linear Fractional Left X
t Power Left X
C X ± X
x X
Linear Fractional Right V
t Power Right V
V X ± X
x Bilinear Right V
Linear Fractional Right V
t Power Right V
X X ± X
x Bilinear Left Right V
Linear Fractional Left Right V
t Power Left Right V

Table 1 Binary Operator Interaction Rules


364 H. D. SHERALI, E. P. SMITH AND S. KIM
PROCEDURE CreateLinearConstraint (b BinaryTree)
j := j + 1

Create new constraint: b - w(j) = 0


Add new constraint to list of linear constraints

b.type .- Leaf
b.class .- V
b.content .- w(j)
Figure 3 Creation of New Linear Constraint

Once the function argument has been reformulated and assigned a class (denoted
by b.right.class), we have to examine whether we need to replace it by a
new variable. This will be so only if the argument has been determined to be a
linear expression (class X), in which case we need to create a new linear constraint
by invoking procedure CreateLinearConstraint, the definition of which is
shown in Figure 3. We note that this procedure creates a new variable Wj by increasing
a global variable count j by 1. It then proceeds to create a new linear constraint by
equating this new variable to the given binary tree b. The new constraint is added to
a list of linear constraints created by reformulation. Finally, b is replaced by the new
variable: its type is changed to Leaf, its class to V and its contents become Wj.
Having dealt with the univariate function's argument, we now come to consider the
function itself. In particular, if its argument has been determined to be anything other
than a constant, then we must replace it by a new variable. This is achieved by an
invocation of procedure CreateVariableDefinition shown in Figure 4. This
is very similar to the CreateLinearConstraint procedure discussed earlier,
except that in this case we create a definition of a new variable rather than a constraint,
and store this definition in a separate list. As we shall see later, we will use this list to
construct the problem relaxation by creating convex upper and lower bounds for each
one of its members.
We note that CreateVariableDefinition also sets the class of the binary
tree b under consideration to a simple variable V. However, if the argument of the
univariate function was determined to be a constant, then CreateVariableDef-
ini tion will not be invoked, and the class of b must be set to a constant (C) in
ClassifyReformulate.
To illustrate the handling of univariate function operators, consider the expression:

exp(x + 2y)
DESIGNING WATER DISTRIBUTION SYSTEMS 365
PROCEDURE CreateVariableDefinition (b BinaryTree)
j := j + 1

Create new variable definition: w(j) == b


Add new definition to list of variable definitions
b.type .- Leaf
b.class .- V
b.content .- w(j}
Figure 4 Creation of New Variable Definition in Terms of Simple Nonlinear Expressions

which corresponds to a binary tree with a unary operator node at its root. In this
case, its argument would be classified as a linear expression, and therefore would be
replaced by a linear constraint involving a new variable Wj:

x + 2y - Wj =0
Then the function itself would be replaced by another variable defined as:

Wj+! == exp(wj)

If, instead, the expression under consideration was simply exp(x), then no additional
linear constraint need be created, and the new variable definition

Wj == exp(x)
would suffice.
In both of the above cases, the expression would be classified as variable (class V).
On the other hand, an expression of the form

exp(a + 2(3)

where a and f3 are constants would not be reformulated at all, and would itself be
classified as constant (class C).

Handling of Binary Operator Nodes


We finally come to consider the classification and reformulation of binary trees
with a binary operator root node. In this case, the algorithm starts by classifying
and reformulating their left and right sub-trees by recursive applications of the
ClassifyReformulate procedure to b .left and b. right.
Once the two sub-trees have been processed in this way, we come to consider the tree
b itself. The classification of the tree and the precise actions that need to be taken
366 H. D. SHERALI, E. P. SMITII AND S. KIM
depend on both the classification of the sub-trees and the binary operator involved, as
shown in Table 1.
Consider, for instance, the expression

x + 2y+z
x+y
This involves the ratio of two linear sub-expressions, and therefore corresponds to
the penultimate row of Table 1. This then indicates that we need to create linear
constraints for each sub-expression:

x + 2y + z - Wj = 0
x+y -Wj+! = 0

thus introducing two new variables Wj and Wj+l. We also need to replace the entire
expression by a new variable defined in terms of a linear fractional term:

Wj+2 == _3_
Wj+!

Finally, the root node is now classified as variable (class V).


Consider, on the other hand, an expression of the form

x(2y + z)

In this case, the left sub-tree is simply a variable. We therefore need to define a
linear constraint to replace the right sub-tree by a new variable, and then introduce a
definition of another new variable in terms of a bilinear product:

2y +Z - Wj 0
Wj+l = XWj

As the entries in the fourth column of Table 1 indicate, only three types of new
variable definition may arise from the reformulation of a binary operator. These
correspond to the bilinear form xy, the linear fractional form x / y and the power form
x Y respectively. To these, we have to add a fourth type created by the reformulation
of unary operators.

3 RELAXED PROBLEM FORMATION


As has already been noted, branch and bound algorithms for global optimisation
require a convex relaxation of the original nonconvex NLP. In this section, we
DESIGNING WATER DISTRIBUTION SYSTEMS 367
consider how such a relaxation can be obtained in a systematic fashion from the results
of the symbolic reformulation algorithm presented in the previous section.
The application of the reformulation procedure to the objective function <I>(x) and
each of the equality and inequality constraints g(x) and h(x) will generally introduce
a number of new variables W in addition to the original variables x. The reformulated
problem will comprise the following:

• A linear objective function (possibly a single variable).


•. A set of linear constraints replacing the original equality and inequality con-
straints. It should be noted that constraints that are linear in the original problem
are not affected by the reformulation procedure. On the other hand, nonlinear
constraints that are not expressed as the sum or difference of nonlinear terms
in the original problem will be replaced by single variables. For instance, a
constraint of the form x(l - x) :S 0 will be replaced by a single variable Wj :S 0
defined in an appropriate fashion (see below).

• A list of linear equality constraints constructed by procedure Crea teL in-


earCons train t.

• A list of new variable definitions in terms of simple nonlinear functions,


constructed by procedure CreateVariableDefini tion.

The above reformulation of the original problem is exact. It is also completely linear
with the exception of the last item which has collected all the nonlinearities and
nonconvexities of the original problem in a single list. Each element of the latter
belongs to one of four special types, and we therefore need to consider the derivation
of convex upper and lower bounds for each of these.
We generally assume that the original variables x are supplied with physically
meaningful lower and upper bounds, Xl and XU. Although no such bounds are
available for the variables W introduced by the reformulation procedure, these may
well be necessary for the construction of the convex relaxation of the original NLP.
The rest of this section is concerned with deriving convex bounds for each type of
nonlinear term, and obtaining upper and lower bounds for the W variables.

3.1 Univariate Function Nonlinearities


These are of the form
Wj == J(z)
where J(.) is a univariate function, and z E {x, w} is a single variable.
For the purposes of this paper, we assume that functions J(.) are either concave or
convex. Although this might appear to be somewhat restrictive (e.g. it excludes
368 H. D. SHERALI, E. P. SMITIf AND S. KIM

Jlz)

/!z)

Z1 z· Zl z
(a) Concave Univariate Function (b) Convex Univariate Function

Figure 5 Upper and Lower Bounds for Univariate Functions

trigonometric functions), it actually includes most of the univariate functions that are
commonly encountered in process engineering problems (e.g. In(.),exp(.), yT)).
We also assume that the functions are well defined over the entire domain of their
argument z, a fact that may be exploited to tighten the bounds on z if necessary.
For concave univariate functions, the secant

provides the lower bound for W j while the function f (z) itself acts as the upper bound
(see Figure 5):

On the other hand, for purely convex functions, we have the bounds

The above bounds represent convex relations of the definition of Wj. The definition
can also be used to derive upper and lower bounds on the Wj variable itself. In fact,
most common univariate functions are monotonic, and in these cases we have simply:

W) min (J(zl), f(zU))


w'j max (J(zl), f(zU))
DESIGNING WATER DISTRIBUTION SYSTEMS 369
3.2 Bilinear Product Terms
These are of the form
Wj =yz

where y, z E {x, w} are single variables. In this case, we employ the linear bounds
proposed by McCormick [15]:

Wj > y'z + zly ylzl


Wj > yU z + zU y yUzU
Wj < ylz + zU y ylzu
Wj < yU z
+ zly yUzl

Bounds on Wj itself may be derived from the bounds on y and z:

W~ = min (ylzl,ylzu,yuzl,yuzu)
wj max (ylzl,ylzu,yuzl,yuzu)

The convex nonlinear over- and underestimators recently proposed by Quesada and
Grossmann [21] could also be used for constructing the convex relaxation of Wj = yz.
However, because of the way the bounds on Wj are derived from the bounds of y and
z, these nonlinear estimators are initially weaker than their linear counterparts listed
above (see Property 3 and Corollary 2 in [21]). On the other hand, if the branch and
bound algorithm (see Section 4) branches on the Wj variable, thereby reducing its
range, then the convex nonlinear bounds may become non-redundant and should then
be included in the relaxation.

3.3 Linear Fractional Terms


These are of the form
Wj
-y
=-
z
where y, z E {x, w} are single variables. For the term to be defined across the entire
bounded region, z must be either strictly positive or strictly negative. Using a simple
reformulation to Wj z = y, we are left with a bilinear term and again may use the linear
bounds derived by McCormick [15] (see above).
The bounds on variable Wj may be derived from those on y and z as follows:
370 H. D. SHERALI, E. P. SMITH AND S. KIM
It is also possible to derive nonlinear over- and underestimators for linear fractional
terms [21]:
> 1L + Z
lLl
Wj > '1.
+ JLzu
JL
lL
< +
Wj

Wj <
ZZ
z;
+ zr
~~
zr
As noted in [21] (see Property 1 and Corollary 1), these are, in fact, stronger than
the linear estimators if the bounds on Wj are calculated in the manner shown above.
However, for a given combination of the signs of the bounds for y and z, only two
of these four nonlinear constraints are convex and can therefore be included in the
relaxation.

3.4 Simple Power Terms


In general, these are of the form
Wj == yZ
where either y or z (or both) is a variable. Most often in practice, either y or z is
actually a constant.
If z is constant, then yZ is either convex or concave over the entire domain of y, and
estimators may be constructed in a manner very similar to that used for univariate
functions (see Section 3.1). In particular, when 0 < z < 1, the secant acts as the
convex lower bound while yZ itself provides the concave overestimator. On the other
hand, for z > 1 and z < 0, the secant is the convex overestimator and yZ acts as the
underestimator, provided y is restricted to non-negative values. For y to be allowed to
take negative values, z must be an integer constant. In this case, if z is even, then yZ
is convex everywhere and the previous rule applies. If z is odd, then yZ is convex for
y > 0 and concave for y < O.
When y is constant and z is a single variable, the term is similar to the exponential
function. For the term to be defined across the entire domain of z, y must be positive.
Here, yZ always provides the underestimator while the secant acts as the overestimator.
Upper and lower bounds for the variable Wj for the two cases where either y or z
are constant can almost always be obtained in a manner similar to that presented in
Section 3.1. The one exception is when z is an even valued integer and y is allowed
to take both positive and negative values, which destroys the mono tonicity of the
function. In this case, we have the bounds

wI.
J
0
wj max ((yl)z) (yU)Z)
DESIGNING WATER DISlRIBUTlON SYSTEMS 371

The more general case in which both y and z are variables rarely occurs in practical
process models. It can be handled by writing Wj == yZ as

In(wj) - zln(y) =0
and reformulating this constraint further using the algorithm of Section 2.

3.5 Deriving Bounds for w's Introduced in Linear Constraints


In addition to the W variables corresponding to nonlinear term definitions, the
reformulation procedure introduces W variables associated with linear constraints (cf.
procedure CreateLinearConstraint) of the form:

where y ~ {x, w} is a vector of variables and f3 a vector of constants.


The following bounds on Wj are readily derived from the above:

W~ L min (f3kYi, f3kyi:)


k

w'j L max (f3kY~' f3kyi:)


k

3.6 Remarks on the Relaxation of Inequality Constraints


We have seen above how over- and underestimators can be derived for different types
of nonlinearity. It is worth mentioning that in some cases, it is not necessary to
introduce both of these estimators into the relaxation. In particular, the inequality
constraints
hi(x)::;O Vi
are, in general, reformulated into linear constraints of the form:

L (lijXj + L f3ijWj ::; 0 Vi


j j

If a variable Wj appears with non-negative coefficients f3ij in all such linear inequality
constraints i without appearing in any other linear constraint, then it suffices to include
its underestimator(s) in the relaxed formulation. Similarly, if all f3ij are non-positive
and again, Wj does not appear in any equality constraint, then we only need to consider
its overestimator(s).
It is interesting to consider the relaxation of problems in which all the equality
constraints g(x) = 0 are linear, and all the inequality constraints h(x) ::; 0 are convex.
372 H. D. SHERALI, E. P. SMITH AND S. KIM
Such problems are, of course, convex, and do not actually require the application of
global optimisation techniques. Fortunately, it can be shown that, in many such cases,
the relaxations derived by our techniques will be exact. Consider for instance, any
constraint of the form

where ak, 13k and rk are non-negative constant coefficients, £k(X), Mk(X), Nk(x)
and P (x) are general linear expressions, Ff x (.) and Ffc (.) are respectively convex
and concave univariate functions, and c a constant.
It can be verified that the application of the symbolic reformulation procedure to such
a constraint will result in the following set of linear constraints

r () -
J--k Xw_(1)
k o
M k () _(2)
X -w k o
.r() _(3)
JVk X -w k o
where the new variables wiA} ,wi>-} ,A = 1,2,3 are related through the definitions:

Let us now consider relaxing this problem. We note that the first definition is a special
form of a linear fractional term in which the numerator is a constant. Thus, in terms
of the notation of Section 3.3, we have y = yl = yU = 1 and can identify z == wi1}.
Therefore, both of the Quesada and Grossmann [21] nonlinear underestimators applied
to this case reduce simply to:

Similarly, because of the signs of 13k and rk, it suffices to relax the definition of
wi2 } by the nonlinear underestimator of the convex function Ffx (.), and that of 3} wi
by the nonlinear overestimator of the concave function Ffc (.). In both cases, as
explained in Section 3.1, these estimators are the nonlinear functions themselves, and
we therefore have the relaxations:

> FfX(wi2 })
< FfC(wi3})
DESIGNING WATER DISTRIBUTION SYSTEMS 373

Now, the above relaxations, together with the set oflinear constraints generated by the
reformulation, are exactly equivalent to the original convex nonlinear constraint. We
therefore conclude that the application of our reformulation/relaxation technique to a
purely convex problem of this form will result in an exact relaxation. Consequently, a
branch and bound algorithm of the type detailed in Section 4 below, will converge to
the optimal point in one iteration without ever requiring any branching.

4. SPATIAL BRANCH AND BOUND ALGORITHM


Having constructed the convex relaxation of the original nonconvex NLP in the
manner described in the previous section, we can now attempt the solution of the
global optimisation problem. A spatial branch and bound algorithm [10] with the
extensions proposed by Quesada and Grossmann [21] and Ryoo and Sahinidis [22,23]
is used. In this section, we outline the overall structure of the algorithm, before
considering in more detail one of the steps in it.

4.1 Algorithm Structure


Given an absolute optimality margin, €:

Step 1: Initialise search


Set the upper bound for the objective function ~u := 00.

Initialise the list of subregions C to a single region R covering the full domain of
the variables x, w: R == [xl, XU] X [wi, WU].

Step 2: Choose a subregion


If C = 0, go to step 8, otherwise choose a subregion R from the list of subregions
C.
Step 3: Bounds tightening
Attempt to tighten the variable bounds for subregion R.
If these bounds consequently become inconsistent, go to step 7.

Step 4: Generate lower bound for objective function in R


Form and solve the relaxed problem for subregion R to yield a relaxed solution
(x'R., w'R.) and a lower bound for the objectivefunction ~k. If ~k 2: ~U - €, or
the relaxed problem is infeasible, go to step 7.
374 H. D. SHERALI, E. P. SMITH AND S. KIM

Step 5: Generate upper bound for objective function in n


Set <I>¥<. := 00.

If xn is a feasible point of the exact problem, then calculate upper bound


<I>¥<. := <I>(x n ).
If <I>k < <I>¥<. - E then
Attempt to solve the original problem over subregion n.
If a feasible solution xn is obtained with an objective function value
of ~n < <I>¥<., then set xn := xn and <I>¥<. := ~n.

If <I>¥<. < <I>u then


Update the best feasible solution found so far x* := x n , and set
<I>u := <I>¥<..
n
Remove all subregions i such that <I>~ 2: <I>u - E from the list .c.
If <I>k 2: <I>¥<. - E, go to step 7.

Step 6: Branching
n
Apply a branching rule to subregion to choose a variable and its corresponding
value on which to branch. Add the two new subregions generated by partitioning
n at this variable value to the list .c.
Step 7: Delete subregion
Delete the current subregion n from the list .c. Go to step 2.
Step 8: Termination
If <T?U = 00, the problem is infeasible.
Otherwise solution is x* with an objective function value of <T?u.

The basic concepts of this type of algorithm have already been described in detail
by several authors [21, 22, 23]. In our implementation, the subregion n
selected to
be examined next at step 2 is the one with the lowest lower bound <I>k in the list of
pending subregions .c.
Also the rule used at step 6 to select the variable on which the
algorithm will branch next is that of Ryoo and Sahinidis [23]. This selects the variable
with the highest contribution to the gap between the objective function values of the
relaxed and exact problems.
Step 4 of the algorithm establishes a lower bound <I>k on the objective function by
solving the relaxed problem. Depending on the form of the estimators used in the
relaxation, this may be either a linear or a convex nonlinear programming problem.
At step 5, we attempt to establish an upper bound <I>¥<. to the objective function. This
can be done cheaply if the solution of the relaxed problem, x n happens to be feasible
with respect to the original problem. In this case, <I>¥<. can be calculated simply by
evaluating the objective function of the original problem at xn.
DESIGNING WATER DISTRIBUTION SYSTEMS 375
If <lik <lin -
< E, there may still be scope for a better upper bound. This we try to obtain
by solving the original nonconvex NLP using an iterative local optimisation algorithm,
with x'R as the initial point for the iteration. Of course, given the nonconvexity of the
problem, there is no guarantee that such an algorithm will converge, or, even if it does,
that the solution obtained will be better than the current upper bound for this region.
In any case, once an upper bound is established, we check to see if it improves on the
global upper bound ~t.I. If so, we update the best solution x* found so far and prune
the list ,c to remove any clearly inferior regions.
Finally, if the lower and upper bound for this region are within the given optimality
n
margin, no further branching within is necessary and it can be deleted from the list
,c (step 7).
The next subsection deals in detail with the bounds tightening procedure applied at
step 3.

4.2 Bounds Tightening Procedure


Clearly the efficiency of the spatial branch and bound algorithm depends very much
on the gap between the relaxed and the exact problems. The quality of the over-
and underestimators used for the construction of the relaxation is, in turn, largely
determined by the bounds on the variables. Thus the procedure for tightening these
bounds at step 3 of the algorithm is particularly important.
One way of obtaining a tight lower (upper) bound for a variable Z E {x, w} is to
minimise (maximise) z subject to all the linear constraints in the NLP relaxation. The
problem with this approach is that it requires the solution of 2n linear programming
problems, where n is the total number of variables in the problem, and this may be
unjustifiably expensive for large problems.
A computationally cheaper, albeit not always so effective, approach considers the
problem constraints individually and attempts to tighten the bounds of the variables
occurring in them. This technique, called "feasibility-based tightening" by some
authors [25], is readily applicable to linear constraints of the form:

For any k such that ak i 0, this can he re-arranged to


376 H. D. SHERALI, E. P. SMTIH AND S. KIM

from which we can attempt to tighten the bounds on Zk through the relations:

IF ak >0
ZUk .- min (Zk' (b-
a1k j1k min(ajZ},ajzj')))

zlk .- max(zL :. (b -;1:nax(a;zj, a;zj)) )


IF ak <0
ZUk .- min (z" !. (b- j1, max(ajzj,a;zy)))
zlk .- max (zL:' (b- ;1, min(a;zj,a jzj )))

The above tightening is based entirely on the linear constraints in the original problem.
One of the effects of the symbolic reformulation procedure presented in Section 2
is that the nonlinearities are extracted from the objective function and constraints,
and are all collected together in a list of simple nonlinear definitions. Here we are
interested in using this information for further bounds tightening.
Consider, for instance, the nonlinear constraint

xy+z=5

where x, y, and Z are variables in the range (1, 10]. Applying the symbolic reformulation
procedure will result in the linear constraint:

w+z=5

where the new variable w := xy is in the range [1, 100] (cf. Section 3.2).
By applying feasibility-based tightening to the now linear constraint rearranged as
w = 5 - z, we can reduce WUto 4. Similarly, from Z = 5 - w, we reduce ZU also to 4.
Now, consider the definition w := xy rearranged as either x = w/y or y = w/x.
From these, we can easily deduce that the upper bounds of both x and y can be
reduced from 10 down to 4. The final result of the tightening procedure is therefore
x,y,z,w E [1,4].
Overall, the reformulation procedure not only has linearised the originally nonlinear
constraint, thereby making it amenable to the application of the standard feasibility-
based tightening techniques, but also has extracted the nonlinearity to a form that can
readily be used for further tightening. In particular, the following general tightening
DESIGNING WATER DISTRIBUTION SYSTEMS 377

rules can be deduced from bilinear product definitions w == yz (cf. Section 3.2)1:

I .- (I.
y .- max y, mm (WI W z;;-
w' 7'
zr, ZU, WU)l U U._ . (U zr,
y .- mm y ,max (WI w' WU)l
W z;;-
zU' 7,
U

I ._ · (WI w' W 0_·


zI,mm u u
yr, yU' yr'""iF
U U
W (Wi Wi W W )
mm z yr'""iF
U
U
max ,max yr,
)
Z .- Z .- yU'

while for linear fractional term definitions w == ~ (cf. Section 3.3), the corresponding
expressions are:

ZU .- mm
. (uz ,max(-l'-'-l'-)
yl yl yU yU )
w wU w wU

Analogous expressions can be derived for univariate function nonlinearities, provided


the function is monotonic, and for most cases of simple power term definitions.
All these expressions, together with the bounds on w already presented in Sections
3.1--3.4, can be used in the context of a bounds tightening procedure at step 3 of the
branch and bound algorithm. This may involve multiple passes through the set of
problem constraints: each constraint is considered in turn with a view to tightening the
bounds of each variable occurring in it; if any such tightening is found to be possible, a
further pass through the set of constraints is initiated, and so on until no more changes
in the bounds take place.
This procedure sometimes results in the lower bound for one or more variables
exceeding the upper bound. Such inconsistency clearly implies that the problem
is infeasible in the region under consideration, and this is therefore discarded from
further consideration (cf. step 3 of the algorithm).

5 IMPLEMENTATION
The constraint reformulation, NLP relaxation and branch and bound algorithms
detailed in Sections 2--4 have been implemented within the gPROMS process
modelling environment [4]. This software package supports the modelling of the
transient and steady-state behaviour of processes involving both lumped and distributed
operations. In general, process models may be described in terms of mixed sets of
integral, partial and ordinary differential and algebraic equations (IPDAEs) expressed
1 It is understood that evaluation of each of these expressions will be undertaken only if all variable
bounds appearing in their denominators are non-zero.
378 H. D. SHERALI, E. P. SMITH AND S. KIM
in a high-level symbolic"language [16]. Moreover, any model may involve instances
of simpler models; this allows the establishment of model hierarchies of arbitrary
depth, and provides an effective mechanism for dealing with modelling complexity.
An example of the gPROMS language is shown in Appendix A. Two types of entity
can be distinguished in this particular case. MODELs describe the physical behaviour
of the system, and, to a large extent, can be defined independently of any specific
application. On the other hand, a PROCESS defines an "experiment" to be performed
in terms of the objects being investigated (i.e. instance(s) of the MODELs), the
experimental frame (i.e. the conditions under which the investigation is to take place),
andthe results generated by its execution [17, 33]. Multipurpose process modelling
environments, such as gPROMS, support several different types of experiment (e.g.
process simulation, optimisation, parameter estimation etc.). The particular PROCES S
(C1PLIsoSeries) shown in Appendix A defines an optimisation experiment to be
carried out in an instance P of the MODEL Iso_Series_C1Pl defined earlier in the
same input file.
The solution of a global optimisation problem defined in the gPROMS language
involves several steps, all of which are performed automatically and completely
transparently to the user. First the input is translated into an internal representation
of the various generic MODEL and PROCESS entities in it. The execution of a
PROCESS (corresponding to performing the "experiment" associated with it) causes
an instantiation of the specific MODEL(s) on which it operates, generating the actual
variables and constraints involved in them. At this stage, any spatial distribution of
variables and constraints is approximated through appropriate spatial discretisation
techniques, thereby reducing the IPDAE system to a system of ordinary differential
(with respect to time) and algebraic equations. For steady-state experiments, such as
those of interest to this paper, the time variation is subsequently removed by setting
all time derivatives to zero, thus further reducing the problem description to a set of
purely algebraic constraints. The latter are then differentiated symbolically to generate
their first partial derivatives with respect to all the variables occurring in them (see,
for instance, [18]).
The above steps are common to all types of gPROMS experiment [3, 16]. They result
in a large set of nonlinear constraints (and their partial derivatives) held in binary tree
form, together with other information (such as the constraint sparsity pattern) that is
typically required by numerical algorithms.
In the specific case of global optimisation, the next step involves the application of
the symbolic reformulation procedure of Section 2 to the binary tree representation
of the objective function and constraints. Our implementation of the Create-
variableDefini tion procedure is slightly more sophisticated than that shown
in Figure 4. Thus, for reasons of efficiency in subsequent operations, separate lists
are maintained for each of the four types of variable definition. Also, before a new
DESIGNING WATER DISTRIBUTION SYSTEMS 379
variable definition is actually created, a test is carried out to prevent any duplication
of previous definitions. Thus, for instance, only one w variable representing a bilinear
product xy is ever created, irrespective of the actual number of occurrences of xy or
yx terms in the problem.
Testing of the reformulation algorithm on problems constructed from standard process
engineering models (including various complex unit operations such as plug flow
reactors, distillation columns and absorbers) indicates that its current implementation
achieves an average of over 3000 constraint reformulations per second on a SUN
SPARC 10/51 workstation.
Once the problem reformulation is completed, gPROMS proceeds with the construc-
tion of the NLP relaxation using the estimators presented in Section 3, before applying
the branch and bound algorithm of Section 4. Our current implementation of the
latter uses a sequential linear programming (SLP) method [5] for the solution of the
convex relaxation and the attempted solution of the nonconvex NLP at steps 4 and 5
respectively. The CPLEX code [6] is used for the solution of the large, sparse linear
programming subproblems within the SLP algorithm.

6 ILLUSTRATIVE EXAMPLE
This example is concerned with the design of a reactor network involving a continuous
stirred tank reactor (CSTR) and a plug flow reactor (PFR) operating in series. The
reactors operate isothermally and reactions occur in the liquid phase according to the
Van de Vusse [32] scheme shown below:

1.0 S-l (first-order)


A ~ B ~ C 1.0 S-l (first-order)
A ~ D = 10.0 l/mol.s (second-order)

The feed to the CSTR is 100 lis of a dilute solution of A at a concentration of 5.8 molll.
The objective is to determine the reactor volumes that maximise the concentration of
component B in the outlet stream of the PFR.
Standard steady-state CSTR and PFR models were used, and these are shown in the
gPROMS input file listed in Appendix A. The CSTR model is expressed simply as
a set of nonlinear algebraic equations, but the PFR model involves a mixed set of
differential and algebraic equations (DAEs) reflecting the variation of concentrations
of the four species A, B, C, D, and other quantities (e.g. reaction rates) over the length
of the reactor. For instance, the mass conservation equation for component A is
written as:
380 H. D. SHERALI, E. P. SMITH AND S. KIM
.- ............................ -., ....-.......... -......................................

! ~ [.......... "; -· i~.i


.§ !
~ LlO t A
~
~ f,,-' D --- - i

: L. . . . . ... . .::::.·:::::::::.:::::.::< ·.·.:·.·.·.:..:....... ____ .___ ..... ____ . .1

Cone C. 0.2R "",/II


Cone D • 0.66 IIwVl v= /6.99/

Figure 6 Optimal Solution for Example II

where F is the volumetric flowrate, rj is the rate of reaction j, v E [0,1] denotes


normalised volume, and V is the total PFR volume. Also the rate of the third reaction
r3 is given by:
r3 = k3C~
gPROMS applies a third-order orthogonal collocation method over five finite elements
of equal length to discretise the DAE system into a set of nonlinear algebraic constraints
[8]. Simulation experiments [27] demonstrated that this discretisation method gives
a sufficiently accurate representation of plug flow behaviour while producing fewer
constraints/variables than the more commonly used finite difference based schemes
for the same level of accuracy.
Following the spatial discretisation of the PFR model, the overall process model
involves 160 variables and 158 equality constraints. Its symbolic reformulation takes
less than 0.1 CPU seconds on a SUN SPARC 10/51 and results in a reformulated
problem involving 309 variables, 209 linear constraints and 98 unique bilinear and
power terms.
Figure 6 shows the global optimum solution, including the reactor volumes, the
component concentration values in the CSTR, and the spatial concentration profiles
in the PFR. The optimal objective function value corresponds to a maximum product
concentration of B of 3.682 moill. This solution is practically identical to the
design obtained by Kokossis and Aoudas [12] using local optimisation techniques and
approximating the PFR as a sequence of 200 CSTRs.
Overall, 14074 nodes were examined by the branch and bound algorithm. It is
interesting to note that the final solution was actually obtained after the inspection of
DESIGNING WATER DISTRIBUTION SYSTEMS 381

the first eight nodes, the remainder of the nodes being examined to confirm that this
was in fact the global optimum. The solution required 2908 CPU seconds on a SUN
SPARC 10/51 workstation. Of this, 37% was taken solving the relaxed problems,
58% solving exact problems, and the remaining 5% on bounds tightening and other
house-keeping activities.

7 CONCLUSIONS
Despite the very significant progress in global optimisation techniques over recent
years, these remain applicable to relatively limited classes of mathematical problems.
This paper has presented a general and efficient symbolic manipulation algorithm that
can reformulate most process engineering optimisation problems into a form that is
amenable to solution by spatial branch and bound techniques for global optimisation. In
fact, the availability of symbolic information also provides the potential for improving
certain aspects (e.g. bounds tightening) of the numerical algorithm itself.
Of course, the potentially poor performance of spatial branch and bound algorithms
applied to large problems remains an important issue. We have not, so far, examined
in detail the effects of various algorithmic decisions on the efficiency of the algorithm.
However, it is worth stressing that the symbolic manipulation, problem relaxation
and bounds tightening algorithms presented in this paper are complementary to, and
can be used in conjunction with, other branch and bound algorithms implemented on
either sequential or distributed architecture computers [2, 7, 21, 23].
The ultimate aim of our work is to enable the automatic application of global
optimisation techniques to engineering optimisation problems without the need for
special expertise in mathematical problem formulation and solution. As a first step
in this direction, the algorithms presented have been implemented in the gPROMS
mUltipurpose process modelling environment. This permits the use of standard process
models for global process optimisation and also facilitates the efficient implementation
of symbolic manipulation algorithms due to the availability of a high-level symbolic
representation of the mathematical constraints.
382 H. D. SHERALI, E. P. SMITH AND S. KIM
REFERENCES
[1] A. Aggarwal and C. A. Floudas. Synthesis of general distillation sequences ---
Nonsharp separations. Comput. Chem. Engng., 14:631--653, 1990.

[2] I. P. Androulakis, C. D. Maranas, and C. A. Floudas. aBB: A global optimization


method for general constrained nonconvex problems. Submitted to Journal of
Global Optimization, 1995.

[3] P. I. Barton. The Modelling and Simulation of Combined Discrete/Continuous


Processes. PhD thesis, University of London, 1992.

[4] P. I. Barton and C. C. Pantelides. Modeling of combined discrete/continuous


processes. AIChEJournal, 40:966--979,1994.

[5] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty. Nonlinear Programming: Theory


and Algorithms. Wiley & Sons, New York, 2nd edition, 1993.

[6] CPLEX Optimization Inc., Incline Village, NY. Using the CPLEX Callable
Library and CPLEX Mixed Integer Library, 1993.

[7] T. G. W. Epperly. Global Optimization ofNon convex Nonlinear Programs Using


Parallel Branch and Bound. PhD thesis, University of Wisconsin, Madison,
1995.

[8] B. A. Finlayson. Nonlinear Analysis in Chemical Engineering. McGraw--Hill,


New York, 1980.

[9] C. A. Floudas and V. Visweswaran. A global optimization algorithm (GOP)


for certain classes of nonconvex NLP's -- I. Theory. Comput. Chem. Engng.,
12:1397--1417,1990.
[10] R. Horst and H. Tuy. Global Optimization: Deterministic Approaches. Springer-
Verlag, Berlin, 2nd rev. edition, 1993.
[11] D. E. Knuth. The Art of Computer Programing -- 1. Fundamental Algo-
rithms. Computer Science and Information Processing. Addison-Wesley, Read-
ing, Mass., 2nd edition, 1973.

[12] A. C. Kokossis and C. A. Floudas. Optimization of complex reactor networks


---I. Isothermal operation. Chemical Engineering Science, 45:595--614,1990.
[13] W. B. Liu and C. A. Floudas. A remark on the GOP algorithm for global
optimization. Journal of Global Optimization, 3:519--521,1993.
[14] C. D. Maranas and C. A. Floudas. Global minimum potential energy con-
firmations of small molecules. Journal of Global Optimization, 4: 135--170,
1994.
DESIGNING WATER DISTRIBUTION SYSTEMS 383

[15] G. P. McCormick. Computability of global solutions to factorable nonconvex


programs: Part I --- Convex underestimating problems. Mathematical Program-
ming, 10:146--175,1976.

[16] M. Oh. Modelling and Simulation of Combined Lumped and Distributed


Processes. PhD thesis, University of London, 1995.

[17] T. I. Oren and B. P. Zeigler. Concepts for advanced simulation methodologies.


Simulation, 32:69--82,1979.

[18] C. C. Pantelides. SpeedUp --- recent advances in process simulation. Comput.


Chem. Engng., 12:745--755,1988.

[19] C. C. Pantelides and H. I. Britt. Multipurpose process modelling environments.


In L. T. Biegler and M. F. Doherty, editors, Proceedings of Conference on
Foundations of Computer-Aided Design '94. CACHE Publications, 1994.

[20] I. Quesada and I. E. Grossmann. Global optimization algorithm for heat


exchanger networks. Ind. Eng. Chem. Res., 32:487--499, 1993.

[21] I. Quesada and I. E. Grossmann. A global optimization algorithm for linear


fractional and bilinear programs. Journal of Global Optimization, 6:39--76,
1995.

[22] H. S. Ryoo and N. V. Sahinidis. A branch-and-reduce approach to global


optimization, 1995. Journal of Global Optimization to appear.

[23] H. S. Ryoo and N. V. Sahinidis. Global optimization of nonconvex NLPs


and MINLPs with appliciations in process design. Comput. Chem. Engng.,
19(5):551--566,1995.

[24] F. Schoen. Stochastic techniques for global optimization: A survey of recent


advances. Journal of Global Optimization, 1:207--228,1991.

[25] J. P. Shectman and N. V. Sahinidis. A finite algorithm for global minimization of


separable concave programs. In C. F. Floudas and P. M. Pardalos, editors, Pro-
ceedings of Workshop on State of the Art in Global Optimization: Computational
Methods and Applications, Princeton University, April 1995.

[26] H. D. Sherali and A. Alameddine. A new reformulation-linearization technique


for bilinear programming problems. Journal of Global Optimization, 2:379--410,
1992.

[27] E. M. B. Smith and C. C. Pantelides. Design of reactor networks using rigorous


models. In Proceedings of the IChemE Annual Research Event, Edinburgh, U.K.,
1995.
384 H. D. SHERALI, E. P. SMTIH AND S. KIM
[28] E. M. B. Smith and C. C. Pantelides. Design of reactor/separation networks
using detailed models. Comput. Chem. Engng., 19:583--S88, 1995.

[29] T. Umeda, A. Hirai, and A. Ichikawa. Synthesis of optimal processing system


by an integrated approach. Chemical Engineering Science, 27:795--804,1972.

[30] R. Vaidyanathan and M. M. EI-Halwagi. Global optimization of nonconvex


nonlinear programs via interval analysis. Comput. Chem. Engng., 18:889--897,
1994.

[31] H. Vaish and C. M. Shetty. A cutting plane algorithm for the bilinear program-
ming problem. Naval Research Logistics Quarterly, 24:83--94, 1977.

[32] J. G. Van de Vusse. Plug-flow type reactor vs. tank reactor. Chemical Engineering
Science, 19:994--999,1964.

[33] B. P. Zeigler. The Theory of Modeling and Simulation. John Wiley, New York,
1976.

APPENDIX A
gPROMS INPUT FILE FOR EXAMPLE
# --------------------------------------------------------------------
#
# EXAMPLE 2
# ---------
# Isothermal Van de Vusse Reaction - 1 CSTR & 1 PFR - Max ConcB
#
# --------------------------------------------------------------------
#
# Model of an Isothermal CSTR
#
MODEL Iso_CSTR
PARAMETER
NoComp, NoReac AS INTEGER
VARIABLE
Rate AS ARRAY (NoReac) OF Positive
Flow AS Flowrate
Conc_In, Conc_Out AS ARRAY (NoComp) OF Concentration
Volume AS positive
EQUATION
# rates
Rate(l) 10. O*Conc_Out (1)
Rate(2) 1.0*Conc_Out(2)
Rate(3) 1.0*Conc_Out(1)h2
DESIGNING WATER DISTRIBUTION SYSTEMS 385

# mass balances
Flow*(Conc_In(l) - Conc_Out(l)) Volume*(Rate(l) + Rate(3))
Flow*(Conc_In(2) - Conc_Out(2)) Volume*(Rate(2) - Rate(l))
Flow*(Conc_In(3) - Conc_Out(3)) - Volume*Rate(2)
Flow*(Conc_In(4) - Conc_Out(4)) - Volume*Rate(3)
END # model
#
# Model of an Isothermal PFR
#
MODEL Iso_PFR
PARAMETER
NoComp, NoReac AS INTEGER
DISTRIBUTION_DOMAIN
Axial
VARIABLE
AS (
° : 1

Conc AS DISTRIBUTION (NoComp,Axial) OF Concentration


Rate AS DISTRIBUTION (NoReac, Axial) OF positive
Conc_In, Conc_Out AS ARRAY (NoComp) OF Concentration
Flow AS Flowrate
Volume AS positive
BOUNDARY
# inlet
Conc_In Conc(,O)
# outlet
Conc_Out Conc(,l)
EQUATION
# rates
FOR Z :=
Rate(l,z)
°
TO 1 DO
lO.O*Conc(l,z)
Rate(2,z) = 1.0*Conc(2,z)
Rate(3,z) = l.0*Conc(1,z)"2
END # for
# mass balances
FOR Z := 01+ TO 1 DO
# component A
Flow*PARTIAL(Conc(l,z) ,Axial) - Volume*(Rate(l,z) + Rate(3,z))
# component B
Flow*PARTIAL(Conc(2,z) ,Axial) - Volume*(Rate(2,z) + Rate(l,z))
# component C
Flow*PARTIAL(Conc(3,z) ,Axial) Volume*Rate(2,z)
# component D
Flow*PARTIAL(Conc(4,z) ,Axial) Volume*Rate(3,z)
END # for
END # model
#
# Model of CSTR & PFR in series
#
MODEL Iso_Series_ClPl
PARAMETER
NoComp, NoReac AS INTEGER
VARIABLE
Flow_In, FlOW_Out AS Flowrate
Conc_In, Conc_Out AS ARRAY(NoComp) OF Concentration
386 H. D. SHERALI, E. P. SMITH AND S. KIM

UNIT
PFR AS Iso_PFR
CSTR AS Iso_CSTR
EQUATION
# define flow into CSTR
Flow_In = CSTR.Flow
Conc_In = CSTR.Conc_In ;
# flow out of CSTR is flow into PFR
CSTR.Flow = PFR.Flow
CSTR.Conc_Out = PFR.Conc_In
# flow out of PFR is product
PFR.Flow Flow_Out
PFR.Conc_Out Conc_Out

END # model

#
# Process to describe the optimisation experiment
#
PROCESS C1P1_IsoSeries

UNIT
P

SET
WITHIN P DO
NoReac .- 3 ;
NoComp .- 4 ;
WITHIN PFR DO
# use 3rd order collocation over 5 elements
Axial := [OCFEM,3,5]
END # within
END # within

ASSIGN
WITHIN P DO
# fix feed flowrate and concentrations
Flow_In . - 100 ;
Conc_In .- [5.8, 0.0, 0.0, 0.0] ;
END # within
PRESET
# specify upper bounds on variables
WITHIN P DO
WITHIN PFR DO
volume.- 100.0
Rate .- 350.0
END # within
WITHIN CSTR DO
Volume.- 100.0
Rate .- 350.0
END # within
END # within

MAXIMISE
P.PFR.Conc_Out(2)

END # process
Nonconvex Optimization and Its Applications

1. D.-Z. Du and J. Sun (eds.): Advances in Optimization and Approximation.


1994. ISBN 0-7923-2785-3
2. R. Horst and P.M. Pardalos (eds.): Handbook of Global Optimization. 1995
ISBN 0-7923-3120-6
3. R. Horst, P.M. Pardalos and N.V. Thoai: Introduction to Global Optimization
1995 ISBN 0-7923-3556-2; Pb 0-7923-3557-0
4. D.-Z. Du and P.M. Pardalos (eds.): Minimax and Applications. 1995
ISBN 0-7923-3615-1
5. P.M. Pardalos, Y. Siskos and C. Zopounidis (eds.): Advances in Multicriteria
Analysis. 1995 ISBN 0-7923-3671-2
6. J.D. Pin~r: Global Optimization in Action. Continuous and Lipschitz Optimi-
zation: Algorithms, Implementations and Applications. 1996
ISBN 0-7923-3757-3
7. C.A. Floudas and P.M. Pardalos (eds.): State of the Art in Global Optimiza-
tion. Computational Methods and Applications. 1996 ISBN 0-7923-3838-3
8. J.L. Higle and S. Sen: Stochastic Decomposition. A Statistical Method for
Large Scale Stochastic Linear Programming. 1996 ISBN 0-7923-3840-5
9. I.E. Grossmann (ed.): Global Optimization in Engineering Design. 1996
ISBN 0-7923-3881-2

KL~ACADEMUCPUBUSHERS-DORDRECHT/BOSTON/LONDON

Vous aimerez peut-être aussi