Académique Documents
Professionnel Documents
Culture Documents
2, FEBRUARY 1990
Communicated by R. W. H. Sargent
Nomenclature
ci = p e n a l t y p r o b a b i l i t y ;
cp = p r e c i s i o n p a r a m e t e r o n c o n s t r a i n t s ;
D = v a r i a t i o n d o m a i n o f the v a r i a b l e x;
f(- ) = objective function;
g(. ) = constraints;
i,j = indexes;
k= iteration number;
331
0022-3239/90/0200-0331506.00/0 © 1990 Plenum PublishingCorporation
332 JOTA: VOL. 64, NO. 2, FEBRUARY 1990
N = number of actions;
P = probability distribution vector;
p,(k) = ith component of the vector P as iteration k;
r = number of reactors in the flowsheet;
u(k) = discrete value or action chosen by the algorithm at iteration k;
Ui = discrete value of the optimization variable in [Umin, Umax];
Umin = lowest value of the optimization variable;
Umax = largest value of the optimization variable;
Z = random number;
X = variable for the criterion function;
xp = precision parameter on criterion function;
W(k) = performance index unit output at iteration k;
/30,/31 = reinforcement scheme parameters;
Ep = sum of the probability distribution vector components.
I. Introduction
these near-optimal solutions, one solution can be more interesting than the
strictly optimal solution from a technical point of view, because it allows,
for example, the use of existing equipment or equipment of standard
dimensions.
Furthermore, the cost functions are most often implicit, i.e., they are
computed from design routines. Consequently, they can be nonconvex
a n d / o r exhibit several local optima. In some c,-es, these local optima can
even be identical (Ref. 4) and thus the above-mentioned methods can reach
only one local optimum.
The present work is concerned with the application of a stochastic
automaton with variable structure to solve optimization problems in the
presence of constraints, without making any assumption such as linearity
of constraints, convexity of the cost function, etc. A learning automaton
operates in a random medium which models the optimization problem to
be solved. The learning system collects pertinent information during its
operation and processes it according to a certain rule, so as to optimize a
prespecified cost function under some constraints. A probability distribution
is associated to the set of feasible solutions. Each probability represents
the weight given by the learning automaton to a processs structure. The
automaton processes and collects the costs of the process structures and
updates the probability distribution by the use of a reinforcement scheme
to achieve predefined goals (i.e., a decrease of the cost function). The
minimum search is made according to this probability distribution. The
goals to be reached are contained in a performance evaluation unit which
evaluates the quality of a specific structure by a technique of reward or
penalty. The reinforcement scheme uses this response (reward or penalty)
to increase or decrease the weight (i.e., the probability) associated to this
structure and generates a new probability distribution P ( k + l ) from the
previous one P(k).
Such a learning technique has already been used for control purposes
and has been applied to the control of a pulsed liquid-liquid extraction
pilot plant (Ref. 5). The technique of reward-penalty was applied to select
the control actions.
2. Learning Algorithm
RANDOMENVIRONMENT
t{u i )
design routines to
compute the cost functionf
selected ~ f
action
ui
f 9
I STOCHASTIC
AUTOMATON PERFORMANCE
WITH EVALUATION
Is the selected action
VARYINGSTRUCTURE o good choice ? UNIT
response w(k)
The index j corresponds to discrete variables which have not been selected
at iteration k.
The quantity: Ep = [ p l ( k ) + " • • +pN(k)] is introduced to avoid round-
ott errors and to guarantee that p~(k) ~ [0, 1].
The adaptation parameters 130 and 131 are such that 0 < / 3 0 - 1 and 0 <-/31 -< 1.
The two penalty levels correspond to taking two different values for/31 ; for
example, fll ----1 for the first level (strict penalty) and/31 < 1 for the second
one (slight penalty).
The algorithm selects randomly an action ui at each iteration according
to the following procedure, based on the probability distribution. The index
i is the least value of the index j verifying the following constraint:
i
Z pj(k)>-z, (7)
j=l
SELECTION PROCEDURE
RANDOM
J,
ENVIRONMENT
4,
PERFORMANCE EVALUATION UNIT
YES
I "~' / first level I
I v of penalty: I
T new iteration
Fig. 2. Flowchart of the procedure performed by the learning algorithm at every iteration.
3. Results
The first example deals with the unconstrained and constrained optimiz-
ation of a multimodal function. No prior information is taken into consider-
ation. All the probability vectors are initialized as follows: P ( 0 ) =
[ l / N , I / N , . . . . I / N ] . A specific machine subroutine ( R A N D U . . . ) is used
to carry out a normally distributed random variable Z.
338 JOTA: VOL. 64, NO. 2, FEBRUARY 1990
2.0 i o .45
f
1,5 i i
0.40
<-- function
~
1.0 , i 0.35
0.5 { ~ 0.30
0.25
o.o 1 To
-0,5 ] 0.20
-1,0 I 0.15
-1.5 0,10
=minimum+O. 0,05
0.00
24 DISCRETEVALUESIN INTERVAL[0,6]
Fig. 3. M i n i m i z a t i o n o f the function cos(zrx) + sin(~rx) + cos(2~rx) + sin(2~rx).
JOTA: VOL. 64, NO. 2, FEBRUARY 1990 339
PROBABILITIES
0.35 ,
initialization
0.30 "-- iteration n~20 A
" " iteration n°60
0.25 ~ iteration n°100
t ~ iteration n°500
0.20
o.~5 f...-.~
0,00
0 5 I0 15 20 25
DISCRETE VALUE NUMBER
Fig. 4. Time updating of the probability distribution [minimization of the cost function
cos(~x) + sin(Trx) + cos(27rx) + sin(2rrx)].
are illustrated in Fig. 5 where the function f ( x ) and the probability distribu-
tion at the end of the search are plotted. As has been previously mentioned,
the probability distribution peaks yield six values o f x leading to a minimum
value o f f ( x ) with f(x)->-1, namely, x =0.75, 1.75, 2.75, 3.75, 4.75, 5.75,
which correspond to u3, uT, u . , u~5, u~9, u23.
FUNCTION PROBABILITIES
2,5 0.6
2.0 !~ '+Y i ,J
1.5 ~ i
<-- function i iI "5
1,o i i ! 0,4
I
-0.5 4. , "o ~ ~ 8 ~t
~ ! constraint ± 2% ~ ~,~/ ~ 0.2
.1.o ~ ~ ~ ,~- - :: t ,~
MT\
# % # ~,
MT\
# i # probabilities --> ~.
lid #
0.1
o,e
24 DISCRETEVALUES IN INTERVAL[0,6]
RECYCLES
SETOF SEPARATION
REACTANT
t REACTORS SEQUENCE
t.-
q
Fig. 6. General flow sheet considered for process synthesis.
JOTA: VOL. 64, NO. 2, FEBRUARY 1990 341
1 Ethane 0.05
2 Propylene 0.15
3 Propane 0.35
4 Isobutane 0.20
5 n-butane 0.15
6 Pentane 0.10
t 0.45
32
0.40
.... costs . / ~ I
0.35
3o i..,V v
0,30
2s I 0.25
0.20
o-o o o
20 p,~ 0.15
0.10
24 minimum -+ 10%
0,05
|Q~'*~ I ~ '~ [ probabilities --->
22 ~ o = q = o = o = o = o = q = o ~ I ~=o=q=o=o=o-o,,-o,o~o=o=o=q=~o-o~o=o~q=O=o~o~o=9=o=o 0.00
0 5 10 15 20 25 30 35 40 45
STRUCTURENUMBER
Fig. 7 Minimization of costs for distillation sequence: reflux = 1.3; pressures = 10, 9, 8, 7,
6 atm.
342 JOTA: VOL. 64, NO. 2, FEBRUARY 1990
the annual costs of the different sequences and the probabilities associated
with these sequences at the end of the search. It can be noted that, in spite
of the multimodal character of the cost function, analysis of the probability
distribution yields five optimal sequences: Nos. 1, 2, 13, 15, 17.
In Fig. 8, the sequences No. 1 and No. 13 are represented as examples.
Figure 9, which gives the probability distribution at different iterations,
allows us to point out how this distribution is updated during the search.
The first hundred iterations yield a subset of 15 possibilities, which is
reduced, as the search goes on, to 6 and at the end, to only 5.
The number of near-optimal solutions depends, of course, on the
precision required for the determination of the minimum (i.e., the value of
12
1234 2
3
123456
1 56
Structuren°l
5
6
4
12
123456 2 3
34,56 .1~4
5
56 I ~ ~
Structure n°13
.~ 6
PROBABILITIES
0.20
0.~8
0.t6
tAt!
0.14 - - bliliatiz~lion
- - tteratiott n°300
0.08 !
?
0,06 ..~
o.o~
0.02
0.00
0 5 10 15 20 25 30 35 40 45
STRUCTURE NUMBER
Fig. 9. Time updating of the probability distribution (minimization of costs for distillation
sequence).
44
<--- costs
42
0,20
40 il
3B 0.15
) J • %
AAAAJI!'
0.I0
,, " I
0.05
Fig. 10. Minimization o f costs for distillation s e q u e n c e : reflux = 1.5; pressures = 5.5, 5.4, 5.3,
5.2, 5.t atm.
344 JOTA: VOL. 64, NO. 2, FEBRUARY 1990
44
.... o o = =
;,,,A
;",,', ,
o,,
o.,o
42
I' I '.p o.=5
40
38 . ~~ ,0.6 0.25
° ~ ~
~ ~.,*. ~ ~ o.2o
~: f O. 10
Fig. 11. Minimization of costs for distillation sequence: reflux = 1.5; pressures = 5.5, 5.4, 5.3,
5.2, 5.1 atm.
4. Conclusions
COSTS(MILLIONSS/YEAR)
64
62
<--costs
Ii \ % f
/~,~ {o.5Oo.o
PROBABILITIES
0.45
i , ; 0.30
60
0.30
probabilities --> ~ ~ % /
58 ~o- .......~ /\ ,, ! k ...................
y,,/\ o..o
°=5
56
.... ....................
+
0.15
I '%,~ ~,,'~ V %~/£ ~ minirnum_ 5 % gAS
54
J
%~%q,~.~.~..,~ 0.05
52 I I ~ I ~,~'~""'----'?~9 o ~ 0.00
2 3 4 5 6 7 8 9 10
STRUCTURENUMBER
Fig. 12. Minimization of cost function for chlorination of benzene: conversion rate = 0.8;
temperature = 320 K.
346 JOTA: VOL. 64, NO. 2, FEBRUARY 1990
References