Optimization Technique Based On Learning Automata PDF

.JOURNAL OF OPTIMIZATIONTHEORY AND APPLICATIONS: Vol.64, No.
2, FEBRUARY 1990
Optimization Technique Based on Learning Automata

K. N A J I M , l L. PIBOULEAU, 2 A N D M. V. LE LANN 3
Communicated by R. W. H. Sargent
Abstract. Optimization techniques are finding increasingly numerous

applications in process design, in parallel to the increase of computer
sophistication. The process synthesis problem can be stated as a large-
scale constrained optimization problem involving numerous local
optima and presenting a nonlinear and nonconvex character. To solve
this kind of problem, the classical optimization methods can lead to
analytical and numerical difficulties. This paper describes the feasibility
of an optimization technique based on learning systems which can take
into consideration all the prior information concerning the process to
be optimized and improve their behavior with time. This information
generally occurs in a very complex analytical, empirical, or know-how
form. Computer simulations related to chemical engineering problems
(benzene chlorination, distillation sequence) and numerical examples
are presented. The results illustrate both the performance and the
implementation simplicity of this method.
Key Words. Optimization, learning systems, nonconvex programming,

process synthesis.
Nomenclature
ci = p e n a l t y p r o b a b i l i t y ;
cp = p r e c i s i o n p a r a m e t e r o n c o n s t r a i n t s ;
D = v a r i a t i o n d o m a i n o f the v a r i a b l e x;
f(- ) = objective function;
g(. ) = constraints;
i,j = indexes;
k= iteration number;
1 Professor, Ecole Nationale Suprrieure d'Ing~nieurs de Grnie Chimique, Toulouse, France.

2 Assistant Professor, Ecole Nationale Sup~rieure d'Ing~nieurs de G~nie Chimique, Toulouse,
France.
3 Lecturer, Ecole Nationale Suprrieure d'Ingrnieurs de Grnie Chimique, Toulouse, France.
331
0022-3239/90/0200-0331506.00/0 © 1990 Plenum PublishingCorporation
332 JOTA: VOL. 64, NO. 2, FEBRUARY 1990
N = number of actions;
P = probability distribution vector;
p,(k) = ith component of the vector P as iteration k;
r = number of reactors in the flowsheet;
u(k) = discrete value or action chosen by the algorithm at iteration k;
Ui = discrete value of the optimization variable in [Umin, Umax];
Umin = lowest value of the optimization variable;
Umax = largest value of the optimization variable;
Z = random number;
X = variable for the criterion function;
xp = precision parameter on criterion function;
W(k) = performance index unit output at iteration k;
/30,/31 = reinforcement scheme parameters;
Ep = sum of the probability distribution vector components.
I. Introduction
In the last decade, an important place has been allotted to optimization

and optimal plant design problems. The studies related to these problems
were principally motivated by the increasing prices of raw material and
energy. The purpose of this work is to develop a method which not only
may yield the minimum value of the criterion function, but also gives useful
information on the values of the criterion function close to the minimum.
The optimal synthesis of industrial processes can be stated as a large-
scale mixed-integer programming problem, in which the discrete variables
refer to the process structure (connections between unit operations, type of
unit operations, etc.) and the continuous ones represent the operating
conditions. Most of the variables are physically bounded and nonlinearly
constrained by mass and energy balance equations.
Some algorithms have been proposed (Ref. 1) which are restricted to
linearly constrained problems with separable variables. The implementation
of these algorithms is generally a hard task, because they use a large-scale
NLP code [for example G R G (Ref. 2) or MINOS (Ref. 3)] within an
iterative procedure. Nevertheless, these optimization techniques do not give
any information concerning the neighborhood of the minimum. So, as far
as process synthesis is concerned, it can be interesting to get near-optimal
solutions and not only the solution leading to the minimum cost. Firstly,
in an optimization program, the cost of a process is computed by means
of a simple models such as a shortcut model for distillation. Consequently,
more rigorous models can be used once the initial set of feasible solutions
has been restricted to a subset of near-optimal solutions. Secondly, among
JOTA: VOL. 64, NO. 2, FEBRUARY 1990 333
these near-optimal solutions, one solution can be more interesting than the
strictly optimal solution from a technical point of view, because it allows,
for example, the use of existing equipment or equipment of standard
dimensions.
Furthermore, the cost functions are most often implicit, i.e., they are
computed from design routines. Consequently, they can be nonconvex
a n d / o r exhibit several local optima. In some c,-es, these local optima can
even be identical (Ref. 4) and thus the above-mentioned methods can reach
only one local optimum.
The present work is concerned with the application of a stochastic
automaton with variable structure to solve optimization problems in the
presence of constraints, without making any assumption such as linearity
of constraints, convexity of the cost function, etc. A learning automaton
operates in a random medium which models the optimization problem to
be solved. The learning system collects pertinent information during its
operation and processes it according to a certain rule, so as to optimize a
prespecified cost function under some constraints. A probability distribution
is associated to the set of feasible solutions. Each probability represents
the weight given by the learning automaton to a processs structure. The
automaton processes and collects the costs of the process structures and
updates the probability distribution by the use of a reinforcement scheme
to achieve predefined goals (i.e., a decrease of the cost function). The
minimum search is made according to this probability distribution. The
goals to be reached are contained in a performance evaluation unit which
evaluates the quality of a specific structure by a technique of reward or
penalty. The reinforcement scheme uses this response (reward or penalty)
to increase or decrease the weight (i.e., the probability) associated to this
structure and generates a new probability distribution P ( k + l ) from the
previous one P(k).
Such a learning technique has already been used for control purposes
and has been applied to the control of a pulsed liquid-liquid extraction
pilot plant (Ref. 5). The technique of reward-penalty was applied to select
the control actions.
2. Learning Algorithm
Two types of optimization problems will be considered: (i) minimiz-

ation of constrained problems with local optima which could be identical;
(ii) search for an optimal plant design which minimizes a certain perform-
ance criterion in the presence of constraints.
In the second case of discrete problems, the optimization variable u~

represents a structure number. In the first case, as the problem is not naturally
discrete, the variation domain [Umi,, /'/max] of the optimization variable is
discretized into a set of N values ui, (ui; i = 1, N ; uf = Umin, /'/N =//max)-
T h u s , these two optimization problems can be regarded as the determi-
nation of the value(s), one or several, of the discrete variable u~ which
give(s) the minimum value of an objective function f ( u ) in the presence of
constraints g~( u ) <- O.
The application of learning system theory to solve these problems needs
the behavior analysis and synthesis of a learning automaton which operates
in a stationary random environment.
A learning system is composed of a stochastic automaton and a perform-
ance evaluation unit. A sketch of a learning automaton is given in Fig. 1.
The stochastic automaton consists of one automaton with N internal states
which are related to the discrete values u~ (one to each internal state). These
discrete values are called actions. A probability p~(k) is associated to action
u~ and represents the weight (confidence in some sense) given by the
stochastic automaton to this action. Initially, all the actions (i.e., the struc-
tures) have an identical weight.
Based on the probability distribution vector P(k), at each iteration k,
the automaton selects randomly an action ui (k) with associated cost function
f(ui(k)) and updates the overall probability distribution according to the
RANDOMENVIRONMENT
t{u i )
design routines to
compute the cost functionf
selected ~ f
action
ui
f 9
I STOCHASTIC
AUTOMATON PERFORMANCE
WITH EVALUATION
Is the selected action
VARYINGSTRUCTURE o good choice ? UNIT
response w(k)
Fig. 1. Learning system.

J
response of the performance evaluation unit. This unit contains information

or descriptions of the objective concerning the optimization problem. The
information is of any type: analytical or heuristic if human knowledge can
be introduced. The performance evaluation unit generates a response corre-
sponding to a reward [w(k) = 0] if the selected action ui(k) agrees with the
goals to be reached or a penalty [ w ( k ) = 1] in the opposite case. The
performance evaluation unit produces output w(k) = 1 with probability c~
and output w(k)= 0 with probability 1 - c i . The set of penalty probabilities
{ci} characterizes the environment. They are initially unknown. The proba-
bility distribution is modified according to this response by a reinforcement
scheme:
(i) if reward, the probability pi(k) is increased;
(ii) if penalty, the probability pi(k) is decreased.
For optimization problems, the goals are quite simple and consist of
giving a reward if the selected action u~(k) satisfies the constraints and leads
to a criterion value less than the one obtained for the previous selected
value u i ( k - 1). In many actual problems, the criterion and the constraints
may be defined with some fuzzy terms (inaccurate costing and design models,
data reconciliation, data fitting, etc). To simulate this fuzzy approach, the
constraint domain D is defined by g~(u)±cp<-O, where cp is an error
percentage margin. The same method can be used can be used for the
criterion value by introducing another margin percentage error xp. So, the
near-optimal solutions in an xp% range of the minimum criterion value can
be obtained. As mentioned above, a near-optimal solution may be more
realistic for engineering purposes. The constraint domain D is used for
defining two penalty levels:
(i) first level, u,(k)~ D; (1)
(ii) second level,
u~(k) ~ D and f(ui(k)) > f(u,(k - 1)[1 +sign(f)xp], (2a)
sign(f) =f(u,(k))/lf(u~(k)) 1. (2b)
In the opposite case, the performance evaluation unit gives a reward.
The reinforcement scheme used is the optimal scheme developed by
Seret and Macchi (Ref. 6) (optimality means that asymptotically the action
associated with the minimum penalty probability is chosen with probability
one). It is defined as follows:
Case 1. W(k) = 0 (reward):

(i) increasing of the probability pi(k),
p~(k + 1) = {pi(k)+ flo" p,(k)[1 -p,(k)]}/Ep ; (3)
336 JOTA: VOL. 64, NO. 2, F E B R U A R Y 1990
(ii) decreasing of the other probabilities to constrain the sum to be I,
pj(k + 1) = {pj(k) - [30" p,(k)pj(k)}/~p, j ~ i (j = 1, N). (4)
The index j corresponds to discrete variables which have not been selected
at iteration k.
The quantity: Ep = [ p l ( k ) + " • • +pN(k)] is introduced to avoid round-
ott errors and to guarantee that p~(k) ~ [0, 1].
Case 2. W(k) = 1 (penalty):

(i) decreasing of the probability p~(k),
p~(k+ 1) = {p~(k) - i l l " p,(k)[1 -p~(k)]}/Ep ; (5)
(ii) increasing of the other probabilities,
pj(k+l)={p~(k)+fli .p~(k)p~(k)}/Ep, j#i. (6)
The adaptation parameters 130 and 131 are such that 0 < / 3 0 - 1 and 0 <-/31 -< 1.
The two penalty levels correspond to taking two different values for/31 ; for
example, fll ----1 for the first level (strict penalty) and/31 < 1 for the second
one (slight penalty).
The algorithm selects randomly an action ui at each iteration according
to the following procedure, based on the probability distribution. The index
i is the least value of the index j verifying the following constraint:
i
Z pj(k)>-z, (7)
j=l
where Z is a normally distributed random number generated at each

iteration, Z ~ [0, 1].
The behavior of the learning system can be summarized by the following
steps performed at each iteration:
Step 1. Random selection of one discrete value u; on the basis of the
probability distribution.
Step 2. Computation of the cost function.
Step 3. Use of this cost value to determine if this action u~ has to be
rewarded or penalized.
Step 4. Updating of the probability distribution according to this
response.
Step 5. Return to Step 1.
A flowchart of this algorithm is given in Fig. 2.
To illustrate the feasiblity and the flexibility of this algorithm, two
types of examples are presented in the following section.
SELECTION PROCEDURE
selectionof the leastvalueof the

indexi suchas :
RANDOM
J,
ENVIRONMENT
I computationof the costfunction 1

fk = f(ui )
4,
PERFORMANCE EVALUATION UNIT
YES
I "~' / first level I
I v of penalty: I
REINFORCEMENT SCHEME REINFORCEMENT SCHEME
pj(k+l) = [pj (k) + B1Pi(k)Pl(k))l/~p~
T new iteration
Fig. 2. Flowchart of the procedure performed by the learning algorithm at every iteration.
3. Results
The first example deals with the unconstrained and constrained optimiz-
ation of a multimodal function. No prior information is taken into consider-
ation. All the probability vectors are initialized as follows: P ( 0 ) =
[ l / N , I / N , . . . . I / N ] . A specific machine subroutine ( R A N D U . . . ) is used
to carry out a normally distributed random variable Z.
Example 3.1. Multimodal Function Optimization. The multimodal

function considered, depicted in Fig. 3, is given by
f(x)=cos(Irx)+sin(~rx)+cos(2~rx)+sin(27rx), 0<x-6.
The interval {0,6} is discretized into a set of 24 values ui (N=
24): {0.25, 0.5, 0 . 7 5 , . . . , 6.0}. To avoid numerical errors due to the approxi-
mation of the constant ,r, we have determined all the values of x leading
to the minimum values of the function f(x) with a precision of 5 x 10 -4
(xp = 0.05%).
The parameter ~1 was chosen to be equal to 1.0 for the first degree of
penalty and equal to 0.95 for the second degree. The parameter /30 was
chosen equal to 0 (this corresponds to an inaction). Indeed, by setting/30
to 0 neither solution is rewarded, and so all the solutions can be obtained
with the same probability. There are three minima for x = 1.5, 3.5, 5.5. In
Fig. 3, the function f(x) and the probability distribution at the end of the
search are plotted. It can be noted that the probability distribution plot can
be regarded as a negative picture of the f u n c t i o n f ( x ) ; thus, the probabilities
associated to the discrete values ul (i = 6, 14, 22) are equal to 1/3. Figure
4 shows the probability distribution for different iterations. It can be seen
that the three values of x corresponding to the minima are found between
iterations No. 60 and No. 100. After this number of iterations, the algorithm
only increases the probabilities affected to these values of uj.
This algorithm was used to optimize the same function f(x) with the
constraint f(x)>-1 with the same values of the parameters/30 and fll as
in the previous case. This problem was solved for a precision of 0.05% for
the minimum and of 2% on the constraint (fuzzy constraint). The results
FUNCTION PROBABILITIES
23 0.50
2.0 i o .45
f
1,5 i i
0.40
<-- function
~
1.0 , i 0.35
0.5 { ~ 0.30
0.25
o.o 1 To
-0,5 ] 0.20
-1,0 I 0.15
-1.5 0,10
=minimum+O. 0,05
0.00
24 DISCRETEVALUESIN INTERVAL[0,6]
Fig. 3. M i n i m i z a t i o n o f the function cos(zrx) + sin(~rx) + cos(2~rx) + sin(2~rx).
PROBABILITIES
0.35 ,
initialization
0.30 "-- iteration n~20 A
" " iteration n°60
0.25 ~ iteration n°100
t ~ iteration n°500
0.20
o.~5 f...-.~
o os ...... , ~r'l',. ~,. r .....

t!
"f---:
',
° ~ ~ ---
0,00
0 5 I0 15 20 25
DISCRETE VALUE NUMBER
Fig. 4. Time updating of the probability distribution [minimization of the cost function
cos(~x) + sin(Trx) + cos(27rx) + sin(2rrx)].
are illustrated in Fig. 5 where the function f ( x ) and the probability distribu-
tion at the end of the search are plotted. As has been previously mentioned,
the probability distribution peaks yield six values o f x leading to a minimum
value o f f ( x ) with f(x)->-1, namely, x =0.75, 1.75, 2.75, 3.75, 4.75, 5.75,
which correspond to u3, uT, u . , u~5, u~9, u23.
FUNCTION PROBABILITIES
2,5 0.6
2.0 !~ '+Y i ,J
1.5 ~ i
<-- function i iI "5
1,o i i ! 0,4
O.0 • ~'--, I ¶ " ' - - ' Z * - - 0.3
I
-0.5 4. , "o ~ ~ 8 ~t
~ ! constraint ± 2% ~ ~,~/ ~ 0.2
.1.o ~ ~ ~ ,~- - :: t ,~
MT\
# % # ~,
MT\
# i # probabilities --> ~.
lid #
0.1
o,e
24 DISCRETEVALUES IN INTERVAL[0,6]
Fig. 5. Constrained minimization of the function cos(Irx) +sin(lrx) + cos(2rrx) + sin(27rx).

Example 3.2. Process Synthesis. Processes involving reactors and

separators are in common use in chemical engineering, so the examples
presented below are restricted to the optimum computer-aided design of
such processes. This optimization can be defined as follows: given a set of
goal products (composition, flow rate, pressure, and temperature), sys-
tematically synthesize a process leading to this production at a minimum
total annual cost (sum of investment and operating costs) from known
reactants and reaction paths. The general flowsheet under consideration is
shown in Fig. 6. In the following examples, the process structure is searched
for given operating condition values, so the initial mixed-integer program-
ming problem becomes an integer programming problem. During the last
20 years, various solution methods have been proposed; they can be
classified into three main categories (Ref. 7): (a) heuristic methods which
use rule of thumb resulting from long experience; (b) evolutionary methods
which try to identify the best process through a sequence of evolutionary
improvements; (c) algorithmic methods which employ algorithms developed
in the area of integer programming.
Only algorithmic methods guarantee an optimal solution, but since
these methods are expensive in terms of computational time, heuristic
methods are often employed to reduce the search space. In this type of
methods, heuristic approaches are commonly used to compute bounds in
order to guide the scanning of the tree of solutions. The procedure presented
in this paper is quite different, because heuristic approaches can be used
to perfect the learning automaton.
Synthesis of a Distillation Sequence, r = O. Let us consider the separ-

ation of a feed stream containing six components as listed in Table 1. The
feed rate is 1000 kmol h -1, and all the products are required as pure com-
ponents. In order to simplify the problem, no heat integration is considered
RECYCLES
SETOF SEPARATION
REACTANT
t REACTORS SEQUENCE
t.-
q
Fig. 6. General flow sheet considered for process synthesis.
Table 1. List of the feed components (distillation sequence).
Component number Component Feed mole fraction
1 Ethane 0.05
2 Propylene 0.15
3 Propane 0.35
4 Isobutane 0.20
5 n-butane 0.15
6 Pentane 0.10
among the columns. Furthermore, it is assumed that each distillation column

operates with a very high given recovery rate (0.95%) of adjacent light and
heavy key components, and that the pressures are decreasing along any
sequence, so no compressor is introduced into the separation sequence. All
the correlations and cost data used for designing the columns and computing
the technico-economic criterion are given by Pibouleau and Domenech
(Ref. 8).
For such such a distillation s e q u e n c e and with given operating condi-
tions, there are 42 feasible flowsheets. The first results were o b t a i n e d with
a reflex of 1.3 for each column and pressures of 10, 9, 8, 7, 6 atm.
A n u m b e r is assigned to each structure. The o p t i m i z a t i o n was performed
to get the near-optimal s o l u t i o n s with a precision o f 10%. Figure 7 gives
COSTS (MILLIONS S/YEAR) PROBABILITIES

34 0.50
t 0.45
32
0.40
.... costs . / ~ I
0.35
3o i..,V v
0,30
2s I 0.25
0.20
o-o o o
20 p,~ 0.15
0.10
24 minimum -+ 10%
0,05
|Q~'*~ I ~ '~ [ probabilities --->
22 ~ o = q = o = o = o = o = q = o ~ I ~=o=q=o=o=o-o,,-o,o~o=o=o=q=~o-o~o=o~q=O=o~o~o=9=o=o 0.00
0 5 10 15 20 25 30 35 40 45
STRUCTURENUMBER
Fig. 7 Minimization of costs for distillation sequence: reflux = 1.3; pressures = 10, 9, 8, 7,
6 atm.
the annual costs of the different sequences and the probabilities associated
with these sequences at the end of the search. It can be noted that, in spite
of the multimodal character of the cost function, analysis of the probability
distribution yields five optimal sequences: Nos. 1, 2, 13, 15, 17.
In Fig. 8, the sequences No. 1 and No. 13 are represented as examples.
Figure 9, which gives the probability distribution at different iterations,
allows us to point out how this distribution is updated during the search.
The first hundred iterations yield a subset of 15 possibilities, which is
reduced, as the search goes on, to 6 and at the end, to only 5.
The number of near-optimal solutions depends, of course, on the
precision required for the determination of the minimum (i.e., the value of
12
1234 2
3
123456
1 56
Structuren°l
5
6
4
12
123456 2 3
34,56 .1~4
5
56 I ~ ~
Structure n°13
.~ 6
Fig. 8. Examples of sequence representation.

PROBABILITIES
0.20
0.~8
0.t6
tAt!
0.14 - - bliliatiz~lion
8,12 "~'ii .... ilera~ior~ n°'O0

i
0.I0 ~ ilerallon n°200
- - tteratiott n°300
0.08 !
?
0,06 ..~
o.o~
0.02
0.00
0 5 10 15 20 25 30 35 40 45
STRUCTURE NUMBER
Fig. 9. Time updating of the probability distribution (minimization of costs for distillation
sequence).
xp). This is illustrated with the following example corresponding to different

operating conditions: reflux = 1.5, pressures = 5.5, 5.4, 5.3, 5.2, 5.1 atm. For
a precision of 10%, the optimization yields 9 near-optimal solutions (Fig.
10); for a precision of 5%, there are only 3 solutions (Fig. 11).
COSTS (MILLIONS S/YEAR) PROBABILITIES

46 0.30
e
44
<--- costs
42
0,20
40 il
3B 0.15
) J • %
AAAAJI!'
0.I0
,, " I
0.05
" IVV~II V • ) ~ ~ , ~oro,,,,,,,,o, ....

30 ~ - woSJ-J ~.l .Xo.o.otO__q o Lo=~ ......... ~.o.o~.o.o, . . . . . . . . . ~. . . . . 0.00
S 10 15 20 25 30 35 40 45
STRUCTURE NUMBER
Fig. 10. Minimization o f costs for distillation s e q u e n c e : reflux = 1.5; pressures = 5.5, 5.4, 5.3,
5.2, 5.t atm.
COSTS (MILLIONS SMEAR) PROBABILITIES

46 0.50
44
.... o o = =
;,,,A
;",,', ,
o,,
o.,o
42
I' I '.p o.=5
40
38 . ~~ ,0.6 0.25
° ~ ~
~ ~.,*. ~ ~ o.2o
~: f O. 10
robabilities ---> m i n i m u m ± 5% 0.05

_~ ~, ~,--6=o~o=9-o -6~o=0~0-o=O~o~°~--'o,"--O~o,=O~='.q~,o=o=o=o=o-o=~,-om~q='O=O=O~O~q~o=~ 0,00
5 10 15 20 25 30 35 40 45
STRUCTURE NUMBER
Fig. 11. Minimization of costs for distillation sequence: reflux = 1.5; pressures = 5.5, 5.4, 5.3,
5.2, 5.1 atm.
Synthesis of a Benzene Chlorination Process, r = 1. The benzene chlori-

nation process studied in this section is shown in Fig. 6 and involves the
following chemical reactions:
C 6 H 6 --1-C12 ~--~C6H5C1+ HCI,
C6H5C1 + CI2 ~-'>C 6 H a C I 2 + HC1,
C6H4C12 + C12 ~-~ C6H3C13+ HC1.
Insofar as the other chlorination reactions involve insignificant amounts

of reactants, they are not taken into account. The monochlorobenzene is
produced in the liquid phase in a sequence of CSTR's whose design has
been studied by Barona et al. (Ref. 9) and Wild et al. (Ref. 10) for fixed
operating conditions. The reaction rate constants are given by the classical
Arrhenius law (Ref. 11). Since all reactions are exothermic, the heat is
removed in the internal coils.
After the CSTR sequence, the product (monochlorobenzene) is separ-
ated from the unreacted benzene recycled toward the CSTR's and from the
byproducts of the reactions (di- and trichlorobenzene). The hydrochloric
acid produced is eliminated at the CSTR sequence output by a stripping
operation. This stripping operation cost is not taken into account in the
criterion evaluation. The kinetic data, the correlations, and economic data
are give by Floquet (Ref. 12).
The problem consists of determining the optimal process (involving

only one reactor) for a total production of monochlorobenzene of 30×
106 kg/year.
Such a problem involving only one reactor and with given operating
conditions already leads to ten possibilities. The following example corre-
sponds to a conversion rate of 0.8 in the reactor at a temperature of 320 K.
The optimization problem was to determine the subset of near-optimal
solutions with a precision of 5%. The costs and the probabilities associated
to the different flowsheets are plotted in Fig. 12. As previously mentioned,
the probability distribution at the end of the search can be regarded as a
negative picture of the cost function. This is well illustrated in Fig. 12,
where the probability distribution plot presents 4 peaks corresponding
effectively to the 4 near-optimal solutions.
Hence, the peaks of the probability distribution plot yield all the
near-optimal solutions in a domain defined by the precision xp. The value
of the peaks is about 0.25 (1/4), and the other probabilities are about zero.
4. Conclusions
The optimization technique based on learning automata can solve

nonlinear-nonconvex and constrained problems involving several local or
identical optima. Furthermore, all the near-optimal solutions (i.e., solutions
in a given range of an optimum) can be obtained. The efficiency of such a
COSTS(MILLIONSS/YEAR)
64
62
<--costs
Ii \ % f
/~,~ {o.5Oo.o
PROBABILITIES
0.45
i , ; 0.30
60
0.30
probabilities --> ~ ~ % /
58 ~o- .......~ /\ ,, ! k ...................
y,,/\ o..o
°=5
56
.... ....................
+
0.15
I '%,~ ~,,'~ V %~/£ ~ minirnum_ 5 % gAS
54
J
%~%q,~.~.~..,~ 0.05
52 I I ~ I ~,~'~""'----'?~9 o ~ 0.00
2 3 4 5 6 7 8 9 10
STRUCTURENUMBER
Fig. 12. Minimization of cost function for chlorination of benzene: conversion rate = 0.8;
temperature = 320 K.
method has been illustrated on numerical and chemical engineering

examples. All the local optima and the near-optimal solutions have been
identified for the two types of problems.
The numerical example demonstrates that the learning system is well
suited to finding all the optima of a multimodal function in the presence
of constraints. This can have wide applications in chemical engineering,
particularly when the criterion to be optimized is obtained from simulation
models and consequently can have several optima.
This method can also be used to determine near-optimal solutions in
a given dubiousness range on the constraints and the criterion. This approach
is of practical interest for chemical engineering problems, insofar as a
near-optimal solution can be more attractive from a technical point of view
than the strictly optimal one. Furthermore, this error margin can simulate
the fuzzy character of economic, experimental, and other data.
Further research works will be oriented toward two areas: the use of
heuristic rules inside the performance evaluation unit in order to take into
account the type of problems under consideration; the introduction of a
hierarchical structure of automata for solving large-scale optimization prob-
lems represented by a tree of decision. This will give birth to a powerful
tool in the process synthesis area.
References
1. GROSSMANN, I. E., Mixed-Integer Programming Approach for the Synthesis of

Integrated Process Flowsheets, Computer and Chemical Engineering Journal,
Vol. 9, pp. 463-482, 1985.
2. ABADIE, J., The GRG Method for Nonlinear Programming, Design and
Implementation of Optimization Software, Edited by H. Greenberg, New York,
New York, 1968.
3. MURTAGH, n. m., and SAUNDERS, M. A., The Implementation ofa Lagrangian-
Based Algorithm for Sparse Nonlinear Constraints, Report No. SOL 80-1,
Stanford University, Stanford, California, 1980.
4. RUDD, D. F., and WATSON, C. C., Strategy of Process Engineering, Wiley, New
York, New York, 1968.
5. LE LANN, M. V., NAJIM, K., and CASAMATTA, G., Learning Control of a
Pulsed Liquid-Liquid Extraction Column, Chemical Engineering Science, Vol.
42, pp. 1619-1628, 1987.
6. SECRET, D., and MACCHI, O., Automates Adaptatifs Optimaux, Techniques et
Science Informatiques, Vol. 1, pp. 143-151, 1982.
7. NISHIDA, N., STEPHANOPOULOS, G., and WESTERBERG, m. W., A Review
of Process Synthesis, AIChE Journal, Vol. 27, pp. 321-351, 1981.
8. PIBOULEAU, L., and DOMENECH, S., Discrete and Continuous Approaches to

the Optimal Synthesis of Distillation Sequences, Computer and Chemical
Engineering Journal, Vol. 10, pp. 479-492, 1986.
9. BARONA, N., and PRENGLE, W. A., JR., Design Reactors This Way for Liquid-
Phase Processes, Hydrocarbbon Processing, VoL 52, pp. 73-90, 1973.
10. WILD, G., MIDOUX, N., and CHARPENTIER, J. C., Dimensionnement d'un
R~acteur Agitd pour ta Fabrication de Monochlorobenzene, Energ6tique Indus-
trielle, Edited by P. Le Goff, Technique et Documentation, Paris, France, 1980.
11. BODMAN, S. W., The Industrial Practice of Chemical Process Engineering, MIT
Press, Cambridge, Massachusetts, 1968.
12. FLOQUET, P., Procddures Discrete et Continue d'Optimisation et de CAO en
G~nie Chimique--Etude de Cas, Th~se de Doctorat, Institute National Polytech-
nique des Toulouse, Toulouse, France, 1986.

Optimization Technique Based On Learning Automata PDF

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Optimization Technique Based On Learning Automata PDF

Transféré par

Droits d'auteur :

Formats disponibles

.JOURNAL OF OPTIMIZATIONTHEORY AND APPLICATIONS: Vol.64, No.

Optimization Technique Based on Learning Automata

Abstract. Optimization techniques are finding increasingly numerous

Key Words. Optimization, learning systems, nonconvex programming,

1 Professor, Ecole Nationale Suprrieure d'Ing~nieurs de Grnie Chimique, Toulouse, France.

In the last decade, an important place has been allotted to optimization

Two types of optimization problems will be considered: (i) minimiz-

In the second case of discrete problems, the optimization variable u~

Fig. 1. Learning system.

response of the performance evaluation unit. This unit contains information

Case 1. W(k) = 0 (reward):

(ii) decreasing of the other probabilities to constrain the sum to be I,

pj(k + 1) = {pj(k) - [30" p,(k)pj(k)}/~p, j ~ i (j = 1, N). (4)

Case 2. W(k) = 1 (penalty):

p~(k+ 1) = {p~(k) - i l l " p,(k)[1 -p~(k)]}/Ep ; (5)

(ii) increasing of the other probabilities,

pj(k+l)={p~(k)+fli .p~(k)p~(k)}/Ep, j#i. (6)

where Z is a normally distributed random number generated at each

selectionof the leastvalueof the

I computationof the costfunction 1

REINFORCEMENT SCHEME REINFORCEMENT SCHEME

pj(k+l) = [pj (k) + B1Pi(k)Pl(k))l/~p~

Example 3.1. Multimodal Function Optimization. The multimodal

o os ...... , ~r'l',. ~,. r .....

O.0 • ~'--, I ¶ " ' - - ' Z * - - 0.3

Fig. 5. Constrained minimization of the function cos(Irx) +sin(lrx) + cos(2rrx) + sin(27rx).

Example 3.2. Process Synthesis. Processes involving reactors and

Synthesis of a Distillation Sequence, r = O. Let us consider the separ-

Table 1. List of the feed components (distillation sequence).

Component number Component Feed mole fraction

among the columns. Furthermore, it is assumed that each distillation column

COSTS (MILLIONS S/YEAR) PROBABILITIES

Fig. 8. Examples of sequence representation.

8,12 "~'ii .... ilera~ior~ n°'O0

xp). This is illustrated with the following example corresponding to different

COSTS (MILLIONS S/YEAR) PROBABILITIES

" IVV~II V • ) ~ ~ , ~oro,,,,,,,,o, ....

COSTS (MILLIONS SMEAR) PROBABILITIES

robabilities ---> m i n i m u m ± 5% 0.05

Synthesis of a Benzene Chlorination Process, r = 1. The benzene chlori-

C 6 H 6 --1-C12 ~--~C6H5C1+ HCI,

C6H5C1 + CI2 ~-'>C 6 H a C I 2 + HC1,

C6H4C12 + C12 ~-~ C6H3C13+ HC1.

Insofar as the other chlorination reactions involve insignificant amounts

The problem consists of determining the optimal process (involving

The optimization technique based on learning automata can solve

method has been illustrated on numerical and chemical engineering

1. GROSSMANN, I. E., Mixed-Integer Programming Approach for the Synthesis of

8. PIBOULEAU, L., and DOMENECH, S., Discrete and Continuous Approaches to

Vous aimerez peut-être aussi