Vous êtes sur la page 1sur 7

Journal of Petroleum Science and Engineering 76 (2011) 217223

Contents lists available at ScienceDirect

Journal of Petroleum Science and Engineering


j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / p e t r o l

Committee neural networks with fuzzy genetic algorithm


S.A. Jafari a,, S. Mashohor a, M. Jalali Varnamkhasti b
a
b

Computer and Communication Systems, Faculty of Engineering, University Putra Malaysia, 43400, Serdang, Selangor, Malaysia
Laboratory of Applied and Computational Statistic, Institute for Mathematical Research, UPM, 43400, Serdang, Selangor, Malaysia

a r t i c l e

i n f o

Article history:
Received 17 October 2009
Accepted 10 January 2011
Available online 28 January 2011
Keywords:
back propagation neural network
committee neural network
fuzzy genetic algorithm
reservoir properties

a b s t r a c t
Combining numerous appropriate experts can improve the generalization performance of the group when
compared to a single network alone. There are different ways of combining the intelligent systems' outputs in
the combiner in the committee neural network, such as simple averaging, gating network, stacking, support
vector machine, and genetic algorithm. Premature convergence is a classical problem in nding optimal
solution in genetic algorithms. In this paper, we propose a new technique for choosing the female
chromosome during sexual selection to avoid the premature convergence in a genetic algorithm. A bi-linear
allocation lifetime approach is used to label the chromosomes based on their tness value, which will then be
used to characterize the diversity of the population. The label of the selected male chromosome and the
population diversity of the previous generation are then applied within a set of fuzzy rules to select a suitable
female chromosome for recombination. Finally, we use fuzzy genetic algorithm methods for combining the
output of experts to predict a reservoir parameter in petroleum industry. The results show that the proposed
method (fuzzy genetic algorithm) gives the smallest error and highest correlation coefcient compared to ve
members and genetic algorithm and produces signicant information on the reliability of the permeability
predictions.
2011 Elsevier B.V. All rights reserved.

1. Introduction
There are some reasons for distributing a learning task among a
number of individual networks. The main reason is due to improving
the generalization ability, because the generalization of individual
networks is not unique. The combination of some Articial Neural
Network (ANN) when they do the same task is called as the
ensemble of neural network or committee of neural network. When
the networks are different it is called a committee of machine. In
ensemble methods, the ensemble candidates are different. There are a
number of methods to create different individual training data, the
initial condition, the topology of nets, and the training algorithms.
After selecting individuals and training them, their generated results
will be combined by some methods. The committee machine
structure can be viewed in Fig. 1. In the committee machine, the
expectation is that difference experts converge to different local
minima on the error surface, and the overall output improved the
performance, (Wolpert, 1992; Efron and Tibshirani, 1993; Rezaee,
2001). The mean square error (MSE) between individual output and
expectation output (target) can be expressed in terms of the bias
squared plus the variance, (Haykin, 1999). The MSE equation makes it
clear that we can reduce either the bias or the variance to reduce the
neural network error. Unfortunately, it is found that for the concerned
Corresponding author. Tel.: +60 124422445.
E-mail addresses: sajkenari@yahoo.com (S.A. Jafari), syamsiah@eng.upm.edu.my
(S. Mashohor), jalali@inspem.upm.edu.my (M.J. Varnamkhasti).
0920-4105/$ see front matter 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.petrol.2011.01.006

individual ANN, the bias is reduced at the cost of a large variance.


However, the variance can be reduced by an ensemble of ANNs. From
the MSE equation, Naftaly et al. (1997) obtained two conclusions:
(1) The bias of the ensemble averaged function is exactly the same
as that of the function connected to a single NN.
(2) The variance of the ensemble averaged function is less than
that of the function connected to a single NN.
These theoretical ndings indicate that ensembles of ANNs can
easily reduce the variance with less cost from the bias. Therefore, an
effective approach is to select or create a set of nets or experts that
show high variance but low bias, because with combining, we can
reduce the variance. Several methods have been employed for creating
committee members. Generally, these methods can be divided into
three categories:
(1) Method to select diverse training data sets from the original
source data
(2) Method to create different experts or individual neural
network
(3) Method to combine these individuals and their results
2. Some methods for constructing committee member
In this section, we introduce some methods for committee
member construction as mentioned in part 1. There are some
approaches that have been used for selecting training data set by

218

S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223

NN 1

y1 (n )
Input

NN 2

y2 (n )

Output

Combin

X(n)

Y(n)

yk (n )
NN k
Fig. 1. Committee neural network with k members.

varying the source data sets. Bagging, noise injection, cross-validation,


stacking and boosting are the most common techniques. Other
methods have been used by a researcher to construct committee
members are: Fuzzy Logic (FL) with a different fuzzy inference
system, Genetic Algorithm (GA), Neuro-Fuzzy, empirical formula and
etc. Kadkhodaie-Ilkhchi et al. (2009a), have used the neural network,
fuzzy logic and fuzzy neural network as a committee member. Chen
and Lin (2006) have used three empirical formulas as committee
members. Kadkhodaie-Ilkhchi (2009b), Rezaee et al. (2009) have
used a back-propagation neural network with the different training
algorithm for the construction committee neural network. In this
paper, we have used ve signicant training algorithms in articial
neural network as committee members. They were Levenberg
Marquardt (LM), Bayesian Regularization (BR), One Step Secant
(OSS), Resilient Back Propagation (RP), and Scaled Conjugate Gradient
(SCG). The following is a brief description of some important methods to
create committee members.
2.1. Bagging (Breiman, 1996)
One of the important methods of manipulating the data set and
creates M training sets is bootstrap aggregation or bagging. The
basic idea of bagging is to generate a collection of experts, such
that every expert uses bootstrap training data set. Given a data set,
X = (x1, x2,..., xn), bootstrap sampling means to create N new data set
X1, X2, X3,..., XN such that every Xi is generated by randomly picking n
data point xi of X. It is clear, in creating Xi, some xi of X may be
repeated and some xi may be ignored. In bagging, we repeat this
learning algorithm to create M different training sets for M experts.
The bagging method is designed to reduce the error variance, and it is
very efcient in constructing a set of training data when the source
data size is small.
2.2. Noise injection (Raviv and Intrator, 1996)
As mentioned before, simple bootstrap generates several training
data using the source data, all with the same size. Efron and Tibshirani
(1993), noted that, bootstrap can also be viewed as a method for
simulating noise inherent in the data and thus, increase effectively the
number of training patterns. Raviv and Intrator (1996), presented
another algorithm in a bootstrap, which is Bootstrap Ensemble with
Noise (BEN). In this method, a variable amount of noise is added to the
input data before using the bootstrap sampling to ensemble training
sets. This method can effectively reduce the variance, since the
injection of noise increases the independence between the different
training sets derived from the source data sets. Bhatt and Helle (2002)
have used noise injection for the construction committee members.
2.3. Cross-validation (Krogh and Vedelsby, 1995)
In this method the available data set is partitioned into M disjoint
and equal subsets. After that we select one of these subsets as a test set

and the (M-1) remaining data set as the training data. After M time
repeating, we have M numbers of overlapping training sets and M
independent test sets. Since the training sets are different, then the
generated errors after training are expected to fall in different local
error minima and therefore lead to different results. The performance
of experts is measured on the corresponding test data set. Breiman,
Friedman, Olshen and Stone, use cross-validation to prune classication tree algorithms.
2.4. Stacking (Wolpert, 1992)
The rst part of the stacking method is similar to the crossvalidation method. As mentioned above, there is an M training set and
an M test set. After that we use the M training sets to train two
generalizers G1,G2 and then the M test set is put into G1 and G2, (these
outputs will be used as second space generalizer inputs). The output
of G1 and G2and target value, (g1i, g2i, yi) will be used as the training set
of generalizer G as a second space generalizer.
2.5. Boosting by ltering (Schapire, 1990), AdaBoost (Freund and Schapire)
In this method, there are three experts. The rst expert is trained
with the M training data of the source training data set and the result
of the rst expert will be applied to the second expert. After that, the
second expert will be trained with this data set. After training the
second expert, the training data of the source data will be passed from
the rst and second experts. Finally the third expert will be trained
only on the data set in which the output of the rst and second experts
is disagreed. That means, if there are disagreements between the rst
and second experts on a certain data, this data will be passed to the
third expert. The nal result is related to the outputs of the three
experts. Freund and Schapire (1995); Drucker et al. (1994), have
shown that the boosting algorithm is very effective in many
experiences. Another method of boosting is adaptive boosting. In
this method, the training data will be selected with their probability.
For every data, the predicted value is close to the target value and the
probability to choose this data is low, otherwise the probability is
high. This method gives more chances to such data for retraining. For a
classication problem, we can use majority voting and for a regression
problem the result with lowest error rate is selected. AdaBoost is
sensitive to noisy data and outliers, but it is less sensitive to the over
tting for most learning algorithms.
3. Combination methods
The last stage of design Committee Machine (CM) is the combination
of the expert outputs. Many investigations have been done to nd the
combining methods to combine the expert outputs and produce the
nal outputs. In this section, we have introduced some traditional
combining methods in the CM. Some of them are suitable for the
classier and some of them performed well in regression.
3.1. Simple averaging (Lincoln and Skrzypek, 1990)
One of the most frequently used combination methods is simple
averaging. In this method after training the committee members, the
nal output can be obtained by averaging the output of the committee
members. It is easy to illustrate by Cauchy's inequality which the
Mean Square Error (MSE) for committee machine with the simple
averaging method is less or equal than the average of MSE for every
expert. This method is more useful when the variances of the
ensemble members are different, because the simple averaging can
reduce the variance of the nets. The disadvantage of simple averaging
is the equal weight for every committee member, i.e. there is no
difference between the weights of two committee members with low
and high generalizations.

S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223

3.2. Weighted averaging (Jacobs, 1995)


In this method, every committee member has a suitable weight
related to their ability to generalization. In Jacobs (1995) the researcher
introduced a gating method to determine the weight of every expert.
The authors Opitz and Shavlik (1996) have used Genetic Algorithm (GA)
to determine the weight of each member. To obtain the optimal weights
for combining with the GA algorithm, the tness function is dened as
below:
n

MSEGA =

i=1

1
2
w y + w2 y2i + ::: + wk yki Ti ;
n 1 1i

wi = 1 1

i=1

where, y1i is the output of the rst network on the ith input or ith
training pattern, wi is the weight of the ith member, Ti is the target
value of the i-th input, and n is the number of training data.
3.3. Majority voting (Hansen and Salamon, 1990)
This combination method is most popular for classication
problems. If more than half of the individuals vote for a prediction,
majority voting will select this prediction to be the nal output. Majority
voting ignores the fact that some networks that lie in a minority
sometimes can produce the correct results. At this stage of combination,
it ignores the existence of diversity that is the motivation for ensembles.
3.4. Ranking (Ho et al., 1994; Al-Ghoneim and Kumar, 1996)
This method uses experimental results obtained by a set of experts
on a set of dataset to generate a ranking of those experts (each expert
has a rank related with an input dataset). After that the results of the
ranks of each expert will be calculated by some methods such as
average rank, success rate ratio, and signicant wins to generate nal
ranking for experts. The nal rank can be used to select one or more
suitable experts for a test (unseen) data (Brazdil and Soares, 2000).
There are no unique criteria on the selection of the mentioned
combination methods. The choice mainly depends on the characteristic
of the particular application that we have in hand, e.g. the nature of the
application (classier or regression), the size and quality of the training
data and the generated errors on the region of the input space. Using
the same combination method on an ensemble for a regression problem
may generate good results. However, it may not work on a classication
problem and vice versa. Much work has been done to introduce
combining method in ensemble approaches. Major contribution in
ranking is as the weighted majority voting (Kuncheva, 2004), decision
templates (Kuncheva et al., 2001), naive Bayesian fusion (Xu et al.,
1992), Dumpster Shafer combination (Ahmadzadeh and Petrou, 2003)
and Fuzzy integral (Cho and Kim, 1995).
4. Fuzzy genetic algorithm (FGA) for combining
Genetic Algorithm (GA) is a search optimization technique that
mimics some of the processes of natural selection and evolution. In
optimization, when a GA fails to nd the global optimum, the problem
is often credited to premature convergence, which means that the
sampling process converges on a local optimum rather than the global
optimum. Sexual selection by means of female preferences has
promoted the evolution of complex male ornaments in many animal
groups. A sex-determination system is a biological system that
determines the expansion of sexual characteristics in an organism.
Most sexual organisms have two sexes. In many cases, sex determination is genetic: males and females have different alleles or even different
genes that state their sexual morphology. In a classical GA, chromosomes reproduce asexually: any two chromosomes may be parents in
crossover. Gender division and sexual selection inspired a model of
gendered GA in which crossover takes place only between chromo-

219

somes of an opposite sex. In this study, a relation between the age and
tness as in biological systems affecting the selection procedure is
proposed. A bi-linear allocation lifetime approach is used to label the
chromosomes based on their tness value, which will then be used to
characterize the diversity of the population. Inspired by the non-genetic
sex-determination system that exists in some species of reptiles,
including alligators and some turtles where sex is determined by the
temperature at which the egg is incubated, we divided the population
into two groups, male and female, so that the male and female can be
selected in an alternate way. In each generation, the layout of the
selection of male and female is different. During the sexual selection, the
male chromosome is selected randomly. The label of the selected male
chromosome and the population diversity of the previous generation
are then applied within a set of fuzzy rules to select a suitable female
chromosome (Jalali and Lee, 2009). Fuzzy systems are encountered in
numerous areas of application. Fuzzy rules, for example, viewed as a
generic mechanism of grainy knowledge representation, are positioned
in the center of the knowledge-based systems. A fuzzy IF-THEN rule
consists of an IF part (antecedent) and a THEN part (consequent).
The antecedent is a combination of terms, whereas the consequent is
exactly one term. In the antecedent, the terms can be combined by using
fuzzy conjunction, disjunction and negation. A term is an expression
of the form: X = T, where X is a linguistic variable and T is one of its
linguistic terms. In this paper, we use a linguistic variable age for
chromosomes. Fig. 2 describes the linguistic variable age where Infant,
Teenager, Adult and Elderly are the linguistic values.
The system applied in our study uses triangular membership
functions, the (minimum) intersection operator and correlationproduct inference procedure. Defuzzication of the outputs was
performed using the fuzzy centroid method described by Kosko
(1992). To nd the membership function, we use the tness value of
each chromosome and the minimum, maximum and average tness
values of the population in each generation. Each chromosome has
its own label determined by the age function. Let
=

fi f min
;
favr f min

fi favr
;
f max favr

= favr fi

where fi = tness value of chromosome i, favr = average tness value,


fmin = minimum tness value, and fmax = maximum tness value of
population. Then the age function is:
8
L +
>
>
; 0
<
n
ageci =
>
>
: + ; b 0
n

or
8
UL +
>
>
; 0
<
n
ageci =
>
>
: U + ; b 0
n

where, ci =Chromosome i, L =Minimum age, U =maximum age,


UL
U+L
,=
, and , , are dened in
n =Population size, =
2
2
Eq. (2).

Linguistic variable
Age

Syntactic rules

Infant

Teenager

Adult

Elderly

Linguistic terms

Fig. 2. The linguistic variable age.

220

S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223

Infant
1

Teenager

Adult

Elderly

Table 1
Fuzzy rules for selecting female chromosome.
Male age Diversity Female age
(Mage)
(Fage)

0.25

0.45

0.65

0.85

Infant

age (ci)

Fig. 3. The age linguistic variable for male and female.

Eq. (3) is suited for maximization problems which relate to higher


tness values while Eq. (4) is suited for minimization problems which
relate to lower tness values. This idea is inspired by the idea of
the lifetime proposed in Arabas et al. (1994). The fuzzication
interface denes the possibilities of the four linguistic values for each
chromosome: {Infant, Teenager, Adult, and Elderly}. These values
determine the degree of truth for each rule premise. This computation
takes into account all chromosomes in each generation and relies on
the triangular membership functions shown in Fig. 3, with L = 2 and
U = 10. A bi-linear allocation lifetime approach proposed in Kosko
(1992) is used to label the chromosomes based on their tness
value which will then be used to characterize the diversity of the
population.

Dci =

L + ;
+ ;

0
:
b0

Let =label of half of the population, then the population can be


divided into four levels, Very Low, Low, Medium and High diversity as
follows:
8
High
L + t
>
>
<
Population
Medium
L + tbL + t + 1
6
=
Diversity
Low
L + t + 1bL + t + 2
>
>
:
NL+t+2
Very Low


L+U
where, t =
is a parameter that has a correlation with the
n
hni
, (where [x] means
domain of labels in the population and =
10
nearest integer number to x, for example [2.3] = 2 and[2.8] = 3). This
computation is performed in every generation and relies on the
triangular membership functions shown in Fig. 4. The inputs are
combined logically using the AND operator to produce output
response values for all expected inputs. A ring strength for each
output membership function is computed. All that remains is to
combine these logical sums in a defuzzication process to produce the
crisp output. The fuzzy outputs for all rules are nally aggregated to
one fuzzy set. To obtain a crisp decision from this fuzzy output, we
have to defuzzify the fuzzy set. Defuzzication of the outputs was
performed using the fuzzy centroid method of the ring behavior
(Kosko, 1992), which may show that some of the rules are
unnecessary. The number of fuzzy rules in its rule base is 16. Table 1
lists the fuzzy rules for selecting the female chromosome. Although we
can obtain the Fage, we may not be able to nd a female chromosome
that has the exact Fage. We will select a female chromosome having
the nearest tness value to Fage to be the parent. In case there is more
than one female chromosome which satises the Fage condition, we
will choose a female chromosome with the highest tness value to

High

2.5

Medium

Low

4.5

6.5

Very Low

8.5

Fig. 4. The population diversity linguistic variable.

D (ci)

High
Medium
Low
Very low
Teenager High
Medium
Low
Very low

Male age Diversity Female age


(Mage)
(Fage)

Elderly or adult
Adult
Adult or teenager
Teenager or infant
Infant
Elderly or adult
Elderly
Adult or teenager
Teenager or infant
Infant

High
Medium
Low
Very low
High
Medium
Low
Very low

Elderly or adult
Adult or teenager
Teenager or infant
Infant
Adult or teenager
Teenager or infant
Infant
Infant

be the parent. This technique is called the Complement Method (Jalali


and Lee, 2009).
5. Case study
In this section, we used a data set from oil wells in Iran. First
several crossplots were generated between well log data and core
permeability to nding, which log has a good relationship with
permeability. With this method, we found a logical relationship
between ve inputs including Sonic transit time (DT), Neutron log
(NPHI), Density log (RHOB), Gamma Ray (GA), and True Formation
Resistivity (Rt) and rock permeability (K) as a target respectively. The
total of the data points divided randomly into three parts, sixty
percent for training, twenty percent for validation and twenty percent
for test. Five training algorithms of back propagation neural network
are selected as committee members. They were Levenberg Marquardt
(LM), Bayesian Regularization (BR), One Step Secant (OSS), Resilient
Back Propagation (RP), and Scaled Conjugate Gradient (SCG). As
mentioned above we used ve wireline logs as input data and a core
permeability as output data for analysis of our combining methods.
A brief description of this data set is provided here.
Sonic log (DT): The sonic tool measures the time required for the
transmission of an acoustic wave through a unit of formation thickness.
Sonic transit time (DT) is used both in the porosity determination and to
compute secondary porosity in carbonate reservoirs (Service, 1999).
Neutron log (NPHI): A radioactivity well log is used to determine
formation porosity. The logging tool bombards the formation with
neutrons. When the neutrons strike the hydrogen atoms in water
or oil, gamma rays are released. Since water or oil exists only in
pore spaces, a measurement of the gamma rays indicates formation
porosity. See radioactivity well logging (Service, 1999).
Density log: (RHOB): A special radioactivity log for open-hole
surveying responds to variations in the specic gravity of formations.
It is a contact log (i.e., the logging tool is held against the wall of the
hole). It emits neutrons and then measures the secondary gamma
radiation that is scattered back to the detector in the instrument. The
density log is an excellent porosity-measure device, especially for
shaley sands (Service, 1999).
Gamma Ray (GR): A type of radioactivity well log records natural
radioactivity around the wellbore. Shales generally produce higher
levels of gamma radiation and can be detected and studied with the
gamma ray tool. See radioactivity well logging (Service, 1999).
True formation resistivity (Rt): With reference to log analysis, the
resistivity of the undisturbed formation. It is derived from a resistivity
log that has been corrected as far as possible for all environmental
effects, such as borehole, invasion and surrounding bad effects. Hence,
it is taken as the true resistivity of the undisturbed formation in situ
and is called Rt. With reference to the core analysis, the resistivity

Male

Female

Randomly numbers

Fig. 5. The technique for two points cut in offspring.

S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223

221

Fig. 6. Variety of female's age chromosome when male's age chromosome and when diversity of population are changing.

of a sample is only partially lled with water. Called Rt, it is used in


contrast to the resistivity of a sample completely lled with water,
Ro. The water may be replaced by any nonconductive uid, usually air
or dead oil (Schlumberger Oileld Glossary).
Permeability (K): The ability of a rock or sediment to transmit
uids is dened permeability and is one of the most important rock
parameters for the evaluation of hydrocarbon reservoirs. The
permeability can be determined by three methods including; well
test, core analysis and well logging. Core analysis may not be available
for all boreholes and can be obtained in only some interval cored. Well
test interpretation provides a reliable in situ measure of average
permeability. Anyway, both well test and core data are expensive to
obtain and long time consuming to implement.
5.1. Combiner construction using fuzzy genetic algorithm
Now we are ready to construct a CM for the overall prediction of
permeability by combining the results of all training algorithms. At
the rst step, we use the matlab genetic algorithm toolbox to nd out
the weight of each committee member. In this method, every expert
has a weight value and the permeability from the CM is obtained by
the following equation:
KGA = w1 y1 + w2 y2 + w3 y3 + w4 y4 + w5 y5

the results of experts. The tness function for FGA is dened as


below:
n

MSEFGA =

i=1

1
2
w y + w2 y2i + w3 y3i + w4 y4i + w5 y5i Ti 8
n 1 1i

In this function, yji are the outputs of j-th expert where j = 1, 2,.., 5,
Ti is the target value of the i-th input, and n is the number of training
data. The parameters of the applied FGA are described as follows. We
divided the population into two groups, male and female, so that the
male and female can be selected in an alternate way. In each
generation, the layout of this selection is different. During the sexual
selection, the male chromosome is selected randomly and the label of
the selected chromosome (1) and the population diversity (4) of
the previous generation are then applied within the set of Fuzzy rules
(as in Table 1) to select a suitable female chromosome. For crossover
we consider K-Point, and Random number (KPR). In this method,
when male and female are selected, a positive integer number k is
selected randomly, then these parents are divided into (k + 1)-parts.
After that, k-part of these parts will be used for the offspring of the
same k-point cut method and the other part of offspring is completed
with random numbers of 0 or 1. Fig. 5 shows this technique for two
points cut in offspring. The place of these parts can be changed
randomly (Jalali and Lee, 2009).

With this equation, the R2 value is 0.8438 and MSE is 0.001. In


the next step, we use FGA to obtain optimal weights for combining

Fig. 7. Variety of female's age when male's age is changing.

Fig. 8. Variety of female's age when diversity of population is changing.

222

S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223

Mutation is performed in four steps:


(1) A random real number from an interval (0, 1) is generated for
the probability of mutation.
(2) GA considers this probability and some chromosomes are selected.
(3) For each chromosome that is selected, a random natural number
k, varying from 1 to the number of genes in the chromosome is
generated.
(4) The gene number k is replaced by another randomly-generated
gene.

A standard GA is used in this experiment with a population size


of 20, and the total length of the chromosomes is 85 bits. Crossover
probability, pc = 0.50 and mutation probability, pm = 0.02. Each test
function is tested on the GA for 30 times with a maximum of 5000
generations per each run. Fig. 6 shows the variable of female's age
when male's age and diversity are changing. Fig. 7 shows when the
male's age chromosomes are increasing, the system for female's age
chromosome is considering decreasing age. Fig. 8 shows that when
the diversity of population is increasing, the system for female's age
chromosomes is considering the age decrement. This technique will

Fig. 9. (af). Crossplot showing R between core and predicted permeability using ve training algorithms and FGA.

S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223
Table 2
The comparison of MSE and R2 for test data using ve training algorithm, GA and FGA.
Algorithm

R2

MSE

LM
BR
OSS
RP
SCG
GA
FGA

0.8274
0.8239
0.7257
0.751
0.7885
0.8438
0.8523

0.0012
0.0012
0.0015
0.0016
0.0015
0.001
0.00092

maintain the diversity of the population and then GA cannot converge


very soon, and premature convergence will be avoided. Fig. 9 (af)
shows the correlation coefcient between the core and predicted
permeabilities from ve training algorithms and FGA methods.
Table 2, shows both MSE and R2 values for overall data points using
ve training algorithms, GA and weighted averaging (FGA). This
table helps us to decide, which combining model is better in its
performance. A good combining scheme should have a higher R2 and
lowest MSE.
6. Conclusion
There are different ways of combining the intelligent system's
outputs in the combiner in the committee neural network, one of
these methods is Genetic Algorithm (GA). The failure to nd good
results when we use GA in a CM is highly due to premature
convergence. The population diversity in GA is an important
parameter on premature convergence. A technique for controlling
the population diversity using Fuzzy rules and sexual selection is
proposed in this paper. In conclusion, the female choice by Fuzzy logic
is a suitable way for improving the performance of GAs in keeping
with the diversity of the population and premature convergence can
be eliminated. In this paper, we used the FGA methods for combining
the output of experts to the prediction of permeability in the oil
industry. From the simulation results, the correlation coefcient and
MSE for ve training algorithms are shown in Table 2. The R2 and MSE
for GA combining method are 0.8438 and 0.001 respectively, which
are better than all training algorithms. With applying FGA method to
combining, the correlation coefcient and MSE are improved, which
are 0.8523 and 0.00092 respectively.
References
Ahmadzadeh, M.R., Petrou, M., 2003. Use of DempsterShafer theory to combine
classiers which use different class boundaries. Pattern Anal. Appl. 6 (1), 4146.
Al-Ghoneim, K.A., Kumar, B.V., 1996. Combining neural networks using the ranking
gure of merit. Proc. SPIE 2760, 213.

223

Arabas, J., Michalewicz, Z., et al., 1994. GAVaPSa genetic algorithm with varying
population size. Evolutionary Computation, 1994: IEEE World Congress on
Computational Intelligence, vol.1, pp. 7378.
Bhatt, A., Helle, H.B., 2002. Committee neural networks for porosity and permeability
prediction from well logs. Geophys. Prospect. 50 (6), 645660.
Brazdil, P., Soares, C., 2000. A Comparison of Ranking Methods for Classication
Algorithm Selection, pp. 6375.
Breiman, L., 1996. Bagging predictors. Mach. Learn. 24 (2), 123140.
Chen, C.-H., Lin, Z.-S., 2006. A committee machine with empirical formulas for
permeability prediction. Comput. Geosci. 32 (4), 485496.
Cho, S.-B., Kim, J.H., 1995. An HMM/MLP architecture for sequence recognition. Neural
Comput. 7 (2), 358369.
Drucker, H., Cortes, C., et al., 1994. Boosting and other ensemble methods. Neural
Comput. 6 (6), 12891301.
Efron, B., Tibshirani, R.J., 1993. An Introduction to the Bootstrap. [u.a.] Chapman & Hall,
New York. 1993.
Freund, Y., Schapire, R., 1995. A decision-theoretic generalization of on-line learning
and an application to boosting. European Conference on Computational Learning
Theory, pp. 2337.
Hansen, L.K., Salamon, P., 1990. Neural network ensembles. IEEE Trans. Pattern Anal.
Mach. Intell. 12 (10), 9931001.
Haykin, S., 1999. Neural NetworksA Comprehensive Foundation, Upper Saddle River.
Prentice-Hall, NJ.
Ho, T.K., Hull, J.J., et al., 1994. Decision combination in multiple classier systems. IEEE
Trans. Pattern Anal. Mach. Intell. 16 (1), 6675.
Jacobs, R.A., 1995. Methods for combining experts' probability assessments. Neural
Comput. 7 (5), 867888.
Jalali, M., Lee, L.S., 2009. Fuzzy genetic algorithm with sexual selection (FGASS). Second
Int. Conf. and Workshop on Basic and Applied Science, 24 June, Johor Bahru,
Malaysia.
Kadkhodaie-Ilkhchi, A., Rahimpour-Bonab, H., et al., 2009a. A committee machine
with intelligent systems for estimation of total organic carbon content from
petrophysical data: an example from Kangan and Dalan reservoirs in South Pars
Gas Field, Iran. Comput. Geosci. 35 (3), 459474.
Kadkhodaie-Ilkhchi, A., Rezaee, M.R., et al., 2009b. A committee neural network for
prediction of normalized oil content from well log data: an example from South
Pars Gas Field, Persian Gulf. J. Petrol. Sci. Eng. 65 (12), 2332.
Kosko, B., 1992. Neural Networks and Fuzzy Systems: A Dynamical Systems Approach
to Machine Intelligence. Prentice-Hall, Inc, p. 449.
Krogh, A., Vedelsby, J., 1995. Neural network ensembles, cross validation, and active
learning. Adv. Neural Inf. Process. Syst. 7, 231238.
Kuncheva, L.I., 2004. Classier Ensembles for Changing Environments, pp. 115.
Kuncheva, L.I., Bezdek, J.C., et al., 2001. Decision templates for multiple classier fusion:
an experimental comparison. Pattern Recognit. 34 (2), 299314.
Lincoln, W., Skrzypek, J., 1990. Synergy of clustering multiple back propagation
networks. Adv. Neural Inf. Process. systems 2, 650657.
Naftaly, U., Intrator, N., Horn, D., 1997. Optimal ensemble averaging of neural networks.
Network 8, 283296.
Opitz, D.W., Shavlik, J.W., 1996. Actively searching for an effective neural network
ensemble. Connect. Sci. 8 (3), 337354.
Raviv, Y., Intrator, N., 1996. Bootstrapping with noise: an effective regularization
technique. Connect. Sci. 8 (3), 355372.
Rezaee, M.R., 2001. Petroleum Geology. Alavi Publications, Tehran, Iran.
Schapire, R.E., 1990. The strength of weak learnability. Mach. Learn. 5 (2), 197227.
Service, U. o. T. a. A. P. E., 1999. A Dictionary for the Petroleum Industry. Petroleum
Extension Service.
Wolpert, D.H., 1992. Stacked generalization. Neural Netw. 5, 241259.
Xu, L., Krzyzak, A., et al., 1992. Methods of combining multiple classiers and their
applications to handwriting recognition. Syst. Man Cybern. IEEE Trans. 22 (3),
418435.

Vous aimerez peut-être aussi