Vous êtes sur la page 1sur 9

Journal of Petroleum Science and Engineering 53 (2006) 149 157

www.elsevier.com/locate/petrol

Facies identification from well logs: A comparison of discriminant


analysis and nave Bayes classifier
Yumei Li , Richard Anderson-Sprecher
University of Wyoming, Laramie, WY 82071-3332, USA
Received 19 March 2005; received in revised form 1 June 2006; accepted 6 June 2006

Abstract
The performance of a nave Bayes classifier is compared with a well-established statistical classification approach, linear
discriminant analysis, by considering core and log data from marineeolian sediments. The results indicate that both methods
perform adequately, and the Gaussian nave Bayes classifier provides estimates as good as those based on the linear discriminant
analysis for the given data set. Quadratic discriminant analysis, a more conventional Bayesian analysis, and kernel-based density
estimation methods perform unexpectedly poor, probably because of overfitting. We conclude that the normal distribution is
appropriate to fit the distribution of log readings in the present data, and the simplifications of nave Bayes provide a robust, simple
approach for facies identification.
2006 Elsevier B.V. All rights reserved.
Keywords: Facies; Well logs; Discriminant analysis; Nave Bayes classifier

1. Introduction
Facies identification is important in oil exploration
and development because facies often control the variation of petrophysical properties. Identification of
facies is generally based on core samples and outcrop
characteristics. Because available core and outcrop are
usually limited, establishing relationships between
facies and more readily available data sources, in particular well logs, is highly desirable.
Some efforts have been made to use statistical methods such as discriminant analysis (Sakurai and Melvin,
1988; Avseth et al., 2001; Tang et al., 2004) to identify

Corresponding author.
E-mail address: liyumei@uwyo.edu (Y. Li).
0920-4105/$ - see front matter 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.petrol.2006.06.001

facies from well logs. The past decade has also seen
applications of Artificial Neural Network (ANN) (Derek
et al., 1990; Wong et al., 1995; Siripitayananon et al.,
2001; Bhatt and Helle, 2002) and fuzzy logic (Cuddy,
2000; Saggaf and Nebrija, 2003) in facies classification.
Initial successes of ANN for facies prediction have
inspired enthusiasm, leading to claims that it has the
potency to dominate or take over other analytical tools
used in the exploration and production industry
(Iloghalu, 2003). However, the reliable use of neural
networks requires experience for adjusting parameters
and a large amount of training time, especially for large
data sets (Wong et al., 1995; Avseth et al., 2001).
All methods use a training data set consisting of
observed cases with full information about both predictors (in our application, well-log readings) and groups
(in our case, facies). Based on the training data set, one

150

Y. Li, R. Anderson-Sprecher / Journal of Petroleum Science and Engineering 53 (2006) 149157

creates a rule (called a classifier) by which future


observations of predictors can be used to infer probable
group memberships. The ideal classifier would be easy
to implement and would give reliable results. Among
statistical classification methods, discriminant analysis is
robust and powerful (Wong et al., 1995; Avseth et al.,
2001). Like other statistical methods, discriminant analysis does, however, need a large training data set (> 100
cases in the training set). The classification rule maximizes the separation of the pre-defined groups in the
multi-dimensional space formed by variables or predictors in the training set. It is widely accepted that the
success of discriminant analysis depends on the validity
of certain statistical assumptions such as multivariate
normality and homogeneity (Wong et al., 1995).
However, experience shows that the technique is fairly
robust when data size is adequate (at least 20 cases in the
smallest group in the training set) and when there are
relatively few (five or fewer) predictors (Tabachnick and
Fidell, 1996).
Bayesian classifiers provide another alternative. Conceptually, a Bayesian approach to classification is
appealing because it allows one to incorporate known
information or expert opinions, it explicitly leads to probabilities of new cases falling into different classes, and it
is easily updated as new information is obtained.
Multivariate Bayesian analyses are sometimes problematic, however, in that computations may be difficult and
modeling multiple correlations between variables is
potentially both delicate and unwieldy. To take advantage
of positive aspects of the Bayesian approach while avoiding some of the negative aspects, a modified Bayesian
method, known as nave Bayes, is gaining acceptance.
Nave Bayes is easy to implement, and is thus appealing,
provided that it gives good results. At first glance the
approach seems dubious because it assumes independence, and in many settings proper treatment of correlations is known to be important for good inference. For
nave Bayes, however, the impact of this simplification is
often surprisingly small and early experience with nave
Bayes suggests that it may give facies predictions that are
at least as accurate as those from neural networks without
the burden of lengthy training required by neural networks
(Kapur et al., 2000).
Little work has been done on nave Bayes for facies
identification, probably for three reasons: First, the
choice of prior probability distribution can greatly affect
classification results; although prior probabilities are
used in other classification methods, including discriminant analysis, the problem of priors is particularly associated with Bayesian methods. Prior information
originates from local geological knowledge. In hetero-

geneous formations like fluvial deposits, the prior distribution may change from one well to another (Coudert
et al., 1994). The heterogeneity of deposits makes the
choice of prior a challenge. Second, probabilities required by a fully Bayesian method are hard to obtain for
more than one predictor. The nave Bayes classifier
assumes independence among predictors, but well logs
are often dependent. It is not clear whether violation of
the independence assumption will affect the facies classification. Third, it is still unknown what distributions are
appropriate to fit different log readings and how different
distributions affect the facies prediction. Kapur et al.
(2000) discretized values of predictor variables and used
a counting rule to calculate probabilities. They emphasized the importance of picking appropriate bin sizes: If
too few bins are selected, the FOP (facies occurrence
probability) lacks the ability to discriminate between
adjacent log readings. If there are too many bins, the FOP
will not be estimated precisely.
This study evaluates the performance of discriminant
analysis and a normal-based nave Bayes classifier in
facies identification from well logs by applying the logfacies correlation derived from the training set in three
hold out wells.
2. Methodology
2.1. Nave Bayes classifier
Bayes theorem aims to determine the conditional
probability of parameter values given the data by combining expectations based on previous experience (prior
probabilities) with information from available data. In
this study, Bayes theorem is used to calculate the probability of the occurrence of a certain facies given the
well-log readings and to assign the facies of the highest
posterior probability to that observation depth.
The application of Bayes theorem in facies classification can be written as follows:
P fj jX x P fj

PX xj fj
PX x

Here P( fj|X = x) is the posterior probability of the jth


facies fj given that a random log reading X is equal to x;
P( fj) is the probability of the jth facies obtained from
previous experience or from our initial belief of the
facies distribution before we have observed any data;
and P(X = x| fj) is the conditional probability density for
a random log reading x given the occurrence of the jth
facies fj. P(X = x) is the probability density for a random

Y. Li, R. Anderson-Sprecher / Journal of Petroleum Science and Engineering 53 (2006) 149157

log reading x, without conditioning on the facies. We


predict that a new case X will come from the facies fj
that achieves the highest posterior probability. If there
are n well logs (X1, X2, X3, , Xn), then the above
formula can be modified as:
P fj jX1 x1 ; X2 x2 ; :::; Xn xn
  PX1 x1 ; X2 x2 ; :::; Xn xn j fj
P fj
PX1 x1 ; X2 x2 ; :::; Xn xn
2
By assuming independence among well logs given
certain types of facies, we get what is called a nave
Bayes or simple Bayes classifier given by:
Pfj jX1 x1 ; X2 x2 ; :::; Xn xn
n
Y
PXi xi j fj
P fj

i1

m
P

P f

n
Y

j1

i1

PXi xi j fj

where m is the number of facies.


The above posterior probability is computed for each
facies and the prediction is made for the facies associated
with the largest posterior probability. This classification
rule requires preliminary knowledge of univariate probability distributions of well logs, which can be extracted
from training data for each facies. Note that Eq. (3)
differs from Eq. (2) in that Eq. (3) treats values of well
logs as though they were independently distributed.
The nave Bayes classifier is simple and computationally efficient. The independence assumption simpli-

151

fies the classification task dramatically by allowing the


conditional densities to be calculated separately for each
well log. Although the independence assumption is
almost certainly violated, the classifier has been shown to
be robust to the violation of independence in classification
and to exhibit surprisingly good performance in many
domains that contain clear attribute dependences (Clark
and Niblett, 1989; Langley et al., 1992). A goal of the
present study is to see whether facies identification is one
of these domains.
2.2. Probability density estimation
Normal probability distributions are often assumed
for data in practical situations. In this study, we assume
log readings x (or, in some cases, natural logarithms of
log readings) given a certain facies f are normally
distributed, with a probability density function given
by:
2

2 xuf
1
2r
Pxj f q e f
2kr2f
1

where f2 is the variance of log readings given facies f,


and f is the mean of log readings given facies f.
Estimates of parameters in the above probability density function can be derived from the training set.
Estimation of parameters in Eq. (4) was done using
standard unbiased univariate estimators, the sample
mean for and the sample variance for 2 . The
sample mean is both the maximum likelihood
estimator and the least squares estimator. The sample

Fig. 1. Location of seven wells in Teapot dome, Powder River Basin, Wyoming.

152

Y. Li, R. Anderson-Sprecher / Journal of Petroleum Science and Engineering 53 (2006) 149157

Fig. 2. The matrix plot of GR, NPHI, RHOB and LOGRT shows moderately strong pairwise correlations among NPHI, RHOB and LOGRT.

variance is slightly larger than the maximum likelihood estimator of variance, and in this situation either
the sample variance or the MLE may be used with little

difference in results. Notice that correlations need not


be estimated because they do not enter into the nave
Bayes approach to classification.

Fig. 3. Boxplots of GR, NPHI, RHOB and LOGRT grouped by facies show that overlap of well-log responses is common among the five facies, and
that the most discriminating individual well logs are RHOB and LOGRT.

Y. Li, R. Anderson-Sprecher / Journal of Petroleum Science and Engineering 53 (2006) 149157


Table 1
Facies description of Upper Tensleep Formation, Wyoming
Facies

Description

Frequency

Sand dune

Fine- to medium-grained sandstone


High-angle cross-bedding
Siltstone to very fine-grained sandstone
Burrowed, crinkly laminations
Dolomitic sandstone
Horizontal or low angle laminations
Dolomite or sandy dolomite
Massive, fossil (crinoids)
Dolomite, vugs (molds after evaporite
crystals) and fractures

160

Interdune
Sand sheet
Shallow
marine
Sabkha

200
110

153

conditional probability densities given different types of


facies. A program written in Matlab estimated the kernel
density, and the optimal bandwidth for kernel density
estimates (the default bandwidth in Matlab) was calculated on the basis of estimated integrated squared
error (Martinez and Martinez, 2002).
2.3. Discriminant analysis

38
85

An alternative approach, which we also consider, is


to use a nonparametric estimate of the density for each
facies based on kernel density estimation (KDE). In
KDE the density function is approximated by the superposition of a set of kernels (Kraaijveld, 1996). As in
most applications, a particularly popular choice, the
Gaussian or normal kernel was used (Duda and Hart,
1973; Specht, 1990). In keeping with nave Bayes, we
applied univariate kernel density estimation to evaluate

Discriminant analysis may take the form of either


linear or quadratic discriminant analysis. Both forms
assume each well log and their linear combinations are
normally distributed for each facies, an assumption that is
seldom true in practice. Linear discriminant analysis
additionally assumes homogeneity of the variance
covariance structures for the different classes (facies).
This assumption is also violated for the given data set
according to Box's M test. Violation of the homogeneity
assumption may lead to overclassification, which means
cases tend to be assigned to facies with higher variance
due to higher posterior probability. Tabachnick and Fidell
(1996) recommend quadratic discriminant analysis as an
alternative to avoid overclassification. However, due to

Fig. 4. The nave Bayes posterior probabilities, LDA-predicted facies, and observed facies columns of well 55 (f 1 = SD, f 2 = ID, f 3 = SM, f4 = SB,
f5 = SS, LDA = linear discriminant analysis). For clarity, probabilities of classification are split into two figures, Fig. 4 for nave Bayes (BAY) and Fig.
5 for linear discriminant analysis (LDA). Probabilities give more detailed information than class identification (the highest probability class).
Probability curves indicate uncertainty in identification. See also Fig. 5.

154

Y. Li, R. Anderson-Sprecher / Journal of Petroleum Science and Engineering 53 (2006) 149157

Fig. 5. The posterior probability, observed facies and BAY-predicted facies columns of well 55 (f 1 = SD, f 2 = ID, f 3 = SM, f4 = SB, f5 = SS,
BAY = nave Bayes classifier). See also Fig. 4. The agreement between nave Bayes and LDA is close. Both methods locate the economically
important stratum f1 but identify a narrower band of f1 than is actually present. F3 is erratically identified by LDA, with similar but slightly superior
performance by nave Bayes. The dominant facies f 2 is identified by both nave Bayes and LDA, although other facies are sometimes labeled as f 2 by
both nave Bayes and LDA.

overfitting, quadratic discriminant analysis can have poor


classification capability in hold out data sets, especially
when a hold out data distribution deviates far from the
training set distributions.
The steps of a discriminant analysis may be summarized as: (1) create discriminant functions from the
training set; (2) use discriminant functions to calculate
discriminant scores; (3) convert discriminant scores to
Mahalanobis distances and associated posterior probabilities; (4) classify observations to the facies associated
with highest posterior probability. We performed
discriminant analysis using the statistical program SPSS.

chosen so that there would be enough data for each type of


facies in the rest wells (i.e., the training set). Three wells
(51, 55, 56) were held out respectively as test sets to study
the consistency of the two classification methods. Multiple
analyses are performed: for each analysis one well is held
out as a test set from the beginning and other wells are taken
as the training set.
3. Results and analyses
The geological data, consisting of 593 core readings
and log signatures, were obtained from seven wells in the
Upper Tensleep Formation in Teapot Dome, Powder

2.4. Cross-validation
Cross-validation evaluates classification performance
by using two independent samples of data, one to learn the
rule and another to test it. In this study, seven wells (Fig. 1)
were selected on the basis of stratigraphic and geographic
coverage, availability of appropriate well logs, and
availability of core analysis data. Due to limited data and
limited facies types in some wells, instead of leaving out a
randomly selected well as the test set, the hold out well was

Table 2
Classification results of linear discriminant analysis in well 55
Observed

SD
ID
SM

Predicted

Percent

SD

ID

SM

SB

SS

correct

8
0
0

9
66
4

0
0
8

0
0
2

0
2
2

47.1%
97.1%
50%

Overall percent correct: 81.2%.

Y. Li, R. Anderson-Sprecher / Journal of Petroleum Science and Engineering 53 (2006) 149157

155

Table 3
Classification results of nave Bayes classifier in well 55
Observed

SD
ID
SM

Predicted

Percent

SD

ID

SM

SB

SS

correct

5
0
0

12
60
4

0
0
8

0
0
4

0
8
0

29.4%
88.2%
50.0%

Overall percent correct: 72%.

River Basin, Wyoming (Fig. 1). In the Powder River


Basin, the 150-foot-thick Upper Tensleep Formation at
depth 53005800 ft is composed of eolianmarine sequences, featured by sandstones, dolomitic sandstones,
sandy dolomite, and dolomite.
The well-log data consist of gamma-ray (GR),
neutron porosity (NPHI), formation density (RHOB),
and deep resistivity (LLD). The resistivity data are lognormally distributed, so a natural log transform of these
data was taken and designated LOGRT. Among the four
well logs, the variables NPHI, RHOB, and LOGRT show
moderately strong pairwise correlations with each other
(Fig. 2).
Different facies have different responses in well logs,
but overlap of well-log responses is very common among
different facies (Fig. 3). The most discriminating
individual well logs are RHOB and LOGRT. The least
discriminating log is GR.
Five facies were identified based on descriptions of
well cores: sand dune (SD), interdune (ID), shallow marine (SM), sabkha (SB) and sand sheet (SS). The decription and frequency of the five facies are presented in
Table 1.
Both linear discriminant analysis and the nave Bayes
classifier are applied in three hold out wells with priors

Fig. 7. Comparison of discriminant analysis and the nave Bayes


analysis suggests that both approaches perform consistently in the
three analyzed wells. (LDA = linear discriminant analysis, BAY =
nave Bayes classifier).

being set as percentages of facies in the training data set.


For each method, a predicted facies column is produced
with corresponding posterior probability column for
each well. The classification results of the two methods
in one of the three hold out wells are illustrated in Figs. 4
and 5. The cross-validation results (Tables 2 and 3)
suggest that: (1) Interdune, the most prevalent facies, are
mostly correctly classified; (2) Although less than 50%
of sand dune, the main hydrocarbon reservoir, is correctly identified, no other facies are misclassified as sand
dune. Also, misclassifications of sand dune typically
occur physically adjacent to correct classifications of sand
dune.
In the current data, the normal-based Bayes classifier
achieved a higher success rate than did the KDE-based
Bayes classifier (Fig. 6), with increases in classification
rate by up to 20%. Thus the normality assumption is
appropriate for probability density estimation of specified well logs when using the nave Bayes classifier.
Both linear discriminant analysis and the normaldistribution-based nave Bayes classifier perform consistently in three wells with average success rate 74%
(Fig. 7).
4. Discussion and conclusions

Fig. 6. Comparison of kernel density estimation and normal density


estimation of well logs suggests the normal assumption is more
appropriate than is kernel density estimation. (KDE = kernel density
estimation, NOR = normal distribution).

The performance of the nave Bayes method in facies


identification from well logs primarily depends on how
the probability densities are estimated and how priors are
distributed. Estimation of probability densities is important for the calculation of the likelihood and thus for
estimation of the posterior distribution of facies. Comparison of KDE and the normal distribution surprisingly
indicates that a normal distribution gives better results

156

Y. Li, R. Anderson-Sprecher / Journal of Petroleum Science and Engineering 53 (2006) 149157

than does KDE from the aspect of prediction. The success


of the normal assumption over KDE implies that incorporating efforts to find the actual distribution does not
necessarily improve prediction. Optimal bandwidths
probably follow the data too closely, and broader bandwidth with smoother density estimates could be expected
to perform better. Under the normality assumption, the
bandwidth goes to infinity, which leads to an increased
robustness of the classifier, as the location of the decision
surface is less affected by noise and outliers in the data
(Kraaijveld, 1996). Furthermore, compared with other
density estimation methods, fitting log readings to a
normal distribution is simple, computationally efficient
and reliable for purposes of facies identification.
The choice of priors also plays a role in classification.
The cross-validation in well 55 (Table 3) indicates that most
of sand dune facies are misclassified as interdune. This is
probably due to interdune's much higher prior probability,
the larger variance in well logs for interdune over sand
dune, or a combination of these two influences. Classification based on an alternative prior distribution, which
takes the average of the prior from the training set and an
equal prior (all probabilities = 0.2), failed to improve
results. We conclude for the given data that the difference
in the variance of well logs among facies plays a more
important role than does the prior distribution. In
heterogeneous deposits like fluvial deposits where the
prior distribution plays a more important role, the
performance of nave Bayes classifier may be less
consistent than that in homogeneous marine deposits.
Although linear discriminant analysis requires multivariate normality and equal variances across groups, past
experience shows that violation of these assumptions does
not generally lead to poor prediction, a finding that is
justified by this study. How the degree of violation of the
normality assumption affects the prediction is hard to
characterize precisely and is still unknown. On the other
hand, violation of the homogeneity assumption is known to
lead to overclassification. Our cross-validation (Table 2)
demonstrates that some sand dune are misclassified as
interdune, which is probably the result of overclassification. This explanation is consistent with the observation
that the two most discriminating well logs, RHOB and
NPHI, show substantial overlap between interdune and
sand dune, and interdune has larger spread than sand dune.
Quadratic discriminant analysis, which is a natural remedy
to this problem, was also tested, but, with a success rate of
67.3%, we judged it to be inferior to linear discriminant
analysis for the present application. The probable difficulty
with quadratic discriminant analysis in the current setting is
overfitting of the training set coupled with heterogeneity of
distributions within facies across physical sites.

The nave Bayes classifier assumes independence


among predictors. Violation of the independence
assumption is substantial but does not adversely affect
the classification in this study. One possible reason is
that although the estimated posteriors are not necessarily correct, the group associated with the highest
i Pxi j fj =Pxi is the group associated with the highest P(X| fj) / P(X). This slightly weaker condition relaxes
the importance of the strict independence assumption.
An attempt to replace the four well-log variables with
four corresponding principle components in the nave
Bayes classifier ends up with 42% success rate in the
hold out well 55. This initially surprising result may be
explained by noting that: (1) Estimation of too many
parameters in the variancecovariance matrix for each
facies may introduce error; (2) The difference in the
correlation among well logs from one facies to another
facies complicates principle component analysis; (3)
Although principle components in the training set are
independent of each other, the principle components of
the test set, which are calculated based on the principle
component functions derived from the training set, are
not necessarily independent due to the difference in
distribution between the test set and the training set.
In this study, the nave Bayes classifier performs the
classification as well as does linear discriminant analysis in
terms of efficiency and consistency. Although we selected
normal likelihoods, nave Bayes requires no assumption
on data distribution, which makes it a more universal
technique than discriminant analysis. We conclude that the
nave Bayes classifier is worthy of consideration in general
for problems of facies identification.
Acknowledgements
The authors would like to thank Dr. P.G. Yin for providing the data and professional advice, Q.S. Zhang for
his contribution to facies analysis, Huaiyu Yuan for valuable discussion and insight, and an anonymous reviewer,
whose comments substantively improved the paper.
References
Avseth, P., Mukerji, T., Jorstad, A., Mavko, G., Veggeland, T., 2001.
Seismic reservoir mapping from 3-D AVO in a North Sea turbidite
system. Geophysics 66 (4), 11571176.
Bhatt, A., Helle, H., 2002. Determination of facies from well logs
using modular neural networks. Pet. Geosci. 8 (3), 217228.
Clark, P., Niblett, T., 1989. The CN2 induction algorithm. Mach.
Learn. 3 (4), 261283.
Coudert, L., Frappa, M., Arias, R., 1994. A statistical method for lithofacies identification. J. Appl. Geophys. 32, 257267.
Cuddy, S., 2000. Litho-facies and permeability prediction from electrical
logs using fuzzy logic. SPE Reserv. Evalu. Eng. 3 (4), 319324.

Y. Li, R. Anderson-Sprecher / Journal of Petroleum Science and Engineering 53 (2006) 149157


Derek, H., Johns, R., Pasternack, E., 1990. Comparative study of a
backpropagation neural network and statistical pattern recognition
techniques in identifying sandstone lithofacies. Proceedings 1990
Conference on Artificial Intelligence in Petroleum Exploration and
Production. Texas A and M University, College Station, TX, pp.
4149.
Duda, R., Hart, P., 1973. Pattern Classification and Scene Analysis.
John Wiley and Sons Inc, New York. 482 pp.
Iloghalu, E., 2003. Application of neural networks technique in
lithofacies classifications used for 3-D reservoir geological modeling
and exploration studies. AAPG Annual Meeting Abstract.
Kapur, L., Lake, L., Sepehrnoori, K., 2000. Probability logs for facies
classification. In Situ 24 (1), 5758.
Kraaijveld, M., 1996. A Parzen classifier with an improved robustness
against deviations between training and test data. Pattern Recogn.
Lett. 17 (7), 679689.
Langley, P., Iba, W., Thompson, K., 1992. An analysis of Bayesian
classifiers. Proceedings of the Tenth National Conference on
Artificial Intelligence. AAAI Press, San Jose, CA.

157

Martinez, W., Martinez, A., 2002. Computational Statistics Handbook with


MATLAB. Chapman and Hall/CRC, Boca Raton, Florida. 616 pp.
Saggaf, M., Nebrija, E., 2003. A fuzzy logic approach for the estimation
of facies from wire-line logs. AAPG Bull. 87 (7), 12231240.
Sakurai, S., Melvin, J., 1988. Facies discrimination and permeability
estimation from well logs for the Endicott field. 29th Annual
APWLA Symposium. San Antonio, Texas.
Siripitayananon, P., Chen, H., Hart, B., 2001. A new technique for
lithofacies prediction: back-propagation neural network. Proceedings of the 39th Annual ACM-SE Conference.
Specht, D., 1990. Probabilistic neural networks. Neural Netw. 3, 110118.
Tabachnick, B., Fidell, L., 1996. Using Multivariate Statistics.
HarperCollins College Publishers, New York.
Tang, H., White, C., Zeng, X., Gani, M., Bhattacharya, J., 2004.
Comparison of multivariate statistical algorithms for wireline log
facies classification. AAPG Annual Meeting Abstract, vol. 88, p. 13.
Wong, P., Jian, F., Taggart, I., 1995. A critical comparison of neural
networks and discriminant analysis in lithofacies, porosity and
permeability predictions. J. Pet. Geol. 18 (2), 191206.