Vous êtes sur la page 1sur 5

Application of Computers and Operations Research in the Mineral Industry –

Dessureault, Ganguli, Kecojevic & Dwyer (eds)


© 2005 Taylor & Francis Group, London, ISBN 04 1537 449 9

Comparing ordinary kriging interpolation variance and indicator kriging


conditional variance for assessing uncertainties at unsampled locations

J.K. Yamamoto
Department of Environmental and Sedimentary Geology, Institute of Geosciences,
University of São Paulo, Brazil

ABSTRACT: This paper presents an investigation into measures of the reliability associated with both ordinary
kriging and indicator kriging. Ordinary kriging is a well known estimation method which is based on minimization
of the error variance. This estimation method provides the traditional kriging variance and the interpolation
variance as well. Indicator kriging is a method used to compute conditional probability at an unsampled location
that the unknown value is no greater than a given cutoff value. Actually indicator variables are determined for
each cutoff by means of a nonlinear transform. From conditional cumulative distribution built from assembling
the conditional probabilities computed for all cutoff values, one can derive the conditional expectation (E-type
estimate) and the conditional variance. Interpolation variance and conditional variance were computed from a
sample data set drawn from an exhaustive data set. From the exhaustive data set true variances were calculated.
Then, sample variances are compared with true variances. Both sample variances present good correlation with
the true variance, but the conditional variance was slightly better than the interpolation variance.

1 INTRODUCTION computed as follows:

Ordinary kriging is a well known estimation method


which minimizes the error variance and provides an
uncertainty measure as well. However, the kriging vari-
ance is homoscedastic, i.e., it is independent of the
data used to obtain the estimator Z ∗ (xo ) (Olea 1991). where so2 is the interpolation variance {λi , i = 1, n} are
The ordinary kriging variance is not data dependent the kriging weights computed from a normal system
because it is based on a global semivariogram. In of equations derived from minimization of the error-
order to overcome this situation Yamamoto (2000) variance {z(xi ), i = 1, n} are the n neighboring data
has proposed the use of the interpolation variance closer to the unsampled location (xo ) and z ∗ (xo ) is
which accounts for both the data configuration and the ordinary kriging estimate. According toYamamoto
the dispersion of the data values. (2000), the interpolation variance is heteroscedas-
Based on a nonlinear transformation of raw data tic and, therefore, increases with the dispersion of
indicator kriging is used to estimate the conditional the n neighboring data. Besides, it recognizes the
cumulative distribution function (ccdf) at unsampled proportional effect when data values follow a log-
location. From the ccdf one can derive both condi- normal distribution, i.e., the interpolation variance is
tional expectation (E-type estimate) and conditional proportional to the ordinary kriging estimate.
variance. Journel & Rao (1996) interpreted the krig-
This paper presents results of a comparative study ing weights {λi , i = 1, n} as conditional probabilities
between ordinary kriging interpolation variance and attached to the n local data. To build the conditional
indicator kriging conditional variance. cumulative distribution function at any given loca-
tion xo the n neighboring data are firstly sorted in
increasing order: z(x1 ) ≤ z(x2 ) ≤ · · · ≤ z(xn ) and then
2 ORDINARY KRIGING INTERPOLATION modeled as:
VARIANCE

Yamamoto (2000) proposed an alternative measure of


the reliability of ordinary kriging estimates which is

265

Copyright © 2005 Taylor & Francis Group plc, London, UK


Important properties of these ccdf ’s are mean and similar, then median indicator kriging should give very
variance. Thus, the mean is: similar results as indicator kriging.
From the conditional cumulative distribution func-
tion we can derive both the conditional expectation
(E-type estimate) and the conditional variance.
The conditional expectation can be computed,
which is the ordinary kriging estimate, and the vari- according to Olea (1999), as:
ance is:

where z̄k is the mean value of the class (zk−1 , zk ) and


the conditional variance associated with the ccdf as
which is none other than the interpolation variance follows:
(Yamamoto 2000). The conditional variance derived
from conditional cumulative distribution function is
data values-dependent and, therefore, is a better mea-
sure of accuracy (Journel & Rao 1996).
Another uncertainty measurement from conditional
cumulative distribution function was proposed by
3 INDICATOR KRIGING CONDITIONAL Hohn (1999). According to this author, the estima-
VARIANCE tion standard deviation is computed from lower and
upper 16 percent quantiles. Probably Hohn (1999) has
Indicator kriging (Journel 1983) is a nonparametric assumed a hypothesis of normal cumulative distribu-
approach for building a conditional cumulative distri- tion. Therefore, assuming a normal distribution one
bution function at an unsampled location. First the data can compute the standard deviation as:
distribution is analyzed and a set of cutoffs is estab-
lished. Thus for each cutoff zk , the indicator variable
is determined after a nonlinear transform:

where φ84 is the 84th percentile and φ16 is the 16th


percentile.
Actually, this expression was used to compute the
standard deviation from grain size distribution using
The main purpose regarding the number of cutoffs the graphic method as proposed formerly by Inman
(K) is to obtain a reasonable description of frequen- (1952).
cies below or above each cutoff (Hohn 1999). For each Observe that this last approach uses only two points
cutoff zk , experimental semivariogram has to be com- on the conditional cumulative distribution function
puted and modeled. After that, unsampled locations whereas the conditional variance is based on (K − 1)
can be estimated using the indicator approach which points. Obviously the greater the number of points on
gives at each cutoff the probability that the unknown the curve the more accurate is the derived statistics.
value is no greater than a given cutoff value. This paper aims at comparing ordinary kriging
interpolation variance and indicator kriging variance
as uncertainty measurements. This comparison will be
carried out through a case study.

Assembling the K probabilities computed for each


unsampled location one obtains the conditional cumu- 4 CASE STUDY
lative distribution function (Deutsch & Journel 1992).
The conditional cumulative distribution function For a case study let us consider a sample composed
comes at a price, that is the time spent in computing of 224 points which were drawn using stratified ran-
and modeling experimental semivariograms and solv- dom sampling from a synthetic data set composed
ing the kriging system for each cutoff value. Thus, of 48 × 42 values on a regular grid. Summary statis-
Deutsch & Journel (1992) have suggested the use of tics for both sample and the exhaustive data set are
the median indicator kriging which is based on the presented in Table 1.
median indicator semivariogram to define spatial cor- Considering this data set let us compute ordinary
relation for all indicators.According to Hohn (1999), if kriging estimates and associated measures of uncer-
semivariograms computed for several cutoffs are very tainty: kriging variance and interpolation variance.

266

Copyright © 2005 Taylor & Francis Group plc, London, UK


Table 1. Summary statistics for the sample subset.

Summary statistics Sample Exhaustive

No. of data 224 2016


Mean 3.137 3.113
Standard deviation 2.339 2.204
Coefficient of variation 0.746 0.708
Maximum 14.577 15.924
Upper quartile 4.191 4.120
Median 2.735 2.641
Lower quartile 1.469 1.470
Minimum 0.290 0.178

Figure 2. Images of E-type estimates (top) and conditional


standard deviation (bottom).

as noticed by Goovaerts (1994), Hohn (1999), Deutsch


& Journel (1992) among others. E-type estimates
(expression 2) and conditional standard deviations
(expressions 3 and 4) were computed based on 19
cutoffs values. Results are displayed in Figure 2.

4.1 Local precision of estimates


Figure 1. Images of ordinary kriging estimates (top) and
interpolation standard deviation (bottom). Since the sample data set was drawn from the exhaus-
tive data set, we know actual values on nodes of a
regular grid. Furthermore, we have for these nodes
The estimation procedure was carried out using 8 estimated values by either ordinary kriging or indi-
neighbor data searched by quadrant method. Figure 1 cator kriging which can be checked by comparing
displays results of ordinary kriging estimates and them with actual values. Table 2 presents correlation
interpolation variances. coefficients between estimated and actual values. Esti-
Median indicator kriging was chosen to model con- mates present good correlation with actual values,
ditional cumulative distribution functions at unsam- because the sample data set represents 11.1 percent
pled locations. Indeed median indicator kriging was of total data. Actually, E-type estimates present higher
chosen for this study to avoid order relation problems correlation than ordinary kriging estimates.

267

Copyright © 2005 Taylor & Francis Group plc, London, UK


Table 2. Correlation coefficient between estimates and
actual values.

Estimates Correlation coefficient

Ordinary kriging 0.829


Indicator kriging 0.856

Table 3. Correlation coefficients measuring the propor-


tional effect.

Estimate Error Correlation coefficient

OK estimate Interpolation 0.637


standard deviation
OK estimate Kriging 0.030
standard deviation
E-type Conditional 0.634
estimate standard deviation
E-type Estimation 0.630
estimate standard deviation
True value True 0.699
standard deviation

4.2 Proportional effect


The proportional effect is a heteroscedastic condition
in which the variance of the error is proportional to
a function of the local mean of the data values (Olea
1991). Actually, natural phenomena with skewed dis-
tribution may exhibit proportional effect (Yamamoto
2000). According to summary statistics presented in
Table 1, both sample and exhaustive data sets present
skewed distributions as indicated by signs of the differ-
ences between mean and median. For the exhaustive
data set in which we know all values on nodes of a
regular grid, we can compute a true variance for a
given node considering surrounding neighbor nodes.
Thus, given a central node we can consider 8 sur-
rounding nodes, 24 nodes or more depending on the Figure 3. Scattergrams displaying the proportional effect
neighborhood. For this study, we have considered a as recognized by interpolation standard deviation (top) and
conditional standard deviation (bottom).
neighborhood composed of 24 surrounding nodes rel-
ative to a central one. The true variance of a central
node zi,j can be computed as: Figure 3 illustrates the proportional effect as recog-
nized by interpolation standard deviation and condi-
tional standard deviation.
4.3 Comparing estimation errors with true errors
Now we can compare ordinary kriging and indica-
where si,2 j is the true variance, zi2 , j2 is a surrounding tor kriging estimation errors to true errors. Table 4
node relative to the central node zi, j . In order to check presents correlation coefficients between estimation
if computed errors are proportional to estimated values errors and true errors.
we computed correlation coefficients which are pre- Once again just the kriging standard deviation
sented in Table 3. As we can see in this table, just the does not present any correlation with the true error
kriging standard deviation does not recognize the pro- because its homoscedastic characteristic. The condi-
portional effect. All the others present a positive linear tional standard deviation presents higher correlation
relationship between estimates and errors. Therefore, than the others, probably because it is based on a
they are reliable error measurements. cumulative distribution function built from 19 cutoff

268

Copyright © 2005 Taylor & Francis Group plc, London, UK


Table 4. Correlation coefficients between estimation errors standard deviation gives better results than the inter-
and true errors. polation standard deviation since the former is almost
unbiased relative to the 45◦ line.
Estimation error Correlation coefficient

Interpolation standard deviation 0.666 5 CONCLUSIONS


Kriging standard deviation −0.011
Conditional standard deviation 0.703
Ordinary kriging interpolation variance and indicator
Estimation standard deviation 0.673
kriging conditional variance are very similar uncer-
tainty measurements associated with their respec-
tive estimates. Since both measurements are derived
from conditional cumulative distribution functions,
this paper has proved indirectly the similarity between
these cumulative functions. Indeed, it validates the
interpretation of ordinary kriging weights as con-
ditional probabilities as it has been proposed by
Journel & Rao (1996). In other words, the conditional
cumulative distribution function derived directly from
ordinary kriging weights is equivalent to the same built
from assembling conditional probabilities computed
from indicator kriging.

ACKNOWLEDGEMENTS

I am very grateful to the National Council for Scien-


tific and Technological Development-CNPq (Process
304612/89-8) as well as to the Foundation for Research
Sponsorship of State of São Paulo-FAPESP (Process
01/10948-4) which gave financial support to develop
this research. Finally I wish to thank an anonymous
reviewer who examined this manuscript and gave some
suggestions which helped me to improve it.

REFERENCES

Deutsch, C.V. & Journel, A.G. GSLIB Geostatistical software


library and user’s guide. New York, Oxford University
Press, 1992. 340p.
Goovaerts, P. Comparative performance of indicator algo-
rithms for modeling conditional probability distribu-
tion functions. Math. Geology, vol. 26, no. 3, 1994.
pp. 389–411.
Hohn, M.E. Geostatistics and petroleum geology. Dordrecht,
Kluwer Academic Publishers, 1999. 235p.
Inman, D.L. Measures for describing the size distribution of
sediments. J. Sed. Pet., vol. 22, no. 3, 1952. pp. 125–145.
Journel, A.G. Nonparametric estimation of spatial distribu-
tions. Math. Geology, vol. 15, no. 3, 1983. pp. 445–468.
Figure 4. Comparison between true deviation and interpo- Journel, A.G. & Rao, S.E. Deriving conditional distribu-
lation standard deviation (top) and between true deviation tions from ordinary kriging. Stanford, Stanford Center for
and conditional standard deviation (bottom). Reservoir Forecasting, Stanford, 1996. 25p. (Report #9)
Olea, R.A. Geostatistical glossary and multilingual dictio-
grades. The correlation coefficient calculated between nary. NewYork, Oxford University Press, NewYork, 1991.
175p.
interpolation standard deviation and conditional stan- Olea, R.A. Geostatistics for engineers and earth scientists.
dard deviation was equal to 0.920, proving that both Boston, Kluwer Academic Publishers, 1999. 303p.
approaches give very similar results. Figure 4 shows Yamamoto, J.K. An alternative measure of the reliability of
scattergrams of estimation errors and true errors. Com- ordinary kriging estimates: Math. Geology, vol. 32, no. 4,
paring scattergrams we conclude that the conditional 2000. pp. 489–509.

269

Copyright © 2005 Taylor & Francis Group plc, London, UK

Vous aimerez peut-être aussi