Vous êtes sur la page 1sur 11

Journal of Integrative Agriculture

2013, 12(9): 1673-1683 September 2013


Spatial Interpolation of Soil Texture Using Compositional Kriging and

Regression Kriging with Consideration of the Characteristics of Compositional
Data and Environment Variables

ZHANG Shi-wen1, 2, SHEN Chong-yang1, CHEN Xiao-yang2, YE Hui-chun1, HUANG Yuan-fang1 and LAI

1 China Agricultural University/Key Laboratory of Arable Land Conservation (North China), Minstry of Agriculture/Key Laboratory of
Agricultural Land Quality Monitoring, Ministry of Land and Resources, Beijing 100193, P.R.China
2 School of Earth and Environment, Anhui University of Science and Technology, Huainan 232001, P.R.China

3 Afforestation Management Office, Sichuan Forestry Department, Chengdu 610081, P.R.China

The spatial interpolation for soil texture does not necessarily satisfy the constant sum and nonnegativity constraints.
Meanwhile, although numeric and categorical variables have been used as auxiliary variables to improve prediction
accuracy of soil attributes such as soil organic matter, they (especially the categorical variables) are rarely used in spatial
prediction of soil texture. The objective of our study was to comparing the performance of the methods for spatial
prediction of soil texture with consideration of the characteristics of compositional data and auxiliary variables. These
methods include the ordinary kriging with the symmetry logratio transform, regression kriging with the symmetry logratio
transform, and compositional kriging (CK) approaches. The root mean squared error (RMSE), the relative improvement
value of RMSE and Aitchison’s distance (DA) were all utilized to assess the accuracy of prediction and the mean squared
deviation ratio was used to evaluate the goodness of fit of the theoretical estimate of error. The results showed that the
prediction methods utilized in this paper could enable interpolation results of soil texture to satisfy the constant sum and
nonnegativity constraints. Prediction accuracy and model fitting effect of the CK approach were better, suggesting that
the CK method was more appropriate for predicting soil texture. The CK method is directly interpolated on soil texture,
which ensures that it is optimal unbiased estimator. If the environment variables are appropriately selected as auxiliary
variables, spatial variability of soil texture can be predicted reasonably and accordingly the predicted results will be

Key words: compositional kriging, auxiliary variables, regression kriging, symmetry logratio transform

has been studied extensively in literature (Gobin et al.

INTRODUCTION 2001; Meul and Meirvenne 2003; Zhao et al. 2009; Ließ
et al. 2012). However, the spatial interpolation for soil
Soil texture is a kind of compositional data which is texture does not necessarily satisfy the constant sum
very common in the earth sciences (Walvoort and de and nonnegativity constraints. Incorporating informa-
Gruijter 2001). The spatial prediction of soil texture tion about soil variability and increasing the resolution

Received 16 October, 2012 Accepted 4 March, 2013

ZHANG Shi-wen, Tel: +86-554-6668430, E-mail: mamin1190@126.com; Correspondence HUANG Yuan-fang, Tel: +86-10-62732963, Fax: +86-10-62733596, E-mail:

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

1674 ZHANG Shi-wen et al.

of soil data in simulation models have often shown to performance of different prediction methods.
improve model predictions (McBratney et al. 1992;
Lathrop et al. 1995; Lilburne and Webb 2002; Chaplot
2005). The spatial variability of soil texture is not an
independent process and is usually correlated with other
environment variables. The studies on using numeric Prediction of spatial distribution of soil texture
variables and categorical variables to improve the pre- using RK-SLR and OK-SLR methods
diction accuracy of soil attributes such as soil organic
matter have been reported. For example, Hengl et al. Soil texture is impacted by many environment variables
(2004) proposed a methodological framework for spa- (e.g., parent material, climate, topography, geology and
tial prediction of organic matter, pH in topsoil and top- hydrology, human activities) during the long-term
soil thickness by comparing regression kriging with geochemical process. This article selected some environ-
ordinary kriging and plain regression. Chai et al. (2008) ment variables, including the numeric variables (e.g., SOM,
compared the performance of the empirical best linear elevation and soil bulk density) and categorical variables
unbiased predictor (E-BLUP) with residual maximum (e.g., land use types, soil types and parent material) to
likelihood (REML) with that of regression kriging for carry out analysis of variance (ANVON) using SPSS ver.
predicting soil organic matter (SOM) in the presence 20.0 software (SPSS Institute 2012). The original data of
of different external drifts. Zhang et al. (2011) exam- soil texture were transformed by SLR approach, and the
ined whether inclusion of categorical variables can im- predicted results were backtransformed by means of the
prove the accuracy of SOM prediction through sys- antisymmetric logratio transform.
tematical analyses of variability. However, the reports As shown in Table 1, results of one-way ANVON indi-
about uses of numeric variables and categorical vari- cated that sandSLR had significant correlations with soil
ables to improve the prediction accuracy of soil texture types, parent material, soil bulk density, elevation and land
are very few. The objective of the paper is to find an use types, siltSLR had significant correlations with soil bulk
appropriate interpolation method for soil texture pre- density and elevation, and claySLR had significant correla-
diction by testing various spatial prediction methods tions with soil types, parent material and soil bulk density.
which take the interpolation requirements of compo- Because these environment variables are significantly re-
sitional data and auxiliary information into account. lated to different soil particles, they were selected to carry
Compositional kriging (CK), firstly described and ap- out multiple linear stepwise regressions by ordinary least
plied in de Gruijter et al. (1997), was introduced as a squares. Land use types, soil types and parent material
straightforward extension of ordinary kriging that com- were taken as categorical variables.
plies with these constraints. Also, CK appears to be a The assignment of categorical variables in the re-
promising alternative for indicator kriging, because or- gression was as follows: If a categorical variable in-
der relation is implicitly taken into account in the CK cludes n categories, then n-1 dummy variables are
(Isaaks and Srivastava 1989; Walvoort and de Gruijter produced. If any one category is taken as a control
2001; Tan et al. 2009; Zhang et al. 2011). Taking the group, the other categories would be given 1 or 0. This
numeric variables (e.g., SOM, elevation and bulk assignment method ensures the independence of the
density) and categorical variables (e.g., land use types, regression for the independent variable. Details about
soil types and parent material) as auxiliary variable, soil the assignment method can be referred to previous stud-
texture was predicted using ordinary kriging with the ies (SPSS Institute 2012; Zhang et al. 2012). The
symmetry logratio transform (OK-SLR), regression dummy variables derived from the soil type, parent ma-
kriging with the symmetry logratio transform (RK-SLR) terial and land use type were denoted as “X11, X12,
and CK methods. The root mean squared errors X13, X14”, “X21, X22, X23,X24, X25, X26”, and
(RMSE), the relative improvement (RI) values of “X31, X32, X33, X34”, respectively.
RMSE, the mean squared deviation ratio (MSDR) and Fitted equations and related parameters were as
Aitchison’s distance (DA) were adopted to evaluate the follows:

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

Spatial Interpolation of Soil Texture Using Compositional Kriging and Regression Kriging with Consideration 1675

Table 1 Results of analysis of variance (ANVON)

Variables Test of significance2)
Types Classification 1) Parameters SandSLR SiltSLR ClaySLR
SOM (g kg-1) <10 (26), 10-15(44), 15-20 (67), >20 (83) F value 2.19 1.35 0.94
Significance 0.09 0.24 0.46
Soil type Aquic soil (23), cinnamon soil (18), brown soil (72), paddy soil (34), marsh soil (15) F value 6.01 0.96 6.85
Significance 0.00 ** 0.43 0.00 **
Parent material AD (7), DAD (99), SR (12), CR (22), FR (27), LR (10), others (43) F value 2.95 1.38 3.93
Significance 0.00 * 0.21 0.00 **
Soil bulk density (g m-3) <1.36 (54), 1.36-1.38 (55), 1.38-1.40 (59), >1.40 (52) F value 6.64 2.11 3.63
Significance 0.00 ** 0.04 * 0.00 **
Elevation (m) <40 (80), 40-60 (69), 60-80 (18), 80-100 (21), >100 (32) F value 3.72 2.57 1.56
Significance 0.00 * 0.02 * 0.16
Land use type Paddy (10), irrigated (5), dryland (34), vegetable land (128), orchard (43) F value 2.96 1.97 1.00
Significance 0.02 * 0.10 0.41
Number in brackets were number of samples for classification of different types of variables. AD, DAD, SR, CR, FR and LR represent alluvial deposit, dilluvial-alluvial
deposit, siliceous rock, calcareous rocks, Feldspathic rock and leaf rock, respectively.
SandSLR, Silt SLR and ClaySLR represent values of sand, silt and clay transformed by SLR.
and **, Correlation is significant at the 0.05 and 0.01 level (2-tailed), respectively.


Results of regression analyses showed that the 1994; Goovaerts 1997). The C0/(C0+C1) ratios of siltSLR,
sandSLR content was proportional to aquic soil, while claySLR and siltSLR residuals were smaller than 50% (Table 2),
inversely proportional to paddy and dryland; the siltSLR demonstrating that their spatial heterogeneities were
content was proportional to bulk density and the claySLR mainly caused by systematic variability, while the C0/
content was inversely proportional to aquic soil and (C0+C1) ratios for sandSLR and claySLR residuals were
brown soil. There was a significant linear relationship larger than 50% (Table 2), which demonstrated that
between siltSLR and related auxiliary variables (P<0.05) their spatial heterogeneities were mainly caused by ran-
and there was an extremely significant linear relation- dom components. The spatial correlations of sandSLR,
ship between sandSLR and claySLR and related auxiliary siltSLR and claySLR were stronger than the corresponding
variables (P<0.01), which indicated that selected auxil- residuals. The semivariogram in the east (E)-west (W)
iary variables could explain the variability of sandSLR, direction showed a spatial correlation within a range of
slitSLR and claySLR to some extent. 15.84 m, 19.30 km and 13.60 km were larger or equal
We calculated the regression values and residuals of to those in the north (N)-south (S) direction for sandSLR,
sandSLR, siltSLR and claySLR using eq. (1). Semivariogram siltSLR and claySLR, respectively, while ranges of a spatial
models were obtained using the ARCGIS GA function correlation in the east (E)-west (W) direction were al-
modules (ESRI 2010) and their models with minimal most all smaller than those of a spatial correlation in the
residual sum were selected as the best fitting models. north (N)-south (S) direction for sandSLR, siltSLR and
In order to better describe the spatial distribution of soil claySLR residual, respectively, indicating that spatial vari-
texture, anisotropy and trend parameters were obtained ability of sandSLR, siltSLR and claySLR caused by stochas-
by taking the semivariogram model and the interpola- tic factors was stronger than their corresponding
tion process into account. residuals. The C0/(C0+C1) ratio of siltSLR was the small-
Fig. 1 and Table 2 showed semivariogram models est (C0/(C0+C1)=36.36%), and the C0/(C0+C1) ratio of
and corresponding parameters of sandSLR, siltSLR, claySLR, sandSLR residual was the largest (C0/(C0+C1)=68.29%).
and their residuals. The nugget to sill ratio C0/(C0+C1), The C0/(C 0+C1) ratios of various types of data were
that shown in Table 2 was designated the degree of between 25 and 75%. Therefore, various types of data
spatial heterogeneity arising from random components had a medium spatial correlation, which is in agree-
to that the total spatial heterogeneity (Cambardella et al. ment with the decrease trend in range.

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

1676 ZHANG Shi-wen et al.

Fig. 1 Semivariogram model of different soil particles and corresponding parameters. The blue lines were a defining model that provides
the best fit through the point. We need to find a line such that the weighted squared difference between each point and the line is as small
as possible. Binned values were showed as red dots, and were generated by grouping (binning) empirical semivariogram points together using
square cells that are one lag wide. Average points are showed as blue crosses, and are generated by binning empirical semivariogram points
that fall within angular sectors. Binned points show local variation in the values, whereas average values show smooth semivariogram values
variation, sandSLR, siltSLR and claySLR represented values of sand, silt and clay transformed by SLR.

Prediction of spatial distribution of soil texture ward extension of ordinary kriging (OK) that complies
using CK method with these constraints. The CK procedure utilized in
this article is John T version which contains six but-
Compositional kriging is introduced as a straightfor- tons and five edit boxes. For details about CK see

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

Spatial Interpolation of Soil Texture Using Compositional Kriging and Regression Kriging with Consideration 1677

Walvoort and de Gruijter (2001). Validation

Table 3 was the model file of CK system, which
contained the variograms models. The CK program We utilized semivariogram parameters showed in Table 3
supports the following models: Spherical, Exponential, to predict the sand, silt and clay content of validation
Linear with sill, and Gaussian. Nested variograms are points. Regression values of sandSLR, siltSLR and claySLR
also allowed (up to three structures). Furthermore, for validation sites were predicted by regression
anisotropic structures can be modeled. Note that pa- equations. Predicted values of validation sites were
rameter α is identical to the range for Spherical and obtained by adding regression values to residuals for
Linear models. However, for Exponential and Gaussian RK-SLR method. The prediction results for the OK-
models the relationship between the practical range α’ SLR and RK-SLR methods were inversely converted
and parameter α is given by α’=3α and respec- back to original scale. Values of RMSE and MSDR
tively (Journel and Huijbregts 1978). The practical range were showed in Table 4 to evaluate prediction accu-
is that lag distance where the semivariance reaches 95% racy and model fitting effect.
of the sill. The C0/(C0+C1) ratios for sand and silt were Table 4 showed values of RMSE and MSDR for dif-
between 25 and 75%, while the C0/(C0+C1) ratio for ferent spatial prediction methods. The values of RMSE
claySLR was 16.87%, which showed sand and silt had a for the CK method were the smallest for any kind of
medium spatial correlation, and clay had a strong spa- soil particles (RMSE values of 1.73, 1.71 and 0.62 for
tial correlation. The C0/(C0+C1) ratios of sand, silt and sand, silt and clay were listed in Table 4, respectively),
clay were smaller than 50% (Table 3), demonstrating demonstrating that the prediction accuracy of CK
that their spatial heterogeneities were mainly due to sys- method is the highest. The values of MSDR values for
tematic variability. So spatial structure caused by varia- the CK method were closer to 1, which demonstrates
tion of parent material played a dominant and decisive the fitting effects of variograms for the CK method are
role in the total spatial variability, which illustrates that better. The value of RMSE for clay was the smallest,
the variation of soil texture as an important soil physi- while the value of MSDR for silt was smaller and closer
cal property is relative stable in the long-term soil to 1. Our results are in agreement with the findings
formation, migration, deposition, decomposition from other studies (Walvoort et al. 2001; Zhang et al.
process, which also was concluded by Zhang et al. 2011).
(2011). Compared to the OK-SLR method, the predicted

Table 2 Semivariogram model of different soil particle composition and corresponding parameters1)
Rang (km)
Variables C0 C1 C0/(C0+C1) (%) phi (°)
Major range Minor range
SandSLR 0.040 0.048 45.45 15.84 15.84 0
SiltSLR 0.012 0.021 36.36 19.30 10.21 19.16
ClaySLR 0.041 0.047 46.59 13.60 13.60 0
SandSLR residual 0.029 0.056 34.12 19.69 18.69 43.06
SiltSLR residual 0.014 0.019 42.42 10.40 20.16 111.45
Clay SLR residual 0.047 0.026 64.38 4.67 7.00 14.06
C 0 is the nugget variance; C1 is the autocorrelated variance; C0/(C0+C1) is the nugget (C0) to sill (C0+C 1) ratio (%); phi (°) is the angle of anisotropy, i.e., the angle
between the major axis of the ellipse and the North, taken in clockwise direction (range: 0-180 degrees). All fitted models used here were spherical model.

Table 3 The model file of compositional kriging system

Str C0 C1 Model smj1 smn1 smj smn phi (°) Max Min
1 60.93 70.76 2 27.2 20.01 81.6 60.02 12.30 30 4
1 40.59 61.99 2 21.81 15.77 65.43 47.3 9.49 30 4
1 0.42 2.07 2 4.30 1.44 12.9 4.32 83.32 30 4

Str, number of variogram structures; Model, 1=Spherical, 2=Exponential, 3=Linear with sill, 4=Gaussian; smj1, parameter α of structure 1 in major direction; smn1,
parameter α of structure 1 in minor direction; smj, the major search radius; smn, the minor search radius; Min, the minimum number of conditioning points; Max, the
maximum number of conditioning points within the search ellipse.

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

1678 ZHANG Shi-wen et al.

Table 4 Validation results for different spatial predicted methods

Validation paraments
Sand Silt Clay Sand Silt Clay Sand Silt Clay
RMSE 1.74 1.71 0.62 2.83 1.72 0.63 3.25 3.16 0.67
MSDR 6.38 3.54 8.88 7.89 5.75 66.69 13.18 3.97 35.04

accuracy of the RK-SLR and CK methods are improved. also be found in other studies (Chai et al. 2008; Zhang
Specifically, the relative improvement values of RMSE et al. 2012).
of sand, silt and clay for the CK approach reached to
46.64, 45.89 and 7.83%, while the relative improve-
ment values of RMSE of sand, silt and clay for RK-
SLR method reached to 13.06, 45.75 and 6.17% (Fig. 2).
The predicted methods utilized in this article enable the
Aitchison’s distance (D A) was computed between
interpolation results to satisfy the four requirements for
the predicted (x i) and observed z(x i) for all valida-
spatial interpolation of compositional data. By compari-
tion points xi. The scatter plots of DA among differ-
son of RMSE, RI and MSDR of various predicted
ent predicted methods were shown in Fig. 3. A one-
methods, CK was more appropriate for soil texture and
tailed paired difference t test showed that the null
its prediction accuracy and model fitting effect of compo-
hypothesis of no difference between the average DA
for the CK, RK-SLR and OK-SLR methods should
be rejected (pCK-(OK-SLR)=0.042<0.05; pRK-(OK-SLR)=0.048
<0.05); a one-tailed paired difference t test showed
that the null hypothesis of no difference between the
average DA for the CK and RK-SLR methods should
not be rejected (p (RK-SLR)-CK=0.564>0.05). It can be
concluded that predictions obtained with CK and RK-
SLR method were more accurate than those obtained
with the OK-SLR approach, while predictions ob-
tained with the CK method were no significant dif-
ference to those obtained with the RK-SLR method.
The spatial variability of soil texture is independent
process, which is certain correlated with other soil
properties. If their explanations of the auxiliary vari-
ables to various soil particle types were enough, the
prediction results would be satisfied. This result can

Fig. 3 Aitchison’s distance for the OK-SLR vs. RK-SLR and CK

Fig. 2 The relative improvement of accuracy of RK-SLR and CK method and for RK-SLR vs. CK for the study. The solid line is the
based on the reference method OK-SLR. 1:1 line.

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

Spatial Interpolation of Soil Texture Using Compositional Kriging and Regression Kriging with Consideration 1679

sitional were better. Values of RMSE, RI and MSDR of The spatial variability of soil texture is not an indepen-
sand for the CK method were 1.73, 46.64% and 6.38, dent process. The introduction of the auxiliary vari-
respectively; values of RMSE, RI and MSDR of silt for ables (especially the categorical variables) can explain
the CK method were 1.71, 45.891% and 3.55, better about spatial variability of soil texture and give
respectively; values of RMSE, RI and MSDR of clay satisfied prediction results.
for the CK method were 0.62, 7.83% and 8.88,
respectively. Scatter plots of DA showed that predic-
tions obtained with the CK and RK-SLR methods were
more accurate than those obtained with the OK-SLR
approach, while predictions obtained with the CK method Study area
have no significant difference on those obtained with
The study was conducted in the plain area of Fangshan
RK-SLR method. The CK method is directly interpo-
District with an area of 805 km2 (39°30´-39°50´N and 115°41´-
lated with soil texture, which is an unbiased predictor 116°14´E), located in the southeast of Beijing City (Fig. 4).
that minimizes the prediction error variance and that In the study area, the topography slopes slightly from the
complies fully with the nonnegativity and constant sum southwest to the northeast with the relative elevation vary-
constraints of compositional data. Obviously, if the ing between 27 and 390 m. Orchards and arable land are
the main types of land use (Fig. 5). The soil types include
active inequality constraints were known in advance,
brown soil, cinnamon soil, and fluvo-aquic soil, of which
the solution of the compositional kriging system would cinnamon soil and fluvo-aquic soil are dominant, occupy-
be rather straightforward. Wismer and Chattergy ing 85.47% of the total study area.
(1978) provided an effecient iterative algorithm to find
these active constraints. This algorithm, known as Data collection and analysis
the method of Theil and van de Panne (1960) starts
with solving the compositional kriging system with all Soil samples were collected in August 2010. The longi-
inequality constraints removed. Its solution is opti- tudes and latitudes of each sampling site were recorded
mal if no inequality constraints are countered. using a global positioning system receiver. For a spe-
cific site, three to five soil samples were collected from
Otherwise, combinations of the violated inequality
the 0-20 cm layer within the diameter of 10 m surround-
constraints are added iteratively as equality constraints ing a specific sampling location and then mixed
to the CK system until the optimal solution is obtained. thoroughly. A total of 1.5 kg of soil per sampling site

Fig. 4 The map of the study position, sampling sites and elevation.

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

1680 ZHANG Shi-wen et al.

Fig. 5 The map of land use types and parent materials for the study area.

was taken from the mixed samples to perform chemical equation is as follows:
analysis based on the quartile method. The samples
were air-dried and ground to pass a 2-mm sieve. SOM
content was determined using the potassium dichro- (2)
mate wet combustion procedure (NSS 1995). Soil par-
ticles were measured using laser grain analyzing equip-
After performing semivariogram analysis and inter-
ment (Mastersizer 2000, Malvern Instruments Ltd., UK),
and soil texture was classified based on International polation on the SLR-transformed data, predicted results
system. Soil bulk density was calculated by dividing are backtransformed via the antisymmetric logratio
the mass of the oven-dried soil (105°C) by the core transform:
volume. We obtained soil types, types of parent material,
land use types and elevation of sampling points from (3)
maps of soil types, land use types in 2007 and DEM
using the extraction function of ArcGIS10.0 platform.
To validate the performance of the different prediction
Where ij(x) is the relative content of the j kind of soil
methods, the data was randomly split into 220 sites as a
particle on sampling site , ij(x) is the relative content
prediction set and 52 sites as a validation set (Fig. 4).
transform value of the j kind of particle on sample site i.
The constant i takes the 1/2 of the smallest percentage of
Methods the j kind of soil particle except 0 in the study area. k is
number of components.

Transformation for compositional data

Theory and predicted methods
As a type of compositional data, soil texture must be trans-
formed before interpolation in order to meet the standard Regression kriging OK, RK and CK methods were uti-
of nonnegative, the constant sum, minimum error and un- lized to predict the spatial variations of sand, silt and clay
biased estimate. The transform methods for composition content in this article. RK is a geostatistical method which
data mainly include the asymmetry logratio transform has been found to give more precise local predictions than
(ALR), the symmetry logratio transform (SLR), the multi- OK (McBratney et al. 2000; Simbahan et al. 2006). The RK
plicative logratio transform (Aitchison 1986) and the iso- approach is based on the idea that the deterministic com-
metric logratio transform (Egozcue 2003). The SLR method ponent of the target variable is explained by a regression
was utilized most widely (Walvoort and de Gruijter 2001; model, whereas the residuals are assumed to describe the
Tan et al. 2009; Zhang et al. 2011), and this study also spatially varying but dependent component (Bishop and
utilized SLR method to transform data of soil texture. The McBratney 2001). In the RK procedure used here the re-

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

Spatial Interpolation of Soil Texture Using Compositional Kriging and Regression Kriging with Consideration 1681

gression model is first fitted to the available data. At each

location the residual from the fitted model, e, is calculated. At
a given location, where the covariates are known, an RK
prediction, Z * , is achieved by summing the regression predic- (7)
tion from the covariates, Z *pr , and the ordinary kriging predic-
tion of the residual, e*, interpolated from the observed residu-
als at all sampled locations (Sumfleth and Duttmann 2008):
Where Zk represents the kth column of Z, and tr(·) gives
(4) the trace of its argument. Since this optimization problem
More complete descriptions of theory and applications also contains inequality constraints, its solution is more
about RK and OK could be found at Odeh et al. (1995), complicated than that of ordinary kriging. Fortunately, the
McBratney et al. (2000, 2003), Hengl et al. (2004), and concept of active constraints (Wismer and Chattergy 1978)
Goovaerts (1999). The RK includes the model “B” (RK-B) is very useful in this respect. It can be illustrated by means
and model “C” (RK-C). The details of the RK-C model of a simple univariate optimization problem:
utilized in the paper can be found in Odeh et al. (1995).
Compositional kriging Compositional kriging is a straight- (8)
forward extension of ordinary kriging. Therefore ordinary
kriging can be taken as a starting point for the derivation Where f (x) is a convex quadratic function of x. At point
of the compositional kriging system (Walvoort and de x* satisfying minx f (x)=f (x*) subject to x 0.
Gruijter 2001). The aim of ordinary kriging is to minimize f’(x*)=0 or f’(x*)>0 and x*=0 (9)
the prediction error variance subject to the unbiasedness Where f (x)=df(x)/dx. In other words, the following con-
estimation constraint (Isaaks and Srivastava 1989): ditions must hold at the minimum:


Where k2 is the estimation variance of the kth compo- Where α * is a Lagrange multiplier. These results are
nent of z(xi), Wk is the kth column of weight W, Ck is the called the Kuhn-Tucker stationary conditions (Wismer and
n×n matrix containing the covariances between the data Chattergy 1978). The inequality constraint is said to be
points for component k, and dk is the vector of dimension active if α <0 and consequently f(x)>0 and x=0. On the
n containing the covariances between the data points and other hand, it is inactive if α=0 and f(x)=0. Hence, active
the prediction point for component k. This constrained inequality constraints can be considered as equality
optimization problem can be converted into an uncon- constraints, whereas inactive inequality constraints can
strained one by adding the unbiasedness constraint with be left out of consideration. Analogously, the Kuhn-Tucker
Lagrange multiplier k to the objective function. The ob- conditions for the compositional kriging optimization prob-
tained objective function, i.e., the Lagrangian, can be mini- lem are given by:
mized by setting its partial first derivatives with respect to
the weights and the Lagrange multiplier equal to zero
(Walvoort and de Gruijter 2001). This results in the ordi-
nary kriging system:


Solving this system for Wk and k yields an optimal set of

weights for each component k. The optimal in this case refers
to weights that lead to an unbiased predictor with minimum Where k, β, and k are Lagrange multipliers pertaining
prediction error variance (Walvoort and de Gruijter 2001). to the nonnegativity, the constant sum, and the
However, since the constant sum and nonnegativity unbiasedness constraints, respectively. The resulting set
constraints are not guaranteed, these weights are only of equations is the compositional kriging system (Walvoort
optimal for each component k separately and not neces- and de Gruijter 2001).
sarily for the composition as a whole. Therefore composi-
tional kriging considers all components simultaneously by
Prediction accuracy verification
minimizing the sum of their prediction error variances, and
by taking the unbiasedness, nonnegativity, and constant
sum constraints into account: In order to measure the performance of ordinary kriging

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

1682 ZHANG Shi-wen et al.

with the symmetry logratio transform (OK-SLR), regres- References

sion kriging with the symmetry logratio transform (RK- Aitchison J. 1986. The Statistical Analysis of
SLR), and CK, the values of RMSE, MSDR, RI and DA were Compositional Data. Chapman and Hall, London.
calculated. The RMSE indicates the accuracy of prediction, Bishop T F A, McBratney A B. 2001. A comparison of
the smaller RMSE is, and the more accurate the prediction prediction methods for the creation of field-extent soil
results are. And the MSDR measures the goodness of fit property maps. Geoderma, 103, 149-160.
of the theoretical estimate of error (Bishop and Lark 2008). Bishop T F A, Lark R M. 2008. Reply to “Standardized vs.
If the correct variogram models are utilized, the MSDR val- customary ordinary cokriging…” by A. Papritz.
ues should be close to 1 (Lark 2000; Kerry and Oliver 2007; Geoderma, 146, 397-399.
Chai et al. 2008). The formulas are as follows: Cambardella C A, Moorman T B, Novak J M, Parkin T B,
Karlen D L, Turco R F, Konopka A E. 1994. Field-scale
(12) variability of soil properties in central low a soils. Soil
Science Society of America Journal, 58, 1501-1511.
Chai X R, Shen C Y, Yuan X Y, Huang Y F. 2008. Spatial
(13) prediction of soil organic matter in the presence of
different external trends with REML-EBLUP. Geoderma,
148, 159-166.
Where Z(xi) is the measured value, (xi) is the predicted Chaplot V. 2005. Impact of DEM mesh size and soil map
value, i2 and n, are the variance and the number of samples scale on SWAT runoff, sediment and NO 3 -N loads
in the validation set, respectively. The relative improve- predictions. Journal of Hydrology, 312, 207-222.
ment (RI) values of RMSE, utilized to measure the improve- Egozcue J J, Pawlowsky-Glahn V, Mateu-Figueras G,
ment of the prediction accuracy of x method over the refer- Barceló-Vidal C. 2003. Isometric logratio transformations
ence method (OK-SLR), were calculated as Pang et al. (2009) for compositional data analysis. Mathematical
and Zhang et al. (2012). Geology, 35, 279-300.
de Gruijter J J, Walvoort D J J, van Gaans P F M. 1997.
(14) Continuous soil maps - a fuzzy set approach to bridge
the gap between aggregation levels of process and
Where RMSEx and RMSEref are the RMSE value of the x distribution models. Geoderma, 77, 169-195.
method and the reference method, respectively. Taking Gobin A, Campling P, Feyen J. 2001. Soil-landscape
OK-SLR method as the reference method, the RI values of modeling to quantify spatial variability of soil texture.
RMSE for CK and RK-SLR methods were calculated in our Physics and Chemistry of the Earth (Part B, Hydrology,
study. DA between the predicted Z(xi) and observed (xi) Oceans and Atmosphere), 26, 41-45.
was computed for all validation points xi for CK and RK- Goovaerts P. 1999. Geostatistics in soil science: state-of-
SLR versus OK-SLR and RK-SLR vs. CK. It is defined as the-art and perspectives. Geoderma, 89, 1-45.
Martin-Fernandez et al. (1998). Hengl T, Heuvelink G B M, Alfred Stein A. 2004. A generic
framework for spatial prediction of soil variables based
on regression-kriging. Geoderma, 120, 75-93.
Isaaks E H, Srivastava R M. 1989. An Introduction to
Applied Geostatistics. Oxford University Press, New
(15) York. p. 561.
Journel A G, Huijbregts C J. 1978. Mining Geostatistics.
Where z(x i) and (x i) are the predicted and observed
Academic Press, London. p. 600.
values for all validation points xi, respectively, zk(xi) and
Kerry R, Oliver M A. 2007. Comparing sampling needs for
(xi) are the predicted and observed values for all validation
variograms of soil properties computed by the method
points x i with k method, respectively, p is number of
of moments and residual maximum likelihood.
components, which is equal to 3 in our study.
Geoderma, 140, 383-396.
Lark R M. 2000. A comparison of some robust estimators of
Acknowledgements the variogram for use in soil survey. European Journal
This work was supported by the National Natural Science of Soil Science, 51, 137-157.
Foundation of China (41071152), the Special Fund for Land Lathrop Jr R G, Aber J D, Bognar J A. 1995. Spatial variability
and Resources Scientific Research in the Public Interest, of digital soil maps and its impact on regional ecosystem
China (201011006-3), and the Special Fund for Agro-Sci- modeling. Ecological Modelling, 82, 1-10.
entific Research in the Public Interest, China (201103005- Ließ M, Glaser B, Huwe B. 2012. Uncertainty in the spatial
01-01). prediction of soil texture comparison of regression tree

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.

Spatial Interpolation of Soil Texture Using Compositional Kriging and Regression Kriging with Consideration 1683

and Random Forest models. Geoderma, 170, 70-79. M L. 2006. Fine-resolution mapping of soil organic
Lilburne L R, Webb T H. 2002. Effect of soil variability, carbon based on multivariate secondary data.
within and between soil taxonomic units, on simulated Geoderma, 132, 471-489.
nitrate leaching under arable farming, New Zealand. SPSS Institute. 2012. SPSS Software. ver. 20. SPSS, New
Australian Journal of Soil Research, 40, 1187-1199. York, Armonk.
Martin-Fernandez J A, Barcelo-Vidal C, Pawlowsky-Glahn Sumfleth K, Duttmann D. 2008. Prediction of soil property
V. 1998. Measures of difference for compositional data distribution in paddy soil landscapes using terrain data
and hierarchical clustering methods. In: Buccianti A, and satellite information as indicators. Ecological
Nardi G, Potenza R, eds., Proceedings of IAMG’98. Indicators, 8, 485-501.
Italy. pp. 526-539. Tan M Z, Mi S X, Li K L, Chen J. 2009. Influences of different
McBratney A B, de Gruijter J J, Brus D J. 1992. Spatial interpolation methods on spatial prediction of
prediction and mapping of continuous soil classes. compositional data - A case of fuzzy membership values
Geoderma, 54, 39-64. of soil continuous classification. Soils, 41, 998-1003.
McBratney A B, Mendonca Santos M L, Minasny B. 2003. (in Chinese)
On digital soil mapping. Geoderma, 117, 3-52. Theil H, van De Panne C. 1960. Quadratic programming as
McBratney A B, Odeh I O A, Bishop T F A, Dunbar M S, an extension of classical quadratic maximization.
Shatar T M. 2000. An overview of pedometric Management Science, 7, 1-20.
techniques for use in soil survey. Geoderma, 97, 293- Walvoort D J J, de Gruijter J J. 2001. Compositional kriging:
327. A spatial interpolation method for compositional data.
Meul M, Meirvenne M V. 2003. Kriging soil texture under Mathematical Geology, 33, 951-966.
different types of nonstationarity. Geoderma, 112, 217- Wismer D A, Chattergy R. 1978. Introduction to Nonlinear
233. Optimization: A Problem Solving Approach. Elsevier
NSS (National Soil Survey Office). 1995. Chinese Soil Genus North-Holland, Amsterdam, The Netherlands. p. 395.
Records. vol. 1-6. China Agriculture Press, Beijing. (in Zhang S W, Huang Y F, Shen C Y, Ye H C, Du Y C. 2012.
Chinese). Spatial prediction of soil organic matter using terrain
Odeh I O A, McBratney A B, Chittleborough D J, 1995. indices and categorical variables as auxiliary
Further results on prediction of soil properties from information. Geoderma, 171-172, 35-43.
terrain attributes: heterotopic cokriging and regression- Zhang S W, Wang S T, Liu N, Ye H C, Huang Y F. 2011.
kriging. Geoderma, 67, 215. Comparison of spatial prediction method for soil texture.
Pang S, Li T X, Wang Y D, Yu H Y, Li X. 2009. Spatial Transactions of the Chinese Society of Agricultural
interpolation and sample size optimization for soil copper Engineering, 27, 333-339. (in Chinese)
(Cu) investigation in cropland soil at county scale using Zhao Z, Chow T L, Rees H W, Yang Q, Xing Z, Meng F R.
cokriging. Agricultural Sciences in China, 8, 1369-1377. 2009. Predict soil texture distributions using an artificial
(in Chinese) neural network model. Computers and Electronics in
Simbahan G C, Dobermann A, Goovaerts P, Ping J, Haddix Agriculture, 65, 36-48.

(Managing editor SUN Lu-juan)

© 2013, CAAS. All rights reserved. Published by Elsevier Ltd.