Vous êtes sur la page 1sur 5

SARMA 2012 Sharp statistical tools

2012-10-17

Computer exercise Kriging (geoR)


Loading R-packages
Load the needed packages:
> library(geoR)
> library(fields)
> library(maps)

Loading the data


As an example well study seasonal temperature over the US from 1997 (data
from http://www.image.ucar.edu/public/Data/RData.USmonthlyMet.bin).
First we load the data,
> temp <- read.csv("temp1997.csv")
and study the resulting data structure
> head(temp)
which contains: coordinates and elevation for each of the 250 sites, along with
the distance to coast and average temperature for the winter, spring, summer
and fall of 1997 (winter incldues December 1996).

Creating a geodata-object
Our first step is to construct a geodata object containing locations, observations
(fall temperatures) and covariates.
> temp.geodata <- as.geodata(temp, coords.col=1:2,
data.col=8, covar.col=1:4)
We then examine the data structure and plot the observations
> summary(temp.geodata)
> plot(temp.geodata)
The initial plot has modeled the mean using a constant; lets try some more
interesting mean-models.
> #1st trend using long and lat
> plot(temp.geodata, trend="1st")
> #add elevation and distance to coast
> plot(temp.geodata, trend=~long+lat+elevation+dCoast)

Computing variograms
Having decided on a mean-model we now examine the residual dependence.
Both for a model with only a constant mean, and for a more advanced model
> ##largest distance that makes sense is in the North-South direction.
> D.max <- 1.5*diff(range(temp$lat))
> ##compute variograms
> estV.const <- variog(temp.geodata, option="bin",
trend="cte", bin.cloud="TRUE",
max.dist=D.max)
> ##use uvec to specify more bins
> estV.const.50 <- variog(temp.geodata, option="bin",
trend="cte", bin.cloud="TRUE",
max.dist=D.max, uvec=50)
> ##with different trend
> estV.trend <- variog(temp.geodata, option="bin",
trend=~long+lat+elevation+dCoast,
bin.cloud="TRUE", max.dist=D.max)
2012-10-14, 13:55:12Z, rev.3326

SARMA 2012 Sharp statistical tools

The compute empirical variograms are isotropic; it is often interesting to also


investigate any anisotropic behaviour in the data
> variog4.const <- variog4(temp.geodata, option="bin",
trend="cte", bin.cloud="TRUE",
max.dist=D.max)
> variog4.trend <- variog4(temp.geodata, option="bin",
trend=~long+lat+elevation+dCoast,
bin.cloud="TRUE", max.dist=D.max)
Another possibility is to examine what variograms for independent data would
look like. This can be done by permuting the observations locations and computing an envelope of possible covariance functions.
> estV.const.env <- variog.mc.env(temp.geodata,
obj.var=estV.const, nsim=100)
> estV.trend.env <- variog.mc.env(temp.geodata,
obj.var=estV.trend, nsim=100)
Lets plot the estimated variograms, and uncertainty envelopes.
> par(mfrow=c(2,2))
> plot(estV.const, envelope=estV.const.env)
> points(estV.const.50$u, estV.const.50$v,
col=2, pch=19, cex=.1)
> plot(variog4.const)
> plot(estV.trend, envelope=estV.trend.env)
> plot(variog4.trend)
Along with more detailed directional results for the last case.
> plot(variog4.trend, omni=TRUE, same=FALSE)
Study the variograms; what can you say about the directional dependence for
the two models? Does this make sense (whats being ignored in the first mean
model)?

Estimation of covariance parameters


Two options for estimating covariance parameters now exist. We can either use
least-squares (LS) fit to the empirical variograms computed above or fit the the
full model (mean+covariance) using maximum-likelihood (ML). We start with
LS for four different covariance models. Recall that the LS-estimates will be
affected by the number of bins, uvec, used.
> ##initial parameters (sill,range)
> initial.pars <- c(1,5)
> ##fit variograms - LS using 4 different covariance functions
> estV.LS.exp <- variofit(estV.trend, cov.model="exponential",
ini.cov.pars=initial.pars,
weights="cressie")
> estV.LS.gau <- variofit(estV.trend, cov.model="gaussian",
ini.cov.pars=initial.pars,
weights="cressie")
> estV.LS.sph <- variofit(estV.trend, cov.model="spherical",
ini.cov.pars=initial.pars,
weights="cressie")
> estV.LS.mat <- variofit(estV.trend, cov.model="matern",
ini.cov.pars=initial.pars,
weights="cressie", fix.kappa=FALSE)
Compare the results for the Gaussian and Matern covariances.
> print(estV.LS.gau)
> print(estV.LS.mat)

2012-10-14, 13:55:12Z, rev.3326

SARMA 2012 Sharp statistical tools

Whats the limiting case of a Matern as ?


We can now also estimate the same four models using ML (this may take a few
minutes)
> estV.ML.exp <- likfit(temp.geodata, cov.model="exponential",
trend=estV.trend$trend,
ini.cov.pars=initial.pars)
> estV.ML.gau <- likfit(temp.geodata, cov.model="gaussian",
trend=estV.trend$trend,
ini.cov.pars=initial.pars)
> estV.ML.sph <- likfit(temp.geodata, cov.model="spherical",
trend=estV.trend$trend,
ini.cov.pars=initial.pars)
> estV.ML.mat <- likfit(temp.geodata, cov.model="matern",
trend=estV.trend$trend, fix.kappa=FALSE,
kappa=1, ini.cov.pars=initial.pars)
Plot the estimated parameteric-covariance functions and compare them with
the empirical variograms
> plot(estV.trend, envelope=estV.trend.env,
ylim=c(0,2*max(estV.trend$v)))
> ##variograms for LS
> lines.variomodel(estV.LS.exp, col="red")
> lines.variomodel(estV.LS.gau, col="green")
> lines.variomodel(estV.LS.sph, col="blue")
> lines.variomodel(estV.LS.mat, col="cyan")
> ##and for ML
> lines.variomodel(estV.ML.exp, col="red", lty=2)
> lines.variomodel(estV.ML.gau, col="green", lty=2)
> lines.variomodel(estV.ML.sph, col="blue", lty=2)
> lines.variomodel(estV.ML.mat, col="cyan", lty=2)
An alternative to the envelope computed above is to simulate from the estimated
covariance model and compute an envelope for the empirical variogram. A minor
detail here is that this simulation code in geoR does not handle trends in the
data correctly.
> ##Lets create a constant trend.
> variog.tmp <- estV.trend
> variog.tmp$trend <- "cte"
> model.tmp <- estV.ML.sph
> model.tmp$beta <- 0
> ##and estimate envelope for the well fitting spherical variogram
> estV.ML.sph.env <- variog.model.env(temp.geodata,
obj.variog=variog.tmp,
model.pars=model.tmp, nsim=100)
> ##and similarly for the not so well fitting Matern
> variog.tmp <- estV.trend
> variog.tmp$trend <- "cte"
> model.tmp <- estV.ML.mat
> model.tmp$beta <- 0
> estV.ML.mat.env <- variog.model.env(temp.geodata,
obj.variog=variog.tmp,
model.pars=model.tmp, nsim=100)
We can now plot the results and see if our observed empirical variogram fits in
the uncertainty for the estimated variograms.
> par(mfrow=c(1,2))
> plot(estV.trend, envelope.obj=estV.ML.sph.env)

2012-10-14, 13:55:12Z, rev.3326

SARMA 2012 Sharp statistical tools

> lines(estV.ML.sph, col="red")


> plot(estV.trend, envelope.obj=estV.ML.mat.env)
> lines(estV.ML.mat, col="red")

Cross-validation
geoR provides several features for cross-validation. Both leave-one-out crossvalidation and validation using held-out data can be done using xvalid. To
save time, we have here chosen not to reestimate the covariance parameters.
> cv.ML.sph <- xvalid(temp.geodata, reestimate=FALSE,
model=estV.ML.sph)
The cross-validation produces a large number of diagnostics plots that can be
examined (these contain both residuals and standardised residuals).
> par(mfrow=c(3,4), mar=c(3,3,0,1), mgp=c(2,1,0))
> plot(cv.ML.sph)

Prediction on a grid
A common goal of spatial modelling is to compute predictions at unobserved
target locations. Here this will be illustrated by predicting temperature over
the entire US. First lets load the gridded covariates and plot them as images.
> grid <- read.csv("temp1997_grid.csv")
> ##convert data to images for plotting
> elev.Im <- as.image(grid$elevation, grid[,c("long","lat")],
nrow=115, ncol=50)
> dCoast.Im <- as.image(grid$dCoast, grid[,c("long","lat")],
nrow=115, ncol=50)
> ##and plot, adding a map for reference
> par(mfrow=c(1,2))
> image.plot(elev.Im)
> map("world", add=TRUE, col="magenta", lwd=2)
> image.plot(dCoast.Im)
> map("world", add=TRUE, col="magenta", lwd=2)
We need to collect the prediction grid into a geodata-object, with constant 1
observations.
> grid.geodata <- as.geodata(cbind(1,grid), data.col=1,
coords.col=2:3, covar.col=2:5)
Using the gridded data and estimated parameters we can compute predictions,
> kc <- krige.conv(geodata=temp.geodata,
locations=grid.geodata$coords,
krige=krige.control(type.krige="OK",
trend.d=trend.spatial(estV.trend$trend,
temp.geodata),
trend.l=trend.spatial(estV.trend$trend,
grid.geodata),
obj.model=estV.ML.sph)
)
We can also compute predictions based on the mean value in the Kriging model
> mu <- (trend.spatial(estV.trend$trend, grid.geodata) %*%
estV.ML.sph$beta)
or based on only a regression models
> OLS.temp <- lm(SON ~ long+lat+elevation+dCoast, temp)
> OLS.pred <- predict(OLS.temp, grid)
Plot predicted values, prediction uncertainties, and observation locations.

2012-10-14, 13:55:12Z, rev.3326

SARMA 2012 Sharp statistical tools

> par(mfrow=c(2,2))
> ##predictions
> image.plot( as.image(kc$predict, grid[,c("long","lat")],
nrow=115, ncol=50))
> map("world", add=TRUE, col="black", lwd=2)
> points(temp$long, temp$lat, pch=19, col="magenta", cex=.5)
> ##standard deviations
> image.plot( as.image(sqrt(kc$krige.var), grid[,c("long","lat")],
nrow=115, ncol=50))
> map("world", add=TRUE, col="black", lwd=2)
> points(temp$long, temp$lat, pch=19, col="magenta", cex=.5)
mean component
> image.plot( as.image(mu, grid[,c("long","lat")],
nrow=115, ncol=50))
> map("world", add=TRUE, col="black", lwd=2)
> points(temp$long, temp$lat, pch=19, col="magenta", cex=.5)
and OLS predictions
> image.plot( as.image(OLS.pred, grid[,c("long","lat")],
nrow=115, ncol=50))
> map("world", add=TRUE, col="black", lwd=2)
> points(temp$long, temp$lat, pch=19, col="magenta", cex=.5)

The end!
2012-10-14, 13:55:12Z, rev.3326

Vous aimerez peut-être aussi