SlideShare a Scribd company logo
1 of 17
Download to read offline
The meuse data set: a brief tutorial
for the gstat R package
Edzer Pebesma
March 30, 2016
1 Introduction
The meuse data set provided by package sp is a data set comprising of four
heavy metals measured in the top soil in a flood plain along the river Meuse,
along with a handful of covariates. The process governing heavy metal distribu-
tion seems that polluted sediment is carried by the river, and mostly deposited
close to the river bank, and areas with low elevation. This document shows a
geostatistical analysis of this data set. The data set was introduced by Burrough
and McDonnell, 1998.
This tutorial introduced the functionality of the R package gstat, used in
conjunction with package sp. Package gstat provides a wide range of uni-
variable and multivariable geostatistical modelling, prediction and simulation
functions, where package sp provides general purpose classes and methods for
defining, importing/exporting and visualizing spatial data.
2 R geostatistics packages
Package gstat (Pebesma, 2004) is an R package that provides basic functionality
for univariable and multivariable geostatistical analysis, including
ˆ variogram modelling, residual variogram modelling, and cross variogram
modelling using fitting of parametric models to sample variograms
ˆ geometric anisotropy specfied for each partial variogram model
ˆ restricted maximum likelihood fitting of partial sills
ˆ variogram and cross variogram maps
ˆ simple, ordinary, universal and external drift (co)kriging
ˆ (sequential) Gaussian (co)simulation equivalents for each of the kriging
varieties
ˆ indicator (co)kriging and sequential indicator (co)simulation
ˆ kriging in a local or global neighbourhood
ˆ block (co)kriging or simulation for each of the varieties, for rectangular or
irregular blocks
1
Other geostatistical packages for R usually lack part of these options (e.g. block
kriging, local kriging, or cokriging) but provide others: e.g. package geoR and
geoRglm (by Paulo Ribeiro and Ole Christensen) provide the model-based geo-
statistics framework described in Diggle et al. (1998), package fields (Doug
Nychka and others) provides thin plate spline interpolation, covariance func-
tions for spherical coordinates (unprojected data), and routines for network
design optimization.
3 Spatial data frames
Package gstat assumes that data are projected, i.e. they should not be provided
as lattitude/longitude. As an example, we will look at the meuse data set, which
is a regular data frame that comes with package gstat (remove the 88 from the
colour strings to make a plot without alpha transparency on windows or X11):
> library(sp)
> data(meuse)
> class(meuse)
[1] "data.frame"
> names(meuse)
[1] "x" "y" "cadmium" "copper" "lead" "zinc" "elev"
[8] "dist" "om" "ffreq" "soil" "lime" "landuse" "dist.m"
> coordinates(meuse) = ~x+y
> class(meuse)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
> summary(meuse)
Object of class SpatialPointsDataFrame
Coordinates:
min max
x 178605 181390
y 329714 333611
Is projected: NA
proj4string : [NA]
Number of points: 155
Data attributes:
cadmium copper lead zinc
Min. : 0.200 Min. : 14.00 Min. : 37.0 Min. : 113.0
1st Qu.: 0.800 1st Qu.: 23.00 1st Qu.: 72.5 1st Qu.: 198.0
Median : 2.100 Median : 31.00 Median :123.0 Median : 326.0
Mean : 3.246 Mean : 40.32 Mean :153.4 Mean : 469.7
3rd Qu.: 3.850 3rd Qu.: 49.50 3rd Qu.:207.0 3rd Qu.: 674.5
Max. :18.100 Max. :128.00 Max. :654.0 Max. :1839.0
2
elev dist om ffreq soil lime
Min. : 5.180 Min. :0.00000 Min. : 1.000 1:84 1:97 0:111
1st Qu.: 7.546 1st Qu.:0.07569 1st Qu.: 5.300 2:48 2:46 1: 44
Median : 8.180 Median :0.21184 Median : 6.900 3:23 3:12
Mean : 8.165 Mean :0.24002 Mean : 7.478
3rd Qu.: 8.955 3rd Qu.:0.36407 3rd Qu.: 9.000
Max. :10.520 Max. :0.88039 Max. :17.000
NA's :2
landuse dist.m
W :50 Min. : 10.0
Ah :39 1st Qu.: 80.0
Am :22 Median : 270.0
Fw :10 Mean : 290.3
Ab : 8 3rd Qu.: 450.0
(Other):25 Max. :1000.0
NA's : 1
> coordinates(meuse)[1:5,]
x y
1 181072 333611
2 181025 333558
3 181165 333537
4 181298 333484
5 181307 333330
> bubble(meuse, "zinc",
+ col=c("#00ff0088", "#00ff0088"), main = "zinc concentrations (ppm)")
3
zinc concentrations (ppm)
113
198
326
674.5
1839
and note the following:
1. the function coordinates, when assigned (i.e. on the left-hand side of an
= or <- sign), promotes the data.frame meuse into a SpatialPoints-
DataFrame, which knows about its spatial coordinates; coordinates may
be specified by a formula, a character vector, or a numeric matrix or data
frame with the actual coordinates
2. the function coordinates, when not assigned, retrieves the spatial coor-
dinates from a SpatialPointsDataFrame.
3. the two plotting functions used, plot and bubble assume that the x- and
y-axis are the spatial coordinates, and choose the aspect ratio of the axes
such that one unit in x equals one unit in y (i.e., data are map data,
projected).
4 Spatial data on a regular grid
> data(meuse.grid)
> summary(meuse.grid)
x y part.a part.b
Min. :178460 Min. :329620 Min. :0.0000 Min. :0.0000
1st Qu.:179420 1st Qu.:330460 1st Qu.:0.0000 1st Qu.:0.0000
Median :179980 Median :331220 Median :0.0000 Median :1.0000
Mean :179985 Mean :331348 Mean :0.3986 Mean :0.6014
3rd Qu.:180580 3rd Qu.:332140 3rd Qu.:1.0000 3rd Qu.:1.0000
4
Max. :181540 Max. :333740 Max. :1.0000 Max. :1.0000
dist soil ffreq
Min. :0.0000 1:1665 1: 779
1st Qu.:0.1193 2:1084 2:1335
Median :0.2715 3: 354 3: 989
Mean :0.2971
3rd Qu.:0.4402
Max. :0.9926
> class(meuse.grid)
[1] "data.frame"
> coordinates(meuse.grid) = ~x+y
> class(meuse.grid)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
> gridded(meuse.grid) = TRUE
> class(meuse.grid)
[1] "SpatialPixelsDataFrame"
attr(,"package")
[1] "sp"
> image(meuse.grid["dist"])
> title("distance to river (red = 0)")
> library(gstat)
> zinc.idw = idw(zinc~1, meuse, meuse.grid)
[inverse distance weighted interpolation]
> class(zinc.idw)
[1] "SpatialPixelsDataFrame"
attr(,"package")
[1] "sp"
> spplot(zinc.idw["var1.pred"], main = "zinc inverse distance weighted interpolations")
5
distance to river (red = 0)
zinc inverse distance weighted interpolations
200
400
600
800
1000
1200
1400
1600
1800
If you compare the bubble plot of zinc measurements with the map with
distances to the river, it becomes evident that the larger concentrations are
measured at locations close to the river. This relationship can be linearized by
6
log-transforming the zinc concentrations, and taking the square root of distance
to the river:
> plot(log(zinc)~sqrt(dist), meuse)
> abline(lm(log(zinc)~sqrt(dist), meuse))
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
qq
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
0.0 0.2 0.4 0.6 0.8
5.05.56.06.57.07.5
sqrt(dist)
log(zinc)
5 Variograms
Variograms are calculated using the function variogram, which takes a formula
as its first argument: log(zinc)~1 means that we assume a constant trend for
the variable log(zinc).
> lzn.vgm = variogram(log(zinc)~1, meuse)
> lzn.vgm
np dist gamma dir.hor dir.ver id
1 57 79.29244 0.1234479 0 0 var1
2 299 163.97367 0.2162185 0 0 var1
3 419 267.36483 0.3027859 0 0 var1
4 457 372.73542 0.4121448 0 0 var1
5 547 478.47670 0.4634128 0 0 var1
6 533 585.34058 0.5646933 0 0 var1
7 574 693.14526 0.5689683 0 0 var1
8 564 796.18365 0.6186769 0 0 var1
9 589 903.14650 0.6471479 0 0 var1
10 543 1011.29177 0.6915705 0 0 var1
7
11 500 1117.86235 0.7033984 0 0 var1
12 477 1221.32810 0.6038770 0 0 var1
13 452 1329.16407 0.6517158 0 0 var1
14 457 1437.25620 0.5665318 0 0 var1
15 415 1543.20248 0.5748227 0 0 var1
> lzn.fit = fit.variogram(lzn.vgm, model = vgm(1, "Sph", 900, 1))
> lzn.fit
model psill range
1 Nug 0.05066243 0.0000
2 Sph 0.59060780 897.0209
> plot(lzn.vgm, lzn.fit)
distance
semivariance
0.2
0.4
0.6
500 1000 1500
q
q
q
q
q
q q
q
q
q
q
q
q
q
q
Instead of the constant mean, denoted by ~1, we can specify a mean function,
e.g. using ~sqrt(dist) as a predictor variable:
> lznr.vgm = variogram(log(zinc)~sqrt(dist), meuse)
> lznr.fit = fit.variogram(lznr.vgm, model = vgm(1, "Exp", 300, 1))
> lznr.fit
model psill range
1 Nug 0.05712231 0.0000
2 Exp 0.17641559 340.3201
> plot(lznr.vgm, lznr.fit)
8
distance
semivariance
0.05
0.10
0.15
0.20
0.25
500 1000 1500
q
q
q
q
q
q
q
q
q
q
q
q q
q q
In this case, the variogram of residuals with respect to a fitted mean function
are shown. Residuals were calculated using ordinary least squares.
6 Kriging
> lzn.kriged = krige(log(zinc)~1, meuse, meuse.grid, model = lzn.fit)
[using ordinary kriging]
> spplot(lzn.kriged["var1.pred"])
9
5.0
5.5
6.0
6.5
7.0
7.5
7 Conditional simulation
> lzn.condsim = krige(log(zinc)~1, meuse, meuse.grid, model = lzn.fit,
+ nmax = 30, nsim = 4)
drawing 4 GLS realisations of beta...
[using conditional Gaussian simulation]
> spplot(lzn.condsim, main = "four conditional simulations")
10
four conditional simulations
sim1 sim2
sim3 sim4
3
4
5
6
7
8
For UK/residuals:
> lzn.condsim2 = krige(log(zinc)~sqrt(dist), meuse, meuse.grid, model = lznr.fit,
+ nmax = 30, nsim = 4)
drawing 4 GLS realisations of beta...
[using conditional Gaussian simulation]
> spplot(lzn.condsim2, main = "four UK conditional simulations")
11
four UK conditional simulations
sim1 sim2
sim3 sim4
4
5
6
7
8
8 Directional variograms
The following command calculates a directional sample variogram, where di-
rections are binned by direction angle alone. For two point pairs, Z(s) and
Z(s+h), the separation vector is h, and it has a direction. Here, we will classify
directions into four direction intervals:
> lzn.dir = variogram(log(zinc)~1, meuse, alpha = c(0, 45, 90, 135))
> lzndir.fit = vgm(.59, "Sph", 1200, .05, anis = c(45, .4))
> plot(lzn.dir, lzndir.fit, as.table = TRUE)
12
distance
semivariance
0.2
0.4
0.6
0.8
1.0
q
q
q
q
q q q
q q
q
q
q
q
q
q
0
500 1000 1500
q
q
q
q
q
q
q
q q
q q q q q q
45
500 1000 1500
q
q
q
q
q
q
q q
q
q q q
q
q
q
90
0.2
0.4
0.6
0.8
1.0
q
q
q
q q
q
q q q
q
q
q
q
q
q
135
Looking at directions between 180 and 360 degrees will repeat the image
shown above, because the variogram is a symmetric measure: (Z(s) − Z(s +
h))2
= (Z(s + h) − Z(s))2
.
The first plot gives the variogram in the zero direction, which is North; 90
degrees is East. By default, point pairs are assigned to the directional variorgram
panel with their nearest direction, so North contains everything between -22.5
and 22.5 degrees (North-West to North-East). After classifying by direction,
point pairs are binned by separation distance class, as is done in the usual
omnidirectional case.
In the figure, the partial sill, nugget and model type of the model are equal to
those of the omnidirectional model fitted above; the range is that in the direction
with the largest range (45o
), and the anisotropy ratio, the range in the 135
direction and the range in the 45 direction, estimated“by eye”by comparing the
45 and 135 degrees sample variograms. Gstat does not fit anisotropy parameters
automatically.
We do not claim that the model fitted here is “best” in some way; in order to
get to a better model we may want to look at more directions, other directions
(e.g. try alpha = c(22, 67, 112, 157) ), and to variogram maps (see below).
More elaborate approaches may use directions in three dimensions, and want
to further control the direction tolerance (which may be set such that direction
intervals overlap).
For the residual variogram from the linear regression model using sqrt(dist)
as covariate, the directional dependence is much less obvious; the fitted model
here is the fitted isotropic model (equal in all directions).
> lznr.dir = variogram(log(zinc)~sqrt(dist), meuse, alpha = c(0, 45, 90, 135))
> plot(lznr.dir, lznr.fit, as.table = TRUE)
13
distance
semivariance
0.1
0.2
0.3
0.4
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
0
500 1000 1500
q
q
q
q
q
q
q
q
q
q
q
q
q q
q
45
500 1000 1500
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
90
0.1
0.2
0.3
0.4
q
q q q q
q
q
q
q
q
q
q
q
q
q
135
9 Variogram maps
Another means of looking at directional dependence in semivariograms is ob-
tained by looking at variogram maps. Instead of classifying point pairs Z(s)
and Z(s + h) by direction and distance class separately, we can classify them
jointly. If h = {x, y} be the two-dimentional coordinates of the separation
vector, in the variogram map the semivariance contribution of each point pair
(Z(s) − Z(s + h))2
is attributed to the grid cell in which h lies. The map is
centered around (0, 0), as h is geographical distance rather than geographical
location. Cutoff and width correspond to some extent to map extent and cell
size; the semivariance map is point symmetric around (0, 0), as γ(h) = γ(−h).
> vgm.map = variogram(log(zinc)~sqrt(dist), meuse, cutoff = 1500, width = 100,
+ map = TRUE)
> plot(vgm.map, threshold = 5)
14
dx
dy
−1000
−500
0
500
1000
−1000 −500 0 500 1000
var1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
The threshold assures that only semivariogram map values based on at least
5 point pairs are shown, removing too noisy estimation.
10 Cross variography
Fitting a linear model of coregionalization.
> g = gstat(NULL, "log(zn)", log(zinc)~sqrt(dist), meuse)
> g = gstat(g, "log(cd)", log(cadmium)~sqrt(dist), meuse)
> g = gstat(g, "log(pb)", log(lead)~sqrt(dist), meuse)
> g = gstat(g, "log(cu)", log(copper)~sqrt(dist), meuse)
> v = variogram(g)
> g = gstat(g, model = vgm(1, "Exp", 300, 1), fill.all = TRUE)
> g.fit = fit.lmc(v, g)
> g.fit
data:
log(zn) : formula = log(zinc)`~`sqrt(dist) ; data dim = 155 x 12
log(cd) : formula = log(cadmium)`~`sqrt(dist) ; data dim = 155 x 12
log(pb) : formula = log(lead)`~`sqrt(dist) ; data dim = 155 x 12
log(cu) : formula = log(copper)`~`sqrt(dist) ; data dim = 155 x 12
variograms:
model psill range
log(zn)[1] Nug 0.05141798 0
log(zn)[2] Exp 0.17556219 300
log(cd)[1] Nug 0.39996573 0
log(cd)[2] Exp 0.47893816 300
15
log(pb)[1] Nug 0.04770893 0
log(pb)[2] Exp 0.21323027 300
log(cu)[1] Nug 0.04577523 0
log(cu)[2] Exp 0.07827374 300
log(zn).log(cd)[1] Nug 0.09190848 0
log(zn).log(cd)[2] Exp 0.24542024 300
log(zn).log(pb)[1] Nug 0.04528367 0
log(zn).log(pb)[2] Exp 0.18407011 300
log(cd).log(pb)[1] Nug 0.06425412 0
log(cd).log(pb)[2] Exp 0.25525359 300
log(zn).log(cu)[1] Nug 0.02912806 0
log(zn).log(cu)[2] Exp 0.10438748 300
log(cd).log(cu)[1] Nug 0.09441635 0
log(cd).log(cu)[2] Exp 0.13073936 300
log(pb).log(cu)[1] Nug 0.02369778 0
log(pb).log(cu)[2] Exp 0.10267516 300
> plot(v, g.fit)
> vgm.map = variogram(g, cutoff = 1500, width = 100, map = TRUE)
> plot(vgm.map, threshold = 5, col.regions = bpy.colors(), xlab = "", ylab = "")
distance
semivariance
0.000.050.100.15
500 10001500
q
qq
q
qq
q
q
qq
q
qqqq
log(zn).log(cu)
0.000.100.20
q
qq
q
q
q
qq
qq
qqqq
q
log(cd).log(cu)
0.000.050.10
500 10001500
q
qq
q
q
q
qq
qq
q
qq
qq
log(pb).log(cu)
0.000.050.10
q
q
q
qqq
q
q
qqq
q
qqq
log(cu)
0.000.100.20
q
qqq
q
q
q
q
qq
q
qq
qq
log(zn).log(pb)
0.00.10.20.3
q
q
qq
qq
qq
qq
qqq
qq
log(cd).log(pb)
0.000.100.200.30
q
qqq
q
q
q
q
qq
q
qq
qq
log(pb)
0.00.10.20.3
q
qqq
qq
qq
qq
q
qqq
q
log(zn).log(cd)
0.00.40.8
q
qqq
qq
qqqq
q
q
q
q
q
log(cd)
0.000.100.20
q
qqq
qq
q
q
qq
q
qq
qq
log(zn)
16
−1000
−500
0
500
1000
−1000 0500
log.cd. log.cd..log.cu.
−1000 0500
log.cd..log.pb. log.cu.
log.pb. log.pb..log.cu. log.zn.
−1000
−500
0
500
1000
log.zn..log.cd.log.zn..log.cu.
−1000
−500
0
500
1000
log.zn..log.pb.
0.0
0.5
1.0
1.5
2.0
References
ˆ Burrough, P.A., R.A. McDonnell, 1998. Principles of Geographical Infor-
mation Systems, 2nd Edition. Oxford University Press.
ˆ Diggle, P.J., J.A. Tawn, R.A. Moyeed, 1998. Model-based geostatistics.
Applied Statistics 47(3), pp 299-350.
ˆ Pebesma, E.J., 2004. Multivariable geostatistics in S: the gstat package.
Computers & Geosciences 30: 683-691.
ˆ Wackernagel, H., 1998. Multivariate Geostatistics; an introduction with
applications, 2nd edn., Springer, Berlin, 291 pp.
17

More Related Content

What's hot

Sistem Berkas 1
Sistem Berkas 1Sistem Berkas 1
Sistem Berkas 1Mrirfan
 
3.5 Exploratory Data Analysis
3.5 Exploratory Data Analysis3.5 Exploratory Data Analysis
3.5 Exploratory Data Analysismlong24
 
Building a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQLBuilding a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQLKudos S.A.S
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modellingdedidarwis
 
Relational keys
Relational keysRelational keys
Relational keysSana2020
 
Detection Datasets: Forged Characters for Passport and Driving Licence
Detection Datasets: Forged Characters for Passport and Driving LicenceDetection Datasets: Forged Characters for Passport and Driving Licence
Detection Datasets: Forged Characters for Passport and Driving LicenceIJITE
 
R Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In RR Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In RRsquared Academy
 
Guide to Human Activity System (HAS) Mapping
Guide to Human Activity System (HAS) MappingGuide to Human Activity System (HAS) Mapping
Guide to Human Activity System (HAS) MappingDavid Alman
 
R data-import, data-export
R data-import, data-exportR data-import, data-export
R data-import, data-exportFAO
 
4 Descriptive Statistics with R
4 Descriptive Statistics with R4 Descriptive Statistics with R
4 Descriptive Statistics with RDr Nisha Arora
 
Time series clustering using self-organizing maps
Time series clustering using self-organizing mapsTime series clustering using self-organizing maps
Time series clustering using self-organizing mapsLorena Santos
 
Human Activity System (HAS) Mapping
Human Activity System (HAS) MappingHuman Activity System (HAS) Mapping
Human Activity System (HAS) MappingDavid Alman
 
SE - Chapter 9 Pemeliharaan Perangkat Lunak
SE - Chapter 9 Pemeliharaan Perangkat LunakSE - Chapter 9 Pemeliharaan Perangkat Lunak
SE - Chapter 9 Pemeliharaan Perangkat LunakRiza Nurman
 
Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on rAbhik Seal
 
Applications of GIS in Public Health Engineering
Applications of GIS in Public Health EngineeringApplications of GIS in Public Health Engineering
Applications of GIS in Public Health EngineeringVignesh Sekar
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...Databricks
 

What's hot (20)

SAS Internal Training
SAS Internal TrainingSAS Internal Training
SAS Internal Training
 
Sistem Berkas 1
Sistem Berkas 1Sistem Berkas 1
Sistem Berkas 1
 
Data Structure Lec #1
Data Structure Lec #1Data Structure Lec #1
Data Structure Lec #1
 
3.5 Exploratory Data Analysis
3.5 Exploratory Data Analysis3.5 Exploratory Data Analysis
3.5 Exploratory Data Analysis
 
Building a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQLBuilding a Spatial Database in PostgreSQL
Building a Spatial Database in PostgreSQL
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Relational keys
Relational keysRelational keys
Relational keys
 
Data mining tools overall
Data mining tools overallData mining tools overall
Data mining tools overall
 
Detection Datasets: Forged Characters for Passport and Driving Licence
Detection Datasets: Forged Characters for Passport and Driving LicenceDetection Datasets: Forged Characters for Passport and Driving Licence
Detection Datasets: Forged Characters for Passport and Driving Licence
 
R Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In RR Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In R
 
Guide to Human Activity System (HAS) Mapping
Guide to Human Activity System (HAS) MappingGuide to Human Activity System (HAS) Mapping
Guide to Human Activity System (HAS) Mapping
 
R data-import, data-export
R data-import, data-exportR data-import, data-export
R data-import, data-export
 
4 Descriptive Statistics with R
4 Descriptive Statistics with R4 Descriptive Statistics with R
4 Descriptive Statistics with R
 
Time series clustering using self-organizing maps
Time series clustering using self-organizing mapsTime series clustering using self-organizing maps
Time series clustering using self-organizing maps
 
NB.ppt
NB.pptNB.ppt
NB.ppt
 
Human Activity System (HAS) Mapping
Human Activity System (HAS) MappingHuman Activity System (HAS) Mapping
Human Activity System (HAS) Mapping
 
SE - Chapter 9 Pemeliharaan Perangkat Lunak
SE - Chapter 9 Pemeliharaan Perangkat LunakSE - Chapter 9 Pemeliharaan Perangkat Lunak
SE - Chapter 9 Pemeliharaan Perangkat Lunak
 
Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on r
 
Applications of GIS in Public Health Engineering
Applications of GIS in Public Health EngineeringApplications of GIS in Public Health Engineering
Applications of GIS in Public Health Engineering
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
 

Similar to Manual "The meuse data set"

Table of Useful R commands.
Table of Useful R commands.Table of Useful R commands.
Table of Useful R commands.Dr. Volkan OBAN
 
Response Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty QuantificationResponse Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty QuantificationAlexander Litvinenko
 
Advanced Data Visualization Examples with R-Part II
Advanced Data Visualization Examples with R-Part IIAdvanced Data Visualization Examples with R-Part II
Advanced Data Visualization Examples with R-Part IIDr. Volkan OBAN
 
11. Linear Models
11. Linear Models11. Linear Models
11. Linear ModelsFAO
 
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATIONINDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATIONijcseit
 
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATIONINDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATIONijcseit
 
Platoon Control of Nonholonomic Robots using Quintic Bezier Splines
Platoon Control of Nonholonomic Robots using Quintic Bezier SplinesPlatoon Control of Nonholonomic Robots using Quintic Bezier Splines
Platoon Control of Nonholonomic Robots using Quintic Bezier SplinesKaustav Mondal
 
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQLModeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQLKostis Kyzirakos
 
Suppression of grating lobes
Suppression of grating lobesSuppression of grating lobes
Suppression of grating lobeskavindrakrishna
 
Creating a Custom Serialization Format (Gophercon 2017)
Creating a Custom Serialization Format (Gophercon 2017)Creating a Custom Serialization Format (Gophercon 2017)
Creating a Custom Serialization Format (Gophercon 2017)Scott Mansfield
 
Day 4b iteration and functions for-loops.pptx
Day 4b   iteration and functions  for-loops.pptxDay 4b   iteration and functions  for-loops.pptx
Day 4b iteration and functions for-loops.pptxAdrien Melquiond
 
Stochastic methods for uncertainty quantification in numerical aerodynamics
Stochastic methods for uncertainty quantification in numerical aerodynamicsStochastic methods for uncertainty quantification in numerical aerodynamics
Stochastic methods for uncertainty quantification in numerical aerodynamicsAlexander Litvinenko
 
machinelearning project
machinelearning projectmachinelearning project
machinelearning projectLianli Liu
 
Linear models
Linear modelsLinear models
Linear modelsFAO
 

Similar to Manual "The meuse data set" (20)

Table of Useful R commands.
Table of Useful R commands.Table of Useful R commands.
Table of Useful R commands.
 
Response Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty QuantificationResponse Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty Quantification
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
Advanced Data Visualization Examples with R-Part II
Advanced Data Visualization Examples with R-Part IIAdvanced Data Visualization Examples with R-Part II
Advanced Data Visualization Examples with R-Part II
 
11. Linear Models
11. Linear Models11. Linear Models
11. Linear Models
 
12. Linear models
12. Linear models12. Linear models
12. Linear models
 
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATIONINDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
 
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATIONINDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
INDUCTIVE LEARNING OF COMPLEX FUZZY RELATION
 
Platoon Control of Nonholonomic Robots using Quintic Bezier Splines
Platoon Control of Nonholonomic Robots using Quintic Bezier SplinesPlatoon Control of Nonholonomic Robots using Quintic Bezier Splines
Platoon Control of Nonholonomic Robots using Quintic Bezier Splines
 
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQLModeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
 
Suppression of grating lobes
Suppression of grating lobesSuppression of grating lobes
Suppression of grating lobes
 
Parameter estimation
Parameter estimationParameter estimation
Parameter estimation
 
Master's thesis
Master's thesisMaster's thesis
Master's thesis
 
Gaussian
GaussianGaussian
Gaussian
 
Creating a Custom Serialization Format (Gophercon 2017)
Creating a Custom Serialization Format (Gophercon 2017)Creating a Custom Serialization Format (Gophercon 2017)
Creating a Custom Serialization Format (Gophercon 2017)
 
Day 4b iteration and functions for-loops.pptx
Day 4b   iteration and functions  for-loops.pptxDay 4b   iteration and functions  for-loops.pptx
Day 4b iteration and functions for-loops.pptx
 
Stochastic methods for uncertainty quantification in numerical aerodynamics
Stochastic methods for uncertainty quantification in numerical aerodynamicsStochastic methods for uncertainty quantification in numerical aerodynamics
Stochastic methods for uncertainty quantification in numerical aerodynamics
 
machinelearning project
machinelearning projectmachinelearning project
machinelearning project
 
Linear models
Linear modelsLinear models
Linear models
 
R Programming Homework Help
R Programming Homework HelpR Programming Homework Help
R Programming Homework Help
 

Recently uploaded

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 

Recently uploaded (20)

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 

Manual "The meuse data set"

  • 1. The meuse data set: a brief tutorial for the gstat R package Edzer Pebesma March 30, 2016 1 Introduction The meuse data set provided by package sp is a data set comprising of four heavy metals measured in the top soil in a flood plain along the river Meuse, along with a handful of covariates. The process governing heavy metal distribu- tion seems that polluted sediment is carried by the river, and mostly deposited close to the river bank, and areas with low elevation. This document shows a geostatistical analysis of this data set. The data set was introduced by Burrough and McDonnell, 1998. This tutorial introduced the functionality of the R package gstat, used in conjunction with package sp. Package gstat provides a wide range of uni- variable and multivariable geostatistical modelling, prediction and simulation functions, where package sp provides general purpose classes and methods for defining, importing/exporting and visualizing spatial data. 2 R geostatistics packages Package gstat (Pebesma, 2004) is an R package that provides basic functionality for univariable and multivariable geostatistical analysis, including ˆ variogram modelling, residual variogram modelling, and cross variogram modelling using fitting of parametric models to sample variograms ˆ geometric anisotropy specfied for each partial variogram model ˆ restricted maximum likelihood fitting of partial sills ˆ variogram and cross variogram maps ˆ simple, ordinary, universal and external drift (co)kriging ˆ (sequential) Gaussian (co)simulation equivalents for each of the kriging varieties ˆ indicator (co)kriging and sequential indicator (co)simulation ˆ kriging in a local or global neighbourhood ˆ block (co)kriging or simulation for each of the varieties, for rectangular or irregular blocks 1
  • 2. Other geostatistical packages for R usually lack part of these options (e.g. block kriging, local kriging, or cokriging) but provide others: e.g. package geoR and geoRglm (by Paulo Ribeiro and Ole Christensen) provide the model-based geo- statistics framework described in Diggle et al. (1998), package fields (Doug Nychka and others) provides thin plate spline interpolation, covariance func- tions for spherical coordinates (unprojected data), and routines for network design optimization. 3 Spatial data frames Package gstat assumes that data are projected, i.e. they should not be provided as lattitude/longitude. As an example, we will look at the meuse data set, which is a regular data frame that comes with package gstat (remove the 88 from the colour strings to make a plot without alpha transparency on windows or X11): > library(sp) > data(meuse) > class(meuse) [1] "data.frame" > names(meuse) [1] "x" "y" "cadmium" "copper" "lead" "zinc" "elev" [8] "dist" "om" "ffreq" "soil" "lime" "landuse" "dist.m" > coordinates(meuse) = ~x+y > class(meuse) [1] "SpatialPointsDataFrame" attr(,"package") [1] "sp" > summary(meuse) Object of class SpatialPointsDataFrame Coordinates: min max x 178605 181390 y 329714 333611 Is projected: NA proj4string : [NA] Number of points: 155 Data attributes: cadmium copper lead zinc Min. : 0.200 Min. : 14.00 Min. : 37.0 Min. : 113.0 1st Qu.: 0.800 1st Qu.: 23.00 1st Qu.: 72.5 1st Qu.: 198.0 Median : 2.100 Median : 31.00 Median :123.0 Median : 326.0 Mean : 3.246 Mean : 40.32 Mean :153.4 Mean : 469.7 3rd Qu.: 3.850 3rd Qu.: 49.50 3rd Qu.:207.0 3rd Qu.: 674.5 Max. :18.100 Max. :128.00 Max. :654.0 Max. :1839.0 2
  • 3. elev dist om ffreq soil lime Min. : 5.180 Min. :0.00000 Min. : 1.000 1:84 1:97 0:111 1st Qu.: 7.546 1st Qu.:0.07569 1st Qu.: 5.300 2:48 2:46 1: 44 Median : 8.180 Median :0.21184 Median : 6.900 3:23 3:12 Mean : 8.165 Mean :0.24002 Mean : 7.478 3rd Qu.: 8.955 3rd Qu.:0.36407 3rd Qu.: 9.000 Max. :10.520 Max. :0.88039 Max. :17.000 NA's :2 landuse dist.m W :50 Min. : 10.0 Ah :39 1st Qu.: 80.0 Am :22 Median : 270.0 Fw :10 Mean : 290.3 Ab : 8 3rd Qu.: 450.0 (Other):25 Max. :1000.0 NA's : 1 > coordinates(meuse)[1:5,] x y 1 181072 333611 2 181025 333558 3 181165 333537 4 181298 333484 5 181307 333330 > bubble(meuse, "zinc", + col=c("#00ff0088", "#00ff0088"), main = "zinc concentrations (ppm)") 3
  • 4. zinc concentrations (ppm) 113 198 326 674.5 1839 and note the following: 1. the function coordinates, when assigned (i.e. on the left-hand side of an = or <- sign), promotes the data.frame meuse into a SpatialPoints- DataFrame, which knows about its spatial coordinates; coordinates may be specified by a formula, a character vector, or a numeric matrix or data frame with the actual coordinates 2. the function coordinates, when not assigned, retrieves the spatial coor- dinates from a SpatialPointsDataFrame. 3. the two plotting functions used, plot and bubble assume that the x- and y-axis are the spatial coordinates, and choose the aspect ratio of the axes such that one unit in x equals one unit in y (i.e., data are map data, projected). 4 Spatial data on a regular grid > data(meuse.grid) > summary(meuse.grid) x y part.a part.b Min. :178460 Min. :329620 Min. :0.0000 Min. :0.0000 1st Qu.:179420 1st Qu.:330460 1st Qu.:0.0000 1st Qu.:0.0000 Median :179980 Median :331220 Median :0.0000 Median :1.0000 Mean :179985 Mean :331348 Mean :0.3986 Mean :0.6014 3rd Qu.:180580 3rd Qu.:332140 3rd Qu.:1.0000 3rd Qu.:1.0000 4
  • 5. Max. :181540 Max. :333740 Max. :1.0000 Max. :1.0000 dist soil ffreq Min. :0.0000 1:1665 1: 779 1st Qu.:0.1193 2:1084 2:1335 Median :0.2715 3: 354 3: 989 Mean :0.2971 3rd Qu.:0.4402 Max. :0.9926 > class(meuse.grid) [1] "data.frame" > coordinates(meuse.grid) = ~x+y > class(meuse.grid) [1] "SpatialPointsDataFrame" attr(,"package") [1] "sp" > gridded(meuse.grid) = TRUE > class(meuse.grid) [1] "SpatialPixelsDataFrame" attr(,"package") [1] "sp" > image(meuse.grid["dist"]) > title("distance to river (red = 0)") > library(gstat) > zinc.idw = idw(zinc~1, meuse, meuse.grid) [inverse distance weighted interpolation] > class(zinc.idw) [1] "SpatialPixelsDataFrame" attr(,"package") [1] "sp" > spplot(zinc.idw["var1.pred"], main = "zinc inverse distance weighted interpolations") 5
  • 6. distance to river (red = 0) zinc inverse distance weighted interpolations 200 400 600 800 1000 1200 1400 1600 1800 If you compare the bubble plot of zinc measurements with the map with distances to the river, it becomes evident that the larger concentrations are measured at locations close to the river. This relationship can be linearized by 6
  • 7. log-transforming the zinc concentrations, and taking the square root of distance to the river: > plot(log(zinc)~sqrt(dist), meuse) > abline(lm(log(zinc)~sqrt(dist), meuse)) q q q q q q q q q qq q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.0 0.2 0.4 0.6 0.8 5.05.56.06.57.07.5 sqrt(dist) log(zinc) 5 Variograms Variograms are calculated using the function variogram, which takes a formula as its first argument: log(zinc)~1 means that we assume a constant trend for the variable log(zinc). > lzn.vgm = variogram(log(zinc)~1, meuse) > lzn.vgm np dist gamma dir.hor dir.ver id 1 57 79.29244 0.1234479 0 0 var1 2 299 163.97367 0.2162185 0 0 var1 3 419 267.36483 0.3027859 0 0 var1 4 457 372.73542 0.4121448 0 0 var1 5 547 478.47670 0.4634128 0 0 var1 6 533 585.34058 0.5646933 0 0 var1 7 574 693.14526 0.5689683 0 0 var1 8 564 796.18365 0.6186769 0 0 var1 9 589 903.14650 0.6471479 0 0 var1 10 543 1011.29177 0.6915705 0 0 var1 7
  • 8. 11 500 1117.86235 0.7033984 0 0 var1 12 477 1221.32810 0.6038770 0 0 var1 13 452 1329.16407 0.6517158 0 0 var1 14 457 1437.25620 0.5665318 0 0 var1 15 415 1543.20248 0.5748227 0 0 var1 > lzn.fit = fit.variogram(lzn.vgm, model = vgm(1, "Sph", 900, 1)) > lzn.fit model psill range 1 Nug 0.05066243 0.0000 2 Sph 0.59060780 897.0209 > plot(lzn.vgm, lzn.fit) distance semivariance 0.2 0.4 0.6 500 1000 1500 q q q q q q q q q q q q q q q Instead of the constant mean, denoted by ~1, we can specify a mean function, e.g. using ~sqrt(dist) as a predictor variable: > lznr.vgm = variogram(log(zinc)~sqrt(dist), meuse) > lznr.fit = fit.variogram(lznr.vgm, model = vgm(1, "Exp", 300, 1)) > lznr.fit model psill range 1 Nug 0.05712231 0.0000 2 Exp 0.17641559 340.3201 > plot(lznr.vgm, lznr.fit) 8
  • 9. distance semivariance 0.05 0.10 0.15 0.20 0.25 500 1000 1500 q q q q q q q q q q q q q q q In this case, the variogram of residuals with respect to a fitted mean function are shown. Residuals were calculated using ordinary least squares. 6 Kriging > lzn.kriged = krige(log(zinc)~1, meuse, meuse.grid, model = lzn.fit) [using ordinary kriging] > spplot(lzn.kriged["var1.pred"]) 9
  • 10. 5.0 5.5 6.0 6.5 7.0 7.5 7 Conditional simulation > lzn.condsim = krige(log(zinc)~1, meuse, meuse.grid, model = lzn.fit, + nmax = 30, nsim = 4) drawing 4 GLS realisations of beta... [using conditional Gaussian simulation] > spplot(lzn.condsim, main = "four conditional simulations") 10
  • 11. four conditional simulations sim1 sim2 sim3 sim4 3 4 5 6 7 8 For UK/residuals: > lzn.condsim2 = krige(log(zinc)~sqrt(dist), meuse, meuse.grid, model = lznr.fit, + nmax = 30, nsim = 4) drawing 4 GLS realisations of beta... [using conditional Gaussian simulation] > spplot(lzn.condsim2, main = "four UK conditional simulations") 11
  • 12. four UK conditional simulations sim1 sim2 sim3 sim4 4 5 6 7 8 8 Directional variograms The following command calculates a directional sample variogram, where di- rections are binned by direction angle alone. For two point pairs, Z(s) and Z(s+h), the separation vector is h, and it has a direction. Here, we will classify directions into four direction intervals: > lzn.dir = variogram(log(zinc)~1, meuse, alpha = c(0, 45, 90, 135)) > lzndir.fit = vgm(.59, "Sph", 1200, .05, anis = c(45, .4)) > plot(lzn.dir, lzndir.fit, as.table = TRUE) 12
  • 13. distance semivariance 0.2 0.4 0.6 0.8 1.0 q q q q q q q q q q q q q q q 0 500 1000 1500 q q q q q q q q q q q q q q q 45 500 1000 1500 q q q q q q q q q q q q q q q 90 0.2 0.4 0.6 0.8 1.0 q q q q q q q q q q q q q q q 135 Looking at directions between 180 and 360 degrees will repeat the image shown above, because the variogram is a symmetric measure: (Z(s) − Z(s + h))2 = (Z(s + h) − Z(s))2 . The first plot gives the variogram in the zero direction, which is North; 90 degrees is East. By default, point pairs are assigned to the directional variorgram panel with their nearest direction, so North contains everything between -22.5 and 22.5 degrees (North-West to North-East). After classifying by direction, point pairs are binned by separation distance class, as is done in the usual omnidirectional case. In the figure, the partial sill, nugget and model type of the model are equal to those of the omnidirectional model fitted above; the range is that in the direction with the largest range (45o ), and the anisotropy ratio, the range in the 135 direction and the range in the 45 direction, estimated“by eye”by comparing the 45 and 135 degrees sample variograms. Gstat does not fit anisotropy parameters automatically. We do not claim that the model fitted here is “best” in some way; in order to get to a better model we may want to look at more directions, other directions (e.g. try alpha = c(22, 67, 112, 157) ), and to variogram maps (see below). More elaborate approaches may use directions in three dimensions, and want to further control the direction tolerance (which may be set such that direction intervals overlap). For the residual variogram from the linear regression model using sqrt(dist) as covariate, the directional dependence is much less obvious; the fitted model here is the fitted isotropic model (equal in all directions). > lznr.dir = variogram(log(zinc)~sqrt(dist), meuse, alpha = c(0, 45, 90, 135)) > plot(lznr.dir, lznr.fit, as.table = TRUE) 13
  • 14. distance semivariance 0.1 0.2 0.3 0.4 q q q q q q q q q q q q q q q 0 500 1000 1500 q q q q q q q q q q q q q q q 45 500 1000 1500 q q q q q q q q q q q q q q q 90 0.1 0.2 0.3 0.4 q q q q q q q q q q q q q q q 135 9 Variogram maps Another means of looking at directional dependence in semivariograms is ob- tained by looking at variogram maps. Instead of classifying point pairs Z(s) and Z(s + h) by direction and distance class separately, we can classify them jointly. If h = {x, y} be the two-dimentional coordinates of the separation vector, in the variogram map the semivariance contribution of each point pair (Z(s) − Z(s + h))2 is attributed to the grid cell in which h lies. The map is centered around (0, 0), as h is geographical distance rather than geographical location. Cutoff and width correspond to some extent to map extent and cell size; the semivariance map is point symmetric around (0, 0), as γ(h) = γ(−h). > vgm.map = variogram(log(zinc)~sqrt(dist), meuse, cutoff = 1500, width = 100, + map = TRUE) > plot(vgm.map, threshold = 5) 14
  • 15. dx dy −1000 −500 0 500 1000 −1000 −500 0 500 1000 var1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 The threshold assures that only semivariogram map values based on at least 5 point pairs are shown, removing too noisy estimation. 10 Cross variography Fitting a linear model of coregionalization. > g = gstat(NULL, "log(zn)", log(zinc)~sqrt(dist), meuse) > g = gstat(g, "log(cd)", log(cadmium)~sqrt(dist), meuse) > g = gstat(g, "log(pb)", log(lead)~sqrt(dist), meuse) > g = gstat(g, "log(cu)", log(copper)~sqrt(dist), meuse) > v = variogram(g) > g = gstat(g, model = vgm(1, "Exp", 300, 1), fill.all = TRUE) > g.fit = fit.lmc(v, g) > g.fit data: log(zn) : formula = log(zinc)`~`sqrt(dist) ; data dim = 155 x 12 log(cd) : formula = log(cadmium)`~`sqrt(dist) ; data dim = 155 x 12 log(pb) : formula = log(lead)`~`sqrt(dist) ; data dim = 155 x 12 log(cu) : formula = log(copper)`~`sqrt(dist) ; data dim = 155 x 12 variograms: model psill range log(zn)[1] Nug 0.05141798 0 log(zn)[2] Exp 0.17556219 300 log(cd)[1] Nug 0.39996573 0 log(cd)[2] Exp 0.47893816 300 15
  • 16. log(pb)[1] Nug 0.04770893 0 log(pb)[2] Exp 0.21323027 300 log(cu)[1] Nug 0.04577523 0 log(cu)[2] Exp 0.07827374 300 log(zn).log(cd)[1] Nug 0.09190848 0 log(zn).log(cd)[2] Exp 0.24542024 300 log(zn).log(pb)[1] Nug 0.04528367 0 log(zn).log(pb)[2] Exp 0.18407011 300 log(cd).log(pb)[1] Nug 0.06425412 0 log(cd).log(pb)[2] Exp 0.25525359 300 log(zn).log(cu)[1] Nug 0.02912806 0 log(zn).log(cu)[2] Exp 0.10438748 300 log(cd).log(cu)[1] Nug 0.09441635 0 log(cd).log(cu)[2] Exp 0.13073936 300 log(pb).log(cu)[1] Nug 0.02369778 0 log(pb).log(cu)[2] Exp 0.10267516 300 > plot(v, g.fit) > vgm.map = variogram(g, cutoff = 1500, width = 100, map = TRUE) > plot(vgm.map, threshold = 5, col.regions = bpy.colors(), xlab = "", ylab = "") distance semivariance 0.000.050.100.15 500 10001500 q qq q qq q q qq q qqqq log(zn).log(cu) 0.000.100.20 q qq q q q qq qq qqqq q log(cd).log(cu) 0.000.050.10 500 10001500 q qq q q q qq qq q qq qq log(pb).log(cu) 0.000.050.10 q q q qqq q q qqq q qqq log(cu) 0.000.100.20 q qqq q q q q qq q qq qq log(zn).log(pb) 0.00.10.20.3 q q qq qq qq qq qqq qq log(cd).log(pb) 0.000.100.200.30 q qqq q q q q qq q qq qq log(pb) 0.00.10.20.3 q qqq qq qq qq q qqq q log(zn).log(cd) 0.00.40.8 q qqq qq qqqq q q q q q log(cd) 0.000.100.20 q qqq qq q q qq q qq qq log(zn) 16
  • 17. −1000 −500 0 500 1000 −1000 0500 log.cd. log.cd..log.cu. −1000 0500 log.cd..log.pb. log.cu. log.pb. log.pb..log.cu. log.zn. −1000 −500 0 500 1000 log.zn..log.cd.log.zn..log.cu. −1000 −500 0 500 1000 log.zn..log.pb. 0.0 0.5 1.0 1.5 2.0 References ˆ Burrough, P.A., R.A. McDonnell, 1998. Principles of Geographical Infor- mation Systems, 2nd Edition. Oxford University Press. ˆ Diggle, P.J., J.A. Tawn, R.A. Moyeed, 1998. Model-based geostatistics. Applied Statistics 47(3), pp 299-350. ˆ Pebesma, E.J., 2004. Multivariable geostatistics in S: the gstat package. Computers & Geosciences 30: 683-691. ˆ Wackernagel, H., 1998. Multivariate Geostatistics; an introduction with applications, 2nd edn., Springer, Berlin, 291 pp. 17