SlideShare a Scribd company logo
1 of 16
Download to read offline
Advancing Spatio-temporal Analysis of
Ecological Data: Examples in R
Tomislav Hengl1
, Emiel van Loon1
, Henk Sierdsema2
, and Willem Bouten1
1
Research Group on Computational Geo-Ecology (CGE), University of Amsterdam,
Amsterdam, The Netherlands
hengl@science.uva.nl
http://www.science.uva.nl/ibed-cge
2
SOVON Dutch Centre for Field Ornithology, Beek-Ubbergen, The Netherlands
Abstract. The article reviews main principles of running geo-computa-
tions in ecology, as illustrated with case studies from the EcoGRID and
FlySafe projects, and emphasizes the advantages of using R computing
environment as the most attractive programming/scripting environment.
Three case studies (including R code) of interest to ecological applications
are described: (a) analysis of GPS trajectory data for two gull-birds species;
(b) species distribution mapping in space and time for a bird species (sedge
warbler; EcoGRID project); and (c) change detection using time-series of
maps. The case studies demonstrate that R, together with its numerous
packages for spatial and geostatistical analysis, is a well-suited tool to pro-
duce quality outputs (maps, statistical models) of interest in Geo-Ecology.
Moreover, due to the recent implementation of the maptools and sp pack-
ages, such outputs can be easily exported to popular geographical browsers
such as Google Earth and similar. The key computational challenges for
Computational Geo-Ecology recognized were: (1) solving the problem of
input data quality (filtering techniques), (2) solving the problem of com-
puting with large data sets, (3) improving the over-simplistic statistical
models, and (4) producing outputs of increasingly higher level of detail.
1 Introduction
Computational Geo-Ecology is an emerging scientific sub-field of Ecology that
focuses on development and testing of computational tools that can be used to
extract spatio-temporal information on the dynamics of complex geo-ecosystems.
It evolved as a combination of three scientific fields: (a) Ecology, as it focuses
on interactions between species and abiotic factors; (b) Statistics, as it implies
quantitative analysis of field and remote sensing data; and (c) Geoinformation
Science, as all variables are spatially referenced and outputs of analyzes are com-
monly maps. The importance of this topic has been recognized at the Institute
for Biodiversity and Ecosystem Dynamics, University of Amsterdam, where a
research group on Computational Geo-Ecology (CGE) has been established. It
comprises about 20 researchers, PhD students and supporting staff mainly with
backgrounds in physical geography, computer sciences, ecology and geosciences.
O. Gervasi et al. (Eds.): ICCSA 2008, Part I, LNCS 5072, pp. 692–707, 2008.
c
 Springer-Verlag Berlin Heidelberg 2008
Advancing Spatio-temporal Analysis of Ecological Data 693
ECOGRID.NL
SPECIES:
dragonflies, plants, fish
fungi, mollusca,
mammals, butterflies,
moss  lichens, birds
AUXILIARY DATA:
geographical location,
date, landscape,
taxonomy, socio-
economic data
METADATA:
taxonomy, lineage,
contact information, data
quality
NDFF observations
QUERY
PARAMETERS:
species, period,
area, type of
analysis,
outputs...
Spatio-temporal data mining
Density estimation
Geostatistical analysis
Trend analysis and change detection
Habitat mapping
Error propagation
Interactive visualization
Summary statistics
Distribution maps
Change indices
Biodiversity indices
Home range
Scenario testing
REPORTS
BASE MAPS
ECOLOGICAL CONDITIONS:
distance to man-made objects,
distance to water and food
supplies, land use, hydrology,
climate, geology
Fig. 1. Workflow scheme and main components of the EcoGRID. See further some
concrete case studies from the EcoGRID in Sec. 2.3 and 2.4.
The key objective of this group is to develop and apply computational tools1
that implement theoretical models of complex geo-ecosystems calibrated by field
observations and remote sensing data, and that can be used to perform various
tasks: from spatio-temporal data mining to analysis and decision making.
CGE is, at the moment, actively involved with two research projects:
EcoGRID and ESA-Flysafe. EcoGRID (www.ecogrid.nl) is a national project
currently being applied in supporting the functioning of the growing Dutch Flora
and Fauna Database (NDFF), which contains about 20 million field records of
more than 3000 species registered in the Dutch Species Catalogue (www.neder-
landsesoorten.nl). EcoGRID aims at providing researchers, policy-makers and
stake-holders with relevant information, including distribution maps, distribu-
tion change indices, biodiversity indices, estimated outcomes for scenario-testing
models [ 1]
. To achieve this, a set of general analysis procedures is being imple-
mented and tested — ranging from spatio-temporal data mining, density es-
timation, geostatistical analysis, trend analysis and change detection, habitat
mapping, error propagation and interactive visualization techniques (Fig. 1).
EcoGRID is the Dutch segment of the recent pan-European initiative called
“LifeWatch” (www.lifewatch.eu), which aims at building a very large infras-
tructure (virtual laboratories) to support sharing of the knowledge and tools to
monitor biodiversity over Europe.
1
By ‘tools’ we mainly refer to various software solutions: stand-alone packages, plug-
ins/packages and toolboxes, software-based scripts, web-applications and computa-
tional schemes.
694 T. Hengl et al.
ESA-Flysafe is a project precursor to the Avian Alert initiative (www.avian-
alert.eu), a potential integrated application promotion programme (IAP) of
the European Space Agency. CGE has already successfully implemented a na-
tional project called BAMBAS (www.bambas.ecogrid.nl), which is now used
as a decision support tool by the Royal Netherlands Air Force to reduce the risk
of bird-aircraft collisions [ 2]
. The objective of Flysafe is to integrate multi-source
data into a Virtual laboratory, in order to provide predictions and forecasts of
bird migration (bird densities, species structure, altitudes, vectors and velocities)
at different scales in space and time [ 2]
.
This paper reviews the most recent activities of the CGE group, discusses
limitations and opportunities of using various algorithms and sets a research
agenda for the coming years. This is all illustrated with a selection of real case
studies, as implemented in the R computing environment. Our idea was not to
produce an R tutorial for spatial data analysis, but to demonstrate some common
processing steps and then emphasize advantages of running computations in R.
2 Examples in R
2.1 Why R?
“From a period in which geographic information systems, and later
geocomputation and geographical information science, have been agenda setters,
there seems to be interest in trying things out, in expressing ideas in code, and
in encouraging others to apply the coded functions in teaching and applied
research settings.”
Roger Bivand [ 3]
The three most attractive computing environments to develop and implement
computational schemes used in Computational Geo-Ecology are R (www.r--
project.org), MATLAB (www.mathworks.com) and Python (www.python.org).
The first offers less support and instructions to beginners, the second has more
basic utilities, is easier to use and the third is the most popular environment
used for software development. Although all three are high level languages with
extensive users’ communities that interact and share code willingly, R seems to
be the most attractive candidate for implementation of algorithms of interest
to CGE. [ 3]
recognizes three main opportunities for using R: (1) vitality and
high speed of development of R, (2) academic openness of developers and their
willingness to collaborate, and (3) increasing sympathy for spatial data analysis
and visualization. Our main reasons to select R for our projects are:
⋆ R supports various GIS formats via the rgdal package, including the export
functionality of vector layers and plots to Google Earth (maptools package).
⋆ R offers a much larger family of methods for spatio-temporal analysis (point
pattern analysis, spatial interpolation and simulations, spatio-temporal trend
analysis) than MATLAB.
⋆ Unlike MATLAB, R is an open-source software and hence does not require
additional investments and is easy to install and update.
Advancing Spatio-temporal Analysis of Ecological Data 695
Several authors have recently drawn attention to new R packages used for
spatial data analysis. [ 4, 5]
promotes the gstat and sp packages that together
offer variety of geostatistical analysis; [ 6, 3]
reviews spatial data analysis pack-
ages in general with special focus on maptools and GRASS packages that have
been established as the most comprehensive links between statistical and GIS
computing; [ 7]
presents the spatstat package for analysis of point patterns. We
should also add to this list: RSAGA — link to the SAGA GIS, spsurvey — pack-
age for spatial sampling, geoR — geostatistical analysis, splancs — spatial point
pattern analysis, and the specialized ecological data analysis packages: adehabi-
tat [ 8]
, GRASP and BIOMOD, that support spatial prediction of point-sampled
variables using GLM/GAMs, and export to GIS. For an update on most recent
activities connected with the development of spatial analysis tools in R, you can
at any time subscribe to the R-sig-Geo mailing list and witness the evolution.
A limitation of R is that it does not provide dynamic linked visualization
and user-friendly data exploration. This might frustrate users that wish to zoom
into spatial layers, visually explore patterns and open multiple layers over each
other. However, due to the recent implementation of the maptools, rgdal and sp
packages, outputs of spatial/statistical analysis in R can be exported to free geo-
graphic browsers such as Google Earth. Google Earth is a HTML-language based
freeware that has, with its intelligent indexing of very large datasets combined
with an open architecture for integrating and customizing new data, revolution-
ized the meaning of the word “geoinformation”. By combining computational
power of R and visualization possibilities of Google Earth, one creates a complete
system.
The following sections demonstrate use of R scripting to perform various an-
alyzes, including the export to Google Earth. We are not able to display the
complete scripts, but we instead zoom into specific processing steps that might
be of interest to research unfamiliar with R. For a detailed introduction to spa-
tial analysis in R, please refer to the recent books by [ 9]
and [ 10]
, and various
lecture notes [ 11;12]
.
2.2 Analysis of GPS Trajectory Data
The objective of this exercise is to analyze movement of two gull bird species —
lesser black-backed gull (Larus Fuscus), further in text referred to as LBG, and
european herring gull (Larus Argentatus Pontoppidan), further in text referred
to as HG. For the analysis, we use the GPS readings of the receivers attached
to a total of 23 individual birds. The birds were released on 1st of June 2007 in
the region of Vlieland, the Netherlands, and then recordings collected until 24th
of October 2007. A map of trajectories is shown in Fig. 2a. We are interested
to see where do gulls forage and rest, do they have specific paths, how fast do
they move over an area and is there a relationship between activity centers and
landscape?
We can import the raw table data to R using:
 gulls - read.delim(gulls.txt)
696 T. Hengl et al.
This shows the following structure:
’data.frame’: 13530 obs. of 6 variables:
$ BIRDID : Factor w/ 23 levels ID41745,ID41747,..: 1 1 1 1 1 1 ...
$ LATITUDE : num 53.2 53.3 53.3 53.2 53.2 ...
$ LONGITUDE: num 4.93 4.95 4.96 5.00 5.04 ...
$ SPECIES : Factor w/ 2 levels HG,LBG: 2 2 2 2 2 2 2 2 2 2 ...
$ SEX : Factor w/ 2 levels F,M: 2 2 2 2 2 2 2 2 2 2 ...
$ TIME : POSIXct, format: 2007-06-01 07:00:00 2007-06-01 09:00:00
We first want to calculate the velocity of birds moving from one to other
location. To do this, we need to reproject the geographical coordinates to a
cartesian system, so that the distances in x and y directions are equal. We start
by attaching the coordinates and the coordinate system (sp package):
 library(sp)
 coordinates(gulls) - ~LONGITUDE+LATITUDE
 proj4string(gulls) - CRS(+proj=longlat +datum=WGS84)
which will coerce the table into a SpatialPointsDataFrame, i.e. an R spatial
layer. Now we can transform the coordinates to the European Terrestrial Refer-
ence System (www.euref.eu) using:
 gulls.laea - spTransform(gulls, CRS(+init=epsg:3035))
where spTransform is the rgdal method to reproject the coordinates, epsg is the
European Petroleum Survey Group’s registry code (see www.epsg-registry.org)
of the ETRS coordinate system. Once the coordinates are in a metric system, we
can derive distances and velocities (in km/h) from point to point, by dividing the
distance vector by time. This will attach to each point location an estimate of the
velocity for an individual bird.
At this stage, we want to separate the analysis for the two species. This can
be done in few steps, e.g. for LBG species:
 gulls.laea$LBG - ifelse(gulls.laea$SPECIES==LBG,
ifelse(is.na(gulls.laea@data$VELOC), NA, T), NA)
 LBG - !is.na(gulls.laea$LBG)
We proceed with interpolating the velocities estimated at point locations. For
this, we can use the gstat package [ 4]
. The variograms can be visualized and
fitted using:
 library(gstat)
 LBG.points - remove.duplicates(gulls.laea[LBG,], zero=0.0, remove.second=TRUE)
 plot(variogram(log1p(log1p(VELOC))~1, LBG.points, cutoff=40000))
 LBG.ovar - variogram(log1p(log1p(VELOC))~1, LBG.points, cutoff=40000)
 LBG.ovgm - fit.variogram(LBG.ovar, vgm(0.4, Exp, range=30000, 0.5))
where remove.duplicates is the function to remove duplicate point and pre-
vent the package for running into computational problems, variogram is the
Advancing Spatio-temporal Analysis of Ecological Data 697
14.747
11.871
8.995
6.118
3.242
0.366
(a)
(c)
(b)
(d)
[km / h]
100.0
80.0
60.0
40.0
20.0
0.0
[ HSI ]
Fig. 2. Spatio-temporal analysis of GPS trajectory data for lesser black-backed gull
(Larus Fuscus): (a) observed trajectories for a total of 14 individual birds; (b) in-
terpolated velocities; (c) Habitat Suitability Index derived using mean annual EVI,
topographic wetness index and night lights image; (d) 3D kernel density aggregated
for the period June–August 2007.
gstat command to calculate semivariances for a given target variable (VELOC),
log1p is the log transformation (double log in this case), fit.variogram is the
gstat command used to fit the variogram using re-weighted least squares and
cutoff is the maximum distance of interest. Once we fitted the variogram, we
698 T. Hengl et al.
can interpolate the values over the whole area of interest using e.g. ordinary
kriging:
 LBG.g - gstat(id=c(VELOC), formula=log1p(log1p(VELOC))~1,
data=LBG.points, model=LBG.ovgm)
 LBGspeed.OK - predict.gstat(object=LBG.g, newdata=maskLBG, nmax=60)
where gstat is the generic function to run predictions and simulations, maskLBG
is a SpatialGridDataFrame showing the prediction locations, and nmax=60 is
the maximum number of point pairs that will be used to make predictions. The
resulting map can be seen in Fig. 2b.
We are next interested to see how is the movement of the gulls connected with
environmental conditions, i.e. relief, urban development and landscape in gen-
eral. For this, we use three 1–km maps: (1) mean annual EVI image (pcevi1lbg)
representing the mean biomass over a terrain, (2) topographic wetness index de-
rived from the 1 km DEM (twilbg) representing relief, and (3) night lights image
(nlightslbg) representing urban development [ 11]
. These can be loaded to R
via the adehabitat package:
 library(adehabitat)
 pcevi1lbg.asc - import.asc(pcevi1lbg.asc)
 twilbg.asc - import.asc(twilbg.asc)
 nlightslbg.asc - import.asc(nlightslbg.asc)
 LBG.maps - as.kasc(list(dem=demlbg.asc, pcevi1lbg=pcevi1lbg.asc,
twilbg=twilbg.asc, nlightslbg=nlightslbg.asc))
The maps and locations where LBG species has been observed can be packed
together and used to run Ecological Niche Factor Analysis [ 8]
:
 LBG.hab - data2enfa(LBG.maps, gulls.laea[LBG,]@coords)
 enfa.LBG - enfa(dudi.pca(LBG.hab$tab, scannf=FALSE), LBG.hab$pr, nf=2)
 LBG.dist - predict(enfa.LBG, LBG.hab$index, LBG.hab$attr)
where data2enfa will combine the grids and observation locations, enfa is a
method to run Ecological Niche Factor Analysis and LBG.dist are the output
distances from the barycentre of the niche [ 8]
. Final Habitat Suitability Index
(0–100%) can be seen in Fig. 2c. This shows that LBG birds, based on this
trajectory data, systematically avoid mountain chains and big urban areas.
By using package splancs [ 13;3]
, we can also estimate the space-time kernel
density for different time intervals. A space-time (3D) kernel filter can be run by
defining: coordinates of the points and time reference (x, y, z), grid of interest,
and search radius.
 LBG.densnoTime - kernel3d(pts=LBG.points@coords, times=LBG.points$CTIME,
xgr=seq(maskLBG@bbox[x,min], maskLBG@bbox[x,max],
maskLBG@grid@cellsize[1]), ygr=seq(maskLBG@bbox[y,min],
maskLBG@bbox[y,max], maskLBG@grid@cellsize[2]),
zgr=seq(3608,8000,72), hxy=20000, hz=168)
which will produce 61 maps of kernel smoother densities for 3-day periods. The
output is basically a space-time cube, i.e. a series of 61 grid maps. Note that such
Advancing Spatio-temporal Analysis of Ecological Data 699
calculations can be time-consuming, thus we recommend that you test the script
using relatively small data sets first, and then proceed with the real case studies.
The summary density map estimated using kernel3d method, aggregated over
the period June, July, August, is shown in Fig. 2d.
2.3 Generation of Distribution Maps
The following section demonstrates how to connect to an on-line ecological
database (ecogrid.nl), run queries, generate distribution maps and export them
to geographical browsers such as Google Earth. We start by setting-up a new
ODBC connection on our Windows machine2
, and then connecting to it from R
using the RODBC package:
 library(RODBC)
 ecogrid.conn - odbcConnect(dsn=ecogrid.nl, connection=sovon-ecogrid,
case=postgresql)
which will create an R definition of the connection. We can now run a query e.g.
to fetch all observations of breeding pair counts of the bird species sedge warbler
(Acrocephalus schoenobaenus):
 Acrocephalus.tbl - sqlQuery(ecogrid.conn, query=paste(SELECT o.countmin,
o.countmax, o.counttype, x(centroid(l.the_geom)), y(centroid(l.the_geom)),
o.timestart, o.timestop, o.timetype FROM survey.observations o,
survey.locations l, taxonomy.taxa t, taxonomy.taxa p WHERE o.locid=l.locid
AND o.taxid=t.taxid AND t.parent_id=p.taxid AND p.taxon ILIKE
’Acrocephalus’ AND t.taxon ILIKE ’schoenobaenus’;))
where o.countmin, o.countmax are the observed counts of the breeding pairs,
x(centroid(l.the geom)), y(centroid(l.the geom)) are the coordinates of
the center of the observation plots and o.timestart, o.timestop is the time
of beginning and the end of observation. As a result of query, we obtain 16,028
observations (Fig. 3a):
 str(Acrocephalus.tbl)
’data.frame’: 16028 obs. of 8 variables:
$ countmin : int 0 0 0 0 0 0 0 0 5 8 ...
$ countmax : int 0 0 0 0 0 0 0 0 5 8 ...
$ counttype: Factor w/ 1 level n: 1 1 1 1 1 1 1 1 1 1 ...
$ x : num 115663 115663 115663 115663 115663 ...
$ y : num 425971 425971 425971 425971 425971 ...
$ timestart: POSIXct, format: 1984-04-20 1985-04-20 ...
$ timestop : POSIXct, format: 1984-08-16 1985-08-16 ...
$ timetype : Factor w/ 1 level f: 1 1 1 1 1 1 1 1 1 1 ...
2
Under Control panel → Administrative tools → ODBC Data Source Administration
→ Add, then enter the server address, port, username and password used to connect
to server.
700 T. Hengl et al.
Acrocephalus schoenobaenus in 2000
(a) (b)
(c)
5.0
2
[ no / km ]
4.0
3.0
2.0
1.0
0.0
1984 1992 2000
Fig. 3. Automated generation of distribution maps: (a) observed number of breeding
pairs of Acrocephalus schoenobaenus in year 2000; (b) as shown in Google Earth; (c)
breeding pair densitites for years 1984, 1992 and 2000 derived using regression-kriging
over 1 km grid.
Before we can proceed with generation of the distribution maps, we need to
convert the time-coordinates to a linear system. For example, we can consider
using the cumulative number of days since 1970-01-01:
 Acrocephalus.tbl$ctime - floor(unclass(Acrocephalus.tbl$timestart)/86400) +
( floor(unclass(Acrocephalus.tbl$timestop)/86400) -
floor(unclass(Acrocephalus.tbl$timestart)/86400) )
here we use the command unclass to get the time as a numeric vector3
. For
example, 1987-04-20 W. Europe Standard Time date-time value corresponds
to a value of 6317 (hours since 1970-01-01).
3
This will convert time values to the number of hours since the beginning of 1970.
This way we can run statistical analysis with such data.
Advancing Spatio-temporal Analysis of Ecological Data 701
Next, we import the predictor maps of the Netherlands that can be used to
map distribution of this bird:
 gridmaps - readGDAL(dheight.asc)
 names(gridmaps)[[1]] - dheight
 gridmaps$dtm - readGDAL(dtm.asc)$band1
 gridmaps$freat1 - readGDAL(freat1.asc)$band1
 gridmaps$lgn3dsee - readGDAL(lgn3dsee.asc)$band1
 gridmaps$sltdch1 - readGDAL(sltdch1.asc)$band1
 gridmaps$t10nhuis - readGDAL(t10nhuis.asc)$band1
 gridmaps$t10ntree - readGDAL(t10ntree.asc)$band1
 proj4string(gridmaps) - CRS(+init=epsg:28992)
where dheight is layer showing the height of canopy, dtm is the LiDAR-derived
Digital Elevation Model, freat1 is the map showing duration of drainage,
lgn3dsee is the distance from the coast line, sltdch1 is the density of the
primary water course, t10nhuis is the density of buildings from the 1:10k topo-
maps, t10ntree is the density of trees, and epsg:28992 is the EPSG ID of the
Dutch coordinate system. Note that this gridmaps data set is fairly large as each
layer consists of 910,000 grids.
We can select a specific year/period and subset the original data set, e.g. to
year 2000:
 Acrocephalus.2000 - subset(Acrocephalus.tbl,
as.integer(format(Acrocephalus.tbl$timestop, %Y))==2000)
and plot the values of bird counts using (Fig. 3a):
 coordinates(Acrocephalus.2000) - ~x+y
 bubble(Acrocephalus.2000[count], scales=list(draw=TRUE), main=Acrocephalus
schoenobaenus in 2000, sp.layout=list(sp.lines, col=black, NLborders))
The point maps (also lines and polygons) can be exported and viewed in
Google Earth using the writeOGR command:
 Acrocephalus.2000.latlong - spTransform(Acrocephalus.2000, CRS(+proj=longlat))
 writeOGR(Acrocephalus.2000.latlong[count], Acrocephalus_2000.kml,
Acrocephalus_schoenobaenus2000, KML)
which will produce an image as shown in Fig. 3b.
A list of distribution maps can be generated by using regression-kriging tech-
nique, as implemented in the gstat package [ 4;11]
. We first need to overlay rasters
and points and then attach the values of predictors to the original data frame.
To speed up data processing, we can also loop the operations: linear regression,
then step-wise regression, then variogram modeling, and final interpolation using
regression-kriging. Note that, because all operations are automated, you might
get some strange results, either in the whole map, or at a specific grid nodes, so
that it might be a good idea to inspect/visualize all fitted models and output
maps before you proceed with interpretation of the final outputs.
The final results of interpolation (Fig. 3c) show that the density of Acro-
cephalus schoenobaenus in the Netherlands has been increasing over the period
702 T. Hengl et al.
of last 10–15 years. But how distinct is the change in breeding pair counts, we
will answer in the following exercise.
2.4 Trend Analysis and Change Detection
In the last example we demonstrate how robust linear regression can be used
to map rate of change using time-series of distribution maps. For this we use
a series of 17 distribution maps derived in the previous exercise: Acrocephalus
schoenobaenus in the Netherlands from 1984 trough 2000 (Fig. 3c). We can run
the analysis by selecting only the grid nodes with attached value:
 count01 - rk.Acroephalus.1984[mask,]$count
 count02 - rk.Acroephalus.1985[mask,]$count
...
 count17 - rk.Acroephalus.2000[mask,]$count
where mask is the selection of grid nodes in the area1km map that are not
undefined. The distribution values can be packed together to a new data frame:
 counts - cbind(count01, count02, ... , count17)
To view how values change at individual pixel, we can use:
 plot(counts[1112,])
 abline(coef(line(1:17, counts[1112,])))
which will produce a plot as shown in Fig. 4b. This shows a clear increase of
breeding pair counts over the period 1984–2000.
Next we proceed with fitting the trend models for each pixel in the map. First,
we make an empty data frame that will later be filled with fitted values:
 linefits - as.data.frame(rep(0, length(counts[,1])), optional=TRUE)
 linefits$beta0 - rep(0, length(counts[,1]))
 linefits$beta1 - rep(0, length(counts[,1]))
 linefits$residual - rep(0, length(counts[,1]))
 linefits[1] - NULL
and then run a loop that will fit robust linear regression models for each pixel.
We can then copy each result to the linefits data frame using a loop:
 for (i in 1:length(counts[,1])) {
assign(paste(line,i,sep=), line(1:17, counts[i,]))
linefits$beta0[[i]] = get(paste(line,i,sep=))$coefficients[1]
linefits$beta1[[i]] = get(paste(line,i,sep=))$coefficients[2]
linefits$sumres[[i]] = sum((get(paste(line,i,sep=))$residuals)^2)
}
where beta0 is the intercept coefficient, beta1 is the slope coefficient, and sumres
is the sum of squared residuals. The last parameter will be used to quantify the
Advancing Spatio-temporal Analysis of Ecological Data 703
Change
in
count
at
52.315525
N,
6.294985
E
Period 1984--2000
1.50
1.10
0.70
0.30
-0.10
-0.50
(a)
(c)
(b)
(d)
5 10 15
1
2
3
4
5
6
4.0
2
[ no / km ]
3.0
2.0
1.0
0.0
-1.0
Fig. 4. Trend analysis and change detection: (a) absolute change in distributions of
Acrocephalus schoenobaenus breeding pair in period 2000 − 1984; (b) example of dis-
tribution dynamics at a specific (grid node) location; (c) mapped slope (beta) index
for a robustly-fit linear model for period 1984–2000 — positive values indicate increase
and negative decrease in counts; (b) delineated areas where the rate of change 0.3.
quality of the fit — if sumres tends to zero, we speak about obvious trend,
otherwise if sumres→ ∞ the trend is less reliable. Note that, in this case study,
there are only 43,207 grid nodes with values (from 91,000) but the processing
can still take up to 30 minutes on a standard PC.
We further convert the raster map to a point map, so that the grid nodes in
the linefits can be exported to a viewer such as Google Earth:
 counts.ll = spTransform(rk.Acrocephalus.1984, CRS(+proj=longlat))
and then copy the fitted parameters (beta0, beta1 and sumres) to the counts.ll
(SpatialPointsDataFrame):
704 T. Hengl et al.
 counts.ll$beta0 = linefits$beta0
 counts.ll$beta1 = linefits$beta1
 counts.ll$sumres = linefits$sumres
Export of raster maps from R to Google Earth is somewhat more complicated
because we first need to create a grid in the longlat coordinate system. We
start by determining the width correction factor4
based on the latitude of the
center of the study area:
 corrf - (1+cos((counts.ll@bbox[y,max]+counts.ll@bbox[y,min])/
2*pi/180))/2
and then estimate the grid cell size in arcdegrees:
 geogrd.cell - corrf*(counts.ll@bbox[x,max]-counts.ll@bbox[x,min])/
counts@grid@cells.dim[1]
which gives cell size of 0.0113152 arc-degrees. Now, we can generate a new grid
definition using the spsample method of the sp package:
 geoarc - spsample(counts.ll, type=regular, cellsize=c(geogrd.cell,geogrd.cell))
 gridded(geoarc) - TRUE
 gridparameters(geoarc)
cellcentre.offset cellsize cells.dim
x1 3.316779 0.01131520 347
x2 50.752184 0.01131520 251
which shows that the new grid will have approximately the same number of grid
nodes as the original map in the Dutch coordinate system (87,348 compared to
91,000 pixels). Further steps needed to generate a PNG of an R plot and then
export to KML are explained in [ 11]
. The fitted values of beta1, visualized in
the Google Earth viewer, are shown in Fig. 4c. Fig. 4d shows locations where
beta1 0.3, which are typical maps of interest for decision making.
3 Discussion and Conclusions
The case studies listed previously demonstrate that R computing environment
is a well-suited tool to produce quality outputs (maps, statistical models) of
interest in Geo-Ecology. In principle, all operations listed before are completely
automated. This allows us to combine various operations, ranging from general
point pattern analysis, geostatistics to habitat suitability mapping, via R script-
ing and develop complex automated mapping frameworks. Moreover, due to the
4
For datasets in geographical coordinates, a cell size correction factor can be estimated
as a function of the latitude and spacing at the equator: ∆xmetric = F ·cos(ϕ)·∆x0
degree;
where ∆xmetric is the East/West grid spacing estimated for a given latitude (ϕ),
∆x0
degree is the grid spacing in degrees at equator, and F is the empirical constant
used to convert from degrees to metres [ 14]
.
Advancing Spatio-temporal Analysis of Ecological Data 705
recent implementation of the maptools and sp packages, such outputs can be eas-
ily exported to popular geographical browsers such as Google Earth and shared
with the wider community [ 3;10]
.
Automated mapping and interactive data exploration have completely
changed the perspective of what is possible in Computational Geo-Ecology.
In addition, outputs of such analysis add significant value to (dynamic) Geo-
graphical Information Systems used for analysis of patterns and processes of
(geo-)ecosystems[ 15]
. However, there are also a number of research topics that
will need to be tackled in the coming years. These are the key ones:
⋆ Spatio-temporal visualization and data mining: The largest percent-
age of tools developed for CGE applications are basically visualization and
data mining tools [ 16]
. When one such tool is being developed, a range of
research questions need to be answered — how does a certain tool helps users
complete various data mining tasks e.g. to analyze dependencies, detect out-
liers, discover trends, visualize uncertainties? how well does it generalizes
spatio-temporal patterns, and how easy is to zoom in into the data? how
accurate are the final outputs?
⋆ Automated mapping and change detection: Because the quantity of
both field and remote sensing data in ecology is exponentially increasing, it
is also increasingly important to work with algorithms that do not require
(much of) human labour/intervention. Automation is especially important
to be able to generate large numbers of target variables over dense time-
intervals, and to rapidly detect changes in ecosystems.
⋆ Multi-scale data integration: The input data that feeds the CGE mod-
els often comes with large differences in temporal and spatial support size
and effective scale. On the other hand, there are many benefits of running
analysis that takes into account all possible correlations and dependencies.
Can multi-scale/multi-source data be automatically filtered and integrated
how to achieve this?
⋆ Modeling and management of the uncertainties: It is increasingly
important to accompany the data analysis report with a summary of the
uncertainty budget. Such analysis then allows us to distinguish between
conceptual (model), data (survey) errors and natural variation, i.e. between
the true spatio-temporal patterns and artefacts/noise. In many cases, in-
formation about the inherent uncertainties in the input data can be used
to adjust or filter the data accordingly, pick the right effective scale and
generalize/downscale where necessary.
⋆ Implementation of algorithms and software development: Quality
of computational frameworks becomes apparent when they achieve imple-
mentation in applied fields, especially outside their fields of origin [ 3]
. Here
a range of issues need to be addressed — how many operations does a pro-
gramming language accommodates? what is the processing speed of the soft-
ware? how compatible is it with various GIS formats (vector, raster)? how
compatible is it with various environmental applications? how ease-to-use
will it be? who will maintain the software and provide a support?
706 T. Hengl et al.
Our special focus in the coming years will be development of automated
spatio-temporal analysis algorithms that can be used to generate interactive
(Google Earth-compatible) visualizations from large quantities of field and re-
mote sensing data in near real-time. Although the tool is already there, our
experience is that there are still many challenges to be solved in the coming
years:
• Solving the problem of low quality input data (field observations):
This includes low precision of spatial referencing (size of the plots), im-
precise quantities/counts, (mis)-classification errors, preferential sampling
(complete omission of some area) etc. At this moment it is impossible to
foresee how these inherent uncertainties (biased sampling, species classifica-
tion errors, location errors, poor spatial/temporal coverage etc.) will affect
the final outputs, but it is on our agendas to report on this in the coming
years.
• Solving the problem of computing with large data sets: the Dutch
National Database of Flora and Fauna contains observations of about 3000
species, collected over 25 years at many thousands of locations. To produce
maps using such large quantity of data, automated mapping tools will need
to be developed. In addition, in order to be able to generate maps in near-real
time, super-computing will become unavoidable.
• Improving the over-simplistic statistical models: There are still even
fundamental statistical issues that need to be answered. For example, R
currently does not support a combination of non-linear regression models and
geostatistics5
. This area of geostatistics is all fairly speculative and fresh, so
we can expect much development in the coming years [ 15]
. We can only agree
with [ 15]
— better predictions of geographical distributions of organisms
and effects of impacts on biological communities can emerge only from more
robust species’ distribution models.
• Producing outputs of increasingly higher level of detail: The required
level of detail important for decision-makers is increasingly high. This again
asks for more powerful, faster and robust statistical models. A question re-
mains if there are ways to make predictions at fine resolution using more
effective computations?
The gull data was provided by Bruno Ens (SOVON, the Netherlands) and
Michael Exo (Institute of Avian Research, Germany). This project is made pos-
sible in part by the European Space Agency FlySafe initiative. The EcoGRID
project is carried out in the context of the Virtual Laboratory for e-Science
project (www.vl-e.nl). This project is supported by a BSIK grant from the
Dutch Ministry of Education, Culture and Science and Dutch Ministry of Agri-
culture, Nature and Food Quality.
5
Fitting a GLGM (generalized linear geostatitical model) is possible in geoRglm pack-
age, but it requires two steps — fitting a model without correlation and then mod-
elling residuals.
Advancing Spatio-temporal Analysis of Ecological Data 707
References
[1] Shamoun, J.Z., Sierdsema, H., van Loon, E.E., van Gasteren, H., Bouten, W.,
Sluiter, F.: Linking Horizontal and Vertical Models to Predict 3D + time Distri-
butions of Bird Densities. In: International Bird Strike Committee, Athens, p. 12
(2005)
[2] Van Belle, J., Bouten, W., Shamoun-Baranes, J., van Loon, E.E.: An operational
model predicting autumn bird migration intensities for flight safety. Journal of
Applied Ecology 11, 864–874 (2007)
[3] Bivand, R.: Implementing Spatial Data Analysis Software Tools in R. Geographical
Analysis 38, 23–40 (2006)
[4] Pebesma, E.J.: Multivariable geostatistics in S: the gstat package. Computers 
Geosciences 30(7), 683–691 (2004)
[5] Pebesma, E.J., Bivand, R.S.: Classes and methods for spatial data in R. R
News 5(2), 9–13 (2005)
[6] Bivand, R.S.: Interfacing GRASS 6 and R. Status and development directions.
GRASS Newsletter 3, 11–16 (2005)
[7] Baddeley, A., Turner, R.: Spatstat: an R package for analyzing spatial point pat-
terns. Journal of Statistical Software 12(6), 1–42 (2005)
[8] Calenge, C.: The package “adehabitat” for the R software: A tool for the analysis of
space and habitat use by animals. Ecological Modelling 197(3–4), 516–519 (2006)
[9] Waller, L.A., Gotway, C.A.: Applied Spatial Statistics for Public Health Data, p.
520. Wiley, Hobokone (2004)
[10] Bivand, R., Pebesma, E., Rubio, V.: Applied Spatial Data Analysis with R. Use
R Series, p. 400. Springer, Heidelberg (2008)
[11] Hengl, T.: A Practical Guide to Geostatistical Mapping of Environmental Vari-
ables. In: EUR 22904 EN. Office for Official Publications of the European Com-
munities, Luxembourg, p. 143 (2007)
[12] Rossiter, D.G.: Introduction to the R Project for Statistical Computing for use at
ITC. In: International Institute for Geo-information Science  Earth Observation
(ITC), Enschede, Netherlands, p. 136 (2007)
[13] Rowlingson, B., Diggle, P.: Splancs: spatial point pattern analysis code in S-Plus.
Computers  Geosciences 19, 627–655 (1993)
[14] Guth, P.L.: Slope and aspect calculations on gridded digital elevation models:
Examples from a geomorphometric toolbox for personal computers. Zeitschrift für
Geomorphologie 101, 31–52 (1995)
[15] Scott, J.M., Heglund, P.J., Morrison, M.L.: Predicting Species Occurrences: Issues
Of Accuracy And Scale. Habitat (Ecology), p. 840. Island Press, Washington, DC
(2002)
[16] Compieta, P., Di Martino, S., Bertolotto, M., Ferrucci, F., Kechadi, T.: Ex-
ploratory spatio-temporal data mining and visualization. Journal of Visual Lan-
guages and Computing 18(3), 255–279 (2007)

More Related Content

Similar to Advancing Spatio-temporal Analysis of Ecological Data Examples in R.pdf

A quick overview of geospatial analysis
A quick overview of geospatial analysisA quick overview of geospatial analysis
A quick overview of geospatial analysisMd.Farhad Hossen
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1Johan Blomme
 
Geopsy: Seismic Vibration Processing
Geopsy: Seismic Vibration ProcessingGeopsy: Seismic Vibration Processing
Geopsy: Seismic Vibration ProcessingAli Osman Öncel
 
Metadata Standards in CKAN for Biodiversity Pilot in NextGEOSS
Metadata Standards in CKAN for Biodiversity Pilot in NextGEOSSMetadata Standards in CKAN for Biodiversity Pilot in NextGEOSS
Metadata Standards in CKAN for Biodiversity Pilot in NextGEOSSplan4all
 
FULLYFINAL GIS CHOUBEYJI.pptx
FULLYFINAL GIS CHOUBEYJI.pptxFULLYFINAL GIS CHOUBEYJI.pptx
FULLYFINAL GIS CHOUBEYJI.pptxAkashBhagat34
 
Application Of GIS In Environmental Engineering
Application Of GIS In Environmental EngineeringApplication Of GIS In Environmental Engineering
Application Of GIS In Environmental EngineeringStephen Faucher
 
GEOSPATIAL TECHNOLOGY, CONCEPT, TECHNIQUES AND ITS COMPONENTS. pptx
GEOSPATIAL TECHNOLOGY, CONCEPT, TECHNIQUES AND ITS COMPONENTS. pptxGEOSPATIAL TECHNOLOGY, CONCEPT, TECHNIQUES AND ITS COMPONENTS. pptx
GEOSPATIAL TECHNOLOGY, CONCEPT, TECHNIQUES AND ITS COMPONENTS. pptxMalothSuresh2
 
Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...terradue
 
Open Based Systems Eng 010706 Titech
Open Based Systems Eng 010706 TitechOpen Based Systems Eng 010706 Titech
Open Based Systems Eng 010706 TitechFarhan Helmy
 
Topographic Information System as a Tool for Environmental Management, a Case...
Topographic Information System as a Tool for Environmental Management, a Case...Topographic Information System as a Tool for Environmental Management, a Case...
Topographic Information System as a Tool for Environmental Management, a Case...iosrjce
 
ANALYSIS OF LAND USE AND LAND COVER CHANGE OF BANGALORE URBAN USING REMOTE SE...
ANALYSIS OF LAND USE AND LAND COVER CHANGE OF BANGALORE URBAN USING REMOTE SE...ANALYSIS OF LAND USE AND LAND COVER CHANGE OF BANGALORE URBAN USING REMOTE SE...
ANALYSIS OF LAND USE AND LAND COVER CHANGE OF BANGALORE URBAN USING REMOTE SE...Cynthia King
 
article multidimensionnal modeling and analysis .pdf
article multidimensionnal modeling and analysis .pdfarticle multidimensionnal modeling and analysis .pdf
article multidimensionnal modeling and analysis .pdfrachidaerrahli2
 
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...Raul Palma
 
Multidimensional access methods
Multidimensional access methodsMultidimensional access methods
Multidimensional access methodsunyil96
 

Similar to Advancing Spatio-temporal Analysis of Ecological Data Examples in R.pdf (20)

Modelling tools
Modelling toolsModelling tools
Modelling tools
 
A quick overview of geospatial analysis
A quick overview of geospatial analysisA quick overview of geospatial analysis
A quick overview of geospatial analysis
 
Symposium 2008
Symposium 2008Symposium 2008
Symposium 2008
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1
 
Geopsy: Seismic Vibration Processing
Geopsy: Seismic Vibration ProcessingGeopsy: Seismic Vibration Processing
Geopsy: Seismic Vibration Processing
 
Metadata Standards in CKAN for Biodiversity Pilot in NextGEOSS
Metadata Standards in CKAN for Biodiversity Pilot in NextGEOSSMetadata Standards in CKAN for Biodiversity Pilot in NextGEOSS
Metadata Standards in CKAN for Biodiversity Pilot in NextGEOSS
 
Predicting Plant Growth
Predicting Plant GrowthPredicting Plant Growth
Predicting Plant Growth
 
land health surveillance highlights
land health surveillance highlightsland health surveillance highlights
land health surveillance highlights
 
FULLYFINAL GIS CHOUBEYJI.pptx
FULLYFINAL GIS CHOUBEYJI.pptxFULLYFINAL GIS CHOUBEYJI.pptx
FULLYFINAL GIS CHOUBEYJI.pptx
 
Application Of GIS In Environmental Engineering
Application Of GIS In Environmental EngineeringApplication Of GIS In Environmental Engineering
Application Of GIS In Environmental Engineering
 
GIS KD.pdf
GIS KD.pdfGIS KD.pdf
GIS KD.pdf
 
GEOSPATIAL TECHNOLOGY, CONCEPT, TECHNIQUES AND ITS COMPONENTS. pptx
GEOSPATIAL TECHNOLOGY, CONCEPT, TECHNIQUES AND ITS COMPONENTS. pptxGEOSPATIAL TECHNOLOGY, CONCEPT, TECHNIQUES AND ITS COMPONENTS. pptx
GEOSPATIAL TECHNOLOGY, CONCEPT, TECHNIQUES AND ITS COMPONENTS. pptx
 
Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...Application packaging and systematic processing in earth observation exploita...
Application packaging and systematic processing in earth observation exploita...
 
application of gis
application of gisapplication of gis
application of gis
 
Open Based Systems Eng 010706 Titech
Open Based Systems Eng 010706 TitechOpen Based Systems Eng 010706 Titech
Open Based Systems Eng 010706 Titech
 
Topographic Information System as a Tool for Environmental Management, a Case...
Topographic Information System as a Tool for Environmental Management, a Case...Topographic Information System as a Tool for Environmental Management, a Case...
Topographic Information System as a Tool for Environmental Management, a Case...
 
ANALYSIS OF LAND USE AND LAND COVER CHANGE OF BANGALORE URBAN USING REMOTE SE...
ANALYSIS OF LAND USE AND LAND COVER CHANGE OF BANGALORE URBAN USING REMOTE SE...ANALYSIS OF LAND USE AND LAND COVER CHANGE OF BANGALORE URBAN USING REMOTE SE...
ANALYSIS OF LAND USE AND LAND COVER CHANGE OF BANGALORE URBAN USING REMOTE SE...
 
article multidimensionnal modeling and analysis .pdf
article multidimensionnal modeling and analysis .pdfarticle multidimensionnal modeling and analysis .pdf
article multidimensionnal modeling and analysis .pdf
 
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
 
Multidimensional access methods
Multidimensional access methodsMultidimensional access methods
Multidimensional access methods
 

More from Sabrina Green

There Are Several Advantages You C
There Are Several Advantages You CThere Are Several Advantages You C
There Are Several Advantages You CSabrina Green
 
Printable Primary Writing Paper - Printable World Ho
Printable Primary Writing Paper - Printable World HoPrintable Primary Writing Paper - Printable World Ho
Printable Primary Writing Paper - Printable World HoSabrina Green
 
Leather Writing Case With Writing Paper And Envelopes
Leather Writing Case With Writing Paper And EnvelopesLeather Writing Case With Writing Paper And Envelopes
Leather Writing Case With Writing Paper And EnvelopesSabrina Green
 
Reflective Essay - Grade B - Reflective Essay Introducti
Reflective Essay - Grade B - Reflective Essay IntroductiReflective Essay - Grade B - Reflective Essay Introducti
Reflective Essay - Grade B - Reflective Essay IntroductiSabrina Green
 
Piece Of Paper With Word Anxiety. Black And White S
Piece Of Paper With Word Anxiety. Black And White SPiece Of Paper With Word Anxiety. Black And White S
Piece Of Paper With Word Anxiety. Black And White SSabrina Green
 
J 176 Media Fluency For The Digital Age Example Of A O
J 176 Media Fluency For The Digital Age Example Of A OJ 176 Media Fluency For The Digital Age Example Of A O
J 176 Media Fluency For The Digital Age Example Of A OSabrina Green
 
FROG STREET PRESS FST6541 SMART START
FROG STREET PRESS FST6541 SMART STARTFROG STREET PRESS FST6541 SMART START
FROG STREET PRESS FST6541 SMART STARTSabrina Green
 
Essay Writing Service In Australia
Essay Writing Service In AustraliaEssay Writing Service In Australia
Essay Writing Service In AustraliaSabrina Green
 
Five Paragraph Essay Graphic Organizer In 2023
Five Paragraph Essay Graphic Organizer In 2023Five Paragraph Essay Graphic Organizer In 2023
Five Paragraph Essay Graphic Organizer In 2023Sabrina Green
 
Essential Features Of A Good Term Paper Writing Service Provider
Essential Features Of A Good Term Paper Writing Service ProviderEssential Features Of A Good Term Paper Writing Service Provider
Essential Features Of A Good Term Paper Writing Service ProviderSabrina Green
 
How To Write A College Research Paper Step By Ste
How To Write A College Research Paper Step By SteHow To Write A College Research Paper Step By Ste
How To Write A College Research Paper Step By SteSabrina Green
 
Essay My Teacher My Role Model - Pg
Essay My Teacher My Role Model - PgEssay My Teacher My Role Model - Pg
Essay My Teacher My Role Model - PgSabrina Green
 
Best College Essay Writing Service - The Writing Center.
Best College Essay Writing Service - The Writing Center.Best College Essay Writing Service - The Writing Center.
Best College Essay Writing Service - The Writing Center.Sabrina Green
 
Rutgers Admission Essay Help Essay Online Writers
Rutgers Admission Essay Help Essay Online WritersRutgers Admission Essay Help Essay Online Writers
Rutgers Admission Essay Help Essay Online WritersSabrina Green
 
How To Write An Essay - English Learn Site
How To Write An Essay - English Learn SiteHow To Write An Essay - English Learn Site
How To Write An Essay - English Learn SiteSabrina Green
 
I Need Someone To Write An Essay For Me
I Need Someone To Write An Essay For MeI Need Someone To Write An Essay For Me
I Need Someone To Write An Essay For MeSabrina Green
 
How To Properly Write A Bibliography
How To Properly Write A BibliographyHow To Properly Write A Bibliography
How To Properly Write A BibliographySabrina Green
 
Example Of An Autobiography That Will Point Your Writing I
Example Of An Autobiography That Will Point Your Writing IExample Of An Autobiography That Will Point Your Writing I
Example Of An Autobiography That Will Point Your Writing ISabrina Green
 
(PDF) Steps For Writing A Term P
(PDF) Steps For Writing A Term P(PDF) Steps For Writing A Term P
(PDF) Steps For Writing A Term PSabrina Green
 

More from Sabrina Green (20)

There Are Several Advantages You C
There Are Several Advantages You CThere Are Several Advantages You C
There Are Several Advantages You C
 
Printable Primary Writing Paper - Printable World Ho
Printable Primary Writing Paper - Printable World HoPrintable Primary Writing Paper - Printable World Ho
Printable Primary Writing Paper - Printable World Ho
 
Leather Writing Case With Writing Paper And Envelopes
Leather Writing Case With Writing Paper And EnvelopesLeather Writing Case With Writing Paper And Envelopes
Leather Writing Case With Writing Paper And Envelopes
 
Reflective Essay - Grade B - Reflective Essay Introducti
Reflective Essay - Grade B - Reflective Essay IntroductiReflective Essay - Grade B - Reflective Essay Introducti
Reflective Essay - Grade B - Reflective Essay Introducti
 
Piece Of Paper With Word Anxiety. Black And White S
Piece Of Paper With Word Anxiety. Black And White SPiece Of Paper With Word Anxiety. Black And White S
Piece Of Paper With Word Anxiety. Black And White S
 
J 176 Media Fluency For The Digital Age Example Of A O
J 176 Media Fluency For The Digital Age Example Of A OJ 176 Media Fluency For The Digital Age Example Of A O
J 176 Media Fluency For The Digital Age Example Of A O
 
Cat Writing Paper
Cat Writing PaperCat Writing Paper
Cat Writing Paper
 
FROG STREET PRESS FST6541 SMART START
FROG STREET PRESS FST6541 SMART STARTFROG STREET PRESS FST6541 SMART START
FROG STREET PRESS FST6541 SMART START
 
Essay Writing Service In Australia
Essay Writing Service In AustraliaEssay Writing Service In Australia
Essay Writing Service In Australia
 
Five Paragraph Essay Graphic Organizer In 2023
Five Paragraph Essay Graphic Organizer In 2023Five Paragraph Essay Graphic Organizer In 2023
Five Paragraph Essay Graphic Organizer In 2023
 
Essential Features Of A Good Term Paper Writing Service Provider
Essential Features Of A Good Term Paper Writing Service ProviderEssential Features Of A Good Term Paper Writing Service Provider
Essential Features Of A Good Term Paper Writing Service Provider
 
How To Write A College Research Paper Step By Ste
How To Write A College Research Paper Step By SteHow To Write A College Research Paper Step By Ste
How To Write A College Research Paper Step By Ste
 
Essay My Teacher My Role Model - Pg
Essay My Teacher My Role Model - PgEssay My Teacher My Role Model - Pg
Essay My Teacher My Role Model - Pg
 
Best College Essay Writing Service - The Writing Center.
Best College Essay Writing Service - The Writing Center.Best College Essay Writing Service - The Writing Center.
Best College Essay Writing Service - The Writing Center.
 
Rutgers Admission Essay Help Essay Online Writers
Rutgers Admission Essay Help Essay Online WritersRutgers Admission Essay Help Essay Online Writers
Rutgers Admission Essay Help Essay Online Writers
 
How To Write An Essay - English Learn Site
How To Write An Essay - English Learn SiteHow To Write An Essay - English Learn Site
How To Write An Essay - English Learn Site
 
I Need Someone To Write An Essay For Me
I Need Someone To Write An Essay For MeI Need Someone To Write An Essay For Me
I Need Someone To Write An Essay For Me
 
How To Properly Write A Bibliography
How To Properly Write A BibliographyHow To Properly Write A Bibliography
How To Properly Write A Bibliography
 
Example Of An Autobiography That Will Point Your Writing I
Example Of An Autobiography That Will Point Your Writing IExample Of An Autobiography That Will Point Your Writing I
Example Of An Autobiography That Will Point Your Writing I
 
(PDF) Steps For Writing A Term P
(PDF) Steps For Writing A Term P(PDF) Steps For Writing A Term P
(PDF) Steps For Writing A Term P
 

Recently uploaded

_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 

Recently uploaded (20)

_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 

Advancing Spatio-temporal Analysis of Ecological Data Examples in R.pdf

  • 1. Advancing Spatio-temporal Analysis of Ecological Data: Examples in R Tomislav Hengl1 , Emiel van Loon1 , Henk Sierdsema2 , and Willem Bouten1 1 Research Group on Computational Geo-Ecology (CGE), University of Amsterdam, Amsterdam, The Netherlands hengl@science.uva.nl http://www.science.uva.nl/ibed-cge 2 SOVON Dutch Centre for Field Ornithology, Beek-Ubbergen, The Netherlands Abstract. The article reviews main principles of running geo-computa- tions in ecology, as illustrated with case studies from the EcoGRID and FlySafe projects, and emphasizes the advantages of using R computing environment as the most attractive programming/scripting environment. Three case studies (including R code) of interest to ecological applications are described: (a) analysis of GPS trajectory data for two gull-birds species; (b) species distribution mapping in space and time for a bird species (sedge warbler; EcoGRID project); and (c) change detection using time-series of maps. The case studies demonstrate that R, together with its numerous packages for spatial and geostatistical analysis, is a well-suited tool to pro- duce quality outputs (maps, statistical models) of interest in Geo-Ecology. Moreover, due to the recent implementation of the maptools and sp pack- ages, such outputs can be easily exported to popular geographical browsers such as Google Earth and similar. The key computational challenges for Computational Geo-Ecology recognized were: (1) solving the problem of input data quality (filtering techniques), (2) solving the problem of com- puting with large data sets, (3) improving the over-simplistic statistical models, and (4) producing outputs of increasingly higher level of detail. 1 Introduction Computational Geo-Ecology is an emerging scientific sub-field of Ecology that focuses on development and testing of computational tools that can be used to extract spatio-temporal information on the dynamics of complex geo-ecosystems. It evolved as a combination of three scientific fields: (a) Ecology, as it focuses on interactions between species and abiotic factors; (b) Statistics, as it implies quantitative analysis of field and remote sensing data; and (c) Geoinformation Science, as all variables are spatially referenced and outputs of analyzes are com- monly maps. The importance of this topic has been recognized at the Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, where a research group on Computational Geo-Ecology (CGE) has been established. It comprises about 20 researchers, PhD students and supporting staff mainly with backgrounds in physical geography, computer sciences, ecology and geosciences. O. Gervasi et al. (Eds.): ICCSA 2008, Part I, LNCS 5072, pp. 692–707, 2008. c Springer-Verlag Berlin Heidelberg 2008
  • 2. Advancing Spatio-temporal Analysis of Ecological Data 693 ECOGRID.NL SPECIES: dragonflies, plants, fish fungi, mollusca, mammals, butterflies, moss lichens, birds AUXILIARY DATA: geographical location, date, landscape, taxonomy, socio- economic data METADATA: taxonomy, lineage, contact information, data quality NDFF observations QUERY PARAMETERS: species, period, area, type of analysis, outputs... Spatio-temporal data mining Density estimation Geostatistical analysis Trend analysis and change detection Habitat mapping Error propagation Interactive visualization Summary statistics Distribution maps Change indices Biodiversity indices Home range Scenario testing REPORTS BASE MAPS ECOLOGICAL CONDITIONS: distance to man-made objects, distance to water and food supplies, land use, hydrology, climate, geology Fig. 1. Workflow scheme and main components of the EcoGRID. See further some concrete case studies from the EcoGRID in Sec. 2.3 and 2.4. The key objective of this group is to develop and apply computational tools1 that implement theoretical models of complex geo-ecosystems calibrated by field observations and remote sensing data, and that can be used to perform various tasks: from spatio-temporal data mining to analysis and decision making. CGE is, at the moment, actively involved with two research projects: EcoGRID and ESA-Flysafe. EcoGRID (www.ecogrid.nl) is a national project currently being applied in supporting the functioning of the growing Dutch Flora and Fauna Database (NDFF), which contains about 20 million field records of more than 3000 species registered in the Dutch Species Catalogue (www.neder- landsesoorten.nl). EcoGRID aims at providing researchers, policy-makers and stake-holders with relevant information, including distribution maps, distribu- tion change indices, biodiversity indices, estimated outcomes for scenario-testing models [ 1] . To achieve this, a set of general analysis procedures is being imple- mented and tested — ranging from spatio-temporal data mining, density es- timation, geostatistical analysis, trend analysis and change detection, habitat mapping, error propagation and interactive visualization techniques (Fig. 1). EcoGRID is the Dutch segment of the recent pan-European initiative called “LifeWatch” (www.lifewatch.eu), which aims at building a very large infras- tructure (virtual laboratories) to support sharing of the knowledge and tools to monitor biodiversity over Europe. 1 By ‘tools’ we mainly refer to various software solutions: stand-alone packages, plug- ins/packages and toolboxes, software-based scripts, web-applications and computa- tional schemes.
  • 3. 694 T. Hengl et al. ESA-Flysafe is a project precursor to the Avian Alert initiative (www.avian- alert.eu), a potential integrated application promotion programme (IAP) of the European Space Agency. CGE has already successfully implemented a na- tional project called BAMBAS (www.bambas.ecogrid.nl), which is now used as a decision support tool by the Royal Netherlands Air Force to reduce the risk of bird-aircraft collisions [ 2] . The objective of Flysafe is to integrate multi-source data into a Virtual laboratory, in order to provide predictions and forecasts of bird migration (bird densities, species structure, altitudes, vectors and velocities) at different scales in space and time [ 2] . This paper reviews the most recent activities of the CGE group, discusses limitations and opportunities of using various algorithms and sets a research agenda for the coming years. This is all illustrated with a selection of real case studies, as implemented in the R computing environment. Our idea was not to produce an R tutorial for spatial data analysis, but to demonstrate some common processing steps and then emphasize advantages of running computations in R. 2 Examples in R 2.1 Why R? “From a period in which geographic information systems, and later geocomputation and geographical information science, have been agenda setters, there seems to be interest in trying things out, in expressing ideas in code, and in encouraging others to apply the coded functions in teaching and applied research settings.” Roger Bivand [ 3] The three most attractive computing environments to develop and implement computational schemes used in Computational Geo-Ecology are R (www.r-- project.org), MATLAB (www.mathworks.com) and Python (www.python.org). The first offers less support and instructions to beginners, the second has more basic utilities, is easier to use and the third is the most popular environment used for software development. Although all three are high level languages with extensive users’ communities that interact and share code willingly, R seems to be the most attractive candidate for implementation of algorithms of interest to CGE. [ 3] recognizes three main opportunities for using R: (1) vitality and high speed of development of R, (2) academic openness of developers and their willingness to collaborate, and (3) increasing sympathy for spatial data analysis and visualization. Our main reasons to select R for our projects are: ⋆ R supports various GIS formats via the rgdal package, including the export functionality of vector layers and plots to Google Earth (maptools package). ⋆ R offers a much larger family of methods for spatio-temporal analysis (point pattern analysis, spatial interpolation and simulations, spatio-temporal trend analysis) than MATLAB. ⋆ Unlike MATLAB, R is an open-source software and hence does not require additional investments and is easy to install and update.
  • 4. Advancing Spatio-temporal Analysis of Ecological Data 695 Several authors have recently drawn attention to new R packages used for spatial data analysis. [ 4, 5] promotes the gstat and sp packages that together offer variety of geostatistical analysis; [ 6, 3] reviews spatial data analysis pack- ages in general with special focus on maptools and GRASS packages that have been established as the most comprehensive links between statistical and GIS computing; [ 7] presents the spatstat package for analysis of point patterns. We should also add to this list: RSAGA — link to the SAGA GIS, spsurvey — pack- age for spatial sampling, geoR — geostatistical analysis, splancs — spatial point pattern analysis, and the specialized ecological data analysis packages: adehabi- tat [ 8] , GRASP and BIOMOD, that support spatial prediction of point-sampled variables using GLM/GAMs, and export to GIS. For an update on most recent activities connected with the development of spatial analysis tools in R, you can at any time subscribe to the R-sig-Geo mailing list and witness the evolution. A limitation of R is that it does not provide dynamic linked visualization and user-friendly data exploration. This might frustrate users that wish to zoom into spatial layers, visually explore patterns and open multiple layers over each other. However, due to the recent implementation of the maptools, rgdal and sp packages, outputs of spatial/statistical analysis in R can be exported to free geo- graphic browsers such as Google Earth. Google Earth is a HTML-language based freeware that has, with its intelligent indexing of very large datasets combined with an open architecture for integrating and customizing new data, revolution- ized the meaning of the word “geoinformation”. By combining computational power of R and visualization possibilities of Google Earth, one creates a complete system. The following sections demonstrate use of R scripting to perform various an- alyzes, including the export to Google Earth. We are not able to display the complete scripts, but we instead zoom into specific processing steps that might be of interest to research unfamiliar with R. For a detailed introduction to spa- tial analysis in R, please refer to the recent books by [ 9] and [ 10] , and various lecture notes [ 11;12] . 2.2 Analysis of GPS Trajectory Data The objective of this exercise is to analyze movement of two gull bird species — lesser black-backed gull (Larus Fuscus), further in text referred to as LBG, and european herring gull (Larus Argentatus Pontoppidan), further in text referred to as HG. For the analysis, we use the GPS readings of the receivers attached to a total of 23 individual birds. The birds were released on 1st of June 2007 in the region of Vlieland, the Netherlands, and then recordings collected until 24th of October 2007. A map of trajectories is shown in Fig. 2a. We are interested to see where do gulls forage and rest, do they have specific paths, how fast do they move over an area and is there a relationship between activity centers and landscape? We can import the raw table data to R using: gulls - read.delim(gulls.txt)
  • 5. 696 T. Hengl et al. This shows the following structure: ’data.frame’: 13530 obs. of 6 variables: $ BIRDID : Factor w/ 23 levels ID41745,ID41747,..: 1 1 1 1 1 1 ... $ LATITUDE : num 53.2 53.3 53.3 53.2 53.2 ... $ LONGITUDE: num 4.93 4.95 4.96 5.00 5.04 ... $ SPECIES : Factor w/ 2 levels HG,LBG: 2 2 2 2 2 2 2 2 2 2 ... $ SEX : Factor w/ 2 levels F,M: 2 2 2 2 2 2 2 2 2 2 ... $ TIME : POSIXct, format: 2007-06-01 07:00:00 2007-06-01 09:00:00 We first want to calculate the velocity of birds moving from one to other location. To do this, we need to reproject the geographical coordinates to a cartesian system, so that the distances in x and y directions are equal. We start by attaching the coordinates and the coordinate system (sp package): library(sp) coordinates(gulls) - ~LONGITUDE+LATITUDE proj4string(gulls) - CRS(+proj=longlat +datum=WGS84) which will coerce the table into a SpatialPointsDataFrame, i.e. an R spatial layer. Now we can transform the coordinates to the European Terrestrial Refer- ence System (www.euref.eu) using: gulls.laea - spTransform(gulls, CRS(+init=epsg:3035)) where spTransform is the rgdal method to reproject the coordinates, epsg is the European Petroleum Survey Group’s registry code (see www.epsg-registry.org) of the ETRS coordinate system. Once the coordinates are in a metric system, we can derive distances and velocities (in km/h) from point to point, by dividing the distance vector by time. This will attach to each point location an estimate of the velocity for an individual bird. At this stage, we want to separate the analysis for the two species. This can be done in few steps, e.g. for LBG species: gulls.laea$LBG - ifelse(gulls.laea$SPECIES==LBG, ifelse(is.na(gulls.laea@data$VELOC), NA, T), NA) LBG - !is.na(gulls.laea$LBG) We proceed with interpolating the velocities estimated at point locations. For this, we can use the gstat package [ 4] . The variograms can be visualized and fitted using: library(gstat) LBG.points - remove.duplicates(gulls.laea[LBG,], zero=0.0, remove.second=TRUE) plot(variogram(log1p(log1p(VELOC))~1, LBG.points, cutoff=40000)) LBG.ovar - variogram(log1p(log1p(VELOC))~1, LBG.points, cutoff=40000) LBG.ovgm - fit.variogram(LBG.ovar, vgm(0.4, Exp, range=30000, 0.5)) where remove.duplicates is the function to remove duplicate point and pre- vent the package for running into computational problems, variogram is the
  • 6. Advancing Spatio-temporal Analysis of Ecological Data 697 14.747 11.871 8.995 6.118 3.242 0.366 (a) (c) (b) (d) [km / h] 100.0 80.0 60.0 40.0 20.0 0.0 [ HSI ] Fig. 2. Spatio-temporal analysis of GPS trajectory data for lesser black-backed gull (Larus Fuscus): (a) observed trajectories for a total of 14 individual birds; (b) in- terpolated velocities; (c) Habitat Suitability Index derived using mean annual EVI, topographic wetness index and night lights image; (d) 3D kernel density aggregated for the period June–August 2007. gstat command to calculate semivariances for a given target variable (VELOC), log1p is the log transformation (double log in this case), fit.variogram is the gstat command used to fit the variogram using re-weighted least squares and cutoff is the maximum distance of interest. Once we fitted the variogram, we
  • 7. 698 T. Hengl et al. can interpolate the values over the whole area of interest using e.g. ordinary kriging: LBG.g - gstat(id=c(VELOC), formula=log1p(log1p(VELOC))~1, data=LBG.points, model=LBG.ovgm) LBGspeed.OK - predict.gstat(object=LBG.g, newdata=maskLBG, nmax=60) where gstat is the generic function to run predictions and simulations, maskLBG is a SpatialGridDataFrame showing the prediction locations, and nmax=60 is the maximum number of point pairs that will be used to make predictions. The resulting map can be seen in Fig. 2b. We are next interested to see how is the movement of the gulls connected with environmental conditions, i.e. relief, urban development and landscape in gen- eral. For this, we use three 1–km maps: (1) mean annual EVI image (pcevi1lbg) representing the mean biomass over a terrain, (2) topographic wetness index de- rived from the 1 km DEM (twilbg) representing relief, and (3) night lights image (nlightslbg) representing urban development [ 11] . These can be loaded to R via the adehabitat package: library(adehabitat) pcevi1lbg.asc - import.asc(pcevi1lbg.asc) twilbg.asc - import.asc(twilbg.asc) nlightslbg.asc - import.asc(nlightslbg.asc) LBG.maps - as.kasc(list(dem=demlbg.asc, pcevi1lbg=pcevi1lbg.asc, twilbg=twilbg.asc, nlightslbg=nlightslbg.asc)) The maps and locations where LBG species has been observed can be packed together and used to run Ecological Niche Factor Analysis [ 8] : LBG.hab - data2enfa(LBG.maps, gulls.laea[LBG,]@coords) enfa.LBG - enfa(dudi.pca(LBG.hab$tab, scannf=FALSE), LBG.hab$pr, nf=2) LBG.dist - predict(enfa.LBG, LBG.hab$index, LBG.hab$attr) where data2enfa will combine the grids and observation locations, enfa is a method to run Ecological Niche Factor Analysis and LBG.dist are the output distances from the barycentre of the niche [ 8] . Final Habitat Suitability Index (0–100%) can be seen in Fig. 2c. This shows that LBG birds, based on this trajectory data, systematically avoid mountain chains and big urban areas. By using package splancs [ 13;3] , we can also estimate the space-time kernel density for different time intervals. A space-time (3D) kernel filter can be run by defining: coordinates of the points and time reference (x, y, z), grid of interest, and search radius. LBG.densnoTime - kernel3d(pts=LBG.points@coords, times=LBG.points$CTIME, xgr=seq(maskLBG@bbox[x,min], maskLBG@bbox[x,max], maskLBG@grid@cellsize[1]), ygr=seq(maskLBG@bbox[y,min], maskLBG@bbox[y,max], maskLBG@grid@cellsize[2]), zgr=seq(3608,8000,72), hxy=20000, hz=168) which will produce 61 maps of kernel smoother densities for 3-day periods. The output is basically a space-time cube, i.e. a series of 61 grid maps. Note that such
  • 8. Advancing Spatio-temporal Analysis of Ecological Data 699 calculations can be time-consuming, thus we recommend that you test the script using relatively small data sets first, and then proceed with the real case studies. The summary density map estimated using kernel3d method, aggregated over the period June, July, August, is shown in Fig. 2d. 2.3 Generation of Distribution Maps The following section demonstrates how to connect to an on-line ecological database (ecogrid.nl), run queries, generate distribution maps and export them to geographical browsers such as Google Earth. We start by setting-up a new ODBC connection on our Windows machine2 , and then connecting to it from R using the RODBC package: library(RODBC) ecogrid.conn - odbcConnect(dsn=ecogrid.nl, connection=sovon-ecogrid, case=postgresql) which will create an R definition of the connection. We can now run a query e.g. to fetch all observations of breeding pair counts of the bird species sedge warbler (Acrocephalus schoenobaenus): Acrocephalus.tbl - sqlQuery(ecogrid.conn, query=paste(SELECT o.countmin, o.countmax, o.counttype, x(centroid(l.the_geom)), y(centroid(l.the_geom)), o.timestart, o.timestop, o.timetype FROM survey.observations o, survey.locations l, taxonomy.taxa t, taxonomy.taxa p WHERE o.locid=l.locid AND o.taxid=t.taxid AND t.parent_id=p.taxid AND p.taxon ILIKE ’Acrocephalus’ AND t.taxon ILIKE ’schoenobaenus’;)) where o.countmin, o.countmax are the observed counts of the breeding pairs, x(centroid(l.the geom)), y(centroid(l.the geom)) are the coordinates of the center of the observation plots and o.timestart, o.timestop is the time of beginning and the end of observation. As a result of query, we obtain 16,028 observations (Fig. 3a): str(Acrocephalus.tbl) ’data.frame’: 16028 obs. of 8 variables: $ countmin : int 0 0 0 0 0 0 0 0 5 8 ... $ countmax : int 0 0 0 0 0 0 0 0 5 8 ... $ counttype: Factor w/ 1 level n: 1 1 1 1 1 1 1 1 1 1 ... $ x : num 115663 115663 115663 115663 115663 ... $ y : num 425971 425971 425971 425971 425971 ... $ timestart: POSIXct, format: 1984-04-20 1985-04-20 ... $ timestop : POSIXct, format: 1984-08-16 1985-08-16 ... $ timetype : Factor w/ 1 level f: 1 1 1 1 1 1 1 1 1 1 ... 2 Under Control panel → Administrative tools → ODBC Data Source Administration → Add, then enter the server address, port, username and password used to connect to server.
  • 9. 700 T. Hengl et al. Acrocephalus schoenobaenus in 2000 (a) (b) (c) 5.0 2 [ no / km ] 4.0 3.0 2.0 1.0 0.0 1984 1992 2000 Fig. 3. Automated generation of distribution maps: (a) observed number of breeding pairs of Acrocephalus schoenobaenus in year 2000; (b) as shown in Google Earth; (c) breeding pair densitites for years 1984, 1992 and 2000 derived using regression-kriging over 1 km grid. Before we can proceed with generation of the distribution maps, we need to convert the time-coordinates to a linear system. For example, we can consider using the cumulative number of days since 1970-01-01: Acrocephalus.tbl$ctime - floor(unclass(Acrocephalus.tbl$timestart)/86400) + ( floor(unclass(Acrocephalus.tbl$timestop)/86400) - floor(unclass(Acrocephalus.tbl$timestart)/86400) ) here we use the command unclass to get the time as a numeric vector3 . For example, 1987-04-20 W. Europe Standard Time date-time value corresponds to a value of 6317 (hours since 1970-01-01). 3 This will convert time values to the number of hours since the beginning of 1970. This way we can run statistical analysis with such data.
  • 10. Advancing Spatio-temporal Analysis of Ecological Data 701 Next, we import the predictor maps of the Netherlands that can be used to map distribution of this bird: gridmaps - readGDAL(dheight.asc) names(gridmaps)[[1]] - dheight gridmaps$dtm - readGDAL(dtm.asc)$band1 gridmaps$freat1 - readGDAL(freat1.asc)$band1 gridmaps$lgn3dsee - readGDAL(lgn3dsee.asc)$band1 gridmaps$sltdch1 - readGDAL(sltdch1.asc)$band1 gridmaps$t10nhuis - readGDAL(t10nhuis.asc)$band1 gridmaps$t10ntree - readGDAL(t10ntree.asc)$band1 proj4string(gridmaps) - CRS(+init=epsg:28992) where dheight is layer showing the height of canopy, dtm is the LiDAR-derived Digital Elevation Model, freat1 is the map showing duration of drainage, lgn3dsee is the distance from the coast line, sltdch1 is the density of the primary water course, t10nhuis is the density of buildings from the 1:10k topo- maps, t10ntree is the density of trees, and epsg:28992 is the EPSG ID of the Dutch coordinate system. Note that this gridmaps data set is fairly large as each layer consists of 910,000 grids. We can select a specific year/period and subset the original data set, e.g. to year 2000: Acrocephalus.2000 - subset(Acrocephalus.tbl, as.integer(format(Acrocephalus.tbl$timestop, %Y))==2000) and plot the values of bird counts using (Fig. 3a): coordinates(Acrocephalus.2000) - ~x+y bubble(Acrocephalus.2000[count], scales=list(draw=TRUE), main=Acrocephalus schoenobaenus in 2000, sp.layout=list(sp.lines, col=black, NLborders)) The point maps (also lines and polygons) can be exported and viewed in Google Earth using the writeOGR command: Acrocephalus.2000.latlong - spTransform(Acrocephalus.2000, CRS(+proj=longlat)) writeOGR(Acrocephalus.2000.latlong[count], Acrocephalus_2000.kml, Acrocephalus_schoenobaenus2000, KML) which will produce an image as shown in Fig. 3b. A list of distribution maps can be generated by using regression-kriging tech- nique, as implemented in the gstat package [ 4;11] . We first need to overlay rasters and points and then attach the values of predictors to the original data frame. To speed up data processing, we can also loop the operations: linear regression, then step-wise regression, then variogram modeling, and final interpolation using regression-kriging. Note that, because all operations are automated, you might get some strange results, either in the whole map, or at a specific grid nodes, so that it might be a good idea to inspect/visualize all fitted models and output maps before you proceed with interpretation of the final outputs. The final results of interpolation (Fig. 3c) show that the density of Acro- cephalus schoenobaenus in the Netherlands has been increasing over the period
  • 11. 702 T. Hengl et al. of last 10–15 years. But how distinct is the change in breeding pair counts, we will answer in the following exercise. 2.4 Trend Analysis and Change Detection In the last example we demonstrate how robust linear regression can be used to map rate of change using time-series of distribution maps. For this we use a series of 17 distribution maps derived in the previous exercise: Acrocephalus schoenobaenus in the Netherlands from 1984 trough 2000 (Fig. 3c). We can run the analysis by selecting only the grid nodes with attached value: count01 - rk.Acroephalus.1984[mask,]$count count02 - rk.Acroephalus.1985[mask,]$count ... count17 - rk.Acroephalus.2000[mask,]$count where mask is the selection of grid nodes in the area1km map that are not undefined. The distribution values can be packed together to a new data frame: counts - cbind(count01, count02, ... , count17) To view how values change at individual pixel, we can use: plot(counts[1112,]) abline(coef(line(1:17, counts[1112,]))) which will produce a plot as shown in Fig. 4b. This shows a clear increase of breeding pair counts over the period 1984–2000. Next we proceed with fitting the trend models for each pixel in the map. First, we make an empty data frame that will later be filled with fitted values: linefits - as.data.frame(rep(0, length(counts[,1])), optional=TRUE) linefits$beta0 - rep(0, length(counts[,1])) linefits$beta1 - rep(0, length(counts[,1])) linefits$residual - rep(0, length(counts[,1])) linefits[1] - NULL and then run a loop that will fit robust linear regression models for each pixel. We can then copy each result to the linefits data frame using a loop: for (i in 1:length(counts[,1])) { assign(paste(line,i,sep=), line(1:17, counts[i,])) linefits$beta0[[i]] = get(paste(line,i,sep=))$coefficients[1] linefits$beta1[[i]] = get(paste(line,i,sep=))$coefficients[2] linefits$sumres[[i]] = sum((get(paste(line,i,sep=))$residuals)^2) } where beta0 is the intercept coefficient, beta1 is the slope coefficient, and sumres is the sum of squared residuals. The last parameter will be used to quantify the
  • 12. Advancing Spatio-temporal Analysis of Ecological Data 703 Change in count at 52.315525 N, 6.294985 E Period 1984--2000 1.50 1.10 0.70 0.30 -0.10 -0.50 (a) (c) (b) (d) 5 10 15 1 2 3 4 5 6 4.0 2 [ no / km ] 3.0 2.0 1.0 0.0 -1.0 Fig. 4. Trend analysis and change detection: (a) absolute change in distributions of Acrocephalus schoenobaenus breeding pair in period 2000 − 1984; (b) example of dis- tribution dynamics at a specific (grid node) location; (c) mapped slope (beta) index for a robustly-fit linear model for period 1984–2000 — positive values indicate increase and negative decrease in counts; (b) delineated areas where the rate of change 0.3. quality of the fit — if sumres tends to zero, we speak about obvious trend, otherwise if sumres→ ∞ the trend is less reliable. Note that, in this case study, there are only 43,207 grid nodes with values (from 91,000) but the processing can still take up to 30 minutes on a standard PC. We further convert the raster map to a point map, so that the grid nodes in the linefits can be exported to a viewer such as Google Earth: counts.ll = spTransform(rk.Acrocephalus.1984, CRS(+proj=longlat)) and then copy the fitted parameters (beta0, beta1 and sumres) to the counts.ll (SpatialPointsDataFrame):
  • 13. 704 T. Hengl et al. counts.ll$beta0 = linefits$beta0 counts.ll$beta1 = linefits$beta1 counts.ll$sumres = linefits$sumres Export of raster maps from R to Google Earth is somewhat more complicated because we first need to create a grid in the longlat coordinate system. We start by determining the width correction factor4 based on the latitude of the center of the study area: corrf - (1+cos((counts.ll@bbox[y,max]+counts.ll@bbox[y,min])/ 2*pi/180))/2 and then estimate the grid cell size in arcdegrees: geogrd.cell - corrf*(counts.ll@bbox[x,max]-counts.ll@bbox[x,min])/ counts@grid@cells.dim[1] which gives cell size of 0.0113152 arc-degrees. Now, we can generate a new grid definition using the spsample method of the sp package: geoarc - spsample(counts.ll, type=regular, cellsize=c(geogrd.cell,geogrd.cell)) gridded(geoarc) - TRUE gridparameters(geoarc) cellcentre.offset cellsize cells.dim x1 3.316779 0.01131520 347 x2 50.752184 0.01131520 251 which shows that the new grid will have approximately the same number of grid nodes as the original map in the Dutch coordinate system (87,348 compared to 91,000 pixels). Further steps needed to generate a PNG of an R plot and then export to KML are explained in [ 11] . The fitted values of beta1, visualized in the Google Earth viewer, are shown in Fig. 4c. Fig. 4d shows locations where beta1 0.3, which are typical maps of interest for decision making. 3 Discussion and Conclusions The case studies listed previously demonstrate that R computing environment is a well-suited tool to produce quality outputs (maps, statistical models) of interest in Geo-Ecology. In principle, all operations listed before are completely automated. This allows us to combine various operations, ranging from general point pattern analysis, geostatistics to habitat suitability mapping, via R script- ing and develop complex automated mapping frameworks. Moreover, due to the 4 For datasets in geographical coordinates, a cell size correction factor can be estimated as a function of the latitude and spacing at the equator: ∆xmetric = F ·cos(ϕ)·∆x0 degree; where ∆xmetric is the East/West grid spacing estimated for a given latitude (ϕ), ∆x0 degree is the grid spacing in degrees at equator, and F is the empirical constant used to convert from degrees to metres [ 14] .
  • 14. Advancing Spatio-temporal Analysis of Ecological Data 705 recent implementation of the maptools and sp packages, such outputs can be eas- ily exported to popular geographical browsers such as Google Earth and shared with the wider community [ 3;10] . Automated mapping and interactive data exploration have completely changed the perspective of what is possible in Computational Geo-Ecology. In addition, outputs of such analysis add significant value to (dynamic) Geo- graphical Information Systems used for analysis of patterns and processes of (geo-)ecosystems[ 15] . However, there are also a number of research topics that will need to be tackled in the coming years. These are the key ones: ⋆ Spatio-temporal visualization and data mining: The largest percent- age of tools developed for CGE applications are basically visualization and data mining tools [ 16] . When one such tool is being developed, a range of research questions need to be answered — how does a certain tool helps users complete various data mining tasks e.g. to analyze dependencies, detect out- liers, discover trends, visualize uncertainties? how well does it generalizes spatio-temporal patterns, and how easy is to zoom in into the data? how accurate are the final outputs? ⋆ Automated mapping and change detection: Because the quantity of both field and remote sensing data in ecology is exponentially increasing, it is also increasingly important to work with algorithms that do not require (much of) human labour/intervention. Automation is especially important to be able to generate large numbers of target variables over dense time- intervals, and to rapidly detect changes in ecosystems. ⋆ Multi-scale data integration: The input data that feeds the CGE mod- els often comes with large differences in temporal and spatial support size and effective scale. On the other hand, there are many benefits of running analysis that takes into account all possible correlations and dependencies. Can multi-scale/multi-source data be automatically filtered and integrated how to achieve this? ⋆ Modeling and management of the uncertainties: It is increasingly important to accompany the data analysis report with a summary of the uncertainty budget. Such analysis then allows us to distinguish between conceptual (model), data (survey) errors and natural variation, i.e. between the true spatio-temporal patterns and artefacts/noise. In many cases, in- formation about the inherent uncertainties in the input data can be used to adjust or filter the data accordingly, pick the right effective scale and generalize/downscale where necessary. ⋆ Implementation of algorithms and software development: Quality of computational frameworks becomes apparent when they achieve imple- mentation in applied fields, especially outside their fields of origin [ 3] . Here a range of issues need to be addressed — how many operations does a pro- gramming language accommodates? what is the processing speed of the soft- ware? how compatible is it with various GIS formats (vector, raster)? how compatible is it with various environmental applications? how ease-to-use will it be? who will maintain the software and provide a support?
  • 15. 706 T. Hengl et al. Our special focus in the coming years will be development of automated spatio-temporal analysis algorithms that can be used to generate interactive (Google Earth-compatible) visualizations from large quantities of field and re- mote sensing data in near real-time. Although the tool is already there, our experience is that there are still many challenges to be solved in the coming years: • Solving the problem of low quality input data (field observations): This includes low precision of spatial referencing (size of the plots), im- precise quantities/counts, (mis)-classification errors, preferential sampling (complete omission of some area) etc. At this moment it is impossible to foresee how these inherent uncertainties (biased sampling, species classifica- tion errors, location errors, poor spatial/temporal coverage etc.) will affect the final outputs, but it is on our agendas to report on this in the coming years. • Solving the problem of computing with large data sets: the Dutch National Database of Flora and Fauna contains observations of about 3000 species, collected over 25 years at many thousands of locations. To produce maps using such large quantity of data, automated mapping tools will need to be developed. In addition, in order to be able to generate maps in near-real time, super-computing will become unavoidable. • Improving the over-simplistic statistical models: There are still even fundamental statistical issues that need to be answered. For example, R currently does not support a combination of non-linear regression models and geostatistics5 . This area of geostatistics is all fairly speculative and fresh, so we can expect much development in the coming years [ 15] . We can only agree with [ 15] — better predictions of geographical distributions of organisms and effects of impacts on biological communities can emerge only from more robust species’ distribution models. • Producing outputs of increasingly higher level of detail: The required level of detail important for decision-makers is increasingly high. This again asks for more powerful, faster and robust statistical models. A question re- mains if there are ways to make predictions at fine resolution using more effective computations? The gull data was provided by Bruno Ens (SOVON, the Netherlands) and Michael Exo (Institute of Avian Research, Germany). This project is made pos- sible in part by the European Space Agency FlySafe initiative. The EcoGRID project is carried out in the context of the Virtual Laboratory for e-Science project (www.vl-e.nl). This project is supported by a BSIK grant from the Dutch Ministry of Education, Culture and Science and Dutch Ministry of Agri- culture, Nature and Food Quality. 5 Fitting a GLGM (generalized linear geostatitical model) is possible in geoRglm pack- age, but it requires two steps — fitting a model without correlation and then mod- elling residuals.
  • 16. Advancing Spatio-temporal Analysis of Ecological Data 707 References [1] Shamoun, J.Z., Sierdsema, H., van Loon, E.E., van Gasteren, H., Bouten, W., Sluiter, F.: Linking Horizontal and Vertical Models to Predict 3D + time Distri- butions of Bird Densities. In: International Bird Strike Committee, Athens, p. 12 (2005) [2] Van Belle, J., Bouten, W., Shamoun-Baranes, J., van Loon, E.E.: An operational model predicting autumn bird migration intensities for flight safety. Journal of Applied Ecology 11, 864–874 (2007) [3] Bivand, R.: Implementing Spatial Data Analysis Software Tools in R. Geographical Analysis 38, 23–40 (2006) [4] Pebesma, E.J.: Multivariable geostatistics in S: the gstat package. Computers Geosciences 30(7), 683–691 (2004) [5] Pebesma, E.J., Bivand, R.S.: Classes and methods for spatial data in R. R News 5(2), 9–13 (2005) [6] Bivand, R.S.: Interfacing GRASS 6 and R. Status and development directions. GRASS Newsletter 3, 11–16 (2005) [7] Baddeley, A., Turner, R.: Spatstat: an R package for analyzing spatial point pat- terns. Journal of Statistical Software 12(6), 1–42 (2005) [8] Calenge, C.: The package “adehabitat” for the R software: A tool for the analysis of space and habitat use by animals. Ecological Modelling 197(3–4), 516–519 (2006) [9] Waller, L.A., Gotway, C.A.: Applied Spatial Statistics for Public Health Data, p. 520. Wiley, Hobokone (2004) [10] Bivand, R., Pebesma, E., Rubio, V.: Applied Spatial Data Analysis with R. Use R Series, p. 400. Springer, Heidelberg (2008) [11] Hengl, T.: A Practical Guide to Geostatistical Mapping of Environmental Vari- ables. In: EUR 22904 EN. Office for Official Publications of the European Com- munities, Luxembourg, p. 143 (2007) [12] Rossiter, D.G.: Introduction to the R Project for Statistical Computing for use at ITC. In: International Institute for Geo-information Science Earth Observation (ITC), Enschede, Netherlands, p. 136 (2007) [13] Rowlingson, B., Diggle, P.: Splancs: spatial point pattern analysis code in S-Plus. Computers Geosciences 19, 627–655 (1993) [14] Guth, P.L.: Slope and aspect calculations on gridded digital elevation models: Examples from a geomorphometric toolbox for personal computers. Zeitschrift für Geomorphologie 101, 31–52 (1995) [15] Scott, J.M., Heglund, P.J., Morrison, M.L.: Predicting Species Occurrences: Issues Of Accuracy And Scale. Habitat (Ecology), p. 840. Island Press, Washington, DC (2002) [16] Compieta, P., Di Martino, S., Bertolotto, M., Ferrucci, F., Kechadi, T.: Ex- ploratory spatio-temporal data mining and visualization. Journal of Visual Lan- guages and Computing 18(3), 255–279 (2007)