SlideShare a Scribd company logo
1 of 48
Download to read offline
Finding Meaning in Points, Areas
and Surfaces: Spatial Analysis in
               R

        Revolution Analytics
    Wednesday 13th June 1300 EST
The instructor

• Dave Unwin
• Retired Geography
  professor
• University of London,
  UK
• Spatial analysis & GIS in
  environmental sciences
Geography is everywhere?




• Everything happens somewhere
• Interest is on geo-spatial data at scales from a
  few meters to the planet Earth
Spatial analysis is the name given to a
variety of methods of analysis in which we
use LOCATION as an explanatory variable
NB: Not all spatial analysis is spatial
statistical analysis and not all spatial
analysis is geospatial
Typical Questions

• Is there an unusual clustering of point objects such
  as crimes/cases of a disease/trees/whatever here
  that we need to worry about? If so does the point
  pattern help explain why?
• Does this phenomenon in these areas (counties,
  states, countries) show spatial variation I need to
  know about? Does the pattern help explain why?
• What is the most probable value for a continuous
  variable at this location?
Characteristics of spatial data?



• Almost always given: typically the analyst has no
  choice in their acquisition, sometimes even their
  formatting;
• They have additional structure that defines their
  geometry (point, line/network, area/lattice,
  surface/field/geostatistical)
Types of spatial data



                         Objects
can be points, lines/networks or areas/lattices with L0,
L1 and L2 dimension of length
                          Fields
are self-defining and spatially continuous: everywhere
has a value (e.g. temperature, mean annual rainfall, …)
Locating things on Planet Earth

• There are many ways by which we measure our location (place name,
  address, ZIP/Post code , latitude/longitude, grid reference etc)
• How we locate depends on context and scale
• Spatial resolution of location measurements vary
• For analysis we (usually) need (x, y) co-ordinates in a projected system
• Need for keys to provide these data, often added after the data have been
  collected
• GPS & GPS-enabled devices are changing this and LBS is a massive and
  growing industry that is changing our spatial behaviour
Why R?

• A consistent environment for statistical computing
  and graphics
• Relative proximity to the data
• Easy links to code in numerous languages and to
  DBMS
• Easier development of new methods
• Packages available to perform most analyses
• Immensely supportive community
The sp Spatial
Class and its
 subclasses
> library(sp)
> getClass("Spatial")
Class "Spatial" [package "sp"]

Slots:

Name:     bbox        proj4string
Class:    matrix      CRS

Known Subclasses:
Class "SpatialPoints", directly
Class "SpatialLines", directly
Class "SpatialPolygons", directly
Class "SpatialPointsDataFrame", by class "SpatialPoints", distance 2
Class "SpatialPixels", by class "SpatialPoints", distance 2
Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2
Class "SpatialGrid", by class "SpatialPoints", distance 3
Class "SpatialPixelsDataFrame", by class "SpatialPoints", distance 3
Class "SpatialGridDataFrame", by class "SpatialPoints", distance 4
Class "SpatialPolygonsDataFrame", by class "SpatialPolygons", distance 2
What extra?
• A data matrix called          • A spatial data frame
  turbines:                       called turbines_spdf
> turbine_df                      that adds three bits of
      lon lat                     ‘geography’
1 -0.8716027, 52.39353
                                1. lon/lat become spatial
2 -0.8781694, 52.39340
3 -0.8656111, 52.39398
                                   coordinates
4 -0.8795611, 52.39626          2. A coordinate reference
5 -0.8804666, 52.39913             system (CRS) to which
6 -0.8726833, 52.39631             these relate, and
7 -0.8643472, 52.39723
                                3. A bounding box (for
                                   display)
Why bother?
You can do a lot of spatial
analysis using a simple
Cartesian co-ordinate
system such as a unit
square, but what happens
when you want to merge
with other geographic
data?
Here is a simple example in
which turbines_spdf has
been written out in KML
and then ‘mashed ‘ onto
Google Earth to create a
‘pin’ map
Packages for spatial data

Contributed packages with spatial statistics
applications:

•   Utilities: rgdal, sp, maptools
•   Point patterns: spatstat, VR:spatial, splancs;
•   Geostatistics: gstat, geoR, geoRglm, fields, spBayes,
•   RandomFields, VR: spatial, sgeostat, vardiag;
•   Lattice/area data: spdep, DCluster, spgwr, ade4.
Making sense of it all …

• This is the standard work,
  written by the authors of sp
  and some of the packages
• It contains just about all you
  might want to know about
  spatial analysis in R circa
  2008
• Useful new packages have
  emerged since then
For spatial and spatial statistical analysis?
Three use case examples




• Each illustrates the analysis of a particular class of
  spatial data -- points L0, area L2 and surfaces L3
Patterns in drumlins?
                                                Our bit




A ‘drumlin’                A ‘swarm of them in NI
Adding an ‘edge’ ….




Is the pattern CSR as predicted by Smalley
  and Unwin (1968) over forty years ago?
Visualizing the pattern using kernel density estimation
Simple tests against CSR ….

Using Baddeley’s spatstat package ….

•   > # nearest neighbor tests for comparison
•   > clarkevans(drumlin_ppp)
•     naive     Donnelly       cdf
•   1.249917 1.215380          1.233599
•   > clarkevans(drumlin_rr)
•     naive     Donnelly       cdf
•   1.238626          NA       1.215134
Ripleys K(d) function …




NB: Modification to L(est) on RHS due to Mark Rosenstein
In this case we conclude that the pattern is
more regular than random at short range,
but then we have no evidence that it is
other than CSR at longer ranges

    The generic question is
 Is there an unusual clustering of point objects
 such as crimes/cases of a disease/trees/
 whatever here that we need to worry about? If
 so does the point pattern help explain why?
Patterns in disease incidence



• Where does this disease occur?
• Although disease affects individuals, almost always
  the available information will be aggregated into
  some areal unit such as a postal code, electoral
  district, county, state or country
• Such data are called lattice data and they are
  visualized using choropleth (‘area-value’) maps
• Our questions are essentially the same as before
Lip cancer incidence
in the Districts and
Islands of Scotland
(Clayton and Kaldor,
1987)

> lips <-
readShapePoly("C:s
cotlip",
IDvar="RECORD_ID")
> plot(lips)

Note this is an ESRI
‘shapefile’ a de facto
standard for such
lattice data
Plotting the raw
numbers?

>library(sp)
>spplot (lips,
“CANCER”)

       This is a
   complete NO
            NO
            NO
Plotting the rates?
The data are basically
Poisson and the numbers
are low, which means that
these rates are unstable to
quite small changes
Two alternatives




Probabilities               Bayesian weighting
Chi-square mapping using ‘Pearsonian’ Residuals

> sum(lips$CANCER)
[1] 536
> sum(lips$POP)
[1] 14979894
>pop_exp<-
536*(lips$POP/14979894)
> chisq <- (lips$CANCER-
pop_exp)/sqrt(pop_exp)
> lips_chi <- spCbind(lips, chisq)
>spplot(lips_chi,"chisq")
But is does it have a ‘geography’?
                  Moran’s I is used globally




     w11   w12    w1n 
    w      w22       
W =  21                
                  
                       
     wn1         wnn 
We conclude that we are not fooling ourselves!

Geographic Structure    Moran’s I      Expected value      Variance of (E)   z-score
      Scheme


 Simple contiguity     0.363263693   -0.019230769 (n=52)   0.006769752       4.6488



     Delauney          0.519599336      -0.018181818       0.005068704       7.5537



    Distance k=3       0.543587908      -0.018181818       0.008287442       6.1709



Sphere of influence    0.483547126      -0.018181818       0.006087487       6.4306



   Gabriel graph       0.371846634   -0.022222222 (n=45)   0.007022745       4.7024



 Relative neighbors    0.38126027    -0.02500000 (n=40)     0.01206414       3.6988
We conclude that the pattern is
    ‘real’, the disease has a
    geography of interest

   The generic question is:
Does this phenomenon in these areas (counties,
states, countries) show spatial variation I need to
know about? Does the pattern help explain why?
Spatial interpolation of a continuous field




 In effect we take a sample of ‘heights’ and use these to
  estimate the value EVERYWHERE across the surface
Spatial interpolation

•   The key property of the variable is that it is spatially continuous (everywhere has a
    value and the gradient is likewise a continuous vector field)
•   Given a scatter of sample measurements of the ‘height’ of some continuous
    variable, what is the value of this field variable at this location?
•   There are domain-dependent sub-questions such as: what is the gradient of the
    field at this point? Or : how much of the variable is below the surface (e.g. rainfall
    totals)
•   Examples might be air temperature, rainfall over some period, values of some
    mineral resource, ground height etc., etc.
•   Sometimes results can be verified by further sampling, but equally often there is
    no external way to test the results
•   The process is called spatial interpolation and there are a great many ways of
    doing it automatically
Interpolation by Inverse Distance Weighting (IDW)

• Estimate each and every location on a very
  fine grid using an inverse distance weighted
  sum of the height values of neighboring
  control points
• Uses the gstat package:
• A parameter ‘e’ controls the degree of
  smoothing
Rendering



     IDW
    e=2.0




         IDW
        e=1.0


 IDW
e=3.0
Issues in IDW

• Produces ring contours or bull’s eyes
• No way of assessing the likely errors involved
• No theoretical reason for the choice of the
  distance exponent to be used
• Undesirable side effects if the control data are
  clustered
• But it corresponds fairly well to what a human
  might draw
Geostatistics: making use of spatial dependence in
                    interpolation



• For points and areas spatial dependence can
  complicate any statistical analysis using
  standard methods
• Can we characterise the spatial dependence
  across a field and use it to produce better
  interpolations?
Variography: the semi-variogram ‘cloud’
Summary semi-variogram



We fit one or other of
the plausible models
to these data to derive
a function that
describes the spatial
dependence
Interpolation by Kriging

Error of the estimates can also
be mapped:
We have our estimates over the
           entire area


 The generic question is:
What is the most probable value for a
continuous variable at this location?
Some R-fun (1) : using dismo
>library(XML) #needs this
> library(rgdal) #and this
>library (dismo)
> place<-geocode("Maidwell,              > size<-extent(unlist(place[4:7]))
Northamptonshire, UK") #the              #what does this do?
address needs to have enough to be       > map<-gmap(size,type="satellite")
recognized                               > plot(map)
> place # the place object is a vector   > map<-gmap(size,type="roadmap")
of length 7 with a bounding box:         > plot(map)
 ID     lon lat lonmin lonmax
latmin latmax                              To find places and plot
1 1 -0.9030642 52.38524 -0.938073          them using Google
-0.8710494 52.37016 52.40107               Earth and Maps™
              location
1 Maidwell, Northamptonshire, UK
Where I live …




                          Google Maps™
Aerial photography
Or (slightly) better known?


> place<-geocode("The White
House, Washington, USA")
> size<-
extent(unlist(place[4:7]))
> map<-
gmap(size,type="satellite")
> plot(map)
Some R Fun (2): exporting KML
• Due to James Cheshire
  UCL
• The London Bicycle Hire
  system

> library(maptools)
> library(rgdal)
> cycle <-
read.csv("London_cycle_hire_locs.cs
v", header=TRUE)
> plot(cycle$X,cycle$Y)
Some R Fun (2): exporting KML
            (continued)
• > coordinates(cycle)<- c("X","Y")
• > BNG<-CRS("+init=epsg:27700")
• > proj4string(cycle) <- BNG
• >p4s <- CRS("+proj=longlat
  +ellps=WGS84 +datum=WGS84")
• > cycle_wgs84 <-
  spTransform(cycle,CRS=p4s)
• > writeOGR(cycle_wgs84,
  dsn="london_cycle_docks.kml",
  layer= "cycle_wgs84",
  driver="KML",
  dataset_options=c("NameField=
  name"))
The End
• Taking it further:
• Applied Spatial Data Analysis with R (Bivand,
  Pebesma and Gomez-Rubio (2008)
• Spatial Statistics with R commences 14th
  December 2012 at Statistics.com ™


QUESTIONS ARE WELCOME

More Related Content

What's hot

Spatial analysis and modeling
Spatial analysis and modelingSpatial analysis and modeling
Spatial analysis and modelingTolasa_F
 
Spatial Analysis and Geomatics
Spatial Analysis and GeomaticsSpatial Analysis and Geomatics
Spatial Analysis and GeomaticsRich Heimann
 
Geographic Information System unit 1
Geographic Information System   unit 1Geographic Information System   unit 1
Geographic Information System unit 1sridevi5983
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )designQube
 
Spatial Data Science with R
Spatial Data Science with RSpatial Data Science with R
Spatial Data Science with Ramsantac
 
Geographic Phenomena and their Representations
Geographic Phenomena and their RepresentationsGeographic Phenomena and their Representations
Geographic Phenomena and their RepresentationsNAXA-Developers
 
Scattered gis handbook
Scattered gis handbookScattered gis handbook
Scattered gis handbookWaleed Liaqat
 
Spme 2013 segmentation
Spme 2013 segmentationSpme 2013 segmentation
Spme 2013 segmentationQujiang Lei
 
Database gis fundamentals
Database gis fundamentalsDatabase gis fundamentals
Database gis fundamentalsSumant Diwakar
 
Spatial vs non spatial
Spatial vs non spatialSpatial vs non spatial
Spatial vs non spatialSumant Diwakar
 
Conceptual models of real world geographical phenomena (epm107_2007)
Conceptual models of real world geographical phenomena (epm107_2007)Conceptual models of real world geographical phenomena (epm107_2007)
Conceptual models of real world geographical phenomena (epm107_2007)esambale
 
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information SystemsTYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information SystemsArti Parab Academics
 
Seminar on gis analysis functions
Seminar on gis analysis functionsSeminar on gis analysis functions
Seminar on gis analysis functionsPramoda Raj
 
Geographic Information System unit 5
Geographic Information System   unit 5Geographic Information System   unit 5
Geographic Information System unit 5sridevi5983
 

What's hot (20)

Spatial analysis and modeling
Spatial analysis and modelingSpatial analysis and modeling
Spatial analysis and modeling
 
Gis Concepts 3/5
Gis Concepts 3/5Gis Concepts 3/5
Gis Concepts 3/5
 
Spatial Analysis and Geomatics
Spatial Analysis and GeomaticsSpatial Analysis and Geomatics
Spatial Analysis and Geomatics
 
Geographic Information System unit 1
Geographic Information System   unit 1Geographic Information System   unit 1
Geographic Information System unit 1
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )
 
Spatial Data Science with R
Spatial Data Science with RSpatial Data Science with R
Spatial Data Science with R
 
Understanding raster
Understanding rasterUnderstanding raster
Understanding raster
 
Geographic Phenomena and their Representations
Geographic Phenomena and their RepresentationsGeographic Phenomena and their Representations
Geographic Phenomena and their Representations
 
Scattered gis handbook
Scattered gis handbookScattered gis handbook
Scattered gis handbook
 
Spme 2013 segmentation
Spme 2013 segmentationSpme 2013 segmentation
Spme 2013 segmentation
 
GIS Data Types
GIS Data TypesGIS Data Types
GIS Data Types
 
Database gis fundamentals
Database gis fundamentalsDatabase gis fundamentals
Database gis fundamentals
 
Spatial vs non spatial
Spatial vs non spatialSpatial vs non spatial
Spatial vs non spatial
 
Conceptual models of real world geographical phenomena (epm107_2007)
Conceptual models of real world geographical phenomena (epm107_2007)Conceptual models of real world geographical phenomena (epm107_2007)
Conceptual models of real world geographical phenomena (epm107_2007)
 
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information SystemsTYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
 
Four data models in GIS
Four data models in GISFour data models in GIS
Four data models in GIS
 
Seminar on gis analysis functions
Seminar on gis analysis functionsSeminar on gis analysis functions
Seminar on gis analysis functions
 
Gis unit 3
Gis   unit 3Gis   unit 3
Gis unit 3
 
Geographic Information System unit 5
Geographic Information System   unit 5Geographic Information System   unit 5
Geographic Information System unit 5
 
Introduction to GIS
Introduction to GISIntroduction to GIS
Introduction to GIS
 

Viewers also liked

Konstantin Greger - Spatial Methodologies for the Analysis of Vulnerability i...
Konstantin Greger - Spatial Methodologies for the Analysis of Vulnerability i...Konstantin Greger - Spatial Methodologies for the Analysis of Vulnerability i...
Konstantin Greger - Spatial Methodologies for the Analysis of Vulnerability i...Konstantin Greger
 
INSTITUTEUrbanization and Spatial Connectivity in Ethiopia: Urban Growth Anal...
INSTITUTEUrbanization and Spatial Connectivity in Ethiopia: Urban Growth Anal...INSTITUTEUrbanization and Spatial Connectivity in Ethiopia: Urban Growth Anal...
INSTITUTEUrbanization and Spatial Connectivity in Ethiopia: Urban Growth Anal...essp2
 
Spatial analysis and Analysis Tools
Spatial analysis and Analysis ToolsSpatial analysis and Analysis Tools
Spatial analysis and Analysis ToolsSwapnil Shrivastav
 
Spatial data analysis 2
Spatial data analysis 2Spatial data analysis 2
Spatial data analysis 2Johan Blomme
 
Data Analysis with R (combined slides)
Data Analysis with R (combined slides)Data Analysis with R (combined slides)
Data Analysis with R (combined slides)Guy Lebanon
 
R workshop xiv--Survival Analysis with R
R workshop xiv--Survival Analysis with RR workshop xiv--Survival Analysis with R
R workshop xiv--Survival Analysis with RVivian S. Zhang
 
Using R for customer segmentation
Using R  for customer segmentationUsing R  for customer segmentation
Using R for customer segmentationKumar P
 
Spatial Analysis Using GIS
Spatial Analysis Using GISSpatial Analysis Using GIS
Spatial Analysis Using GISPrachi Mehta
 
Customer Segmentation Masterclass - IIR 2010
Customer Segmentation Masterclass - IIR 2010Customer Segmentation Masterclass - IIR 2010
Customer Segmentation Masterclass - IIR 2010Vladimir Dimitroff
 
PMM23 Week 3 Lectures
PMM23 Week 3 LecturesPMM23 Week 3 Lectures
PMM23 Week 3 Lecturespdiddyboy2
 
Social Network Analysis With R
Social Network Analysis With RSocial Network Analysis With R
Social Network Analysis With RDavid Chiu
 
An Introduction to RFM in Analytics
An Introduction to RFM in AnalyticsAn Introduction to RFM in Analytics
An Introduction to RFM in AnalyticsSAS Canada
 
Statistical Analysis with R and Mind Mapping automation
Statistical Analysis with R and Mind Mapping automationStatistical Analysis with R and Mind Mapping automation
Statistical Analysis with R and Mind Mapping automationJosé M. Guerrero
 
Sentiment Analysis in R
Sentiment Analysis in RSentiment Analysis in R
Sentiment Analysis in REdureka!
 
RFM: A Cool Tool for Simple Analytics
RFM: A Cool Tool for Simple AnalyticsRFM: A Cool Tool for Simple Analytics
RFM: A Cool Tool for Simple AnalyticsC.TRAC Inc.
 
How to Create a Customer Segmentation Model
How to Create a Customer Segmentation ModelHow to Create a Customer Segmentation Model
How to Create a Customer Segmentation ModelMark Haubert
 
Customer Segmentation with R - Deep Dive into flexclust
Customer Segmentation with R - Deep Dive into flexclustCustomer Segmentation with R - Deep Dive into flexclust
Customer Segmentation with R - Deep Dive into flexclustJim Porzak
 
Tim Stonor Predictive analytics using Space Syntax technology
Tim Stonor Predictive analytics using Space Syntax technologyTim Stonor Predictive analytics using Space Syntax technology
Tim Stonor Predictive analytics using Space Syntax technologyTim Stonor
 

Viewers also liked (20)

Konstantin Greger - Spatial Methodologies for the Analysis of Vulnerability i...
Konstantin Greger - Spatial Methodologies for the Analysis of Vulnerability i...Konstantin Greger - Spatial Methodologies for the Analysis of Vulnerability i...
Konstantin Greger - Spatial Methodologies for the Analysis of Vulnerability i...
 
INSTITUTEUrbanization and Spatial Connectivity in Ethiopia: Urban Growth Anal...
INSTITUTEUrbanization and Spatial Connectivity in Ethiopia: Urban Growth Anal...INSTITUTEUrbanization and Spatial Connectivity in Ethiopia: Urban Growth Anal...
INSTITUTEUrbanization and Spatial Connectivity in Ethiopia: Urban Growth Anal...
 
Spatial analysis and Analysis Tools
Spatial analysis and Analysis ToolsSpatial analysis and Analysis Tools
Spatial analysis and Analysis Tools
 
Spatial data analysis 2
Spatial data analysis 2Spatial data analysis 2
Spatial data analysis 2
 
Data Analysis with R (combined slides)
Data Analysis with R (combined slides)Data Analysis with R (combined slides)
Data Analysis with R (combined slides)
 
R workshop xiv--Survival Analysis with R
R workshop xiv--Survival Analysis with RR workshop xiv--Survival Analysis with R
R workshop xiv--Survival Analysis with R
 
Geospatial Data in R
Geospatial Data in RGeospatial Data in R
Geospatial Data in R
 
Using R for customer segmentation
Using R  for customer segmentationUsing R  for customer segmentation
Using R for customer segmentation
 
Spatial Analysis Using GIS
Spatial Analysis Using GISSpatial Analysis Using GIS
Spatial Analysis Using GIS
 
Customer Segmentation Masterclass - IIR 2010
Customer Segmentation Masterclass - IIR 2010Customer Segmentation Masterclass - IIR 2010
Customer Segmentation Masterclass - IIR 2010
 
Facebook data analysis using r
Facebook data analysis using rFacebook data analysis using r
Facebook data analysis using r
 
PMM23 Week 3 Lectures
PMM23 Week 3 LecturesPMM23 Week 3 Lectures
PMM23 Week 3 Lectures
 
Social Network Analysis With R
Social Network Analysis With RSocial Network Analysis With R
Social Network Analysis With R
 
An Introduction to RFM in Analytics
An Introduction to RFM in AnalyticsAn Introduction to RFM in Analytics
An Introduction to RFM in Analytics
 
Statistical Analysis with R and Mind Mapping automation
Statistical Analysis with R and Mind Mapping automationStatistical Analysis with R and Mind Mapping automation
Statistical Analysis with R and Mind Mapping automation
 
Sentiment Analysis in R
Sentiment Analysis in RSentiment Analysis in R
Sentiment Analysis in R
 
RFM: A Cool Tool for Simple Analytics
RFM: A Cool Tool for Simple AnalyticsRFM: A Cool Tool for Simple Analytics
RFM: A Cool Tool for Simple Analytics
 
How to Create a Customer Segmentation Model
How to Create a Customer Segmentation ModelHow to Create a Customer Segmentation Model
How to Create a Customer Segmentation Model
 
Customer Segmentation with R - Deep Dive into flexclust
Customer Segmentation with R - Deep Dive into flexclustCustomer Segmentation with R - Deep Dive into flexclust
Customer Segmentation with R - Deep Dive into flexclust
 
Tim Stonor Predictive analytics using Space Syntax technology
Tim Stonor Predictive analytics using Space Syntax technologyTim Stonor Predictive analytics using Space Syntax technology
Tim Stonor Predictive analytics using Space Syntax technology
 

Similar to Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R

SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSLiemNguyenDuy
 
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleKuldeep Jiwani
 
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Rich Heimann
 
HON4D (O. Oreifej et al., CVPR2013)
HON4D (O. Oreifej et al., CVPR2013)HON4D (O. Oreifej et al., CVPR2013)
HON4D (O. Oreifej et al., CVPR2013)Mitsuru Nakazawa
 
Autocorrelation_kriging_techniques for Hydrology
Autocorrelation_kriging_techniques for HydrologyAutocorrelation_kriging_techniques for Hydrology
Autocorrelation_kriging_techniques for Hydrologysmartwateriitrk
 
CI_SIModule_QGIS.pptx .
CI_SIModule_QGIS.pptx                         .CI_SIModule_QGIS.pptx                         .
CI_SIModule_QGIS.pptx .Athar739197
 
Introduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resourcesIntroduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resourcesAdi Handarbeni
 
Updating Ecological Niche Modeling Methodologies
Updating Ecological Niche Modeling MethodologiesUpdating Ecological Niche Modeling Methodologies
Updating Ecological Niche Modeling MethodologiesTown Peterson
 
D1T3 enm workflows updated
D1T3 enm workflows updatedD1T3 enm workflows updated
D1T3 enm workflows updatedTown Peterson
 
GIS in Public Health Research: Understanding Spatial Analysis and Interpretin...
GIS in Public Health Research: Understanding Spatial Analysis and Interpretin...GIS in Public Health Research: Understanding Spatial Analysis and Interpretin...
GIS in Public Health Research: Understanding Spatial Analysis and Interpretin...hpaocec
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
20131106 acm geocrowd
20131106 acm geocrowd20131106 acm geocrowd
20131106 acm geocrowdDongpo Deng
 
Application of particle swarm optimization in 3 dimensional travelling salesm...
Application of particle swarm optimization in 3 dimensional travelling salesm...Application of particle swarm optimization in 3 dimensional travelling salesm...
Application of particle swarm optimization in 3 dimensional travelling salesm...Maad M. Mijwil
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsMason Porter
 
Recurrence Quantification Analysis : Tutorial & application to eye-movement data
Recurrence Quantification Analysis :Tutorial & application to eye-movement dataRecurrence Quantification Analysis :Tutorial & application to eye-movement data
Recurrence Quantification Analysis : Tutorial & application to eye-movement dataDeb Aks
 
Building maps with analysis
Building maps with analysisBuilding maps with analysis
Building maps with analysisLindaBeale
 
Interpolation 2013
Interpolation 2013Interpolation 2013
Interpolation 2013Atiqa Khan
 
Forest Change Detection in incomplete satellite images with deep neural networks
Forest Change Detection in incomplete satellite images with deep neural networksForest Change Detection in incomplete satellite images with deep neural networks
Forest Change Detection in incomplete satellite images with deep neural networksAatif Sohail
 

Similar to Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R (20)

SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNS
 
Vector.pdf
Vector.pdfVector.pdf
Vector.pdf
 
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
 
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
 
HON4D (O. Oreifej et al., CVPR2013)
HON4D (O. Oreifej et al., CVPR2013)HON4D (O. Oreifej et al., CVPR2013)
HON4D (O. Oreifej et al., CVPR2013)
 
Autocorrelation_kriging_techniques for Hydrology
Autocorrelation_kriging_techniques for HydrologyAutocorrelation_kriging_techniques for Hydrology
Autocorrelation_kriging_techniques for Hydrology
 
CI_SIModule_QGIS.pptx .
CI_SIModule_QGIS.pptx                         .CI_SIModule_QGIS.pptx                         .
CI_SIModule_QGIS.pptx .
 
Introduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resourcesIntroduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resources
 
Updating Ecological Niche Modeling Methodologies
Updating Ecological Niche Modeling MethodologiesUpdating Ecological Niche Modeling Methodologies
Updating Ecological Niche Modeling Methodologies
 
D1T3 enm workflows updated
D1T3 enm workflows updatedD1T3 enm workflows updated
D1T3 enm workflows updated
 
GIS in Public Health Research: Understanding Spatial Analysis and Interpretin...
GIS in Public Health Research: Understanding Spatial Analysis and Interpretin...GIS in Public Health Research: Understanding Spatial Analysis and Interpretin...
GIS in Public Health Research: Understanding Spatial Analysis and Interpretin...
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
20131106 acm geocrowd
20131106 acm geocrowd20131106 acm geocrowd
20131106 acm geocrowd
 
Application of particle swarm optimization in 3 dimensional travelling salesm...
Application of particle swarm optimization in 3 dimensional travelling salesm...Application of particle swarm optimization in 3 dimensional travelling salesm...
Application of particle swarm optimization in 3 dimensional travelling salesm...
 
Exploratory Spatial Analytics (ESA)
Exploratory Spatial Analytics (ESA)Exploratory Spatial Analytics (ESA)
Exploratory Spatial Analytics (ESA)
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
 
Recurrence Quantification Analysis : Tutorial & application to eye-movement data
Recurrence Quantification Analysis :Tutorial & application to eye-movement dataRecurrence Quantification Analysis :Tutorial & application to eye-movement data
Recurrence Quantification Analysis : Tutorial & application to eye-movement data
 
Building maps with analysis
Building maps with analysisBuilding maps with analysis
Building maps with analysis
 
Interpolation 2013
Interpolation 2013Interpolation 2013
Interpolation 2013
 
Forest Change Detection in incomplete satellite images with deep neural networks
Forest Change Detection in incomplete satellite images with deep neural networksForest Change Detection in incomplete satellite images with deep neural networks
Forest Change Detection in incomplete satellite images with deep neural networks
 

More from Revolution Analytics

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureRevolution Analytics
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondRevolution Analytics
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source CommunitiesRevolution Analytics
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceRevolution Analytics
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudRevolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorRevolution Analytics
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalRevolution Analytics
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint packageRevolution Analytics
 

More from Revolution Analytics (20)

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 final
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint package
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R

  • 1. Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R Revolution Analytics Wednesday 13th June 1300 EST
  • 2. The instructor • Dave Unwin • Retired Geography professor • University of London, UK • Spatial analysis & GIS in environmental sciences
  • 3. Geography is everywhere? • Everything happens somewhere • Interest is on geo-spatial data at scales from a few meters to the planet Earth
  • 4. Spatial analysis is the name given to a variety of methods of analysis in which we use LOCATION as an explanatory variable NB: Not all spatial analysis is spatial statistical analysis and not all spatial analysis is geospatial
  • 5. Typical Questions • Is there an unusual clustering of point objects such as crimes/cases of a disease/trees/whatever here that we need to worry about? If so does the point pattern help explain why? • Does this phenomenon in these areas (counties, states, countries) show spatial variation I need to know about? Does the pattern help explain why? • What is the most probable value for a continuous variable at this location?
  • 6. Characteristics of spatial data? • Almost always given: typically the analyst has no choice in their acquisition, sometimes even their formatting; • They have additional structure that defines their geometry (point, line/network, area/lattice, surface/field/geostatistical)
  • 7. Types of spatial data Objects can be points, lines/networks or areas/lattices with L0, L1 and L2 dimension of length Fields are self-defining and spatially continuous: everywhere has a value (e.g. temperature, mean annual rainfall, …)
  • 8. Locating things on Planet Earth • There are many ways by which we measure our location (place name, address, ZIP/Post code , latitude/longitude, grid reference etc) • How we locate depends on context and scale • Spatial resolution of location measurements vary • For analysis we (usually) need (x, y) co-ordinates in a projected system • Need for keys to provide these data, often added after the data have been collected • GPS & GPS-enabled devices are changing this and LBS is a massive and growing industry that is changing our spatial behaviour
  • 9. Why R? • A consistent environment for statistical computing and graphics • Relative proximity to the data • Easy links to code in numerous languages and to DBMS • Easier development of new methods • Packages available to perform most analyses • Immensely supportive community
  • 10. The sp Spatial Class and its subclasses
  • 11. > library(sp) > getClass("Spatial") Class "Spatial" [package "sp"] Slots: Name: bbox proj4string Class: matrix CRS Known Subclasses: Class "SpatialPoints", directly Class "SpatialLines", directly Class "SpatialPolygons", directly Class "SpatialPointsDataFrame", by class "SpatialPoints", distance 2 Class "SpatialPixels", by class "SpatialPoints", distance 2 Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2 Class "SpatialGrid", by class "SpatialPoints", distance 3 Class "SpatialPixelsDataFrame", by class "SpatialPoints", distance 3 Class "SpatialGridDataFrame", by class "SpatialPoints", distance 4 Class "SpatialPolygonsDataFrame", by class "SpatialPolygons", distance 2
  • 12. What extra? • A data matrix called • A spatial data frame turbines: called turbines_spdf > turbine_df that adds three bits of lon lat ‘geography’ 1 -0.8716027, 52.39353 1. lon/lat become spatial 2 -0.8781694, 52.39340 3 -0.8656111, 52.39398 coordinates 4 -0.8795611, 52.39626 2. A coordinate reference 5 -0.8804666, 52.39913 system (CRS) to which 6 -0.8726833, 52.39631 these relate, and 7 -0.8643472, 52.39723 3. A bounding box (for display)
  • 13. Why bother? You can do a lot of spatial analysis using a simple Cartesian co-ordinate system such as a unit square, but what happens when you want to merge with other geographic data? Here is a simple example in which turbines_spdf has been written out in KML and then ‘mashed ‘ onto Google Earth to create a ‘pin’ map
  • 14. Packages for spatial data Contributed packages with spatial statistics applications: • Utilities: rgdal, sp, maptools • Point patterns: spatstat, VR:spatial, splancs; • Geostatistics: gstat, geoR, geoRglm, fields, spBayes, • RandomFields, VR: spatial, sgeostat, vardiag; • Lattice/area data: spdep, DCluster, spgwr, ade4.
  • 15. Making sense of it all … • This is the standard work, written by the authors of sp and some of the packages • It contains just about all you might want to know about spatial analysis in R circa 2008 • Useful new packages have emerged since then
  • 16. For spatial and spatial statistical analysis?
  • 17. Three use case examples • Each illustrates the analysis of a particular class of spatial data -- points L0, area L2 and surfaces L3
  • 18. Patterns in drumlins? Our bit A ‘drumlin’ A ‘swarm of them in NI
  • 19. Adding an ‘edge’ …. Is the pattern CSR as predicted by Smalley and Unwin (1968) over forty years ago?
  • 20. Visualizing the pattern using kernel density estimation
  • 21. Simple tests against CSR …. Using Baddeley’s spatstat package …. • > # nearest neighbor tests for comparison • > clarkevans(drumlin_ppp) • naive Donnelly cdf • 1.249917 1.215380 1.233599 • > clarkevans(drumlin_rr) • naive Donnelly cdf • 1.238626 NA 1.215134
  • 22. Ripleys K(d) function … NB: Modification to L(est) on RHS due to Mark Rosenstein
  • 23. In this case we conclude that the pattern is more regular than random at short range, but then we have no evidence that it is other than CSR at longer ranges The generic question is Is there an unusual clustering of point objects such as crimes/cases of a disease/trees/ whatever here that we need to worry about? If so does the point pattern help explain why?
  • 24. Patterns in disease incidence • Where does this disease occur? • Although disease affects individuals, almost always the available information will be aggregated into some areal unit such as a postal code, electoral district, county, state or country • Such data are called lattice data and they are visualized using choropleth (‘area-value’) maps • Our questions are essentially the same as before
  • 25. Lip cancer incidence in the Districts and Islands of Scotland (Clayton and Kaldor, 1987) > lips <- readShapePoly("C:s cotlip", IDvar="RECORD_ID") > plot(lips) Note this is an ESRI ‘shapefile’ a de facto standard for such lattice data
  • 26. Plotting the raw numbers? >library(sp) >spplot (lips, “CANCER”) This is a complete NO NO NO
  • 27. Plotting the rates? The data are basically Poisson and the numbers are low, which means that these rates are unstable to quite small changes
  • 28. Two alternatives Probabilities Bayesian weighting
  • 29. Chi-square mapping using ‘Pearsonian’ Residuals > sum(lips$CANCER) [1] 536 > sum(lips$POP) [1] 14979894 >pop_exp<- 536*(lips$POP/14979894) > chisq <- (lips$CANCER- pop_exp)/sqrt(pop_exp) > lips_chi <- spCbind(lips, chisq) >spplot(lips_chi,"chisq")
  • 30. But is does it have a ‘geography’? Moran’s I is used globally  w11 w12  w1n  w w22   W =  21          wn1   wnn 
  • 31. We conclude that we are not fooling ourselves! Geographic Structure Moran’s I Expected value Variance of (E) z-score Scheme Simple contiguity 0.363263693 -0.019230769 (n=52) 0.006769752 4.6488 Delauney 0.519599336 -0.018181818 0.005068704 7.5537 Distance k=3 0.543587908 -0.018181818 0.008287442 6.1709 Sphere of influence 0.483547126 -0.018181818 0.006087487 6.4306 Gabriel graph 0.371846634 -0.022222222 (n=45) 0.007022745 4.7024 Relative neighbors 0.38126027 -0.02500000 (n=40) 0.01206414 3.6988
  • 32. We conclude that the pattern is ‘real’, the disease has a geography of interest The generic question is: Does this phenomenon in these areas (counties, states, countries) show spatial variation I need to know about? Does the pattern help explain why?
  • 33. Spatial interpolation of a continuous field In effect we take a sample of ‘heights’ and use these to estimate the value EVERYWHERE across the surface
  • 34. Spatial interpolation • The key property of the variable is that it is spatially continuous (everywhere has a value and the gradient is likewise a continuous vector field) • Given a scatter of sample measurements of the ‘height’ of some continuous variable, what is the value of this field variable at this location? • There are domain-dependent sub-questions such as: what is the gradient of the field at this point? Or : how much of the variable is below the surface (e.g. rainfall totals) • Examples might be air temperature, rainfall over some period, values of some mineral resource, ground height etc., etc. • Sometimes results can be verified by further sampling, but equally often there is no external way to test the results • The process is called spatial interpolation and there are a great many ways of doing it automatically
  • 35. Interpolation by Inverse Distance Weighting (IDW) • Estimate each and every location on a very fine grid using an inverse distance weighted sum of the height values of neighboring control points • Uses the gstat package: • A parameter ‘e’ controls the degree of smoothing
  • 36. Rendering IDW e=2.0 IDW e=1.0 IDW e=3.0
  • 37. Issues in IDW • Produces ring contours or bull’s eyes • No way of assessing the likely errors involved • No theoretical reason for the choice of the distance exponent to be used • Undesirable side effects if the control data are clustered • But it corresponds fairly well to what a human might draw
  • 38. Geostatistics: making use of spatial dependence in interpolation • For points and areas spatial dependence can complicate any statistical analysis using standard methods • Can we characterise the spatial dependence across a field and use it to produce better interpolations?
  • 40. Summary semi-variogram We fit one or other of the plausible models to these data to derive a function that describes the spatial dependence
  • 41. Interpolation by Kriging Error of the estimates can also be mapped:
  • 42. We have our estimates over the entire area The generic question is: What is the most probable value for a continuous variable at this location?
  • 43. Some R-fun (1) : using dismo >library(XML) #needs this > library(rgdal) #and this >library (dismo) > place<-geocode("Maidwell, > size<-extent(unlist(place[4:7])) Northamptonshire, UK") #the #what does this do? address needs to have enough to be > map<-gmap(size,type="satellite") recognized > plot(map) > place # the place object is a vector > map<-gmap(size,type="roadmap") of length 7 with a bounding box: > plot(map) ID lon lat lonmin lonmax latmin latmax To find places and plot 1 1 -0.9030642 52.38524 -0.938073 them using Google -0.8710494 52.37016 52.40107 Earth and Maps™ location 1 Maidwell, Northamptonshire, UK
  • 44. Where I live … Google Maps™ Aerial photography
  • 45. Or (slightly) better known? > place<-geocode("The White House, Washington, USA") > size<- extent(unlist(place[4:7])) > map<- gmap(size,type="satellite") > plot(map)
  • 46. Some R Fun (2): exporting KML • Due to James Cheshire UCL • The London Bicycle Hire system > library(maptools) > library(rgdal) > cycle <- read.csv("London_cycle_hire_locs.cs v", header=TRUE) > plot(cycle$X,cycle$Y)
  • 47. Some R Fun (2): exporting KML (continued) • > coordinates(cycle)<- c("X","Y") • > BNG<-CRS("+init=epsg:27700") • > proj4string(cycle) <- BNG • >p4s <- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84") • > cycle_wgs84 <- spTransform(cycle,CRS=p4s) • > writeOGR(cycle_wgs84, dsn="london_cycle_docks.kml", layer= "cycle_wgs84", driver="KML", dataset_options=c("NameField= name"))
  • 48. The End • Taking it further: • Applied Spatial Data Analysis with R (Bivand, Pebesma and Gomez-Rubio (2008) • Spatial Statistics with R commences 14th December 2012 at Statistics.com ™ QUESTIONS ARE WELCOME