SlideShare a Scribd company logo
1
Intro to Spatial Data Science
with R
Alí Santacruz
amsantac.co
JULY 2016
2
About me
• Expert in geomatics with a background in environmental sciences
• R geek
• PhD candidate in Geography
• Interested in Spatial Data Science
• Author of several R packages (available on CRAN)
3
Purpose of this talk
• Discuss what Spatial Data Science is
• Give an introductory explanation about how to conduct Spatial Data
Science with R
4
What is Spatial Data Science?
Spatial Data Scientist (n.):
Statistician GIS/RS expertGIS developer Software
engineer
Spatial Data
Scientist
Spatial Data Science
Data Science
Spatial
Person that is better in spatial data analysis than a GIS developer and
better in software engineering than a GIS/RS expert
5
Spatial Data Science
All they are combined for data
analysis in order to …
Support a better decision
making
"The key word in data science is not data; it is science"
Jeff Leek. Data Science Specialization. Coursera.
6
Spatial
Data
Scientist
Modified from
gettingsmart.com
7
Hacking skills
• Programming languages: Python and R (and others)
http://www.kdnuggets.com/2016/06/r-python-top-analytics-data-mining-data-science-software.html
8
Why should we use R?
• Free and open-source
• A large and comprehensive set of packages (> 8600)
• Data access
• Data cleaning
• Analysis
• Visualization and report generation
• Excellent development environments – RStudio IDE
• An active and friendly developers community
• A huge users community: > 2 million
9
Why R for spatial analysis
• 160+ packages in CRAN Task View: Analysis of Spatial Data
• Classes for spatial (and spatio-temporal) data
• Spatial data import/export
• Exploratory spatial data analysis
• Support for vector and raster operations
• Spatial statistics
• Data visualization through static and dynamic (web) graphics
• Integration with GIS software
• Easy integration with techniques from non-spatial packages
10
R classes for spatial data
• Before 2003:
• Several packages with different assumptions on how spatial data was structured
• From 2003:
• ‘sp’ package: extends R classes and methods for spatial data (vector and raster)
• From 2010:
• ‘raster’ package: deals with raster files stored in disk that are too large to be
loaded on memory (RAM)
11
R classes for spatial data
SpatialPointsDataFrame SpatialLinesDataFrame SpatialPolygonsDataFrame
SpatialPixelsDataFrame
SpatialGridDataFrame
sp package
RasterLayer
RasterStack
RasterBrick
raster package
(recommended)
12
The Data
Science
Process
Modified from science2knowledge
Reproducibility
13
MODELAR
los datos
MODEL
the data
EXPLORAR
los datos
EXPLORE
the data
PREPARAR
los datos
PREPARE
the data
• Is this A or B or C? :: classification
• Is this weird? :: anomaly detection
• How much/how many? :: regression
• How is it organized? :: clustering
• How will it change? :: prediction
"The key word in data science is not data; it is science"
Jeff Leek. Data Science Specialization. Coursera.
OBTENER
los datos
GET
the data
Domain expertise
PLANTEAR la
pregunta correcta
PLANTEAR la
pregunta correcta
ASK the right
question
COMUNICAR
los resultados
COMMUNICATE
the results
14
• Import vector layers: rgdal, raster packages
• Import raster layers: raster package
• Get geocoded data from APIs: twitteR package, see example
• Download satellite images/geographic data: raster, modis, MODISTools packages
For this slide and following ones see code
and examples in in this webpage
MODELAR
los datos
MODEL
the data
EXPLORAR
los datos
EXPLORE
the data
PREPARAR
los datos
PREPARE
the data
GET
the data
PLANTEAR la
pregunta correcta
PLANTEAR la
pregunta correcta
ASK the right
question
COMUNICAR
los resultados
COMMUNICATE
the results
15
• Data cleaning, subset, etc.
• Manipulate data with “verbs” from dplyr and other Hadley-verse packages
• Spatial subset (sp, raster packages)
• Vector operations:
• Operations on the attribute table (sp package)
• Overlay: union, intersection, clip, extract values from raster data using
points/polygons (raster, rgeos packages)
• Dissolve (sp, rgeos packages), buffer (rgeos package)
• Rasterize vector data (raster package)
• Raster operations:
• Map algebra, spatial filters, resampling, … (raster package)
• Vectorize raster data (rgdal, raster packages)
For slides 14 - 18 see code and examples
in this webpage
MODELAR
los datos
MODEL
the data
EXPLORAR
los datos
EXPLORE
the data
PREPARE
the data
OBTENER
los datos
GET
the data
PLANTEAR la
pregunta correcta
PLANTEAR la
pregunta correcta
ASK the right
question
COMUNICAR
los resultados
COMMUNICATE
the results
16
• Descriptive statistics: central tendency and spread measures
• Exploratory graphics (2D, 3D): scatter plot, box plot, histogram, …
• Spatial autocorrelation:
• Global spatial autocorrelation statistics: Moran’s I, Geary’s C, Getis and Ord’s
G(d) (spdep package)
• Local spatial autocorrelation statistics: Moran’s Ii, Getis and Ord’s Gi y Gi*(d)
(spdep package)
MODELAR
los datos
MODEL
the data
EXPLORE
the data
PREPARAR
los datos
PREPARE
the data
OBTENER
los datos
GET
the data
PLANTEAR la
pregunta correcta
PLANTEAR la
pregunta correcta
ASK the right
question
COMUNICAR
los resultados
COMMUNICATE
the results
For slides 14 - 18 see code and examples
in this webpage
17
• Regression:
• Spatial autoregressive models (spdep package)
• Geographically weighted regression (spgwr package)
• Classification (Machine Learning):
• Supervised: RandomForests, SVM, boosting, … (caret package)
• Non-supervised: k-means clustering (stats package)
• Spatial statistics:
• Geostatistics (gstat, geoR, geospt packages and others)
• Spatial point patterns (spatstat package)
MODEL
the data
EXPLORAR
los datos
EXPLORE
the data
PREPARAR
los datos
PREPARE
the data
OBTENER
los datos
GET
the data
PLANTEAR la
pregunta correcta
PLANTEAR la
pregunta correcta
ASK the right
question
COMUNICAR
los resultados
COMMUNICATE
the results
For slides 14 - 18 see code and examples
in this webpage
18
• Static or interactive maps: tmap, leaflet, mapview packages
• Interactive graphics, web apps and dashboards:
• plotly (example), rcharts, googleVis (example) packages
• shiny, see example
• flexdashboard, see example
MODELAR
los datos
MODEL
the data
EXPLORAR
los datos
EXPLORE
the data
PREPARAR
los datos
PREPARE
the data
OBTENER
los datos
GET
the data
PLANTEAR la
pregunta correcta
PLANTEAR la
pregunta correcta
ASK the right
question
COMMUNICATE
the results
For slides 14 - 18 see code and examples
in this webpage
19
Don’t forget: Reproducibility!
• R code and output for examples shown in this webinar (slides 17-21) can
be reproduced with this .Rmd document using RMarkdown
• See this example about reproducible spatial analysis using interactive
notebooks
• Learn more about reproducible geoscientific research
20
Integrating R with GIS software
• QGIS: see example in this post
• ArcGIS: arcgisbinding package, see example in this post
• GRASS GIS: version 6, spgrass6 package; version 7, rgrass7 package
• gvSIG: more info in this post
• SAGA: RSAGA package
• GME (Geospatial Modelling Environment): more info in this webpage
21
References / Online resources
• Bivand, R., Pebesma, E., Gómez-Rubio, V. 2013. Applied Spatial Data
Analysis with R. New York: Springer. 2nd ed.
• R-SIG-Geo mailing list
• CRAN Task View: Analysis of Spatial Data
• Facebook groups: GIS with R, R project en Español
• Google+ groups: Statistics and R, R Programming for Data Analysis
• My blog: amsantac.co/blog.html
If you have any question feel free to contact me:
amsantac.co/contact.html
Thanks!

More Related Content

What's hot

Indoor Mapping: Why Precision Matters
Indoor Mapping: Why Precision MattersIndoor Mapping: Why Precision Matters
Indoor Mapping: Why Precision Matters
CARTO
 
spatial databases ADBMS ppt
spatial databases ADBMS pptspatial databases ADBMS ppt
spatial databases ADBMS ppt
RitaThakkar1
 
Remote Sensing ppt
Remote Sensing pptRemote Sensing ppt
Remote Sensing ppt
Ravina Dadhich
 
Role of GIS in Health Care Management by Dr. Dipti Mukherji
Role of GIS in Health Care Management by Dr. Dipti MukherjiRole of GIS in Health Care Management by Dr. Dipti Mukherji
Role of GIS in Health Care Management by Dr. Dipti Mukherji
Priyanka_vshukla
 
Geographic information system(gis)
Geographic information system(gis)Geographic information system(gis)
Geographic information system(gis)
MapQuest.Inc
 
DATA in GIS and DATA Query
DATA in GIS and DATA QueryDATA in GIS and DATA Query
DATA in GIS and DATA Query
KU Leuven
 
Remote Sensing Imagery & Artificial Intelligence
Remote Sensing Imagery & Artificial IntelligenceRemote Sensing Imagery & Artificial Intelligence
Remote Sensing Imagery & Artificial Intelligence
Esri Ireland
 
hyperspectral remote sensing and its geological applications
hyperspectral remote sensing and its geological applicationshyperspectral remote sensing and its geological applications
hyperspectral remote sensing and its geological applications
abhijeet_banerjee
 
Digital Elevation Model, Its derivatives and applications
Digital Elevation Model, Its derivatives and applicationsDigital Elevation Model, Its derivatives and applications
Digital Elevation Model, Its derivatives and applicationsShadaab .
 
Components of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISComponents of Spatial Data Quality in GIS
Components of Spatial Data Quality in GIS
Kaium Chowdhury
 
Spatial Databases
Spatial DatabasesSpatial Databases
Spatial Databases
Pratibha Chaudhary
 
A Journey to the World of GIS
A Journey to the World of GISA Journey to the World of GIS
A Journey to the World of GIS
Nishant Sinha
 
Spatial analysis and Analysis Tools
Spatial analysis and Analysis ToolsSpatial analysis and Analysis Tools
Spatial analysis and Analysis Tools
Swapnil Shrivastav
 
Spatial data analysis
Spatial data analysisSpatial data analysis
Spatial data analysis
Johan Blomme
 
Developing Efficient Web-based GIS Applications
Developing Efficient Web-based GIS ApplicationsDeveloping Efficient Web-based GIS Applications
Developing Efficient Web-based GIS Applications
Swetha A
 
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information SystemsTYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
Arti Parab Academics
 
7 Reasons Why CPG Marketers Are Turning To Location Analytics
7 Reasons Why CPG Marketers Are Turning To Location Analytics7 Reasons Why CPG Marketers Are Turning To Location Analytics
7 Reasons Why CPG Marketers Are Turning To Location Analytics
CARTO
 

What's hot (20)

Indoor Mapping: Why Precision Matters
Indoor Mapping: Why Precision MattersIndoor Mapping: Why Precision Matters
Indoor Mapping: Why Precision Matters
 
spatial databases ADBMS ppt
spatial databases ADBMS pptspatial databases ADBMS ppt
spatial databases ADBMS ppt
 
Remote Sensing ppt
Remote Sensing pptRemote Sensing ppt
Remote Sensing ppt
 
Role of GIS in Health Care Management by Dr. Dipti Mukherji
Role of GIS in Health Care Management by Dr. Dipti MukherjiRole of GIS in Health Care Management by Dr. Dipti Mukherji
Role of GIS in Health Care Management by Dr. Dipti Mukherji
 
Web Based GIS
Web Based GISWeb Based GIS
Web Based GIS
 
Geographic information system(gis)
Geographic information system(gis)Geographic information system(gis)
Geographic information system(gis)
 
DATA in GIS and DATA Query
DATA in GIS and DATA QueryDATA in GIS and DATA Query
DATA in GIS and DATA Query
 
Remote Sensing Imagery & Artificial Intelligence
Remote Sensing Imagery & Artificial IntelligenceRemote Sensing Imagery & Artificial Intelligence
Remote Sensing Imagery & Artificial Intelligence
 
hyperspectral remote sensing and its geological applications
hyperspectral remote sensing and its geological applicationshyperspectral remote sensing and its geological applications
hyperspectral remote sensing and its geological applications
 
Digital Elevation Model, Its derivatives and applications
Digital Elevation Model, Its derivatives and applicationsDigital Elevation Model, Its derivatives and applications
Digital Elevation Model, Its derivatives and applications
 
Components of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISComponents of Spatial Data Quality in GIS
Components of Spatial Data Quality in GIS
 
Spatial Databases
Spatial DatabasesSpatial Databases
Spatial Databases
 
Remote sensing
Remote sensingRemote sensing
Remote sensing
 
A Journey to the World of GIS
A Journey to the World of GISA Journey to the World of GIS
A Journey to the World of GIS
 
Spatial analysis and Analysis Tools
Spatial analysis and Analysis ToolsSpatial analysis and Analysis Tools
Spatial analysis and Analysis Tools
 
Spatial data analysis
Spatial data analysisSpatial data analysis
Spatial data analysis
 
Developing Efficient Web-based GIS Applications
Developing Efficient Web-based GIS ApplicationsDeveloping Efficient Web-based GIS Applications
Developing Efficient Web-based GIS Applications
 
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information SystemsTYBSC IT PGIS Unit I  Chapter I- Introduction to Geographic Information Systems
TYBSC IT PGIS Unit I Chapter I- Introduction to Geographic Information Systems
 
7 Reasons Why CPG Marketers Are Turning To Location Analytics
7 Reasons Why CPG Marketers Are Turning To Location Analytics7 Reasons Why CPG Marketers Are Turning To Location Analytics
7 Reasons Why CPG Marketers Are Turning To Location Analytics
 
LANDSAT
LANDSATLANDSAT
LANDSAT
 

Similar to Spatial Data Science with R

Essentials of R
Essentials of REssentials of R
Essentials of R
ExternalEvents
 
Big Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC SystemsBig Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC Systems
HPCC Systems
 
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
Matthäus Zloch
 
Geospatial Options in Apache Spark
Geospatial Options in Apache SparkGeospatial Options in Apache Spark
Geospatial Options in Apache Spark
Databricks
 
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy LansleyUsing R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Guy Lansley
 
R spatial presentation
R spatial presentationR spatial presentation
R spatial presentation
Todd Barr
 
What is the "Big Data" version of the Linpack Benchmark? ; What is “Big Data...
What is the "Big Data" version of the Linpack Benchmark?; What is “Big Data...What is the "Big Data" version of the Linpack Benchmark?; What is “Big Data...
What is the "Big Data" version of the Linpack Benchmark? ; What is “Big Data...
Geoffrey Fox
 
Magellen: Geospatial Analytics on Spark by Ram Sriharsha
Magellen: Geospatial Analytics on Spark by Ram SriharshaMagellen: Geospatial Analytics on Spark by Ram Sriharsha
Magellen: Geospatial Analytics on Spark by Ram Sriharsha
Spark Summit
 
Apache con big data 2015 magellan
Apache con big data 2015 magellanApache con big data 2015 magellan
Apache con big data 2015 magellan
Ram Sriharsha
 
ESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical dataESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical data
geoknow
 
Esta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-dataEsta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-data
geoknow
 
RasterFrames: Enabling Global-Scale Geospatial Machine Learning
RasterFrames: Enabling Global-Scale Geospatial Machine LearningRasterFrames: Enabling Global-Scale Geospatial Machine Learning
RasterFrames: Enabling Global-Scale Geospatial Machine Learning
Astraea, Inc.
 
RasterFrames - FOSS4G NA 2018
RasterFrames - FOSS4G NA 2018RasterFrames - FOSS4G NA 2018
RasterFrames - FOSS4G NA 2018
Simeon Fitch
 
Introduction to R for data science
Introduction to R for data scienceIntroduction to R for data science
Introduction to R for data science
Long Nguyen
 
Aag 2017
Aag 2017Aag 2017
Aag 2017
Michael DeMers
 
Exploring Abandoned GIS Research to Augment Applied Geography Education
Exploring Abandoned GIS Research to Augment Applied Geography EducationExploring Abandoned GIS Research to Augment Applied Geography Education
Exploring Abandoned GIS Research to Augment Applied Geography Education
Michael DeMers
 
Spatial Data, KML, and the University Web
Spatial Data, KML, and the University WebSpatial Data, KML, and the University Web
Spatial Data, KML, and the University Web
Glennon Alan
 
rworldmap: A New R package for Mapping Global Data
rworldmap: A New R package for Mapping Global Datarworldmap: A New R package for Mapping Global Data
rworldmap: A New R package for Mapping Global Data
Dr. Volkan OBAN
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS Community
MongoDB
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folks
Thomas Hütter
 

Similar to Spatial Data Science with R (20)

Essentials of R
Essentials of REssentials of R
Essentials of R
 
Big Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC SystemsBig Data and Geospatial with HPCC Systems
Big Data and Geospatial with HPCC Systems
 
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
 
Geospatial Options in Apache Spark
Geospatial Options in Apache SparkGeospatial Options in Apache Spark
Geospatial Options in Apache Spark
 
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy LansleyUsing R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
 
R spatial presentation
R spatial presentationR spatial presentation
R spatial presentation
 
What is the "Big Data" version of the Linpack Benchmark? ; What is “Big Data...
What is the "Big Data" version of the Linpack Benchmark?; What is “Big Data...What is the "Big Data" version of the Linpack Benchmark?; What is “Big Data...
What is the "Big Data" version of the Linpack Benchmark? ; What is “Big Data...
 
Magellen: Geospatial Analytics on Spark by Ram Sriharsha
Magellen: Geospatial Analytics on Spark by Ram SriharshaMagellen: Geospatial Analytics on Spark by Ram Sriharsha
Magellen: Geospatial Analytics on Spark by Ram Sriharsha
 
Apache con big data 2015 magellan
Apache con big data 2015 magellanApache con big data 2015 magellan
Apache con big data 2015 magellan
 
ESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical dataESTA-LD exploring spatio-temporal linked statistical data
ESTA-LD exploring spatio-temporal linked statistical data
 
Esta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-dataEsta ld -exploring-spatio-temporal-linked-statistical-data
Esta ld -exploring-spatio-temporal-linked-statistical-data
 
RasterFrames: Enabling Global-Scale Geospatial Machine Learning
RasterFrames: Enabling Global-Scale Geospatial Machine LearningRasterFrames: Enabling Global-Scale Geospatial Machine Learning
RasterFrames: Enabling Global-Scale Geospatial Machine Learning
 
RasterFrames - FOSS4G NA 2018
RasterFrames - FOSS4G NA 2018RasterFrames - FOSS4G NA 2018
RasterFrames - FOSS4G NA 2018
 
Introduction to R for data science
Introduction to R for data scienceIntroduction to R for data science
Introduction to R for data science
 
Aag 2017
Aag 2017Aag 2017
Aag 2017
 
Exploring Abandoned GIS Research to Augment Applied Geography Education
Exploring Abandoned GIS Research to Augment Applied Geography EducationExploring Abandoned GIS Research to Augment Applied Geography Education
Exploring Abandoned GIS Research to Augment Applied Geography Education
 
Spatial Data, KML, and the University Web
Spatial Data, KML, and the University WebSpatial Data, KML, and the University Web
Spatial Data, KML, and the University Web
 
rworldmap: A New R package for Mapping Global Data
rworldmap: A New R package for Mapping Global Datarworldmap: A New R package for Mapping Global Data
rworldmap: A New R package for Mapping Global Data
 
Giving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS CommunityGiving MongoDB a Way to Play with the GIS Community
Giving MongoDB a Way to Play with the GIS Community
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folks
 

Recently uploaded

一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 

Recently uploaded (20)

一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 

Spatial Data Science with R

  • 1. 1 Intro to Spatial Data Science with R Alí Santacruz amsantac.co JULY 2016
  • 2. 2 About me • Expert in geomatics with a background in environmental sciences • R geek • PhD candidate in Geography • Interested in Spatial Data Science • Author of several R packages (available on CRAN)
  • 3. 3 Purpose of this talk • Discuss what Spatial Data Science is • Give an introductory explanation about how to conduct Spatial Data Science with R
  • 4. 4 What is Spatial Data Science? Spatial Data Scientist (n.): Statistician GIS/RS expertGIS developer Software engineer Spatial Data Scientist Spatial Data Science Data Science Spatial Person that is better in spatial data analysis than a GIS developer and better in software engineering than a GIS/RS expert
  • 5. 5 Spatial Data Science All they are combined for data analysis in order to … Support a better decision making "The key word in data science is not data; it is science" Jeff Leek. Data Science Specialization. Coursera.
  • 7. 7 Hacking skills • Programming languages: Python and R (and others) http://www.kdnuggets.com/2016/06/r-python-top-analytics-data-mining-data-science-software.html
  • 8. 8 Why should we use R? • Free and open-source • A large and comprehensive set of packages (> 8600) • Data access • Data cleaning • Analysis • Visualization and report generation • Excellent development environments – RStudio IDE • An active and friendly developers community • A huge users community: > 2 million
  • 9. 9 Why R for spatial analysis • 160+ packages in CRAN Task View: Analysis of Spatial Data • Classes for spatial (and spatio-temporal) data • Spatial data import/export • Exploratory spatial data analysis • Support for vector and raster operations • Spatial statistics • Data visualization through static and dynamic (web) graphics • Integration with GIS software • Easy integration with techniques from non-spatial packages
  • 10. 10 R classes for spatial data • Before 2003: • Several packages with different assumptions on how spatial data was structured • From 2003: • ‘sp’ package: extends R classes and methods for spatial data (vector and raster) • From 2010: • ‘raster’ package: deals with raster files stored in disk that are too large to be loaded on memory (RAM)
  • 11. 11 R classes for spatial data SpatialPointsDataFrame SpatialLinesDataFrame SpatialPolygonsDataFrame SpatialPixelsDataFrame SpatialGridDataFrame sp package RasterLayer RasterStack RasterBrick raster package (recommended)
  • 12. 12 The Data Science Process Modified from science2knowledge Reproducibility
  • 13. 13 MODELAR los datos MODEL the data EXPLORAR los datos EXPLORE the data PREPARAR los datos PREPARE the data • Is this A or B or C? :: classification • Is this weird? :: anomaly detection • How much/how many? :: regression • How is it organized? :: clustering • How will it change? :: prediction "The key word in data science is not data; it is science" Jeff Leek. Data Science Specialization. Coursera. OBTENER los datos GET the data Domain expertise PLANTEAR la pregunta correcta PLANTEAR la pregunta correcta ASK the right question COMUNICAR los resultados COMMUNICATE the results
  • 14. 14 • Import vector layers: rgdal, raster packages • Import raster layers: raster package • Get geocoded data from APIs: twitteR package, see example • Download satellite images/geographic data: raster, modis, MODISTools packages For this slide and following ones see code and examples in in this webpage MODELAR los datos MODEL the data EXPLORAR los datos EXPLORE the data PREPARAR los datos PREPARE the data GET the data PLANTEAR la pregunta correcta PLANTEAR la pregunta correcta ASK the right question COMUNICAR los resultados COMMUNICATE the results
  • 15. 15 • Data cleaning, subset, etc. • Manipulate data with “verbs” from dplyr and other Hadley-verse packages • Spatial subset (sp, raster packages) • Vector operations: • Operations on the attribute table (sp package) • Overlay: union, intersection, clip, extract values from raster data using points/polygons (raster, rgeos packages) • Dissolve (sp, rgeos packages), buffer (rgeos package) • Rasterize vector data (raster package) • Raster operations: • Map algebra, spatial filters, resampling, … (raster package) • Vectorize raster data (rgdal, raster packages) For slides 14 - 18 see code and examples in this webpage MODELAR los datos MODEL the data EXPLORAR los datos EXPLORE the data PREPARE the data OBTENER los datos GET the data PLANTEAR la pregunta correcta PLANTEAR la pregunta correcta ASK the right question COMUNICAR los resultados COMMUNICATE the results
  • 16. 16 • Descriptive statistics: central tendency and spread measures • Exploratory graphics (2D, 3D): scatter plot, box plot, histogram, … • Spatial autocorrelation: • Global spatial autocorrelation statistics: Moran’s I, Geary’s C, Getis and Ord’s G(d) (spdep package) • Local spatial autocorrelation statistics: Moran’s Ii, Getis and Ord’s Gi y Gi*(d) (spdep package) MODELAR los datos MODEL the data EXPLORE the data PREPARAR los datos PREPARE the data OBTENER los datos GET the data PLANTEAR la pregunta correcta PLANTEAR la pregunta correcta ASK the right question COMUNICAR los resultados COMMUNICATE the results For slides 14 - 18 see code and examples in this webpage
  • 17. 17 • Regression: • Spatial autoregressive models (spdep package) • Geographically weighted regression (spgwr package) • Classification (Machine Learning): • Supervised: RandomForests, SVM, boosting, … (caret package) • Non-supervised: k-means clustering (stats package) • Spatial statistics: • Geostatistics (gstat, geoR, geospt packages and others) • Spatial point patterns (spatstat package) MODEL the data EXPLORAR los datos EXPLORE the data PREPARAR los datos PREPARE the data OBTENER los datos GET the data PLANTEAR la pregunta correcta PLANTEAR la pregunta correcta ASK the right question COMUNICAR los resultados COMMUNICATE the results For slides 14 - 18 see code and examples in this webpage
  • 18. 18 • Static or interactive maps: tmap, leaflet, mapview packages • Interactive graphics, web apps and dashboards: • plotly (example), rcharts, googleVis (example) packages • shiny, see example • flexdashboard, see example MODELAR los datos MODEL the data EXPLORAR los datos EXPLORE the data PREPARAR los datos PREPARE the data OBTENER los datos GET the data PLANTEAR la pregunta correcta PLANTEAR la pregunta correcta ASK the right question COMMUNICATE the results For slides 14 - 18 see code and examples in this webpage
  • 19. 19 Don’t forget: Reproducibility! • R code and output for examples shown in this webinar (slides 17-21) can be reproduced with this .Rmd document using RMarkdown • See this example about reproducible spatial analysis using interactive notebooks • Learn more about reproducible geoscientific research
  • 20. 20 Integrating R with GIS software • QGIS: see example in this post • ArcGIS: arcgisbinding package, see example in this post • GRASS GIS: version 6, spgrass6 package; version 7, rgrass7 package • gvSIG: more info in this post • SAGA: RSAGA package • GME (Geospatial Modelling Environment): more info in this webpage
  • 21. 21 References / Online resources • Bivand, R., Pebesma, E., Gómez-Rubio, V. 2013. Applied Spatial Data Analysis with R. New York: Springer. 2nd ed. • R-SIG-Geo mailing list • CRAN Task View: Analysis of Spatial Data • Facebook groups: GIS with R, R project en Español • Google+ groups: Statistics and R, R Programming for Data Analysis • My blog: amsantac.co/blog.html
  • 22. If you have any question feel free to contact me: amsantac.co/contact.html Thanks!