R programming language in spatial analysisPresentation Transcript
B Aneesha Satya
R PROGRAMMING LANGUAGE IN
1. Introduction to R programming language.
2. Spatial analysis in ‘R’
3. R and GIS
4. Literature review
5. Case studies
Introduction to R language
Environment for statistical computing and graphics
- Free software
Associated with simple ,interpreted programming language
Versions of R exist of Windows, MacOS, Linux other Unix
Easy to create your own functions in R
Simple GIS tasks like topological overlay, raster algebra
etc., can be carried
R language includes
an effective data handling and storage facility,
a suite of operators for calculations on arrays, in particular
a large, coherent, integrated collection of intermediate tools
for data analysis,
graphical facilities for data analysis and display either on-
screen or on hardcopy, and
a well-developed, simple and effective programming
language which includes conditionals, loops, user-defined
recursive functions and input and output facilities.
R Function Libraries
Implement many common statistical procedures.
Provide excellent graphics functionality.
A convenient starting point for many data analysis projects.
maps: allows you to make maps of the world, the US, and
mapproj: allows you to do cartographic projections
Fig 1 R project for statistical computing
Overview of packages used in R
Table 1 packages for spatial analysis in R
R developers have written the R package ‘sp’ to extend R with
classes and methods for spatial data .
- Classes specify a structure and define how spatial data
are organised and stored. - -
- Methods are instances of functions specialised for a
particular data class.
Analysis of spatial data in R using points:
Points are pairs of coordinates (x; y), representing events, observation posts,
individuals, cities or any other discrete object denned in space.
Let's take a look at the dataset crime, which is just a table of geographic
coordinates (decimal degrees) for crime locations in Baltimore, MD.
ID LONG LAT
1 -76.65159 39.23941
2 -76.47434 39.35274
3 -76.51726 39.25874
4 -76.52607 39.40707
5 -76.51001 39.33571
6 -76.70375 39.26605
Fig 1 Baltimore crime locations
Polygons and lines:
Polygons can be thought of as sequences of connected points, where the
first point is the same as the last.
- An open polygon, where the sequence of points does not result in a
closed shape with a denned area, is called a line.
In the R environment, line and polygon data are stored in objects of classes
SpatialPolygons and Spatial Lines")
Class Polygon [package "sp"]
Name: labpt area hole ringDir coords
Class: numeric numeric logical integer matrix
The data are stored as a SpatialPolygons dataframe, which is a subclass of
SpatialPolygons containing a dataframe of attributes.
Preparation of a simple map in R
Fig 4- showing a simple map
R and GIS
The aim of integrating R and ArcGIS is to provide
an automated way of offering R scripts as ArcGIS
In some cases the analysis is composed by several
steps, demanding different capabilities in such cases
this kind of interface is most suitable.
Examples of R packages providing an interface to
GRASS GIS can be connected through R package spgrass6.
R can access SAGAGIS modules through the R
package RSAGA (currently Windows, Linux, FreeBSD and probably
others); SAGA GIS is an open-source GIS with mainly raster processing
capabilities such as terrain analysis.
R can also run ArcGIS geoprocessing tools through the R
package RPyGeo (Windows only).
-RPyGeo uses Python scripts to communicate with ArcGIS.
RPyGeo/ArcGIS operates on files (raster and shapefiles).
Figure 5- shows the workflow required to expose an R script as a Python toolbox and how the
toolbox communicate with the original R script in order to run the algorithm.
Agriculture and soil science
Mathematics and statistics
The exploitation of fossil fuels, and
Bivand(2001) gives the sketching of key modes of spatial data
analysis (point pattern, continuous surface, areal/lattice), and
how they integrate into legacy GIS data models.
Roger(2007) gave a brief description of how to access data
and also covered how coordinate reference systems are
handled, because they are the foundation for spatial data
Bajat(2012)presents possibilities of applying the
geographically weighted regression, method in mapping
population change index
Bajat(2012) presents possibilities of applying the
geographically weighted regression, method in mapping
population change index in the spatial modelling of population
Shane(2013) described some statistical and mapping
techniques developed for handling and interrogating large-
scale multi-media geochemical datasets using the R with
Python scripting languages along with GIS
CASE STUDY 1
Kate(2013) Utilized open-source programming languages to
statistically and spatially analyse regional-scale
Making best use of open-source programming languages such
as R in analysing regional-scale geoenvironmental datasets and
developing a web mapping service and online viewer for the
The border region of Northern Ireland and interior of
Fig 5: graphical plots produced in R after quality assurance and quality
control assessment of analytical data
-R is employed initially to output a range of graphical plots for quality
assurance and quality control assessment of analytical data with respect to
laboratory reference materials (as shown in fig5).
Exploratory data analyses are carried out to assess the data
Multivariate analytical techniques such as robust factor analyses
and hierarchical cluster analyses are used to investigate statistical
and spatial correlations between elements.
R and Python code have been developed to automate the
process of exploratory data analysis, spatial data analysis, data
Map production using the arcPy mapping module is done.
Finally , a web mapping service and online viewer for the
mapped datasets, with live links to a managed database is
CASE STUDY 2
Acta Silvae (2013) illustrated the use of R programming language
in the analyses of spatial data.
The aim of this article is to demonstrate the R’s potential for the
spatial data processing and presentation.
Snežnik (south Slovenia) forest measuring 20 ha.
Increases in altitude from 820 m to 880 m.
Silver fir and European beech are the dominant tree species. The
terrain is characterized by abundant sinkholes.
Manipulation of vector data :
Coordinates were recorded using GPS devices and exported to
a text file.
This text file was imported into the R environment using the
library ‘map tools’ .
Geospatial spatial interpolation:
A spatial interpolation (kriging) of the temperature throughout
the research area using the library gstat .
A variogram model which is a function of the spatial
dependence of random variables is to be selected.
The point measurements were used to create a continuous
temperature field in raster format .
Fig 7 continuous surface of point measurements
LiDAR data processing:
R is used as a tool for large amounts of data processing,
programming of the raw LiDAR data for 1 km2.
Has a size of 539,468 KB (539 MB) and contains 20,736,221
rows and 62,208,663 data points.
In R, an algorithm is written to eliminate points that represent
forest trees in the whole cloud of points, yielding a point of the
Fig. 8: The 3D point cloud (gray) of longitudinal profile in the research area. The red points are
marked on the floor, which were determined based on the algorithm written in R.
-A digital elevation model (DEM) was produced based on these classified
Fig. 9: 3D elevation model based on LiDAR data. The surface coloured with a colour range
of the altitude value
R has become a high quality open-source software environment
for statistical computing and graphics
It has a high performance GIS tool that can be used for
geospatial data production, analysis, and mapping.
R allows the usage of many control flows, loops and user-
defined functions, multiple input and output data formats.
It gives the opportunity to codify the existing data and
The entire process of analyzing data within R is run through a
written script and syntax, which means that it is simple to rerun
these analyses if needed.
Bivand et Albrecht implementing functions for spatial
statistical analysis using the r language journal of geographical
systems, 2:307-317, 2000.
B.bajat (2012) spatial modelling of population concentration
using geographically weighted regression method, journal of
the geographical institute sass 01/2011; 61:151-167.
Howarth (1983) vol.2: ‘statistics and data analysis in
geochemical prospection’, in handbook of exploration
geochemistry, pages 69-73, elsevier, Amsterdam,
M. Mcclelland et. wang(2010) ‘a python package for using r
in python’, journal of statistical software, code snippets.
Thibaul et al. Using an R package for exploratory spatial data
analysis april 2012, volume 47, issue 2.
(2012) ‘Statda: Statistical Analysis for Environmental Data.’
URL: http://CRAN.R-project.org/package=statda. R package