R programming language in spatial analysis


Published on

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

R programming language in spatial analysis

  2. 2. CONTENTS 2 1. Introduction to R programming language. 2. Spatial analysis in ‘R’ 3. R and GIS 4. Literature review 5. Case studies 6. Summary 7. References.
  3. 3. Introduction to R language 3  Environment for statistical computing and graphics - Free software  Associated with simple ,interpreted programming language  Versions of R exist of Windows, MacOS, Linux other Unix flavours  Easy to create your own functions in R  Simple GIS tasks like topological overlay, raster algebra etc., can be carried
  4. 4. R language includes 4  an effective data handling and storage facility,  a suite of operators for calculations on arrays, in particular matrices,  a large, coherent, integrated collection of intermediate tools for data analysis,  graphical facilities for data analysis and display either on- screen or on hardcopy, and  a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.
  5. 5. R Function Libraries 5  Implement many common statistical procedures.  Provide excellent graphics functionality.  A convenient starting point for many data analysis projects.  Examples : maps: allows you to make maps of the world, the US, and smaller areas mapproj: allows you to do cartographic projections
  6. 6. 6 Fig 1 R project for statistical computing  www. r-project.org
  7. 7. Overview of packages used in R 7 Table 1 packages for spatial analysis in R
  8. 8. 8  R developers have written the R package ‘sp’ to extend R with classes and methods for spatial data . - Classes specify a structure and define how spatial data are organised and stored. - - - Methods are instances of functions specialised for a particular data class.
  9. 9. Analysis of spatial data in R using points: 9  Points are pairs of coordinates (x; y), representing events, observation posts, individuals, cities or any other discrete object denned in space.  Let's take a look at the dataset crime, which is just a table of geographic coordinates (decimal degrees) for crime locations in Baltimore, MD.  head(crime) ID LONG LAT  1 -76.65159 39.23941  2 -76.47434 39.35274  3 -76.51726 39.25874  4 -76.52607 39.40707  5 -76.51001 39.33571  6 -76.70375 39.26605
  10. 10. Points 10 Fig 1 Baltimore crime locations
  11. 11. Polygons and lines: 11  Polygons can be thought of as sequences of connected points, where the first point is the same as the last. - An open polygon, where the sequence of points does not result in a closed shape with a denned area, is called a line.  In the R environment, line and polygon data are stored in objects of classes SpatialPolygons and Spatial Lines")  Class Polygon [package "sp"] Name: labpt area hole ringDir coords Class: numeric numeric logical integer matrix  The data are stored as a SpatialPolygons dataframe, which is a subclass of SpatialPolygons containing a dataframe of attributes.
  12. 12. Preparation of a simple map in R 12 Fig 4- showing a simple map library(maps) library(mapdata) map("worldhires","canada", xlim=c(-141,-53), ylim=c(40,85), col="gray90", fill=TRUE) http://www.r-bloggers.com/maps-with-r-and-polygon- boundaries/
  13. 13. R and GIS 13 The aim of integrating R and ArcGIS is to provide an automated way of offering R scripts as ArcGIS Geoprocessing Tools. In some cases the analysis is composed by several steps, demanding different capabilities in such cases this kind of interface is most suitable.
  14. 14. 14 Examples of R packages providing an interface to GIS:  GRASS GIS can be connected through R package spgrass6.  R can access SAGAGIS modules through the R package RSAGA (currently Windows, Linux, FreeBSD and probably others); SAGA GIS is an open-source GIS with mainly raster processing capabilities such as terrain analysis.  R can also run ArcGIS geoprocessing tools through the R package RPyGeo (Windows only). -RPyGeo uses Python scripts to communicate with ArcGIS. RPyGeo/ArcGIS operates on files (raster and shapefiles).
  15. 15. 15 Figure 5- shows the workflow required to expose an R script as a Python toolbox and how the toolbox communicate with the original R script in order to run the algorithm.
  16. 16. Applications: 16  Geosciences  Water resources  Environmental science  Agriculture and soil science  Mathematics and statistics  Ecology  Geodesy  The exploitation of fossil fuels, and  Meteorology
  17. 17. LITERATURE REVIEW 17  Bivand(2001) gives the sketching of key modes of spatial data analysis (point pattern, continuous surface, areal/lattice), and how they integrate into legacy GIS data models.  Roger(2007) gave a brief description of how to access data and also covered how coordinate reference systems are handled, because they are the foundation for spatial data integration  Bajat(2012)presents possibilities of applying the geographically weighted regression, method in mapping population change index
  18. 18. 18 Bajat(2012) presents possibilities of applying the geographically weighted regression, method in mapping population change index in the spatial modelling of population concentration Shane(2013) described some statistical and mapping techniques developed for handling and interrogating large- scale multi-media geochemical datasets using the R with Python scripting languages along with GIS
  19. 19. CASE STUDY 1 19 Kate(2013) Utilized open-source programming languages to statistically and spatially analyse regional-scale geoenvironmental datasets. Objective  Making best use of open-source programming languages such as R in analysing regional-scale geoenvironmental datasets and developing a web mapping service and online viewer for the datasets. Study area The border region of Northern Ireland and interior of Northern Ireland.
  20. 20. 20 Fig 5: graphical plots produced in R after quality assurance and quality control assessment of analytical data Methodology  R–Statistical analyses: -R is employed initially to output a range of graphical plots for quality assurance and quality control assessment of analytical data with respect to laboratory reference materials (as shown in fig5).
  21. 21. 21  Exploratory data analyses are carried out to assess the data distribution. Multivariate analytical techniques such as robust factor analyses and hierarchical cluster analyses are used to investigate statistical and spatial correlations between elements. Mapping  R and Python code have been developed to automate the process of exploratory data analysis, spatial data analysis, data interpolation.  Map production using the arcPy mapping module is done. Online viewer  Finally , a web mapping service and online viewer for the mapped datasets, with live links to a managed database is developed.
  22. 22. 22 Source: http://spatial.dcenr.gov.ie/GeologicalSurvey/TellusBorder/index.html Fig 6 Tellus border online viewer
  23. 23. CASE STUDY 2 23 Acta Silvae (2013) illustrated the use of R programming language in the analyses of spatial data. Objective The aim of this article is to demonstrate the R’s potential for the spatial data processing and presentation. Study area  Snežnik (south Slovenia) forest measuring 20 ha.  Increases in altitude from 820 m to 880 m.  Silver fir and European beech are the dominant tree species. The terrain is characterized by abundant sinkholes.
  24. 24. 24 Methodology Manipulation of vector data : Coordinates were recorded using GPS devices and exported to a text file.  This text file was imported into the R environment using the library ‘map tools’ . Geospatial spatial interpolation: A spatial interpolation (kriging) of the temperature throughout the research area using the library gstat . A variogram model which is a function of the spatial dependence of random variables is to be selected. The point measurements were used to create a continuous temperature field in raster format .
  25. 25. 25 Fig 7 continuous surface of point measurements
  26. 26. 26 LiDAR data processing: R is used as a tool for large amounts of data processing, programming of the raw LiDAR data for 1 km2. Has a size of 539,468 KB (539 MB) and contains 20,736,221 rows and 62,208,663 data points. In R, an algorithm is written to eliminate points that represent forest trees in the whole cloud of points, yielding a point of the terrain.
  27. 27. 27 Fig. 8: The 3D point cloud (gray) of longitudinal profile in the research area. The red points are marked on the floor, which were determined based on the algorithm written in R.
  28. 28. 28 -A digital elevation model (DEM) was produced based on these classified points. Fig. 9: 3D elevation model based on LiDAR data. The surface coloured with a colour range of the altitude value
  29. 29. SUMMARY 29  R has become a high quality open-source software environment for statistical computing and graphics  It has a high performance GIS tool that can be used for geospatial data production, analysis, and mapping.  R allows the usage of many control flows, loops and user- defined functions, multiple input and output data formats.  It gives the opportunity to codify the existing data and functions.  The entire process of analyzing data within R is run through a written script and syntax, which means that it is simple to rerun these analyses if needed.
  30. 30. References 30  Bivand et Albrecht implementing functions for spatial statistical analysis using the r language journal of geographical systems, 2:307-317, 2000.  B.bajat (2012) spatial modelling of population concentration using geographically weighted regression method, journal of the geographical institute sass 01/2011; 61:151-167.  Howarth (1983) vol.2: ‘statistics and data analysis in geochemical prospection’, in handbook of exploration geochemistry, pages 69-73, elsevier, Amsterdam,
  31. 31. 31 M. Mcclelland et. wang(2010) ‘a python package for using r in python’, journal of statistical software, code snippets. Thibaul et al. Using an R package for exploratory spatial data analysis april 2012, volume 47, issue 2. (2012) ‘Statda: Statistical Analysis for Environmental Data.’ URL: http://CRAN.R-project.org/package=statda. R package version 1.6.2.
  32. 32. Any queries ? 32
  33. 33. 33 THANK YOU