A Journey to the World of GIS


Published on

GIS Training Module, Location Based Services, Introduction to GIS

Published in: Education, Technology
  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A Journey to the World of GIS

  1. 1. A journey to Geographical Information System Dr. Nishant Sinha
  2. 2. Journey Expectations ▪ GIS – Basics of GIS – Components of GIS – GIS Data Models (Raster andVector) – GIS DataTypes and Metadata – VariousGIS Data formats and GIS Data Products – Process of GIS Data Generation/creation to Analysis – DataConversions – WebGIS –WMS,WFS
  3. 3. Spatial is Special ▪ “Everything is related to everything else, but near things are more related than distant things” Tobler, W. 1970. A computer movie simulating urban growth in the Detroit region. Economic Geography 46, 234–40 ▪ Sometimes called the First Law of Geography (because it is generally true!).
  4. 4. How do we describe geographical features? ▪ by recognizing two types of data: – Spatial data which describes location (where) – Attribute data which specifies characteristics at that location (what, how much, and when) How do we represent these digitally in a GIS? ▪ by grouping into layers based on similar characteristics (e.g hydrography, elevation, water lines, sewer lines, grocery sales) and using either: – vector data model – raster data model ▪ by selecting appropriate data properties for each layer with respect to: – projection, scale, accuracy, and resolution How do we incorporate into a computer application system? ▪ by using a relational Data Base Management System (RDBMS) Representing Geographic Features
  5. 5. ▪ Continuous ▪ Elevation ▪ Rainfall ▪ Ocean salinity ▪ Discrete – Polygon areas: ▪ unbounded: landuse, market areas, soils, rock type ▪ bounded: city/county/state boundaries, ownership parcels, zoning – Line networks ▪ roads, transmission lines, streams – Points Location : ▪ fixed: wells, street lamps, addresses Spatial Data Types
  6. 6. Categorical name – nominal ▪ no inherent ordering ▪ land use types, county names – ordinal ▪ inherent order ▪ road class; stream class ▪ often coded to numbers eg SSN but can’t do arithmetic Numerical Known difference between values – interval ▪ No natural zero ▪ can’t say ‘twice as much’ ▪ temperature (Celsius or Fahrenheit) – ratio ▪ natural zero ▪ ratios make sense (e.g. twice as much) ▪ income, age, rainfall ▪ may be expressed as integer [whole number] or floating point [decimal fraction] Attribute data tables can contain locational information, such as addresses or a list of X,Y coordinates. ArcView refers to these as event tables. However, these must be converted to true spatial data (shape file), for example by geocoding, before they can be displayed as a map. Attribute data types
  7. 7. ContainTables or feature classes in which: – rows: entities, records, observations, features: ▪ ‘all’ information about one occurrence of a feature – columns: attributes, fields, data elements, variables, items (ArcInfo) ▪ one type of information for all features The key field is an attribute whose values uniquely identify each row Parcel Table Parcel # Address Block $ Value 8 501 N Hi 1 105,450 9 590 N Hi 2 89,780 36 1001 W. Main 4 101,500 75 1175 W. 1st 12 98,000 entity AttributeKey field Data Base Management Systems (DBMS)
  8. 8. Geographic Information System A system that doesn't hold maps or pictures but holds a database
  9. 9. GIS Defined ….. ▪ A computer-based system for the manipulation and analysis of geospatial information in which there is an automated link between a data object and their spatial location. http://www.spatialanalysisonline.com/ (Free on-line textbook)
  10. 10. Roger F. Tomlinson, (born 17 November 1933) is an English geographer and the primary originator of modern computerized geographic information systems (GIS), and has been acknowledged CM as the "father of GIS"
  11. 11. What is GIS? One word at a time…
  12. 12. G Information S ▪ Data is a fact or collection of facts ▪ Data that is processed, organized, structured or presented in a given context to make them useful, are called Information
  13. 13. G Information System A set of components for: Storing Displaying Analyzing DATA Information System Data Storage Query Information One example of an Information System: Microsoft Access database
  14. 14. What is the S in GIS? ▪ 1980s: Geographic Information Systems – technology for the acquisition and management of spatial information – software for professional users, e.g. cartographers – Example: MapInfo ▪ 1990s: Geographic Information Science – comprehending the underlying conceptual issues of representing data and processes in space-time – the science (or theory and concepts) behind the technology – Example: design spatial data types and operations for querying ▪ 1990s: Geographic Information Studies – understanding the social, legal and ethical issues associated with the application of GISy and GISc ▪ 2000s: Geographic Information Services – Web-sites and service centers for casual users, e.g. travelers – Service (e.g., GPS, mapquest) for route planning
  15. 15. What is GIS?
  16. 16. Geographic Information System A means of: Storing Mapping Analyzing Spatial Data Information System Geographic Position
  17. 17. Geographic Information System Leaving us with a simple way to start learning about GIS: A tool for deriving information from any data with a geographic / spatial component
  18. 18. What is GIS? Basics of Storing, Mapping, and Analyzing Spatial Data…
  19. 19. What is GIS? and answers the following…
  20. 20. Location - What is at………….? The first of these questions seeks to find out what exists at a particular location. A location can be described in many ways, using, for example place name, post code, or geographic reference such as longitude/latitude or x/y.
  21. 21. Condition - Where is it………….? The second question is the converse of the first and requires spatial data to answer. Instead of identifying what exists at a given location, one may wish to find location(s) where certain conditions are satisfied (e.g., an unforested section of at-least 2000 square meters in size, within 100 meters of road, and with soils suitable for supporting buildings)
  22. 22. Trends - What has changed since…………..? The third question might involve both the first two and seeks to find the differences (e.g. in land use or elevation) over time.
  23. 23. Patterns - What spatial patterns exists….? This question is more sophisticated One might ask this question to determine whether landslides are mostly occurring near streams. It might be just as important to know how many anomalies there are that do not fit the pattern and where they are located.
  24. 24. Modelling - What if………..? "What if…" questions are posed to determine what happens, if a new road is added to a network or if a toxic substance seeps into the local ground water supply. Answering this type of question requires both geographic and other information (as well as specific models). GIS permits spatial operation.
  25. 25. Aspatial Questions "What's the average number of people working with GIS in each location?" is an aspatial question the answer to which does not require the stored value of latitude and longitude; nor does it describe where the places are in relation with each other.
  26. 26. Spatial Questions " How many people work with GIS in the major centres of Delhi" OR "Which centres lie within 10 Kms. of each other? ", OR " What is the shortest route passing through all these centres". These are spatial questions that can only be answered using latitude and longitude data and other information such as the radius of earth. Geographic Information Systems can answer such questions.
  27. 27. Storing Geographic Data One GIS data layer combines both Geographic Features and their Attributes Geographic Features indicate “where”
  28. 28. Storing “Everyday” Geographical Objects ▪ Points ▪ The fundamental primitive is the point, a 0-dimensional (0-D) object that has a position in space but no length. – home, day-care, health clinics, schools, retail and tobacco outlets, crimes & graffiti, bus stops, neighborhood anchor institutions, community assets, resources and risks ▪ Lines ▪ A line is a 1-D geographic object having a length and is composed of two or more 0-D point objects. – roads, railway, pathways, walking or bus routes, rivers ▪ Areas (Polygons) ▪ A polygon is a geographic object bounded by at least three 1-D line objects or segments with the requirement that they must start and end at the same location (i.e., node) – census unit, ZIP code, school district, police precinct, health service areas, counties, states, provinces, watersheds
  29. 29. Mapping Geographic Data – India States India Airports (point layer) India States (polygon layer)
  30. 30. Analyzing Geographic Data • Query GIS data layers based on attributes or geography, or both  Which states’ population was more than 75 million in 2011?
  31. 31. Analyzing Geographic Data • Query GIS data layers based on attributes or geography, or both  Which are the neighboring states of Madhya Pradesh
  32. 32. What is GIS? In more details…
  33. 33. Representing Spatial Elements
  34. 34. Scale of GIS data Global to Local
  35. 35. What is Scale? ▪ Ratio of distance on a map, to equivalent distance on the earth's surface. – Large scale: large detail, small area covered (1”=200’ or 1:2,400) – Small scale -->small detail, large area (1:250,000) – A given object (e.g. land parcel) appears larger on a large scale map – Scale can never be constant everywhere on a map because of map projection – Scale representation ▪ Verbal: (good for interpretation.) ▪ Representative fraction (RF) (good for measurement) (smaller fraction=smaller scale: 1:2,000,000 smaller than 1:2,000) ▪ Scale bar (good if enlarged/reduced) 0ne inch each equals one statute mile 1: 63,360 Miles 0 1 2
  36. 36. Scale Examples Common Scales 1:200 (1”=16.8ft) 1:2,000 (1”=56 yards; 1cm=20m) 1:20,000 (5cm=1km) 1:24,000 (1”=2,000ft) 1:25,000 (1cm=.5km) 1:50,000 (2cm=1km) 1:62,500 (1.6cm=1km; 1”=.986mi) 1:63,360 (1”=1mile; 1cm=.634km) 1:100,000 (1”=1.58mi; 1cm=1km) 1:500,000 (1”=7.9mi; 1cm=5km) 1:1,000,000(1”=15.8mi; 1cm=10km) 1:7,500,000(1”=118mi); 1cm=750km) Large versus Small large: above 1:12,500 medium: 1:13,000 - 1:126,720 small: 1:130,000 - 1:1,000,000 very small: below 1:1,000,000 ( really, relative to what’s available for a given area; Maling 1989) Map sheet examples: 1:24,000: 7.5 minute USGS Quads (17 by 22 inches; 6 by 8 miles) 1:7,500,000 US wall map (26 by 16 inches) 1:20,000,000: US 8.5” X 11”
  37. 37. Precision or Resolution - it’s not the same as scale or accuracy! Precision: the exactness of measurement or description ▪ the “size” of the “smallest” feature which can be displayed, recognized, or described ▪ Can apply to space, time (e.g. daily versus annual), or attribute (douglas fir v. conifer) ▪ For raster data, it is the size of the pixel (resolution) – e.g. for NTGISC digital orthos is 1.6ft (half meter) ▪ raster data can be resampled by combining adjacent cells; this decreases resolution but saves storage – eg 1.6 ft to 3.2 ft (1/4 storage); to 6.4 ft (1/16 storage) ▪ Resolution and scale – generally, increasing to larger scale allows features to be observed better and requires higher resolution – but, because of the human eye’s ability to recognize patterns, features in a lower resolution data set can sometimes be observed better by decreasing the scale (6.4 ft resolution shown at 1:400 rather than 1:200) ▪ Resolution and positional accuracy – you can see a feature (resolution), but it may not be in the right place (accuracy) – Higher accuracy generally costs much more to obtain than higher resolution – Accuracy cannot be greater (but may be much less) than resolution ▪ e.g. if pixel size is one meter, then best accuracy possible is one meter) 1.6ft 3.2ft 3.2ft
  38. 38. Accuracy: Rests on at least four legs, not one! Positional Accuracy (sometimes called Quantitative accuracy) – Spatial ▪ horizontal accuracy: distance from true location ▪ vertical accuracy: difference from true height – Temporal ▪ Difference from actual time and/or date Attribute Accuracy or Consistency: the validity concept in experimental design/stat. inf. – a feature is what the GIS/map purports it to be – a railroad is a railroad, and not a road Completeness--the reliability concept from experimental design/stat. inf. – Are all instances of a feature the GIS/map claims to include, in fact, there? – Partially a function of the criteria for including features: when does a road become a track? – Simply put, how much data is missing? LogicalConsistency: The presence of contradictory relationships in the database – Non-Spatial ▪ Data for one country is for 2000, for another its for 2001 ▪ Data uses different source or estimation technique for different years (again, lineage) – Spatial ▪ Overshoots and gaps in road networks or parcel polygons
  39. 39. ▪ Consists of discrete coordinates to store the geographic position of – Points ▪ Points: People or Cities (center) – Lines ▪ Roads or Other Linkages – Polygons ▪ CensusTract ▪ Vector Data Model – Geographic features stored as X,Y coordinate pairs – Each vector layers has an attribute table – Each feature corresponds to a row in the table Data Types: Vector Data
  40. 40. ▪ Raster data represents a continuous surface divided into a regular grid of cells ▪ Often used as background map layer ▪ Points: People or Cities (center) – Lines ▪ Roads or Other Linkages – Polygons ▪ CensusTract ▪ Raster Data Model – Stores images as rows and columns of numbers, forming a regular grid structure – Great for computational analysis or modeling – Bad for mapping precise locations Data Types: Raster Data Raster AttributeTables
  41. 41. Vector vs Raster Vector • Low data volume • Faster display • Can also store attributes • Less pleasing to the eye • Does not dictate how features should look in the GIS Raster • High data volume • Slower display • Has no attribute information • More pleasing to the eye • Inherently stores how features should look in the GIS
  42. 42. Coordinate Systems ▪ Describing the correct location and shape of features requires a framework for defining real-world locations ▪ A geographic coordinate system is used to assign geographic locations to objects. ▪ GIS data layers must have a coordinate system defined to integrate with other layers
  43. 43. Map Projections Transforming 3-dimensional space (Earth) onto a 2-dimensional map (GIS) Mercator Azimuthal Equidistant Albers Equal Area Conic Lambert Conformal Conic Robinson
  44. 44. Map Projection is important ▪ Small-scale (large area) maps – Interested in Comparing shapes, areas, distances, or directions of map features? – Measurement errors can be quite substantial: New York New York Los AngelesLos Angeles Projection: Mercator Distance: 3,124.67 miles Projection: Albers EqualArea Distance: 2,455.03 miles Actual distance: 2,451 miles
  45. 45. Editing Errors in GIS
  46. 46. Data collected may need to be reorganized and checked for errors, before being used for spatial analysis, or mapping project. Error detection and correction may include: - Compare data with input document - Check topology of spatial objects - Check attributes of spatial objects - Check for missing spatial objects Data Storage and Editing
  47. 47. Three major types of error: (1) Entity error (positional error). Entity error can take three different forms: missing entities, incorrectly placed entities, and disordered entities. (2) Attribute error.Attribute error occurs in both vector and raster systems. (3) Entity-attribute agreement error (logical consistency). Of the three basic types of error found in GIS databases, the last two are the most difficult to find.
  48. 48. Detecting and Editing Errors of Diff. Types ▪ Negative cases of the following statements will cause errors: 1. All entities that should have been entered are present. 2. No extra entities have been digitized. 3. The entities are in the right place and are of the correct shape and size. 4. All entities that are supposed to be connected to each other are connected . 5. All entities are within the outside boundary identified with registration marks.
  49. 49. Spatial Errors ▪ Dangling node, can be defined as a single node connected to a single line entity. Dangling nodes are also called dangles. ▪ Dangles can result from three possible mistakes: (1)Failure to close a polygon (2)Failure to connect the node to the object it was supposed to be connected to (called an undershoot) (3)Going beyond the entity you were supposed to connect to (called an overshoot).
  50. 50. Source of Errors ▪ Dangles can also be a result of incorrect placement of the digitizing puck, or improper fuzzy tolerance distance setting. Distance between left dangle and its above line segment is 0.25mm Fuzzy tolerance = 0.1mm, if you change it to o.3mm, dangle will disappear.
  51. 51. Spatial Errors ▪ Sliver polygons ▪ This occurs when the software uses a vector model that treats each polygon as a separate entity. (or spatial object) ▪ Solution: Use a GIS that does not require digitizing the same line twice. ▪ Weird polygons ▪ Polygons with missing nodes. ▪ Missing Arcs/segments
  52. 52. Labeling Errors
  53. 53. Attribute Errors: Raster and Vector Missing attributes For raster: A. Missing row B. Incorrect or misplaced attributes ForVector Incorrect attribute values are very difficult to detect.
  54. 54. Checklist to Avoid Errors As geospatial analyst, you should always approach a project with the obvious sources of error discussed firmly on you mind. Therefore, when given a task to perform, and the associated data, the following should act as a good checklist: – Is the data current? – Were the data mapped at the correct scale? Do they have the same accuracies? – What is the resolution of the data? Will it support the kinds of analysis we want to perform? – Do we have all the data for the project areas, or is there some data missing? – If we need other data sets, are they available, or will we have trouble getting them?
  55. 55. Obvious Errors ▪ The statement “to err is human” is very applicable to creating spatial data. Humans make a lot of errors. Typing in the wrong value in a computer is a common mistake that humans make. However, there are other sources of obvious error besides human error: – Age: a map is a representation of real-world objects at a given point in time. The reliability of a dataset typically goes down as it gets older. This is especially true of data that would frequently change such as housing within a city. Many GIS projects take years to complete, and it is entirely possible that much of the data collected in the beginning of a project may be out of date by the end of the project. – Map Scale: In general, larger scale maps show more detail than smaller scale maps. Also, larger scale maps tend to have greater accuracy than smaller scale maps, especially maps within the “same family” such as the differences between 1:250,000, 1:100,000 and 1:24,000 GIS will process any of your data, whether the processing is appropriate or not. Therefore, you can combine data from different scales rather easily, however, doing so may not be a good idea due to the different accuracies of the products. – Data Format: The way we represent data also presents an obvious source of error. For example, a raster map of landuse represented by 10 meter grid cells will differ significantly from a raster map of landuse represented by 100 meter grid cells. The following is a grid of landuse values around Ithaca, NewYork. You can see the differences in representation between a map with 10 meter grid cells, 30 meter grid cells, and 100 meter grid cells.
  56. 56. Problems with Age The following maps show the different land cover types between 1968 and 1995. You can see how the data has changed over 30 years, and why using older data might present a problem.
  57. 57. Components of Data Quality ▪ Positional Accuracy ▪ Attribute Accuracy ▪ Resolution ▪ Completeness
  58. 58. Spatial Accuracy ▪ positional accuracy relates to the coordinate values for the geographic objects. But, even positional accuracy is divided into two different categories: – Absolute accuracy: refers to the actual X,Y coordinates of a geographic object. If one knows the correct position of the geographic object, they can compare the differences with the position represented in the geographic database. Typically, absolute accuracy will measure the total different between an object, or the difference in the X coordinate and the difference in theY coordinate. – Relative accuracy: refers to the displacement of two or more points on a map (in both the distance and angle), compared to the displacement of those same points in the real world.
  59. 59. Errors Associated with Spatial Analysis ▪ Errors in Digitizing a Map – Source errors ▪ Distortion ▪ Boundaries drawn on a map have a “thickness” – 1 mm line ▪ 1.25 m wide on 1:250 map ▪ 100m wide on 1:100000 ▪ Estimates show that 10% of a 1:24000 soil map may represent the boundary lines alone – Digital Representation ▪ Curves are approximated by many vertices ▪ Boundaries are not absolute, but should have a confidence interval
  60. 60. Errors Resulting from Natural Variations from Original Measurements ▪ Measurement Error – Accuracy vs. Precision ▪ Accuracy: extent to which an estimated value approaches the true value ▪ Precision: measure of dispersion of observations about a mean
  61. 61. Accuracy and Precision ▪ Accuracy is defined as displacement of a plotted point from its true position in relation to an established standard while Precision is the degree of perfection; or repeatability of a measurement. ▪ For mapping, accuracy is associated with position of an object to its true position. ▪ Precision is then the ability to repeat a measurement, or how likely you are to return to the same location time and time again. ▪ The figures to the right illustrate the differences between accuracy and precision. ▪ Therefore, if there are natural variations in either the instruments used for measurement, or the object you are measuring, the accuracy or precision may be effected.
  62. 62. Digitizing errors from duplicate lines include slivers and missing labels for the sliver polygons. Slivers are exaggerated for the purpose of illustration. Digitizing errors
  63. 63. Digitizing errors of overshoot (left) and undershoot (right) Digitizing Errors- Overshoot & Undershoots
  64. 64. Digitizing errors of an unclosed polygon Digitizing errors-Unclosed Polygon
  65. 65. Pseudo nodes, shown by the diamond symbol, are nodes that are not located at line intersections Digitizing errors- Pusedo Nodes
  66. 66. The from-node and to-node of an arc determine the arc’s direction. Digitizing Arc
  67. 67. Digitizing error of multiple labels due to unclosed polygons Digitizing Unclosed Polygon –Multi labels
  68. 68. The dangle length specified by the CLEAN command can remove an overshoot if the overextension is smaller than the specified length. In this diagram, the overshoot a is removed and the overshoot b remains. Removing Dangles - Using Clean Command
  69. 69. Typical Digitizing Situations this is ideal, but... overshoot, and what to do with it undershoot, an d what to do
  70. 70. Acknowledgement These slides are aggregations for better understanding of GIS. I acknowledge the contribution of all the authors and photographers from where I tried to accumulate the info and used for better presentation.
  71. 71. Author’s Coordinates: Dr. Nishant Sinha Pitney Bowes Software, India mr.nishantsinha@gmail.com