1
Abdisalam Issa-Salwe, Taibah University
Michael G. Wing and Pete Bettinger (2008): Geographic Information Systems: Applications in Natural Resource Management,
2nd Editon. Oxford University Press.
Acquiring, Creating, and
Editing GIS Databases
(IS344)
Chapter 3
Abdisalam Issa-Salwe
Information Systems Department
College of Computer Science & Engineering
Taibah University
Chapter 3 Objectives
 Methods to acquire GIS databases,
particularly via the Internet
 Methods to create new GIS databases
 Processes to edit existing GIS databases
 Types and sources of error potentially
associated with GIS databases
2
Four general cases of GIS databases at
most organizations
 The data necessary for project work
 Don’t exist
 Do exist but were created for other general uses and
may not be completely suitable for your project
 Do exist but were created for other specific uses and
may not be completely suitable for your project
 The data are in place and in good order for your
project!
Typically, you’ll have to acquire data
 Hire a contractor to create or edit GIS data
 Use a GPS or other device to capture data
 Modifying an existing database
 Create a new GIS database
Digitizing
Using/buying data from another organization
Downloading data from the Internet
3
Acquisition processes
 Using an Internet browser to select and
download
 FTP- File Transfer Protocol
 Transfer on hardware
 Tape
 External harddrive
 CD or DVD
 USB key
 Floppy
Ñ Ñ
ÑÑ
490000 500435
500435490000
510436 510436
500000500000
Figure 3.1. Measurement reference points for the Daniel
Pickett forest to enable digitizing of additional
landscape features or the creation of new GIS
databases.
Roads
Stands
Reference points
(with associated
X and Y
coordinates)
4
Figure 3.2. A landslide drawn on a map with a regular
sharpened pencil (upper left), a marker (upper right), a
sharpened pencil, yet in a sloppy manner – the landslide area
is not closed (lower left), a marker, yet in a sloppy manner - the
landslide area is barely closed (lower right).
(a) (b)
(c)
Figure 3.3. A timber stand (a) in vector format, from the
Brown Tract, scanned (b) or converted to a raster
format using 25 m grid cells, then converted back to
vector format (c) by connecting lines to the center of
each grid cell.
5
Editing GIS databases
 Reasons for editing
 Changing a spatial projection
 Edge matching GIS databases to other databases
 Generalization and transformation processes
necessary to convert a GIS database to a specific
format or resolution
 In natural resources, updates may occur
annually to keep pace with timber inventory
 Growth, disturbance, harvesting
Updating processes
 Can be laborious and error-prone
 Verification protocols can help
Assures that data variables are reasonable or
meet some standard
Should be in place for spatial and attribute
characteristics
Should involve multiple people
6
Figure 3.4. A
generalized process for
updating a forest
inventory GIS database.
Delineate changes
to be made to
inventory
Inventory forester
Check data for
mistakes and
omissions
Digitize changes
to spatial data,
encode inventory
Integrate into
GIS
database
Information systems
analysts
Check data for
mistakes and
omissions
Check data for
mistakes and
omissions
Check data for
mistakes and
omissions
maps, data files
maps, data files
GIS
databases
maps, data files
maps, data files
GIS
databases
maps, data files
Verification
process #1
Verification
process #2
Verification
process #3
Verification
process #4
Editing attributes
 Attributes are the values used to describe
landscape features in a GIS database
 Fields, variables, columns, data, etc.
 Attributes may need to be updated overtime
 Vegetation type, basal area, age, volume (mbf)
 Easy to make mistakes, particularly with major
updates
 Verification processes can check whether values
are in the appropriate range
7
Editing spatial position
 As the locations or shapes of spatial features
change, their coordinates will need to be
changed within the GIS database
 Editing procedures for this purpose vary widely
among software products
 Typically, a database is first made “editable”
 The user then makes edits
 Points, lines, and/or polygons moved, copied, created, or
deleted
 The edits are saved
 Often, a time-consuming procedure
Consistency in spatial position
 When updating or creating new data,
inconsistencies may result as the data are
incorporated into existing databases
8
Inconsistency
Roads
Timber
stands
Figure 3.6. Spatial
inconsistency
between a timber
stand GIS database
and a roads GIS
database.
Figure 3.5. A timber stand drawn more
precisely (top) and less precisely (bottom).
Note that the lines on the south and eastern
portion of the figures are different.
Error in GIS databases
 Errors arise from many sources:
Editing, encoding, hardware, and others
 Three primary sources of error in GIS data
Systematic
Human
Random
9
Systematic errors
 Caused by problems in the processes
and/or tools used to measure spatial
locations or attribute data
 Sometimes called cumulative errors since
they add up during data collection
 Sometimes called instrumental
 Can be removed if identified and
quantified
Human errors
 Sometimes called gross errors or blunders
 As the name suggests, these are
introduced through carelessness or other
inattention
 Verification processes can be used to
control human errors
 Data collection and editing protocols can
also assist in limiting human errors
10
Random errors
 An almost unavoidable by-product of measuring
and describing landscape data
 No matter how careful we are in data collection
procedures, there will almost always be some
slight variance from the true measurement
 Random errors are the errors that remain after
systematic and human errors have been
removed
Managing random errors
 We assume that random errors follow a normal
(Gaussian) distribution
 They cluster around a mean value or center
 Least squares adjustments can distribute and
minimize the error among all measurements in a
feature
 More frequently, and especially in forestry, we
assume that random errors will cancel each
other out
 For this reason, random errors are sometimes called
compensating errors
11
Types of errors in GIS databases
 Positional errors occur when things are in the
wrong place
 Can result from poor registration or inaccurate
coordinate input during the digitizing process
 Are sometimes handled with accuracy
statements: “90 percent of landscape features
are within 150 meters of their true position”
 A root mean square error (RMSE) is sometimes
used to set or describe an accuracy standard
 A RMSE assesses the error between a mapped point
and its on-the-ground (true) equivalent
Digitized road segment
Real-world representation #1
Real-world representation #2
Real-world representation #3
Figure 3.7.
Uncertainty of
the local shape of
a road segment
(after Schneider
2001).
12
Other types of errors
 Attribute errors
 Incorrect values assigned to features
 Can result from keyboard entry
 Verification processes can help alleviate these
 Computational errors
 Can be introduced during procedures
 Generalization
 Vector-to-raster transformations
 Interpolations
 Results should be carefully considered to judge
appropriateness of procedures

Chapter3 is344(gis)(acquiring, creating, and editing gis db )

  • 1.
    1 Abdisalam Issa-Salwe, TaibahUniversity Michael G. Wing and Pete Bettinger (2008): Geographic Information Systems: Applications in Natural Resource Management, 2nd Editon. Oxford University Press. Acquiring, Creating, and Editing GIS Databases (IS344) Chapter 3 Abdisalam Issa-Salwe Information Systems Department College of Computer Science & Engineering Taibah University Chapter 3 Objectives  Methods to acquire GIS databases, particularly via the Internet  Methods to create new GIS databases  Processes to edit existing GIS databases  Types and sources of error potentially associated with GIS databases
  • 2.
    2 Four general casesof GIS databases at most organizations  The data necessary for project work  Don’t exist  Do exist but were created for other general uses and may not be completely suitable for your project  Do exist but were created for other specific uses and may not be completely suitable for your project  The data are in place and in good order for your project! Typically, you’ll have to acquire data  Hire a contractor to create or edit GIS data  Use a GPS or other device to capture data  Modifying an existing database  Create a new GIS database Digitizing Using/buying data from another organization Downloading data from the Internet
  • 3.
    3 Acquisition processes  Usingan Internet browser to select and download  FTP- File Transfer Protocol  Transfer on hardware  Tape  External harddrive  CD or DVD  USB key  Floppy Ñ Ñ ÑÑ 490000 500435 500435490000 510436 510436 500000500000 Figure 3.1. Measurement reference points for the Daniel Pickett forest to enable digitizing of additional landscape features or the creation of new GIS databases. Roads Stands Reference points (with associated X and Y coordinates)
  • 4.
    4 Figure 3.2. Alandslide drawn on a map with a regular sharpened pencil (upper left), a marker (upper right), a sharpened pencil, yet in a sloppy manner – the landslide area is not closed (lower left), a marker, yet in a sloppy manner - the landslide area is barely closed (lower right). (a) (b) (c) Figure 3.3. A timber stand (a) in vector format, from the Brown Tract, scanned (b) or converted to a raster format using 25 m grid cells, then converted back to vector format (c) by connecting lines to the center of each grid cell.
  • 5.
    5 Editing GIS databases Reasons for editing  Changing a spatial projection  Edge matching GIS databases to other databases  Generalization and transformation processes necessary to convert a GIS database to a specific format or resolution  In natural resources, updates may occur annually to keep pace with timber inventory  Growth, disturbance, harvesting Updating processes  Can be laborious and error-prone  Verification protocols can help Assures that data variables are reasonable or meet some standard Should be in place for spatial and attribute characteristics Should involve multiple people
  • 6.
    6 Figure 3.4. A generalizedprocess for updating a forest inventory GIS database. Delineate changes to be made to inventory Inventory forester Check data for mistakes and omissions Digitize changes to spatial data, encode inventory Integrate into GIS database Information systems analysts Check data for mistakes and omissions Check data for mistakes and omissions Check data for mistakes and omissions maps, data files maps, data files GIS databases maps, data files maps, data files GIS databases maps, data files Verification process #1 Verification process #2 Verification process #3 Verification process #4 Editing attributes  Attributes are the values used to describe landscape features in a GIS database  Fields, variables, columns, data, etc.  Attributes may need to be updated overtime  Vegetation type, basal area, age, volume (mbf)  Easy to make mistakes, particularly with major updates  Verification processes can check whether values are in the appropriate range
  • 7.
    7 Editing spatial position As the locations or shapes of spatial features change, their coordinates will need to be changed within the GIS database  Editing procedures for this purpose vary widely among software products  Typically, a database is first made “editable”  The user then makes edits  Points, lines, and/or polygons moved, copied, created, or deleted  The edits are saved  Often, a time-consuming procedure Consistency in spatial position  When updating or creating new data, inconsistencies may result as the data are incorporated into existing databases
  • 8.
    8 Inconsistency Roads Timber stands Figure 3.6. Spatial inconsistency betweena timber stand GIS database and a roads GIS database. Figure 3.5. A timber stand drawn more precisely (top) and less precisely (bottom). Note that the lines on the south and eastern portion of the figures are different. Error in GIS databases  Errors arise from many sources: Editing, encoding, hardware, and others  Three primary sources of error in GIS data Systematic Human Random
  • 9.
    9 Systematic errors  Causedby problems in the processes and/or tools used to measure spatial locations or attribute data  Sometimes called cumulative errors since they add up during data collection  Sometimes called instrumental  Can be removed if identified and quantified Human errors  Sometimes called gross errors or blunders  As the name suggests, these are introduced through carelessness or other inattention  Verification processes can be used to control human errors  Data collection and editing protocols can also assist in limiting human errors
  • 10.
    10 Random errors  Analmost unavoidable by-product of measuring and describing landscape data  No matter how careful we are in data collection procedures, there will almost always be some slight variance from the true measurement  Random errors are the errors that remain after systematic and human errors have been removed Managing random errors  We assume that random errors follow a normal (Gaussian) distribution  They cluster around a mean value or center  Least squares adjustments can distribute and minimize the error among all measurements in a feature  More frequently, and especially in forestry, we assume that random errors will cancel each other out  For this reason, random errors are sometimes called compensating errors
  • 11.
    11 Types of errorsin GIS databases  Positional errors occur when things are in the wrong place  Can result from poor registration or inaccurate coordinate input during the digitizing process  Are sometimes handled with accuracy statements: “90 percent of landscape features are within 150 meters of their true position”  A root mean square error (RMSE) is sometimes used to set or describe an accuracy standard  A RMSE assesses the error between a mapped point and its on-the-ground (true) equivalent Digitized road segment Real-world representation #1 Real-world representation #2 Real-world representation #3 Figure 3.7. Uncertainty of the local shape of a road segment (after Schneider 2001).
  • 12.
    12 Other types oferrors  Attribute errors  Incorrect values assigned to features  Can result from keyboard entry  Verification processes can help alleviate these  Computational errors  Can be introduced during procedures  Generalization  Vector-to-raster transformations  Interpolations  Results should be carefully considered to judge appropriateness of procedures