Justifying Modelin Uncertanty Agora
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Justifying Modelin Uncertanty Agora

on

  • 636 views

 

Statistics

Views

Total Views
636
Views on SlideShare
635
Embed Views
1

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Justifying Modelin Uncertanty Agora Document Transcript

  • 1. Int. J. of Computers, Communications and Control, Vol. 3 (2008) Suppl. issue: Proceedings of ICCCC 2008, pp. 11-16 Justifying GIS Modeling Uncertainty: A Practical Approach Gabriela Droj Abstract The usage of Geographic Information Systems (GIS) has been rapidly increased and it became the main tool for analyzing spatial data in many engineering applications and decision making activities. These applications and activities require more and more the three- dimensional reference space (surface), known as Digital Terrain Model (DTM). Obviously, in order to take the correct decisions it is necessary to have high quality with known inaccuracy DTMs. The objective of this article is to compare a number of different algorithms involved in producing a DTM for GIS modeling applications, as well as to establish a representative set of parameters necessary to quantify the uncertainty of these DTMs. Keywords: GIS, Spatial Data Accuracy, Data Quality, Modeling Uncertainty. 1 Introduction In the recent years, the usage of Geographic Information Systems (GIS) has been rapidly in- creased and it became the main tool for analyzing spatial data in a number of scientific fields of real world activities. The Geographic (or Geographical) Informational Systems are used not only in assisting daily work, but as well as a decision support tool especially for planning (ur- ban planning, investment planning, infrastructure planning, economic development, taxation, SWOT analysis, etc.), resource management and fiscal impact [2,8,11]. GIS are also used in critical systems like civil protection which need high performance and fast computing time [4]. These applications require more and more the three-dimensional reference space (actually a 2.5D surface), although temporary aspects demand for four dimensions? in- cluding time as the additional (temporal) reference parameter [12,13]. Using GIS technology, contemporary maps have taken radical new forms of display beyond the usual 2D planimetric paper map. Today, it is expected to be able to drape spatial information on a 3D view of the terrain. The 3D view of the terrain is called Digital Terrain Model (DTM) or Digital Elevation Model (DEM) 3 . The digital terrain models are frequently used to take important decisions like to answer hy- drological questions concerning flooding. In order to judge these decisions the quality of DTM must be known. The quality of DTM is, unfortunately, rarely checked. While the development of GIS advances, DTM research has so far been neglected. The objectives of this paper are: (a) to compare the different algorithms involved in produc- ing a DTM, and (b) to establish a representative set of parameters necessary to quantify the uncertainty of DTMs. 2 DTM for GIS Modeling Applications A DTM is a digital representation of ground surface, topography or terrain. It is also known as Digital Elevation Model (DEM). The DTM can be represented as a raster (a grid of squares), as contour lines or as a triangular irregular network (TIN) [5,7,10] .
  • 2. Int. J. of Computers, Communications and Control, Vol. 3 (2008) Suppl. issue: Proceedings of ICCCC 2008, pp. 11-16 In GIS applications and according to the relative bibliography, the following two methodogies are frequently used for DTM GIS 2.5D modeling: (i) the cartographic interpolation-based digitizing method, and (ii) the image-based automatic measurements method 9 . 1. The cartographic interpolation-based digitizing method is widely used because topographic maps are usually available. The input data form the basis for the computation of the DTM, consisting of points. The computation itself consists in spatial interpolation algorithms. 2. The image-based automatic measurements method is based on close-range photogramme- try or airborne laser scanning and outputs bulk points with a high density. The DTM for GIS 2.5D modeling applications is realized in the post processing phase usually by creating the TIN or by interpolation 2.1 Interpolation-based DTMs The current research and industrial projects in GIS require higher standards for the accuracy of spatial data. The data in geographical informational systems (GIS) are usually collected as points, where a point is considered a triplet (x,y,z), where x and y are the coordinates of the point and z is the specific information. This specific information can be for example: the alti- tude level in the point (x, y), the quantity of precipitations, the level of pollution, type of soil, socio-economic parameters etc.. The mapping and spatial analysis often requires converting this type of field measurements into continuous space. Interpolation is one of the frequently used methods to transform field data, especially the point samples into a continuous form such as grid or raster data formats. There are several interpolation methods frequently used in GIS. The following eight widely used methods are compared and studied in this paper. These methods are: Inverse distance weighted (IDW), Spline Biquadratic interpolation, Spline Bicubic interpolation, B-spline inter- polation, Nearest neighbours - Voronoi diagrams, Delaunay triangulation, Quadratic Shepard interpolation and Kriging interpolation 1,6,14 . 2.2 Image-based DTMs The fastest way of obtaining the digital elevation model is by using remote sense data. In the case of collecting data with close-range photogrammetry or airborne laser scan the result consist in a high density of points with three coordinates (x,y,z). By computing the TIN model of these points we can obtain fast a DTM. The quality of the DTM is dependent by the quality of the image, especially by the parameter z which has always a lower accuracy that the pair (x, y ). By using this method we can obtain fast a DTM, this method is used especially in civil protection in case of disaster. This method it’s not recommended to be used for urban areas because the value of z is equal with the sum between the altitude and the high of the object. 3 Quantifying Uncertancy in GIS Modeling Measurement of errors for the results is often impossible because the true value for every geo- graphic feature or phenomenon represented in this geographic data set is rarely determinable. Therefore the uncertainty, instead of error, should be used to describe the quality of an inter- polation result. Quantifying uncertainty in these cases requires comparison of the known data with the created surface 15]. To analyze the pattern of deviation between two sets of elevation data, conventional ways are to yield statistical expressions of the accuracy, such as the root mean square error, standard
  • 3. Int. J. of Computers, Communications and Control, Vol. 3 (2008) Suppl. issue: Proceedings of ICCCC 2008, pp. 11-16 deviation, mean, variance, coefficient of variation. In fact, all statistical measures that are effec- tive for describing a frequency distribution, including central tendency and dispersion measures, may be used, as long as various assumptions for specific methods are satisfied 9,15 . For the evaluation of DTM the most widely used measure, usually the only one, is the well known Root Mean Square Error (RMSE). Actually, it measures the dispersion of the frequency distri- bution of deviations between the original elevation data and the DTM data, mathematically expressed as: n 1 RM SE = (zd,i − zr,i )2 n i=1 where: Zd,i is the ith elevation value measured on the DTM surface; Zr,i is the corresponding original elevation; n is the number of elevation points checked. 4 DTM Quality Evaluation: Experiments with Real-World Data DTMs are the most popular results of interpolation. In the following we will test different methods and algorithms for creation of DTM in order to establish a minimum set of parameters to compare and evaluate the quality of the resulted data [8,11]. To test and compare the methods with real data we have selected an area from north hills of Oradea municipality. For the first DTM we used photogrammetric measurement of spot elevations from orthorectified airborne image of the area. The TIN of the area was generated using ArcGIS Desktop 9.1. In the pictures below the created 3D model is presented (Fig. 1). Figure 1: The Oradea 3D Model This model will consider the reference for testing and evaluating the most popular eight interpolation algorithms used for DTM creation (please see Section 2). The first step was to create a regular grid with the step of 500 m, for a total of 60 points. On this set of points we tested the algorithms specified before. The analyses of the results were made by direct observation, with visual comparisons of the models and by using statistical parameters.
  • 4. Int. J. of Computers, Communications and Control, Vol. 3 (2008) Suppl. issue: Proceedings of ICCCC 2008, pp. 11-16 The visual comparisons show a high similarity with the reference for the Delaunay triangulation and Shepard interpolation. For the statistical evaluation we determined the median absolute deviation, the standard devia- tion, the coefficient of variation and RMSE. In Table 1 the statistics for all the created surfaces are presented (Table 1). By comparing the statistic evaluation and the real world situation we notice an inconsistence. The highest statistical accuracy is given by lowest value for RMSE, in this case this is the B- cubic interpolation. Methods Time Min Max Mean Median ab- Standard Coefficient RMSE solute devi- deviation of variation ation Initial 125 180 245 25 30.9 0.172 data IDW 0.3 125 180 245 11.92 20.05 0.110 182.66 Bi- 0.1 129 181.7 222.9 13.96 20.17 0.110 183.11 quadratic Bi-cubic 0.1 161.7 179.5 197.3 6.99 8.39 0.46 179.75 B-spline 0.8 110 177.9 246.4 19.2 27.73 0.151 184.96 Voronoi 0.08 125 180 245 25 29.69 0.16 184.77 Delaunay 0.03 125 179 244.6 16.11 24.6 0.135 183.9 Shepard 0.3 100 173 245 20.3 27.6 0.152 183 Kriging 0.14 125 180 245 20.6 28.67 0.160 180.82 Table 1: The statistics for all the created surfaces with 60 known points In the second step we have created a regular grid with the step of 250 m, for a total of 121 points. On this set of points we tested the same algorithms as before. The visual comparisons show a high similarity, with the reference, for the Shepard interpolation, Kriging interpolation and Delaunay triangulation. The statistical data on the surfaces created show a decrease of the value for all the parameters, as it can be seen in the Table 2. Methods Time Min Max Mean Median ab- Standard Coefficient RMSE solute devi- deviation of variation ation Initial 125 180 245 25 30.2 0.166 data IDW 0.11 125.5 176.09 242.87 14.30 21.82 0.121 181.11 Bi- 0.3 122.4 176 227 17.89 22.91 0.129 179 quadratic Bi-cubic 0.3 164.8 180 195 6.06 7.2 0.04 180.1 B-spline 3.92 134.7 172.7 238.85 20.5 29 0.163 179.9 Voronoi 0.11 125 175 245 25 30 0.16 181.01 Delaunay 0.11 125 177 244.8 20.3 27.6 0.152 183 Shepard 0.14 125 177.84 245 19.09 27.1 0.148 185 Kriging 0.08 125 180 245 17.17 25.63 0.140 184.26 Table 2: The statistics for all the created surfaces with 121 known points
  • 5. Int. J. of Computers, Communications and Control, Vol. 3 (2008) Suppl. issue: Proceedings of ICCCC 2008, pp. 11-16 In the second case, as in the first one, regarding the statistical parameters the optimal surface is generated with Be-quadratic interpolation but the direct observation shows a higher similarity with the reference for the Shepard interpolation, Kriging interpolation and Delaunay triangulation. The results obtained in the both cases show that the statistical analyses are inconsistent and insufficient, if it’s done on the same data as the interpolation algorithms. This suggests the need of testing the quality of the surfaces with an independent set of data, which were not considered in the interpolation. In the following we will evaluate the surfaces generated by using a random set of points for which was determined the real value of the altitude. For this independent random set we have determined the following statistical parameters: variation, median absolute deviation, standard deviation and the root mean square error. The determined values are presented in the table 3. Methods Variation Variation Median Median Standard Standard RMSE RMSE 250 500 absolute absolute deviation deviation 250 500 deviation deviation 250 500 250 500 IDW 231 167 11.6 10.17 15.21 12.29 13.48 10.93 Bi- 390 426 16.17 16.56 19.75 20.64 15.35 13.84 quadratic Bi-cubic 947 1044 26.53 26 30.77 32 19.14 17.14 B-spline 166.2 145 9.67 9.9 12.89 12 12.54 10.37 Voronoi 246 94.5 11.38 7.3 15.71 9.75 13.79 9.27 Delaunay 218 36.63 10.85 3.41 14.78 6.05 13.34 7.99 Shepard 164 71.78 9.56 6.40 12.81 8.47 11.97 8.49 Kriging 160.06 78.97 9.74 6.81 12.65 8.88 11.75 8.6 Table 3: The statistics for all the created surfaces, determined for a set of random control points 5 Summary and Conclusions By evaluating the statistical data we noticed that the most accurate surfaces are generated, for the first case (grid of 500 m), by Kriging, Shepard and B-spline algorithms and for the second case (grid of 250m) by Delauney triangulation followed by Shepard and Kriging interpolation. If we evaluate all the statistical data (presented in the all three tables), we notice that the Delauney triangulation is representing the optimal method. Similar results can be obtained by Kriging and Shepard interpolation. Even these methods are sometimes more efficient than the Delauney triangulation. Nevertheless, the Delauney algorithm is recommended because it needs less computing time and it is not changing the original values of the points. The B-spline algorithm also gives a good result but in this case the computing time is much higher and it is smoothing the surface, fact which is making this method inadequate for surfaces with a high altitude difference. The first conclusion, which is pointed up by the values presented in Table 3, is that all the statistical indices are pointing out a similar order of accuracy. Another conclusion is that the statistical evaluation of the random set of data illustrated a higher accuracy for a higher number of initial known points. The last conclusion is that, for an accurate evaluation, is necessary to evaluate the surface by using an independent set of data. In this case the RMSE mirrors the quality of the surface but for a correct evaluation we recommend to analyze at least one more statistical index like standard deviation or Median absolute deviation
  • 6. Int. J. of Computers, Communications and Control, Vol. 3 (2008) Suppl. issue: Proceedings of ICCCC 2008, pp. 11-16 References [1] M. R. Asim, M. Ghulam, K. Brodlie, “Constrained Visualization of 2D Positive Data using Modified Quadratic Shepard Method,” WSCG’2004, Plzen, Czech Republic. [2] G. Benwell “A Land Speed Record? Some Management Issues to Address,” Int’l Conference on Managing GIS for success, Melbourne, Australia, pp. 70-75, ISBN: 0 7325 1359 6, 1996. [3] J. K. Berry and Associates “The GIS Primer - An Introduction to Geographic Information Systems,” http://www.innovativegis.com/basis/, May 2006. [4] Tai On Chan, Ian Williamson “A Holistic, Cost-Benefit Approach to Justifying Organization-Wide GIS,” Int’l Conference on Managing GIS for success, Melbourne, Australia, pp. 27-35, ISBN: 0 7325 1359 6, 1996. [5] Du Chongjiang “An Interpolation Method for Grid-Based Terrain Modeling,” The Computer Journal, Vol. 39, No. 10, 1996. [6] ESRI. Environmental Science Research Institute “Arc/Info 8.0.1,” ESRI, Inc. 1999. [7] L. De Floriani, E. Puppo, P. Magillo “Application of Computational Geometry to Geographic Infor- mational Systems,” Handbook of Computational Geometry, 1999 Elsevier Science, pp. 333-388. [8] Gary J. Hunter “ Management Issues in GIS: Accuracy and Data Quality,” Int’l Conference on Managing GIS for success, Melbourne, Australia, pp. 95-101, ISBN: 0 7325 1359 6, 1996 [9] W. Karel, N. Pfeifer, C. Briese “ DTM Quality Assessment,” ISPRS Technical Commission II Sym- posium , 2006 XXXVI/2, pp. 7-12. [10] M. van Kreveld “ Algorithms for Triangulated Terrains,” Conference on Current Trends in Theory and Practice of Informatics, 1997. [11] A. D. Styliadis “ Risk Management for GIS Projects - Lessons Learnt,” Int’l Conference on Managing GIS for success, Melbourne, Australia, pp. 102-111, ISBN: 0 7325 1359 6, 1996. [12] A. D. Styliadis, M.Gr. Vassilakopoulos “A Spatio-temporal Geometric-based model for Digital Doc- umentation of Historical Living Systems,” Elsevier Journal Information and Management, Vol. 42, No. 2, pp. 349-359, 2005. [13] A. D. Styliadis, “GIS: From Layers to Features,” Australian Magazine for CAD and MicroStation, Pen Brush Publishers Vol. 6, No. 3, pp. 18-29, 1996. [14] R. T. Trmbia “Analiz numeric.Note de curs,” http://math.ubbcluj.ro/ tradu/narom.html, 2003 [15] Qihao Weng, “Quantifying Uncertainty of Digital Elevation Model,” http://isu.indstate.edu/qweng, December 2006. Author’s name: Gabriela Droj University of Oradea Department Geodesy and Topography City Hall of Oradea - Chief of GIS Department E-mail: gaby@oradea.ro Received: