Model elevation attribute with geostatistical procedures


Published on

Geisa Bugs. Trabalho final da disciplina de geoestatistica na UNL, 2007.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Model elevation attribute with geostatistical procedures

  1. 1. Geostatistics Master of Science on Geospatial Technologies Professor: Carlos A. Felgueiras Geisa BugsModel elevation attribute with geostatistical proceduresThis article reports the procedures, problems and results of the geostatistics proposedexercise. The goal is to create a model with a free chosen data in order to understand howthe data is distributed and how is it spatial dependency across the study area.1. Presentation of the spatial data to be usedThe study area is composed of seven census sectors (urban area and surroundings) of theSão Gabriel municipality, located in the South Brazilian state of Rio Grande do Sul at30° 20’9’’ latitude south and 54°19’12’’ longitude west. The whole city has an area of5.019,646 km2 and the selected study area has an area of 618,15 km2.For the modeling exercise was used a set of 189 elevation sample points originated fromthe cartographic service of Brazilian military army’s in the scale 1:50.000. The exerciseused ArcGIS and SPRING softwares.Figure 1: SãoGabriel location. Figure 2: Sample points location.2. Exploratory analysisDuring the exploratory analysis graphical and visual methods and numerical techniquesare used to look for patterns in the data, look for outliers, and formulate hypothesis fromdata about the symmetry for example. 1
  2. 2. • Descriptive Statistic Sample points 189 Minimum 9 Maximum 286 Mean 146,96 Standard deviation 50,739 Skewness 1,0259 Kurtosis 3,5291 st 1 Quartile 114 Median 129 rd 3 Quartile 164,5 Global variance 2560,86 Table 1: exploratory analysis.The mean and median are measures of location and show where the center of distributionlies. Values of the two approximately equal suggest a possible normal distribution of thedata. In the elevation data they are not much, and that may be because an erratic value isaffecting the mean value.The standard deviation describes the variability. Skewness and kurtosis are measures ofshape and show information about symmetry. The three coefficients are also sensitive toerratic values, since take into account the mean value. • Histogram: a histogram plots the range of data values (x-axis) against the number of points that have those values (y-axis). Histogram Transformation: None Frequency Count : 189 Skewness : 1,0259 59 Min :9 Kurtosis : 3,5291 Max : 286 1-st Quartile : 114 Mean : 146,96 Median : 129 Std. Dev. : 50,739 3-rd Quartile : 164,5 47,2 35,4 23,6 11,8 0 0,09 0,37 0,65 0,93 1,21 1,49 1,77 2,05 2,33 2,61 2,89 -2 Data 10 Data Source: Layer: sg_topo_pontos_cotados_Clip selection Attribute: CODIGO Figure 3: histogram of the elevation points.The histogram shows that the data are almost symmetric, but there is a very low andisolate value. The more long tail in the right indicates a bigger concentration of high valuesin comparison with low values. 2
  3. 3. The sequence of figures bellow helps to see where these values are located. By visualanalysis it is clear that the low values are mainly located at the north part of the study are,the values that occur more often are mainly located in the central area and the highervalues are located mainly in the south part of the study area. By selecting the very lowvalue it shows that it is one value located in between the lower and the values that occurmore often. It may be considered an outlier. Figure 4: shows the location of the values that occur more often. Figure 5: shows the location of the higher values. 3
  4. 4. Figure 6: shows the location of the lower values. Figure 7: show the location of the lower point value. • Normal Probability Graph: helps to see how close the distribution is to a Gaussian.The graph bellow shows that the elevation data is almost close to a Gaussian behavior butnot actually. The points almost follow the normal line but there is an upper dispersion tothe low values, a “down belly” dispersion to the values close to the mean, and an upperdispersion to the high values. 4
  5. 5. Normal QQPlot Transformation: None -1 Datas Quantile 10 29,15 23,5 17,85 12,2 6,55 0,9 -2,79 -2,23 -1,67 -1,11 -0,55 0,01 0,57 1,13 1,69 2,25 2,81 Standard Normal Value Data Source: Layer: sg_topo_pontos_cotados_Clip selection Attribute: CODIGO Figure 8: the normal probability graph.3. Experimental omnidirectional semivariogramThe semivariogram allows examining the spatial autocorrelation between the measuredsample points, besides of define the necessary parameters to do the estimations valuesfor not sampled areas. The principle of spatial autocorrelation tells that pairs of locationsthat are close in distance should also be close in value.In the semivariogram cloud of ARCGIS each red dot represents a pair of locations. Figure 09: ArcGIS semivariogram cloud in 90° direction.In order to improve the semivariogram evaluation and actually be able to see the “curve” itwas run also in SPRING.When analyzing the SPRING semivariogram results it was clear to see that the datapresents trend because it was not found a stabilized behavior. In a usual behavior as thedistance between the point pairs increases, the semivariogram values also increase till 5
  6. 6. stabilizes and reach a landing. In the elevation data the semivariograms values increasedcontinually, not showing a defined land, indicating the presence of trend in the data. Figure 10: SPRING semivariogram 90° direction.4. Taking off the data trendIn this step it was used a program to take off the data trend and the semivariogramcreation was tested again. The results now are very satisfactory. The sill value isapproximately the global variance value for the sample without trend (174); the range issensible smaller comparing to the previous results; and the nugget effect is really close tozero.The sill, range and nugget effect are the important parameters a semivariogram shows. Sillrepresents variability in the absence of spatial dependency; in a typical behavior of anadjusted semivariogram, the sill value is approximately like the global variance. Rangerepresents separation between point pairs at which the sill is reached, in other words, thedistance at which there is no evidence of spatial dependency. Figure 11: semivariogram and numeric result for the points without trend. 6
  7. 7. 5. Theoretical semivariogramsModeled semivariograms are mathematical models representing the experimentalsemivariograms. The goal is to find the best fit for the variogram (lowest fitting error). Bycomparing the spherical and gaussian models, the range, sill, and contribution are verysimmilar but when looking to the akaike effect, the gaussian model suggested a little betterfitting. Figure 12: spherical modeled semivariogram. Figure 13: gaussian modeled semivariogram. 7
  8. 8. 6. KrigingLet’s go back to ArcGIS after had the parameters set in SPRING. By observing the surfaceplot the data can be assumed as anisotropic as shows different behaviors for differentdirections; the data values along certain direction are more continuous than along others.So ArcGIS offers to calculate it automatically.The first step is to define which interpolation method will be used. In this case it was usedthe ordinary kriging stochastic interpolator. Stochastic interpolators use weighted averagesand probability models to make predictions to the points there is no samples values.Ordinary kriging assumes that the mean is a simple constant, what means no trend on thedata. So similarly to what was done to take the trend on the data before, in ArcGIS it wasselected a second order polynomial trend removal. The input parameters were the onesfigured out from the modeled semivariogram for the point without trend made in SPRING:number of lags: 9, nugget effect: 43, sill value: 131, model: spherical, range: 2.665. Figure 14: ordinary kriging with anisotropy. Figure 15: ordinary kriging with isotropy. 8
  9. 9. 7. Cross validationIn the cross validation procedure one datum is removed and the rest of the data are usedto predict the removed datum.The cross validation graphs bellow shows that the result is very satisfactory, almost all thepoints are following the line, and the root mean squared is more or less 16 and itrepresents only a 5% error when comparing with the range of the data. When comparingthe anisotropy and isotropy the RMS is a little slower for the isotropy. Figure 16: cross validation - anisotropy. Figure 17: cross validation - isotropy. 9
  10. 10. Figure 18: cross validation comparison: on the left anisotropy and on the right isotropy.8. Graph resultThe graph result shows the ordinary kriging map for the elevation data from the sevencensus areas of Sao Gabriel municipality. The kriging works with the hypotesis odminimazing the error variance and creates a smooth model that can filter some details ofthe original surface. The difference between isotropy and anisotropy is quite small but stillit is better the anisotropy result because is considering this difference in directions.Figure 19: anisotropy. Figure 20: isotropy. 10
  11. 11. 9. ConclusionsDifferent softwares offer different possibilities for data prediction using geostatisticsanalysis. It is realy important to know the theory a priori in order to acept or not theautomatic parameters given by the softwares, and also to input the ones that fits better tothe expected results.Important points to keep in mind is to observe if there is trend in the data. If there is trendis very usefull to take the trend off in order to improve the semivariogram creation and theresult parameters used to input in the modeled semivariogram and in the interpolationprocedure. Also important to also observe if it is a isotropic or anisotropic behavior. If thereis a anisotropy is needed to combine the two information (directions) in one model.Finally, the interactive work of generation reliable variograms really request a previousknowledge of geostatistical concepts. If there is no previous knowledge is quite difficult tofind good results.10. ReferencesIsaaks, E. H. and Srivastava, R. M., 1989, An introduction to applied geostatistics, new York: OxfordUniversity Press.ESRI training and education, 2007, Introduction to ArcGIS 9 Geostatistical Analyst. Avaiable onlinein: (last acessed 11November 2007).GPS Global, 2007, Artigos geostatística. Avaiable online (last acessed 11 November 2007). 11