Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Data Quality Interpretation


Published on

Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Data Quality Interpretation
Erik Borg, Bernd Fichtelmann - German Aerospace Center, German Remote Sensing Data Center
Hartmut Asche - Department of Geography, University of Potsdam

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Increased availability of remote sensing data is accompanied by an increasing demand for additional information metadata facilitate the search for appropriate data within remote sensing databases of a variety of providers (e.g. European Space Agency, ESA) Typical metadata information are geographical coordinates, acquisition time, calibration parameters, and data quality.
  • Ground stations Kiruna (Sweden), Fucino (Italy), and Neustrelitz (Germany) constitute ESA Earthnet for the LANDSAT 7 / ETM+ satellite. Stations are compatibly equipped. A detailed description of the national ground segment for LANDSAT 7 in Neustrelitz, representative for all stations of the ESA network Cloud cover degree is an accepted indicator for quantifying data quality. This simple criterion is inadequate as quality measure Quality of optical remote sensing data depends on number of cloud pixels and their distribution within a scene Taking this into account, data quality assessment within the ESA LANDSAT 7 ground segment was interactively by interpreters data quality assessments is labour-intensive and subjective
  • Investigations presented are based on evaluation of 2,957 quick-look data (11,828 quadrants) and corresponding metadata in the period from 13.03.2002 to 15.01.2003. Ground resolution of quick-look data is usually strongly reduced to ground resolution figure shows the spatial distribution of the LANDSAT 7 / ETM+ data base. Colour code in figure represents the number of available scenes for the marked geographical location of the scene (path/row). _________________________________________________________ Quick-look data are preview images derived from original remote sensing data Quadrants are an evaluation unit of a remote sensing data scene
  • Figure illustrates the influence of cloud distribution on data usability for a cloud cover degree of 25 percent Top: concentrated clouds Bottom: distributed clouds
  • Sample for quick-look product for interactive visual data usability assessment by interpreters (the quadrant numbers are 1 to 4).
  • a stratified sample was collected. the population was 11,828 quadrants stratified sample describes characteristic development when the population exhibits strong differences representative sample size is necessary to characterise the population minimum sample size can be calculated by eq. 1 to n ≥ 384 quadrants _________________________________________________________ test was performed by 6 interpreters in 3 test series (400 quadrants) so 18 usability values for each quadrant for arithmetic average this value is denoted as mean data usability (see eq. 2) _________________________________________________________ assuming that the mean data usability delivers objective assessment, subjectivity of individual interpreter can be quantified mean data usability can be used for computing absolute error δi between mean data usability and assessment mi by individual interpreter (eq 3) _________________________________________________________ beside maximum of absolute error δi its distribution over all data usability classes is of particular interest variability over all 18 assessment results can be described by using standard deviation (eq. 4)
  • interactive visual evaluation of data quality performed by interpreters is subjectively biased data usability assessment provided by single interpreter has to be compared with mean data usability frequency distribution and cumulative frequency curve of modulus | δi | are plotted top: Mean data usability vs. single interpretation distribution of | δi | shows for ≈ 60 % of all quadrants a | δi | = 0, for ≈ 90% a | δi | ≤ 10, and for ≈ 98% a | δi | ≤ 20 bottom: Mean data usability vs. metadata assessment distribution of | δi | shows for ≈ 50 % of all quadrants a | δi | = 0, for ≈ 79% a | δi | ≤ 10, and for ≈ 90% a | δi | ≤ 20 the result demonstrate the level of subjectivity: in minimum for 50 % of all quadrants a difference of +/- one class
  • figure shows a diagram, where data usability mi is plotted against the standard deviation. intervals [0; 10] and [80; 90] small standard deviation is computed interval [20; 70] standard deviation is considerably higher each point of diagram can represent various assessments, therefore the trend is drawn as 3rd degree polynomial for better orientation In this range the average standard deviation is in the order of 10 to 15 order of magnitude corresponds to a difference in data usability assessment of approximately one class or two classes this is of particular interest, since the order of magnitude of the evaluation differences is contained in the metadata.
  • scatter diagram: interpreter assessment extracted from metadata is plotted against mean data usability derived of 18 data usability values for each quadrant graph shows the relation of both data sets each point in scatter diagram can represent more than one evaluation dashed 1:1-line has been plotted into diagram for better orientation regression line is relatively closed to the 1:1-line determination coefficient for the linear regression only amounts to 0.8553 data usability of metadata is underestimated in interval [0, 50] data usability of metadata is overestimated in interval [70, 90]. 
  • comparison of mean data usability and data usability in metadata is shown in figure In this context, it is of special interest to quantify the differences of the single interpreter assessments. average absolute error of assessment results for each single interpreter and standard deviation (3 series; 400 quadrants) were computed diagram supports the following statements: average deviation of all interpreter results varies in interval of [-7; +6] and standard deviation varies in the interval of [+6; +12]. results of single interpreter are relatively concentrated in a small evaluation interval maximum range of average deviation for a single interpreter is [-4.5; -2.5] and maximum range of standard deviation is [9.9, 11.9]
  • Considering these results above a harmonisation function could be developed for adjusting the interpretation results of single interpreter: problem: there is no information in metadata identifying the interpreter who did the data quality assessment Without this information there is no possibility to minimize the subjective influence by individual interpreter on the data usability provided in the metadata. or interactive visual interpretation can be supported by an automated supporting system Samples of modified quicklook data are shown here
  • Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Data Quality Interpretation

    1. 1. <ul><li>Improvement of </li></ul><ul><li>spatial data quality through data conflation </li></ul><ul><li>Silvija Stankute, Hartmut Asche </li></ul><ul><li>Geoinformation Research Group </li></ul><ul><li>Dept of Geography | University of Potsdam | Germany </li></ul>ICCSA 2011 | GEOG-AN-MOD 2011 | University of Santander | 20-23/06/2011 Data usability assessment for remote sensing data: Accuracy of interactive data quality interpretation Erik Borg, Bernd Fichtelmann, Hartmut Asche DFD | German Aerospace Centre | Neustrelitz Dept of Geography | University of Potsdam | Germany ICCSA 2011 | GEOG-AN-MOD 2011 | University of Santander | 20-23/06/2011
    2. 2. <ul><li>Increasing availability of remote sensing (RS) data is complica-ting orientation in data bases of remote sensing data </li></ul><ul><li>To facilitate orientation in RS data bases data providers make available additional data information, such as geographic loca-tion, acquisition time, data quality </li></ul><ul><li>To assess RS data quality cloud cover degree is a frequently used quality parameter which records data quality insufficiently </li></ul><ul><li>ESA has defined a new quality measure data usability which is interactively interpreted by operators based on: </li></ul><ul><li>Technical data errors : lost image lines/segments, scan mirror anomalies, </li></ul><ul><li>Unusable image segment : Clouds, haze, shadow, derivation within a scene </li></ul>Motivation
    3. 3. Search in databases Data acquisition <ul><li>DESCW </li></ul><ul><li>Retrieval system: Display Earth Remote Sensing Swath Coverage </li></ul><ul><li>Data selection criteria of LANDSAT 7 / ETM+ data include </li></ul>Mission Orbit Track Frame Data Time Data quality Station Coordinates .. .. SCENE_LL_CORNER_LON = 25.4800 SCENE_LR_CORNER_LAT = 43.6500 SCENE_LR_CORNER_LON = 27.8100 HORIZONTAL_DISPLAY_SHIFT = 0 SCENE_CCA = 50 UL_QUAD_CCA = 60 UR_QUAD_CCA = 60 LL_QUAD_CCA = 30 LR_QUAD_CCA = 50 SUN_AZIMUTH_ANGLE = 138.5 SUN_ELEVATION_ANGLE = 59.8 CCA = Cloud Cover Assessment QUAD = quadrant UL = upper left UR = upper right LL = lower left LR = lower right
    4. 4. Processing chain National ground segment Landsat 7 Monitor & Control, MWD WS (SGI O2 with 128MB) 18 GB disk 2 x DLT 7000 EXABYTE CD Master Labelling system 3 x 18GBdisk Catalogue, OH Workstation (O2 with 128MB) SWITCH MATRIX (EMP) SGI Origin 200 server 4CPUs 512 MB plus GigaChannel PCI ingestion boards (Ciel) 5 x 18 GB disk array internal SCSI Controller 2 x DLT 7000 18 GB disk SGI O2 Station Data Server EXABYTE CD Master Labelling system Label Printer Reports Printer Fast Ethernet Label Printer Reports Printer R S 2 3 2 R S 2 3 2 Front-End Handler (PC - RS232/IEEE488) Demod 1 (Alcatel) Exabyte CD-ROM DLT 7000 c o n t r o l l e r 4 x S C S I SGI Origin 200 server 4CPUs 512 MB plus GigaChannel PCI ingestion boards (Ciel) 5 x 18 GB disk array internal SCSI Controller c o n t r o l l e r 4 x S C S I Quality Control Workstation (O2 with 128MB) I Q I Q I Q Monitor & Control, MWD WS (SGI O2 with 128MB) External lines Demod 2 (Alcatel) I Q <ul><li>Objectives </li></ul><ul><li>Decoding and data syncronisation </li></ul><ul><li>De-communitation </li></ul><ul><li>Production of browse data </li></ul><ul><li>Status information </li></ul><ul><li>Storage of raw data </li></ul>Schematic representation of Landsat ground segment (modified from Beruti 2002) Red: interactive data usability assessment STOP Interactive Data Usability Estimation
    5. 5. Derivation of test data Receiving circle of Neustrelitz ground station Spatial distribution of Landsat 7/ETM+ database
    6. 6. Definitions of data quality <ul><li>Assessment of cloud cover degreee </li></ul><ul><li>Detection and identification of cloud covered pixel </li></ul><ul><li>Ration of number of cloud covered pixel to total number of pixels of assessment unit </li></ul><ul><li>Data usability assessment </li></ul><ul><li>Not usable image segment: detection of clouds and cloud shadows, distribution and configuration within scene </li></ul><ul><li>Technically induced image errors: lost lines and sectors, scan mirror anomalies </li></ul>Cloud No Cloud 25 % Cloudiness
    7. 7. Quick-look data for quality assessment <ul><li>Quality assessment by interpreters </li></ul><ul><li>4 quadrants </li></ul><ul><li>Real colour coded data </li></ul>
    8. 8. Equations Assessment of evaluation subjectivity  Acceptable error 0.05 (significance level of 5 %) z-value Standard normal distribution 1.96, P Population 11,828 quadrants, P(1-P) Maximum value 0.25 n ≥ 384 400 d i Absolute error SD DU Standard deviation
    9. 9. Equations Mean error vs single interpreter | metadata <ul><li>Modulus of absolute error of mean data usability with regard to single interpreter assessment </li></ul><ul><li>Modulus of absolute error of mean data usability with regard to metadata assessment </li></ul>
    10. 10. Mean data usability vs standard deviation <ul><li>Interpreter assessment plotted vs. data usability class standard deviation </li></ul><ul><li>Dashed line: trend </li></ul>
    11. 11. Mean data usability vs data usability within metadata <ul><li>Comparison of mean data usability and data usability (metadata) for 400 quadrants </li></ul><ul><li>Dotted line is 1:1 line, solid line is regression line </li></ul>
    12. 12. Single interpretation assessment vs Standard deviation <ul><li>Evaluation by interpreters plotted against standard deviation </li></ul><ul><li>Results of each individual interpreter marked by separate symbol and colour </li></ul>
    13. 13. Operational aspects Extended quick-look product with cloud mask, operator vote and automaton vote
    14. 14. Extended quick-look product for monitoring the actual processing status Operational aspects
    15. 15. Thank you for your attention! Questions, comments, feedback? [email_address] [email_address] ICCSA 2011 | GEOG-AN-MOD 2011 | University of Santander | 20-23/06/2011