Trust as a Proxy Measure for the Quality of VGI in the Case of OSM

276 views
224 views

Published on

Paper presented at AGILE 2013 in Leuven, Belgium. The paper is available from http://carsten.io/kessler-de_groot-agile-2013.pdf

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
276
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Trust as a Proxy Measure for the Quality of VGI in the Case of OSM

  1. 1. Carsten Keßler a,b and René de Groot aa Institute for Geoinformatics, University of Münster | b soon: Hunter College, CUNYhttp://carsten.io | @carstenkesslerTrust as a Proxy Measure for theQuality of VGI in the Case of OSM
  2. 2. The Idea‣ Develop a measure to assess the degree to which a dataconsumer can trust the quality of a feature
  3. 3. The Idea‣ Develop a measure to assess the degree to which a dataconsumer can trust the quality of a feature‣ Trust measure is based on a feature’s editing history
  4. 4. The Idea‣ Develop a measure to assess the degree to which a dataconsumer can trust the quality of a feature‣ Trust measure is based on a feature’s editing history‣ Benefits‣ Works at feature level‣ Filter features by quality‣ Spot problematic features
  5. 5. Does this work?Can we reliably assess the quality of a feature inOpenStreetMap based on its editing history?
  6. 6. Does this work?Can we reliably assess the quality of a feature inOpenStreetMap based on its editing history?amenity = universityname = Institute for Geoinformaticsv1
  7. 7. Does this work?Can we reliably assess the quality of a feature inOpenStreetMap based on its editing history?amenity = universityname = Institute for Geoinformaticsamenity = universitybuilding = yesname = Institute for Geoinformaticsv1 v2
  8. 8. Does this work?Can we reliably assess the quality of a feature inOpenStreetMap based on its editing history?amenity = universityname = Institute for Geoinformaticsamenity = universitybuilding = yesname = Institute for Geoinformaticsaddr:city = Münsteraddr:country = DEaddr:housenumber = 253addr:street = Weseler Straßebuilding = yeswheelchair = limitedv1 v2 v3 …
  9. 9. OSM Heatmap Kudos: Johannes Trame
  10. 10. OSM Provenance Ontologyhttp://carsten.io/osm/osm-provenance.rdfprv:TagincludesEditChangeset prv:CreationGuidelineEditprv:createdByprv:precededByprv:usedDataNodeStateWayStateprv:DataCreation Userprv:performedBychangesGeometryaddsTagremovesTagchangesValueOfKeyrdfs:Literalprv:DataItemprv:HumanActorsubClassOfhasTagFeatureState
  11. 11. Does this work?‣ Get a first idea whether this is a viable approach‣ Compare results of‣ a simple trust measure and‣ observed feature quality‣ Is there a correlation between the two?
  12. 12. Study area:Münster’sold town
  13. 13. Feature Selection
  14. 14. Feature Selection‣ Re-mapping the whole district was not feasible
  15. 15. Feature Selection‣ Re-mapping the whole district was not feasible‣ Up to 100 features were manageable
  16. 16. Feature Selection‣ Re-mapping the whole district was not feasible‣ Up to 100 features were manageable‣ Selection based on minimum number of versions
  17. 17. Feature Selection‣ Re-mapping the whole district was not feasible‣ Up to 100 features were manageable‣ Selection based on minimum number of versions‣ 74 features with 6+ versions
  18. 18. 74 featuresselected
  19. 19. Trust measure
  20. 20. Trust measure‣ Positive factors:‣ Versions‣ Users‣ Indirect confirmations =edits in the direct vicinity(50m)
  21. 21. Trust measure‣ Positive factors:‣ Versions‣ Users‣ Indirect confirmations =edits in the direct vicinity(50m)‣ Negative factors:‣ Tag corrections‣ Rollbacks
  22. 22. Trust measure (contd.)‣ Classification for each factor: 5 equal classes‣ Combined into one classification‣ Equal weights
  23. 23. Trustmeasure
  24. 24. Field Survey‣ Thematic accuracy4 classes:1. Main tag wrong2. Other tags wrong3. Thematic ambiguities4. Thematically correct
  25. 25. Field Survey‣ Thematic accuracy4 classes:1. Main tag wrong2. Other tags wrong3. Thematic ambiguities4. Thematically correct‣ Results:‣ 6 features (~8%)‣ 2 features (~3%)‣ 9 features (~12%)‣ 57 features (~77%)
  26. 26. Field Survey (contd.)‣ Topological consistency
  27. 27. Field Survey (contd.)‣ Topological consistency‣ Is the feature correctlypositioned relative to thesurrounding features?
  28. 28. Field Survey (contd.)‣ Topological consistency‣ Is the feature correctlypositioned relative to thesurrounding features?‣ Results:‣ 73 out of 74 features (~99%)
  29. 29. Field Survey (contd.)‣ Topological consistency‣ Is the feature correctlypositioned relative to thesurrounding features?‣ Results:‣ 73 out of 74 features (~99%)‣ Information completeness‣ TF-IDF measure to identifyrelevant tags per main tag
  30. 30. Field Survey (contd.)‣ Topological consistency‣ Is the feature correctlypositioned relative to thesurrounding features?‣ Results:‣ 73 out of 74 features (~99%)‣ Information completeness‣ TF-IDF measure to identifyrelevant tags per main tag‣ ~37% tags missing (avg.)
  31. 31. Observedquality:combinedresults
  32. 32. Trustmeasure
  33. 33. mean quality class: ~4.2mean trust class: ~2.8
  34. 34. Do we get the trend right?
  35. 35. Do we get the trend right?‣ Removed outliers‣ Kendall’s τ: 0.52‣ Moderate, but significantpositive correlation
  36. 36. Conclusions
  37. 37. Conclusions‣ Initial study
  38. 38. Conclusions‣ Initial study‣ A feature’s history can determine its trustworthiness
  39. 39. Conclusions‣ Initial study‣ A feature’s history can determine its trustworthiness‣ Trust values correlate with observed quality
  40. 40. Conclusions‣ Initial study‣ A feature’s history can determine its trustworthiness‣ Trust values correlate with observed quality‣ Even with a very simple model
  41. 41. Conclusions‣ Initial study‣ A feature’s history can determine its trustworthiness‣ Trust values correlate with observed quality‣ Even with a very simple model‣ Outliers cannot be explained yet
  42. 42. Tons of Future Work
  43. 43. Tons of Future Work‣ Extend and refine the trust model:Classification, weighting, positive vs negative aspects, …
  44. 44. Tons of Future Work‣ Extend and refine the trust model:Classification, weighting, positive vs negative aspects, …‣ Social aspects: Who has edited a feature?
  45. 45. Tons of Future Work‣ Extend and refine the trust model:Classification, weighting, positive vs negative aspects, …‣ Social aspects: Who has edited a feature?‣ Repeat study without spatial focus
  46. 46. Tons of Future Work‣ Extend and refine the trust model:Classification, weighting, positive vs negative aspects, …‣ Social aspects: Who has edited a feature?‣ Repeat study without spatial focus‣ How to scale the data collection?
  47. 47. Tons of Future Work‣ Extend and refine the trust model:Classification, weighting, positive vs negative aspects, …‣ Social aspects: Who has edited a feature?‣ Repeat study without spatial focus‣ How to scale the data collection?‣ Learn the trust model from the data
  48. 48. Thankyou!All data used in this research © OpenStreetMap contributors.carsten.kessler@uni-muenster.de | http://carsten.io | @carstenkesslerCarsten Keßler | René de Groot

×