Assessing Trust in OSM Features Using Edit History
1. Carsten Keßler a,b and René de Groot a
a Institute for Geoinformatics, University of Münster | b soon: Hunter College, CUNY
http://carsten.io | @carstenkessler
Trust as a Proxy Measure for the
Quality of VGI in the Case of OSM
2. The Idea
‣ Develop a measure to assess the degree to which a data
consumer can trust the quality of a feature
3. The Idea
‣ Develop a measure to assess the degree to which a data
consumer can trust the quality of a feature
‣ Trust measure is based on a feature’s editing history
4. The Idea
‣ Develop a measure to assess the degree to which a data
consumer can trust the quality of a feature
‣ Trust measure is based on a feature’s editing history
‣ Benefits
‣ Works at feature level
‣ Filter features by quality
‣ Spot problematic features
5. Does this work?
Can we reliably assess the quality of a feature in
OpenStreetMap based on its editing history?
6. Does this work?
Can we reliably assess the quality of a feature in
OpenStreetMap based on its editing history?
amenity = university
name = Institute for Geoinformatics
v1
7. Does this work?
Can we reliably assess the quality of a feature in
OpenStreetMap based on its editing history?
amenity = university
name = Institute for Geoinformatics
amenity = university
building = yes
name = Institute for Geoinformatics
v1 v2
8. Does this work?
Can we reliably assess the quality of a feature in
OpenStreetMap based on its editing history?
amenity = university
name = Institute for Geoinformatics
amenity = university
building = yes
name = Institute for Geoinformatics
addr:city = Münster
addr:country = DE
addr:housenumber = 253
addr:street = Weseler Straße
building = yes
wheelchair = limited
v1 v2 v3 …
11. Does this work?
‣ Get a first idea whether this is a viable approach
‣ Compare results of
‣ a simple trust measure and
‣ observed feature quality
‣ Is there a correlation between the two?
16. Feature Selection
‣ Re-mapping the whole district was not feasible
‣ Up to 100 features were manageable
‣ Selection based on minimum number of versions
17. Feature Selection
‣ Re-mapping the whole district was not feasible
‣ Up to 100 features were manageable
‣ Selection based on minimum number of versions
‣ 74 features with 6+ versions
24. Field Survey
‣ Thematic accuracy
4 classes:
1. Main tag wrong
2. Other tags wrong
3. Thematic ambiguities
4. Thematically correct
25. Field Survey
‣ Thematic accuracy
4 classes:
1. Main tag wrong
2. Other tags wrong
3. Thematic ambiguities
4. Thematically correct
‣ Results:
‣ 6 features (~8%)
‣ 2 features (~3%)
‣ 9 features (~12%)
‣ 57 features (~77%)
27. Field Survey (contd.)
‣ Topological consistency
‣ Is the feature correctly
positioned relative to the
surrounding features?
28. Field Survey (contd.)
‣ Topological consistency
‣ Is the feature correctly
positioned relative to the
surrounding features?
‣ Results:
‣ 73 out of 74 features (~99%)
29. Field Survey (contd.)
‣ Topological consistency
‣ Is the feature correctly
positioned relative to the
surrounding features?
‣ Results:
‣ 73 out of 74 features (~99%)
‣ Information completeness
‣ TF-IDF measure to identify
relevant tags per main tag
30. Field Survey (contd.)
‣ Topological consistency
‣ Is the feature correctly
positioned relative to the
surrounding features?
‣ Results:
‣ 73 out of 74 features (~99%)
‣ Information completeness
‣ TF-IDF measure to identify
relevant tags per main tag
‣ ~37% tags missing (avg.)
40. Conclusions
‣ Initial study
‣ A feature’s history can determine its trustworthiness
‣ Trust values correlate with observed quality
41. Conclusions
‣ Initial study
‣ A feature’s history can determine its trustworthiness
‣ Trust values correlate with observed quality
‣ Even with a very simple model
42. Conclusions
‣ Initial study
‣ A feature’s history can determine its trustworthiness
‣ Trust values correlate with observed quality
‣ Even with a very simple model
‣ Outliers cannot be explained yet
44. Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
45. Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
‣ Social aspects: Who has edited a feature?
46. Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
‣ Social aspects: Who has edited a feature?
‣ Repeat study without spatial focus
47. Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
‣ Social aspects: Who has edited a feature?
‣ Repeat study without spatial focus
‣ How to scale the data collection?
48. Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
‣ Social aspects: Who has edited a feature?
‣ Repeat study without spatial focus
‣ How to scale the data collection?
‣ Learn the trust model from the data