SlideShare a Scribd company logo
1 of 49
Download to read offline
Carsten Keßler a,b and René de Groot a
a Institute for Geoinformatics, University of Münster | b soon: Hunter College, CUNY
http://carsten.io | @carstenkessler
Trust as a Proxy Measure for the
Quality of VGI in the Case of OSM
The Idea
‣ Develop a measure to assess the degree to which a data
consumer can trust the quality of a feature
The Idea
‣ Develop a measure to assess the degree to which a data
consumer can trust the quality of a feature
‣ Trust measure is based on a feature’s editing history
The Idea
‣ Develop a measure to assess the degree to which a data
consumer can trust the quality of a feature
‣ Trust measure is based on a feature’s editing history
‣ Benefits
‣ Works at feature level
‣ Filter features by quality
‣ Spot problematic features
Does this work?
Can we reliably assess the quality of a feature in
OpenStreetMap based on its editing history?
Does this work?
Can we reliably assess the quality of a feature in
OpenStreetMap based on its editing history?
amenity = university
name = Institute for Geoinformatics
v1
Does this work?
Can we reliably assess the quality of a feature in
OpenStreetMap based on its editing history?
amenity = university
name = Institute for Geoinformatics
amenity = university
building = yes
name = Institute for Geoinformatics
v1 v2
Does this work?
Can we reliably assess the quality of a feature in
OpenStreetMap based on its editing history?
amenity = university
name = Institute for Geoinformatics
amenity = university
building = yes
name = Institute for Geoinformatics
addr:city = Münster
addr:country = DE
addr:housenumber = 253
addr:street = Weseler Straße
building = yes
wheelchair = limited
v1 v2 v3 …
OSM Heatmap Kudos: Johannes Trame
OSM Provenance Ontology
http://carsten.io/osm/osm-provenance.rdf
prv:Tag
includesEdit
Changeset prv:CreationGuideline
Edit
prv:createdBy
prv:precededBy
prv:usedData
NodeState
WayState
prv:DataCreation User
prv:performedBy
changesGeometry
addsTag
removesTag
changesValueOfKey
rdfs:Literal
prv:DataItem
prv:HumanActor
subClassOfhasTag
FeatureState
Does this work?
‣ Get a first idea whether this is a viable approach
‣ Compare results of
‣ a simple trust measure and
‣ observed feature quality
‣ Is there a correlation between the two?
Study area:
Münster’s
old town
Feature Selection
Feature Selection
‣ Re-mapping the whole district was not feasible
Feature Selection
‣ Re-mapping the whole district was not feasible
‣ Up to 100 features were manageable
Feature Selection
‣ Re-mapping the whole district was not feasible
‣ Up to 100 features were manageable
‣ Selection based on minimum number of versions
Feature Selection
‣ Re-mapping the whole district was not feasible
‣ Up to 100 features were manageable
‣ Selection based on minimum number of versions
‣ 74 features with 6+ versions
74 features
selected
Trust measure
Trust measure
‣ Positive factors:
‣ Versions
‣ Users
‣ Indirect confirmations =
edits in the direct vicinity
(50m)
Trust measure
‣ Positive factors:
‣ Versions
‣ Users
‣ Indirect confirmations =
edits in the direct vicinity
(50m)
‣ Negative factors:
‣ Tag corrections
‣ Rollbacks
Trust measure (contd.)
‣ Classification for each factor: 5 equal classes
‣ Combined into one classification
‣ Equal weights
Trust
measure
Field Survey
‣ Thematic accuracy
4 classes:
1. Main tag wrong
2. Other tags wrong
3. Thematic ambiguities
4. Thematically correct
Field Survey
‣ Thematic accuracy
4 classes:
1. Main tag wrong
2. Other tags wrong
3. Thematic ambiguities
4. Thematically correct
‣ Results:
‣ 6 features (~8%)
‣ 2 features (~3%)
‣ 9 features (~12%)
‣ 57 features (~77%)
Field Survey (contd.)
‣ Topological consistency
Field Survey (contd.)
‣ Topological consistency
‣ Is the feature correctly
positioned relative to the
surrounding features?
Field Survey (contd.)
‣ Topological consistency
‣ Is the feature correctly
positioned relative to the
surrounding features?
‣ Results:
‣ 73 out of 74 features (~99%)
Field Survey (contd.)
‣ Topological consistency
‣ Is the feature correctly
positioned relative to the
surrounding features?
‣ Results:
‣ 73 out of 74 features (~99%)
‣ Information completeness
‣ TF-IDF measure to identify
relevant tags per main tag
Field Survey (contd.)
‣ Topological consistency
‣ Is the feature correctly
positioned relative to the
surrounding features?
‣ Results:
‣ 73 out of 74 features (~99%)
‣ Information completeness
‣ TF-IDF measure to identify
relevant tags per main tag
‣ ~37% tags missing (avg.)
Observed
quality:
combined
results
Trust
measure
mean quality class: ~4.2
mean trust class: ~2.8
Do we get the trend right?
Do we get the trend right?
‣ Removed outliers
‣ Kendall’s τ: 0.52
‣ Moderate, but significant
positive correlation
Conclusions
Conclusions
‣ Initial study
Conclusions
‣ Initial study
‣ A feature’s history can determine its trustworthiness
Conclusions
‣ Initial study
‣ A feature’s history can determine its trustworthiness
‣ Trust values correlate with observed quality
Conclusions
‣ Initial study
‣ A feature’s history can determine its trustworthiness
‣ Trust values correlate with observed quality
‣ Even with a very simple model
Conclusions
‣ Initial study
‣ A feature’s history can determine its trustworthiness
‣ Trust values correlate with observed quality
‣ Even with a very simple model
‣ Outliers cannot be explained yet
Tons of Future Work
Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
‣ Social aspects: Who has edited a feature?
Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
‣ Social aspects: Who has edited a feature?
‣ Repeat study without spatial focus
Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
‣ Social aspects: Who has edited a feature?
‣ Repeat study without spatial focus
‣ How to scale the data collection?
Tons of Future Work
‣ Extend and refine the trust model:
Classification, weighting, positive vs negative aspects, …
‣ Social aspects: Who has edited a feature?
‣ Repeat study without spatial focus
‣ How to scale the data collection?
‣ Learn the trust model from the data
Thankyou!
All data used in this research © OpenStreetMap contributors.
carsten.kessler@uni-muenster.de | http://carsten.io | @carstenkessler
Carsten Keßler | René de Groot

More Related Content

Similar to Assessing Trust in OSM Features Using Edit History

351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptxXanGwaps
 
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
GDG Cloud Community Day 2022 -  Managing data quality in Machine LearningGDG Cloud Community Day 2022 -  Managing data quality in Machine Learning
GDG Cloud Community Day 2022 - Managing data quality in Machine LearningSARADINDU SENGUPTA
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyAlon Bochman, CFA
 
Using Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing BehaviorUsing Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing BehaviorJulia Kiseleva
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Greg Makowski
 
Stance classification - Presentation QMUL by Carolina Scarton, USFD
Stance classification - Presentation QMUL by Carolina Scarton, USFDStance classification - Presentation QMUL by Carolina Scarton, USFD
Stance classification - Presentation QMUL by Carolina Scarton, USFDWeverify
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing frameworkAgnes van Belle
 
Artificial intelligence and IoT
Artificial intelligence and IoTArtificial intelligence and IoT
Artificial intelligence and IoTVeselin Pizurica
 
Monte Carlo Simulation for Agile Development
Monte Carlo Simulation for Agile DevelopmentMonte Carlo Simulation for Agile Development
Monte Carlo Simulation for Agile DevelopmentGlen Alleman
 
RDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesRDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesConnected Data World
 
udacity-dandsyllabus
udacity-dandsyllabusudacity-dandsyllabus
udacity-dandsyllabusBora Yüret
 
Connecting Scenario Approaches with Scenario Tools
Connecting Scenario Approaches with Scenario ToolsConnecting Scenario Approaches with Scenario Tools
Connecting Scenario Approaches with Scenario ToolsRPO America
 
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPAAir Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPASTEP_scotland
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Ian Morgan
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Bayes Nets meetup London
 

Similar to Assessing Trust in OSM Features Using Edit History (20)

351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
 
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
GDG Cloud Community Day 2022 -  Managing data quality in Machine LearningGDG Cloud Community Day 2022 -  Managing data quality in Machine Learning
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
Using Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing BehaviorUsing Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing Behavior
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
Weka bike rental
Weka bike rentalWeka bike rental
Weka bike rental
 
RISK EVALUATION-1
RISK EVALUATION-1RISK EVALUATION-1
RISK EVALUATION-1
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 
Stance classification - Presentation QMUL by Carolina Scarton, USFD
Stance classification - Presentation QMUL by Carolina Scarton, USFDStance classification - Presentation QMUL by Carolina Scarton, USFD
Stance classification - Presentation QMUL by Carolina Scarton, USFD
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
 
Artificial intelligence and IoT
Artificial intelligence and IoTArtificial intelligence and IoT
Artificial intelligence and IoT
 
Monte Carlo Simulation for Agile Development
Monte Carlo Simulation for Agile DevelopmentMonte Carlo Simulation for Agile Development
Monte Carlo Simulation for Agile Development
 
RDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesRDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the pieces
 
Cist2014 slides
Cist2014 slidesCist2014 slides
Cist2014 slides
 
BOM DMAIC TEMPLATE
BOM DMAIC TEMPLATEBOM DMAIC TEMPLATE
BOM DMAIC TEMPLATE
 
udacity-dandsyllabus
udacity-dandsyllabusudacity-dandsyllabus
udacity-dandsyllabus
 
Connecting Scenario Approaches with Scenario Tools
Connecting Scenario Approaches with Scenario ToolsConnecting Scenario Approaches with Scenario Tools
Connecting Scenario Approaches with Scenario Tools
 
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPAAir Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
Air Quality Modelling Tools (Aberdeen Pilot Project) Dr. Alan Hills, SEPA
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 

More from Carsten Keßler

Geoprivacy Talk @ UJI Castellòn
Geoprivacy Talk @ UJI CastellònGeoprivacy Talk @ UJI Castellòn
Geoprivacy Talk @ UJI CastellònCarsten Keßler
 
Privacy-preserving contagious disease tracking
Privacy-preserving contagious disease trackingPrivacy-preserving contagious disease tracking
Privacy-preserving contagious disease trackingCarsten Keßler
 
Central Places in Wikipedia
Central Places in WikipediaCentral Places in Wikipedia
Central Places in WikipediaCarsten Keßler
 
Using the Web as a Data Source: Challenges for Linked Science
Using the Web as a Data Source: Challenges for Linked ScienceUsing the Web as a Data Source: Challenges for Linked Science
Using the Web as a Data Source: Challenges for Linked ScienceCarsten Keßler
 
The EnviroCar Platform: A Decentralized Approach to Monitoring Urban Traffic...
The EnviroCar Platform: A Decentralized Approach  to Monitoring Urban Traffic...The EnviroCar Platform: A Decentralized Approach  to Monitoring Urban Traffic...
The EnviroCar Platform: A Decentralized Approach to Monitoring Urban Traffic...Carsten Keßler
 
Research in the Age of the Context Machine
Research in the Age of the Context MachineResearch in the Age of the Context Machine
Research in the Age of the Context MachineCarsten Keßler
 
Encoding and querying historic map content
Encoding and querying historic map contentEncoding and querying historic map content
Encoding and querying historic map contentCarsten Keßler
 
GIS for the Masses: Volunteered Geographic Information
GIS for the Masses: Volunteered Geographic InformationGIS for the Masses: Volunteered Geographic Information
GIS for the Masses: Volunteered Geographic InformationCarsten Keßler
 
Linked Data and Time – Modeling Researcher Life Lines by Events
Linked Data and Time – Modeling Researcher Life Lines by EventsLinked Data and Time – Modeling Researcher Life Lines by Events
Linked Data and Time – Modeling Researcher Life Lines by EventsCarsten Keßler
 
LODUM talk at ifgi's Spatial @ WWU series
LODUM talk at ifgi's Spatial @ WWU seriesLODUM talk at ifgi's Spatial @ WWU series
LODUM talk at ifgi's Spatial @ WWU seriesCarsten Keßler
 

More from Carsten Keßler (11)

Geoprivacy Talk @ UJI Castellòn
Geoprivacy Talk @ UJI CastellònGeoprivacy Talk @ UJI Castellòn
Geoprivacy Talk @ UJI Castellòn
 
Privacy-preserving contagious disease tracking
Privacy-preserving contagious disease trackingPrivacy-preserving contagious disease tracking
Privacy-preserving contagious disease tracking
 
Central Places in Wikipedia
Central Places in WikipediaCentral Places in Wikipedia
Central Places in Wikipedia
 
Using the Web as a Data Source: Challenges for Linked Science
Using the Web as a Data Source: Challenges for Linked ScienceUsing the Web as a Data Source: Challenges for Linked Science
Using the Web as a Data Source: Challenges for Linked Science
 
The EnviroCar Platform: A Decentralized Approach to Monitoring Urban Traffic...
The EnviroCar Platform: A Decentralized Approach  to Monitoring Urban Traffic...The EnviroCar Platform: A Decentralized Approach  to Monitoring Urban Traffic...
The EnviroCar Platform: A Decentralized Approach to Monitoring Urban Traffic...
 
Research in the Age of the Context Machine
Research in the Age of the Context MachineResearch in the Age of the Context Machine
Research in the Age of the Context Machine
 
Encoding and querying historic map content
Encoding and querying historic map contentEncoding and querying historic map content
Encoding and querying historic map content
 
GIS for the Masses: Volunteered Geographic Information
GIS for the Masses: Volunteered Geographic InformationGIS for the Masses: Volunteered Geographic Information
GIS for the Masses: Volunteered Geographic Information
 
Linked Data and Time – Modeling Researcher Life Lines by Events
Linked Data and Time – Modeling Researcher Life Lines by EventsLinked Data and Time – Modeling Researcher Life Lines by Events
Linked Data and Time – Modeling Researcher Life Lines by Events
 
LODUM @ SWIB11
LODUM @ SWIB11LODUM @ SWIB11
LODUM @ SWIB11
 
LODUM talk at ifgi's Spatial @ WWU series
LODUM talk at ifgi's Spatial @ WWU seriesLODUM talk at ifgi's Spatial @ WWU series
LODUM talk at ifgi's Spatial @ WWU series
 

Recently uploaded

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Recently uploaded (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 

Assessing Trust in OSM Features Using Edit History

  • 1. Carsten Keßler a,b and René de Groot a a Institute for Geoinformatics, University of Münster | b soon: Hunter College, CUNY http://carsten.io | @carstenkessler Trust as a Proxy Measure for the Quality of VGI in the Case of OSM
  • 2. The Idea ‣ Develop a measure to assess the degree to which a data consumer can trust the quality of a feature
  • 3. The Idea ‣ Develop a measure to assess the degree to which a data consumer can trust the quality of a feature ‣ Trust measure is based on a feature’s editing history
  • 4. The Idea ‣ Develop a measure to assess the degree to which a data consumer can trust the quality of a feature ‣ Trust measure is based on a feature’s editing history ‣ Benefits ‣ Works at feature level ‣ Filter features by quality ‣ Spot problematic features
  • 5. Does this work? Can we reliably assess the quality of a feature in OpenStreetMap based on its editing history?
  • 6. Does this work? Can we reliably assess the quality of a feature in OpenStreetMap based on its editing history? amenity = university name = Institute for Geoinformatics v1
  • 7. Does this work? Can we reliably assess the quality of a feature in OpenStreetMap based on its editing history? amenity = university name = Institute for Geoinformatics amenity = university building = yes name = Institute for Geoinformatics v1 v2
  • 8. Does this work? Can we reliably assess the quality of a feature in OpenStreetMap based on its editing history? amenity = university name = Institute for Geoinformatics amenity = university building = yes name = Institute for Geoinformatics addr:city = Münster addr:country = DE addr:housenumber = 253 addr:street = Weseler Straße building = yes wheelchair = limited v1 v2 v3 …
  • 9. OSM Heatmap Kudos: Johannes Trame
  • 10. OSM Provenance Ontology http://carsten.io/osm/osm-provenance.rdf prv:Tag includesEdit Changeset prv:CreationGuideline Edit prv:createdBy prv:precededBy prv:usedData NodeState WayState prv:DataCreation User prv:performedBy changesGeometry addsTag removesTag changesValueOfKey rdfs:Literal prv:DataItem prv:HumanActor subClassOfhasTag FeatureState
  • 11. Does this work? ‣ Get a first idea whether this is a viable approach ‣ Compare results of ‣ a simple trust measure and ‣ observed feature quality ‣ Is there a correlation between the two?
  • 14. Feature Selection ‣ Re-mapping the whole district was not feasible
  • 15. Feature Selection ‣ Re-mapping the whole district was not feasible ‣ Up to 100 features were manageable
  • 16. Feature Selection ‣ Re-mapping the whole district was not feasible ‣ Up to 100 features were manageable ‣ Selection based on minimum number of versions
  • 17. Feature Selection ‣ Re-mapping the whole district was not feasible ‣ Up to 100 features were manageable ‣ Selection based on minimum number of versions ‣ 74 features with 6+ versions
  • 20. Trust measure ‣ Positive factors: ‣ Versions ‣ Users ‣ Indirect confirmations = edits in the direct vicinity (50m)
  • 21. Trust measure ‣ Positive factors: ‣ Versions ‣ Users ‣ Indirect confirmations = edits in the direct vicinity (50m) ‣ Negative factors: ‣ Tag corrections ‣ Rollbacks
  • 22. Trust measure (contd.) ‣ Classification for each factor: 5 equal classes ‣ Combined into one classification ‣ Equal weights
  • 24. Field Survey ‣ Thematic accuracy 4 classes: 1. Main tag wrong 2. Other tags wrong 3. Thematic ambiguities 4. Thematically correct
  • 25. Field Survey ‣ Thematic accuracy 4 classes: 1. Main tag wrong 2. Other tags wrong 3. Thematic ambiguities 4. Thematically correct ‣ Results: ‣ 6 features (~8%) ‣ 2 features (~3%) ‣ 9 features (~12%) ‣ 57 features (~77%)
  • 26. Field Survey (contd.) ‣ Topological consistency
  • 27. Field Survey (contd.) ‣ Topological consistency ‣ Is the feature correctly positioned relative to the surrounding features?
  • 28. Field Survey (contd.) ‣ Topological consistency ‣ Is the feature correctly positioned relative to the surrounding features? ‣ Results: ‣ 73 out of 74 features (~99%)
  • 29. Field Survey (contd.) ‣ Topological consistency ‣ Is the feature correctly positioned relative to the surrounding features? ‣ Results: ‣ 73 out of 74 features (~99%) ‣ Information completeness ‣ TF-IDF measure to identify relevant tags per main tag
  • 30. Field Survey (contd.) ‣ Topological consistency ‣ Is the feature correctly positioned relative to the surrounding features? ‣ Results: ‣ 73 out of 74 features (~99%) ‣ Information completeness ‣ TF-IDF measure to identify relevant tags per main tag ‣ ~37% tags missing (avg.)
  • 33.
  • 34. mean quality class: ~4.2 mean trust class: ~2.8
  • 35. Do we get the trend right?
  • 36. Do we get the trend right? ‣ Removed outliers ‣ Kendall’s τ: 0.52 ‣ Moderate, but significant positive correlation
  • 39. Conclusions ‣ Initial study ‣ A feature’s history can determine its trustworthiness
  • 40. Conclusions ‣ Initial study ‣ A feature’s history can determine its trustworthiness ‣ Trust values correlate with observed quality
  • 41. Conclusions ‣ Initial study ‣ A feature’s history can determine its trustworthiness ‣ Trust values correlate with observed quality ‣ Even with a very simple model
  • 42. Conclusions ‣ Initial study ‣ A feature’s history can determine its trustworthiness ‣ Trust values correlate with observed quality ‣ Even with a very simple model ‣ Outliers cannot be explained yet
  • 44. Tons of Future Work ‣ Extend and refine the trust model: Classification, weighting, positive vs negative aspects, …
  • 45. Tons of Future Work ‣ Extend and refine the trust model: Classification, weighting, positive vs negative aspects, … ‣ Social aspects: Who has edited a feature?
  • 46. Tons of Future Work ‣ Extend and refine the trust model: Classification, weighting, positive vs negative aspects, … ‣ Social aspects: Who has edited a feature? ‣ Repeat study without spatial focus
  • 47. Tons of Future Work ‣ Extend and refine the trust model: Classification, weighting, positive vs negative aspects, … ‣ Social aspects: Who has edited a feature? ‣ Repeat study without spatial focus ‣ How to scale the data collection?
  • 48. Tons of Future Work ‣ Extend and refine the trust model: Classification, weighting, positive vs negative aspects, … ‣ Social aspects: Who has edited a feature? ‣ Repeat study without spatial focus ‣ How to scale the data collection? ‣ Learn the trust model from the data
  • 49. Thankyou! All data used in this research © OpenStreetMap contributors. carsten.kessler@uni-muenster.de | http://carsten.io | @carstenkessler Carsten Keßler | René de Groot