SlideShare a Scribd company logo
A step towards the improvement of spatial data quality of Web 2.0 geo-applications The case of OpenStreetMap Vyron Antoniou, Muki Haklay, Jeremy Morley Department of Civil, Environmental  and Geomatic Engineering
A fundamental GIS problem Information System Real World http://www.bing.com/maps Google Earth
 
OSM Map Features
Wiki Democracy +
OSM Data Geometry Attributes (Tags) +
OSM’s Geometry Haklay et al. Antoniou et al. Completeness Positional Accuracy
Tags?
 
Unique Tags vs Total Tags for each OSM Feature Category (GB) Sum: 2.25M tags
How many tags do we have for each entity?
Residential (2826) Primary (623) How often there is a new Tag introduced?
How often there is a new Tag introduced?
Unique Tags vs Popular Tags (95% of population)
From OSM wiki-pages to XML Schema XML Schema = OSM Specification
From OSM wiki-pages to XML Schema
[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object]
Merkaartor Potlatch JOSM Freedom, Formalization and Quality Standards?
Freedom, Formalization and Quality Standards?
 
 
Final Points ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thank   you

More Related Content

Similar to Gisruk2010 - A step towards the improvement of spatial data quality of Web 2.0 geo-applications . The case of OpenStreetMap

Esri Uc Fgdc
Esri Uc FgdcEsri Uc Fgdc
Esri Uc Fgdc
seagor
 
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
Spark Summit
 
ProofingEngineScreenCompressed
ProofingEngineScreenCompressedProofingEngineScreenCompressed
ProofingEngineScreenCompressed
Jordi Arnabat
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524
EDINA, University of Edinburgh
 

Similar to Gisruk2010 - A step towards the improvement of spatial data quality of Web 2.0 geo-applications . The case of OpenStreetMap (15)

Esri Uc Fgdc
Esri Uc FgdcEsri Uc Fgdc
Esri Uc Fgdc
 
Intro to Internet Mapping (epan 2011)
Intro to Internet Mapping (epan 2011)Intro to Internet Mapping (epan 2011)
Intro to Internet Mapping (epan 2011)
 
Data Quality and Neogeography
Data Quality and NeogeographyData Quality and Neogeography
Data Quality and Neogeography
 
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
IoT and the Autonomous Vehicle in the Clouds: Simultaneous Localization and M...
 
Unlocking the Power of Geospatial Data: An Introduction to the Open Geospatia...
Unlocking the Power of Geospatial Data: An Introduction to the Open Geospatia...Unlocking the Power of Geospatial Data: An Introduction to the Open Geospatia...
Unlocking the Power of Geospatial Data: An Introduction to the Open Geospatia...
 
Your Data and FME
Your Data and FMEYour Data and FME
Your Data and FME
 
ProofingEngineScreenCompressed
ProofingEngineScreenCompressedProofingEngineScreenCompressed
ProofingEngineScreenCompressed
 
Horizon March 2010
Horizon March 2010Horizon March 2010
Horizon March 2010
 
A GIS Based Satellite Data Management Application
A GIS Based Satellite Data Management ApplicationA GIS Based Satellite Data Management Application
A GIS Based Satellite Data Management Application
 
CityGML Integration Into the ArcGIS Platform
CityGML Integration Into the ArcGIS PlatformCityGML Integration Into the ArcGIS Platform
CityGML Integration Into the ArcGIS Platform
 
JACIC
JACICJACIC
JACIC
 
Humanitarian Mapping - Interaction ICCC
Humanitarian Mapping - Interaction ICCCHumanitarian Mapping - Interaction ICCC
Humanitarian Mapping - Interaction ICCC
 
Analisis kebutuhan sistem web gis
Analisis kebutuhan sistem web gisAnalisis kebutuhan sistem web gis
Analisis kebutuhan sistem web gis
 
PINOGIO : A simple way to create a web infographic map (피노지오 : 웹 인포그래픽 맵을 만드는...
PINOGIO : A simple way to create a web infographic map (피노지오 : 웹 인포그래픽 맵을 만드는...PINOGIO : A simple way to create a web infographic map (피노지오 : 웹 인포그래픽 맵을 만드는...
PINOGIO : A simple way to create a web infographic map (피노지오 : 웹 인포그래픽 맵을 만드는...
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524
 

Recently uploaded

plant breeding methods in asexually or clonally propagated crops
plant breeding methods in asexually or clonally propagated cropsplant breeding methods in asexually or clonally propagated crops
plant breeding methods in asexually or clonally propagated crops
parmarsneha2
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 

Recently uploaded (20)

Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
 
plant breeding methods in asexually or clonally propagated crops
plant breeding methods in asexually or clonally propagated cropsplant breeding methods in asexually or clonally propagated crops
plant breeding methods in asexually or clonally propagated crops
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptx
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
 

Gisruk2010 - A step towards the improvement of spatial data quality of Web 2.0 geo-applications . The case of OpenStreetMap

Editor's Notes

  1. The subject of the presentation is the improvement of SDQ of web 2.0 geo-applications by examining in particular the case of OSM
  2. Well, maybe the most fundamental problem of GIS is how we can put the real world into an information system. How we can model the reality in such a way in order to fit into a GIS.
  3. To deal with that problem the Ordnance Survey has published a catalogue that contains the real-world objects which actually serves as a specification for their OS MasterMap porduct. The scope of this catalogue, that has 566 pages, is to provide a list of the RWOs of the product and a list of features and attributes of each of the RWOs
  4. In fact OSM has something similar to that. Well, it is not a catalogue per se rather a wiki page, but it serves the same purpose to provide a list of entities and possible attributes (or tags) that the users can assign to these entities
  5. The thing is though that this list has not just been published but it has been created through democratic procedures with the help of the wiki technology. In brief, OSM users through a voting system can suggest which entities or tags need to be deleted, altered or added at the map feature page.
  6. So in fact when we speak for OSM Data we actually speak for the geometry and the attributes or tags that users have assigned to the real world entities.
  7. Now, regarding the quality of the OSM Geometry there has been some research either to examine completeness or positional accuracy against the OS Meridian2 dataset. But what we haven’t seen up to now is the quality of the tags
  8. So the question is what is going on with the tags in OSM?
  9. After all, tags is what actually transforms spaghetti-like digitized data into a proper map
  10. So, we looked into what is going on in the OSM tags for GB. This graph shows 2 things. The first thing is the number of tags recorded for each of these 18 categories for GB So, we see that the population of tags ranges from just few thousands for motorway_links up to 900k tags for the residential roads category. In total these 18 categories have more than 2.2 million tags. The second thing shown here is the number of unique tags recorded for each category. The interesting thing here in this line graph is that we obviously don’t need more than 300 unique tags to describe a residential road, well, not even 50 unique tags to describe a motorway_link.
  11. Now, the thing is that, despite the huge amount of tags generated by the OSM contributors the average number of tags per recorded entity is quite small, with the majority of the categories having between 1 and 3 tags per feature. This really gives us an indication about the OSM completeness in terms of entity attribution and certainly indicates that population of tags will keep getting bigger and bigger both because new entities will be digitized but also because the average number of tags per entity will grow.
  12. Now, what we wanted to see is how often a new tag is introduced. Well, the answer is that this depends in the total tag population of each category. So, for example, for the residential roads, in average, we have a new tag for almost every 3000 tags where as for primary roads we get a new tag for every 600 tags. So, the question now is…. Is this good or bad? Well, in order to answer that question we translated this figures into percentage of growth
  13. So, this graph shows what the growth of the tags population has to be for each category in order to have a new tag introduced for that category The interesting thing here is that after a threshold of about 40.000 tags, an increase of 0.3%-0.5% percent creates a new tag in each category.
  14. Now, the next question is ….ok we do not need that many tags per feature category but how many tags are actually enough? We see here that just a small fraction of tags covers the 95% of the tag population in each case. So, actually we need only the tip of the iceberg to correctly model the real world and not the whole iceberg itself. So, is there something we can do about that?
  15. Now, our initial aim was to examine the quality of OSM tag…. But examine the quality against what? OSM is a product that literally has no specification and it captures reality in much more detail than any other product. So what we wanted was first to create an XML Schema that will work as the OSM specification We did that by both manually gathering information both from the OSM wiki pages and by examining the tags that were included in the tip of the iceberg that showed you earlier that I showed you earlier. So just to give you an example of what the schema looks like
  16. So, when we finished some fragments of that Schema we start performing all shorts of comparisons and with the actual data. Here is some of the interesting stuff we found. When we examined the entities of some of the OSM feature categories we found that the % of Schema violation was really high We noticed that the majority of the entities violated the schema because they had the ‘create_by’ tag that had been deprecated. When we didn’t take the create_by tag into account we saw that the % of feature violation was considerably smaller.
  17. We performed the same evaluation in larger categories including all OSM Highways, Nodes and Places but this time we just examined the specific Schema principle that says that tags should not have the created_by key both for all the entities and for entities created after the 30 of April last year when this rule was adopted. The interesting thing here is that while new guidelines are announced for the OSM dataset users don’t implement them immediately rather they continue providing data as they used to.
  18. So, what we suggest is in order to improve the quality of OSM and why not of other VGI geo-applications some sort of formalization should exist under the hood. Putting the XML Schema as a layer between the editors that users use and the database we can have Freedom, Formalization and preserve the Quality Standards of the dataset
  19. This schema could also be used again under the hood with the voting system in that way that any changes decided would be automatically propagated to the schema and finally to the data