Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Cristian Consonni
Cristian ConsonniStudente presso l'Università degli studi di Milano-Bicocca, freelance developer and web-designer at Università degli Studi di Trento - University of Trento
Data coherence between OSM and Wikipedia
Cristian Consonni
Fondazione Bruno Kessler
State of the Map 2013 - Birmingham
September 2013
Cristian Consonni Data coherence between OSM and WIkipedia 1 / 16
Outline
1 Introduction
2 The Problem
3 Proposing a Solution
Wikipedia-OSM comparator
Nut4Nuts
4 Conclusions
5 Questions
Cristian Consonni Data coherence between OSM and WIkipedia 2 / 16
Collecting Information About the Real World
Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16
Collecting Information About the Real World
Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16
Collecting Information About the Real World
Wikipedia and OpenStreetMap are:
collaborative
volunteer-driven
free (as in freedom and as in beer)
Both projects collect information about the real world.
Cristian Consonni Data coherence between OSM and WIkipedia 4 / 16
Different Processes and Communities
Wikipedia
anonymous users can edit
entries consist in text (or media)
only encyclopedical subjects
content can be protected from
editing in case of problems
OpenStreetMap
only registered users can edit
entries consist in data
everything can be described
content is always editable
Cristian Consonni Data coherence between OSM and WIkipedia 5 / 16
Inconsistencies in the data
Data in Wikipedia can be inconsistent with data from OpenStreetMap.
We should compare the data and reconcile the differences.
Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
Inconsistencies in the data
Data in Wikipedia can be inconsistent with data from OpenStreetMap.
We should compare the data and reconcile the differences.
On Wikipedia the metro station
“Colosseum” is inside the Colosseum
itself.
Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
Inconsistencies in the data
Data in Wikipedia can be inconsistent with data from OpenStreetMap.
We should compare the data and reconcile the differences.
On Wikipedia the metro station
“Colosseum” is inside the Colosseum
itself.
On OpenStreetMap the metro station is
correctly placed outside the monument.
OpenStreetMap maps on Wikipedia provided by WIWOSM tool by User:Master and User:Kolossos, check it out on:
http://wiki.openstreetmap.org/wiki/WIWOSM
Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
Proposal of the Solution
Two steps towards a solution:
1 Compare the data
Identify links between Wikipedia pages and OSM entities
Extract all the available geographical information
Define metrics to calculate if the data are “close” or not
2 Reconcile the differences
Provide the communities with the result of previous analysis
Creating tools to facilitate the reconciliation
Cristian Consonni Data coherence between OSM and WIkipedia 7 / 16
Comparing the data
Wikipedia-OpenStreetMap comparator
Proof-of-concept: comparing data about churches in Italy:
Wikipedia-OpenStreetMap comparator
source code: https://github.com/CristianCantoro/WOcomparator
Easy case:
pre-defined category of items (selection on a set of features in OSM,
articles with a given template in Wikipedia)
only entities with a (it:)Wikipedia attribute were selected
⇒ linking is straightforward.
Cristian Consonni Data coherence between OSM and WIkipedia 8 / 16
Comparing the data
Wikipedia-OpenStreetMap comparator
http://it.wikipedia.org/wiki/Utente:CristianCantoro/Georeferenziazione
Cristian Consonni Data coherence between OSM and WIkipedia 9 / 16
Comparing the data
nuts4nuts
For the hard case (try to link every possible thing), another tool:
Nuts4Nuts
source code: https://github.com/SpazioDati/Nuts4Nuts
http://nuts4nutsrecon.spaziodati.eu/reconcile?queries={%22q0%22:%20{%22query%22:%20%22Palazzo%20Vecchio%22}}
Known limitations:
limited to Italy
uses of external services
grab the source code: https://github.com/SpazioDati/Nuts4Nuts
Cristian Consonni Data coherence between OSM and WIkipedia 10 / 16
Dandelion
Nuts4Nuts is built using the infrastracture provided by
Dandelion (http://dandelion.eu)
a datamarket by SpazioDati srl.
Cristian Consonni Data coherence between OSM and WIkipedia 11 / 16
Future Work
Nuts4nuts is a step to find geographical information for Wikipedia article
that have no explicit coordinates in them.
Future work:
study new approaches to link entities between Wikipedia and
OpenStreetMap
an application to fix inconsistencies or fill in missing data, like this:
Cristian Consonni Data coherence between OSM and WIkipedia 12 / 16
Conclusions
Wikipedia and OSM collect information about the real world
Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
Conclusions
Wikipedia and OSM collect information about the real world
Comparing data among the two project can highlight inconsistencies
Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
Conclusions
Wikipedia and OSM collect information about the real world
Comparing data among the two project can highlight inconsistencies
We should fix them
Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
Questions & Contacts
Questions?
mail: consonni@fbk.eu
twitter: @CristianCantoro
github: https://github.com/CristianCantoro
Cristian Consonni Data coherence between OSM and WIkipedia 14 / 16
Thank you
Thank you!
This work was supported by:
A project by:
SpazioDati srl
Edizioni Curcu & Genovese
with funds from the European Regional Development Fund.
More information: http://trentino.dandelion.eu
Cristian Consonni Data coherence between OSM and WIkipedia 15 / 16
Copyright notice
The following presentation is realeased under the licence CC3.0-BY-SA.
Further info:
http://creativecommons.org/licenses/by-sa/3.0/
Logos and trademarks are of the respective owners.
Cristian Consonni Data coherence between OSM and WIkipedia 16 / 16
1 of 21

More Related Content

Similar to Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham(20)

EO in Society: Open Science and InnovationEO in Society: Open Science and Innovation
EO in Society: Open Science and Innovation
Maria Antonia Brovelli325 views
Ongoing Research in Data StudiesOngoing Research in Data Studies
Ongoing Research in Data Studies
Communication and Media Studies, Carleton University4K views
Data and scienceData and science
Data and science
Anand Deshpande816 views
OKFN_OpenDataMxOKFN_OpenDataMx
OKFN_OpenDataMx
Velichka Dimitrova572 views
Versioning for Linked Data: Archiving Systems and BenchmarksVersioning for Linked Data: Archiving Systems and Benchmarks
Versioning for Linked Data: Archiving Systems and Benchmarks
Holistic Benchmarking of Big Linked Data1.3K views
Workshop e-participation Bahia-PotenzaWorkshop e-participation Bahia-Potenza
Workshop e-participation Bahia-Potenza
Gilberto Corso Pereira4M views
#migrantsfiles international#migrantsfiles international
#migrantsfiles international
Dataninja939 views
Sight, Sound, Numbers & Us: Data Visualization + Data Sonification = Data Acc...Sight, Sound, Numbers & Us: Data Visualization + Data Sonification = Data Acc...
Sight, Sound, Numbers & Us: Data Visualization + Data Sonification = Data Acc...
University of Michigan Taubman Health Sciences Library825 views
slidesslides
slides
Mart Bosch58 views

Recently uploaded(20)

Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet48 views
Java Platform Approach 1.0 - Picnic MeetupJava Platform Approach 1.0 - Picnic Meetup
Java Platform Approach 1.0 - Picnic Meetup
Rick Ossendrijver23 views
CXL at OCPCXL at OCP
CXL at OCP
CXL Forum183 views
Green Leaf Consulting: Capabilities DeckGreen Leaf Consulting: Capabilities Deck
Green Leaf Consulting: Capabilities Deck
GreenLeafConsulting170 views

Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

  • 1. Data coherence between OSM and Wikipedia Cristian Consonni Fondazione Bruno Kessler State of the Map 2013 - Birmingham September 2013 Cristian Consonni Data coherence between OSM and WIkipedia 1 / 16
  • 2. Outline 1 Introduction 2 The Problem 3 Proposing a Solution Wikipedia-OSM comparator Nut4Nuts 4 Conclusions 5 Questions Cristian Consonni Data coherence between OSM and WIkipedia 2 / 16
  • 3. Collecting Information About the Real World Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16
  • 4. Collecting Information About the Real World Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16
  • 5. Collecting Information About the Real World Wikipedia and OpenStreetMap are: collaborative volunteer-driven free (as in freedom and as in beer) Both projects collect information about the real world. Cristian Consonni Data coherence between OSM and WIkipedia 4 / 16
  • 6. Different Processes and Communities Wikipedia anonymous users can edit entries consist in text (or media) only encyclopedical subjects content can be protected from editing in case of problems OpenStreetMap only registered users can edit entries consist in data everything can be described content is always editable Cristian Consonni Data coherence between OSM and WIkipedia 5 / 16
  • 7. Inconsistencies in the data Data in Wikipedia can be inconsistent with data from OpenStreetMap. We should compare the data and reconcile the differences. Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
  • 8. Inconsistencies in the data Data in Wikipedia can be inconsistent with data from OpenStreetMap. We should compare the data and reconcile the differences. On Wikipedia the metro station “Colosseum” is inside the Colosseum itself. Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
  • 9. Inconsistencies in the data Data in Wikipedia can be inconsistent with data from OpenStreetMap. We should compare the data and reconcile the differences. On Wikipedia the metro station “Colosseum” is inside the Colosseum itself. On OpenStreetMap the metro station is correctly placed outside the monument. OpenStreetMap maps on Wikipedia provided by WIWOSM tool by User:Master and User:Kolossos, check it out on: http://wiki.openstreetmap.org/wiki/WIWOSM Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16
  • 10. Proposal of the Solution Two steps towards a solution: 1 Compare the data Identify links between Wikipedia pages and OSM entities Extract all the available geographical information Define metrics to calculate if the data are “close” or not 2 Reconcile the differences Provide the communities with the result of previous analysis Creating tools to facilitate the reconciliation Cristian Consonni Data coherence between OSM and WIkipedia 7 / 16
  • 11. Comparing the data Wikipedia-OpenStreetMap comparator Proof-of-concept: comparing data about churches in Italy: Wikipedia-OpenStreetMap comparator source code: https://github.com/CristianCantoro/WOcomparator Easy case: pre-defined category of items (selection on a set of features in OSM, articles with a given template in Wikipedia) only entities with a (it:)Wikipedia attribute were selected ⇒ linking is straightforward. Cristian Consonni Data coherence between OSM and WIkipedia 8 / 16
  • 12. Comparing the data Wikipedia-OpenStreetMap comparator http://it.wikipedia.org/wiki/Utente:CristianCantoro/Georeferenziazione Cristian Consonni Data coherence between OSM and WIkipedia 9 / 16
  • 13. Comparing the data nuts4nuts For the hard case (try to link every possible thing), another tool: Nuts4Nuts source code: https://github.com/SpazioDati/Nuts4Nuts http://nuts4nutsrecon.spaziodati.eu/reconcile?queries={%22q0%22:%20{%22query%22:%20%22Palazzo%20Vecchio%22}} Known limitations: limited to Italy uses of external services grab the source code: https://github.com/SpazioDati/Nuts4Nuts Cristian Consonni Data coherence between OSM and WIkipedia 10 / 16
  • 14. Dandelion Nuts4Nuts is built using the infrastracture provided by Dandelion (http://dandelion.eu) a datamarket by SpazioDati srl. Cristian Consonni Data coherence between OSM and WIkipedia 11 / 16
  • 15. Future Work Nuts4nuts is a step to find geographical information for Wikipedia article that have no explicit coordinates in them. Future work: study new approaches to link entities between Wikipedia and OpenStreetMap an application to fix inconsistencies or fill in missing data, like this: Cristian Consonni Data coherence between OSM and WIkipedia 12 / 16
  • 16. Conclusions Wikipedia and OSM collect information about the real world Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
  • 17. Conclusions Wikipedia and OSM collect information about the real world Comparing data among the two project can highlight inconsistencies Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
  • 18. Conclusions Wikipedia and OSM collect information about the real world Comparing data among the two project can highlight inconsistencies We should fix them Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16
  • 19. Questions & Contacts Questions? mail: consonni@fbk.eu twitter: @CristianCantoro github: https://github.com/CristianCantoro Cristian Consonni Data coherence between OSM and WIkipedia 14 / 16
  • 20. Thank you Thank you! This work was supported by: A project by: SpazioDati srl Edizioni Curcu & Genovese with funds from the European Regional Development Fund. More information: http://trentino.dandelion.eu Cristian Consonni Data coherence between OSM and WIkipedia 15 / 16
  • 21. Copyright notice The following presentation is realeased under the licence CC3.0-BY-SA. Further info: http://creativecommons.org/licenses/by-sa/3.0/ Logos and trademarks are of the respective owners. Cristian Consonni Data coherence between OSM and WIkipedia 16 / 16