Your SlideShare is downloading. ×
Fusing openstreetmap with wikipedia
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Fusing openstreetmap with wikipedia

60
views

Published on

Ulmon's recipe for a travel guide is to fuse multiple open sources of data that you may otherwise use individually to plan your vacation, and present them as a coherent package. We are trying to fuse …

Ulmon's recipe for a travel guide is to fuse multiple open sources of data that you may otherwise use individually to plan your vacation, and present them as a coherent package. We are trying to fuse this data in such a way that the resulting whole is more valuable than the sum of its parts.
Our main sources of map data and knowledge about places are OpenStreetMap and Wikipedia respectively. This talk is about the challenges posed by connecting these two, and our strategies of coping with them.


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
60
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Fusing OpenStreetMap with Wikipedia Ulmon GmbH 08/05/2014 Linuxwochen Wien
  • 2. Hello from 08/05/2014 Linuxwochen Wien
  • 3. Ulmon’s recipe for a travel guide Fuse sources of data to create a whole more valuable than its parts 08/05/2014 Linuxwochen Wien
  • 4. Wikipedia and OSM in CityMaps2Go 08/05/2014 Linuxwochen Wien
  • 5. What about unmatchable WIKI? 08/05/2014 Linuxwochen Wien
  • 6. Wikipedia tag in OpenStreetMap 08/05/2014 Linuxwochen Wien http://taginfo.openstreetmap.org
  • 7. Wikipedia tag statistics Tag name Number of values wikipedia 339,148 wikipedia:ru 30,457 wikipedia:en 16,432 wikipedia:de 13,923 wikipedia:es 4,706 404,666 Total Wikipedia entries with location: 1,621,704 in 15 languages 798,965 English 08/05/2014 Linuxwochen Wien
  • 8. The Confusion of Tongues 08/05/2014 Linuxwochen Wien
  • 9. Multiple OSM candidates for one Wiki 08/05/2014 Linuxwochen Wien
  • 10. Multiple fitting Wiki entries 08/05/2014 Linuxwochen Wien
  • 11. Wiki articles with no OSM object 08/05/2014 Linuxwochen Wien
  • 12. What data to include? … for an offline guide 178MB! 08/05/2014 Linuxwochen Wien
  • 13. 08/05/2014 Linuxwochen Wien Ulmon’s matching algorithm … Stephansdom Ströck Stephansplatz Stephansplatz (U3 station) Stock-im-Eisen-Platz Café Weinwurm DO&CO am Stephansplatz Haas-Haus Aida … Distance: 0.9 Name: 1.0 Type: 0.0 ? ?? ? ? ?
  • 14. Comparing Names • Edit distance (Levenshtein distance) • Soundex • Dice coefficient 08/05/2014 Linuxwochen Wien
  • 15. Type score • Compare OSM tags with Dbpedia types – Manual rules – Word similarity – Future: Synonymic analysis based on Wordnet 08/05/2014 Linuxwochen Wien
  • 16. Decision tree • Generated using the J48 algorithm of the Weka toolkit • How to get learning data? – Manual creation – Parsing wikipedia tags from OSM 08/05/2014 Linuxwochen Wien
  • 17. Ulmon’s matching performance • Current – Total wiki entries: 810K (674K English) – Matched entries: 429K • Future – Total wiki entries: 1.6M – Matched entries (extrapolation): 850K 08/05/2014 Linuxwochen Wien
  • 18. Multiple OSM candidates for one Wiki 08/05/2014 Linuxwochen Wien
  • 19. Multiple fitting Wiki entries 08/05/2014 Linuxwochen Wien
  • 20. Open questions • Reduce false positives – Current: 10%, desired < 3% • Get more matching! • Reduce the amount of data 08/05/2014 Linuxwochen Wien
  • 21. Thank you for your attention! Come visit us at www.ulmon.com 08/05/2014 Linuxwochen Wien