DBpedia i18n - Amsterdam Meeting (30/01/2014)

  • 355 views
Uploaded on

DBpedia internationlization presentation at the 1st DBpedia Meeting in Amsterdam (30/01/2014)

DBpedia internationlization presentation at the 1st DBpedia Meeting in Amsterdam (30/01/2014)

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
355
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
4
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Creating Knowledge out of Interlinked Data DBpedia Community Meeting DBpedia Internationalization+ 30/01/2014 Amsterdam Dimitris Kontokostas DBpedia is a community project, please see http://dbpedia.org for a List of contributors LOD2 Presentation . 02.09.2010 . Page AKSW, Universität Leipzig http://lod2.eu
  • 2. Structure in Wikipedia        Title Abstract Infoboxes Geo-coordinates Categories Images Links      other language versions other Wikipedia pages To the Web Redirects Disambiguations
  • 3. Structure in Wikipedia        Title Abstract Infoboxes Geo-coordinates Categories Images Links      other language versions other Wikipedia pages To the Web Redirects Disambiguations
  • 4. Structure in Wikipedia        Title Abstract Infoboxes Geo-coordinates Categories Images Links      other language versions other Wikipedia pages To the Web Redirects Disambiguations
  • 5. Structure in Wikipedia        Title Abstract Infoboxes Geo-coordinates Categories Images Links      other language versions other Wikipedia pages To the Web Redirects Disambiguations
  • 6. Structure in Wikipedia        Title Abstract Infoboxes Geo-coordinates Categories Images Links      other language versions other Wikipedia pages To the Web Redirects Disambiguations
  • 7. Structure in Wikipedia        Title Abstract Infoboxes Geo-coordinates Categories Images Links      other language versions other Wikipedia pages To the Web Redirects Disambiguations
  • 8. Structure in Wikipedia        Title Abstract Infoboxes Geo-coordinates Categories Images Links      other language versions other Wikipedia pages To the Web Redirects Disambiguations
  • 9. Infobox Templates Wikitext-Syntax {{Infobox Korean settlement | title = Busan Metropolitan City | img = Busan.jpg | imgcaption = A view of the [[Geumjeong]] district in Busan | hangul = 부산 광역시 ... | area_km2 = 763.46 | pop = 3635389 | popyear = 2006 | mayor = Hur Nam-sik | divs = 15 wards (Gu), 1 county (Gun) | region = [[Yeongnam]] | dialect = [[Gyeongsang]] }} RDF representation dbp:Busan dbp:Busan dbp:Busan dbp:Busan dbp:Busan dbp:Busan ... dbp:title dbp:hangul dbp:area_km2 dbp:pop dbp:region dbp:dialect ″Busan Metropolitan City″ ″ 부산 광역시″ @Hang ″763.46“^xsd:float ″3635389“^xsd:int dbp:Yeongnam dbp:Gyeongsang
  • 10. Creating Knowledge out of Interlinked Data A closer look at infoboxes KAIST – LOD2 16.8..2011 . Page 10 http://lod2.eu
  • 11. Creating Knowledge out of Interlinked Data A closer look at infoboxes KAIST – LOD2 16.8..2011 . Page 11 http://lod2.eu
  • 12. Creating Knowledge out of Interlinked Data A closer look at infoboxes KAIST – LOD2 16.8..2011 . Page 12 http://lod2.eu
  • 13. Creating Knowledge out of Interlinked Data Björk (Musician) Occupation = Musician, Actor Born = 21.12.1965, Reykjavík Brown (Prime Minister) office = Prime Minister of the UK birth_date = 20.4.1951 birth_place = Govan Romero (Actor) occupation = Actor, Editor birthdate = 4.2.1940 birthplace = New York KAIST – LOD2 16.8..2011 . Page 13 http://lod2.eu
  • 14. Creating Knowledge out of Interlinked Data Björk (Musician) Occupation = Musician, Actor Born = 21.12.1965, Reykjavík Brown (Prime Minister) office = Prime Minister of the UK birth_date = 20.4.1951 birth_place = Govan Romero (Actor) occupation = Actor, Editor birthdate = 4.2.1940 birthplace = New York KAIST – LOD2 16.8..2011 . Page 14 http://lod2.eu
  • 15. Creating Knowledge out of Interlinked Data Björk (Musician) Occupation = Musician, Actor Born = 21.12.1965, Reykjavík Brown (Prime Minister) office = Prime Minister of the UK birth_date = 20.4.1951 birth_place = Govan Romero (Actor) occupation = Actor, Editor birthdate = 4.2.1940 birthplace = New York KAIST – LOD2 16.8..2011 . Page 15 http://lod2.eu
  • 16. Creating Knowledge out of Interlinked Data DBpedia – Collaborative Ontology Engineering • Mappings Wiki • http://mappings.dbpedia.org/ • Everybody can contribute new mappings or improve existing ones • ~170 editors • Correct Semantics: • Combine what belongs together (birth_place, birthplace) • Separate what is different (bornIn, birthplace) • Big boost for Precision • Recall is crowdsourced - Help us! :) 16 DBpedia Community Meeting / Amsterdam 30.01.2014 16 http://lod2.eu
  • 17. Creating Knowledge out of Interlinked Data DBpedia – Collaborative Ontology Engineering 17 DBpedia Community Meeting / Amsterdam 30.01.2014 17 http://lod2.eu
  • 18. Creating Knowledge out of Interlinked Data DBpedia – Internationalization DBpedia Internationalization Effort to port DBpedia to local (non-Enlish) wikipedia's 18 DBpedia Community Meeting / Amsterdam 30.01.2014 18 http://lod2.eu
  • 19. Creating Knowledge out of Interlinked Data DBpedia Internationalization (I18n) • Local Wikipedias provide more information on local resources e.g.: • The French version of Eiffel Tower is better than the English • Articles of local importance might not exist in English • Multilingual extraction was limited to basic page structure • Labels, categories, links, raw infobox extraction • DBpedia I18n started at 2009 with German and Korean (PHP). • Extended at 2010 with Greek (Scala) and many languages followed. • Now default multilingual extraction is enhanced and some languages are tailored for even better extraction. • Extractor tweaking / mappings definitions 19 DBpedia Community Meeting / Amsterdam 30.01.2014 19 http://lod2.eu
  • 20. Creating Knowledge out of Interlinked Data DBpedia I18n – Overview • DBpedia 3.7 (08/2011) was the first to introduce internationalized datasets • Language based namespaces (http://{lang}.DBpedia.org) • Most are not dereferencable (except local chapters) • DBpedia 3.9 (08/2013) provides data in 191 languages • Mappings are enabled for 28 languages (24 active) • 15 local DBpedia chapters http://dbpedia.org/Internationalization/Chapters • • • • 12 from Europe, Indonesian, Japanese & Korean Provide dereferencable URIs / IRIs Maintain their own domain and community Mappings coordination 20 DBpedia Community Meeting / Amsterdam 30.01.2014 20 http://lod2.eu
  • 21. 21
  • 22. Creating Knowledge out of Interlinked Data DBpedia I18n – Mapping Activity 22 DBpedia Community Meeting / Amsterdam 30.01.2014 22 http://lod2.eu
  • 23. Creating Knowledge out of Interlinked Data DBpedia I18n – Mapping Statistics (2013.02) 23 DBpedia Community Meeting / Amsterdam 30.01.2014 23 http://lod2.eu
  • 24. Creating Knowledge out of Interlinked Data DBpedia I18n – Mapping Statistics (2013.02) 24 DBpedia Community Meeting / Amsterdam 30.01.2014 24 http://lod2.eu
  • 25. Creating Knowledge out of Interlinked Data DBpedia I18n – Mapping Statistics (v3.8) 25 DBpedia Community Meeting / Amsterdam 30.01.2014 25 http://lod2.eu
  • 26. Creating Knowledge out of Interlinked Data To keep our Dutch audience happy :) v3.9 (NL): 211.927 People, 861.633 Places, 16.733 Organizations, 92.314 Works DBpedia I18n – Mapping Statistics (v3.8) 26 DBpedia Community Meeting / Amsterdam 30.01.2014 26 http://lod2.eu
  • 27. Creating Knowledge out of Interlinked Data DBpedia I18n – Mapping Statistics (v3.8) 27 DBpedia Community Meeting / Amsterdam 30.01.2014 27 http://lod2.eu
  • 28. Creating Knowledge out of Interlinked Data DBpedia I18n – Mapping Statistics (v3.8) 28 DBpedia Community Meeting / Amsterdam 30.01.2014 28 http://lod2.eu
  • 29. Creating Knowledge out of Interlinked Data DBpedia I18n Further information: ● ● ● http://wiki.dbpedia.org/Internationalization Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, Christian Bizer. DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. To appear in the Semantic Web Journal. Dimitris Kontokostas, Charalampos Bratsas, Sören Auer, Sebastian Hellmann, Ioannis Antoniou, George Metakides, Internationalization of Linked Data: The case of the Greek DBpedia edition, Web Semantics: Science, Services and Agents on the World Wide Web, Volume 15, September 2012, Pages 51–61, ISSN 1570–8268, 10.1016/j.websem.2012.01.001. 29 DBpedia Community Meeting / Amsterdam 30.01.2014 29 http://lod2.eu
  • 30. Creating Knowledge out of Interlinked Data DBpedia @ GSoC 5 succesfull GSoC DBpedia projects this year: http://wiki.dbpedia.org/gsoc2013 ● Type inference based on categories (Kasun Perera) ● ● New interactive DBpedia interface (Denis Lukovnikov) ● ● Live Wikidata2DBpedia endpoint (2014) Power tool for DBpedia testing metadata (Lazaros Ioannidis) ● ● Available at http://live.dbpedia.org Wikidata integration (Hady ElHasar) ● ● Available at https://github.com/dbpedia/dbpedia-links Using Databugger output: http://databugger.aksw.org Input format generalization for DBpedia Spotlight Do you know any students for DBpedia @ GSoC 2014 ? 30 DBpedia Community Meeting / Amsterdam 30.01.2014 30 http://lod2.eu
  • 31. Creating Knowledge out of Interlinked Data Quality @ Dbpedia (soon) Databugger + GsoC Power tool SPARQL quality queries (more than one birth date) => Select ?s where { ?s dbo:birthDate ?d . } Group by ?s Having count(?d > 1) 31 DBpedia Community Meeting / Amsterdam 30.01.2014 31 dbr:Phil_Cuzzi => Wikipedia error dbr:Ivan_Cattaneo dbr:Vijay_Ghate dbr:William_Tempest dbr:Cliff_Speegle dbr:Arnold,_Duke_of_Guelders dbr:Schuyler_Grant dbr:Vlas_Chubar dbr:Adrian_Peterson ... http://lod2.eu
  • 32. Creating Knowledge out of Interlinked Data DBpedia @ GSoC 32 DBpedia Community Meeting / Amsterdam 30.01.2014 32 http://lod2.eu
  • 33. Creating Knowledge out of Interlinked Data Thank you for your attention! DBpedia is a community project, please see http://dbpedia.org for aDBpediaist of conDBpediars. LOD2 Presentation . 02.09.2010 . Page http://lod2.eu