1. Creating Knowledge out of Interlinked Data
DBpedia Community Meeting
DBpedia Internationalization+
30/01/2014 Amsterdam
Dimitris Kontokostas
DBpedia is a community project, please see http://dbpedia.org for a List of contributors
LOD2 Presentation . 02.09.2010 . Page
AKSW, Universität Leipzig
http://lod2.eu
9. Infobox Templates
Wikitext-Syntax
{{Infobox Korean settlement
| title
= Busan Metropolitan City
| img
= Busan.jpg
| imgcaption = A view of the [[Geumjeong]] district in Busan
| hangul
= 부산 광역시
...
| area_km2
= 763.46
| pop
= 3635389
| popyear
= 2006
| mayor
= Hur Nam-sik
| divs
= 15 wards (Gu), 1 county (Gun)
| region
= [[Yeongnam]]
| dialect
= [[Gyeongsang]]
}}
RDF representation
dbp:Busan
dbp:Busan
dbp:Busan
dbp:Busan
dbp:Busan
dbp:Busan
...
dbp:title
dbp:hangul
dbp:area_km2
dbp:pop
dbp:region
dbp:dialect
″Busan Metropolitan City″
″ 부산 광역시″ @Hang
″763.46“^xsd:float
″3635389“^xsd:int
dbp:Yeongnam
dbp:Gyeongsang
10. Creating Knowledge out of Interlinked Data
A closer look at infoboxes
KAIST – LOD2 16.8..2011 . Page
10
http://lod2.eu
11. Creating Knowledge out of Interlinked Data
A closer look at infoboxes
KAIST – LOD2 16.8..2011 . Page
11
http://lod2.eu
12. Creating Knowledge out of Interlinked Data
A closer look at infoboxes
KAIST – LOD2 16.8..2011 . Page
12
http://lod2.eu
13. Creating Knowledge out of Interlinked Data
Björk (Musician)
Occupation = Musician, Actor
Born = 21.12.1965, Reykjavík
Brown (Prime Minister)
office = Prime Minister of the UK
birth_date = 20.4.1951
birth_place = Govan
Romero (Actor)
occupation = Actor, Editor
birthdate = 4.2.1940
birthplace = New York
KAIST – LOD2 16.8..2011 . Page
13
http://lod2.eu
14. Creating Knowledge out of Interlinked Data
Björk (Musician)
Occupation = Musician, Actor
Born = 21.12.1965, Reykjavík
Brown (Prime Minister)
office = Prime Minister of the UK
birth_date = 20.4.1951
birth_place = Govan
Romero (Actor)
occupation = Actor, Editor
birthdate = 4.2.1940
birthplace = New York
KAIST – LOD2 16.8..2011 . Page
14
http://lod2.eu
15. Creating Knowledge out of Interlinked Data
Björk (Musician)
Occupation = Musician, Actor
Born = 21.12.1965, Reykjavík
Brown (Prime Minister)
office = Prime Minister of the UK
birth_date = 20.4.1951
birth_place = Govan
Romero (Actor)
occupation = Actor, Editor
birthdate = 4.2.1940
birthplace = New York
KAIST – LOD2 16.8..2011 . Page
15
http://lod2.eu
16. Creating Knowledge out of Interlinked Data
DBpedia – Collaborative Ontology Engineering
• Mappings Wiki
• http://mappings.dbpedia.org/
• Everybody can contribute new mappings or improve existing ones
• ~170 editors
• Correct Semantics:
• Combine what belongs together (birth_place, birthplace)
• Separate what is different (bornIn, birthplace)
• Big boost for Precision
• Recall is crowdsourced - Help us! :)
16
DBpedia Community Meeting / Amsterdam 30.01.2014
16
http://lod2.eu
17. Creating Knowledge out of Interlinked Data
DBpedia – Collaborative Ontology Engineering
17
DBpedia Community Meeting / Amsterdam 30.01.2014
17
http://lod2.eu
18. Creating Knowledge out of Interlinked Data
DBpedia – Internationalization
DBpedia Internationalization
Effort to port DBpedia to local (non-Enlish) wikipedia's
18
DBpedia Community Meeting / Amsterdam 30.01.2014
18
http://lod2.eu
19. Creating Knowledge out of Interlinked Data
DBpedia Internationalization (I18n)
• Local Wikipedias provide more information on local resources e.g.:
• The French version of Eiffel Tower is better than the English
• Articles of local importance might not exist in English
• Multilingual extraction was limited to basic page structure
• Labels, categories, links, raw infobox extraction
• DBpedia I18n started at 2009 with German and Korean (PHP).
• Extended at 2010 with Greek (Scala) and many languages followed.
• Now default multilingual extraction is enhanced and some languages are tailored
for even better extraction.
• Extractor tweaking / mappings definitions
19
DBpedia Community Meeting / Amsterdam 30.01.2014
19
http://lod2.eu
20. Creating Knowledge out of Interlinked Data
DBpedia I18n – Overview
• DBpedia 3.7 (08/2011) was the first to introduce internationalized datasets
• Language based namespaces (http://{lang}.DBpedia.org)
• Most are not dereferencable (except local chapters)
• DBpedia 3.9 (08/2013) provides data in 191 languages
• Mappings are enabled for 28 languages (24 active)
• 15 local DBpedia chapters http://dbpedia.org/Internationalization/Chapters
•
•
•
•
12 from Europe, Indonesian, Japanese & Korean
Provide dereferencable URIs / IRIs
Maintain their own domain and community
Mappings coordination
20
DBpedia Community Meeting / Amsterdam 30.01.2014
20
http://lod2.eu
22. Creating Knowledge out of Interlinked Data
DBpedia I18n – Mapping Activity
22
DBpedia Community Meeting / Amsterdam 30.01.2014
22
http://lod2.eu
23. Creating Knowledge out of Interlinked Data
DBpedia I18n – Mapping Statistics (2013.02)
23
DBpedia Community Meeting / Amsterdam 30.01.2014
23
http://lod2.eu
24. Creating Knowledge out of Interlinked Data
DBpedia I18n – Mapping Statistics (2013.02)
24
DBpedia Community Meeting / Amsterdam 30.01.2014
24
http://lod2.eu
25. Creating Knowledge out of Interlinked Data
DBpedia I18n – Mapping Statistics (v3.8)
25
DBpedia Community Meeting / Amsterdam 30.01.2014
25
http://lod2.eu
26. Creating Knowledge out of Interlinked Data
To keep our Dutch audience happy :)
v3.9 (NL): 211.927 People, 861.633 Places, 16.733 Organizations, 92.314 Works
DBpedia I18n – Mapping Statistics (v3.8)
26
DBpedia Community Meeting / Amsterdam 30.01.2014
26
http://lod2.eu
27. Creating Knowledge out of Interlinked Data
DBpedia I18n – Mapping Statistics (v3.8)
27
DBpedia Community Meeting / Amsterdam 30.01.2014
27
http://lod2.eu
28. Creating Knowledge out of Interlinked Data
DBpedia I18n – Mapping Statistics (v3.8)
28
DBpedia Community Meeting / Amsterdam 30.01.2014
28
http://lod2.eu
29. Creating Knowledge out of Interlinked Data
DBpedia I18n
Further information:
●
●
●
http://wiki.dbpedia.org/Internationalization
Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann,
Mohamed Morsey, Patrick van Kleef, Sören Auer, Christian Bizer. DBpedia – A Large-scale, Multilingual Knowledge
Base Extracted from Wikipedia. To appear in the Semantic Web Journal.
Dimitris Kontokostas, Charalampos Bratsas, Sören Auer, Sebastian Hellmann, Ioannis Antoniou, George Metakides,
Internationalization of Linked Data: The case of the Greek DBpedia edition, Web Semantics: Science, Services and
Agents on the World Wide Web, Volume 15, September 2012, Pages 51–61, ISSN 1570–8268,
10.1016/j.websem.2012.01.001.
29
DBpedia Community Meeting / Amsterdam 30.01.2014
29
http://lod2.eu
30. Creating Knowledge out of Interlinked Data
DBpedia @ GSoC
5 succesfull GSoC DBpedia projects this year:
http://wiki.dbpedia.org/gsoc2013
●
Type inference based on categories (Kasun Perera)
●
●
New interactive DBpedia interface (Denis Lukovnikov)
●
●
Live Wikidata2DBpedia endpoint (2014)
Power tool for DBpedia testing metadata (Lazaros Ioannidis)
●
●
Available at http://live.dbpedia.org
Wikidata integration (Hady ElHasar)
●
●
Available at https://github.com/dbpedia/dbpedia-links
Using Databugger output: http://databugger.aksw.org
Input format generalization for DBpedia Spotlight
Do you know any students for DBpedia @ GSoC 2014 ?
30
DBpedia Community Meeting / Amsterdam 30.01.2014
30
http://lod2.eu
31. Creating Knowledge out of Interlinked Data
Quality @ Dbpedia (soon)
Databugger + GsoC Power tool
SPARQL quality queries
(more than one birth date)
=>
Select ?s where {
?s dbo:birthDate ?d .
} Group by ?s
Having count(?d > 1)
31
DBpedia Community Meeting / Amsterdam 30.01.2014
31
dbr:Phil_Cuzzi => Wikipedia error
dbr:Ivan_Cattaneo
dbr:Vijay_Ghate
dbr:William_Tempest
dbr:Cliff_Speegle
dbr:Arnold,_Duke_of_Guelders
dbr:Schuyler_Grant
dbr:Vlas_Chubar
dbr:Adrian_Peterson
...
http://lod2.eu
32. Creating Knowledge out of Interlinked Data
DBpedia @ GSoC
32
DBpedia Community Meeting / Amsterdam 30.01.2014
32
http://lod2.eu
33. Creating Knowledge out of Interlinked Data
Thank you for your attention!
DBpedia is a community project, please see http://dbpedia.org for aDBpediaist of conDBpediars.
LOD2 Presentation . 02.09.2010 . Page
http://lod2.eu