DBpedia Mappings Wiki

Anja Jentzsch - @anjeve	

Hasso-Plattner-Institute, Potsdam, Germany	


!
SMWCon Fall 2013	

210...
Linked Data Principles
Set of best practices for publishing structured data on the Web in
accordance with the general arch...
Properties of the Web of Linked Data
•

Global, distributed dataspace build on a simple set of standards	

•

•

RDF, URIs...
W3C Linking Open Data Project [2007]
•

Grassroots community effort to	

•

publish existing open license datasets as Link...
LOD Data Sets on the Web: September 2011

•

295 data sets	


•

Over 31 billion RDF triples	


•

Over 504 million RDF li...
LOD Data Set statistics

LOD Cloud Data Catalog on the Data Hub	

•

http://datahub.io/group/lodcloud 	


More statistics	...
DBpedia [2007]
•

DBpedia is a joint project with the following goals	

• extracting structured information from Wikipedia...
Extracting structured data from Wikipedia
Extracting structured data from Wikipedia
dbpedia:Berlin

rdf:type

dbpedia-owl:City ,	


dbpedia-owl:PopulatedPlace ,	

d...
The DBpedia Data Set
Information on more than 4 million “things”	

• 832,000 persons	

• 209,000 organisations	

• 639,000...
DBpedia Use Cases
1. Hub for the growing Web of Data 	

2. Data source for applications and mashups	

3. Improvement of Wi...
DBpedia Mobile
displays Wikipedia data on map	

• aggregates different data sources
•
Faceted Wikipedia Search
•

faceted browsing and free text search
http://spotlight.dbpedia.org
DBpedia Information Extraction
Framework (DIEF)
Open source: http://github.com/dbpedia 	

• More than 30 developers	

• Wr...
DIEF Architecture
DIEF
Simple approach, huge generality	

• Inconsistency in property naming	

• Different infobox properties can have diffe...
Mapping-Based Infobox Extraction
•

Correct semantics	

• Combine what belongs together (birth_place, Geburtsort)	

• Divi...
DBpedia Mappings Wiki
•

•
•

•
•

since March 2010 collaborative editing of	

• DBpedia ontology	

• mappings from Wikipe...
DBpedia Mappings Wiki Details
MediaWiki plus	

• Extensions for	

• validating mappings	

• storing and validating the ont...
Classes and Properties
Test Mappings
Validate Mappings
DBpedia 3.9 Mapping Statistics
•
•
•
•
•
•

3177 template mappings	

529 classes	

927 object properties	

1,290 datatype ...
DBpedia Mapping Edits
DBpedia Mapping Coverage
Google Summer of Code [2013]
Mapping from DBpedia to Wikidata properties	

• Dump from Wikidata facts with mapped properti...
Ongoing & Future Work
•
•
•
•

•

Multilingual data integration and fusion	

Community-driven data quality improvement	

I...
Thanks!
Email: anja@anjeve.de	

Twitter: @anjeve

References:	

• DBpedia http://dbpedia.org 	

• DBpedia Mappings Wiki ht...
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
Upcoming SlideShare
Loading in …5
×

DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin

1,527 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,527
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
94
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin

  1. 1. 
 DBpedia Mappings Wiki Anja Jentzsch - @anjeve Hasso-Plattner-Institute, Potsdam, Germany ! SMWCon Fall 2013 2103/10/30
  2. 2. Linked Data Principles Set of best practices for publishing structured data on the Web in accordance with the general architecture of the Web. 1. 2. 3. 4. Use URIs as names for things. Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful RDF information. Include RDF statements that link to other URIs so that they can discover related things. Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2006
  3. 3. Properties of the Web of Linked Data • Global, distributed dataspace build on a simple set of standards • • RDF, URIs, HTTP Entities are connected by links • • • creating a global data graph that spans data sources and enables the discovery of new data sources Provides for data-coexistence • Everyone can publish data to the Web of Linked Data • Everyone can express their personal view on things • Everybody can use the vocabularies/schemas that they like
  4. 4. W3C Linking Open Data Project [2007] • Grassroots community effort to • publish existing open license datasets as Linked Data on the Web • interlink things between different data sources
  5. 5. LOD Data Sets on the Web: September 2011 • 295 data sets • Over 31 billion RDF triples • Over 504 million RDF links between data sources http://lod-cloud.net
  6. 6. LOD Data Set statistics LOD Cloud Data Catalog on the Data Hub • http://datahub.io/group/lodcloud More statistics • http://lod-cloud.net/state/
  7. 7. DBpedia [2007] • DBpedia is a joint project with the following goals • extracting structured information from Wikipedia • publish this information under an open license on the Web • setting links to other data sources
 ! • Partners • Universität Mannheim (Germany) • Universität Leipzig (Germany) • OpenLink Software (UK)
  8. 8. Extracting structured data from Wikipedia
  9. 9. Extracting structured data from Wikipedia dbpedia:Berlin rdf:type dbpedia-owl:City , dbpedia-owl:PopulatedPlace , dbpedia-owl:Place ; rdfs:label "Berlin"@en , "Berlino"@it ; dbpedia-owl:population wgs84:lat wgs84:long ! 52.500557 ; 13.398889 . dbpedia:SoundCloud • 3499879 ; dbpedia-owl:location Access to DBpedia data: • Dumps • SPARQL endpoint • Linked Data interface dbpedia:Berlin .
  10. 10. The DBpedia Data Set Information on more than 4 million “things” • 832,000 persons • 209,000 organisations • 639,000 places • 116,000 music albums • 78,000 movies • 226,000 species • overall more than 2.4 billion RDF triples • localised versions in 119 languages • 24.6 million links to images • 27.6 million links to external web pages • 45 million links to other Linked Data sets •
  11. 11. DBpedia Use Cases 1. Hub for the growing Web of Data 2. Data source for applications and mashups 3. Improvement of Wikipedia search 4. Text analysis and annotation
  12. 12. DBpedia Mobile displays Wikipedia data on map • aggregates different data sources •
  13. 13. Faceted Wikipedia Search • faceted browsing and free text search
  14. 14. http://spotlight.dbpedia.org
  15. 15. DBpedia Information Extraction Framework (DIEF) Open source: http://github.com/dbpedia • More than 30 developers • Written in Scala & Java • Can be adapted to other MediaWikis • adaption to Wiktionary http://wiktionary.dbpedia.org •
  16. 16. DIEF Architecture
  17. 17. DIEF Simple approach, huge generality • Inconsistency in property naming • Different infobox properties can have different names for the same meaning (e.g. born vs birth_date vs birthDate) • Inconsistency in property data types • Data types are determined by resource with a simple greedy algorithm •
  18. 18. Mapping-Based Infobox Extraction • Correct semantics • Combine what belongs together (birth_place, Geburtsort) • Divide what is different (born, Geburtsort) • Huge impact on precision & recall
  19. 19. DBpedia Mappings Wiki • • • • • since March 2010 collaborative editing of • DBpedia ontology • mappings from Wikipedia infoboxes and tables to DBpedia ontology curated in a public wiki with instant validation methods • http://mappings.dbpedia.org multi-langual mappings to the DBpedia ontology: • ar, bg, bn, ca, cs, de, el, en, es, et, eu, fr, ga, hi, hr, hu, it, ja, ko, nl, pl, pt, ru, sl, tr ! allows for a significant increase of the extracted data’s quality • each domain has its experts ~ 170 active editors
  20. 20. DBpedia Mappings Wiki Details MediaWiki plus • Extensions for • validating mappings • storing and validating the ontology • Templates for • ontology definition • mapping infoboxes to the ontology • custom templates: date intervals, conditions, geo coordinates etc. ! • DBpedia Server • Ontology storage • Mapping validation •
  21. 21. Classes and Properties
  22. 22. Test Mappings
  23. 23. Validate Mappings
  24. 24. DBpedia 3.9 Mapping Statistics • • • • • • 3177 template mappings 529 classes 927 object properties 1,290 datatype properties 116 specialized datatype properties 46 owl:equivalentClass and 31 owl:equivalentProperty mappings to http:// schema.org
  25. 25. DBpedia Mapping Edits
  26. 26. DBpedia Mapping Coverage
  27. 27. Google Summer of Code [2013] Mapping from DBpedia to Wikidata properties • Dump from Wikidata facts with mapped properties and dataypes ! • http://wiki.dbpedia.org/gsoc2013/ideas/WikidataMappings •
  28. 28. Ongoing & Future Work • • • • • Multilingual data integration and fusion Community-driven data quality improvement Inline extraction DBpedia and NLP • structured background knowledge for e.g. named entity recognition and disambiguation Collaboration between Wikidata and DBpedia
  29. 29. Thanks! Email: anja@anjeve.de Twitter: @anjeve References: • DBpedia http://dbpedia.org • DBpedia Mappings Wiki http://mappings.dbpedia.org • LOD Cloud http://lod-cloud.net • LOD Data Set Catalogue http://www.datahub.io/group/lodcloud

×