Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

20150209 improving the_d_bpedia_ontology_v2

602 views

Published on

What could be made better with regard to DBpedia's ontology?

Published in: Internet
  • Be the first to comment

20150209 improving the_d_bpedia_ontology_v2

  1. 1. Advancing the DBpedia Ontology Presentation DBpedia Community Meeting Trinity College, Dublin February 9, 2015 Gerard Kuys 1
  2. 2. What is the problem with the DBpedia ontology? l Grown out of the need to accommodate extracted resources and typing those l Basic structure organised mostly along the dimensions: l Agents, Roles and Activities l Place, but not Time l Someone’s working area reduced to the spatial dimension l Instruments (Device and the like) l Works l As an organised collection of Types, the DBpedia ontology has more of a partial and rather haphazard Taxonomy than of a consistent encyclopedic description of knowledge domains l Editors and mappers add classes and properties for single-purpose actions, and for single- language mappings (if for mappings at all) l DBpedia basically is an extraction mechanism, however, being also a hub in the Linked- Data cloud, it could assume functions that cannot rely on just extracting Wikipedia data any more
  3. 3. A statement of flaws 3
  4. 4. What is definitely NOT the problem with the DBpedia ontology? l Being a Wikipedia wrapping mechanism, it represents a common understanding of things that is deemed by a community to be relevant to be documented l It allows for all kinds of variations over language editions l It allows for enriching language editions with information from other language editions
  5. 5. What should be the course of action? l The DBpedia Ontology should remain a community-driven structure for encyclopedic knowledge l However, we want more, and better, structure for the ontology l At the same time, we do not want to impose a canonical model of the world l Nor do we want to disrupt existing mappings l Therefore, we should aim at a process of gradually improving the existing framework and offer non-intrusive ways of ‘remapping mappings’ l This might break some eggs, but, please, not all eggs at the same time ;-) As a basis for every course of action, we should take stock of DBpedia classes and how they are being used across language editions *How many classes with no individuals * Classes added for reference, not because extracted notions need to be accommodated, l but because the DBpedia ontology can be used as a kind of 'universal reference' l Use tools like RDFUnit to bring to light obvious deficiencies  And embark on a process of quality improvement indeed l Prepare a set of guidelines on how to use and how to extend the DBpedia ontology
  6. 6. Approaches suggested so far Procedure: A DBpedia Committee responsible for improvement of the Dbpedia Ontology, monitoring changes and publisher guidelines (Dimitris Kontokostas) Style: Open, bottom-up collaborative ontology development l (Agnieszka Ławrynowicz) Actions to be taken: Mapping to upper ontologies, implementing more of owl:disjointWith l (Daniel Fleischhacker) l Pruning the classes’ tree, implement corrections found by the RDF Unit tool l (Peter Patel-Schneider, Dimitris Kontokostas) l Using the Universal Decimal Classification with its auxiliary tables as a ‘connected structural backbone’ l (Gerard Kuys) 6
  7. 7. Connecting fields of knowledge l Is it really necessary to pull external mappings into the DBpedia ontology? l If not, how to connect best? (More than one single answer allowed) l How about using the wealth of links in Wikidata? We’re connected already! l In my view, DBpedia should remain a stronghold of the encyclopedic approach: l Not very deep, not the newest details and insights l In stead of that, document the connection to adjacent phenomena, and the nature of that connection l How about a matrix with ‘objects of knowledge’ as one dimension, and ‘disciplines of knowledge’ as the other? This type of approach is what makes the UDC so interesting! 7

×