Improving the Performance of the DL-Learner SPARQL Component for Semantic Web Applications

846 views

Published on

Presentation at JIST 2012 -- I forgot to add a link to http://en.wikipedia.org/wiki/Knowledge_extraction I mentioned it during the presentation, because some of their output would be compatible with SPARQL

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
846
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Improving the Performance of the DL-Learner SPARQL Component for Semantic Web Applications

  1. 1. Creating Knowledge out of Interlinked Data JIST 2012 – Page 1 http://lod2.eu Improving the Performance of the DL-Learner SPARQL Component for Semantic Web Applications Didier Cherix, Sebastian Hellmann, Jens Lehmann http://slideshare.net/kurzum http://dl-learner.org http://lod2.eu AKSW, Universität LeipzigLOD2 Presentation . 02.09.2010 . Page http://lod2.eu
  2. 2. JIST 2012 – Page 2 http://lod2.eu Motivation: 2007 - 2012DL-Learner was developed in parallel to DBpedia at University Leipzig since 2007DL-Learner is a tool for learning concepts in Description Logics (DLs) from user-provided examples.Worked very well for small to medium sized data sets, e.g. Carcinogenesis an otherML problems from the UCI ML repositoryLimit is the capacity of current OWL-DL reasonersChallenge was (and is) to do reasoning-based, supervized Machine Learning onthe DBpedia Dataset (> 200 Mio triples) or larger datasets
  3. 3. JIST 2012 – Page 3 http://lod2.eu Introduction DL-Learner
  4. 4. JIST 2012 – Page 4 http://lod2.eu Introduction DL-Learner Very large search space Reasoner instance checks
  5. 5. JIST 2012 – Page 5 http://lod2.eu Introduction DL-Learner
  6. 6. JIST 2012 – Page 6 http://lod2.eu Introduction DL-LearnerDL-Learner heavily relies on instance checks for machine learning, so the OWLReasoner is the bottle neckUnderlying idea:Only select relevant data for the Machine Learning Problem based on user-givenexamples→ Reduces the amount of triples that have to be given to a reasoner→ Reduces complexity and size of the OWL schemaBrute-force approach:Load all data into the OWL Reasoner, then do instance checks→ infeasible for DbpediaIterative approach (old component):Iterate over all instances and fetch the data recursively→ inefficient even with caching
  7. 7. JIST 2012 – Page 7 http://lod2.eu Introduction DL-Learner
  8. 8. JIST 2012 – Page 8 http://lod2.eu Introduction DL-Learner
  9. 9. JIST 2012 – Page 9 http://lod2.eu Introduction DL-Learner
  10. 10. JIST 2012 – Page 10 http://lod2.eu Introduction DL-Learner
  11. 11. JIST 2012 – Page 11 http://lod2.eu Introduction DL-Learner
  12. 12. JIST 2012 – Page 12 http://lod2.eu Introduction DL-Learner Challenge: What is the most efficient way to retrieve such a fragment?
  13. 13. JIST 2012 – Page 13 http://lod2.eu Improvements of the New Component• Step 1: Indexing the T-Box: • Download the OWL Schema and index it in memory • either via SPARQL or OWL file
  14. 14. JIST 2012 – Page 14 http://lod2.eu Improvements of the New Component • Step 2: A-Box QueriesParameter recursion depth:Retrieve newly discovered bindings to ?o until a certain depth is reached.
  15. 15. JIST 2012 – Page 15 http://lod2.eu Improvements of the New Component• Step 3: Typing the retrieved instances
  16. 16. JIST 2012 – Page 16 http://lod2.eu Improvements of the New Component• Step 4: T-Box Index: All “relevant” T-Box information is added via the index to the fragment. For each class already in the fragment. all superclasses and their equivalentClass axioms are added
  17. 17. JIST 2012 – Page 17 http://lod2.eu Benchmarking - SpeedFor each class in DBpedia Ontology:- 30 instances as positives- 30 negatives from a sister class
  18. 18. JIST 2012 – Page 18 http://lod2.eu Benchmarking – F-Measure on the training data 70% of the results for each class had an F-measure of 90-100% on the training data
  19. 19. JIST 2012 – Page 19 http://lod2.eu SPARQL Retrieval Component Impact• DL-Learner – http://dl-learner.org• DBpedia Navigator• Tiger Corpus Navigator• AutoSPARQL - http://autosparql.dl-learner.org/• HANNE – http://hanne.aksw.org• ORE - http://aksw.org/Projects/ORE Sebastian Hellmann, Jens Lehmann und Sören Auer: Learning of OWL Class Descriptions on Very Large Knowledge Bases In: International Journal on Semantic Web and Information Systems, 2009 Web Applications Active Learning → User Interaction and Feedback
  20. 20. JIST 2012 – Page 20 http://lod2.eu Future Work• Research Paper in Session 4b (tomorrow at 15:10) Navigation-induced Knowledge Engineering by Example• Caching + more sophisticated options• Large scale learning problems http://slideshare.net/kurzum Homepage: http://dl-learner.org Source code: http://sourceforge.net/projects/dl-learner/
  21. 21. JIST 2012 – Page 21 http://lod2.eu ExampleSebastian Hellmann, Jens Lehmann, Jörg Unbehauen, Claus Stadler, Thanh Nghia Lam und MarkusStrohmaier: Navigation-induced Knowledge Engineering by ExampleIn: JIST 2012
  22. 22. JIST 2012 – Page 22 http://lod2.eu ExampleSebastian Hellmann, Jens Lehmann und Sören Auer:Learning of OWL Class Descriptions on Very Large Knowledge BasesIn: International Journal on Semantic Web and Information Systems, 2009

×