Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PAGOdA Presentation

569 views

Published on

PAGOdA (Pay-as-you-go OWL Query Answering Using a Triple Store) presentation by Bernardo Cuenca Grau

Abstract: We present an enhanced hybrid approach to OWL query answering that combines an RDF triple-store with an OWL reasoner in order to provide scalable pay-as-you-go performance. The enhancements presented here include an extension to deal with arbitrary OWL ontologies, and optimisations that significantly improve scalability. We have implemented these techniques in a prototype system, a preliminary evaluation of which has produced very encouraging results.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

PAGOdA Presentation

  1. 1. Pay-as-you-go Query Answering with PAGOdA BERNARDO CUENCA GRAU
  2. 2. Ontology-mediated Query Answering Q A C T B D RDF Data a b • (Meta)-data published in RDF • RDF resources reference an OWL 2 ontology • The ontology describes the meaning of data RDF and OWL 2 well-established • Thousands of available OWL 2 ontologies • RDF ubiquitous on the Web 2
  3. 3. Ontology-mediated Query Answering Ontology languages offer a wide range modeling constructs High expressive power à high worst-case complexity of reasoning How can we provide scalable query answering? • Restrict our ontology to a lightweight fragment of OWL EL, QL or RL profiles • Tolerate incompleteness • Rely on highly optimised pay-as-you-go systems • Worst case optimal for lightweight fragments • Rapidly computes easy answers • Performance gracefully degrades with harder instances 3
  4. 4. Datalog and the OWL 2 Profiles Datalog is the quintessential rule-based KR language • Reasoning typically implemented via materialisation • Our in-house system RDFox shows excellent performance Query answering within the OWL 2 profiles • RL ontologies equivalent to Datalog programs • EL and QL ontologies can be strengthened using Datalog Query answering requires an additional filtration step 4
  5. 5. Incomplete Reasoning § RL / EL reasoning w.r.t. arbitrary OWL ontology O dataset D and query q gives (in general) an incomplete answer L P Profile-specific reasoning via Datalog (relatively) scalable O Answers may be incomplete O Degree of incompleteness unknown O Incompleteness may be pathological (empty answers) 5 L = cert(q, hO`,Di) ✓ cert(q, hO,Di) with O |= O`
  6. 6. The idea behind PAGOdA 6 Redistribute reasoning workload Datalog reasoner Fully-fledged OWL 2 reasoner Resort to expensive OW2 reasoning as little as possible (if at all) Ensure sound and complete answers Do not restrict ontology language Datalog reasoner OWL 2 reasoner
  7. 7. Step 1: Lower and Upper Bounds ELHO Lower Lower Data Upper Data Ontology Query Datalog Engine Datalog Engine 7 Profile-specific reasoning via Datalog gives a lower bound L gives a subset of cert(q, hO,Di) We transform O into strictly stronger Datalog ontology Ou • Normalise ontology into Datalog±,v rules • Eliminate ∨ by transforming to ∧ • Replace existential variables with Skolem constants Datalog reasoning w.r.t. Ou gives upper bound answer U cert(q, hO,Di) ✓ cert(q, hOu,Di) = U
  8. 8. Step 2: Module extraction 8 Checking possible answers in U L is expensive Compute a fragment of ontology + data sufficient to check each answer in U L. Fragment computation involves proof tracing in Ou Achieved also using Datalog materialisation Relevant fragments are typically much smaller Size of the problem substantially reduced Datalog Engine U D Fragment
  9. 9. Step 3: Summarisation 9 Fragment Summarisation Summary Full Reasoner Q Further reduce problem size by summarising the fragment • Technique introduced by the SHER team at IBM • “Merge” constants that are instances of same concepts • Check answers against summary using OWL 2 reasoner • The summary of the fragment is typically very small This is an orthogonal over-approximation to previous ones We further reduce the size of U L Sometimes we even make it empty !
  10. 10. Step 4: Dependency analysis 10 F Dependency Analysis F Full Reasoner Q Output Group remaining candidate answers • If a and b are in the same group then a is an answer iff b is • We can also establish dependencies between groups Check group representatives against fragment using the fully-fledged reasoner.
  11. 11. Features of PAGOdA PAGOdA provides PAYG query answering for OWL 2: § Uses Datalog reasoner “out of the box” § Efficiently computes sound partial answers § In “easy” cases, efficiently computes complete answers § In “harder” cases, applies increasingly powerful but less scalable reasoning techniques as needed to completely answer query § The last step involving full reasoner is rarely needed in practice § Recent improvements § Better and better upper bounds § Smaller and smaller modules 11
  12. 12. Queries answered by each technique LUBM UOBM FLY DBPedia NPD Total 24 15 6 441 329 Bounds 22 12 5 439 326 Sum 22 14 5 440 329 Full 24 15 6 441 329 Scalability for lower and upper bound computation Importing Lower Mat Upper Mat Ave QA LUBM1000 313s 190s 269s 12s UOBM500 356s 346s 734s 4s
  13. 13. Queries that require full reasoning Lower Upper Gap Sum Groups LUBM100_q20 0 26 26 26 1 LUBM100_q22 0 14 14 14 1 UOBM1_q14 6271 6535 264 264 1 FLY_q5 0 344 344 344 1 DBPedia_q404 0 2 2 2 1
  14. 14. Lower Upper Frag Size (%) Sum Full LUBM100_q20 0.2s 0.3s 14.5s .005/.04 1.2s 190.1s LUBM100_q22 0.3s 0.2s 10.0s .005/.04 0.8s 46.1s UOBM1_q14 0.1s 0.1s 0.7s .17/.076 0.5s 5.4s FLY_q5 0.0s 0.0s 16.0s .34/.01 0.1s 0.2s 14 Time distribution and fragment size
  15. 15. PAGOdA Team § Yujiao Zhou § Yavor Nenov § Bernardo Cuenca Grau § Ian Horrocks 15

×