Semantic Data Box


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Semantic Data Box

  1. 1. HPPS project work: Semantic Data Box G. Orsi Computing Laboratory Oxford University March 2, 2011
  2. 2. IntroductionOntology-Based Data Access (OBDA) • New type of database management systems (DBMS) combining classic databases with advanced reasoning and query processing capabilities. • Marriage of ontologies and databases is a rich commercial opportunity: • Oracle Spatial 11g Semantic Technologies • IBM IODT • Data-Grid Inc. • Ontotext BigOWLLim • and source of many research papers and systems: • QuOnto • Pellet • Presto • Requiem • IRIS±
  3. 3. Databases and constraintsQuery answering under constraints • In OBDA, an extensional relational database D is combined with a set of ontological constraints Σ. • A query Q is then answered against D ∪ Σ instead of D only.The chase • Answering a query Q over D ∪ Σ has been shown to be equivalent to answer Q over the chase expansion of D w.r.t. Σ. • The expansion chase(D, Σ) can be obtained through the well-known chase procedure. • whenever the chase procedure does not fail, the chase(D, Σ) is a universal model for Σ and can be used to answer Q.
  4. 4. Example: the chaseweakly-acyclic constraints chase expansion R: {teaches,professor,student} D: professor(santa) teaches(giorgio,cheng) Σ: σ1 : professor(X) → ∃Z teaches(X,Z) σ2 : teaches(X,Y) → student(Y) Q= ∅ Q: q(A) ← teaches(A,B),student(B) apply σ1 : teaches(santa, z0 ) D: professor(santa) apply σ2 : student(z0 ) teaches(giorgio,cheng) student(cheng) Q= {santa,giorgio}
  5. 5. Databases and constraints (cont.)so... where is the problem? chase expansionCritical issue: chase(D, Σ) might be infinite D: person(santa) father(roberto,giorgio)also for a very simple Σ. Q= person(santa) R: {person,father} Σ: σ1 : person(X) → ∃Y father(Y,X), person(Y) apply σ1 : person(roberto) father(z0 ,santa) Q: q(A) ← person(A) father(z1 ,roberto) person(z0 ) D: person(santa) person(z1 ) father(roberto,giorgio) father(z2 ,z0 ) father(z3 ,z1 ) ... Q= ?
  6. 6. Answering queries when the chase is infiniteThe bounded derivation depth (BDD) property • Johnson and Klug proved that it is possible to answer the queries even when the chase is infinite if we know that all the needed tuples can be produced in a finite initial fragment of chase(D, Σ). • query answering under a set Σ enjoys the BDD property if given a query Q it is possible to compute the chase up to k steps (chasek (D, Σ)) and: • chasek (D, Σ) |= Q iff chase(D, Σ) |= Q iff D ∪ Σ |= Q • k depends only on Q and Σ. • All the Σs with the BDD property enjoy also the property of FO-rewritability.the first-order rewritability property • A set of constraints Σ is first-order rewritable (FO-Rewritable) if, given a query Q, there exists a first-order query QΣ (the reformulation of Q) such that D ∪ Σ |= Q iff D |= QΣ for any database D. • The reformulated query does not depend on the size of D! • Answer QΣ over D, instead of Q over chase(D, Σ).
  7. 7. Chase vs RewritingExample Rewriting R: {person,father} D: person(santa) father(roberto,giorgio) Σ: σ1 : person(X) → ∃Y father(Y,X), person(Y) Q= person(santa) Q: q(A) ← person(A) D: person(santa) use σ1 : q(A) ← person(A) father(roberto,giorgio) q(A) ← father(A,B) Q= {santa,roberto}
  8. 8. The hard(ware) part...Size of the rewriting • QΣ as the form of a union of conjunctive queries (UCQs), i.e., select a,b from R1,R2 where cond1 UNION select a,b from R1,R2 where cond2 UNION ... UNION select a,b from R1,R2 where condn • Johnson and Klug forgot to mention that the number of queries in QΣ is exponential in the number of atoms of Q and in the size of Σ • we have two ways to go: 1. generate another form of rewriting instead of UCQs (e.g., Datalog queries) to eliminate at least the exponentiality in Σ → no more good properties of UCQs :-( 2. go brute-force in hardware :-)
  9. 9. The hard(ware) part...In-hardware databases (e.g., Glacier) • Starting from a query Q and a set of constraints Σ • reformulate the query in software and generate a query plan • synthesize the query plan in hardware • execute the queries in hardware Figure: Glacier: A query-to-hardware compiler (SIGMOD 2010)
  10. 10. ProblemsReconfigurability • Current approaches implement in hardware the general SQL operators. • if we synthesize the reformulation we go faster than ever! • but... • every time the query changes, the reformulation changes • does reconfigurability help?