Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Semantic Data Box


Published on

  • Be the first to comment

  • Be the first to like this

Semantic Data Box

  1. 1. HPPS project work: Semantic Data Box G. Orsi Computing Laboratory Oxford University March 2, 2011
  2. 2. IntroductionOntology-Based Data Access (OBDA) • New type of database management systems (DBMS) combining classic databases with advanced reasoning and query processing capabilities. • Marriage of ontologies and databases is a rich commercial opportunity: • Oracle Spatial 11g Semantic Technologies • IBM IODT • Data-Grid Inc. • Ontotext BigOWLLim • and source of many research papers and systems: • QuOnto • Pellet • Presto • Requiem • IRIS±
  3. 3. Databases and constraintsQuery answering under constraints • In OBDA, an extensional relational database D is combined with a set of ontological constraints Σ. • A query Q is then answered against D ∪ Σ instead of D only.The chase • Answering a query Q over D ∪ Σ has been shown to be equivalent to answer Q over the chase expansion of D w.r.t. Σ. • The expansion chase(D, Σ) can be obtained through the well-known chase procedure. • whenever the chase procedure does not fail, the chase(D, Σ) is a universal model for Σ and can be used to answer Q.
  4. 4. Example: the chaseweakly-acyclic constraints chase expansion R: {teaches,professor,student} D: professor(santa) teaches(giorgio,cheng) Σ: σ1 : professor(X) → ∃Z teaches(X,Z) σ2 : teaches(X,Y) → student(Y) Q= ∅ Q: q(A) ← teaches(A,B),student(B) apply σ1 : teaches(santa, z0 ) D: professor(santa) apply σ2 : student(z0 ) teaches(giorgio,cheng) student(cheng) Q= {santa,giorgio}
  5. 5. Databases and constraints (cont.)so... where is the problem? chase expansionCritical issue: chase(D, Σ) might be infinite D: person(santa) father(roberto,giorgio)also for a very simple Σ. Q= person(santa) R: {person,father} Σ: σ1 : person(X) → ∃Y father(Y,X), person(Y) apply σ1 : person(roberto) father(z0 ,santa) Q: q(A) ← person(A) father(z1 ,roberto) person(z0 ) D: person(santa) person(z1 ) father(roberto,giorgio) father(z2 ,z0 ) father(z3 ,z1 ) ... Q= ?
  6. 6. Answering queries when the chase is infiniteThe bounded derivation depth (BDD) property • Johnson and Klug proved that it is possible to answer the queries even when the chase is infinite if we know that all the needed tuples can be produced in a finite initial fragment of chase(D, Σ). • query answering under a set Σ enjoys the BDD property if given a query Q it is possible to compute the chase up to k steps (chasek (D, Σ)) and: • chasek (D, Σ) |= Q iff chase(D, Σ) |= Q iff D ∪ Σ |= Q • k depends only on Q and Σ. • All the Σs with the BDD property enjoy also the property of FO-rewritability.the first-order rewritability property • A set of constraints Σ is first-order rewritable (FO-Rewritable) if, given a query Q, there exists a first-order query QΣ (the reformulation of Q) such that D ∪ Σ |= Q iff D |= QΣ for any database D. • The reformulated query does not depend on the size of D! • Answer QΣ over D, instead of Q over chase(D, Σ).
  7. 7. Chase vs RewritingExample Rewriting R: {person,father} D: person(santa) father(roberto,giorgio) Σ: σ1 : person(X) → ∃Y father(Y,X), person(Y) Q= person(santa) Q: q(A) ← person(A) D: person(santa) use σ1 : q(A) ← person(A) father(roberto,giorgio) q(A) ← father(A,B) Q= {santa,roberto}
  8. 8. The hard(ware) part...Size of the rewriting • QΣ as the form of a union of conjunctive queries (UCQs), i.e., select a,b from R1,R2 where cond1 UNION select a,b from R1,R2 where cond2 UNION ... UNION select a,b from R1,R2 where condn • Johnson and Klug forgot to mention that the number of queries in QΣ is exponential in the number of atoms of Q and in the size of Σ • we have two ways to go: 1. generate another form of rewriting instead of UCQs (e.g., Datalog queries) to eliminate at least the exponentiality in Σ → no more good properties of UCQs :-( 2. go brute-force in hardware :-)
  9. 9. The hard(ware) part...In-hardware databases (e.g., Glacier) • Starting from a query Q and a set of constraints Σ • reformulate the query in software and generate a query plan • synthesize the query plan in hardware • execute the queries in hardware Figure: Glacier: A query-to-hardware compiler (SIGMOD 2010)
  10. 10. ProblemsReconfigurability • Current approaches implement in hardware the general SQL operators. • if we synthesize the reformulation we go faster than ever! • but... • every time the query changes, the reformulation changes • does reconfigurability help?