Successfully reported this slideshow.
Your SlideShare is downloading. ×

Datalog+-Track Introduction & Reasoning on UML Class Diagrams via Datalog+-

Datalog+-Track Introduction & Reasoning on UML Class Diagrams via Datalog+-

Download to read offline

UML class diagrams (UCDs) are a widely adopted formalism
for modeling the intensional structure of a software system. Although
UCDs are typically guiding the implementation of a system, it is common
in practice that developers need to recover the class diagram from an
implemented system. This process is known as reverse engineering. A
fundamental property of reverse engineered (or simply re-engineered)
UCDs is consistency, showing that the system is realizable in practice.
In this work, we investigate the consistency of re-engineered UCDs, and
we show is pspace-complete. The upper bound is obtained by exploiting
algorithmic techniques developed for conjunctive query answering under
guarded Datalog+/-, that is, a key member of the Datalog+/- family
of KR languages, while the lower bound is obtained by simulating the
behavior of a polynomial space Turing machine.

UML class diagrams (UCDs) are a widely adopted formalism
for modeling the intensional structure of a software system. Although
UCDs are typically guiding the implementation of a system, it is common
in practice that developers need to recover the class diagram from an
implemented system. This process is known as reverse engineering. A
fundamental property of reverse engineered (or simply re-engineered)
UCDs is consistency, showing that the system is realizable in practice.
In this work, we investigate the consistency of re-engineered UCDs, and
we show is pspace-complete. The upper bound is obtained by exploiting
algorithmic techniques developed for conjunctive query answering under
guarded Datalog+/-, that is, a key member of the Datalog+/- family
of KR languages, while the lower bound is obtained by simulating the
behavior of a polynomial space Turing machine.

More Related Content

More from RuleML

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Datalog+-Track Introduction & Reasoning on UML Class Diagrams via Datalog+-

  1. 1. Datalog§ Track Introduction & Reasoning on UML Class Diagrams via Datalog§ Georg Gottlob Department of Computer Science University of Oxford Joint work with G. Orsi, A. Pieris et al.
  2. 2. 09:00 <Datalog+/- Track> 09:00 - Datalog Track Introduction Tue, 4 August, 09:00 – 09:30 Description Georg Gottlob, Giorgio Orsi, and Andreas Pieris: Consistency Checking of Re-Engineered UML Class Diagrams via Datalog+/- (Invited) 09:30 - Graal: A Toolkit for Query Answering with Existential Rules Tue, 4 August, 09:30 – 10:00 DescriptionJean-François Baget, Michel Leclère, Marie-Laure Mugnier, Swan Rocher and Clément Sipieter 10:00 - Binary Frontier-guarded ASP with Function Symbols Tue, 4 August, 10:00 – 10:30 DescriptionMantas Simkus 11:00 - Ontology-Based Multidimensional Contexts with Applications to Quality Data Specification and Extraction WhenTue, 4 August, 11:00 – 11:30 DescriptionMostafa Milani and Leopoldo Bertossi 11:30 - Existential rules and Bayesian Networks for Probabilistic Ontological Data Exchange Tue, 4 August, 11:30 – 12:00 DescriptionThomas Lukasiewicz, Maria Vanina Martinez, Livia Predoiu, Gerardo Ignacio Simari more details» copy to my calendar Wed, 09:00 – 09:10 Datalog+, RuleML and OWL 2: Formats and Translations for Existential Rules JF Baget, A Gutierrez, M Leclère, ML Mugnier, S Rocher and C Sipieter
  3. 3. Competes Stock 0..1 Member Owns 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List The Need for Modeling Data and Objects UML Class Diagrams F-Logic Description Logics Speaker v Person u 9gives.Talk Tutorial v 8attendedBy.Student attendedBy v attended-1 Speaker u Student v ? student::person person[age ¤) number] person[age {0:1} ¤) number] person[name {1:¤} ¤) string] PREFIX abc: hhttp://example.com/exampleOntology#i SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; ?x abc:isCapitalOf ?y . ?x abc:countryname ?country ; ?x abc:isInContinent abc:Africa . } SPARQL
  4. 4. Competes Stock 0..1 Member Owns 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List A Unifying Framework for Reasoning over Data UML Class Diagrams F-Logic Description Logics Speaker v Person u 9gives.Talk Tutorial v 8attendedBy.Student attendedBy v attended-1 Speaker u Student v ? student::person person[age ¤) number] person[age {0:1} ¤) number] person[name {1:¤} ¤) string] PREFIX abc: hhttp://example.com/exampleOntology#i SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; ?x abc:isCapitalOf ?y . ?x abc:countryname ?country ; ?x abc:isInContinent abc:Africa . } SPARQL Datalog± • Logical foundations, semantics and decidability • Complexity results for ontological query answering • Identification of tractable fragments
  5. 5. The Datalog§ Family • Extend Datalog by allowing in the head: • But query answering under Datalog[9] is already undecidable see, e.g., [Beeri & Vardi, ICALP 1981] and [Calì, G. & Kifer, KR 2008] • Datalog[9,=,?] is syntactically restricted ! Datalog§ • Datalog: 8X8Y (X,Y)  R(X) • Existential quantification (TGDs) – 8X8Y (X,Y)  9Z (X,Z) • Equality predicate (EGDs) – 8X (X)  Xi = Xj • Constant false (Negative Constraints) – 8X (X)  ?
  6. 6. Ontological Query Answering D Ο hD,Oi D database ontology Query knowledge base hD,Οi ² Query , D Æ Ο ² Query
  7. 7. Guarded Datalog[9] 8X8Y8Z R(X,Y,Z) ^ S(Y) ^ P(X,Z)  9W Q(X,W) • All 8-variables occur in one body atom – guard atom guard • Query answering is PTIME-complete in data complexity • Models of finite treewidth ) decidability of query answering related to [Andréka, Németi & van Benthem, J. Philosophical Logic 1998] and [Grädel, J. Symb. Log. 1999] [Calì, G. & Kifer, KR 2008]
  8. 8. Linear Datalog[9] • Rules with just one body-atom • Query answering is first-order rewritable 8X supervisorOf(X,X)  manager(X) [Calì, G. & Lukasiewicz, J. Web. Sem. 2012]
  9. 9. First-Order Rewritability 8D (D [ O ² Q , D ² Q*) Query answering is in AC0 in data complexity D translation evaluation SQLpositive first-order query Q Ο QFO Q* compilation
  10. 10. Linear Datalog[9] • Rules with just one body-atom • … and thus, query answering is in AC0 in data complexity • Query answering is first-order rewritable 8X supervisorOf(X,X)  manager(X) [Calì, G. & Lukasiewicz, J. Web. Sem. 2012]
  11. 11. Datalog[9,=,?] • Every decidable Datalog[9] language can be enriched with: • Non-Conflicting EGDs – no interaction with TGDs • Negative constrains • Unsatisfiability can be reduced to query answering • If the theory is satisfiable, then consider only the TGDs see, e.g., [Calì, G. & Pieris, Artif. Intell. 2012]
  12. 12. Competes Stock 0..1 Member Owns 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List A Unifying Framework for Reasoning over Data UML Class Diagrams F-Logic Description Logics Speaker v Person u 9gives.Talk Tutorial v 8attendedBy.Student attendedBy v attended-1 Speaker u Student v ? student::person person[age ¤) number] person[age {0:1} ¤) number] person[name {1:¤} ¤) string] PREFIX abc: hhttp://example.com/exampleOntology#i SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; ?x abc:isCapitalOf ?y . ?x abc:countryname ?country ; ?x abc:isInContinent abc:Africa . } SPARQL Guarded Datalog[9,=,?] • Logical foundations, semantics and decidability • Complexity results for ontological query answering • Identification of tractable fragments
  13. 13. Competes Stock 0..1 Member Owns 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List A Unifying Framework for Reasoning over Data UML Class Diagrams F-Logic Description Logics Speaker v Person u 9gives.Talk Tutorial v 8attendedBy.Student attendedBy v attended-1 Speaker u Student v ? student::person person[age ¤) number] person[age {0:1} ¤) number] person[name {1:¤} ¤) string] PREFIX abc: hhttp://example.com/exampleOntology#i SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; ?x abc:isCapitalOf ?y . ?x abc:countryname ?country ; ?x abc:isInContinent abc:Africa . } SPARQL
  14. 14. UML Class Diagrams • Modeling the intensional structure of a software system • Reasoning services limited to diagram satisfiability checking • Verify properties at runtime by reasoning over a (possibly incomplete) state of the system – query answering
  15. 15. Querying UML Class Diagrams: Example I Stock 0..1 Member Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Executive(john) Member(john,LU) Stock(BAY) Issues(BA,BAY) Owns(john,BAY) Competes(LU,BA) Index[0..1]:Str getIndex():List Issues Does anybody have a potential conflict of interest?
  16. 16. Querying UML Class Diagrams: Example I Stock 0..1 Member Issues Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Executive(john) Member(john,LU) Stock(BAY) Issues(BA,BAY) Owns(john,BAY) Competes(LU,BA) Index[0..1]:Str getIndex():List Conflict  Person(P), Company(C1), Company(C2), Stock(S), Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
  17. 17. Querying UML Class Diagrams: Example I Stock 0..1 Member Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Executive(john) Member(john,LU) Stock(BAY) Issues(BA,BAY) Owns(john,BAY) Competes(LU,BA) Person(john) Issues Index[0..1]:Str getIndex():List Conflict  Person(P), Company(C1), Company(C2), Stock(S), Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
  18. 18. Querying UML Class Diagrams: Example I Executive(john) Member(john,LU) Stock(BAY) Issues(BA,BAY) Owns(john,BAY) Competes(LU,BA) Person(john) Company(LU) Company(BA) Stock 0..1 Member Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List Conflict  Person(P), Company(C1), Company(C2), Stock(S), Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
  19. 19. Querying UML Class Diagrams: Example I Executive(john) Member(john,LU) Stock(BAY) Issues(BA,BAY) Owns(john,BAY) Competes(LU,BA) Person(john) Company(LU) Company(BA) Stock 0..1 Member Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person {P ! john, C1 ! LU, C2 ! BA, S ! BAY} Issues Index[0..1]:Str getIndex():List Conflict  Person(P), Company(C1), Company(C2), Stock(S), Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
  20. 20. Querying UML Class Diagrams: Example II Group(DB) Is there a professor who works in the database group? WorksIn Leads 1..1 3..1 Student Professor Member {disjoint} Group 1..1 0..1 CLeads since: Date
  21. 21. Group(DB) Querying UML Class Diagrams: Example II WorksIn Leads 1..1 3..1 Student Professor Member {disjoint} Group 1..1 0..1 CLeads Ans  Professor(P), WorksIn(P,DB) since: Date
  22. 22. Group(DB) Leads(z1,DB) Professor(z1) glue(z1,DB,z2) CLeads(z2) Querying UML Class Diagrams: Example II WorksIn Leads 1..1 3..1 Student Professor Member {disjoint} Group 1..1 0..1 CLeads since: Date Ans  Professor(P), WorksIn(P,DB)
  23. 23. Group(DB) Leads(z1,DB) Professor(z1) WorksIn(z1,DB) Querying UML Class Diagrams: Example II WorksIn Leads 1..1 3..1 Student Professor Member {disjoint} Group 1..1 0..1 Ans  Professor(P), WorksIn(P,DB) SINCE since: Date
  24. 24. Group(DB) Leads(z1,DB) Professor(z1) WorksIn(z1,DB) … {P ! z1, DB ! DB} Querying UML Class Diagrams: Example II WorksIn Leads 1..1 3..1 Student Professor Member {disjoint} Group 1..1 0..1 SINCE since: Date Ans  Professor(P), WorksIn(P,DB)
  25. 25. From Diagrams to First-Order Logic (Datalog§) Stock 0..1 Member Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List
  26. 26. From Diagrams to First-Order Logic (Datalog§) Stock 0..1 Member Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List 8X8Y Member(X,Y)  Company(X) ^ Executive(Y) 8X Company(X)  9Y9Z Member(X,Y) ^ Member(X,Z) ^ Y ≠ Z 8X Executive(X)  9Y Member(Y,X) [Berardi et al., Artif. Intell. 2005]
  27. 27. From Diagrams to First-Order Logic (Datalog§) Stock 0..1 Member Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List 8X Company(X)  9Y Issues(X,Y) 8X8Y8Z Stock(X) ^ Issues(Y,X) ^ Issues(Z,X)  Y = Z 8X8Y Stock(X) ^ Index(X,Y)  Str(Y) 8X8Y Stock(X) ^ getIndex(X,Y)  List(Y) 8X Stock(X)  9Y Issues(Y,X) [Berardi et al., Artif. Intell. 2005]
  28. 28. From Diagrams to First-Order Logic (Datalog§) Stock 0..1 Member Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List 8X Executive(X)  Person(X) [Berardi et al., Artif. Intell. 2005]
  29. 29. Complexity of Query Answering • EXPTIME-complete in combined complexity • coNP-complete in data complexity • Undecidable when diagrams are combined with arbitrary OCL (Object Constraint Language) constraints implicit in [Berardi et al., Artif. Intell. 2005] and [Lutz, IJCAR 2008] implicit in [Ortiz, Calvanese & Eiter, AAAI 2006] [folklore]
  30. 30. Research Challenge: Reduce High Complexity • Diagrams often have very large instantiations • Some applications require very large diagrams • OCL constraints, which are not expressible diagrammatically, lead to undecidability or high complexity
  31. 31. Aims and Objectives • Restrict UML class diagrams to achieve tractability of query answering in data complexity • Better understanding of combined complexity • Add relevant OCL constraints without losing tractability of query answering in data complexity
  32. 32. 1. For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1} Lean UML Class Diagrams [Calì, G., Orsi & Pieris, FOSSACS 2012]
  33. 33. 1. For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1} 2. For each association A: • upper bounds mU , nU 2 {1,1}, • if A generalizes some other association, then mU = nU = 1 Lean UML Class Diagrams C1 C2 mL..mU nL..nUA [Calì, G., Orsi & Pieris, FOSSACS 2012]
  34. 34. 1. For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1} 2. For each association A: • upper bounds mU , nU 2 {1,1}, • if A generalizes some other association, then mU = nU = 1 3. Completeness constraints are forbidden Lean UML Class Diagrams C1 C2 C {complete} 8X C(X)  C1(X) _ C2(X) C1 C2 mL..mU nL..nUA [Calì, G., Orsi & Pieris, FOSSACS 2012]
  35. 35. Lean UML Class Diagrams: Example I 0..1 Member Issues Owns Competes 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Stock Index[0..1]:Str getIndex():List Executive Person
  36. 36. Lean UML Class Diagrams: Example II WorksIn Leads 1..1 3..1 Student Professor Member {disjoint} Group 1..1 0..1 CLeads since: Date
  37. 37. Non-Diagrammatic Constraints (OCL) C1 C2 C C3 disjoint classes 8X C2(X) ^ C3(X)  ? We need negative constraints of the form 8X C1(X) ^ … ^ Cn(X)  ?
  38. 38. C1 C2 C most-specific class We need most-specific class constraints of the form 8X C1(X) ^ … ^ Cn(X)  C(X) 8X C1(X) ^ C2(X)  C(X) Non-Diagrammatic Constraints (OCL)
  39. 39. type the domain of Enrolled We need domain-type constraints of the form 8X8Y C(X) ^ Attribute(Y,X)  T(Y) 8X8Y CS-Course(X) ^ Enrolled(Y,X)  CS-Student(Y) Enrolled [0..1]:CS-Course CS-Student Enrolled [0..1]:Course Student Non-Diagrammatic Constraints (OCL)
  40. 40. OCL (Object Constraint Language) Context C1 inv: C1.allInstances -> forAll ( x1: C1 | C2.allInstances -> forAll ( x2: C2 | x1=x2 implies x2.oclIsTypeOf(C) ) ) Context C1 inv: C1.allInstances -> forAll ( x1: C1 | C2.allInstances -> forAll ( x2: C2 | x1<>x2 ) ) Context Object inv: Object.allInstances -> forAll ( y: Object | y.A.oclIsTypeOf(C) implies y.oclTypeOf(T) ) 8X C1(X) ^ C2(X)  ? 8X C1(X) ^ C2(X)  C(X) 8X8Y C(X) ^ A(Y,X)  T(Y)
  41. 41. 8X8Y C(X) ^ Attr(X,Y)  T(Y) 8X C(X)  9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn) 8X C1(X)  C2(X) 8X18X2 A(X1,X2)  C1(X1) ^ C2 (X2) 8X18X28Y A(X1,X2) ^ glue(X1,X2,Y)  CA(Y) 8X18X2 A(X1,X2)  9Y glue(X1,X2,Y) 8X C(X)  9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn) 8X C(X)  9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X) 8X18X2 A1(X1,X2)  A2(X1,X2) 8X C1(X) ^ … ^ Cn(X)  C(X) 8X8Y C(X) ^ Attr(Y,X)  T(Y) Lean UML Class Diagrams as Rules additional constraints associations classes
  42. 42. 8X C1(X) ^ … ^ Cn(X)  C(X) 8X8Y C(X) ^ Attr(Y,X)  T(Y) 8X8Y C(X) ^ Attr(X,Y)  T(Y) 8X C(X)  9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn) 8X C1(X)  C2(X) 8X18X2 A(X1,X2)  C1(X1) ^ C2 (X2) 8X18X28Y A(X1,X2) ^ glue(X1,X2,Y)  CA(Y) 8X18X2 A(X1,X2)  9Y glue(X1,X2,Y) 8X C(X)  9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn) 8X C(X)  9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X) 8X18X2 A1(X1,X2)  A2(X1,X2) Lean UML Class Diagrams as Rules additional constraints associations classes
  43. 43. Data Complexity of Query Answering [Calì, G., Orsi & Pieris, FOSSACS 2012] Theorem: Query answering under Lean UML class diagrams + negative & most-specific class & domain-type constraints is PTIME-complete
  44. 44. Data Complexity of Query Answering Proof: • in PTIME: reduction to query answering under guarded TGDs • PTIME-hardness (even without domain-type constraints): reduction from Path System Accessibility Theorem: Query answering under Lean UML class diagrams + negative & most-specific class & domain-type constraints is PTIME-complete [Calì, G., Orsi & Pieris, FOSSACS 2012]
  45. 45. Combined Complexity of Query Answering Theorem: Query answering under Lean UML class diagrams + negative & most-specific class constraints is PSPACE-complete
  46. 46. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D
  47. 47. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) …can be computed from A(y,z) [ type(A(y,z)) atoms whose arguments are among {y,z} 8X C(X) ^ A(X,Y)  T(Y) 8X8Y A(X,Y)  C1(X) ^ C2(Y) 8X8Y A(X,Y)  A 0 (X,Y) C(x) D
  48. 48. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) …can be computed from A(y,z) [ type(A(y,z)) atoms whose arguments are among {y,z} • type(A(y,z)) can be computed from A(y,z) [ type(C(x)) in NP C(x) D 8X C(X) ^ A(X,Y)  T(Y) 8X8Y A(X,Y)  C1(X) ^ C2(Y) 8X8Y A(X,Y)  A 0 (X,Y)
  49. 49. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) …can be computed from A(y,z) [ type(A(y,z)) atoms whose arguments are among {y,z} • type(A(y,z)) can be computed from A(y,z) [ type(C(x)) in NP • construct non-deterministically a finite part with at most |Q| atoms C(x) D 8X C(X) ^ A(X,Y)  T(Y) 8X8Y A(X,Y)  C1(X) ^ C2(Y) 8X8Y A(X,Y)  A 0 (X,Y)
  50. 50. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  51. 51. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  52. 52. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  53. 53. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  54. 54. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  55. 55. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  56. 56. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  57. 57. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  58. 58. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  59. 59. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  60. 60. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (chase procedure) A(y,z) C(x) D R(a,v) S(v,w) B(y,z) T(z) Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
  61. 61. Combined Complexity of Query Answering Proof: • PSPACE-hardness: simulation of the computation of a PSPACE Turing machine on input I = α0α1...αn-1, assuming that it uses m = nk cells • Initialization rules 8X initial(X)  initial-state(X) 8X initial(X)  cell0[α0,1](X) 8X initial(X)  celli[αi,0](X), for each i 2 {1,…,n-1} 8X initial(X)  celli[0,0](X), for each i 2 {n,…,m-1} initial-state cell0[α0,1] celli[αi,0] celln-1[αn-1,0]… celln[0,0] cellm-1[0,0]… initial
  62. 62. 8X config(X)  9Y succ(X,Y) 8X8Y config(X) ^ succ(X,Y)  config(Y) Proof: • PSPACE-hardness: simulation of the computation of a PSPACE Turing machine on input I = α0α1...αn-1, assuming that it uses m = nk cells • Configuration generation rules 8X initial(X)  config(X) Combined Complexity of Query Answering initial config succ[1..1]:config
  63. 63. Proof: • PSPACE-hardness: simulation of the computation of a PSPACE Turing machine on input I = α0α1...αn-1, assuming that it uses m = nk cells • Rules to describe the transition from one configuration to another, e.g., state transition rules Combined Complexity of Query Answering 8X8Y s1-celli [α1,1](X) ^ succ(X,Y)  state-s2(Y), for each i 2 {0,…,m-1} for each δ(hs1,α1i) = hs2,α2,di: in configuration X, which has state s1, the i-th cell contains α1, and the cursor is over cell i s1-celli [α1,1] succ[0..1]:state-s2
  64. 64. Proof: • PSPACE-hardness: simulation of the computation of a PSPACE Turing machine on input I = α0α1...αn-1, assuming that it uses m = nk cells • Acceptance rule 8X accept-state(X)  accept(X) • Initial database D = {initial(c)} – c is the initial configuration • Boolean CQ yes  accept(c) • Turing machine accepts I iff D [  ² yes Combined Complexity of Query Answering accept accept-state
  65. 65. Combined Complexity of Query Answering Theorem: Query answering under Lean UML class diagrams + negative & most-specific class & domain-type constraints is EXPTIME-complete
  66. 66. Combined Complexity of Query Answering Theorem: Query answering under Lean UML class diagrams + negative & most-specific class & domain-type constraints is EXPTIME-complete Proof: • in EXPTIME: reduction to guarded TGDs of bounded arity
  67. 67. Combined Complexity of Query Answering Proof: • EXPTIME-hardness: simulation of an alternating PSPACE Turing machine • Acceptance rules – q 2 {9,} and i 2 {1,2}: 8X accept-stateq(X)  accept(X) 8X8Y accept(X) ^ succi(Y,X)  accepti(Y) 8X state9(X) ^ accepti(X)  accept(X) 8X state(X) ^ accept1(X) ^ accept2(X)  accept(X) accept accept-stateq domain-type most-specific class
  68. 68. Further Restrictions – Restricted Lean (RLean) different classes have disjoint sets of attributes and operations instead of 8X8Y Student(X) ^ Enrolled(X,Y)  Course(Y) we have 8X8Y Student-Enrolled(X,Y)  Course(Y) Data complexity in AC0 and combined complexity NP-complete Enrolled [0..1]:CS-Course CS-Student Enrolled [0..1]:Course Student
  69. 69. Overview Linear Guarded in AC0 PTIME-c Lean UML RLean UML
  70. 70. Complexity of Querying Lean UML Class Diagrams UML Formalism Additional Constraints Data Complexity Combined Complexity Lean negative specific-class domain-type PTIME-c EXPTIME-c Lean negative specific-class PTIME-c PSPACE-c RLean negative specific-class in AC0 NP-c
  71. 71. Competes Stock 0..1 Member Owns 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List A Unifying Framework for Reasoning over Data UML Class Diagrams F-Logic Description Logics Speaker v Person u 9gives.Talk Tutorial v 8attendedBy.Student attendedBy v attended-1 Speaker u Student v ? student::person person[age ¤) number] person[age {0:1} ¤) number] person[name {1:¤} ¤) string] PREFIX abc: hhttp://example.com/exampleOntology#i SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; ?x abc:isCapitalOf ?y . ?x abc:countryname ?country ; ?x abc:isInContinent abc:Africa . } SPARQL
  72. 72. Description Logics (DLs) • Logics that model the domain of interest in terms of: • Concepts ! sets of objects • Roles ! binary relations on sets of objects • Used for conceptual reasoning in the Semantic Web context
  73. 73. Description Logics (DLs) • Logics that model the domain of interest in terms of: • Concepts ! sets of objects • Roles ! binary relations on sets of objects • Used for conceptual reasoning in the Semantic Web context p1 p2 p3 f1 p2 f2 p3 f3 p1 person father person v 9father¡1 each person has a father 9father v person each father is a person
  74. 74. Description Logics (DLs) • Logics that model the domain of interest in terms of: • Concepts ! sets of objects • Roles ! binary relations on sets of objects • Used for conceptual reasoning in the Semantic Web context p1 p2 p3 f1 p2 f2 p3 f3 p1 person father person v 9father¡1 each person has a father 9father v person each father is a person
  75. 75. Description Logics (DLs) • Logics that model the domain of interest in terms of: • Concepts ! sets of objects • Roles ! binary relations on sets of objects • Used for conceptual reasoning in the Semantic Web context p1 p2 p3 f1 p2 f2 p3 f3 p1 person father person v 9father¡1 each person has a father 9father v person each father is a person
  76. 76. The DL-Lite Family B ::= A | 9R | 9R-1 (basic concept) C ::= B | :B (general concept) R ::= P | P-1 (basic role) E ::= R | :R (general role) DL-Litecore DL-LiteF DL-LiteR (OWL 2 QL) B v C R v E(funct R) Popular family of DLs with AC0 data complexity [Calvanese, De Giacomo, Lembo, Lenzerini & Rosati, J. Autom. Reasoning 2007]
  77. 77. The DL-Lite Family Popular family of DLs with AC0 data complexity DL-Lite TBox First-Order Representation (Datalog§) DL-Litecore professor v 9teachesTo professor v :student DL-LiteR hasTutor¡ v teachesTo DL-LiteF funct(hasTutor) 8X professor(X)  9Y teachesTo(X,Y) 8X professor(X) Æ student(X)  ? 8X8Y hasTutor(X,Y)  teachesTo(Y,X) 8X8Y8Z hasTutor(X,Y) Æ hasTutor(X,Z)  Y = Z
  78. 78. The EL Family Popular family of DLs with PTIME data complexity (biological applications) see, e.g., [Baader, IJCAI 2003], [Rosati, DL 2007] and [Pérez-Urbina et al., J. Applied Logic 2010] C1 v C2 R v P EL ELH ELHI ELHI : C ::= > | A | C1 u C2 | 9P.C P1 v P2 R ::= P | P-1 R v E E ::= R | :R
  79. 79. The EL Family Popular family of DLs with PTIME data complexity (biological applications) ELHI : TBox First-Order Representation (Datalog§) A u B v C A v 9R 8X A(X) Æ B(X)  C(X) 8X A(X)  9Y R(X,Y) 8X A(X)  9Y R(X,Y) Æ B(Y)A v 9R.B 9R v A 8X8Y R(X,Y)  A(X) 9R.A v B 8X8Y R(X,Y) Æ A(X)  B(X) A v :B 8X A(X) Æ B(X)  ? R v : P 8X8Y R(X,Y) Æ P(X,Y)  ?
  80. 80. DLs vs. Datalog[9,=,?] • Theorem: For query answering, DL-Lite ·L Linear ELHI : ·L Guarded • Remark: the data complexity is preserved • Theorem: Linear/Guarded TGDs are strictly more expressive 8X supervisorOf(X,X)  manager(X)
  81. 81. Overview Linear Guarded in AC0 PTIME-c Lean UML and ELHI : DL-Lite and RLean UML
  82. 82. Competes Stock 0..1 Member Owns 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List A Unifying Framework for Reasoning over Data UML Class Diagrams F-Logic Description Logics Speaker v Person u 9gives.Talk Tutorial v 8attendedBy.Student attendedBy v attended-1 Speaker u Student v ? student::person person[age ¤) number] person[age {0:1} ¤) number] person[name {1:¤} ¤) string] PREFIX abc: hhttp://example.com/exampleOntology#i SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; ?x abc:isCapitalOf ?y . ?x abc:countryname ?country ; ?x abc:isInContinent abc:Africa . } SPARQL
  83. 83. Frame Logic (F-Logic) • Originally developed for object-oriented deductive databases • Now used as an ontology language – Semantic Web • F-Logic is in general undecidable see, e.g., [Kifer, Lausen & Wu, J. ACM 1995]
  84. 84. F-Logic Lite • An expressive decidable fragment of F-Logic • Obtained from F-Logic by: • Excluding negation and default inheritance • Allowing only a limited form of cardinality constraints • Query answering is PTIME-complete in data complexity [Calì & Kifer, VLDB 2006]
  85. 85. From F-Logic Lite to First-Order Logic (Datalog§) [Calì & Kifer, VLDB 2006] data(O,A,V) Æ data(O,A,W),funct(A,O)  V = W type(O,A,T) Æ data(O,A,V)  member(V,T) sub(C1,C3) Æ sub(C3,C2)  sub(C1,C2) member(O,C) Æ sub(C,C1)  member(O,C1) mandatory(A,O)  9V data(O,A,V) member(O,C) Æ type(C,A,T)  type(O,A,T) sub(C,C1) Æ type(C1,A,T)  type(C,A,T) type(C,A,T1) Æ sub(T1,T)  type(C,A,T) sub(C,C1) Æ mandatory(A,C1)  mandatory(A,C) member(O,C) Æ mandatory(A,C)  member(A,O) sub(C,C1) Æ funct(A,C1)  funct(A,C) member(O,C) Æ funct(A,C)  funct(A,O)
  86. 86. From F-Logic Lite to First-Order Logic (Datalog§) [Calì & Kifer, VLDB 2006] data(O,A,V) Æ data(O,A,W),funct(A,O)  V = W type(O,A,T) Æ data(O,A,V)  member(V,T) sub(C1,C3) Æ sub(C3,C2)  sub(C1,C2) member(O,C) Æ sub(C,C1)  member(O,C1) mandatory(A,O)  9V data(O,A,V) member(O,C) Æ type(C,A,T)  type(O,A,T) sub(C,C1) Æ type(C1,A,T)  type(C,A,T) type(C,A,T1) Æ sub(T1,T)  type(C,A,T) sub(C,C1) Æ mandatory(A,C1)  mandatory(A,C) member(O,C) Æ mandatory(A,C)  member(A,O) sub(C,C1) Æ funct(A,C1)  funct(A,C) member(O,C) Æ funct(A,C)  funct(A,O)
  87. 87. From F-Logic Lite to First-Order Logic (Datalog§) [Calì & Kifer, VLDB 2006] More expressive Datalog[9] language? type(O,A,T) Æ data(O,A,V)  member(V,T) sub(C1,C3) Æ sub(C3,C2)  sub(C1,C2) member(O,C) Æ sub(C,C1)  member(O,C1) mandatory(A,O)  9V data(O,A,V) member(O,C) Æ type(C,A,T)  type(O,A,T) sub(C,C1) Æ type(C1,A,T)  type(C,A,T) type(C,A,T1) Æ sub(T1,T)  type(C,A,T) sub(C,C1) Æ mandatory(A,C1)  mandatory(A,C) member(O,C) Æ mandatory(A,C)  member(A,O) sub(C,C1) Æ funct(A,C1)  funct(A,C) member(O,C) Æ funct(A,C)  funct(A,O) Non-guarded rules
  88. 88. Weakly-Guarded Datalog[9] [Calì, G. & Kifer, KR 2008] 8X8Y S(X,Y) Æ P(X,Y)  9Z P(Y,Z) 8X8Y8W P(X,Y) Æ P(W,X)  S(Y,X) Affected positions = ? • All 8-variables at affected positions occur in one body atom null can occur during the chase
  89. 89. Weakly-Guarded Datalog[9] [Calì, G. & Kifer, KR 2008] 8X8Y S(X,Y) Æ P(X,Y)  9Z P(Y,Z) 8X8Y8W P(X,Y) Æ P(W,X)  S(Y,X) Affected positions = {P[2] • All 8-variables at affected positions occur in one body atom null can occur during the chase
  90. 90. Weakly-Guarded Datalog[9] [Calì, G. & Kifer, KR 2008] 8X8Y S(X,Y) Æ P(X,Y)  9Z P(Y,Z) 8X8Y8W P(X,Y) Æ P(W,X)  S(Y,X) Affected positions = {P[2], S[1] • All 8-variables at affected positions occur in one body atom null can occur during the chase
  91. 91. Weakly-Guarded Datalog[9] [Calì, G. & Kifer, KR 2008] 8X8Y S(X,Y) Æ P(X,Y)  9Z P(Y,Z) 8X8Y8W P(X,Y) Æ P(W,X)  S(Y,X) Affected positions = {P[2], S[1]} • All 8-variables at affected positions occur in one body atom null can occur during the chase
  92. 92. • Models of finite treewidth ) decidability of query answering related to [Andréka, Németi & van Benthem, J. Philosophical Logic 1998] and [Grädel , J. Symb. Log. 1999] Weakly-Guarded Datalog[9] • All 8-variables at affected positions occur in one body atom [Calì, G. & Kifer, KR 2008] null can occur during the chase • Query answering is EXPTIME-complete in data complexity
  93. 93. type(O,A,T) Æ data(O,A,V)  member(V,T) sub(C1,C3) Æ sub(C3,C2)  sub(C1,C2) member(O,C) Æ sub(C,C1)  member(O,C1) mandatory(A,O)  9V data(O,A,V) member(O,C) Æ type(C,A,T)  type(O,A,T) sub(C,C1) Æ type(C1,A,T)  type(C,A,T) type(C,A,T1) Æ sub(T1,T)  type(C,A,T) sub(C,C1) Æ mandatory(A,C1)  mandatory(A,C) member(O,C) Æ mandatory(A,C)  member(A,O) sub(C,C1) Æ funct(A,C1)  funct(A,C) member(O,C) Æ funct(A,C)  funct(A,O) F-Logic Lite is Weakly-Guarded [Calì & Kifer, VLDB 2006] Affected Positions data[1] data[3] type[1] member[1] mandatory[2] funct[2]
  94. 94. type(O,A,T) Æ data(O,A,V)  member(V,T) sub(C1,C3) Æ sub(C3,C2)  sub(C1,C2) member(O,C) Æ sub(C,C1)  member(O,C1) mandatory(A,O)  9V data(O,A,V) member(O,C) Æ type(C,A,T)  type(O,A,T) sub(C,C1) Æ type(C1,A,T)  type(C,A,T) type(C,A,T1) Æ sub(T1,T)  type(C,A,T) sub(C,C1) Æ mandatory(A,C1)  mandatory(A,C) member(O,C) Æ mandatory(A,C)  member(A,O) sub(C,C1) Æ funct(A,C1)  funct(A,C) member(O,C) Æ funct(A,C)  funct(A,O) F-Logic Lite is Weakly-Guarded [Calì & Kifer, VLDB 2006] Affected Positions data[1] data[3] type[1] member[1] mandatory[2] funct[2] Tractability?
  95. 95. Polynomial Cloud Criterion (PCC) [Calì, G. & Kifer, KR 2008] cloud(a): all atoms in the chase with arguments in a and domain constants • For every database, only polynomially many clouds in the chase (up to isomorphism), and can be computed in PTIME
  96. 96. Polynomial Cloud Criterion (PCC) [Calì, G. & Kifer, KR 2008] • Theorem: Query answering under fixed weakly-guarded TGDs is in PTIME in data complexity • A useful tool for identifying tractable cases – F-Logic Lite cloud(a): all atoms in the chase with arguments in a and domain constants • For every database, only polynomially many clouds in the chase (up to isomorphism), and can be computed in PTIME
  97. 97. Overview Linear Guarded Weakly-Guarded PCC Lean UML and ELHI : F-Logic Lite in AC0 PTIME-c EXPTIME-c DL-Lite and RLean UML
  98. 98. Competes Stock 0..1 Member Owns 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List A Unifying Framework for Reasoning over Data UML Class Diagrams F-Logic Description Logics Speaker v Person u 9gives.Talk Tutorial v 8attendedBy.Student attendedBy v attended-1 Speaker u Student v ? student::person person[age ¤) number] person[age {0:1} ¤) number] person[name {1:¤} ¤) string] PREFIX abc: hhttp://example.com/exampleOntology#i SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; ?x abc:isCapitalOf ?y . ?x abc:countryname ?country ; ?x abc:isInContinent abc:Africa . } SPARQL
  99. 99. SPARQL Protocol and RDF Query Language • RDF – data model for representing information in the Web • … in fact, is a finite set of triples (subject, predicate, object) – or a relational database for the schema {triple(.,.,.)} • SPARQL – the standard language for querying RDF data
  100. 100. Some SPARQL Queries • P = (?X, name, ?Y) – list of pairs (o1,o2) such as o2 is the name of o1 • P = (?X, name, B) – list of elements that have a name • P = (?X, name, ?Y) OPT (?X, phone, ?Y) – for every object o, return the object o, the name of o, and the phone number of o, if the phone number is available; otherwise, return the object o and its name
  101. 101. From SPARQL to First-Order Logic (Datalog§) P = (?X, name, ?Y) – list of pairs (o1,o2) such as o2 is the name of o1 8X8Y triple(X,name,Y)  queryP(X,Y)
  102. 102. From SPARQL to First-Order Logic (Datalog§) P = (?X, name, B) – list of elements that have a name 8X8Y triple(X,name,Y)  queryP(X)
  103. 103. From SPARQL to First-Order Logic (Datalog§) P = (?X, name, ?Y) OPT (?X, phone, ?Y) – for every object o, return the object o, the name of o, and the phone number of o, if the phone number is available; otherwise, return the object o and its name 8X8Y triple(X,name,Y) Æ triple(X,phone,Z)  queryP(X,Y,Z) Æ compatibleP(Z) 8X8Y triple(X,name,Y) Æ :compatibleP(X)  queryP,3(X,Y) list of individuals with phone number the third argument (i.e., the phone no.) is missing
  104. 104. From SPARQL to First-Order Logic (Datalog§) 8X8Y triple(X,name,Y) Æ triple(X,phone,Z)  queryP(X,Y,Z) Æ compatibleP(Z) 8X8Y triple(X,name,Y) Æ :compatibleP(X)  queryP,3(X,Y) Non-guarded rules, and also negation is needed P = (?X, name, ?Y) OPT (?X, phone, ?Y) – for every object o, return the object o, the name of o, and the phone number of o, if the phone number is available; otherwise, return the object o and its name
  105. 105. From SPARQL to First-Order Logic (Datalog§) 8X8Y triple(X,name,Y) Æ triple(X,phone,Z)  queryP(X,Y,Z) Æ compatibleP(Z) 8X8Y triple(X,name,Y) Æ :compatibleP(X)  queryP,3(X,Y) Non-guarded rules, and also negation is needed Stratified Weakly-Guarded Datalog[9,:,?] P = (?X, name, ?Y) OPT (?X, phone, ?Y) – for every object o, return the object o, the name of o, and the phone number of o, if the phone number is available; otherwise, return the object o and its name
  106. 106. Additional Functionalities • Reasoning capabilities – deal with RDFS and OWL vocabularies • Navigational capabilities – exploit the graph structure of RDF data • General form of recursion – express natural queries
  107. 107. Additional Functionalities • Reasoning capabilities – deal with RDFS and OWL vocabularies • Navigational capabilities – exploit the graph structure of RDF data • General form of recursion – express natural queries Theorem: Stratified weakly-guarded Datalog[9,:,?] is strictly more expressive than SPARQL enriched with the above functionalities (under the the OWL 2 QL and OWL 2 RL profiles of OWL 2)
  108. 108. Overview Linear Guarded Weakly-Guarded PCC Lean UML and ELHI : F-Logic Lite in AC0 PTIME-c EXPTIME-c DL-Lite and RLean UML SPARQL
  109. 109. Complexity of Datalog[9,=,:,?] Linear Guarded Weakly-Guarded Arbitrary Query PSPACE-c 2EXPTIME-c 2EXPTIME-c Arbitrary Program Fixed/Atomic Query PSPACE-c 2EXPTIME-c 2EXPTIME-c Arbitrary Query NP-c NP-c EXPTIME-c Fixed Program Fixed/Atomic Query in AC0 PTIME-c EXPTIME-c
  110. 110. Complexity of Datalog[9,=,:,?] Data Complexity Linear Guarded Weakly-Guarded Arbitrary Query PSPACE-c 2EXPTIME-c 2EXPTIME-c Arbitrary Program Fixed/Atomic Query PSPACE-c 2EXPTIME-c 2EXPTIME-c Arbitrary Query NP-c NP-c EXPTIME-c Fixed Program Fixed/Atomic Query in AC0 PTIME-c EXPTIME-c
  111. 111. Complexity of Datalog[9,=,:,?] Polynomial Cloud Criterion Linear Guarded Weakly-Guarded Arbitrary Query PSPACE-c 2EXPTIME-c 2EXPTIME-c Arbitrary Program Fixed/Atomic Query PSPACE-c 2EXPTIME-c 2EXPTIME-c Arbitrary Query NP-c NP-c NP-c Fixed Program Fixed/Atomic Query in AC0 PTIME-c PTIME-c
  112. 112. Competes Stock 0..1 Member Owns 0..1 1..1 0..1 1..1 1..1 1..1 2..1 Company Executive Person Issues Index[0..1]:Str getIndex():List A Unifying Framework for Reasoning over Data UML Class Diagrams F-Logic Description Logics Speaker v Person u 9gives.Talk Tutorial v 8attendedBy.Student attendedBy v attended-1 Speaker u Student v ? student::person person[age ¤) number] person[age {0:1} ¤) number] person[name {1:¤} ¤) string] PREFIX abc: hhttp://example.com/exampleOntology#i SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; ?x abc:isCapitalOf ?y . ?x abc:countryname ?country ; ?x abc:isInContinent abc:Africa . } SPARQL Datalog± • Logical foundations, semantics and decidability • Complexity results for ontological query answering • Identification of tractable fragments Thank you!

×