Querying UML Class Diagrams - FoSSaCS 2012

3,978 views
4,182 views

Published on

UML Class Diagrams (UCDs) are the best known class-based formalism for conceptual modeling. They are used by software engineers to model the intensional structure of a system in terms of classes, attributes and operations, and to express constraints that must hold for every instance of the system. Reasoning over UCDs is of paramount importance in design, validation, maintenance and system analysis; however, for medium and large software projects, reasoning over UCDs may be impractical. Query answering, in particular, can be used to verify whether a (possibly incomplete) instance of the system modeled by the UCD, i.e., a snapshot, enjoys a certain property. In this work, we study the problem of querying UCD instances, and we relate it to query answering under guarded Datalog +/-, that is, a powerful Datalog-based language for ontological modeling. We present an expressive and meaningful class of UCDs, named UCDLog, under which conjunctive query answering is tractable in the size of the instances.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,978
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
26
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Querying UML Class Diagrams - FoSSaCS 2012

  1. 1. Querying UML Class Diagrams Georg Gottlob Department of Computer Science University of Oxfordjoint work with Andrea Calì, Giorgio Orsi and Andreas Pieris
  2. 2. Major Modeling Formalisms for Data and Objects Competes person: SSN  Name 0..1 Stock employee[1]  person[1] Company Issues Index[0..1]:Str0..1 1..1 1..1 getIndex():List Relational Data Dependencies 1..1 0..1 Member Owns 2..1 Executive Person 1..1 UML Class Diagrams XML Schemas student v member m_name since g_name leads¡ v works member (1,1) 1 works 2 (1,N) group student v :professor (1,1) [1,2] Description Logics (1,N)student professor leads 1 2 Context C1 inv: C1.allInstances -> forAll ( x1: C1 | C2.allInstances -> forAll ( x2: C2 | x1=x2 implies x2.oclIsTyeOf(C))) ER Diagrams Object Constraint Language (OCL)
  3. 3. Datalog± : A Unifying Logical Framework Datalog± Description Logics (DL-Lite, EL,…) Relational Constraints (IDs, FKDs,…) Datalog Conceptual Models (UML, ER,…) … providing: Logical foundations, semantics, decidability and complexity results for reasoning and query-answering, identification of tractable fragments, …
  4. 4. Datalog§ 8X8Y (X,Y)  (X)• Extend Datalog with additional features such as: • Existential quantification (9): TGDs 8X8Y (X,Y)  9Z (X,Z) • Equality atoms (=): EGDs 8X (X)  Xi = Xj • Constant false (?): Negative constraints 8X (X)  ?• But query answering under Datalog[9] is undecidable [see, e.g., Beeri & Vardi, ICALP 81]• Datalog[9,=,?] is syntactically restricted ! Datalog§
  5. 5. Restriction: Guardedness• All 8-variables occur in one body atom - guard atom 8X8Y8Z R(X,Y,Z) ^ S(Y) ^ P(X,Z)  9W Q(X,W) guard• Models of finite treewidth ) decidability of query answering [Calì, G. & Kifer, KR 08] related to work by [Andréka, Németi & van Benthem] and [Grädel]• Query answering is PTIME-complete in data complexity [Calì, G. & Lukasiewicz, PODS 09]
  6. 6. Reasoning over UML Class Diagrams Competes 0..1 Stock Company Issues 0..1 1..1 Index[0..1]:Str 1..1 1..1 getIndex():List 0..1 Member Owns 2..1 Executive Person 1..1 Satisfiability: Does the diagram admit at least one instantiation?
  7. 7. Reasoning over UML Class Diagrams• Satisfiability - the diagram has a (possibly infinite) non-empty instantiation• Full Satisfiability - the diagram has a (possibly infinite) instantiation where each class and association is non-empty• Finite Satisfiability - the diagram has a finite instantiation
  8. 8. Reasoning over UML Class Diagrams Person {disjoint} Student Worker finitely satisfiable, e.g., {Worker(john),Person(john)} but not fully - student class is necessarily empty
  9. 9. Complexity of Reasoning over UML Class Diagrams • Satisfiability is EXPTIME-complete [Berardi, Calvanese & De Giacomo, Artificial Intelligence 05] • Full Satisfiability is EXPTIME-complete [Artale, Calvanese & Ibánez-García, ER 10] • Finite Satisfiability is EXPTIME-complete [implicit in Berardi, Calvanese & De Giacomo, Artificial Intelligence 05]
  10. 10. Querying UML Class Diagrams Executive(john) CompetesMember(john,LU) 0..1 Stock Stock(BAY) Company Issues 0..1 1..1 1..1 Index[0..1]:StrIssues(BA,BAY) 1..1 getIndex():ListOwns(john,BAY) 0..1Competes(LU,BA) Member Owns 2..1 Executive Person 1..1 Which persons have a potential conflict of interest?
  11. 11. Querying UML Class Diagrams Executive(john) Competes Member(john,LU) 0..1 Stock Stock(BAY) Company Issues 0..1 1..1 1..1 Index[0..1]:Str Issues(BA,BAY) 1..1 getIndex():List Owns(john,BAY) 0..1 Competes(LU,BA) Member Owns 2..1 Executive Person 1..1Conflict(P)  Person(P), Company(C1), Company(C2), Stock(S), Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
  12. 12. Querying UML Class Diagrams Executive(john) CompetesMember(john,LU) 0..1 Stock Stock(BAY) Company Issues 0..1 1..1 1..1 Index[0..1]:StrIssues(BA,BAY) 1..1 getIndex():ListOwns(john,BAY) 0..1Competes(LU,BA) Member Owns 2..1 Executive Person 1..1 Does anybody have a potential conflict of interest?
  13. 13. Querying UML Class Diagrams Executive(john) CompetesMember(john,LU) 0..1 Stock Stock(BAY) Company Issues 0..1 1..1 1..1 Index[0..1]:StrIssues(BA,BAY) 1..1 getIndex():ListOwns(john,BAY) 0..1Competes(LU,BA) Member Owns 2..1 Executive Person 1..1 Conflict  Person(P), Company(C1), Company(C2), Stock(S), Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
  14. 14. Querying UML Class Diagrams Executive(john) CompetesMember(john,LU) 0..1 Stock Stock(BAY) Company Issues 0..1 1..1 1..1 Index[0..1]:StrIssues(BA,BAY) 1..1 getIndex():ListOwns(john,BAY) 0..1Competes(LU,BA) Member Person(john) Owns 2..1 Executive Person 1..1 Conflict  Person(P), Company(C1), Company(C2), Stock(S), Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
  15. 15. Querying UML Class Diagrams Executive(john) CompetesMember(john,LU) 0..1 Stock Stock(BAY) Company Issues 0..1 1..1 1..1 Index[0..1]:StrIssues(BA,BAY) 1..1 getIndex():ListOwns(john,BAY) 0..1Competes(LU,BA) Member Person(john) Owns 2..1 Company(LU) Executive Person Company(BA) 1..1 Conflict  Person(P), Company(C1), Company(C2), Stock(S), Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
  16. 16. Querying UML Class Diagrams Executive(john) CompetesMember(john,LU) 0..1 Stock Stock(BAY) Company Issues 0..1 1..1 1..1 Index[0..1]:StrIssues(BA,BAY) 1..1 getIndex():ListOwns(john,BAY) 0..1Competes(LU,BA) Member Person(john) Owns 2..1 Company(LU) Executive Person Company(BA) 1..1 {P ! john, C1 ! LU, C2 ! BA, S ! BAY} Conflict  Person(P), Company(C1), Company(C2), Stock(S), Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
  17. 17. Querying UML Class Diagrams: Existential Rules Group(DB) Member 3..1 WorksIn 1..1 {disjoint} Group 0..1 Student Professor 1..1 Leads CLeads since: Date Is there a professor who works in the database group?
  18. 18. Querying UML Class Diagrams: Existential Rules Group(DB) Member 3..1 WorksIn 1..1 {disjoint} Group 0..1 Student Professor 1..1 Leads CLeads since: Date Ans  Professor(P), WorksIn(P,DB)
  19. 19. Querying UML Class Diagrams: Existential Rules Group(DB) Leads(z1,DB) Member 3..1 WorksIn Professor(z1) 1..1 {disjoint} Group R*(z1,DB,z2) CLeads(z2) 0..1 Student Professor 1..1 Leads CLeads since: Date Ans  Professor(P), WorksIn(P,DB)
  20. 20. Querying UML Class Diagrams: Existential Rules Group(DB) Leads(z1,DB) Member 3..1 WorksIn Professor(z1) 1..1 {disjoint} Group R*(z1,DB,z2) CLeads(z2) 0..1WorksIn(z1,DB) Student Professor 1..1 Leads CLeads since: Date Ans  Professor(P), WorksIn(P,DB)
  21. 21. Querying UML Class Diagrams: Existential Rules Group(DB) Leads(z1,DB) Member 3..1 WorksIn Professor(z1) 1..1 {disjoint} Group R*(z1,DB,z2) CLeads(z2) 0..1WorksIn(z1,DB) Student Professor 1..1 Leads … CLeads since: Date {P ! z1, DB ! DB} Ans  Professor(P), WorksIn(P,DB)
  22. 22. Querying UML Class Diagrams D  Q D[²Q , 8M (M ² D [  ! M ² Q)
  23. 23. Querying UML Class Diagrams D  Q D[²Q , 8M (M ² D [  ! M ² Q) M¶D Æ M²
  24. 24. From Diagrams to First-Order Logic (Datalog§) Competes 0..1 Stock Company Issues 0..1 1..1 Index[0..1]:Str 1..1 1..1 getIndex():List 0..1 Member Owns 2..1 Executive Person 1..1
  25. 25. From Diagrams to First-Order Logic (Datalog§) Competes 0..1 Stock Company Issues 0..1 1..1 Index[0..1]:Str 1..1 1..1 getIndex():List 0..1 Member Owns 2..1 Executive Person 1..1 8X8Y Member(X,Y)  Company(X) ^ Executive(Y) 8X Company(X)  9Y9Z Member(X,Y) ^ Member(X,Z) ^ Y ≠ Z 8X Executive(X)  9Y Member(Y,X) [Satoh & Kaneiwa, TCS 10 and Berardi et al., AI 05]
  26. 26. From Diagrams to First-Order Logic (Datalog§) Competes 0..1 Stock Company Issues 0..1 1..1 Index[0..1]:Str 1..1 1..1 getIndex():List 0..1 Member Owns 2..1 Executive Person 1..1 8X Company(X)  9Y Issues(X,Y) 8X Stock(X)  9Y Issues(Y,X) 8X8Y8Z Stock(X) ^ Issues(Y,X) ^ Issues(Z,X)  Y = Z 8X8Y Stock(X) ^ Index(X,Y)  Str(Y) [Satoh & Kaneiwa, TCS 10 8X8Y Stock(X) ^ getIndex(X,Y)  List(Y) and Berardi et al., AI 05]
  27. 27. From Diagrams to First-Order Logic (Datalog§) Competes 0..1 Stock Company Issues 0..1 1..1 Index[0..1]:Str 1..1 1..1 getIndex():List 0..1 Member Owns 2..1 Executive Person 1..1 8X Executive(X)  Person(X) [Satoh & Kaneiwa, TCS 10 and Berardi et al., AI 05]
  28. 28. Complexity of Query Answering• EXPTIME-complete in combined complexity (everything is part of the input) [implicit in Berardi et al., AI 05 and Lutz, IJCAR 08]• coNP-complete in data complexity (the diagram and the query are fixed) [implicit in Ortiz, Calvanese & Eiter, AAAI 06]• Undecidable when diagrams are combined with arbitrary OCL (Object Constraint Language) constraints [folklore]
  29. 29. Research Challenge: Reduce High Complexity • Diagrams often have very large instantiations • Some applications require very large diagrams • OCL constraints, that are not expressible diagrammatically, lead to undecidability or high complexity
  30. 30. Our Goals • Restrict UML class diagrams to achieve tractability of query answering in data complexity • Better understanding of combined complexity • Add relevant OCL constraints without losing tractability of query answering in data complexity
  31. 31. Lean UML Class Diagrams• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
  32. 32. Lean UML Class Diagrams• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1} mL..mU A nL..nU• For each association A: C1 C2 - upper bounds mU ,nU 2 {1,1}, - if A generalizes some other association, then mU = nU = 1
  33. 33. Lean UML Class Diagrams• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1} mL..mU A nL..nU• For each association A: C1 C2 - upper bounds mU ,nU 2 {1,1}, - if A generalizes some other association, then mU = nU = 1 C• Completeness constraints are forbidden {complete} 8X C(X)  C1(X) _ C2(X) C1 C2
  34. 34. Lean UML Class Diagrams• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1} mL..mU A nL..nU• For each association A: C1 C2 - upper bounds mU ,nU 2 {1,1}, - if A generalizes some other association, then mU = nU = 1 C• Completeness constraints are forbidden {complete} 8X C(X)  C1(X) _ C2(X) C1 C2
  35. 35. Lean UML Class Diagrams: Example Competes 0..1 Stock Company Issues 0..1 1..1 Index[0..1]:Str 1..1 1..1 getIndex():List 0..1 Member Owns 2..1 Executive Person 1..1
  36. 36. Lean UML Class Diagrams: Example 3..1 WorksIn Member 1..1 {disjoint} Group 0..1 Student Professor 1..1 Leads CLeads since: Date
  37. 37. Add Some Non-Diagrammatic Constraints (OCL) C C1 C2 8X C2(X) ^ C3(X)  ? C3 disjoint classes We need negative constraints of the form 8X C1(X) ^ … ^ Cn(X)  ?
  38. 38. Add Some Non-Diagrammatic Constraints (OCL) C1 C2 C 8X C1(X) ^ C2(X)  C(X) most-specific class We need most-specific class constraints of the form 8X C1(X) ^ … ^ Cn(X)  C(X)
  39. 39. Add Some Non-Diagrammatic Constraints (OCL) Student Enrolled[0..1]:Course type the domain of Enrolled CS-Student Enrolled[0..1]:CS-Course 8X8Y CS-Course(X) ^ Enrolled(Y,X)  CS-Student(Y) We need domain-type constraints of the form 8X8Y C(X) ^ Attr(Y,X)  T(Y)
  40. 40. Add Some Non-Diagrammatic Constraints (OCL) Student Enrolled[0..1]:Course type the domain of Enrolled CS-Student Enrolled[0..1]:CS-Course 8X8Y CS-Course(X) ^ Enrolled(Y,X)  CS-Student(Y) We need domain-type constraints of the form 8X8Y C(X) ^ Attr(Y,X)  T(Y) pullback rule
  41. 41. Add Some Non-Diagrammatic Constraints (OCL) Student Enrolled[0..1]:Course type the domain of Enrolled CS-Student Enrolled[0..1]:CS-Course 8X8Y CS-Course(X) ^ Enrolled(Y,X)  CS-Student(Y) We need domain-type constraints of the form 8X8Y C(X) ^ Attr(Y,X)  T(Y) pullback rule will make a difference!
  42. 42. OCL (Object Constraint Language) Context C1 inv: C1.allInstances -> forAll ( x1: C1 | 8X C1(X) ^ C2(X)  ? C2.allInstances -> forAll ( x2: C2 | x1<>x2 ) ) Context C1 inv: C1.allInstances -> forAll ( x1: C1 | C2.allInstances -> forAll ( x2: C2 |8X C1(X) ^ C2(X)  C(X) x1=x2 implies x2.oclIsTypeOf(C) ) ) Context Object inv:8X8Y C(X) ^ a(Y,X)  T(Y) Object.allInstances -> forAll ( y: Object | y.a.oclIsTypeOf(C) implies y.oclTypeOf(T) )
  43. 43. Lean UML Class Diagrams as Rules8X8Y C(X) ^ Attr(X,Y)  T(Y)8X C(X)  9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn) 1 ≤ i < j ≤ n Yi ≠ Yj8X8Y8Z C(X) ^ Attr(X,Y) ^ Attr(X,Z)  Y = Z8X8Y1…8Yn8Z C(X) ^ Op(X,Y1, …,Yn,Z)  T1(Y1) ^ … ^ Tn(Yn) ^ T(Z)8X8Y1…8Yn8Z18Z2 C(X) ^ Op(X,Y1, …,Yn,Z1) ^ Op(X,Y1, …,Yn,Z2)  Z1 = Z28X C1(X)  C2(X)8X C1(X) ^ … ^ Cn(X)  ?8X1…8Xn A(X1,…,Xn)  C1(X1) ^ … ^ Cn(Xn) 8X18X2 A(X1,X2)  C1(X1) ^ C2(X2)8X1…8Xn8Y A(X1,…,Xn) ^ R*(X1,…,Xn,Y)  CA(Y) 8X18X28Y A(X1,X2) ^ R*(X1,X2,Y)  CA(Y)8X1…8Xn A(X1,…,Xn)  9Y R*(X1,…,Xn,Y) 8X18X2 A(X1,X2)  9Y R*(X1,X2,Y)8X1…8Xn8Y8Z A(X1,…,Xn) ^ R*(X1,…,Xn,Y) ^ R*(X1,…,Xn,Z)  Y = Z8X1…8Xn8Y1…8Yn8Z R*(X1,…,Xn,Z) ^ R*(Y1,…,Yn,Z) ^ CA(Z)  X1 = Y1 ^ … ^ Xn = Yn8X C(X)  9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn) 1 ≤ i < j ≤ n Yi ≠ Yj8X8Y8Z C(X) ^ A(X,Y) ^ A(X,Z)  Y = Z8X C(X)  9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X) 1 ≤ i < j ≤ n Yi ≠ Yj8X8Y8Z C(X) ^ A(Y,X) ^ A(Z,X)  Y = Z8X1…8Xn A1(X1,…,Xn)  A2(X1,…, Xn) 8X18X2 A1(X1,X2)  A2(X1,X2)
  44. 44. Lean UML Class Diagrams as Rules8X8Y C(X) ^ Attr(X,Y)  T(Y)8X C(X)  9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn) 1 ≤ i < j ≤ n Yi ≠ Yj pre-process8X8Y8Z C(X) ^ Attr(X,Y) ^ Attr(X,Z)  Y = Z8X8Y1…8Yn8Z C(X) ^ Op(X,Y1, …,Yn,Z)  T1(Y1) ^ … ^ Tn(Yn) ^ T(Z)8X8Y1…8Yn8Z18Z2 C(X) ^ Op(X,Y1, …,Yn,Z1) ^ Op(X,Y1, …,Yn,Z2)  Z1 = Z2 check via query8X C1(X)  C2(X)8X C1(X) ^ … ^ Cn(X)  ?8X1…8Xn A(X1,…,Xn)  C1(X1) ^ … ^ Cn(Xn) 8X18X2 A(X1,X2)  C1(X1) ^ C2(X2)8X1…8Xn8Y A(X1,…,Xn) ^ R*(X1,…,Xn,Y)  CA(Y) 8X18X28Y A(X1,X2) ^ R*(X1,X2,Y)  CA(Y)8X1…8Xn A(X1,…,Xn)  9Y R*(X1,…,Xn,Y) 8X18X2 A(X1,X2)  9Y R*(X1,X2,Y)8X1…8Xn8Y8Z A(X1,…,Xn) ^ R*(X1,…,Xn,Y) ^ R*(X1,…,Xn,Z)  Y = Z8X1…8Xn8Y1…8Yn8Z R*(X1,…,Xn,Z) ^ R*(Y1,…,Yn,Z) ^ CA(Z)  X1 = Y1 ^ … ^ Xn = Yn8X C(X)  9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn) 1 ≤ i < j ≤ n Yi ≠ Yj8X8Y8Z C(X) ^ A(X,Y) ^ A(X,Z)  Y = Z8X C(X)  9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X) 1 ≤ i < j ≤ n Yi ≠ Yj8X8Y8Z C(X) ^ A(Y,X) ^ A(Z,X)  Y = Z8X1…8Xn A1(X1,…,Xn)  A2(X1,…, Xn) 8X18X2 A1(X1,X2)  A2(X1,X2)
  45. 45. Lean UML Class Diagrams as Rules 8X8Y C(X) ^ Attr(X,Y)  T(Y) 8X C(X)  9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn) classes 8X C1(X)  C2(X) 8X18X2 A(X1,X2)  C1(X1) ^ C2 (X2) 8X18X28Y A(X1,X2) ^ R*(X1,X2,Y)  CA(Y) 8X18X2 A(X1,X2)  9Y R*(X1,X2,Y) associations 8X C(X)  9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn) 8X C(X)  9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X) 8X18X2 A1(X1,X2)  A2(X1,X2) 8X C1(X) ^ … ^ Cn(X)  C(X) additional constraints 8X8Y C(X) ^ Attr(Y,X)  T(Y)
  46. 46. Lean UML Class Diagrams as Rules 8X8Y C(X) ^ Attr(X,Y)  T(Y) 8X C(X)  9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn) 8X C1(X)  C2(X) 8X18X2 A(X1,X2)  C1(X1) ^ C2 (X2) 8X18X28Y A(X1,X2) ^ R*(X1,X2,Y)  CA(Y) 8X18X2 A(X1,X2)  9Y R*(X1,X2,Y) guard atoms 8X C(X)  9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn) 8X C(X)  9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X) 8X18X2 A1(X1,X2)  A2(X1,X2) 8X C1(X) ^ … ^ Cn(X)  C(X) 8X8Y C(X) ^ Attr(Y,X)  T(Y)
  47. 47. Data Complexity of Query AnsweringTheorem: Query answering under Lean UML class diagrams + negative& most-specific class & domain-type constraints is PTIME-complete
  48. 48. Data Complexity of Query AnsweringTheorem: Query answering under Lean UML class diagrams + negative& most-specific class & domain-type constraints is PTIME-completeProof:• in PTIME: reduction to assertions 8X8Y body(X,Y)  9Z head(X,Z), where body(X,Y) has a guard-atom [Calì, G. & Lukasiewicz, PODS 09]• PTIME-hardness (even without domain-type constraints): reduction from Path System Accessibility
  49. 49. Combined Complexity of Query Answering Theorem: Query answering under Lean UML class diagrams + negative & most-specific class constraints is PSPACE-complete
  50. 50. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D 8X8Y C(X) ^ A(X,Y)  T(Y) C(x) 8X8Y A(X,Y)  C1(X) ^ C2(Y)polynomial depth Relevant part: 8X8Y A(X,Y)  B(X,Y) A(y,z) B(y,z) T(z) R(a,v) S(v,w)
  51. 51. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D 8X8Y C(X) ^ A(X,Y)  T(Y) C(x) 8X8Y A(X,Y)  C1(X) ^ C2(Y)polynomial depth Relevant part: 8X8Y A(X,Y)  B(X,Y) A(y,z) B(y,z) T(z) R(a,v) S(v,w) Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
  52. 52. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model (Chase) D 8X8Y C(X) ^ A(X,Y)  T(Y) C(x) 8X8Y A(X,Y)  C1(X) ^ C2(Y)polynomial depth Relevant part: 8X8Y A(X,Y)  B(X,Y) A(y,z) B(y,z) T(z) R(a,v) S(v,w) Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
  53. 53. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D 8X8Y C(X) ^ A(X,Y)  T(Y) C(x) 8X8Y A(X,Y)  C1(X) ^ C2(Y) 8X8Y A(X,Y)  B(X,Y) A(y,z) B(y,z) T(z) R(a,v) S(v,w) Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)With each current atom A we need to compute and memorize its type(A). This is in NP
  54. 54. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D 8X8Y C(X) ^ A(X,Y)  T(Y) C(x) 8X8Y A(X,Y)  C1(X) ^ C2(Y) 8X8Y A(X,Y)  B(X,Y) A(y,z) B(y,z) T(z) R(a,v) S(v,w) Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)With each current atom A we need to compute and memorize its type(A). This is in NP
  55. 55. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D C(x) A(y,z) B(y,z) T(z) R(a,v) S(v,w) 8X8Y C(X) ^ A(X,Y)  T(Y)With each current atom A we need to compute 8X8Y A(X,Y)  C1(X) ^ C2(Y)and memorize its type(A). This is in NP 8X8Y A(X,Y)  B(X,Y)
  56. 56. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D C(x) A(y,z) B(y,z) T(z) R(a,v) S(v,w) 8X8Y C(X) ^ A(X,Y)  T(Y)With each current atom A we need to compute 8X8Y A(X,Y)  C1(X) ^ C2(Y)and memorize its type(A). This is in NP 8X8Y A(X,Y)  B(X,Y)
  57. 57. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D C(x) A(y,z) B(y,z) T(z) R(a,v) S(v,w) 8X8Y C(X) ^ A(X,Y)  T(Y)With each current atom A we need to compute 8X8Y A(X,Y)  C1(X) ^ C2(Y)and memorize its type(A). This is in NP 8X8Y A(X,Y)  B(X,Y)
  58. 58. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D C(x) A(y,z) B(y,z) T(z) R(a,v) S(v,w) 8X8Y C(X) ^ A(X,Y)  T(Y)With each current atom A we need to compute 8X8Y A(X,Y)  C1(X) ^ C2(Y)and memorize its type(A). This is in NP 8X8Y A(X,Y)  B(X,Y)
  59. 59. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D C(x) A(y,z) B(y,z) T(z) R(a,v) S(v,w) 8X8Y C(X) ^ A(X,Y)  T(Y)With each current atom A we need to compute 8X8Y A(X,Y)  C1(X) ^ C2(Y)and memorize its type(A). This is in NP 8X8Y A(X,Y)  B(X,Y)
  60. 60. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D C(x) A(y,z) B(y,z) T(z) R(a,v) S(v,w) 8X8Y C(X) ^ A(X,Y)  T(Y)With each current atom A we need to compute 8X8Y A(X,Y)  C1(X) ^ C2(Y)and memorize its type(A). This is in NP 8X8Y A(X,Y)  B(X,Y)
  61. 61. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D C(x) A(y,z) B(y,z) T(z) R(a,v) S(v,w) 8X8Y C(X) ^ A(X,Y)  T(Y)With each current atom A we need to compute 8X8Y A(X,Y)  C1(X) ^ C2(Y)and memorize its type(A). This is in NP 8X8Y A(X,Y)  B(X,Y)
  62. 62. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D C(x) A(y,z) B(y,z) T(z) R(a,v) S(v,w) 8X8Y C(X) ^ A(X,Y)  T(Y)With each current atom A we need to compute 8X8Y A(X,Y)  C1(X) ^ C2(Y)and memorize its type(A). This is in NP 8X8Y A(X,Y)  B(X,Y)
  63. 63. Combined Complexity of Query Answering Proof: • in PSPACE: there exists a tree-like universal model D C(x) A(y,z) B(y,z) T(z) R(a,v) S(v,w) 8X8Y C(X) ^ A(X,Y)  T(Y) Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)With each current atom A we need to compute 8X8Y A(X,Y)  C1(X) ^ C2(Y)and memorize its type(A). This is in NP 8X8Y A(X,Y)  B(X,Y)
  64. 64. Combined Complexity of Query AnsweringProof:• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine on input I = α0α1...αn-1, assuming that it uses m = nk cells • Initialization rules 8X initial(X)  initial-state(X) 8X initial(X)  cell0[α0,1](X) 8X initial(X)  celli[αi,0](X), for each i 2 {1,…,n-1} 8X initial(X)  celli[0,0](X), for each i 2 {n,…,m-1} initial-state cell0[α0,1] cell1[α1,0] … celln-1[αn-1,0] celln[0,0] … cellm-1[0,0] initial
  65. 65. Combined Complexity of Query AnsweringProof:• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine on input I = α0α1...αn-1, assuming that it uses m = nk cells • Configuration generation rules config 8X initial(X)  config(X) succ[1..1]:config 8X config(X)  9Y succ(X,Y) 8X8Y config(X) ^ succ(X,Y)  config(Y) initial
  66. 66. Combined Complexity of Query AnsweringProof:• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine on input I = α0α1...αn-1, assuming that it uses m = nk cells • Rules to describe the transition from one configuration to another, e.g., state transition rules for each δ(hs1,α1i) = hs2,α2,di: 8X8Y s1-celli [α1,1](X) ^ succ(X,Y)  state-s2(Y), for each i 2 {0,…,m-1} in configuration X, which has state s1, the i-th cell contains α1, and the cursor is over cell i s1-celli [α1,1] succ[0..1]:state-s2
  67. 67. Combined Complexity of Query AnsweringProof:• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine on input I = α0α1...αn-1, assuming that it uses m = nk cells • Acceptance rule 8X accept-state(X)  accept(X) accept accept-state • Initial database D = {initial(c)} - c is the initial configuration • Boolean CQ Q= accept(X) • Turing machine accepts I iff D [  ² Q
  68. 68. Combined Complexity of Query AnsweringTheorem: Query answering under Lean UML class diagrams + negative &most-specific class & domain-type constraints is EXPTIME-complete
  69. 69. Combined Complexity of Query AnsweringTheorem: Query answering under Lean UML class diagrams + negative &most-specific class & domain-type constraints is EXPTIME-completeProof:• in EXPTIME: reduction to assertions 8X8Y body(X,Y)  9Z head(X,Z), where body(X,Y) has a guard-atom, and all the predicates are of bounded arity [Calì, G. & Kifer, KR 08]
  70. 70. Combined Complexity of Query AnsweringProof:• EXPTIME-hardness: simulation of an alternating PSPACE Turing machine • Acceptance rules - q 2 {9,} and i 2 {1,2}: 8X q-accept-state(X)  accept(X) accept q-accept-state 8X8Y accept(X) ^ succ1(Y,X)  accept1(Y) domain-type constraints pullback rules 8X8Y accept(X) ^ succ2(Y,X)  accept2(Y) 8X 9-state(X) ^ accept1(X)  accept(X) most-specific class 8X 9-state(X) ^ accept2(X)  accept(X) constraints 8X -state(X) ^ accept1(X) ^ accept2(X)  accept(X)
  71. 71. Further Restrictions Student Enrolled [0..1]:Course CS-Student Enrolled [0..1]:CS-Course
  72. 72. Further Restrictions Student Enrolled [0..1]:Course different classes have disjoint sets of attributes and operations CS-Student Enrolled [0..1]:CS-Course instead of 8X8Y Student(X) ^ Enrolled(X,Y)  Course(Y) we have 8X8Y Student-Enrolled(X,Y)  Course(Y) Data complexity in AC0 and combined complexity NP-complete
  73. 73. Complexity of Querying UML Class Diagrams UML Additional Data CombinedFormalism Constraints Complexity Complexity Full none coNP-complete EXPTIME-complete + negative Lean + specific-class PTIME-complete EXPTIME-complete + domain-type + negative Lean PTIME-complete PSPACE-complete + specific-classRestricted + negative in AC0 NP-complete Lean + specific-class
  74. 74. Datalog± : A Unifying Logical Framework Datalog± Description Logics (DL-Lite, EL,…) Relational Constraints (IDs, FKDs,…) Datalog Conceptual Models (UML, ER,…) … without losing tractable data complexity
  75. 75. Datalog± : A Unifying Logical Framework Datalog± Description Logics (DL-Lite, EL,…) Relational Constraints (IDs, FKDs,…) Datalog Conceptual Models (UML, ER,…) … without losing tractable data complexity
  76. 76. Ontological Reasoning and Datalog DL Assertion Datalog Rule Concept Inclusion emp v person emp(X)  person(X) Concept Product sen-emp £ emp v moreThan sen-emp(X) ^ emp(Y)  moreThan(X,Y) (Inverse) Role Inclusion reports¡ v mgr reports(X,Y)  mgr(Y,X) Role Transitivity trans(mgr) mgr(X,Y) ^ mgr(Y,Z)  mgr(X,Z)
  77. 77. Ontological Reasoning and Datalog DL Assertion Datalog Rule Concept Inclusion emp v person emp(X)  person(X) Concept Product sen-emp £ emp v moreThan sen-emp(X) ^ emp(Y)  moreThan(X,Y) (Inverse) Role Inclusion reports¡ v mgr reports(X,Y)  mgr(Y,X) Role Transitivity trans(mgr) mgr(X,Y) ^ mgr(Y,Z)  mgr(X,Z) Participation emp v 9report emp(X)  9Y report(X,Y) Disjointness emp u customer v ? emp(X) ^ customer(X)  ? Functionality funct(reports) reports(X,Y) ^ reports(X,Z)  Y = Z
  78. 78. Guardedness • All 8-variables occur in one body atom - guard atom 8X8Y8Z R(X,Y,Z) ^ S(Y) ^ P(X,Z)  9W Q(X,W) guard • Models of finite treewidth ) decidability of query answering [Calì, G. & Kifer, KR 08] • Query answering is PTIME-complete in data complexity [Calì, G. & Lukasiewicz, PODS 09] • Properly extends ELH (same data complexity)
  79. 79. Ontology QueryingELH: Popular DL with PTIME data complexity[Baader, IJCAI 03 and Rosati, DL 07] ELH TBox Datalog§ Representation AvB 8X A(X)  B(X) Au Bv C 8X A(X) ^ B(X)  C(X) A v 9R.B 8X A(X)  9Y R(X,Y) ^ B(Y) 9R.A v B 8X8Y R(X,Y) ^ A(Y)  B(X) RvP 8X8Y R(X,Y)  P(X,Y)
  80. 80. First-Order Rewritable Datalog± Fragments Q  compilation Q
  81. 81. First-Order Rewritable Datalog± Fragments Q  D
  82. 82. First-Order Rewritable Datalog± Fragments Q  compilation Q Q* FO+ translation SQL D
  83. 83. First-Order Rewritable Datalog± Fragments Q  compilation Q Q* evaluation translation SQL D 8D: D [  ² Q , D ² Q*
  84. 84. Linearity• Just one atom in the body 8X8Y R(X,Y) ! 9Z (X,Z) guard• Linear TGDs are trivially guarded• Linear TGDs are first-order rewritable [Calì, G. & Lukasiewicz, PODS 09]  Query answering in AC0 data complexity• Polynomial-size rewriting (SQL DL = NR Datalog) [G. & Schwentick, KR 12]• Properly extends DL-Lite (same data complexity)
  85. 85. Ontology QueryingDL-Lite: Popular family of DLs with AC0 data complexity (OWL 2 QL)[Calvanese, De Giacomo, Lembo, Lenzerini & Rosati, JAR 07] DL-Lite TBox Datalog§ Representation AvB 8X A(X)  B(X) A v 9R 8X A(X)  9Y R(X,Y) 9R v A 8X8Y R(X,Y)  A(X) RvP 8X8Y R(X,Y)  P(X,Y)
  86. 86. Finite Controllability ? D[²Q , D [  ²fin Q • Holds for inclusion dependencies [Rosati, PODS 06] • Holds for guarded Datalog± (in fact, for the guarded fragment) [Bárány, G. & Otto, LICS 10] • Different from finite-model property of the guarded fragment: If D [  [ Q has a model, then D [  [ Q it has a finite one.
  87. 87. Finite Controllability and Lean UML Class Diagrams• For each attribute assertion Attribute[ i..j ]:Type: j = 1 C1 mL..mU A nL..nU• For each association A: mU = nU = 1 C2• Disjointness and negative constraints are forbidden• Most-specific class and domain-type constraints are allowed set of guarded TGDs ) finite controllability holds
  88. 88. Datalog§: Overview Guarded Linear8X8Y R(X,Y,Y) ! 9Z P(X,Z)
  89. 89. Datalog§: Overview Guarded Linear Sticky-join8X8Y R(X,Y,Y) ! 9Z P(X,Z) 8X8Y8Z R(X,Y) ^ S(Y,Z) ! 9W P(Y,W)
  90. 90. Datalog§: Overview DL-Lite PTIME-complete FO-rewritable Guarded Linear Sticky Sticky-join ELH IDs + FKDs Lean UCDs
  91. 91. Datalog§: Summary of Complexity Results Data Fixed  CombinedGuarded PTIME-complete NP-complete 2EXPTIME-completeLinear in AC0 NP-complete PSPACE-completeSticky-join in AC0 NP-complete EXPTIME-complete Same complexity with negative constraints and non-conflicting EGDs
  92. 92. Datalog§: Next Steps PTIME-complete PTIME-complete FO-rewritable Guarded Linear Sticky-join ? …with disjunction in the head … finite-model reasoning
  93. 93. Thank you!
  94. 94. But… • What about joins in rule bodies? 8A8D8P runs(D,P) ^ area(P,A)  9E employee(E,D,P,A) • What about the DL assertion concept product? 8E8M elephant(E) ^ mouse(M)  biggerThan(E,M)
  95. 95. But… • What about joins in rule bodies? 8A8D8P runs(D,P) ^ area(P,A)  9E employee(E,D,P,A) • What about the DL assertion concept product? 8E8M elephant(E) ^ mouse(M)  biggerThan(E,M) No tree-like models guaranteed 8X8Y R(X,Y)  9Z R(Y,Z) Infinitely many symbols in S 8X8Y R(X,Y)  S(X) 8X8Y S(X) ^ S(Y)  P(X,Y) P forms an infinite clique
  96. 96. Stickiness 8A8D8P runs(D,P) ^ area(P,A)  9E employee(E,D,P,A) 8E8M elephant(E) ^ mouse(M)  biggerThan(E,M)
  97. 97. Stickiness 8X8Y8Z R(X,Y) ^ P(Y,Z)  9W T(X,Y,W) 8X8Y8Z T(X,Y,Z)  9W S(Y,W) 8X8Y8Z R(X,Y) ^ P(Y,Z)  9W T(X,Y,W) 8X8Y8Z T(X,Y,Z)  9W S(X,W)
  98. 98. Stickiness• Properly generalize inclusion dependencies• Backward-resolution terminates• Query answering is in AC0 in data complexity (first-order rewritability) [Calì, G. & Pieris, VLDB 10]• Properly extends DL-Lite (same data complexity)
  99. 99. Additional Features • EGDs, e.g., 8X8Y8Z reports(X,Y) ^ reports(X,Z) ! Y = Z Non-Conflicting EGDs: do not interact with TGDs Preliminary check without adding complexity • Negative constraints, e.g., 8X emp(X) ^ customer(X) ! ? Check without adding complexity Finite controllability does not hold D = {R(a,b)} D[²Q 8X8Y R(X,Y)  9Z R(Y,Z) but = 8X8Y8Z R(Y,X) ^ R(Z,X)  Y = Z D [  ²fin Q Q  R(A,a)
  100. 100. Additional Features • EGDs, e.g., 8X8Y8Z reports(X,Y) ^ reports(X,Z) ! Y = Z Non-Conflicting EGDs: do not interact with TGDs Preliminary check without adding complexity • Negative constraints, e.g., 8X emp(X) ^ customer(X) ! ? Check without adding complexity
  101. 101. Comparison with ER§ SchemataER§: Extended ER formalism with AC0 data complexity[Calì, G. & Pieris, Information Systems 2012] Lean UML ER§IS-A among classesIS-A among associations¸n participation n¸0 n 2 {0,1}·1 participationPermutation on IS-AValues of attributes complex atomicOperationsAttribute re-use
  102. 102. The Chase Procedure Input: Database D, set of TGDs  Output: A model of D [  D person(john)  person(P)  9F father(F,P) father(F,P)  person(F) chase(D,) = D [ ?
  103. 103. The Chase Procedure Input: Database D, set of TGDs  Output: A model of D [  D person(john)  person(P)  9F father(F,P) father(F,P)  person(F) chase(D,) = D [ {father(z1,john)
  104. 104. The Chase Procedure Input: Database D, set of TGDs  Output: A model of D [  D person(john)  person(P)  9F father(F,P) father(F,P)  person(F) chase(D,) = D [ {father(z1,john), person(z1)
  105. 105. The Chase Procedure Input: Database D, set of TGDs  Output: A model of D [  D person(john)  person(P)  9F father(F,P) father(F,P)  person(F) chase(D,) = D [ {father(z1,john), person(z1), father(z2,z1)
  106. 106. The Chase Procedure Input: Database D, set of TGDs  Output: A model of D [  D person(john)  person(P)  9F father(F,P) father(F,P)  person(F) chase(D,) = D [ {father(z1,john), person(z1), father(z2,z1), …}
  107. 107. The Chase Procedure Input: Database D, set of TGDs  Output: A model of D [  D person(john)  person(P)  9F father(F,P) father(F,P)  person(F) chase(D,) = D [ {father(z1,john), person(z1), father(z2,z1), …} infinite instance
  108. 108. Query Answering via Chase Q h C = chase(D,) D h1 h2 h2(C) h1(C) . . . M1 M2 D[²Q , chase(D,) ² Q [see, e.g., Deutsch, Nash & Remmel, PODS 08]
  109. 109. Bounded Derivation-Depth Property (BDDP) DQ constant depth w.r.t. D P chase(D,) chase(D,) ² Q ) P²Q [Calì, G. & Lukasiewitcz, PODS 09]
  110. 110. Bounded Derivation-Depth Property (BDDP) DQ constant depth w.r.t. D P chase(D,) BDDP ) First-Order Rewritability [Calì, G. & Lukasiewitcz, PODS 09]
  111. 111. First-Order Rewritable TGDs  Q rewriting Q 8D: D [  ² Q , D ² Q Query answering is in AC0 in data complexity [Vardi, PODS 95]
  112. 112. OMG UML (Unified Modeling Language)Standard conceptual modeling tool for software design Competes 0..1 Stock Company Issues Index[0..1]:Str 0..1 1..1 1..1 1..1 getIndex():List 0..1 Member Owns 2..1 Executive Person 1..1 Class DiagramsReasoning over UML models Model checking Specification recovery Software maintenance

×