Upcoming SlideShare
×

# Exchanging More than Complete Data

364 views

Published on

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
364
On SlideShare
0
From Embeds
0
Number of Embeds
34
Actions
Shares
0
2
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Exchanging More than Complete Data

1. 1. Exchanging More than Complete Data Jorge P´rez e Universidad de Chilejoint work with Marcelo Arenas (PUC-Chile) and Juan Reutter (Univ. Edinburgh)
2. 2. FatherAndy BobBob Danny Father(x, y ) → Parent(x, y )Danny Eddie Mother(x, y ) → Parent(x, y ) Parent(x, y ) ∧ Parent(y , z) → Gr-Parent(x, z) Mother Carrie Bob
3. 3. FatherAndy BobBob Danny Father(x, y ) → Parent(x, y )Danny Eddie Mother(x, y ) → Parent(x, y ) Parent(x, y ) ∧ Parent(y , z) → Gr-Parent(x, z) Mother Carrie Bob Father(x, y ) → Pateras(x, y ) Source Target Gr-Parent(x, y ) → Pappoi(x, y )
4. 4. FatherAndy BobBob Danny Father(x, y ) → Parent(x, y )Danny Eddie Mother(x, y ) → Parent(x, y ) Parent(x, y ) ∧ Parent(y , z) → Gr-Parent(x, z) Mother Carrie Bob Father(x, y ) → Pateras(x, y ) Source Target Gr-Parent(x, y ) → Pappoi(x, y ) Pateras Pappoi Andy Bob Andy Danny Bob Danny Carrie Danny Danny Eddie Bob Eddie
5. 5. FatherAndy BobBob Danny Father(x, y ) → Parent(x, y )Danny Eddie Mother(x, y ) → Parent(x, y ) Parent(x, y ) ∧ Parent(y , z) → Gr-Parent(x, z) Mother Carrie Bob Father(x, y ) → Pateras(x, y ) Source Target Gr-Parent(x, y ) → Pappoi(x, y ) Pateras(x, y ) ∧ Pateras(y , z) → Pappoi(x, z)
6. 6. FatherAndy BobBob Danny Father(x, y ) → Parent(x, y )Danny Eddie Mother(x, y ) → Parent(x, y ) Parent(x, y ) ∧ Parent(y , z) → Gr-Parent(x, z) Mother Carrie Bob Father(x, y ) → Pateras(x, y ) Source Target Gr-Parent(x, y ) → Pappoi(x, y ) PaterasAndy BobBob DannyDanny Eddie Pateras(x, y ) ∧ Pateras(y , z) → Pappoi(x, z) PappoiCarrie Danny
7. 7. Exchanging More than Complete Data Jorge P´rez e Universidad de Chilejoint work with Marcelo Arenas (PUC-Chile) and Juan Reutter (Univ. Edinburgh)
8. 8. We can exchange more than complete data ◮ In data exchange we begin with a database instance we begin with complete data ◮ What if we have a representation of a set of possible instances? ◮ We propose a new general formalism to exchange representations of possible instances and apply it to incomplete instances and knowledge bases
9. 9. Outline Formalism for exchanging representations systems Applications to incomplete instances Applications to knowledge bases What is the complexity of testing kb-solutions? How can we compute a good kb-solution? Concluding remarks
10. 10. Outline Formalism for exchanging representations systems Applications to incomplete instances Applications to knowledge bases What is the complexity of testing kb-solutions? How can we compute a good kb-solution? Concluding remarks
11. 11. Representation systems A representation system is composed of: ◮ a set W of representatives ◮ a function rep from W to sets of instances rep V −→ {I1 , I2 , I3 , . . .} = rep(V)
12. 12. Representation systems A representation system is composed of: ◮ a set W of representatives ◮ a function rep from W to sets of instances rep V −→ {I1 , I2 , I3 , . . .} = rep(V) ◮ Incomplete instances and knowledge bases are representation systems
13. 13. In classical data exchangewe consider only complete data M = (S, T, Σ) a schema mapping from S to T I ∈ Inst(S), J ∈ Inst(T) J is a solution for I under M iﬀ (I , J) |= Σ J ∈ SolM (I )
14. 14. In classical data exchangewe consider only complete data M = (S, T, Σ) a schema mapping from S to T I ∈ Inst(S), J ∈ Inst(T) J is a solution for I under M iﬀ (I , J) |= Σ J ∈ SolM (I ) We can extend this to set of instances: given a set X ⊆ Inst(S) SolM (X ) = SolM (I ) I ∈X
15. 15. Extending the deﬁnition to representation systems Given: ◮ a mapping M = (S, T, Σ) ◮ a representation system R = (W, rep) ◮ U, V ∈ W
16. 16. Extending the deﬁnition to representation systems Given: ◮ a mapping M = (S, T, Σ) ◮ a representation system R = (W, rep) ◮ U, V ∈ W Deﬁnition U is an R-solution of V if rep(U) ⊆ SolM (rep(V))
17. 17. Extending the deﬁnition to representation systems Given: ◮ a mapping M = (S, T, Σ) ◮ a representation system R = (W, rep) ◮ U, V ∈ W Deﬁnition U is an R-solution of V if rep(U) ⊆ SolM (rep(V)) or equivalently, ∀J ∈ rep(U), ∃I ∈ rep(V) such that J ∈ SolM (I )
18. 18. Universal solutions and strong representation systems Deﬁnition U is a universal R-solution of V if rep(U) = SolM (rep(V))
19. 19. Universal solutions and strong representation systems Deﬁnition U is a universal R-solution of V if rep(U) = SolM (rep(V)) Let C be a class of mappings, Deﬁnition R = (W, rep) is a strong representation system for C if for every M ∈ C, and V ∈ W, there exists a U ∈ W such that: rep(U) = SolM (rep(V))
20. 20. Universal solutions and strong representation systems Deﬁnition U is a universal R-solution of V if rep(U) = SolM (rep(V)) Let C be a class of mappings, Deﬁnition R = (W, rep) is a strong representation system for C if for every M ∈ C, and V ∈ W, there exists a U ∈ W such that: rep(U) = SolM (rep(V)) Strong representation systems ⇒ universal solutions for representatives in R can always be represented in the same system.
21. 21. Outline Formalism for exchanging representations systems Applications to incomplete instances Applications to knowledge bases What is the complexity of testing kb-solutions? How can we compute a good kb-solution? Concluding remarks
22. 22. Incomplete Instances We consider: ◮ Naive instances: instances with null values R(1, N1 ) R(N1 , 2)
23. 23. Incomplete Instances We consider: ◮ Naive instances: instances with null values R(1, N1 ) R(N1 , 2) ◮ Positive conditional instances: naive instances plus conditions for tuples ◮ equalities between nulls, and between nulls and constants ◮ conjunctions and disjunctions T (1, N1 ) true R(1, N1 , N2 ) N1 = N2 ∨ (N1 = 2 ∧ N2 = 3)
24. 24. Incomplete Instances We consider: ◮ Naive instances: instances with null values R(1, N1 ) R(N1 , 2) ◮ Positive conditional instances: naive instances plus conditions for tuples ◮ equalities between nulls, and between nulls and constants ◮ conjunctions and disjunctions T (1, N1 ) true R(1, N1 , N2 ) N1 = N2 ∨ (N1 = 2 ∧ N2 = 3) Semantics given by valuations of nulls (+ open world): rep(I) = {K | µ(I) ⊆ K for some valuation µ}
25. 25. Strong representation systems for st-tgds Let M = (S, T, Σ) be speciﬁed by source-to-target tgds (st-tgd: φS (¯ ) → ∃¯ψT (¯ , y ), φS and ψT are CQs) x y x ¯ (FKMP03) For every complete instance I there exists a naive instance J s.t. rep(J ) = SolM (I )
26. 26. Strong representation systems for st-tgds Let M = (S, T, Σ) be speciﬁed by source-to-target tgds (st-tgd: φS (¯ ) → ∃¯ψT (¯ , y ), φS and ψT are CQs) x y x ¯ (FKMP03) For every complete instance I there exists a naive instance J s.t. rep(J ) = SolM (I ) Proposition Naive instances are not a strong representation system for st-tgds Idea Mng(N, Alice) Mng(x, y ) → Reports(x, y ) Mng(x, x) → Self-Mng(x)
27. 27. Strong representation systems for st-tgds Let M = (S, T, Σ) be speciﬁed by source-to-target tgds (st-tgd: φS (¯ ) → ∃¯ψT (¯ , y ), φS and ψT are CQs) x y x ¯ (FKMP03) For every complete instance I there exists a naive instance J s.t. rep(J ) = SolM (I ) Proposition Naive instances are not a strong representation system for st-tgds Idea Mng(N, Alice) Mng(x, y ) → Reports(x, y ) ? Mng(x, x) → Self-Mng(x)
28. 28. Positive conditional instances are enough Theorem Positive conditional instances are a strong representation system for st-tgds.
29. 29. Positive conditional instances are enough Theorem Positive conditional instances are a strong representation system for st-tgds. Mng(N, Alice) Mng(x, y ) → Reports(x, y ) Mng(x, x) → Self-Mng(x)
30. 30. Positive conditional instances are enough Theorem Positive conditional instances are a strong representation system for st-tgds. Mng(N, Alice) Mng(x, y ) → Reports(x, y ) Reports(N, Alice) true Mng(x, x) → Self-Mng(x)
31. 31. Positive conditional instances are enough Theorem Positive conditional instances are a strong representation system for st-tgds. Mng(N, Alice) Mng(x, y ) → Reports(x, y ) Reports(N, Alice) true Mng(x, x) → Self-Mng(x) Self-Mng(Alice)
32. 32. Positive conditional instances are enough Theorem Positive conditional instances are a strong representation system for st-tgds. Mng(N, Alice) Mng(x, y ) → Reports(x, y ) Reports(N, Alice) true Mng(x, x) → Self-Mng(x) Self-Mng(Alice) N = Alice
33. 33. Positive conditional instances are enough Theorem Positive conditional instances are a strong representation system for st-tgds. Mng(N, Alice) Mng(x, y ) → Reports(x, y ) Reports(N, Alice) true Mng(x, x) → Self-Mng(x) Self-Mng(Alice) N = Alice Corollary Positive conditional instances are a strong representation system for SO-tgds.
34. 34. Positive conditional instances areexactly the needed representation system Theorem Positive conditional instances are minimal: ◮ equalities between nulls ◮ equalities between constant and nulls ◮ conjunctions and disjunctions are all needed for a strong representation system of st-tgds
35. 35. Positive conditional instancesmatch the complexity bounds of classical data exchange Theorem For positive conditional instances and (ﬁxed) st-tgds ◮ Universal solutions can be constructed in polynomial time. ◮ Certain answers of unions of conjunctive queries can be computed in polynomial time.
36. 36. Outline Formalism for exchanging representations systems Applications to incomplete instances Applications to knowledge bases What is the complexity of testing kb-solutions? How can we compute a good kb-solution? Concluding remarks
37. 37. The semantics of knowledge basesis given by sets of instances Knowledge base over S: (I , Γ) s.t. I ∈ Inst(S) Γ a set of rules over S Semantics: ﬁnite models Mod(I , Γ) = {K ∈ Inst(S) | I ⊆ K and K |= Γ}
38. 38. We can apply our formalismto knowledge bases (I2 , Γ2 ) is a kb-solution for (I1 , Γ1 ) under M iﬀ Mod(I2 , Γ2 ) ⊆ SolM (Mod(I1 , Γ1 ))
39. 39. Outline Formalism for exchanging representations systems Applications to incomplete instances Applications to knowledge bases What is the complexity of testing kb-solutions? How can we compute a good kb-solution? Concluding remarks
40. 40. We study the following decision problem Check-KB-Sol: Input: M = (S, T, Σ) with Σ st-tgds (I1 , Γ1 ) knowledge base over S with Γ1 tgds (I2 , Γ2 ) knowledge base over T with Γ2 tgds Output: Is (I2 , Γ2 ) a kb-solution of (I1 , Γ1 ) under M?
41. 41. We study the following decision problem Check-KB-Sol: Input: M = (S, T, Σ) with Σ st-tgds (I1 , Γ1 ) knowledge base over S with Γ1 tgds (I2 , Γ2 ) knowledge base over T with Γ2 tgds Output: Is (I2 , Γ2 ) a kb-solution of (I1 , Γ1 ) under M? Theorem Check-KB-Sol is undecidable (even for ﬁxed M).
42. 42. We study the following decision problem Check-KB-Sol: Input: M = (S, T, Σ) with Σ st-tgds (I1 , Γ1 ) knowledge base over S with Γ1 tgds (I2 , Γ2 ) knowledge base over T with Γ2 tgds Output: Is (I2 , Γ2 ) a kb-solution of (I1 , Γ1 ) under M? Theorem Check-KB-Sol is undecidable (even for ﬁxed M). Undecidability is a consequence of using ∃ in knowledge bases Check-Full-KB-Sol: Γ1 , Γ2 full-tgds (no ∃)
43. 43. The complexity of Check-Full-KB-Sol Theorem Check-Full-KB-Sol is EXPTIME-complete.
44. 44. The complexity of Check-Full-KB-Sol Theorem Check-Full-KB-Sol is EXPTIME-complete. Theorem If M = (S, T, Σ) is ﬁxed Check-Full-KB-Sol is ∆P [O(log n)]-complete. 2 ∆P [O(log n)] 2 ≡ P NP with only a logarithmic number of calls to the NP oracle.
45. 45. Hardness is obtained by reducingthe Max-Odd-Clique problem Given a graph G with n nodes: S: E (·, ·), Succ(·, ·), Clique(·)
46. 46. Hardness is obtained by reducingthe Max-Odd-Clique problem Given a graph G with n nodes: S: E (·, ·), Succ(·, ·), Clique(·) T: E ′ (·, ·), Succ ′ (·, ·), Clique ′ (·)
47. 47. Hardness is obtained by reducingthe Max-Odd-Clique problem Given a graph G with n nodes: S: E (·, ·), Succ(·, ·), Clique(·) T: E ′ (·, ·), Succ ′ (·, ·), Clique ′ (·) I1 and I2 are copies of G , plus a successor relation over [1, n]
48. 48. Hardness is obtained by reducingthe Max-Odd-Clique problem Given a graph G with n nodes: S: E (·, ·), Succ(·, ·), Clique(·) T: E ′ (·, ·), Succ ′ (·, ·), Clique ′ (·) I1 and I2 are copies of G , plus a successor relation over [1, n] Γ1 : “E has a clique of even size x ≤ n” → Clique(x)
49. 49. Hardness is obtained by reducingthe Max-Odd-Clique problem Given a graph G with n nodes: S: E (·, ·), Succ(·, ·), Clique(·) T: E ′ (·, ·), Succ ′ (·, ·), Clique ′ (·) I1 and I2 are copies of G , plus a successor relation over [1, n] Γ1 : “E has a clique of even size x ≤ n” → Clique(x) Γ2 : “E ′ has a clique of odd size x ≤ n” → Clique ′ (x)
50. 50. Hardness is obtained by reducingthe Max-Odd-Clique problem Given a graph G with n nodes: S: E (·, ·), Succ(·, ·), Clique(·) T: E ′ (·, ·), Succ ′ (·, ·), Clique ′ (·) I1 and I2 are copies of G , plus a successor relation over [1, n] Γ1 : “E has a clique of even size x ≤ n” → Clique(x) Γ2 : “E ′ has a clique of odd size x ≤ n” → Clique ′ (x) Σ: Clique(x) ∧ Succ(x, y ) → Clique ′ (y )
51. 51. Hardness is obtained by reducingthe Max-Odd-Clique problem Given a graph G with n nodes: S: E (·, ·), Succ(·, ·), Clique(·) T: E ′ (·, ·), Succ ′ (·, ·), Clique ′ (·) I1 and I2 are copies of G , plus a successor relation over [1, n] Γ1 : “E has a clique of even size x ≤ n” → Clique(x) Γ2 : “E ′ has a clique of odd size x ≤ n” → Clique ′ (x) Σ: Clique(x) ∧ Succ(x, y ) → Clique ′ (y ) (I2 , Γ2 ) is a kb-solution of (I1 , Σ1 ) iﬀ the maximum size of a clique in G is odd.
52. 52. Outline Formalism for exchanging representations systems Applications to incomplete instances Applications to knowledge bases What is the complexity of testing kb-solutions? How can we compute a good kb-solution? Concluding remarks
53. 53. Minimal knowledge bases as good kb-solutions What are good knowledge-base solutions? ◮ ﬁrst choice: universal kb-solutions, but ◮ there are some other kb-solutions desirable to materialize
54. 54. Minimal knowledge bases as good kb-solutions What are good knowledge-base solutions? ◮ ﬁrst choice: universal kb-solutions, but ◮ there are some other kb-solutions desirable to materialize X ≡min Y: X and Y coincide in the minimal instances
55. 55. Minimal knowledge bases as good kb-solutions What are good knowledge-base solutions? ◮ ﬁrst choice: universal kb-solutions, but ◮ there are some other kb-solutions desirable to materialize X ≡min Y: X and Y coincide in the minimal instances (I2 , Γ2 ) is a minimal kb-solution of (I1 , Γ1 ) under M if Mod(I2 , Γ2 ) ≡min SolM (Mod(I1 , Γ1 )) We consider minimal kb-solutions as the good solutions
56. 56. Two requirements to constructminimal knowledge-base solutions Given (I1 , Γ1 ) and Σ, when constructing a minimal-knowledge base solution (I2 , Γ2 ) we would want: 1. Γ2 to only depend on Γ1 and Σ Γ2 is safe for Γ1 and Σ
57. 57. Two requirements to constructminimal knowledge-base solutions Given (I1 , Γ1 ) and Σ, when constructing a minimal-knowledge base solution (I2 , Γ2 ) we would want: 1. Γ2 to only depend on Γ1 and Σ Γ2 is safe for Γ1 and Σ Deﬁnition Γ2 is safe for Γ1 and Σ, if for every I1 there exists I2 s.t. (I2 , Γ2 ) is a minimal kb-solution of (I1 , Γ1 )
58. 58. Two requirements to constructminimal knowledge-base solutions Given (I1 , Γ1 ) and Σ, when constructing a minimal-knowledge base solution (I2 , Γ2 ) we would want: 2. Γ2 to be as informative as possible thus minimizing the size of I2
59. 59. Two requirements to constructminimal knowledge-base solutions Given (I1 , Γ1 ) and Σ, when constructing a minimal-knowledge base solution (I2 , Γ2 ) we would want: 2. Γ2 to be as informative as possible thus minimizing the size of I2 Γ2 is optimum-safe if for every other safe set Γ′ Γ2 |= Γ′
60. 60. Safe sets: data independent target knowledge bases Unfortunately, optimum-safe sets are not (always) FO-expressible Theorem There exist sets Γ1 (full & acyclic) and Σ (full) s.t. no FO sentence is optimum-safe for Γ1 and Σ
61. 61. Safe sets: data independent target knowledge bases Unfortunately, optimum-safe sets are not (always) FO-expressible Theorem There exist sets Γ1 (full & acyclic) and Σ (full) s.t. no FO sentence is optimum-safe for Γ1 and Σ In our paper: An algorithm to compute optimum-safe set in SO
62. 62. Safe sets: data independent target knowledge bases Unfortunately, optimum-safe sets are not (always) FO-expressible Theorem There exist sets Γ1 (full & acyclic) and Σ (full) s.t. no FO sentence is optimum-safe for Γ1 and Σ In our paper: An algorithm to compute optimum-safe set in SO Can we do better? not optimum but useful set? (non-trivial safe set in a good language)
63. 63. We can compute a good safe setfor acyclic tgds knowledge bases Input: Γ1 acyclic full-tgds, Σ full-tgds (from S to T)
64. 64. We can compute a good safe setfor acyclic tgds knowledge bases Input: Γ1 acyclic full-tgds, Σ full-tgds (from S to T) 1. Unfold Γ1 to get Γ+ 1
65. 65. We can compute a good safe setfor acyclic tgds knowledge bases Input: Γ1 acyclic full-tgds, Σ full-tgds (from S to T) 1. Unfold Γ1 to get Γ+ 1 2. Construct Σ− (from T to S) as follows: for every α(¯ ) → R(¯) in Γ+ x x 1
66. 66. We can compute a good safe setfor acyclic tgds knowledge bases Input: Γ1 acyclic full-tgds, Σ full-tgds (from S to T) 1. Unfold Γ1 to get Γ+ 1 2. Construct Σ− (from T to S) as follows: for every α(¯ ) → R(¯) in Γ+ x x 1 Rewrite α(¯) in the target of Σ to obtain β(¯ ) x x
67. 67. We can compute a good safe setfor acyclic tgds knowledge bases Input: Γ1 acyclic full-tgds, Σ full-tgds (from S to T) 1. Unfold Γ1 to get Γ+ 1 2. Construct Σ− (from T to S) as follows: for every α(¯ ) → R(¯) in Γ+ x x 1 Rewrite α(¯) in the target of Σ to obtain β(¯ ) x x Add β(¯) → R(¯) to Σ− x x
68. 68. We can compute a good safe setfor acyclic tgds knowledge bases Input: Γ1 acyclic full-tgds, Σ full-tgds (from S to T) 1. Unfold Γ1 to get Γ+ 1 2. Construct Σ− (from T to S) as follows: for every α(¯ ) → R(¯) in Γ+ x x 1 Rewrite α(¯) in the target of Σ to obtain β(¯ ) x x Add β(¯) → R(¯) to Σ− x x 3. Let Γ2 = Σ− ◦ Σ
69. 69. We can compute a good safe setfor acyclic tgds knowledge bases Input: Γ1 acyclic full-tgds, Σ full-tgds (from S to T) 1. Unfold Γ1 to get Γ+ 1 2. Construct Σ− (from T to S) as follows: for every α(¯ ) → R(¯) in Γ+ x x 1 Rewrite α(¯) in the target of Σ to obtain β(¯ ) x x Add β(¯) → R(¯) to Σ− x x 3. Let Γ2 = Σ− ◦ Σ Theorem Γ2 is safe for Γ1 and Σ (Γ2 is a set of full-tgds with =)
70. 70. In the paper: a complete strategy for computingminimal knowledge-base solutions Given Γ1 and Σ: ◮ We compute Γ2 safe for Γ1 and Σ ◮ We then compute minimal set Γ∗ (implied by Γ1 ) such that for every I1
71. 71. In the paper: a complete strategy for computingminimal knowledge-base solutions Given Γ1 and Σ: ◮ We compute Γ2 safe for Γ1 and Σ ◮ We then compute minimal set Γ∗ (implied by Γ1 ) such that for every I1 chaseΓ∗ (I1 )
72. 72. In the paper: a complete strategy for computingminimal knowledge-base solutions Given Γ1 and Σ: ◮ We compute Γ2 safe for Γ1 and Σ ◮ We then compute minimal set Γ∗ (implied by Γ1 ) such that for every I1 chaseΣ ( chaseΓ∗ (I1 ))
73. 73. In the paper: a complete strategy for computingminimal knowledge-base solutions Given Γ1 and Σ: ◮ We compute Γ2 safe for Γ1 and Σ ◮ We then compute minimal set Γ∗ (implied by Γ1 ) such that for every I1 chaseΣ ( chaseΓ∗ (I1 )), Γ2
74. 74. In the paper: a complete strategy for computingminimal knowledge-base solutions Given Γ1 and Σ: ◮ We compute Γ2 safe for Γ1 and Σ ◮ We then compute minimal set Γ∗ (implied by Γ1 ) such that for every I1 chaseΣ ( chaseΓ∗ (I1 )), Γ2 is a minimal kb-solution for (I1 , Γ1 ).
75. 75. Outline Formalism for exchanging representations systems Applications to incomplete instances Applications to knowledge bases What is the complexity of testing kb-solutions? How can we compute a good kb-solution? Concluding remarks
76. 76. We can exchange more than complete data We propose a general formalism to exchange representation systems ◮ Applications to incomplete instances ◮ Applications to knowledge bases
77. 77. We can exchange more than complete data We propose a general formalism to exchange representation systems ◮ Applications to incomplete instances ◮ Applications to knowledge bases Next step: apply our general setting to the Semantic Web ◮ Semantic Web data have nulls (blank nodes) ◮ Semantic Web speciﬁcations have rules (RDFS, OWL)
78. 78. Exchanging More than Complete Data Jorge P´rez e Universidad de Chilejoint work with Marcelo Arenas (PUC-Chile) and Juan Reutter (Univ. Edinburgh)
79. 79. Outline Formalism for exchanging representations systems Applications to incomplete instances Applications to knowledge bases What is the complexity of testing kb-solutions? How can we compute a good kb-solution? Concluding remarks