SlideShare a Scribd company logo
1 of 42
Download to read offline
Properties Semantics Insensitivity Stability Conclusions 
Reexamining some Holy Grails of Provenance 
Boris Glavic Renee J. Miller 
University of Toronto 
UofT:DB Group 
TaPP 2011, November 5, 2014
Properties Semantics Insensitivity Stability Conclusions 
Insensitivity 
UofT:DB Group 
Insensitivity to Query Rewrite 
Equivalent queries have the same provenance 
Q  Q0 ) P(Q; I ; t) = P(Q0; I ; t) 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Insensitivity 
UofT:DB Group 
Insensitivity to Query Rewrite 
Equivalent queries have the same provenance 
Q  Q0 ) P(Q; I ; t) = P(Q0; I ; t) 
Caveat: Which queries are equivalent? 
Set vs. Bag semantics 
Query language / Operators 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Stability 
UofT:DB Group 
Stability with Respect to Query Language Extension 
Extend query language with new operators ) no change to 
provenance of queries that do not use new operators 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Where 
UofT:DB Group 
Where [Buneman et al., 2003] 
Captures which attribute values in the result of a query have been 
copied from which attribute values in the instance. 
Representation: P(Attr (I )) 
Where: Operator-level syntax-based annotation propagation 
IWhere: Insensitive variant: Union of Where for all Q0 with 
Q0  Q 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Where 
UofT:DB Group 
Where 
Sensitive, traditionally attributed to being based on query 
syntax 
Depends on the internal data-
ow inside the query 
How values are routed through the query 
IWhere 
Insensitive by combining Where for all equivalent queries 
Counterintuitive eect that if (R; t;A) is in the provenance 
then all (R; t0; A) with t:A = t0:A are in the provenance too. 
Reason: Can construct equivalent query adding self-join on A 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Where 
UofT:DB Group 
Example 
Qa = R 
Qb = A;B(R ./A=C A!C;B!D(R)) 
R 
A B 
r1 1 2 
r2 1 3 
r3 2 3 
r4 2 5 
Qa  Qb 
A B 
a1 1 2 
a2 1 3 
a3 2 3 
a4 2 5 
Provenance 
Where(Qa; a1; A) = f(r1;A)g 
Where(Qb; a1; A) = f(r1;A); (r2;A)g 
IWhere(Qa; a1; A) = IWhere(Qb; a1; A) = f(r1;A); (r2;A)g 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Arguments for Insensitivity 
UofT:DB Group 
1 Traditionally observed as advantageous in database research 
) Tradition not a solid argument 
2 External implementation of sensitive semantics. Computing 
provenance for a query dierent from the one that will be 
executed by DBMS 
) No way to solve this, but provenance based on user query 
seems to be reasonable 
3 Implementation of sensitive semantics in DB-engine limits 
optimizer search space 
) Insensitive semantics may be harder to compute 
) Lack of practical experience 
) Realistic sensitivity example? 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Instability of IWhere 
UofT:DB Group 
IWhere is union of Where for all equivalent queries 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Instability of IWhere 
UofT:DB Group 
IWhere is union of Where for all equivalent queries 
e.g., SPJ and USPJ equivalences are dierent 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Instability of IWhere 
UofT:DB Group 
IWhere is union of Where for all equivalent queries 
e.g., SPJ and USPJ equivalences are dierent 
e.g., union Q with a join of Q with some other relation 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Instability of IWhere 
UofT:DB Group 
IWhere is union of Where for all equivalent queries 
e.g., SPJ and USPJ equivalences are dierent 
e.g., union Q with a join of Q with some other relation 
Let UWhere be IWhere for USPJ queries 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Instability of IWhere 
UofT:DB Group 
IWhere is union of Where for all equivalent queries 
e.g., SPJ and USPJ equivalences are dierent 
e.g., union Q with a join of Q with some other relation 
Let UWhere be IWhere for USPJ queries 
) UWhere an attribute value is annotated with all 
annotations from attribute positions that have the same value 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Instability of IWhere 
UofT:DB Group 
Example 
Qa = R 
R 
A B 
r1 1 2 
r2 1 3 
r3 2 3 
r4 2 5 
S 
C 
s1 2 
s2 3 
Qa  Qb 
A B 
a1 1 2 
a2 1 3 
a3 2 3 
a4 2 5 
Provenance 
Where(Qa; a3;A) = f(r3; A)g 
IWhere(Qa; a3;A) = f(r3; A); (r4; A)g 
UWhere(Qa; a3;A) = f(r3; A); (r4; A); (r1;B); (s1; C)g 
Holy Grails
Properties Semantics Insensitivity Stability Conclusions 
Conclusions 
UofT:DB Group 
Take Away Messages 
Be careful how to achieve a property 
Insensitivity less applicable to semantics that address internal 
data-
ow 
Queries with the same external but possibly dierent internal 
behaviour have the same provenance 
Some Things I'd Like to See 
Declarative Semantics ) derive operator-level construction 
Semantics model processing, but have a insensitive core 
The never-ending quest: Deal with Negation 
Other data-models (order) 
Holy Grails
Questions Overview Properties Semantics Insensitivity 
Questions 
UofT:DB Group 
Semantics 
Sound 
Complete 
Responsible 
Insensitive (set) 
Insensitive (bag) 
Stable 
Why 
Wit - X - X X X 
Why - X - - ? X 
IWhy - X X X X X 
Where 
Where - - - - ? X 
IWhere - - - X X - 
How - X - - X X 
Lineage-based 
Lineage X X - - - X 
PI-CS X X - - - X 
C-CS X - - - - X 
Causality - X X X X X 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Semantics Summary 
UofT:DB Group 
Representation 
Used by 
P(Attr (I )) Where, 
IWhere 
P(P(Tuple(I ))) Wit, 
Why, 
IWhy 
N[Tuple(I )] How 
f R 
1 ; : : : ; R 
n j R 
i  Ri (Q)g Lineage 
P(f t1; : : : ; tn j ti 2 Ri (Q) _ ti =?g) PI-CS, C- 
CS 
P(Tuple(I )) Causality 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Sound, Complete, Responsible 
UofT:DB Group 
Sound, Complete, Responsible 
Sound: Provenance of t produces nothing dierent from t. 
t06= t ) t062 Q(P(Q; I ; t)) 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Sound, Complete, Responsible 
UofT:DB Group 
Sound, Complete, Responsible 
Sound: Provenance of t produces nothing dierent from t. 
t06= t ) t062 Q(P(Q; I ; t)) 
Caveat: Semantics of evaluating query over provenance 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Sound, Complete, Responsible 
UofT:DB Group 
Sound, Complete, Responsible 
Sound: Provenance of t produces nothing dierent from t. 
t06= t ) t062 Q(P(Q; I ; t)) 
Complete: Provenance of t produces at least t 
t 2 Q(P(Q; I ; t)) 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Sound, Complete, Responsible 
UofT:DB Group 
Sound, Complete, Responsible 
Sound: Provenance of t produces nothing dierent from t. 
t06= t ) t062 Q(P(Q; I ; t)) 
Complete: Provenance of t produces at least t 
t 2 Q(P(Q; I ; t)) 
Caveat: Semantics of evaluating query over provenance 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Sound, Complete, Responsible 
UofT:DB Group 
Sound, Complete, Responsible 
Sound: Provenance of t produces nothing dierent from t. 
t06= t ) t062 Q(P(Q; I ; t)) 
Complete: Provenance of t produces at least t 
t 2 Q(P(Q; I ; t)) 
Responsible: Every tuple in the provenance of t is necessary 
to derive t 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Sound, Complete, Responsible 
UofT:DB Group 
Sound, Complete, Responsible 
Sound: Provenance of t produces nothing dierent from t. 
t06= t ) t062 Q(P(Q; I ; t)) 
Complete: Provenance of t produces at least t 
t 2 Q(P(Q; I ; t)) 
Responsible: Every tuple in the provenance of t is necessary 
to derive t 
Caveat: . . . from every alternative derivation in the 
provenance . . . 
) factor provenance into alterative derivations 
Caveat: Dierent ways to model that. 
E.g., 8t0 2 P(Q; I ; t) : t =2 Q(P(Q; I ; t)  ft0g) 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Sound, Complete, Responsible 
UofT:DB Group 
Sound, Complete, Responsible 
Example 
Qb = A;B(R ./A=C A!C;B!D(R)) 
R 
A B 
r1 1 2 
r2 1 3 
r3 2 3 
r4 2 5 
Qb 
A B 
a1 1 2 
a2 1 3 
a3 2 3 
a4 2 5 
Provenance 
P(Qb; I ; a1) = fr1; r2g 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Lineage-based 
UofT:DB Group 
Lineage-based [Cui et al., 2000] 
Operator-level declarative semantics similar to Why. Provenance is 
modeled as a list of subsets of the relations accessed by the query 
(leafs of the algebra tree of Q) 
Representation: f R 
1 ; : : : ; R 
n j R 
i  Ri (Q)g 
Lineage: List of subsets of the algebra-tree nodes 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Lineage-based 
UofT:DB Group 
Lineage-based [Glavic et al., 2009] 
Provenance is modeled as a set of witness lists. A witness list is a 
list of tuples - one from each relation accessed by the query. 
Representation: P(f t1; : : : ; tn j ti 2 Ri (Q) _ ti =?g) 
PI-CS: Lineage with dierent representation and broader 
query language coverage 
C-CS: Similar to Where but with tuple granularity 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Lineage-based 
UofT:DB Group 
Example 
Qc = A(R) [ B(R) 
Qd = A(R ./B=C S) 
R 
A B 
r1 1 2 
r2 1 3 
r3 2 3 
r4 2 5 
S 
C 
s1 2 
s2 3 
Qc 
A 
c1 1 
c2 2 
c3 3 
c4 5 
Qd  Qe 
A 
d1 1 
d2 2 
Provenance 
PI  CS(Qc ; c2) = f?; r1 ; r3;?; r4;?g 
PI  CS(Qd ; d1) = f r1; s1 ; r2; s2 g 
C  CS(Qd ; d1) = f r1;?; r2;?g 
Lineage(Qd ; d1) = fr1; r2g; fs1; s2g  
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Why 
UofT:DB Group 
Why [Buneman et al., 2003] 
Why-provenance models provenance as a set of witnesses. A 
witness w for a tuple t is a subset of the instance I where 
t 2 Q(w). 
Representation: P(P(Tuple(I ))) 
Wit: Set of all witnesses 
Why: Query-syntax based proof-witnesses 
IWhy: Minimal elements from Wit resp. Why 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Why 
UofT:DB Group 
Example 
Qa = R 
R 
A B 
r1 1 2 
r2 1 3 
r3 2 3 
r4 2 5 
Qa 
A B 
a1 1 2 
a2 1 3 
a3 2 3 
a4 2 5 
Provenance 
Wit(Qa; a1) = fJ j J  R ^ r1 2 Jg 
Why(Qa; a1) = ffr1gg 
IWhy(Qa; a1) = ffr1gg 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
How 
UofT:DB Group 
Provenance Semirings [Green et al., 2007] 
Tuples of relations annotated with elements from a semiring. 
Annotation propagation de
ned for positive relational algebra as 
operations of the semiring (set dierence and aggregration later). 
Representation: N(Tuple(I )) 
How: Most general form of annotations: polynomials over 
variable representing the instance tuples. Addition indicates 
alternative use of tuples; multiplication conjunctive use. 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
How 
UofT:DB Group 
Example 
Qb = A;B(R ./A=C A!C;B!D(R)) 
R 
A B 
r1 1 2 
r2 1 3 
r3 2 3 
r4 2 5 
Qb 
A B 
a1 1 2 
a2 1 3 
a3 2 3 
a4 2 5 
Provenance 
How(Qb; a1) = r 2 
1 + r1  r2 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Causality 
UofT:DB Group 
Causality [Meliou et al., 2010] 
Provenance is modeled as a set of causes. A cause c 2 I for a 
tuple t is de
ned as follows: 
1 t62 Q(I  fcg) 
2 there exists a set C  I called contingency so that 
t 2 Q(I  C) and t62 Q(I  C  fcg) 
Representation: P(Tuple(I )) 
Causality: Set of all causes 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Causality 
UofT:DB Group 
Example 
Qa = R 
R 
A B 
r1 1 2 
r2 1 3 
r3 2 3 
r4 2 5 
Qa 
A B 
a1 1 2 
a2 1 3 
a3 2 3 
a4 2 5 
Provenance 
Causality(Qa; a1) = fr1g 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Insensitivity to Query Rewrite 
UofT:DB Group 
Insensitivity to Query Rewrite 
Equivalent queries have the same provenance 
Q  Q0 ) P(Q; I ; t) = P(Q0; I ; t) 
Set resp. Bag semantics 
Overview 
Insensitive (Set, Bag) Wit, IWhy, IWhere, Causality 
Insensitive (Bag) How 
Sensitive: Lineage, PI-CS, C-CS, Where 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Why 
UofT:DB Group 
Wit 
De
ned over black-box behaviour of query ) trivially 
insensitive 
Why 
Sensitive, traditionally attributed to being based on query 
syntax 
Why may contain tuples that do not contribute to t 
) Equivalent queries that apply redundant computations may 
contain larger provenance 
Caveat: But why does this argument not apply to Wit? 
Positive queries: super-set of a witness is also a witness ) 
tuples used by redundant computations are in Wit 
IWhy 
Reexamining some Holy Grails of Provenance 
Minimal elements of Why
Questions Overview Properties Semantics Insensitivity 
Why 
UofT:DB Group 
Example 
Qa = R 
Qb = A;B(R ./A=C A!C;B!D(R)) 
R 
A B 
r1 1 2 
r2 1 3 
r3 2 3 
r4 2 5 
Qa  Qb 
A B 
a1 1 2 
a2 1 3 
a3 2 3 
a4 2 5 
Provenance 
Wit(Qa; a1) = Wit(Qb; a1) = fJ j J  R ^ r1 2 Jg 
Why(Qa; a1) = ffr1gg 
Why(Qb; a1) = ffr1g; fr1; r2gg 
IWhy(Qa; a1) = IWhy(Qb; a1) = ffr1gg 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Lineage-based 
UofT:DB Group 
Provenance representation based on query syntax 
Trivial examples for sensitivity based on reordering of the 
arguments of commutative operators 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
Lineage-based 
UofT:DB Group 
Example 
Qd = A(R ./B=C S) 
Qe = A(S ./B=C R) 
R 
A B 
r1 1 2 
r2 1 3 
r3 2 3 
r4 2 5 
S 
C 
s1 2 
s2 3 
Qd  Qe 
A 
d1 1 
d2 2 
Provenance 
Lineage(Qd ; d1) = fr1; r2g; fs1; s2g  
Lineage(Qe ; d1) = fs1; s2g; fr1; r2g  
PI  CS(Qd ; d1) = f r1; s1 ; r2; s2 g 
PI  CS(Qe ; d1) = f s1; r1 ; s2; r2 g 
Reexamining some Holy Grails of Provenance
Questions Overview Properties Semantics Insensitivity 
How 
UofT:DB Group 
Sensitive (Set): 
Insensitive (Bag): Operator semantics de

More Related Content

Viewers also liked

2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers
2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers
2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers
Boris Glavic
 
Administracion del rrhh
Administracion del rrhhAdministracion del rrhh
Administracion del rrhh
sdiaz5
 
TaPP 2015 - Towards Constraint-based Explanations for Answers and Non-Answers
TaPP 2015 - Towards Constraint-based Explanations for Answers and Non-AnswersTaPP 2015 - Towards Constraint-based Explanations for Answers and Non-Answers
TaPP 2015 - Towards Constraint-based Explanations for Answers and Non-Answers
Boris Glavic
 
Anh trang trinh chieu
Anh trang  trinh chieuAnh trang  trinh chieu
Anh trang trinh chieu
huongvuduy
 

Viewers also liked (16)

2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers
2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers
2015 TaPP - Towards Constraint-based Explanations for Answers and Non-Answers
 
Lkti aaaaa
Lkti aaaaaLkti aaaaa
Lkti aaaaa
 
Administracion del rrhh
Administracion del rrhhAdministracion del rrhh
Administracion del rrhh
 
Bahan tayang-pkn-3
Bahan tayang-pkn-3Bahan tayang-pkn-3
Bahan tayang-pkn-3
 
TaPP 2015 - Towards Constraint-based Explanations for Answers and Non-Answers
TaPP 2015 - Towards Constraint-based Explanations for Answers and Non-AnswersTaPP 2015 - Towards Constraint-based Explanations for Answers and Non-Answers
TaPP 2015 - Towards Constraint-based Explanations for Answers and Non-Answers
 
Ipaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanIpaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, Ian
 
2015 TaPP - Interoperability for Provenance-aware Databases using PROV and JSON
2015 TaPP - Interoperability for Provenance-aware Databases using PROV and JSON2015 TaPP - Interoperability for Provenance-aware Databases using PROV and JSON
2015 TaPP - Interoperability for Provenance-aware Databases using PROV and JSON
 
Akm ch 17 obligasi
Akm ch 17 obligasi Akm ch 17 obligasi
Akm ch 17 obligasi
 
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
2016 QDB VLDB Workshop - Towards Rigorous Evaluation of Data Integration Syst...
 
Akm ch 15 saham
Akm ch 15 sahamAkm ch 15 saham
Akm ch 15 saham
 
Proctor & gamble pampers
Proctor & gamble pampersProctor & gamble pampers
Proctor & gamble pampers
 
Anh trang trinh chieu
Anh trang  trinh chieuAnh trang  trinh chieu
Anh trang trinh chieu
 
R DATE TIME and FACTORS
R DATE TIME and FACTORSR DATE TIME and FACTORS
R DATE TIME and FACTORS
 
R DATA STRUCTURES 2
R DATA STRUCTURES 2R DATA STRUCTURES 2
R DATA STRUCTURES 2
 
Akm ch 7 kas
Akm ch 7 kasAkm ch 7 kas
Akm ch 7 kas
 
2016 VLDB - Messing Up with Bart: Error Generation for Evaluating Data-Cleani...
2016 VLDB - Messing Up with Bart: Error Generation for Evaluating Data-Cleani...2016 VLDB - Messing Up with Bart: Error Generation for Evaluating Data-Cleani...
2016 VLDB - Messing Up with Bart: Error Generation for Evaluating Data-Cleani...
 

Similar to TaPP 2011 Talk Boris - Reexamining some Holy Grails of Provenance

[ABDO] Data Integration
[ABDO] Data Integration[ABDO] Data Integration
[ABDO] Data Integration
Carles Farré
 
Mathematical foundations of computer science
Mathematical foundations of computer scienceMathematical foundations of computer science
Mathematical foundations of computer science
BindhuBhargaviTalasi
 
CPSC 125 Ch 1 sec 2
CPSC 125 Ch 1 sec 2CPSC 125 Ch 1 sec 2
CPSC 125 Ch 1 sec 2
David Wood
 
多媒體資料庫(New)3rd
多媒體資料庫(New)3rd多媒體資料庫(New)3rd
多媒體資料庫(New)3rd
Kevingo Tsai
 
A Distributed Tableau Algorithm for Package-based Description Logics
A Distributed Tableau Algorithm for Package-based Description LogicsA Distributed Tableau Algorithm for Package-based Description Logics
A Distributed Tableau Algorithm for Package-based Description Logics
Jie Bao
 
[ABDO] Logic As A Database Language
[ABDO] Logic As A Database Language[ABDO] Logic As A Database Language
[ABDO] Logic As A Database Language
Carles Farré
 

Similar to TaPP 2011 Talk Boris - Reexamining some Holy Grails of Provenance (20)

[ABDO] Data Integration
[ABDO] Data Integration[ABDO] Data Integration
[ABDO] Data Integration
 
DISMATH_Part1
DISMATH_Part1DISMATH_Part1
DISMATH_Part1
 
defense
defensedefense
defense
 
Mathematical foundations of computer science
Mathematical foundations of computer scienceMathematical foundations of computer science
Mathematical foundations of computer science
 
Dissertation Defense - Managing and Consuming Completeness Information for RD...
Dissertation Defense - Managing and Consuming Completeness Information for RD...Dissertation Defense - Managing and Consuming Completeness Information for RD...
Dissertation Defense - Managing and Consuming Completeness Information for RD...
 
Parallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationParallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox Presentation
 
CPSC 125 Ch 1 sec 2
CPSC 125 Ch 1 sec 2CPSC 125 Ch 1 sec 2
CPSC 125 Ch 1 sec 2
 
Discrete Math Lecture 02: First Order Logic
Discrete Math Lecture 02: First Order LogicDiscrete Math Lecture 02: First Order Logic
Discrete Math Lecture 02: First Order Logic
 
多媒體資料庫(New)3rd
多媒體資料庫(New)3rd多媒體資料庫(New)3rd
多媒體資料庫(New)3rd
 
Logic
LogicLogic
Logic
 
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
 
Fun never stops. introduction to haskell programming language
Fun never stops. introduction to haskell programming languageFun never stops. introduction to haskell programming language
Fun never stops. introduction to haskell programming language
 
Intelligent Methods in Models of Text Information Retrieval: Implications for...
Intelligent Methods in Models of Text Information Retrieval: Implications for...Intelligent Methods in Models of Text Information Retrieval: Implications for...
Intelligent Methods in Models of Text Information Retrieval: Implications for...
 
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
 
Semantics and optimisation of the SPARQL1.1 federation extension
Semantics and optimisation of the SPARQL1.1 federation extensionSemantics and optimisation of the SPARQL1.1 federation extension
Semantics and optimisation of the SPARQL1.1 federation extension
 
A Distributed Tableau Algorithm for Package-based Description Logics
A Distributed Tableau Algorithm for Package-based Description LogicsA Distributed Tableau Algorithm for Package-based Description Logics
A Distributed Tableau Algorithm for Package-based Description Logics
 
Clojure made simple - Lightning talk
Clojure made simple - Lightning talkClojure made simple - Lightning talk
Clojure made simple - Lightning talk
 
[ABDO] Logic As A Database Language
[ABDO] Logic As A Database Language[ABDO] Logic As A Database Language
[ABDO] Logic As A Database Language
 
Pwl rewal-slideshare
Pwl rewal-slidesharePwl rewal-slideshare
Pwl rewal-slideshare
 
5 csp
5 csp5 csp
5 csp
 

More from Boris Glavic

WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
Boris Glavic
 
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
Boris Glavic
 

More from Boris Glavic (11)

2019 - SIGMOD - Uncertainty Annotated Databases - A Lightweight Approach for ...
2019 - SIGMOD - Uncertainty Annotated Databases - A Lightweight Approach for ...2019 - SIGMOD - Uncertainty Annotated Databases - A Lightweight Approach for ...
2019 - SIGMOD - Uncertainty Annotated Databases - A Lightweight Approach for ...
 
2019 - SIGMOD - Going Beyond Provenance: Explaining Query Answers with Patter...
2019 - SIGMOD - Going Beyond Provenance: Explaining Query Answers with Patter...2019 - SIGMOD - Going Beyond Provenance: Explaining Query Answers with Patter...
2019 - SIGMOD - Going Beyond Provenance: Explaining Query Answers with Patter...
 
2016 VLDB - The iBench Integration Metadata Generator
2016 VLDB - The iBench Integration Metadata Generator2016 VLDB - The iBench Integration Metadata Generator
2016 VLDB - The iBench Integration Metadata Generator
 
EDBT 2009 - Provenance for Nested Subqueries
EDBT 2009 - Provenance for Nested SubqueriesEDBT 2009 - Provenance for Nested Subqueries
EDBT 2009 - Provenance for Nested Subqueries
 
ICDE 2009 - Perm: Processing Provenance and Data on the same Data Model throu...
ICDE 2009 - Perm: Processing Provenance and Data on the same Data Model throu...ICDE 2009 - Perm: Processing Provenance and Data on the same Data Model throu...
ICDE 2009 - Perm: Processing Provenance and Data on the same Data Model throu...
 
2010 VLDB - TRAMP: Understanding the Behavior of Schema Mappings through Prov...
2010 VLDB - TRAMP: Understanding the Behavior of Schema Mappings through Prov...2010 VLDB - TRAMP: Understanding the Behavior of Schema Mappings through Prov...
2010 VLDB - TRAMP: Understanding the Behavior of Schema Mappings through Prov...
 
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
WBDB 2012 - "Big Data Provenance: Challenges and Implications for Benchmarking"
 
DEBS 2013 - "Ariadne: Managing Fine-Grained Provenance on Data Streams"
DEBS 2013 - "Ariadne: Managing Fine-Grained Provenance on Data Streams"DEBS 2013 - "Ariadne: Managing Fine-Grained Provenance on Data Streams"
DEBS 2013 - "Ariadne: Managing Fine-Grained Provenance on Data Streams"
 
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
SIGMOD 2013 - Patricia's talk on "Value invention for Data Exchange"
 
TaPP 2013 - Provenance for Data Mining
TaPP 2013 - Provenance for Data MiningTaPP 2013 - Provenance for Data Mining
TaPP 2013 - Provenance for Data Mining
 
TaPP 2014 Talk Boris - A Generic Provenance Middleware for Database Queries, ...
TaPP 2014 Talk Boris - A Generic Provenance Middleware for Database Queries, ...TaPP 2014 Talk Boris - A Generic Provenance Middleware for Database Queries, ...
TaPP 2014 Talk Boris - A Generic Provenance Middleware for Database Queries, ...
 

Recently uploaded

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 

Recently uploaded (20)

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 

TaPP 2011 Talk Boris - Reexamining some Holy Grails of Provenance

  • 1. Properties Semantics Insensitivity Stability Conclusions Reexamining some Holy Grails of Provenance Boris Glavic Renee J. Miller University of Toronto UofT:DB Group TaPP 2011, November 5, 2014
  • 2. Properties Semantics Insensitivity Stability Conclusions Insensitivity UofT:DB Group Insensitivity to Query Rewrite Equivalent queries have the same provenance Q Q0 ) P(Q; I ; t) = P(Q0; I ; t) Holy Grails
  • 3. Properties Semantics Insensitivity Stability Conclusions Insensitivity UofT:DB Group Insensitivity to Query Rewrite Equivalent queries have the same provenance Q Q0 ) P(Q; I ; t) = P(Q0; I ; t) Caveat: Which queries are equivalent? Set vs. Bag semantics Query language / Operators Holy Grails
  • 4. Properties Semantics Insensitivity Stability Conclusions Stability UofT:DB Group Stability with Respect to Query Language Extension Extend query language with new operators ) no change to provenance of queries that do not use new operators Holy Grails
  • 5. Properties Semantics Insensitivity Stability Conclusions Where UofT:DB Group Where [Buneman et al., 2003] Captures which attribute values in the result of a query have been copied from which attribute values in the instance. Representation: P(Attr (I )) Where: Operator-level syntax-based annotation propagation IWhere: Insensitive variant: Union of Where for all Q0 with Q0 Q Holy Grails
  • 6. Properties Semantics Insensitivity Stability Conclusions Where UofT:DB Group Where Sensitive, traditionally attributed to being based on query syntax Depends on the internal data- ow inside the query How values are routed through the query IWhere Insensitive by combining Where for all equivalent queries Counterintuitive eect that if (R; t;A) is in the provenance then all (R; t0; A) with t:A = t0:A are in the provenance too. Reason: Can construct equivalent query adding self-join on A Holy Grails
  • 7. Properties Semantics Insensitivity Stability Conclusions Where UofT:DB Group Example Qa = R Qb = A;B(R ./A=C A!C;B!D(R)) R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 Qa Qb A B a1 1 2 a2 1 3 a3 2 3 a4 2 5 Provenance Where(Qa; a1; A) = f(r1;A)g Where(Qb; a1; A) = f(r1;A); (r2;A)g IWhere(Qa; a1; A) = IWhere(Qb; a1; A) = f(r1;A); (r2;A)g Holy Grails
  • 8. Properties Semantics Insensitivity Stability Conclusions Arguments for Insensitivity UofT:DB Group 1 Traditionally observed as advantageous in database research ) Tradition not a solid argument 2 External implementation of sensitive semantics. Computing provenance for a query dierent from the one that will be executed by DBMS ) No way to solve this, but provenance based on user query seems to be reasonable 3 Implementation of sensitive semantics in DB-engine limits optimizer search space ) Insensitive semantics may be harder to compute ) Lack of practical experience ) Realistic sensitivity example? Holy Grails
  • 9. Properties Semantics Insensitivity Stability Conclusions Instability of IWhere UofT:DB Group IWhere is union of Where for all equivalent queries Holy Grails
  • 10. Properties Semantics Insensitivity Stability Conclusions Instability of IWhere UofT:DB Group IWhere is union of Where for all equivalent queries e.g., SPJ and USPJ equivalences are dierent Holy Grails
  • 11. Properties Semantics Insensitivity Stability Conclusions Instability of IWhere UofT:DB Group IWhere is union of Where for all equivalent queries e.g., SPJ and USPJ equivalences are dierent e.g., union Q with a join of Q with some other relation Holy Grails
  • 12. Properties Semantics Insensitivity Stability Conclusions Instability of IWhere UofT:DB Group IWhere is union of Where for all equivalent queries e.g., SPJ and USPJ equivalences are dierent e.g., union Q with a join of Q with some other relation Let UWhere be IWhere for USPJ queries Holy Grails
  • 13. Properties Semantics Insensitivity Stability Conclusions Instability of IWhere UofT:DB Group IWhere is union of Where for all equivalent queries e.g., SPJ and USPJ equivalences are dierent e.g., union Q with a join of Q with some other relation Let UWhere be IWhere for USPJ queries ) UWhere an attribute value is annotated with all annotations from attribute positions that have the same value Holy Grails
  • 14. Properties Semantics Insensitivity Stability Conclusions Instability of IWhere UofT:DB Group Example Qa = R R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 S C s1 2 s2 3 Qa Qb A B a1 1 2 a2 1 3 a3 2 3 a4 2 5 Provenance Where(Qa; a3;A) = f(r3; A)g IWhere(Qa; a3;A) = f(r3; A); (r4; A)g UWhere(Qa; a3;A) = f(r3; A); (r4; A); (r1;B); (s1; C)g Holy Grails
  • 15. Properties Semantics Insensitivity Stability Conclusions Conclusions UofT:DB Group Take Away Messages Be careful how to achieve a property Insensitivity less applicable to semantics that address internal data- ow Queries with the same external but possibly dierent internal behaviour have the same provenance Some Things I'd Like to See Declarative Semantics ) derive operator-level construction Semantics model processing, but have a insensitive core The never-ending quest: Deal with Negation Other data-models (order) Holy Grails
  • 16. Questions Overview Properties Semantics Insensitivity Questions UofT:DB Group Semantics Sound Complete Responsible Insensitive (set) Insensitive (bag) Stable Why Wit - X - X X X Why - X - - ? X IWhy - X X X X X Where Where - - - - ? X IWhere - - - X X - How - X - - X X Lineage-based Lineage X X - - - X PI-CS X X - - - X C-CS X - - - - X Causality - X X X X X Reexamining some Holy Grails of Provenance
  • 17. Questions Overview Properties Semantics Insensitivity Semantics Summary UofT:DB Group Representation Used by P(Attr (I )) Where, IWhere P(P(Tuple(I ))) Wit, Why, IWhy N[Tuple(I )] How f R 1 ; : : : ; R n j R i Ri (Q)g Lineage P(f t1; : : : ; tn j ti 2 Ri (Q) _ ti =?g) PI-CS, C- CS P(Tuple(I )) Causality Reexamining some Holy Grails of Provenance
  • 18. Questions Overview Properties Semantics Insensitivity Sound, Complete, Responsible UofT:DB Group Sound, Complete, Responsible Sound: Provenance of t produces nothing dierent from t. t06= t ) t062 Q(P(Q; I ; t)) Reexamining some Holy Grails of Provenance
  • 19. Questions Overview Properties Semantics Insensitivity Sound, Complete, Responsible UofT:DB Group Sound, Complete, Responsible Sound: Provenance of t produces nothing dierent from t. t06= t ) t062 Q(P(Q; I ; t)) Caveat: Semantics of evaluating query over provenance Reexamining some Holy Grails of Provenance
  • 20. Questions Overview Properties Semantics Insensitivity Sound, Complete, Responsible UofT:DB Group Sound, Complete, Responsible Sound: Provenance of t produces nothing dierent from t. t06= t ) t062 Q(P(Q; I ; t)) Complete: Provenance of t produces at least t t 2 Q(P(Q; I ; t)) Reexamining some Holy Grails of Provenance
  • 21. Questions Overview Properties Semantics Insensitivity Sound, Complete, Responsible UofT:DB Group Sound, Complete, Responsible Sound: Provenance of t produces nothing dierent from t. t06= t ) t062 Q(P(Q; I ; t)) Complete: Provenance of t produces at least t t 2 Q(P(Q; I ; t)) Caveat: Semantics of evaluating query over provenance Reexamining some Holy Grails of Provenance
  • 22. Questions Overview Properties Semantics Insensitivity Sound, Complete, Responsible UofT:DB Group Sound, Complete, Responsible Sound: Provenance of t produces nothing dierent from t. t06= t ) t062 Q(P(Q; I ; t)) Complete: Provenance of t produces at least t t 2 Q(P(Q; I ; t)) Responsible: Every tuple in the provenance of t is necessary to derive t Reexamining some Holy Grails of Provenance
  • 23. Questions Overview Properties Semantics Insensitivity Sound, Complete, Responsible UofT:DB Group Sound, Complete, Responsible Sound: Provenance of t produces nothing dierent from t. t06= t ) t062 Q(P(Q; I ; t)) Complete: Provenance of t produces at least t t 2 Q(P(Q; I ; t)) Responsible: Every tuple in the provenance of t is necessary to derive t Caveat: . . . from every alternative derivation in the provenance . . . ) factor provenance into alterative derivations Caveat: Dierent ways to model that. E.g., 8t0 2 P(Q; I ; t) : t =2 Q(P(Q; I ; t) ft0g) Reexamining some Holy Grails of Provenance
  • 24. Questions Overview Properties Semantics Insensitivity Sound, Complete, Responsible UofT:DB Group Sound, Complete, Responsible Example Qb = A;B(R ./A=C A!C;B!D(R)) R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 Qb A B a1 1 2 a2 1 3 a3 2 3 a4 2 5 Provenance P(Qb; I ; a1) = fr1; r2g Reexamining some Holy Grails of Provenance
  • 25. Questions Overview Properties Semantics Insensitivity Lineage-based UofT:DB Group Lineage-based [Cui et al., 2000] Operator-level declarative semantics similar to Why. Provenance is modeled as a list of subsets of the relations accessed by the query (leafs of the algebra tree of Q) Representation: f R 1 ; : : : ; R n j R i Ri (Q)g Lineage: List of subsets of the algebra-tree nodes Reexamining some Holy Grails of Provenance
  • 26. Questions Overview Properties Semantics Insensitivity Lineage-based UofT:DB Group Lineage-based [Glavic et al., 2009] Provenance is modeled as a set of witness lists. A witness list is a list of tuples - one from each relation accessed by the query. Representation: P(f t1; : : : ; tn j ti 2 Ri (Q) _ ti =?g) PI-CS: Lineage with dierent representation and broader query language coverage C-CS: Similar to Where but with tuple granularity Reexamining some Holy Grails of Provenance
  • 27. Questions Overview Properties Semantics Insensitivity Lineage-based UofT:DB Group Example Qc = A(R) [ B(R) Qd = A(R ./B=C S) R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 S C s1 2 s2 3 Qc A c1 1 c2 2 c3 3 c4 5 Qd Qe A d1 1 d2 2 Provenance PI CS(Qc ; c2) = f?; r1 ; r3;?; r4;?g PI CS(Qd ; d1) = f r1; s1 ; r2; s2 g C CS(Qd ; d1) = f r1;?; r2;?g Lineage(Qd ; d1) = fr1; r2g; fs1; s2g Reexamining some Holy Grails of Provenance
  • 28. Questions Overview Properties Semantics Insensitivity Why UofT:DB Group Why [Buneman et al., 2003] Why-provenance models provenance as a set of witnesses. A witness w for a tuple t is a subset of the instance I where t 2 Q(w). Representation: P(P(Tuple(I ))) Wit: Set of all witnesses Why: Query-syntax based proof-witnesses IWhy: Minimal elements from Wit resp. Why Reexamining some Holy Grails of Provenance
  • 29. Questions Overview Properties Semantics Insensitivity Why UofT:DB Group Example Qa = R R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 Qa A B a1 1 2 a2 1 3 a3 2 3 a4 2 5 Provenance Wit(Qa; a1) = fJ j J R ^ r1 2 Jg Why(Qa; a1) = ffr1gg IWhy(Qa; a1) = ffr1gg Reexamining some Holy Grails of Provenance
  • 30. Questions Overview Properties Semantics Insensitivity How UofT:DB Group Provenance Semirings [Green et al., 2007] Tuples of relations annotated with elements from a semiring. Annotation propagation de
  • 31. ned for positive relational algebra as operations of the semiring (set dierence and aggregration later). Representation: N(Tuple(I )) How: Most general form of annotations: polynomials over variable representing the instance tuples. Addition indicates alternative use of tuples; multiplication conjunctive use. Reexamining some Holy Grails of Provenance
  • 32. Questions Overview Properties Semantics Insensitivity How UofT:DB Group Example Qb = A;B(R ./A=C A!C;B!D(R)) R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 Qb A B a1 1 2 a2 1 3 a3 2 3 a4 2 5 Provenance How(Qb; a1) = r 2 1 + r1 r2 Reexamining some Holy Grails of Provenance
  • 33. Questions Overview Properties Semantics Insensitivity Causality UofT:DB Group Causality [Meliou et al., 2010] Provenance is modeled as a set of causes. A cause c 2 I for a tuple t is de
  • 34. ned as follows: 1 t62 Q(I fcg) 2 there exists a set C I called contingency so that t 2 Q(I C) and t62 Q(I C fcg) Representation: P(Tuple(I )) Causality: Set of all causes Reexamining some Holy Grails of Provenance
  • 35. Questions Overview Properties Semantics Insensitivity Causality UofT:DB Group Example Qa = R R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 Qa A B a1 1 2 a2 1 3 a3 2 3 a4 2 5 Provenance Causality(Qa; a1) = fr1g Reexamining some Holy Grails of Provenance
  • 36. Questions Overview Properties Semantics Insensitivity Insensitivity to Query Rewrite UofT:DB Group Insensitivity to Query Rewrite Equivalent queries have the same provenance Q Q0 ) P(Q; I ; t) = P(Q0; I ; t) Set resp. Bag semantics Overview Insensitive (Set, Bag) Wit, IWhy, IWhere, Causality Insensitive (Bag) How Sensitive: Lineage, PI-CS, C-CS, Where Reexamining some Holy Grails of Provenance
  • 37. Questions Overview Properties Semantics Insensitivity Why UofT:DB Group Wit De
  • 38. ned over black-box behaviour of query ) trivially insensitive Why Sensitive, traditionally attributed to being based on query syntax Why may contain tuples that do not contribute to t ) Equivalent queries that apply redundant computations may contain larger provenance Caveat: But why does this argument not apply to Wit? Positive queries: super-set of a witness is also a witness ) tuples used by redundant computations are in Wit IWhy Reexamining some Holy Grails of Provenance Minimal elements of Why
  • 39. Questions Overview Properties Semantics Insensitivity Why UofT:DB Group Example Qa = R Qb = A;B(R ./A=C A!C;B!D(R)) R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 Qa Qb A B a1 1 2 a2 1 3 a3 2 3 a4 2 5 Provenance Wit(Qa; a1) = Wit(Qb; a1) = fJ j J R ^ r1 2 Jg Why(Qa; a1) = ffr1gg Why(Qb; a1) = ffr1g; fr1; r2gg IWhy(Qa; a1) = IWhy(Qb; a1) = ffr1gg Reexamining some Holy Grails of Provenance
  • 40. Questions Overview Properties Semantics Insensitivity Lineage-based UofT:DB Group Provenance representation based on query syntax Trivial examples for sensitivity based on reordering of the arguments of commutative operators Reexamining some Holy Grails of Provenance
  • 41. Questions Overview Properties Semantics Insensitivity Lineage-based UofT:DB Group Example Qd = A(R ./B=C S) Qe = A(S ./B=C R) R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 S C s1 2 s2 3 Qd Qe A d1 1 d2 2 Provenance Lineage(Qd ; d1) = fr1; r2g; fs1; s2g Lineage(Qe ; d1) = fs1; s2g; fr1; r2g PI CS(Qd ; d1) = f r1; s1 ; r2; s2 g PI CS(Qe ; d1) = f s1; r1 ; s2; r2 g Reexamining some Holy Grails of Provenance
  • 42. Questions Overview Properties Semantics Insensitivity How UofT:DB Group Sensitive (Set): Insensitive (Bag): Operator semantics de
  • 43. ned to take bag semantics into account Reexamining some Holy Grails of Provenance
  • 44. Questions Overview Properties Semantics Insensitivity How UofT:DB Group Example Qa = R Qb = A;B(R ./A=C A!C;B!D(R)) R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 Qa Qb A B a1 1 2 a2 1 3 a3 2 3 a4 2 5 Provenance How(Qa; a1) = r1 How(Qb; a1) = r 2 1 + r1 r2 Reexamining some Holy Grails of Provenance
  • 45. Questions Overview Properties Semantics Insensitivity Causality UofT:DB Group Trivially insensitive: De
  • 46. ned over the black-box behaviour of a query Reexamining some Holy Grails of Provenance
  • 47. Questions Overview Properties Semantics Insensitivity Causality UofT:DB Group Example Qa = R Qb = A;B(R ./A=C A!C;B!D(R)) R A B r1 1 2 r2 1 3 r3 2 3 r4 2 5 Qa Qb A B a1 1 2 a2 1 3 a3 2 3 a4 2 5 Provenance Causality(Qa; a1) = Causality(Qb; a1) = fr1g Reexamining some Holy Grails of Provenance