From Provenance Standards and Tools to Queries and Actionable Provenance
Constraint Provenance Games Unify Why and Why-Not
1. Towards
Constraint
Provenance
Games
Sean
Riddle
Sven
Köhler
Bertram
Ludäscher
University
of
California,
Davis
2. • Overture:
– Provenance
Semirings
&
Database
Queries
• Act
I:
From
Games
to
Provenance
– Leitmo?v:
Define
Game
Provenance!
– Why
is
posi?on
X
won,
lost,
or
drawn
in
a
game?
• win(X)
:-‐
move(X,Y),
not
win
(Y).
• Act
II:
Provenance
Games:
– Queries
are
Games
=>
apply
Game
Provenance!
– Uniform
No?on
for
Why
&
Why-‐Not
Provenance!
TaPP'2014
@
DLR,
Cologne
2
Constraint
Provenance:
Game
Plan
3. • Act
III:
Constraint
Provenance
Games:
– Why-‐Not
Provenance:
The
many
ways
to
fail
…
• Domain
dependence
of
Provenance
Games
– Constraints
to
the
rescue!
• Conclusions:
– …
to
be
con9nued
…
TaPP'2014
@
DLR,
Cologne
3
Constraint
Provenance:
Game
Plan
4. • Overture
• Act
I:
From
Games
to
Provenance
• Act
II:
Provenance
Games
• Act
III:
Constraint
Provenance
Games
• Conclusions
TaPP'2014
@
DLR,
Cologne
4
Constraint
Provenance:
Game
Plan
5. Example:
Go
from
X
to
Y
in
3
hops!
(a
=
DLR
b
=
Hotel
Quelle
c
=
Downtown)
• hop(X,Y)
:=
• 3hop(X,Y)
:-‐
hop(X,
Z1),
hop(Z1,
Z2),
hop(Z2,Y).
TaPP'2014
@
DLR,
Cologne
5
a
p
b
q
r
c
s
Note:
Can’t
go
from
c
to
a
in
3hops!
a
ppp+pqr+qrp
b
ppq+qrq
cpqs
ppr+qrr
rpq
rqs
6. Provenance
Polynomials
One
Semiring
to
Rule
them
all!
TaPP'2014
@
DLR,
Cologne
6
Green,
Karvounarakis,
Tannen.
Provenance
semirings,
PODS,
2007
7. Provenance
Polynomials
TaPP'2014
@
DLR,
Cologne
7
,,Mein
Schatz!”
p3
+
2pqr
p3
+
pqr
p
+
2pqr
p
+
pqr
pqr
p
+
pqr
p
a
ppp+pqr+qrp
b
ppq+qrq
cpqs
ppr+qrr
rpq
rqs
8. Problem:
NegaLon
&
Why-‐Not
TaPP'2014
@
DLR,
Cologne
8
• Provenance
Semirings
work
well
for:
– PosiLve
Queries
(e.g.,
RA+
)
• Challenges:
Handling
of
– set
difference
(~
negaLon)
– Why-‐not
provenance
• A
fresh
look
at
provenance!
• …
using
an
old
idea:
Game
semanLcs!
9. • Overture
• Act
I:
From
Games
to
Provenance
• Act
II:
Provenance
Games
• Act
III:
Constraint
Provenance
Games
• Conclusions
TaPP'2014
@
DLR,
Cologne
9
Constraint
Provenance:
Game
Plan
10. A
Game
a
k
b
c
l
d
e
m
g
h
n
f
TaPP'2014
@
DLR,
Cologne
10
11. Solving
the
Game
a
k
b
c
l
d
e
m
g
h
n
f
TaPP'2014
@
DLR,
Cologne
11
All
successors
won
è
posi?on
lost
Some
successor
lost
è
posi?on
won
12. Solving
the
Game
a
k
b
c
l
d
e
m
g
h
n
f
TaPP'2014
@
DLR,
Cologne
12
All
leaves
(dead-‐ends)
are
immediately
lost!
13. Solving
the
Game
a
k
b
c
l
d
e
m
g
h
n
f
TaPP'2014
@
DLR,
Cologne
13
X
is
won
if
there
exists
a
move
to
a
lost
Y
14. Solving
the
Game
a
k
b
c
l
d
e
m
g
h
n
f
TaPP'2014
@
DLR,
Cologne
14
X
is
lost
if
all
moves
lead
to
a
won
Y
15. Solving
the
Game
a
k
b
c
l
d
e
m
g
h
n
f
TaPP'2014
@
DLR,
Cologne
15
Repeat
unLl
no
change
=>
drawn
posiLons
remain
16. Game
Provenance
a
b
1
c
3
d e
f
1
g
3
m
h
1
k
l
oo
n
oo
oo
oo
2 2
2
TaPP'2014
@
DLR,
Cologne
16
• Game
can
be
solved
in
?me
linear
in
|Move|
• One
rule
to
rule
them
all!
win(X)
:-‐
move(X,Y),
not
win(Y)
• node
color
=>
edge
color
– good
vs
bad
moves
• good
moves
=
natural,
new
noLon
of
provenance!
Aside:
Games
~
Argumenta?on
Frameworks
win(X)
:-‐
move(X,Y),
not
win(Y)
def(X)
:-‐
alacks(Y,X),
not
def(Y)
17. Game
Provenance
W
bad Dbad
L
winning
bad
drawing
n/a
delaying
n/a
n/a
a
b
1
c
3
d e
f
1
g
3
m
h
1
k
l
oo
n
oo
oo
oo
2 2
2
TaPP'2014
@
DLR,
Cologne
17
ExtracLng
Provenance:
ü Why/how
win(x)?
• [x]
–G.(R.G)*–>
[y]
ü Why-‐not
win(x)?
• [x]
–(R.G)*–>
[y]
• [x]
–(Y+)–>
[y]
Move
types
18. Game
Provenance
a
b
1
c
3
d e
f
1
g
3
m
h
1
k
l
oo
n
oo
oo
oo
2 2
2
TaPP'2014
@
DLR,
Cologne
18
ExtracLng
Provenance:
ü Why/how
win(x)?
• [x]
–G.(R.G)*–>
[y]
ü Why-‐not
win(x)?
• [x]
–(R.G)*–>
[y]
• [x]
–(Y+)–>
[y]
• Next:
play
a
query
evaluaLon
game
• =>
new
why-‐(not)
provenance
via
games!
19. • Overture
• Act
I:
From
Games
to
Provenance
• Act
II:
Provenance
Games
• Act
III:
Constraint
Provenance
Games
• Conclusions
TaPP'2014
@
DLR,
Cologne
19
Constraint
Provenance:
Game
Plan
20. Provenance
(or
Query
Evalua9on)
Games
TaPP'2014
@
DLR,
Cologne
20
“SLD-‐resoluLon
game”
Next
(Example):
A(X)
:–
B(X,Y,Z)
…
not
C(X,Y)
…
21. TranslaLon:
Q(I) => G Q(I)
TaPP'2014
@
DLR,
Cologne
21
A(X)
C(X)
B(X, Y )
r2(X, Y )
g1
2(X, Y )
g2
2(Y )
rB(X, Y )
rC (X)
¬A(X)
¬B(X, Y )
¬C(X)
B(X, Y )
C(X)
X:=Y
9Y
(a) Game template for QABC : A(X) : B(X, Y ), ¬C(Y ).
¬C(a)
¬C(b)
¬B(a, a)
¬B(a, b)
rB(b, a)
r2(b, a)¬A(b)
¬A(a)
g1
2(a, a)
B(a, b)
B(a, a)
C(a)
g2
2(a)
g2
2(b)
C(b)
¬B(b, a)
¬B(b, b)
rC (a)
A(b)
A(a)
r2(a, b)
r2(a, a)
g1
2(a, b) rB(a, b)
r2(b, b)
g1
2(b, b)
g1
2(b, a)
B(b, b)
B(b, a)
9a
9b
9b
9a
(b) Instantiated QABC game on I = {B(a, b), B(b, a), C(a)}.
A(b)
Figure 4: Alt
x
¬A :
x1 = a
22. Solve
G Q(I)
=>
Provenance!
TaPP'2014
@
DLR,
Cologne
22
¬B(a, b)¬A(a) B(a, b)
r2(a, b)
g1
2(a, b) rB(a, b)
(b) Instantiated QABC game on I = {B(a, b), B(b, a), C(a)}.
¬C(a)
¬C(b)
¬B(a, a)
¬B(a, b)
rB(b, a)
r2(b, a)¬A(b)
¬A(a) rB(a, b)B(a, b)
B(a, a)
C(a)
g2
2(a)
g2
2(b)
C(b)
¬B(b, a)
¬B(b, b)
rC (a)
A(b)
A(a)
r2(a, b)
r2(a, a)
g1
2(a, b)
g1
2(a, a)
r2(b, b)
g1
2(b, b)
g1
2(b, a)
B(b, b)
B(b, a)
9a
9b
9b
9a
(c) Solved game: lost positions are (dark) red; won positions
are (light) green. Provenance edges (= good moves) are solid.
Bad moves are dashed and not part of the provenance. A(a) is
true (A(b) is false) as it is won (lost) in the solved game; the
game provenance explains why (why-not).
Figure 3: Provenance game for Q . The well-founded model of
23. Happy
End
(1
of
3)
TaPP'2014
@
DLR,
Cologne
23
a p
b
q r
c
s
(a) input I ...
hop
a a p
a b q
b a r
b c s
(b) ... annotated.
3hop
a a p3
+ 2pqr
a b p2
q + q2
r
a c pqs
b a p2
r + qr2
b b pqr
b c qrs
(c) 3hop with provenance.
r1(a, a, b, a)
g2
1(a, a)
¬hop(b, a)
g1
1(a, a)
hop(b, a)
g2
1(a, b) g3
1(b, a)
rhop(b, a)
r1(a, a, a, a)
r1(a, a, a, b)
3hop(a, a)
g3
1(a, a)
rhop(a, a)
hop(a, b)
¬hop(a, a)
g1
1(a, b)
rhop(a, b)
g2
1(b, a)
¬hop(a, b)
hop(a, a)
9 a,a 9 b,a
9 a,b
(d) The game provenance of 3hop(a, a) ...
⇥
+
⇥
+
+
+ +
r
⇥
⇥
+
+
p
+
⇥
+
q
+
⇥
+
(e) ... is p3 + 2pqr.
Figure 1: Each edge hop(x, y) in the input graph I in (a) is annotated
Provenance
Game
on
GQ(I)
=
Provenance
Polynomials
…
for
posiLve
queries!
24. Happy
End
(2
of
3)
TaPP'2014
@
DLR,
Cologne
24
…
but
also
works
for
Why-‐Not
provenance
&
non-‐monotonic
queries
(i.e.,
Q
can
have
negaLon)
!!
Here:
not
3hop(c,a)
–
can’t
go
back
from
Downtown
to
DLR
c
a
g2
1(c, a)
¬3hop(c, a)
g2
1(c, c)g1
1(c, c)
r1(c, a, c, b)
¬hop(c, b)
hop(c, a)
g2
1(b, b)
¬hop(a, c)
hop(c, c)
g1
1(c, a)
r1(c, a, b, c)r1(c, a, a, b)
3hop(c, a)
hop(b, b)
g2
1(c, b)g2
1(a, c)
r1(c, a, a, c)
¬hop(c, c)
hop(c, b)
¬hop(c, a)
g1
1(c, b)
r1(c, a, b, b)
¬hop(b, b)
g3
1(c, a)
r1(c, a, a, a) r1(c, a, b, a)
hop(a, c)
r1(c, a, c, a) r1(c, a, c, c)
9 a,b 9 a,c 9 c,a 9 c,c9 b,c 9 b,b9 b,a9 a,a 9 c,b
Figure 2: Why-not provenance for 3hop(c, a) using provenance games.
gi
1 in the body of r1, thus claiming that gi
1 is false and hence that
the r1 instance doesn’t derive t. The first player can counter and
demonstrate that gi
1 is true by selecting a rule instance or fact as
evidence for gi
1. The game proceeds in rounds until some player
cannot move and thus loses (the opponent wins). In [KLZ13] it
25. Happy
End
(2
of
3)
TaPP'2014
@
DLR,
Cologne
25
5
leaf
nodes
~
5
missing
(“hypotheLcal”)
edges
Insert
those
=>
3hop(c,a)
will
be
true!
g2
1(c, a)
¬3hop(c, a)
g2
1(c, c)g1
1(c, c)
r1(c, a, c, b)
¬hop(c, b)
hop(c, a)
g2
1(b, b)
¬hop(a, c)
hop(c, c)
g1
1(c, a)
r1(c, a, b, c)r1(c, a, a, b)
3hop(c, a)
hop(b, b)
g2
1(c, b)g2
1(a, c)
r1(c, a, a, c)
¬hop(c, c)
hop(c, b)
¬hop(c, a)
g1
1(c, b)
r1(c, a, b, b)
¬hop(b, b)
g3
1(c, a)
r1(c, a, a, a) r1(c, a, b, a)
hop(a, c)
r1(c, a, c, a) r1(c, a, c, c)
9 a,b 9 a,c 9 c,a 9 c,c9 b,c 9 b,b9 b,a9 a,a 9 c,b
Figure 2: Why-not provenance for 3hop(c, a) using provenance games.
gi
1 in the body of r1, thus claiming that gi
1 is false and hence that
the r1 instance doesn’t derive t. The first player can counter and
demonstrate that gi
1 is true by selecting a rule instance or fact as
evidence for gi
1. The game proceeds in rounds until some player
cannot move and thus loses (the opponent wins). In [KLZ13] it
was shown how the provenance of a tuple t can be obtained via a
regular path query over a solved game graph like the one in Fig. 1d:
e.g., p3
+ 2pqr for 3hop(a, a) is represented by a solved game
as shown in Fig. 1e: for positive queries, solved games represent
semiring provenance by noting that won (green) and lost (red) po-
sitions correspond to “+” and “⇥” operations, respectively (leaves
represent input annotations, here: p, q, r, s) [KLZ13].
h labels t, u, v, w, and x. These missing edges
failed leaf nodes in Fig. 2. The table in Fig. 6
not provenance, with different combinations of
reconditions for a derivation of 3hop(c, a).
a p
b
q
c
u
r
x
s
t
w
v
h I with five additional, hypothetical edges (dashed).
t Game Construction
y QABC. To build the game, each ground tu-
currently ‘at’ a rule node is
firing is satisfied and creat
claim, the player moves to
The goal, if unsatisfied, wi
at least one goal is unsatisfi
for the rule node.
A detailed example usin
next section.
Constraint provenance
games by making them dom
tivating example, consider
are effectively the same as i
nodes that apply to more th
the firing r2(b, c) was not
has to find the node admitt
The subgraph of this node
explain why rule firings adm
Example Consider the ex
straint game in Fig. 5. After
cessed, the rule is processed
of A(X) is to select a node
in B and a node for the abse
domain, also captures the rule non-satisfaction of an infinite s
possible variable bindings to elements possibly outside the a
domain. Any constraint that has a variable that is only disequa
constrained represents an infinite set of firings. Consider the
node: R1 : X6=a, X6=b, Z1=a, Z2=a, Y =a. This correspon
the (hypothetical) 3hop path c
t
a
p
a
p
a and the situ
in which the edge t exist (see first row of Fig. 6). However, it
explains why the rule firing d ! a ! a ! a is not succes
The explanation is the failure of the first goal of the rule. In the
of X=c, it represents that there are no outgoing edges from
the case of X=d or any other invented value this is trivially tr
This shows that constraint provenance games do not suffer
the same problems as their fully-grounded counterparts. Pr
nance can be queried for any imaginable tuple, including one n
the active domain, and the provenance presented is still corre
the presence of a growing active domain.
r1(X, Y, Z1, Z2) X ! Z1 ! Z2 ! Y Why Not R1
[Fig. 2] [Fig. 7] Provenance [F
r1(c, a, a, a) c
t
a
p
a
p
a t ) t·p·p
r1(c, a, a, b) c
t
a
q
b
r
a t ) t·q·r
r1(c, a, a, c) c
t
a
u
c
t
a t, u ) t·u·t
r1(c, a, c, a) c
v
c
t
a
p
a t, v ) v·t·p
r1(c, a, b, c) c
w
b
s
c
t
a t, w ) w·s·t
r1(c, a, c, c) c
v
c
v
c
t
a t, v ) v·v·t
r1(c, a, c, b) c
v
c
w
b
r
a v, w ) v·w·r
r1(c, a, b, a) c
w
b
r
a
p
a w ) w·r·p
r1(c, a, b, b) c
w
b
x
b
r
a w, x ) w·x·r
Figure 6: The nine r1-instances in the first column correspond to
in Fig. 2 from left to right. The 3hop-path is shown in the second col
=>
What-‐If
provenance!
26. Are
there
more
ways
to
fail?
TaPP'2014
@
DLR,
Cologne
26
(X, Y )
C (X)
(Y ).
(b, a)
g1
2(b, c)
g1
2(b, b)
r2(b, a)
¬B(b, c) B(b, c)
g2
2(a)
¬B(b, b)
rC (a)
A(b)
C(a)
B(b, b)r2(b, b)
r2(b, c)
9 c
9 a
9 b
Figure 4: Altered subgraph of Fig. 3c after adding c to the active domain.
¬B(a, a)
¬B(a, b)¬A(a)
g1
2(a, a)
B(a, b)
B(a, a)
r2(a, b)
g1
2(a, b) rB(a, b)
9b
(b) Instantiated QABC game on I = {B(a, b), B(b, a), C(a)}.
¬C(a)
¬C(b)
¬B(a, a)
¬B(a, b)
rB(b, a)
r2(b, a)¬A(b)
¬A(a) rB(a, b)B(a, b)
B(a, a)
C(a)
g2
2(a)
g2
2(b)
C(b)
¬B(b, a)
¬B(b, b)
rC (a)
A(b)
A(a)
r2(a, b)
r2(a, a)
g1
2(a, b)
g1
2(a, a)
r2(b, b)
g1
2(b, b)
g1
2(b, a)
B(b, b)
B(b, a)
9a
9b
9b
9a
(c) Solved game: lost positions are (dark) red; won positions
are (light) green. Provenance edges (= good moves) are solid.
Bad moves are dashed and not part of the provenance. A(a) is
true (A(b) is false) as it is won (lost) in the solved game; the
game provenance explains why (why-not).
gure 3: Provenance game for QABC. The well-founded model of
n(X) : M(X, Y ), ¬win(Y ), applied to move graph M, solves the game.
A :
x1 =
A :
x1 =
¬A :
x1 6= a,
x1 6= b
A :
x1 6=
x1 6=
¬A :
x1 = b
¬A :
x1 = a
Figure 5: Constr
may represent fin
Two
branches
that
explain
Why-‐not
A(b)
Adding
a
new
constant
c
to
the
domain
=>
new
why-‐not
answer!
27. • Overture
• Act
I:
From
Games
to
Provenance
• Act
II:
Provenance
Games
• Act
III:
Constraint
Provenance
Games
• Conclusions
TaPP'2014
@
DLR,
Cologne
27
Constraint
Provenance:
Game
Plan
28. Problem:
Domain
Dependence
• Provenance
Game
seman?cs
[KLZ13]
is
(ac?ve)
domain-‐dependent!
– enumera?ng
all
(finitely
many)
ways
to
fail
• cf.
failed
SLD
branches
• Key
Problem:
– why-‐not
p(X)
ó
why/how
not p(X) !
ó
anything
not
in
p!
(usually
infinite…)
TaPP'2014
@
DLR,
Cologne
28
29. Domain
Dependence,
Infinite
Domains
• Key
Problem:
– Why-‐not
p(X)
ó
why/how
not p(X) !
ó
anything
not
in
p!
(may
be
infinite…)
• Key
Idea:
– Use
constraints
(equali9es,
dis-‐equali9es)
(inspired
by
Chan’s
Construc9ve
Nega9on
and
CLP)
3hop(X,
Y):
X≠a,
X≠b,
Y=a
=
{
3hop(x,
a)
|
x
∈
D
{a,
b}
}
TaPP'2014
@
DLR,
Cologne
29
30. ¬C(b)
¬B(a, a)
¬B(a, b)¬A(a)
g1
2(a, a)
B(a, b)
B(a, a)
g2
2(b)
C(b)
A(b)
A(a)
r2(a, b)
r2(a, a)
g1
2(a, b) rB(a, b)
r2(b, b)
9a
9b
(b) Instantiated QABC game on I = {B(a, b), B(b, a), C(a)}.
¬C(a)
¬C(b)
¬B(a, a)
¬B(a, b)
rB(b, a)
r2(b, a)¬A(b)
¬A(a) rB(a, b)B(a, b)
B(a, a)
C(a)
g2
2(a)
g2
2(b)
C(b)
¬B(b, a)
¬B(b, b)
rC (a)
A(b)
A(a)
r2(a, b)
r2(a, a)
g1
2(a, b)
g1
2(a, a)
r2(b, b)
g1
2(b, b)
g1
2(b, a)
B(b, b)
B(b, a)
9a
9b
9b
9a
(c) Solved game: lost positions are (dark) red; won positions
are (light) green. Provenance edges (= good moves) are solid.
Bad moves are dashed and not part of the provenance. A(a) is
true (A(b) is false) as it is won (lost) in the solved game; the
game provenance explains why (why-not).
igure 3: Provenance game for QABC. The well-founded model of
in(X) : M(X, Y ), ¬win(Y ), applied to move graph M, solves the game.
he new binding for X; a condition “B(X, Y )” means that a move
s possible only if B(X, Y ) is true in I for the current X, Y values.2
Given database I, a template can be instantiated yielding a game
raph GQ(I) as in Fig. 3b. Note how template variables (e.g., Y )
ave been replaced by domain values (a or b), and that conditional
dges (e.g., labeled “C(X)”) became unconditional edges (e.g.,
(a) ! rC(a)) or no edge at all (e.g., from C(b)), depending on
whether or not the condition holds in I. To extract why(-not)
rovenance from a game graph GQ(I) as in Fig. 3b, we need to
olve the game first, i.e., determine which positions are won (light
¬B :
x1 6= a,
x1 6= b,
x2 = a
C :
x1 = a
A :
x1 = a
A :
x1 = b
¬C :
x1 6= a
¬A :
x1 6= a,
x1 6= b
C :
x1 6= a
R2 :
X = a,
Y = a
R2 :
X = a,
Y = b
B :
x1 6= a,
x2 6= a
R2 :
X 6= a,
Y 6= a
RB :
x1 = b
x2 = a
B :
x1 = a,
x2 = b
A :
x1 6= a,
x1 6= b
G2
2 : ¬C :
Y 6= a
G1
2 : B :
X 6= a,
X 6= b,
Y = a
¬A :
x1 = b
¬A :
x1 = a
¬B :
x1 6= a,
x2 6= a
¬B :
x1 = a,
x2 = b
B :
x1 = b,
x2 = a
RC :
x1 = a
RB :
x1 = a
x2 = b
R2 :
Y 6= b,
X = a,
Y 6= a
G1
2 : B :
X 6= a,
Y 6= a
G1
2 : B :
X = b,
Y = a
B :
x1 6= a,
x1 6= b,
x2 = a
R2 :
X 6= a,
X 6= b,
Y = a
G1
2 : B :
X = a,
Y = b
R2 :
X = b,
Y = a
¬C :
x1 = a
¬B :
x1 = b,
x2 = a
G2
2 : ¬C :
Y = a
Figure 5: Constraint provenance game for QABC. Unlike in Figure 3, node
may represent finite or infinite sets here.
GQ(I) thus consists only of edges that are matched by the regula
path queries (g.r)+
and r.(g.r)⇤
, i.e., alternating sequences o
green (winning) and red (delaying) moves [KLZ13].
3. Constraint Provenance Games
Consider the solved game graph of Fig. 3c. If the value c wer
added to the active domain, the provenance would be incomplete
e.g., to explain why-not A(b) there are two 9a, 9b branches ema
nating from A(b). However, with c in the active domain there is
third 9c branch via r2(b, c): see Fig. 4. We show that a modifie
Happy
End
(3
of
3)
TaPP'2014
@
DLR,
Cologne
30
C(X)
B(X, Y )
X, Y )
g1
2(X, Y )
g2
2(Y )
rB(X, Y )
rC (X)
¬B(X, Y )
¬C(X)
B(X, Y )
C(X)
X:=Y
mplate for QABC : A(X) : B(X, Y ), ¬C(Y ).
¬C(a)
¬C(b)
¬B(a, a)
¬B(a, b)
rB(b, a)
r2(b, a)
g1
2(a, a)
B(a, b)
B(a, a)
C(a)
g2
2(a)
g2
2(b)
C(b)
¬B(b, a)
¬B(b, b)
rC (a)
r2(a, b)
r2(a, a)
g1
2(a, b) rB(a, b)
r2(b, b)
g1
2(b, b)
g1
2(b, a)
B(b, b)
B(b, a)
ed QABC game on I = {B(a, b), B(b, a), C(a)}.
¬C(a)
¬C(b)
¬B(a, a)
¬B(a, b)
rB(b, a)
r2(b, a)
rB(a, b)B(a, b)
B(a, a)
C(a)
g2
2(a)
g2
2(b)
C(b)
¬B(b, a)
¬B(b, b)
rC (a)
r2(a, b)
r2(a, a)
g1
2(a, b)
g1
2(a, a)
r2(b, b)
g1
2(b, b)
g1
2(b, a)
B(b, b)
B(b, a)
me: lost positions are (dark) red; won positions
en. Provenance edges (= good moves) are solid.
e dashed and not part of the provenance. A(a) is
alse) as it is won (lost) in the solved game; the
nce explains why (why-not).
g1
2(b, c)
g1
2(b, b)
r2(b, a)
¬B(b, c) B(b, c)
g2
2(a)
¬B(b, b)
rC (a)
A(b)
C(a)
B(b, b)r2(b, b)
r2(b, c)
9 c
9 a
9 b
Figure 4: Altered subgraph of Fig. 3c after adding c to the active domain.
¬B :
x1 6= a,
x1 6= b,
x2 = a
C :
x1 = a
A :
x1 = a
A :
x1 = b
¬C :
x1 6= a
¬A :
x1 6= a,
x1 6= b
C :
x1 6= a
R2 :
X = a,
Y = a
R2 :
X = a,
Y = b
B :
x1 6= a,
x2 6= a
R2 :
X 6= a,
Y 6= a
RB :
x1 = b,
x2 = a
B :
x1 = a,
x2 = b
A :
x1 6= a,
x1 6= b
G2
2 : ¬C :
Y 6= a
G1
2 : B :
X 6= a,
X 6= b,
Y = a
B :
x2 6= b,
x1 = a
¬A :
x1 = b
¬A :
x1 = a
G1
2 : B :
Y 6= b,
X = a
¬B :
x1 6= a,
x2 6= a
¬B :
x1 = a,
x2 = b
B :
x1 = b,
x2 = a
RC :
x1 = a
¬B :
x2 6= b,
x1 = a
RB :
x1 = a,
x2 = b
R2 :
Y 6= b,
X = a,
Y 6= a
G1
2 : B :
X 6= a,
Y 6= a
G1
2 : B :
X = b,
Y = a
B :
x1 6= a,
x1 6= b,
x2 = a
R2 :
X 6= a,
X 6= b,
Y = a
G1
2 : B :
X = a,
Y = b
R2 :
X = b,
Y = a
¬C :
x1 = a
¬B :
x1 = b,
x2 = a
G2
2 : ¬C :
Y = a
Why-‐not
provenance
complete
only
for
adom(I)
=
{
a,
b
}
!
Constraint
why-‐not
provenance
also
captures
new
constants,
i.e.,
for
an
unlimited
domain
D
=
{
a,
b,
c,
…
}
=>
Constraint
Provenance
answer
is
domain
independent!
31. Why-‐Not:
The
Full
Story
Emerges…
TaPP'2014
@
DLR,
Cologne
31
R1 :
X 6= a,
X 6= b,
Z1 = c,
Z2 = c,
Y 6= c
¬hop :
x2 6= a,
x2 6= b,
x1 = a
R1 :
X 6= a,
X 6= b,
Z1 = c,
Z2 = b,
Y = a
3Hop :
x1 6= a,
x1 6= b,
x2 = a
R1 :
X 6= a,
X 6= b,
Z1 6= c,
Z1 6= a,
Z1 6= b,
Z2 = c,
Y 6= c
G1
1 : hop :
X 6= a,
X 6= b,
Z1 6= c
R1 :
X 6= a,
X 6= b,
Z1 = b,
Z2 = c,
Y 6= c
G1
1 : hop :
X 6= a,
X 6= b,
Z1 = c
¬hop :
x1 6= a,
x1 6= b,
x2 = c
hop :
x2 6= a,
x2 6= b,
x1 = a
R1 :
X 6= a,
X 6= b,
Z1 = a,
Z2 = a,
Y = a
¬hop :
x2 6= a,
x2 6= c,
x1 = b
G2
1 : hop :
U 6= a,
Z1 6= b,
Z2 6= c
R1 :
X 6= a,
X 6= b,
Z1 = c,
Z2 6= c,
Z2 6= a,
Z2 6= b,
Y 6= c
R1 :
X 6= a,
X 6= b,
Z1 6= c,
Z1 6= a,
Z1 6= b,
Z2 6= c,
Z2 6= a,
Z2 6= b,
Y 6= c
hop :
x1 6= a,
x1 6= b,
x2 6= c
¬hop :
x1 6= a,
x1 6= b,
x2 6= c
R1 :
X 6= a,
X 6= b,
Z1 = b,
Z2 = b,
Y = a
R1 :
X 6= a,
X 6= b,
Z1 6= c,
Z1 6= a,
Z1 6= b,
Z2 = b,
Y = a
G2
1 : hop :
Z1 6= a,
Z1 6= b,
Z2 = c
hop :
x2 6= a,
x2 6= c,
x1 = b
hop :
x1 6= a,
x1 6= b,
x2 = c
R1 :
X 6= a,
X 6= b,
Z1 = b,
Z2 = a,
Y = a
G2
1 : hop :
Z2 6= a,
Z2 6= c,
Z1 = b
R1 :
X 6= a,
X 6= b,
Z2 6= a,
Z2 6= b,
Z1 = a,
Y 6= c
R1 :
X 6= a,
X 6= b,
Z1 = c,
Z2 = a,
Y = a
R1 :
X 6= a,
X 6= b,
Z1 6= c,
Z1 6= a,
Z1 6= b,
Z2 = a,
Y = a
R1 :
X 6= a,
X 6= b,
Z1 = a,
Z2 = b,
Y = a
G3
1 : hop :
Z2 6= a,
Z2 6= b,
Y 6= c
R1 :
X 6= a,
X 6= b,
Z2 6= a,
Z2 6= c,
Z1 = b,
Z2 6= b,
Y 6= c
G2
1 : hop :
Z2 6= a,
Z2 6= b,
Z1 = a
Figure 9: The why-not provenance of 3hop(c, a). The provenance is represented in the failure of the claim that 3hop(c, a) is in the answer. This is argued
over the Boolean expression defining 3hop(x, y). A move from the source node to a child represents the choice of a Boolean expression that is sufficient to
g2
1(c, a)
¬3hop(c, a)
g2
1(c, c)g1
1(c, c)
r1(c, a, c, b)
¬hop(c, b)
hop(c, a)
g2
1(b, b)
¬hop(a, c)
hop(c, c)
g1
1(c, a)
r1(c, a, b, c)r1(c, a, a, b)
3hop(c, a)
hop(b, b)
g2
1(c, b)g2
1(a, c)
r1(c, a, a, c)
¬hop(c, c)
hop(c, b)
¬hop(c, a)
g1
1(c, b)
r1(c, a, b, b)
¬hop(b, b)
g3
1(c, a)
r1(c, a, a, a) r1(c, a, b, a)
hop(a, c)
r1(c, a, c, a) r1(c, a, c, c)
9 a,b 9 a,c 9 c,a 9 c,c9 b,c 9 b,b9 b,a9 a,a 9 c,b
Figure 2: Why-not provenance for 3hop(c, a) using provenance games.
gi
1 in the body of r1, thus claiming that gi
1 is false and hence that
the r1 instance doesn’t derive t. The first player can counter and
demonstrate that gi
1 is true by selecting a rule instance or fact as
evidence for gi
1. The game proceeds in rounds until some player
cannot move and thus loses (the opponent wins). In [KLZ13] it
was shown how the provenance of a tuple t can be obtained via a
regular path query over a solved game graph like the one in Fig. 1d:
e.g., p3
+ 2pqr for 3hop(a, a) is represented by a solved game
as shown in Fig. 1e: for positive queries, solved games represent
semiring provenance by noting that won (green) and lost (red) po-
sitions correspond to “+” and “⇥” operations, respectively (leaves
represent input annotations, here: p, q, r, s) [KLZ13].
Why-Not Provenance and the Many Ways to Fail. Since games
are inherently symmetric (one player’s win is the opponent’s loss
and vice versa), the approach yields an elegant provenance model
that unifies why and why-not provenance. Consider the (dark, red)
node 3hop(c, a) in Fig. 2. The color coding indicates that the posi-
Constraint Provenance Games. We propose to solve th
lem of domain dependence by modifying provenance ga
that they can handle certain infinite relations that can be
represented. For example, in addition to the finitely many
why 3hop(c, a) fails over the active domain adom(I), ther
finitely many others, if we consider new constants d, e, . . .
of adom(I). For example, let relation R = {a, b} have tw
R(a) and R(b). If we want to know why-not R(c), we just
c /2 R. But we could also return a more general answer for w
R(x) and say that ¬R(x) is true for all x with x 6= a ^ x 6=
just for x = c). This approach is inspired by Chan’s Cons
Negation [Cha88], a form of constraint logic programming [
The key idea is to represent (potentially infinite) relations
constraints, i.e., Boolean combinations of equalities x = c
equalities x 6= c.
Overview and Contributions. Section 2 briefly explains ho
order queries are translated into games and how provenanc
tracted from solved games. In Section 3 we describe the co
tion of constraint provenance games; additional details and
ples are contained in the appendix. Our main contributio
(i) game provenance provides a uniform treatment of why an
not provenance for first-order logic (= relational algebra w
difference); (ii) for positive queries, the approach captures t
informative semiring provenance [GKT07, KG12]; (iii) we
a constraint provenance framework which yields domain in
dent provenance expressions, extending prior results [KLZ1
(iv) we implemented a prototype of constraint provenance g
inal database instance I plus a number of hypothetical
edges (dotted), with labels t, u, v, w, and x. These m
correspond to the failed leaf nodes in Fig. 2. The ta
contains the why-not provenance, with different com
missing edges as preconditions for a derivation of 3ho
a p
b
q
c
u
r
x
s
t
w
v
Figure 7: Input graph I with five additional, hypothetical ed
B. Constraint Game Construction
Consider the query QABC. To build the game, each
ple in the program such as B(a, b) is replaced by
B: x1=a, x2=b (a conjunction).
First, the subgraph for EDB predicates is created. T
of the game is constructed iteratively similar to quer
For rules whose subgoals are all on EDB predicat
nodes/edges are generated. For IDB predicates that
the head of EDB-only rules, tuple nodes are generate
5
missing
edges
9
minimal
combina9ons
A. Why-Not 3hop(c, a) Dissected
Consider the input graph in Fig. 1a and its why-not
for 3hop(c, a) in Fig. 2. The graph encodes the re
3hop(c, a) is not in the answer. Moving from the lost 3
Fig. 2, there are nine possible rule instantiations r1(c, a
of which represent a reason why there is no 3hop(c, a)
diate nodes z1, z2 2 {a, b, c}. To better understand th
explanations, consider the input graph in Fig. 7. It conta
inal database instance I plus a number of hypothetical
edges (dotted), with labels t, u, v, w, and x. These mi
correspond to the failed leaf nodes in Fig. 2. The tab
contains the why-not provenance, with different com
missing edges as preconditions for a derivation of 3hop
a p
b
q
c
u
r
x
s
t
w
v
Figure 7: Input graph I with five additional, hypothetical ed
+
…
?
Constraints
imply
15
disjoint
rela9ons
over
key
variables
X,
Z1,
Z2,
Y
32. Summary
• Opening
Aufschlag
• Game
Provenance
Zwischenspiel
– What’s
the
provenance
of
win-‐move?
• Provenance
Games
Mielspiel
– Query
evalua?on
is
a
game
=>
uniform
why
+
why-‐not
provenance
• Constraint
Provenance
Endspiel
– Domain
independent
(infinite
domains
OK)
– Prototypically
implemented
• WF-‐Datalog
+
Z3
+
Plumbing
• Outlook
Nachspiel
– More
work
needed!
• Theore?cal
proper?es
• Rela?on
to
Argumenta?on
Frameworks
• …
TaPP'2014
@
DLR,
Cologne
32