Hendrik van Antwerpen
IN4303 Compiler Construction
TU Delft
September 2017
Declare Your Language
Chapter 8: Constraint Resolution II
Reading Material
2
Unification
3
Baader, F., and W. Snyder

"Unification Theory”

Ch. 8 of Handbook of Automated Deduction

Springer Verlag, Berlin (2001)
http://www.cs.bu.edu/~snyder/publications/UnifChapter.pdf
Efficient Unification
with Union-Find
4
Space complexity
- Exponential
- Representation of unifier
Time complexity
- Exponential
- Recursive calls on terms
Solution
- Union-Find algorithm
- Complexity growth can be
considered constant

Complexity of Unification (recap)
5
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1 -> f(a0,a0)
a2 -> f(f(a0,a0), f(a0,a0))
ai -> … 2i+1-1 subterms …
b1 -> f(a0,a0)
b2 -> f(f(a0,a0), f(a0,a0))
bi -> … 2i+1-1 subterms …
a1 -> f(a0,a0)
a2 -> f(a1,a1)
ai -> … 3 subterms …
b1 -> f(a0,a0)
b2 -> f(a1,a1)
bi -> … 3 subterms …
fully applied triangular
Set Representatives
6
Set Representatives
6
a == b

c == a
Set Representatives
6
a == b

c == a
a
b c
Set Representatives
6
a == b

c == a
u == w

v == u

x == v
a
b c
Set Representatives
6
a == b

c == a
u == w

v == u

x == v
a
b c
u
w v
x
Set Representatives
6
a == b

c == a
u == w

v == u

x == v
a
b c
u
w v
x
representative
Set Representatives
6
a == b

c == a
u == w

v == u

x == v
a
b c
u
w v
x
representative
Set Representatives
6
a == b

c == a
u == w

v == u

x == v
a
b c
u
w v
x
Set Representatives
6
a == b

c == a
u == w

v == u

x == v
x == c
a
b c
u
w v
x
Set Representatives
6
a == b

c == a
u == w

v == u

x == v
x == c
a
b c
u
w v
x
Set Representatives
6
a == b

c == a
u == w

v == u

x == v
x == c
a
b c
u
w v
x
Set Representatives
6
a == b

c == a
u == w

v == u

x == v
x == c
a
b c
u
w v
x
Set Representatives
6
FIND(a):

b := rep(a)

if b == a:

return a

else

return FIND(b)





UNION(a1,a2):

b1 := FIND(a1)

b2 := FIND(a2)

LINK(b1,b2)

LINK(a1,a2):

rep(a1) := a2









a == b

c == a
u == w

v == u

x == v
x == c
a
b c
u
w v
x
Path Compression
7
a
b c
u
w v
x
Path Compression
7
…
x == b
x == c
x == w
x == v
a
b c
u
w v
x
Path Compression
7
…
x == b
x == c
x == w
x == v
a
b c
u
w v
x
Path Compression
7
…
x == b
x == c
x == w
x == v
a
b c
u
w v
x
Path Compression
7
…
x == b
x == c
x == w
x == v
a
b c
u
w v
x
Path Compression
7
…
x == b
x == c
x == w
x == v
a
b c
u
w v
x
Path Compression
7
FIND(a):
b := rep(a)
if b == a:
return a
else
b := FIND(b)
rep(a) := b
return b
UNION(a1,a2):

b1 := FIND(a1)
b2 := FIND(a2)
LINK(b1,b2)

LINK(a1,a2):

rep(a1) := a2
…
x == b
x == c
x == w
x == v
a
b c
u
w v
x
Tree Balancing
8
…
x == c
a
b c
u
w v
x
Tree Balancing
8
…
x == c
a
b c
u
w v
x
Tree Balancing
8
…
x == c
a
b c
u
w v
x
Tree Balancing
8
…
x == c
a
b c
u
w v
x
3 steps
Tree Balancing
8
…
x == c
a
b c
u
w v
x
Tree Balancing
8
…
x == c
a
b c
u
w v
x
2 steps
Tree Balancing
8
…
x == c
a
b c
u
w v
x
?
Tree Balancing
8
…
x == c
a
b c
u
w v
x
1
111
?
Tree Balancing
8
…
x == c
a
b c
u
w v
x
1
2111
?
Tree Balancing
8
…
x == c
a
b c
u
w v
x
1
21
4
11
3
?
Tree Balancing
8
…
x == c
a
b c
u
w v
x
1
21
4
11
3
Tree Balancing
8
FIND(a):
b := rep(a)
if b == a:
return a
else
b := FIND(b)
rep(a) := b
return b
UNION(a1,a2):
b1 := FIND(a1)
b2 := FIND(a2)
LINK(b1,b2)

LINK(a1,a2):

if size(a2) > size(a1):
rep(a1) := a2
size(a2) += size(a1)
else:
rep(a2) := a1
size(a1) += size(a2)
…
x == c
a
b c
u
w v
x
1
21
4
11
3
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
a1 == b1 a1 == b1
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
a1 == b1 a1 == b1
f(a0,a0) == f(b0,b0)
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
a1 == b1 a1 == b1
f(a0,a0) == f(b0,b0)
a0 == b0 a0 == b0
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
a1 == b1 a1 == b1
f(a0,a0) == f(b0,b0)
a0 == b0 a0 == b0
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
a1 == b1 a1 == b1
f(a0,a0) == f(b0,b0)
a0 == b0 a0 == b0
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
a1 == b1 a1 == b1
f(a0,a0) == f(b0,b0)
a0 == b0 a0 == b0
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
a1 == b1 a1 == b1
f(a0,a0) == f(b0,b0)
a0 == b0 a0 == b0
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
a1 == b1 a1 == b1
f(a0,a0) == f(b0,b0)
a0 == b0 a0 == b0
How about occurrence checks?
The Complex Case
9
h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) ==
h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
a1
a0
an-1
an
b1
b0
bn-1
bn
f(a0,a0)
f(an-2,an-2)
f(an-1,an-1)
f(b0,b0)
f(bn-2,bn-2)
f(bn-1,bn-1)
an == bn
f(an-1,an-1) == f(bn-1,bn-1)
an-1 == bn-1 an-1 == bn-1
f(an-2,an-2) == f(bn-2,bn-2)
⠸
a1 == b1 a1 == b1
f(a0,a0) == f(b0,b0)
a0 == b0 a0 == b0
How about occurrence checks? Postpone!
Main idea
- Represent unifier as graph
- One variable represent equivalence class
- Replace substitution by union & find operations
- Testing equality becomes testing node identity
Optimizations
- Path compression make recurring lookups fast
- Tree balancing keeps paths short
Complexity
- Linear in space and almost linear in time (technically inverse Ackermann)
- Easy to extract triangular unifier from graph
- Postpone occurrence checks to prevent traversing (potentially) large terms
Union-Find
10
Martelli, Montanari. An Efficient Unification Algorithm. TOPLAS, 1982
Error Reporting
11
Type errors
- Types are supposed to prevent us from making mistakes
- Error messages are supposed to help us fix mistakes
- This does not always work out so well
‣ Wrong error location
‣ Unclear error message
‣ Not always clear where inferred types come from
- Users expect concise, informative messages, at the right place
Type errors = Unsatisfiable constraints
- Constraint satisfaction is a binary property
- Language designers want to influence error messages
Reporting Type Errors
12
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
✔
❌
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
✔
❌
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
✔
❌
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty4, ty5)
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
✔
❌
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty4, ty5)
✔
✔
✔
✔
✔
❌
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
✔
❌
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty4, ty5)
✔
✔
✔
✔
✔
❌
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
✔
❌
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty4, ty5)
✔
✔
✔
✔
✔
❌
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
❌
✔
✔
✔
✔
✔
❌
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty4, ty5)
✔
✔
✔
✔
✔
❌
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
❌
✔
✔
✔
✔
✔
❌
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty4, ty5)
✔
✔
✔
✔
✔
❌
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
❌
✔
✔
✔
✔
✔
❌
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty4, ty5)
✔
✔
✔
✔
✔
❌
Error Example
13
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty1 == FUN(ty2, ty3)
ty1 == FUN(ty4, ty5)
ty2 == INT()
ty3 == INT()
ty5 == INT()
ty4 == STRING()
✔
✔
✔
✔
❌
✔
✔
✔
✔
✔
❌
ty1 == FUN(ty2, ty3)
ty3 == INT()
ty4 == STRING()
ty5 == INT()
ty2 == INT()
ty1 == FUN(ty4, ty5)
✔
✔
✔
✔
✔
❌
Order of constraint solving is relevant!
Conflict Sets
14
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x + _]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
Conflict Sets
14
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x + _]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1 == FUN(ty2, ty3)
Conflict Sets
14
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x + _]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1 == FUN(ty2, ty3)
ty2 == INT()
Conflict Sets
14
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x + _]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1 == FUN(ty2, ty3)
ty2 == INT()
Conflict Sets
14
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x + _]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty1 == FUN(ty4, ty5)
Conflict Sets
14
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x + _]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
Conflict Sets
14
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x + _]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
Conflict Sets
14
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
[function f(x)]
[x + _]
[_ + _]
[f _]
[“Hello!”]
[_ * 2]
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
A conflict set is an inconsistent subset of the constraint set
Conflict Sets
- Sets of constraint that are inconsistent together
- Multiple conflict sets might appear in one constraint set
- Conflict sets can intersect
How to find conflict sets?
- Find minimal sets that are in conflict
‣ Combinatorial explosion (Min-SAT)
- Approximate conflict sets during constraint solving
‣ Conflict set may be too large
Where to report errors?
- Report the conflicting program slice

(using all conflicting constraints)
- Report on a few constraints in the conflict set
‣ Remove constraints until inconsistency disappears
‣ Use heuristics to efficiently select constraints
There is not always a best place to report the error!
Conflict Sets and Errors
15
1. let
2. function f(x) =
3. x + 1
4. in
5. (f "Hello!") * 2
Constraint Graphs
16
Zhang, Meyers. Toward General Diagnosis of Static Errors. POPL’14
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1
ty2
ty3
ty4
ty5
INT()
STRING()
FUN(ty2, ty3)
FUN(ty4, ty5)
1
2
1
2
INT()
INT()
Constraint Graphs
16
Zhang, Meyers. Toward General Diagnosis of Static Errors. POPL’14
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1
ty2
ty3
ty4
ty5
INT()
STRING()
FUN(ty2, ty3)
FUN(ty4, ty5)
1
2
1
2
- Consider paths between

non-variable terms
INT()
INT()
Constraint Graphs
16
Zhang, Meyers. Toward General Diagnosis of Static Errors. POPL’14
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1
ty2
ty3
ty4
ty5
INT()
STRING()
FUN(ty2, ty3)
FUN(ty4, ty5)
1
2
1
2
- Consider paths between

non-variable terms
INT()
INT()
Constraint Graphs
16
Zhang, Meyers. Toward General Diagnosis of Static Errors. POPL’14
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1
ty2
ty3
ty4
ty5
INT()
STRING()
FUN(ty2, ty3)
FUN(ty4, ty5)
1
2
1
2
- Consider paths between

non-variable terms
- Trust edges involved in

many correct paths
INT()
INT()
Constraint Graphs
16
Zhang, Meyers. Toward General Diagnosis of Static Errors. POPL’14
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1
ty2
ty3
ty4
ty5
INT()
STRING()
FUN(ty2, ty3)
FUN(ty4, ty5)
1
2
1
2
- Consider paths between

non-variable terms
- Trust edges involved in

many correct paths
INT()
INT()
Constraint Graphs
16
Zhang, Meyers. Toward General Diagnosis of Static Errors. POPL’14
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1
ty2
ty3
ty4
ty5
INT()
STRING()
FUN(ty2, ty3)
FUN(ty4, ty5)
1
2
1
2
- Consider paths between

non-variable terms
- Trust edges involved in

many correct paths
- Suspect edges involved in

many wrong paths
INT()
INT()
Constraint Graphs
16
Zhang, Meyers. Toward General Diagnosis of Static Errors. POPL’14
ty1 == FUN(ty2, ty3)
ty2 == INT()
ty3 == INT()
ty1 == FUN(ty4, ty5)
ty4 == STRING()
ty5 == INT()
ty1
ty2
ty3
ty4
ty5
INT()
STRING()
FUN(ty2, ty3)
FUN(ty4, ty5)
1
2
1
2
- Consider paths between

non-variable terms
- Trust edges involved in

many correct paths
- Suspect edges involved in

many wrong paths
- Restricts constraint langauge

and solver implementation
INT()
INT()
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1 FUN(ty2, ty3)
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1 FUN(ty2, ty3)
- Track constraints during

unification
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1 FUN(ty2, ty3)
- Track constraints during

unification
1
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
FUN(ty2, ty3)
- Track constraints during

unification
INT()
1
2
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
ty3 INT()
FUN(ty2, ty3)
- Track constraints during

unification
INT()
1
2
3
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
ty3
ty4
INT()
FUN(ty2, ty3)
- Track constraints during

unification
INT()
1
2
3
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
ty3
ty4
INT()
FUN(ty2, ty3)
- Track constraints during

unification
INT()
1
2
3
1,4
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
ty3
ty4
ty5 INT()
FUN(ty2, ty3)
- Track constraints during

unification
INT()
1
2
3
1,4
1,4
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
ty3
ty4
ty5 INT()
FUN(ty2, ty3)
- Track constraints during

unification
INT()
1
2
3
1,4
1,4
FIND(ty4) = INT() because of {1,2,4}
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
ty3
ty4
ty5 INT()
FUN(ty2, ty3)
- Track constraints during

unification
INT()
1
2
3
1,4
1,4
FIND(ty4) = INT() because of {1,2,4}
constraint 5 gives conflict set {1,2,4,5}
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
ty3
ty4
ty5 INT()
FUN(ty2, ty3)
- Track constraints during

unification
- Conflict set may be an

over-approximation
INT()
1
2
3
1,4
1,4
FIND(ty4) = INT() because of {1,2,4}
constraint 5 gives conflict set {1,2,4,5}
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
ty3
ty4
ty5 INT()
FUN(ty2, ty3)
- Track constraints during

unification
- Conflict set may be an

over-approximation
- Iteratively drop constraints

until inconsistency is gone
INT()
1
2
3
1,4
1,4
FIND(ty4) = INT() because of {1,2,4}
constraint 5 gives conflict set {1,2,4,5}
Use approximated conflict sets
17
1. ty1 == FUN(ty2, ty3)
2. ty2 == INT()
3. ty3 == INT()
4. ty1 == FUN(ty4, ty5)
5. ty4 == STRING()
6. ty5 == INT()
Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
ty1
ty2
ty3
ty4
ty5 INT()
FUN(ty2, ty3)
- Track constraints during

unification
- Conflict set may be an

over-approximation
- Iteratively drop constraints

until inconsistency is gone
- Solver is used as an oracle,

no requirements on its

implementation
INT()
1
2
3
1,4
1,4
FIND(ty4) = INT() because of {1,2,4}
constraint 5 gives conflict set {1,2,4,5}
Error reporting from conflict sets
- Select constraints that are likely causes
- Few error messages is usually preferred over many
- Avoid combinatorial explosion by using heuristics
Language independent heuristics
- Select constraints shared between multiple conflict sets
- Select constraints coming from smaller program fragments
‣ Function definition type comes from the (larger) body expression
‣ Function use type comes from the (smaller) use expression
Language specific heuristics
- Select constraint based on weights assigned in specification
‣ Assign more weight to constraints from definitions, than constraints from use
- Select constraints with explicit error messages
‣ Avoids reporting on implicit equalities that arise during constraint generation
Which constraint is to blame?
18
Requirements on the solver
- Report (approximated) conflict sets
‣ Conflict sets should be over-approximations
‣ The closer to the minimal conflict set, the better
- Fast failure is important
‣ Solver is called repeatedly with smaller constraint sets
How to report errors?
- Reporting ‘<long function type> expected, but got <long but
slightly different function type>’ can be improved
- Can we report that first argument is different, or that number of
arguments is different?
More Aspects
19
Exercises
20
Conclusion
21
Unification with Union-Find
- Represent unifier as a graph
- Testing term equality becomes testing node identity
- Really fast with path compression and tree balancing
Error Reporting
- Selecting constraints to report errors on
- Use conflict sets to find cause of inconsistencies
- Report on
‣ all conflicting constraints (program slice)
‣ subset of constraints that removes inconsistency
- Use heuristics to select constraints to blame
Summary
22

Declare Your Language: Constraint Resolution 2

  • 1.
    Hendrik van Antwerpen IN4303Compiler Construction TU Delft September 2017 Declare Your Language Chapter 8: Constraint Resolution II
  • 2.
  • 3.
    Unification 3 Baader, F., andW. Snyder
 "Unification Theory”
 Ch. 8 of Handbook of Automated Deduction
 Springer Verlag, Berlin (2001) http://www.cs.bu.edu/~snyder/publications/UnifChapter.pdf
  • 4.
  • 5.
    Space complexity - Exponential -Representation of unifier Time complexity - Exponential - Recursive calls on terms Solution - Union-Find algorithm - Complexity growth can be considered constant
 Complexity of Unification (recap) 5 h(a1 , …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 -> f(a0,a0) a2 -> f(f(a0,a0), f(a0,a0)) ai -> … 2i+1-1 subterms … b1 -> f(a0,a0) b2 -> f(f(a0,a0), f(a0,a0)) bi -> … 2i+1-1 subterms … a1 -> f(a0,a0) a2 -> f(a1,a1) ai -> … 3 subterms … b1 -> f(a0,a0) b2 -> f(a1,a1) bi -> … 3 subterms … fully applied triangular
  • 6.
  • 7.
  • 8.
    Set Representatives 6 a ==b
 c == a a b c
  • 9.
    Set Representatives 6 a ==b
 c == a u == w
 v == u
 x == v a b c
  • 10.
    Set Representatives 6 a ==b
 c == a u == w
 v == u
 x == v a b c u w v x
  • 11.
    Set Representatives 6 a ==b
 c == a u == w
 v == u
 x == v a b c u w v x representative
  • 12.
    Set Representatives 6 a ==b
 c == a u == w
 v == u
 x == v a b c u w v x representative
  • 13.
    Set Representatives 6 a ==b
 c == a u == w
 v == u
 x == v a b c u w v x
  • 14.
    Set Representatives 6 a ==b
 c == a u == w
 v == u
 x == v x == c a b c u w v x
  • 15.
    Set Representatives 6 a ==b
 c == a u == w
 v == u
 x == v x == c a b c u w v x
  • 16.
    Set Representatives 6 a ==b
 c == a u == w
 v == u
 x == v x == c a b c u w v x
  • 17.
    Set Representatives 6 a ==b
 c == a u == w
 v == u
 x == v x == c a b c u w v x
  • 18.
    Set Representatives 6 FIND(a):
 b :=rep(a)
 if b == a:
 return a
 else
 return FIND(b)
 
 
 UNION(a1,a2):
 b1 := FIND(a1)
 b2 := FIND(a2)
 LINK(b1,b2)
 LINK(a1,a2):
 rep(a1) := a2
 
 
 
 
 a == b
 c == a u == w
 v == u
 x == v x == c a b c u w v x
  • 19.
  • 20.
    Path Compression 7 … x ==b x == c x == w x == v a b c u w v x
  • 21.
    Path Compression 7 … x ==b x == c x == w x == v a b c u w v x
  • 22.
    Path Compression 7 … x ==b x == c x == w x == v a b c u w v x
  • 23.
    Path Compression 7 … x ==b x == c x == w x == v a b c u w v x
  • 24.
    Path Compression 7 … x ==b x == c x == w x == v a b c u w v x
  • 25.
    Path Compression 7 FIND(a): b :=rep(a) if b == a: return a else b := FIND(b) rep(a) := b return b UNION(a1,a2):
 b1 := FIND(a1) b2 := FIND(a2) LINK(b1,b2)
 LINK(a1,a2):
 rep(a1) := a2 … x == b x == c x == w x == v a b c u w v x
  • 26.
  • 27.
  • 28.
  • 29.
    Tree Balancing 8 … x ==c a b c u w v x 3 steps
  • 30.
  • 31.
    Tree Balancing 8 … x ==c a b c u w v x 2 steps
  • 32.
    Tree Balancing 8 … x ==c a b c u w v x ?
  • 33.
    Tree Balancing 8 … x ==c a b c u w v x 1 111 ?
  • 34.
    Tree Balancing 8 … x ==c a b c u w v x 1 2111 ?
  • 35.
    Tree Balancing 8 … x ==c a b c u w v x 1 21 4 11 3 ?
  • 36.
    Tree Balancing 8 … x ==c a b c u w v x 1 21 4 11 3
  • 37.
    Tree Balancing 8 FIND(a): b :=rep(a) if b == a: return a else b := FIND(b) rep(a) := b return b UNION(a1,a2): b1 := FIND(a1) b2 := FIND(a2) LINK(b1,b2)
 LINK(a1,a2):
 if size(a2) > size(a1): rep(a1) := a2 size(a2) += size(a1) else: rep(a2) := a1 size(a1) += size(a2) … x == c a b c u w v x 1 21 4 11 3
  • 38.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn)
  • 39.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1)
  • 40.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn
  • 41.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1)
  • 42.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1
  • 43.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2)
  • 44.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸
  • 45.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸ a1 == b1 a1 == b1
  • 46.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸ a1 == b1 a1 == b1 f(a0,a0) == f(b0,b0)
  • 47.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸ a1 == b1 a1 == b1 f(a0,a0) == f(b0,b0) a0 == b0 a0 == b0
  • 48.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸ a1 == b1 a1 == b1 f(a0,a0) == f(b0,b0) a0 == b0 a0 == b0
  • 49.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸ a1 == b1 a1 == b1 f(a0,a0) == f(b0,b0) a0 == b0 a0 == b0
  • 50.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸ a1 == b1 a1 == b1 f(a0,a0) == f(b0,b0) a0 == b0 a0 == b0
  • 51.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸ a1 == b1 a1 == b1 f(a0,a0) == f(b0,b0) a0 == b0 a0 == b0
  • 52.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸ a1 == b1 a1 == b1 f(a0,a0) == f(b0,b0) a0 == b0 a0 == b0 How about occurrence checks?
  • 53.
    The Complex Case 9 h(a1, …,an , f(b0,b0), …, f(bn-1,bn-1), an) == h(f(a0,a0), …,f(an-1,an-1), b1, …, bn-1 , bn) a1 a0 an-1 an b1 b0 bn-1 bn f(a0,a0) f(an-2,an-2) f(an-1,an-1) f(b0,b0) f(bn-2,bn-2) f(bn-1,bn-1) an == bn f(an-1,an-1) == f(bn-1,bn-1) an-1 == bn-1 an-1 == bn-1 f(an-2,an-2) == f(bn-2,bn-2) ⠸ a1 == b1 a1 == b1 f(a0,a0) == f(b0,b0) a0 == b0 a0 == b0 How about occurrence checks? Postpone!
  • 54.
    Main idea - Representunifier as graph - One variable represent equivalence class - Replace substitution by union & find operations - Testing equality becomes testing node identity Optimizations - Path compression make recurring lookups fast - Tree balancing keeps paths short Complexity - Linear in space and almost linear in time (technically inverse Ackermann) - Easy to extract triangular unifier from graph - Postpone occurrence checks to prevent traversing (potentially) large terms Union-Find 10 Martelli, Montanari. An Efficient Unification Algorithm. TOPLAS, 1982
  • 55.
  • 56.
    Type errors - Typesare supposed to prevent us from making mistakes - Error messages are supposed to help us fix mistakes - This does not always work out so well ‣ Wrong error location ‣ Unclear error message ‣ Not always clear where inferred types come from - Users expect concise, informative messages, at the right place Type errors = Unsatisfiable constraints - Constraint satisfaction is a binary property - Language designers want to influence error messages Reporting Type Errors 12
  • 57.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2
  • 58.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT()
  • 59.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2]
  • 60.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING()
  • 61.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ✔ ❌
  • 62.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ✔ ❌
  • 63.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ✔ ❌ ty1 == FUN(ty2, ty3) ty3 == INT() ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty4, ty5)
  • 64.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ✔ ❌ ty1 == FUN(ty2, ty3) ty3 == INT() ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty4, ty5) ✔ ✔ ✔ ✔ ✔ ❌
  • 65.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ✔ ❌ ty1 == FUN(ty2, ty3) ty3 == INT() ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty4, ty5) ✔ ✔ ✔ ✔ ✔ ❌
  • 66.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty2, ty3) ty3 == INT() ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ✔ ❌ ty1 == FUN(ty2, ty3) ty3 == INT() ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty4, ty5) ✔ ✔ ✔ ✔ ✔ ❌
  • 67.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty2, ty3) ty3 == INT() ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ❌ ✔ ✔ ✔ ✔ ✔ ❌ ty1 == FUN(ty2, ty3) ty3 == INT() ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty4, ty5) ✔ ✔ ✔ ✔ ✔ ❌
  • 68.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty2, ty3) ty3 == INT() ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ❌ ✔ ✔ ✔ ✔ ✔ ❌ ty1 == FUN(ty2, ty3) ty3 == INT() ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty4, ty5) ✔ ✔ ✔ ✔ ✔ ❌
  • 69.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty2, ty3) ty3 == INT() ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ❌ ✔ ✔ ✔ ✔ ✔ ❌ ty1 == FUN(ty2, ty3) ty3 == INT() ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty4, ty5) ✔ ✔ ✔ ✔ ✔ ❌
  • 70.
    Error Example 13 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty2, ty3) ty3 == INT() ty1 == FUN(ty2, ty3) ty1 == FUN(ty4, ty5) ty2 == INT() ty3 == INT() ty5 == INT() ty4 == STRING() ✔ ✔ ✔ ✔ ❌ ✔ ✔ ✔ ✔ ✔ ❌ ty1 == FUN(ty2, ty3) ty3 == INT() ty4 == STRING() ty5 == INT() ty2 == INT() ty1 == FUN(ty4, ty5) ✔ ✔ ✔ ✔ ✔ ❌ Order of constraint solving is relevant!
  • 71.
    Conflict Sets 14 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x + _] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT()
  • 72.
    Conflict Sets 14 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x + _] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 == FUN(ty2, ty3)
  • 73.
    Conflict Sets 14 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x + _] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 == FUN(ty2, ty3) ty2 == INT()
  • 74.
    Conflict Sets 14 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x + _] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 == FUN(ty2, ty3) ty2 == INT()
  • 75.
    Conflict Sets 14 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x + _] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 == FUN(ty2, ty3) ty2 == INT() ty1 == FUN(ty4, ty5)
  • 76.
    Conflict Sets 14 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x + _] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 == FUN(ty2, ty3) ty2 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING()
  • 77.
    Conflict Sets 14 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x + _] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 == FUN(ty2, ty3) ty2 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING()
  • 78.
    Conflict Sets 14 1. let 2.function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() [function f(x)] [x + _] [_ + _] [f _] [“Hello!”] [_ * 2] ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 == FUN(ty2, ty3) ty2 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() A conflict set is an inconsistent subset of the constraint set
  • 79.
    Conflict Sets - Setsof constraint that are inconsistent together - Multiple conflict sets might appear in one constraint set - Conflict sets can intersect How to find conflict sets? - Find minimal sets that are in conflict ‣ Combinatorial explosion (Min-SAT) - Approximate conflict sets during constraint solving ‣ Conflict set may be too large Where to report errors? - Report the conflicting program slice
 (using all conflicting constraints) - Report on a few constraints in the conflict set ‣ Remove constraints until inconsistency disappears ‣ Use heuristics to efficiently select constraints There is not always a best place to report the error! Conflict Sets and Errors 15 1. let 2. function f(x) = 3. x + 1 4. in 5. (f "Hello!") * 2
  • 80.
    Constraint Graphs 16 Zhang, Meyers.Toward General Diagnosis of Static Errors. POPL’14 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 ty2 ty3 ty4 ty5 INT() STRING() FUN(ty2, ty3) FUN(ty4, ty5) 1 2 1 2 INT() INT()
  • 81.
    Constraint Graphs 16 Zhang, Meyers.Toward General Diagnosis of Static Errors. POPL’14 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 ty2 ty3 ty4 ty5 INT() STRING() FUN(ty2, ty3) FUN(ty4, ty5) 1 2 1 2 - Consider paths between
 non-variable terms INT() INT()
  • 82.
    Constraint Graphs 16 Zhang, Meyers.Toward General Diagnosis of Static Errors. POPL’14 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 ty2 ty3 ty4 ty5 INT() STRING() FUN(ty2, ty3) FUN(ty4, ty5) 1 2 1 2 - Consider paths between
 non-variable terms INT() INT()
  • 83.
    Constraint Graphs 16 Zhang, Meyers.Toward General Diagnosis of Static Errors. POPL’14 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 ty2 ty3 ty4 ty5 INT() STRING() FUN(ty2, ty3) FUN(ty4, ty5) 1 2 1 2 - Consider paths between
 non-variable terms - Trust edges involved in
 many correct paths INT() INT()
  • 84.
    Constraint Graphs 16 Zhang, Meyers.Toward General Diagnosis of Static Errors. POPL’14 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 ty2 ty3 ty4 ty5 INT() STRING() FUN(ty2, ty3) FUN(ty4, ty5) 1 2 1 2 - Consider paths between
 non-variable terms - Trust edges involved in
 many correct paths INT() INT()
  • 85.
    Constraint Graphs 16 Zhang, Meyers.Toward General Diagnosis of Static Errors. POPL’14 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 ty2 ty3 ty4 ty5 INT() STRING() FUN(ty2, ty3) FUN(ty4, ty5) 1 2 1 2 - Consider paths between
 non-variable terms - Trust edges involved in
 many correct paths - Suspect edges involved in
 many wrong paths INT() INT()
  • 86.
    Constraint Graphs 16 Zhang, Meyers.Toward General Diagnosis of Static Errors. POPL’14 ty1 == FUN(ty2, ty3) ty2 == INT() ty3 == INT() ty1 == FUN(ty4, ty5) ty4 == STRING() ty5 == INT() ty1 ty2 ty3 ty4 ty5 INT() STRING() FUN(ty2, ty3) FUN(ty4, ty5) 1 2 1 2 - Consider paths between
 non-variable terms - Trust edges involved in
 many correct paths - Suspect edges involved in
 many wrong paths - Restricts constraint langauge
 and solver implementation INT() INT()
  • 87.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16
  • 88.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 FUN(ty2, ty3)
  • 89.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 FUN(ty2, ty3) - Track constraints during
 unification
  • 90.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 FUN(ty2, ty3) - Track constraints during
 unification 1
  • 91.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 FUN(ty2, ty3) - Track constraints during
 unification INT() 1 2
  • 92.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 ty3 INT() FUN(ty2, ty3) - Track constraints during
 unification INT() 1 2 3
  • 93.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 ty3 ty4 INT() FUN(ty2, ty3) - Track constraints during
 unification INT() 1 2 3
  • 94.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 ty3 ty4 INT() FUN(ty2, ty3) - Track constraints during
 unification INT() 1 2 3 1,4
  • 95.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 ty3 ty4 ty5 INT() FUN(ty2, ty3) - Track constraints during
 unification INT() 1 2 3 1,4 1,4
  • 96.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 ty3 ty4 ty5 INT() FUN(ty2, ty3) - Track constraints during
 unification INT() 1 2 3 1,4 1,4 FIND(ty4) = INT() because of {1,2,4}
  • 97.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 ty3 ty4 ty5 INT() FUN(ty2, ty3) - Track constraints during
 unification INT() 1 2 3 1,4 1,4 FIND(ty4) = INT() because of {1,2,4} constraint 5 gives conflict set {1,2,4,5}
  • 98.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 ty3 ty4 ty5 INT() FUN(ty2, ty3) - Track constraints during
 unification - Conflict set may be an
 over-approximation INT() 1 2 3 1,4 1,4 FIND(ty4) = INT() because of {1,2,4} constraint 5 gives conflict set {1,2,4,5}
  • 99.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 ty3 ty4 ty5 INT() FUN(ty2, ty3) - Track constraints during
 unification - Conflict set may be an
 over-approximation - Iteratively drop constraints
 until inconsistency is gone INT() 1 2 3 1,4 1,4 FIND(ty4) = INT() because of {1,2,4} constraint 5 gives conflict set {1,2,4,5}
  • 100.
    Use approximated conflictsets 17 1. ty1 == FUN(ty2, ty3) 2. ty2 == INT() 3. ty3 == INT() 4. ty1 == FUN(ty4, ty5) 5. ty4 == STRING() 6. ty5 == INT() Loncaric e.a. A practical framework for type inference error explanation. OOPSLA’16 ty1 ty2 ty3 ty4 ty5 INT() FUN(ty2, ty3) - Track constraints during
 unification - Conflict set may be an
 over-approximation - Iteratively drop constraints
 until inconsistency is gone - Solver is used as an oracle,
 no requirements on its
 implementation INT() 1 2 3 1,4 1,4 FIND(ty4) = INT() because of {1,2,4} constraint 5 gives conflict set {1,2,4,5}
  • 101.
    Error reporting fromconflict sets - Select constraints that are likely causes - Few error messages is usually preferred over many - Avoid combinatorial explosion by using heuristics Language independent heuristics - Select constraints shared between multiple conflict sets - Select constraints coming from smaller program fragments ‣ Function definition type comes from the (larger) body expression ‣ Function use type comes from the (smaller) use expression Language specific heuristics - Select constraint based on weights assigned in specification ‣ Assign more weight to constraints from definitions, than constraints from use - Select constraints with explicit error messages ‣ Avoids reporting on implicit equalities that arise during constraint generation Which constraint is to blame? 18
  • 102.
    Requirements on thesolver - Report (approximated) conflict sets ‣ Conflict sets should be over-approximations ‣ The closer to the minimal conflict set, the better - Fast failure is important ‣ Solver is called repeatedly with smaller constraint sets How to report errors? - Reporting ‘<long function type> expected, but got <long but slightly different function type>’ can be improved - Can we report that first argument is different, or that number of arguments is different? More Aspects 19
  • 103.
  • 104.
  • 105.
    Unification with Union-Find -Represent unifier as a graph - Testing term equality becomes testing node identity - Really fast with path compression and tree balancing Error Reporting - Selecting constraints to report errors on - Use conflict sets to find cause of inconsistencies - Report on ‣ all conflicting constraints (program slice) ‣ subset of constraints that removes inconsistency - Use heuristics to select constraints to blame Summary 22