2. Available Expression
Aim
An expression e is available at program point p if e have
already been computed, and not later modified, on all paths to
p.
Why we need AE analysis?
x = a + b;
y = a * b;
while y > a + b do
a = a + 1;
x = a + b;
end
x = a + b;
y = a * b;
while y > x do
a = a + 1;
x = a + b;
end
Test it at run-time is not efficient.
Miller Lee Dataflow Analysis
3. Let’s analyze it
1 List all expressions: a + b, a * b, a + 1.
2 How long is the life span of an expression?
When is an expression be generated??
When is an expression be killed??
3 How to automate the analysis?
Miller Lee Dataflow Analysis
7. Here comes the problems
Why does it work?
How can we prove that this algorithm is correct?
Do this algorithm definitely stop?
Because it works, so it is correct, this reason is not accepeted.
There are two parts we can notice.
1 The relations between data
2 The characteristic of the transfer function
Miller Lee Dataflow Analysis
9. Partial Order
Definition
Giving a Pair (P, R)
P is a set of elements.
R is a relation over P (R ⊆ PxP)
R is reflexive, anti-symmetric, and transitive.
Miller Lee Dataflow Analysis
11. Least Upper Bound
Given partial order (D, ≤), for each S ⊆ D
g is an upper bound of S if x ∈ D and ∀y ∈ S : y ≤ x
g ∈ V is a least upper bound. x, y ∈ S
1 x ≤ g
2 y ≤ g
3 If z is any element that x ≤ z and y ≤ z, then g ≤ z
Join Operation V
x V y is the least upper bound of {x, y}
Miller Lee Dataflow Analysis
12. Greatest Lower Bound
Given partial order (D, ≤), for each S ⊆ D
g is an lower bound of S if x ∈ D and ∀y ∈ S : x ≤ y
g ∈ V is a greatest lower bound. x, y ∈ S
1 g ≤ x
2 g ≤ y
3 If z is any element that z ≤ x and z ≤ y, then z ≤ g
Meet Operation ∧
x ∧ y is the greatest lower bound of {x, y}
Miller Lee Dataflow Analysis
15. Lattice
Definition
A partial order is a lattice if every two elements of P have a
unique least upper bound and greatest lower bound.
1 idempotent: x ∧ x = x
2 commutative: x ∧ y = y ∧ x
3 associative: x ∧ (y ∧ z) = (x ∧ y) ∧ z
Special Elements
top element : for all x ∈ V , ∧ x = x.
bottom element ⊥: for all x ∈ V , ⊥ ∧x =⊥.
Semilattice
A join semi-lattice (meet semi-lattice) has only the join (meet)
operator defined.
Miller Lee Dataflow Analysis
16. Examples
R = ∪
= {a, b, c}
⊥ = {}
For any x ∈ P({a, b, c}),
{}∪x = x,
{a,b,c}∪x={a,b,c}.
Figure : P({a,b,c})
Miller Lee Dataflow Analysis
17. Examples
(2S
, ⊆) forms a lattice for any set S.
2S is a powerset of S, the set of all subsets of S.
if (S, ≤) is a semilattice, so is (S, ≥)
i.e., can ”flip” the lattice
Lattice for constant propagation
Miller Lee Dataflow Analysis
18. Algorithm for a forward data-flow problem
OUT[ENTRY] = VENTRY
for each basic block B other than ENTRY do
OUT[B] =
end
while changes to any OUT occur do
for each basic block B other than ENTRY do
IN[B] = ∧S a predecessor of BOUT[S];
OUT[B] = fB(IN[B]);
end
end
Miller Lee Dataflow Analysis
19. The Dataflow Equation of Available Expression
Let s be a statement
succs(s) = { immediate successor stmts of s }
pres(s) = { immediate predecessor stmts of s }
In(s) = program just before executing s
Out(s) = program just after executing s
In(s) = ∩s ∈preds(s) Out(s )
Out(s) = Gen(s)∪(In(S) - Kill(s))
Miller Lee Dataflow Analysis
20. Monotone Functions
Definition
If (D, ⊆) and (D , ⊆ ) are two posets and F : D → D is
function, the F is called monotone if and only if
F(x) ⊆ F(y) for any x, y ∈ D with x ⊆ y.
Observation
Functions for computing In(s) and Out(s) are monotonic
In(s) = ∩s ∈preds(s) Out(s )
Out(s) = Gen(s)∪(In(S) - Kill(s))
Extensivity
People think f is monotonic if x ≤ f (x). This is a different
property called extensivity.
Miller Lee Dataflow Analysis
21. Examples
Monotonic functions
x → {}
x → x ∪ {a}
x → x − {a}
Not monotone
x → {a} − x
Extensivity
x → x ∪ {a}
x → {a} − x
Figure : P({a,b,c})
Miller Lee Dataflow Analysis
22. Chain
Chain
Given a poset (D, ⊆), a chain in D is an infinite sequence
d0 ⊆ d1 ⊆ d2 ⊆ · · · ⊆ dn ⊆ . . . of elements in D, also written
using set notation as {dn|n ∈ N}.
Stationary
A chain is called stationary when there is some n ∈ N such
that dm = dm+1 for all m > n.
Miller Lee Dataflow Analysis
23. Examples
The sets N, Z, Q, R of natural numbers, integers,
rationals, and real numbers form chains under their usual
order.
Miller Lee Dataflow Analysis
24. The Fixed-Point Theorem
Theorem
Let (D, ⊆, ⊥) be a semilattice, let
F : (D, ⊆, ⊥) → (D, ⊆, ⊥) be a continuous function, and let
fix(F) be the lub of the chain {Fn
(⊥)|n ∈ N}. Then fix(F) is
the least fix-point of F.
Miller Lee Dataflow Analysis
25. Examples
For D = P({a,b,c}), so ⊥ is {}
Identity function: sequence is {},{},{}. . . so least
fixpoint is {}. And all the elements are fixpoints.
x → x ∪ {a}: sequence is {},{a},{a},{a},. . . so least
fixpoint is {a}.{a},{a,b},{a,b,c} are all fixpoints.
x → {a} − x: no fixpoints.
Miller Lee Dataflow Analysis
26. Observation
1 If Algorithm converges, the result is a solution to the
data-flow equations.
Miller Lee Dataflow Analysis
27. Observation
1 If Algorithm converges, the result is a solution to the
data-flow equations.
2 If the frame work is monotone, then the solution found is
the maximum fixedpoint of the data-flow equations.
Miller Lee Dataflow Analysis
28. Observation
1 If Algorithm converges, the result is a solution to the
data-flow equations.
2 If the frame work is monotone, then the solution found is
the maximum fixedpoint of the data-flow equations.
3 If the semilattice of the framework is monotone and of
finite height, then the algorithm is guaranteed to
converge.
Miller Lee Dataflow Analysis
29. Distributive
Definition
f (x ∧ y) = f (x) ∧ f (y) for all x and y in V and f in F.
Benefit
Joins lose no information
k(h(f(T)∧g(T)))
= k(h(f(T))∧h(g(T)))
= k(h(f(T)))∧k(h(g(T)))
Miller Lee Dataflow Analysis
30. Meaning of a Data-Flow Solution
The Ideal Solution
The Meet-Over-Paths Solution
Maximun Fixedpoint: the result of the iterative algorithm.
Miller Lee Dataflow Analysis
31. The Ideal Solution
Property
IDEAL[B] =∧P, a possible path from ENTRY to BfP(vENTRY )
In terms of the lattice-theoretic partial order ≤ for the
framework in question.
Any answer that is greater than IDEAL is incorrect.
Any value smaller than or equal to the ideal is
conservative, i.e., safe.
Miller Lee Dataflow Analysis
32. The Meet-Over-Paths Solution
Property
MOP[B]=∧P, a path from ENTRY to BfP(vENTRY ).
For all B we have MOP[B]≤IDEAL[B].
Too much information
If the transfer functions are distributive, produce the MFP
solution.
Miller Lee Dataflow Analysis
33. MFP v.s. MOP
The number of paths considered is unbounded if the flow
graph contains cycles.
→ the MOP does not lend itself to a direct algorithm.
Miller Lee Dataflow Analysis
34. MFP v.s. MOP
The number of paths considered is unbounded if the flow
graph contains cycles.
→ the MOP does not lend itself to a direct algorithm.
MOP ≤ IDEAL and MFP ≤ MOP, we know that
MFP ≤ IDEAL (SAVE)
MFP ≤ MOP, because the meet operator is applied at
the end in the definiton of MOP.
Miller Lee Dataflow Analysis
36. The Drawback of Dataflow Analysis
We need to keep track of lots of irrelevant details at every
program point.
We need to consider all the path that the program will
execute.
Hard to analyze to pointers.
Miller Lee Dataflow Analysis