Lecture 10: Data-Flow Analysis
Jeff Smits
CS4200 Compiler Construction
TU Delft
November 2018
Title Text
2
Source
Code
Editor
Parse
Abstract
Syntax
Tree
Type Check
Check that names are used correctly and that expressions are well-typed
Errors
Earlier Lecture
Title Text
3
Source
Code
Editor
Parse
Abstract
Syntax
Tree
Other Check
Check that variables are initialised, statements are reached, etc.
Errors
This Lecture
Title Text
4
Source
Code
Editor
Parse
Abstract
Syntax
Tree
Optimise
Eliminate common subexpressions, reduce loop strength, etc.
Optimised
Abstract
Syntax
Tree
This Lecture
Reading Material
5
The following papers add background, conceptual exposition,
and examples to the material from the slides. Some notation and
technical details have been changed; check the documentation.
6
This paper introduces FlowSpec, the declarative data-flow
analysis specification language in Spoofax. Although the
design of the language described in this paper is still current,
the syntax used is already dated, i.e. the current FlowSpec
syntax in Spoofax is slightly different.
https://doi.org/10.1145/3136014.3136029
SLE 2017
7
Documentation for FlowSpec
at the metaborg.org website.
http://www.metaborg.org/en/latest/source/langdev/meta/lang/flowspec/index.html
Control-Flow
8
What is Control-Flow?
- “Order of evaluation”
Discuss a series of example programs
- What is the control flow?
- What constructs in the program determine that?
Control-Flow
9
What is Control-Flow?
10
function id(x) { return x; }
id(4); id(true);
- Calling a function passes control to that function

- A return passes control back to the caller
Function calls
What is Control-Flow?
11
if (c) { a = 5 } else { a = "four" }
- Control is passed to one of the two branches

- This is dependent on the value of the condition
Branching
What is Control-Flow?
12
while (c) { a = 5 }
- Control is passed to the loop body depending on the condition

- After the body we start over
Looping
What is Control-Flow?
13
float distance = 12.0;
float velocity = time / distance;
- No conditions or anything complicated

- But still order of execution
Sequence
What is Control-Flow?
14
distance = distance + 1;
- The expression needs to be evaluated, 

before we can save its result
Writes and reads
What is Control-Flow?
15
counter.next() / counter.next()
- Order in sub-expressions is usually undefined

- Side-effects make sub-expression order relevant
Expressions & side-effects
- Sequential	 statements

- Conditional	 if / switch / case

- Looping	 while / do while / for / foreach / loop

- Exceptions	 throw / try / catch / finally

- Continuations	 call/cc

- Async-await	 threading

- Coroutines / Generators	 yield

- Dispatch	 function calls / method calls

- Loop jumps	 break / continue

- ... many more ...	
What kind of Control-Flow?
16
Shorter code
- No need to repeat the same statement 10 times
Parametric code
- Extract reusable patterns
- Let user decide repetition amount
Expressive power
- Playing with the Turing Machines
Reason about program execution
- What happens when?
- In what order?
Why Control-Flow?
17
Imperative programming
- Explicit control-flow constructs
Declarative programming
- What, not how
- Less explicit control-flow

- More options for compilers to choose order
- Great if your compiler is often smarter than the programmer
Control-Flow and language design
18
Representation
- Represent control-flow of a program
- Conduct and represent results of data-flow analysis
Declarative Rules
- To define control-flow of a language
- To define data-flow of a language
Language-Independent Tooling
- Data-Flow Analysis
- Code completion
- Refactoring
- Optimisation
- …
Separation of Concerns in Data-Flow Analysis
19
Control-Flow Graphs
20
What is a Control-Flow Graph?
21
A control flow graph (CFG) in computer science is a representation, using graph notation, of
all paths that might be traversed through a program during its execution.
https://en.wikipedia.org/wiki/Control_flow_graph
let
var x : int := a + b
var y : int := a * b
in
while y > a + b do
(
a := a + 1;
x := a + b
)
end
Control-Flow Graph example
22
Control-Flow Graph example
23
y > a + b
var y : int := a * b
var x : int := a + b
a := a + 1
x := a + b
Basic blocks
24
var x : int := a + b
var y : int := a * b
y > a + b
a := a + 1
x := a + b
Nodes
- Usually innermost statements and expressions
- Or blocks for consecutive statements (basic blocks)

Edges
- Back edges: show loops
- Splits: conditionally split the control flow
- Merges: combine previously split control flow
Control Flow Graphs
25
Equivalent to unstructured control-flow
26
a ← 0
L1: b ← a + 1
c ← c + b
a ← 2 * b
if a < N goto L1
return c
a ← 0
b ← a + 1
c ← c + b
a ← 2 * b
a < N
return c
Representation
- Represent control-flow of a program
- Conduct and represent results of data-flow analysis
Declarative Rules
- To define control-flow of a language
- To define data-flow of a language
Language-Independent Tooling
- Data-Flow Analysis
- Code completion
- Refactoring
- Optimisation
- …
Separation of Concerns in Data-Flow Analysis
27
Representation
-
- Conduct and represent results of data-flow analysis
Declarative Rules
- To define control-flow of a language
- To define data-flow of a language
Language-Independent Tooling
- Data-Flow Analysis
- Code completion
- Refactoring
- Optimisation
- …
Separation of Concerns in Data-Flow Analysis
28
Control Flow Graphs
Data-Flow
29
What is Data-Flow?
- Possible values (data) that flow through the program
- Relations between that data (data dependence)
Discuss a series of example programs
- What is wrong or can be optimised?
- What is the flow we can use for this?
Data-Flow
30
Source
Code
Editor
Parse
Abstract
Syntax
Tree
Other Check
Check that code is reachable or observable
Errors
31
public int ComputeFac(int num) {
return num;
int num_aux;
if (num < 1)
num_aux = 1;
else
num_aux = num * this.ComputeFac(num-1);
return num_aux;
}
What is wrong here?
32
- Most of the code is never reached because of the early return

- This is usually considered an error by compilers
Dead code (control-flow)
x := 2;
y := 4;
x := 1;
// x and y used later
What is “wrong” here?
33
- The first value of x is never observed

- This is sometimes warned about by compilers
Dead code (data-flow)
Live variable analysis
Source
Code
Editor
Parse
Abstract
Syntax
Tree
Optimise
Eliminate common subexpressions, reduce loop strength, etc.
Optimised
Abstract
Syntax
Tree
34
let
var x : int := a + b
var y : int := a * b
in
if y > a + b then
(
a := a + 1;
x := a + b
)
end
What is suboptimal here?
35
- a + b is already computed when you get to the condition

- There is no need to compute it again
Common subexpression elimination
Available expression analysis
What is suboptimal here?
for i := 1 to 100 do
(
x[i] := y[i];
if w > 0 then
y[i] := 0
)
- The if condition is not dependent on i, x or y

- Still it is checked in the loop, which is slowing the loop
Loop unswitching
36
Data-dependence analysis
Tiger in FlowSpec
37
Representation
-
- Conduct and represent results of data-flow analysis
Declarative Rules
- To define control-flow of a language
- To define data-flow of a language
Language-Independent Tooling
- Data-Flow Analysis
- Code completion
- Refactoring
- Optimisation
- …
Separation of Concerns in Data-Flow Analysis
38
Control Flow Graphs
Representation
-
-
Declarative Rules
- To define control-flow of a language
- To define data-flow of a language
Language-Independent Tooling
- Data-Flow Analysis
- Code completion
- Refactoring
- Optimisation
- …
Separation of Concerns in Data-Flow Analysis
39
Control Flow Graphs
Data-flow information on CFG nodes
Representation
-
-
Declarative Rules
- To define control-flow of a language
- To define data-flow of a language
Language-Independent Tooling
- Data-Flow Analysis
- Code completion
- Refactoring
- Optimisation
- …
Separation of Concerns in Data-Flow Analysis
40
Control Flow Graphs
Data-flow information on CFG nodes
A domain-specific meta-language for Spoofax: FlowSpec
Map abstract syntax to control-flow (sub)graphs
- Match an AST pattern
- List all edges of that AST
- Use special “context” nodes to connect a subgraph to the outside graph
Control-Flow rules
x := 1;
if y > x then
z := y;
else
z := y * y;
y := a * b;
while y > a + b do
(a := a + 1;
x := a + b)
Control-flow graphs in FlowSpec
root Mod(s) =
start -> s -> end
start
end
While(c, b) =
entry -> node c -> b -> node c,
node c -> exit
IfThenElse(c, t, e) =
entry -> node c -> t -> exit,
node c -> e -> exit
Seq(s1, s2) =
entry -> s1 -> s2 -> exit
node Assign(_, _)
FlowSpec Example program
Data-Flow rules
Define effect of control-flow nodes
- Match an AST pattern on one side of a CFG edge
- Propagate the information from the other side of the edge
- Adapt that information as the effect of the matched CFG node
{z,x}
{z,x}
{z,x}
{z,x}
{z}
{}
{}
{}
{}
{}
{x}
{z}
{}
{}
x := 2;
y := 4;
x := 1;
z := x;
x := z;
{}
{}
{}
{}
{}
{}
{}
Live Variables in FlowSpec
properties
live: MaySet(name)
live(Assign(n, _) -> next) =
{ m | m <- live(next), n != m }
live(Ref(n) -> next) =
live(next) / {n}
live(_.end) =
{}
A variable is live if the current value
of the variable may be read further
along in the program
end
start
property rules
x := 2;
y := 4;
x := 1;
if y > 0 then
z := x;
else
z := y * y;
x := z;
{x,y}
{x}
{y}
{z}
{}
{}
{z}
{}
{}
{x}
{y}
{z}
{}
{}end
start
Live Variables in FlowSpec
A variable is live if the current value
of the variable may be read further
along in the program
properties
live: MaySet(name)
live(Assign(n, _) -> next) =
{ m | m <- live(next), n != m }
live(Ref(n) -> next) =
live(next) / {n}
live(_.end) =
{}
property rules
Conclusion
46
Control-Flow
- Order of execution
- Reasoning about what is reachable
Data-Flow
- Flow of data through a program
- Reasoning about data, and dependencies between data
FlowSpec
- Control-Flow rules to construct the graph
- Annotate with information from analysis by Data-Flow rules
Summary
47
Except where otherwise noted, this work is licensed under

Compiler Construction | Lecture 10 | Data-Flow Analysis

  • 1.
    Lecture 10: Data-FlowAnalysis Jeff Smits CS4200 Compiler Construction TU Delft November 2018
  • 2.
    Title Text 2 Source Code Editor Parse Abstract Syntax Tree Type Check Checkthat names are used correctly and that expressions are well-typed Errors Earlier Lecture
  • 3.
    Title Text 3 Source Code Editor Parse Abstract Syntax Tree Other Check Checkthat variables are initialised, statements are reached, etc. Errors This Lecture
  • 4.
    Title Text 4 Source Code Editor Parse Abstract Syntax Tree Optimise Eliminate commonsubexpressions, reduce loop strength, etc. Optimised Abstract Syntax Tree This Lecture
  • 5.
    Reading Material 5 The followingpapers add background, conceptual exposition, and examples to the material from the slides. Some notation and technical details have been changed; check the documentation.
  • 6.
    6 This paper introducesFlowSpec, the declarative data-flow analysis specification language in Spoofax. Although the design of the language described in this paper is still current, the syntax used is already dated, i.e. the current FlowSpec syntax in Spoofax is slightly different. https://doi.org/10.1145/3136014.3136029 SLE 2017
  • 7.
    7 Documentation for FlowSpec atthe metaborg.org website. http://www.metaborg.org/en/latest/source/langdev/meta/lang/flowspec/index.html
  • 8.
  • 9.
    What is Control-Flow? -“Order of evaluation” Discuss a series of example programs - What is the control flow? - What constructs in the program determine that? Control-Flow 9
  • 10.
    What is Control-Flow? 10 functionid(x) { return x; } id(4); id(true); - Calling a function passes control to that function - A return passes control back to the caller Function calls
  • 11.
    What is Control-Flow? 11 if(c) { a = 5 } else { a = "four" } - Control is passed to one of the two branches - This is dependent on the value of the condition Branching
  • 12.
    What is Control-Flow? 12 while(c) { a = 5 } - Control is passed to the loop body depending on the condition - After the body we start over Looping
  • 13.
    What is Control-Flow? 13 floatdistance = 12.0; float velocity = time / distance; - No conditions or anything complicated - But still order of execution Sequence
  • 14.
    What is Control-Flow? 14 distance= distance + 1; - The expression needs to be evaluated, 
 before we can save its result Writes and reads
  • 15.
    What is Control-Flow? 15 counter.next()/ counter.next() - Order in sub-expressions is usually undefined - Side-effects make sub-expression order relevant Expressions & side-effects
  • 16.
    - Sequential statements -Conditional if / switch / case - Looping while / do while / for / foreach / loop - Exceptions throw / try / catch / finally - Continuations call/cc - Async-await threading - Coroutines / Generators yield - Dispatch function calls / method calls - Loop jumps break / continue - ... many more ... What kind of Control-Flow? 16
  • 17.
    Shorter code - Noneed to repeat the same statement 10 times Parametric code - Extract reusable patterns - Let user decide repetition amount Expressive power - Playing with the Turing Machines Reason about program execution - What happens when? - In what order? Why Control-Flow? 17
  • 18.
    Imperative programming - Explicitcontrol-flow constructs Declarative programming - What, not how - Less explicit control-flow
 - More options for compilers to choose order - Great if your compiler is often smarter than the programmer Control-Flow and language design 18
  • 19.
    Representation - Represent control-flowof a program - Conduct and represent results of data-flow analysis Declarative Rules - To define control-flow of a language - To define data-flow of a language Language-Independent Tooling - Data-Flow Analysis - Code completion - Refactoring - Optimisation - … Separation of Concerns in Data-Flow Analysis 19
  • 20.
  • 21.
    What is aControl-Flow Graph? 21 A control flow graph (CFG) in computer science is a representation, using graph notation, of all paths that might be traversed through a program during its execution. https://en.wikipedia.org/wiki/Control_flow_graph
  • 22.
    let var x :int := a + b var y : int := a * b in while y > a + b do ( a := a + 1; x := a + b ) end Control-Flow Graph example 22
  • 23.
    Control-Flow Graph example 23 y> a + b var y : int := a * b var x : int := a + b a := a + 1 x := a + b
  • 24.
    Basic blocks 24 var x: int := a + b var y : int := a * b y > a + b a := a + 1 x := a + b
  • 25.
    Nodes - Usually innermoststatements and expressions - Or blocks for consecutive statements (basic blocks) Edges - Back edges: show loops - Splits: conditionally split the control flow - Merges: combine previously split control flow Control Flow Graphs 25
  • 26.
    Equivalent to unstructuredcontrol-flow 26 a ← 0 L1: b ← a + 1 c ← c + b a ← 2 * b if a < N goto L1 return c a ← 0 b ← a + 1 c ← c + b a ← 2 * b a < N return c
  • 27.
    Representation - Represent control-flowof a program - Conduct and represent results of data-flow analysis Declarative Rules - To define control-flow of a language - To define data-flow of a language Language-Independent Tooling - Data-Flow Analysis - Code completion - Refactoring - Optimisation - … Separation of Concerns in Data-Flow Analysis 27
  • 28.
    Representation - - Conduct andrepresent results of data-flow analysis Declarative Rules - To define control-flow of a language - To define data-flow of a language Language-Independent Tooling - Data-Flow Analysis - Code completion - Refactoring - Optimisation - … Separation of Concerns in Data-Flow Analysis 28 Control Flow Graphs
  • 29.
  • 30.
    What is Data-Flow? -Possible values (data) that flow through the program - Relations between that data (data dependence) Discuss a series of example programs - What is wrong or can be optimised? - What is the flow we can use for this? Data-Flow 30
  • 31.
  • 32.
    public int ComputeFac(intnum) { return num; int num_aux; if (num < 1) num_aux = 1; else num_aux = num * this.ComputeFac(num-1); return num_aux; } What is wrong here? 32 - Most of the code is never reached because of the early return - This is usually considered an error by compilers Dead code (control-flow)
  • 33.
    x := 2; y:= 4; x := 1; // x and y used later What is “wrong” here? 33 - The first value of x is never observed - This is sometimes warned about by compilers Dead code (data-flow) Live variable analysis
  • 34.
    Source Code Editor Parse Abstract Syntax Tree Optimise Eliminate common subexpressions,reduce loop strength, etc. Optimised Abstract Syntax Tree 34
  • 35.
    let var x :int := a + b var y : int := a * b in if y > a + b then ( a := a + 1; x := a + b ) end What is suboptimal here? 35 - a + b is already computed when you get to the condition - There is no need to compute it again Common subexpression elimination Available expression analysis
  • 36.
    What is suboptimalhere? for i := 1 to 100 do ( x[i] := y[i]; if w > 0 then y[i] := 0 ) - The if condition is not dependent on i, x or y - Still it is checked in the loop, which is slowing the loop Loop unswitching 36 Data-dependence analysis
  • 37.
  • 38.
    Representation - - Conduct andrepresent results of data-flow analysis Declarative Rules - To define control-flow of a language - To define data-flow of a language Language-Independent Tooling - Data-Flow Analysis - Code completion - Refactoring - Optimisation - … Separation of Concerns in Data-Flow Analysis 38 Control Flow Graphs
  • 39.
    Representation - - Declarative Rules - Todefine control-flow of a language - To define data-flow of a language Language-Independent Tooling - Data-Flow Analysis - Code completion - Refactoring - Optimisation - … Separation of Concerns in Data-Flow Analysis 39 Control Flow Graphs Data-flow information on CFG nodes
  • 40.
    Representation - - Declarative Rules - Todefine control-flow of a language - To define data-flow of a language Language-Independent Tooling - Data-Flow Analysis - Code completion - Refactoring - Optimisation - … Separation of Concerns in Data-Flow Analysis 40 Control Flow Graphs Data-flow information on CFG nodes A domain-specific meta-language for Spoofax: FlowSpec
  • 41.
    Map abstract syntaxto control-flow (sub)graphs - Match an AST pattern - List all edges of that AST - Use special “context” nodes to connect a subgraph to the outside graph Control-Flow rules
  • 42.
    x := 1; ify > x then z := y; else z := y * y; y := a * b; while y > a + b do (a := a + 1; x := a + b) Control-flow graphs in FlowSpec root Mod(s) = start -> s -> end start end While(c, b) = entry -> node c -> b -> node c, node c -> exit IfThenElse(c, t, e) = entry -> node c -> t -> exit, node c -> e -> exit Seq(s1, s2) = entry -> s1 -> s2 -> exit node Assign(_, _) FlowSpec Example program
  • 43.
    Data-Flow rules Define effectof control-flow nodes - Match an AST pattern on one side of a CFG edge - Propagate the information from the other side of the edge - Adapt that information as the effect of the matched CFG node
  • 44.
    {z,x} {z,x} {z,x} {z,x} {z} {} {} {} {} {} {x} {z} {} {} x := 2; y:= 4; x := 1; z := x; x := z; {} {} {} {} {} {} {} Live Variables in FlowSpec properties live: MaySet(name) live(Assign(n, _) -> next) = { m | m <- live(next), n != m } live(Ref(n) -> next) = live(next) / {n} live(_.end) = {} A variable is live if the current value of the variable may be read further along in the program end start property rules
  • 45.
    x := 2; y:= 4; x := 1; if y > 0 then z := x; else z := y * y; x := z; {x,y} {x} {y} {z} {} {} {z} {} {} {x} {y} {z} {} {}end start Live Variables in FlowSpec A variable is live if the current value of the variable may be read further along in the program properties live: MaySet(name) live(Assign(n, _) -> next) = { m | m <- live(next), n != m } live(Ref(n) -> next) = live(next) / {n} live(_.end) = {} property rules
  • 46.
  • 47.
    Control-Flow - Order ofexecution - Reasoning about what is reachable Data-Flow - Flow of data through a program - Reasoning about data, and dependencies between data FlowSpec - Control-Flow rules to construct the graph - Annotate with information from analysis by Data-Flow rules Summary 47
  • 48.
    Except where otherwisenoted, this work is licensed under