Static Analysis Lecture 2

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Static Analysis Lecture 2 - Presentation Transcript

    1. Lattice Theory Control Flow Graphs Dataflow Analysis Static Analysis 2009 Michael I. Schwartzbach Computer Science, University of Aarhus
    2. Syllabus 1. Motivations 2. Lattice Theory 3. Control Flow Graph 4. Control Flow Analysis Examples 5. Dimension of Control Flow Analysis 6. Q & A
    3. Motivations We no longer deal with yes / no questions. Ex: var a,b,c; a = 1; b = 2; c = 3; //Which variables are greater than 1?
    4. Motivations The answer is Flow Sensitive. Ex: var a,b; a = input; b = 1; if ( a > 0 ) b = 2; //What’s the value of b?
    5. Motivations We no longer deal with yes / no questions. - Lattice Theory The answer is Flow Sensitive. - Control Flow Graph Analysis MIT
    6. Partial Orders ! A partial order is a structure L = (S, ) ! S is a set ! is a binary relation that satisfies: • reflexivity: ∀x∈S: x x • transitivity: ∀x,y,z∈S: x y ∧ y z ! x z • anti-symmetry: ∀x,y∈S: x y ∧ y x ! x = y Static Analysis 2
    7. Upper and Lower Bounds ! Let X ⊆ S be a subset ! We say that y∈S is an upper bound (X y) when: ∀ x∈X: x y ! We say that y∈S is a lower bound (y X) when: ∀ x∈X: y x ! A least upper bound X is defined by: X X ∧ ∀y∈S: X y ! X y ! A greatest lower bound X is defined by: X X ∧ ∀y∈S: y X ! y X Static Analysis 3
    8. Lattices ! A lattice is a partial order where: X and X exist for all X ! S ! A lattice must have: • a unique largest element, = S • a unique smallest element, = S ! If S is a finite set, then it is a lattice iff: • and exist • x y and y x exist for all x,y "S Static Analysis 4
    9. These Partial Orders Are Lattices Static Analysis 5
    10. These Partial Orders Are Not Lattices Static Analysis 6 (click) Least Upper Bound (click) Least Upper Bound
    11. These Partial Orders Are Not Lattices No Unique Least Upper Bound Static Analysis 6 (click) Least Upper Bound (click) Least Upper Bound
    12. These Partial Orders Are Not Lattices No Unique Least Upper Bound No Unique Least Upper Bound Static Analysis 6 (click) Least Upper Bound (click) Least Upper Bound
    13. The Subset Lattice ! Every finite set A defines a lattice (2A,⊆), where: • =∅ • =A • x y=x∪y • x y=x∩y {0,1,2,3} {0,1,2} {0,1,3} {0,2,3} {1,2,3} {0,1} {0,2} {0,3} {1,2} {1,3} {2,3} {0} {1} {2} {3} {} Static Analysis 7 Subset
    14. Lattice Height ! The height of a lattice is the length of the longest path from to ! The lattice (2A,⊆) has height |A| {0,1,2,3} {0,1,2} {0,1,3} {0,2,3} {1,2,3} {0,1} {0,2} {0,3} {1,2} {1,3} {2,3} {0} {1} {2} {3} {} Static Analysis 8 Lattice Fixed Point Lattice ( )
    15. Monotone and Increasing Functions ! A function f: L !L is monotone when: "x,y # S: x y ! f(x) f(y) ! Monotone functions are closed under composition ! As functions, and are both monotone ! A function is increasing when: "x # S: x f(x) ! Monotone is different from increasing • e.g. all constant functions are monotone Static Analysis 9
    16. The Fixed-Point Theorem ! In a lattice with finite height, every monotone function f has a unique least fixed-point: fix(f) = fi( ) i ≥0 such that f(fix(f)) = fix(f) Static Analysis 10
    17. Proof of Existence ! Clearly, f( ) ! Since f is monotone, we also have f( ) f2( ) ! By induction, fi( ) fi+1( ) ! This means that: f( ) f2( ) ... fi( ) ... is an increasing chain ! L has finite height, so for some k: fk( ) = fk+1( ) ! But then fix(f) = fk( ) Static Analysis 11
    18. Proof of Unique Least ! Assume that x is another fixed-point: x = f(x) ! Clearly, x ! By induction, fi( ) fi(x) = x ! In particular, fix(f) = fk( ) x ! Uniqueness then follows from anti-symmetry Static Analysis 12
    19. Computing Fixed-Points ! The time complexity of fix(f) depends on: • the height of the lattice • the cost of computing f • the cost of testing equality x= ; do { t = x; x = f(x); } while (x ≠ t); Static Analysis 13 Fixed-Point ,
    20. Product Lattice ! If L1, L2, ..., Ln are lattices, then so is the product: L1×L2× ... ×Ln = { (x1,x2,...,xn) | xi ∈ Li } where is defined pointwise ! Note that and can be computed pointwise ! height(L1×L2× ... ×Ln) = height(L1)+ ... + height(Ln) Static Analysis 14 Lattice Lattice Lattice
    21. Sum Lattice ! If L1, L2, ..., Ln are lattices, then so is the sum: L1+L2+ ... +Ln = { (i,xi) | xi ! Li{ , }} " { , } where: • and are as expected • (i,x) (j,y) if and only if i=j and x y ! height(L1+L2+ ... +Ln) = max{height(Li)} Static Analysis 15
    22. Lift Lattice ! If L is a lattice, then so is lift(L), which is: ! height(lift(L)) = height(L)+1 Static Analysis 16
    23. Flat Lattice ! If A is a finite set, the flat(A) is a lattice: a1 a2 ... an ! height(flat(A)) = 2 Static Analysis 17
    24. Map Lattice ! If A is a finite set and L is a lattice, then we obtain the map lattice: A → L = { [a1→x1, ..., an→xn] | xi ∈ Li } ordered pointwise ! height(A → L) = |A|⋅height(L) Static Analysis 18
    25. Lattice Equations ! Let L be a lattice with finite height ! A equation system is of the form: x1 = F1(x1, ..., xn) x2 = F2(x1, ..., xn) ... xn = Fn(x1, ..., xn) where xi are variables and Fi: Ln→L is monotone Static Analysis 19
    26. Solving Equations ! Every equation system has a unique least solution, which is the least fixed-point of the function F: Ln→Ln defined by: F(x1,...,xn) = (F1(x1,...,xn), ..., Fn(x1,...,xn)) ! The F function plugs into the right-hand sides ! A solution is always a fixed-point • this is true for any kind of equation Static Analysis 20
    27. Summary of Lattice Theory 1. There’s no simple yes/no questions but need to answer with a set. 2. Set with ⊆ relation is a partial order and form a lattice. 3. There exist least fixed point in lattice with monotone framework. 4. We can compute fixed point by work-list algorithm.
    28. Control Flow Graphs ! A control flow graph (CFG) is a directed graph: • nodes correspond to program points • edges represent possible flow of control ! A CFG always has: • a single point of entry • a single point of exit ! Let v be a node in a CFG: • pred(v) is the set of predecessor nodes • succ(v) is the set of successor nodes Static Analysis 21
    29. CFG Construction (1/3) ! For the simpel while-fragment, CFGs are constructed inductively ! CFGs for simple statements: id = E output E return E var id Static Analysis 22
    30. CFG Construction (2/3) ! For a statement sequence S1 S2: • eliminate the exit node of S1 and the entry node of S2 • glue the statements together S1 S1 S2 S2 Static Analysis 23
    31. CFG Construction (3/3) ! Similarly for the other control structures: E E E S S1 S2 S Static Analysis 24
    32. An Example CFG ite(n) { var f; var f f = 1; f=1 while (n>0) { f = f*n; n>0 n = n-1; } f=f*n return f; n=n-1 } return f Static Analysis 25
    33. The Monotone Framework (1/2) ! A CFG to be analyzed, nodes V = {v1,v2, ..., vn} ! A finite-height lattice L of possible answers • fixed or parametrized by the given program ! A variable [[v]]∈L for every CFG node v ! A dataflow constraint for each syntactic construct • relates the value of [[vi]] to the variables for other nodes • typically a node is related to its neighbors • the constraints must be monotone functions: [[vi]] = Fi([[v1]], [[v2]], ..., [[vn]]) Static Analysis 26
    34. The Monotone Framework (2/2) ! Extract all constraints for the complete CFG ! Solve constraints using the fixed-point algorithm: • we work in the lattice Ln • computing the fixed-point of the combined function: F(x1,...,xn) = (F1(x1,...,xn), ..., Fn(x1,...,xn)) ! This solutions gives an answer from L for each program point Static Analysis 27
    35. Generating and Solving Constraints fixed-point [[v]] = {x,yz} [[w]] = {x,y} [[q]] = {} [[p]] = {y} solution CFG constraints Static Analysis 28
    36. Lattice Points as Answers the trivial, useless answer our answer (the fixed-point) safe answers unsafe answers the true answer Static Analysis 29 Fixed Point True Answer, 1. True Answer 2. (non-trivial)
    37. The Naive Algorithm x = ( , , ..., ); do { t = x; x = F(x); } while (x≠t); ! Does not exploit any special structure Static Analysis 30
    38. Chaotic Iteration ! Exploits the special structure of Ln x1 = ; ... xn = ; do { t1 = x1; ... tn = xn; x1 = F1(x1, ..., xn); ... xn = Fn(x1, ..., xn); } while (x1≠t1 ∨ ... ∨ xn≠tn); Static Analysis 31 while short-circuit( improvement)
    39. The Worklist Algorithm (1/2) ! Exploits the special structure of right-hand sides ! Most right-hand sides are quite sparse: • constraints on CFG nodes do not involve all others ! Use a map: dep: V ! 2V that for v"V gives the variables w where v occurs on the right-hand side of the constraint for w Static Analysis 32 Work-list algorithm
    40. The Worklist Algorithm (2/2) x1 = ; ... xn = ; q = [v1, ..., vn]; while (q![]) { assume q = [vi, ...]; y = Fi(x1, ..., xn); q = q.tail(); if (y!xi) { for (v " dep(vi)) q.append(v); xi = y; } } Static Analysis 33 (click) dep(v)
    41. The Worklist Algorithm (2/2) x1 = ; ... xn = ; q = [v1, ..., vn]; while (q![]) { dep (v): assume q = [vi, ...]; The set of nodes y = Fi(x1, ..., xn); whose information q = q.tail(); may depend on the if (y!xi) { information of v for (v " dep(vi)) q.append(v); xi = y; } } Static Analysis 33 (click) dep(v)
    42. Further Improvements ! Use a priority queue instead of a FIFO queue: • find clever heuristics for priorities ! Look at the graph of dependency edges: • build strongly-connected components • solve constraints bottom-up in the resulting DAG Static Analysis 34
    43. Liveness Analysis ! A variable is live at a program point if its current value may be read in the remaining execution ! This is clearly undecidable, but the property can be conservatively approximated ! The answer ”dead” must be the true one • dead variables may be ignored Static Analysis 35 A May Question A Future Behavior Question
    44. A Lattice for Liveness ! A subset lattice of program variables var x,y,z; L = (2{x,y,z}, ⊆) x = input; while (x>1) { the trivial answer y = x/2; if (y>3) x = x-y; {x,y,z} z = x-4; {x,y} {y,z} {x,z} if (z>0) x = x/2; {x} {y} {z} z = z-1; } {} output x; Static Analysis 36 CFG
    45. The Control Flow Graph x = input x = input x>1 x > 1 y = x/2 y = x/2 y>3 y > 3 x = x-y x = x-y var x,y,z var x,y,z z = x-4 z = x-4 z > 0 z> 0 x = x/2 x = x/2 z = z-1 z = z-1 output x output x Static Analysis 37
    46. Setting Up ! For every CFG node, v, we have a variable [[v]]: • the subset of program variables that are live at the program point before v ! Since the analysis is conservative, the computed sets may be too large v ! Auxiliary definition: JOIN(v) = ∪ [[w]] w∈succ(v) w1 w2 wk Static Analysis 38
    47. Liveness Constraints ! For the exit node: vars(E) = variables occurring in E [[exit]] = {} ! For conditions and output: [[v]] = JOIN(v) ∪ vars(E) if (E) , while (E) , output E ! For assignments: [[v]] = JOIN(v) {id} ∪ vars(E) id = E ! For variable declarations: [[v]] = JOIN(v) {id1, ..., idn} var id1, ... ,idn ! For all other nodes: right-hand sides are monotone [[v]] = JOIN(v) since JOIN is monotone Static Analysis 39
    48. Generated Constraints [[var x,y,z]] = [[z=input]] {x,y,z} [[x=input]] = [[x>1]] {x} [[x>1]] = ([[y=x/2]] ! [[output x]]) ! {x} [[y=x/2]] = ([[y>3]] {y}) ! {x} [[y>3]] = [[x=x-y]] ! [[z=x-4]] ! {y} [[x=x-y]] = ([[z=x-4]] {x}) ! {x} [[z>0]] = [[x=x/2]] ! [[z=z-1]] ! {z} [[x=x/2]] = ([[z=z-1]] {x}) ! {z} [[output x]] = [[exit]] ! {x} [[exit]] = {} Static Analysis 40 equation successor , equation work-list algorithm , CFG (click)
    49. The Control Flow Graph x = input x = input x>1 x > 1 y = x/2 y = x/2 y>3 y > 3 x = x-y x = x-y var x,y,z var x,y,z z = x-4 z = x-4 z > 0 z> 0 x = x/2 x = x/2 z = z-1 z = z-1 output x output x Static Analysis 37 dep(v) node parent( equation children ) ( ) dep(v) ( parent)
    50. Least Solution [[entry]] = {} [[z>0]] = {x,z} [[var x,y,z]] = {} [[x=x/2]] = {x,z} [[x=input]] = {} [[z=z-1]] = {x,z} [[x>1]] = {x} [[output x]] = {x} [[y=x/2]] = {x} [[exit]] = {} [[y>3]] = {x,y} [[x=x-y]] = {x,y} [[z=x-4]] = {x} ! Many non-trivial answers! Static Analysis 41
    51. Optimizations ! Variables y and z are never simultaneously live ! they can share the same variable location ! The value assigned in z=z-1 is never read ! the assigment can be skipped var x,yz; x = input; while (x>1) { yz = x/2; • better register allocation if (yz>3) x = x-yz; • a few clock cycles saved yz = x-4; if (yz>0) x = x/2; } output x; Static Analysis 42 x y ( register)
    52. Time Complexity ! With n CFG nodes and k variables: • the lattice has height k⋅n ! Subsets can be represented as bitvectors: • each lattice element has size k • each ∪, , = operation takes time O(k) ! Each iteration uses O(n) operations: • each iteration takes time O(k⋅n) ! There are at most k⋅n iterations ! Total time complexity: O(k2n2) Static Analysis 43
    53. Available Expressions Analysis ! A (nontrivial) expression is available at a program point if its current value has already been computed earlier in the execution ! The approximation includes too few expressions • the answer ”available” must be the true one • available expression may not be re-computed Static Analysis 44 A Must Question A Past Behavior Question
    54. A Lattice for Available Expressions ! A reverse subset lattice of nontrivial expressions var x,y,z,a,b; z = a+b; y = a*b; L = (2{a+b, a*b, y>a+b, a+1}, ⊇) while (y > a+b) { a = a+1; x = a+b; } Static Analysis 45
    55. Reverse Subset Lattice the trivial answer {} {a+b} {a*b} {y>a+b} {a+1} {a+b, a*b} {a+b, y>a+b} {a+b, a+1} {a*b, y>a+b} {a*b, a+1} {y>a+b, a+1} {a+b, a*b, y>a+b} {a+b, a*b, a+1} {a+b, y>a+b, a+1} {a*b, y>a+b, a+1} {a+b, a*b, y>a+b, a+1} Static Analysis 46
    56. The Flow Graph var x,y,z,a,b z=a+b y=a*b y>a+b a=a+1 x=a+b Static Analysis 47
    57. Setting Up ! For every CFG node, v, we have a variable [[v]]: • the subset of program variables that are available at the program point after v ! Since the analysis is conservative, the computed sets may be too small w2 wk ! Auxiliary definition: w1 JOIN(v) = ∩ [[w]] w∈pred(v) v Static Analysis 48
    58. Auxiliary Functions ! The function ↓id removes all expressions that contain a reference to the variable id ! The function exps(E) is defined as: • exps(intconst) = ∅ • exps(id) = ∅ • exps(input) = ∅ • exps(E1 op E2) = {E1 op E2} ∪ exps(E1) ∪ exps(E2) but don’t include expressions containing input Static Analysis 49
    59. Availability Constraints ! For the entry node: [[entry]] = {} ! For conditions and output: [[v]] = JOIN(v) ! exps(E) ! For assignments: [[v]] = (JOIN(v) ! exps(E))"id ! For all other nodes: [[v]] = JOIN(v) Static Analysis 50
    60. Generated Constraints [[entry]] = {} [[var x,y,z,a,b]] = [[entry]] [[z=a+b]] = exps(a+b)↓z [[y=a*b]] = ([[z=a+b]] ∪ exps(a*b))↓y [[y>a+b]] = ([[y=a*b]] ∩ [[x=a+b]]) ∪ exps(y>a+b) [[a=a+1]] = ([[y>a+b]] ∪ exps(a+1))↓a [[x=a+b]] = ([[a=a+1]] ∪ exps(a+b))↓x [[exit]] = [[y>a+b]] Static Analysis 51
    61. Least Solution [[entry]] = {} [[var x,y,z,a,b]] = {} [[z=a+b]] = {a+b} [[y=a*b]] = {a+b, a*b} [[y>a+b]] = {a+b, y>a+b} [[a=a+1]] = {} [[x=a+b]] = {a+b} [[exit]] = {a+b} ! Many nontrivial answers! Static Analysis 52 (click) (click)
    62. Least Solution ! A reverse subset lattice [[entry]] = {} var x,y,z,a,b; [[var x,y,z,a,b]] = {} z = a+b; [[z=a+b]] = {a+b} y = a*b; [[y=a*b]] = {a+b, a*b} while (y > a+b) { [[y>a+b]] = {a+b, y>a+b} a = a+1; [[a=a+1]] = {} [[x=a+b]] = {a+b} x = a+b; [[exit]] = {a+b} } ! Many nontrivial answers! Static Analysis Static Analysis 52 (click) (click)
    63. Optimizations ! We notice that a+b is available before the loop ! The program can be optimized (slightly): var x,y,x,a,b,aplusb; aplusb = a+b; z = aplusb; a+b y = a*b; a+b while (y > aplusb) { a = a+1; aplusb = a+b; a+b x = aplusb; } Static Analysis 53 a+b , (click) a+b ( ) Loops a (click) a+b
    64. Optimizations ! We notice that a+b is available before the loop ! The program can be optimized (slightly): var x,y,x,a,b,aplusb; aplusb = a+b; z = aplusb; y = a*b; while (y > aplusb) { a = a+1; aplusb = a+b; x = aplusb; } Static Analysis 53 a+b , (click) a+b ( ) Loops a (click) a+b
    65. Optimizations ! We notice that a+b is available before the loop ! The program can be optimized (slightly): var x,y,x,a,b,aplusb; aplusb = a+b; z = aplusb; y = a*b; while (y > aplusb) { a = a+1; aplusb = a+b; x = aplusb; } Static Analysis 53 a+b , (click) a+b ( ) Loops a (click) a+b
    66. Very Busy Expressions Analysis ! A (nontrivial) expression is very busy if it will definitely be evaluated before its value changes ! The approximation includes too few expressions • the answer ”very busy” must be the true one • very busy expressions may be pre-computed ! Same lattice as for available expressions Static Analysis 54
    67. Setting Up ! For every CFG node, v, we have a variable [[v]]: • the subset of program variables that are very busy at the program point before v ! Since the analysis is conservative, the computed sets may be too small v ! Auxiliary definition: JOIN(v) = ! [[w]] w"succ(v) w1 w2 wk Static Analysis 55
    68. Very Busy Constraints ! For the exit node: [[exit]] = {} ! For conditions and output: [[v]] = JOIN(v) ∪ exps(E) ! For assignments: [[v]] = JOIN(v)↓id ∪ exps(E) ! For all other nodes: [[v]] = JOIN(v) Static Analysis 56
    69. An Example Program var x,a,b; x = input; a = x-1; b = x-2; while (x > 0) { output a*b-x; x = x-1; } output a*b; ! The analysis shows that a*b is very busy Static Analysis 57
    70. Code Hoisting var x,a,b; var x,a,b,atimesb; x = input; x = input; a = x-1; a = x-1; b = x-2; b = x-2; while (x > 0) { atimesb = a*b; output a*b-x; while (x > 0) { x = x-1; output atimesb-x; } x = x-1; output a*b; } output atimesb; ! The analysis shows that a*b is very busy Static Analysis 58
    71. Reaching Definitions Analysis ! The reaching definitions for a program point are reaching definitions those assignments that may define the current values of variables ! The conservative approximation may include too many possible assignments Static Analysis 59
    72. A Lattice for Reaching Definitions ! The subset lattice of assignments var x,y,z; x = input; while (x > 1) { y = x/2; if (y>3) x = x-y; z = x-4; if (z>0) x = x/2; z = z-1; } output x; L = (2{x=input, y=x/2, x=x-y, z=x-4, x=x/2, z=z-1},⊆) Static Analysis 60
    73. Reaching Definitions Constraints ! For assignments: [[v]] = JOIN(v)↓id ∪ {v} ! For all other nodes: [[v]] = JOIN(v) w2 wk w1 ! Auxiliary definition: JOIN(v) = ∪ [[w]] w∈pred(v) v ! The function ↓id removes assignments to id Static Analysis 61
    74. Def-Use Graph ! Reaching definitions define the def-use graph: • like a CFG but with edges from def to use nodes • basis for dead code elimination and code motion x=input x=x-y z=z-1 x>1 y=x/2 z=x-4 output x y>3 z>0 x=x/2 Static Analysis 62
    75. Forwards vs. Backwards ! A forwards analysis: • computes information about the past behavior • available expressions, reaching definitions ! A backwards analysis: • computes information about the future behavior • liveness, very busy expressions Static Analysis 63 Forwards / Backwards May / Must
    76. May vs. Must ! A may analysis: • describes information that possibly is true • an upper approximation • liveness, reaching definitions ! A must analysis: • describes information that definitely is true • a lower approximation • available expressions, very busy expressions Static Analysis 64
    77. Classifying Analyses forwards backwards reaching definitions liveness [[v]] describes state after v [[v]] describes state before v may JOIN(v) = [[w]] = w∈pred(v) ∪ [[w]] w∈pred(v) JOIN(v) = [[w]] = w∈succ(v) ∪ [[w]] w∈succ(v) available expressions very busy expressions [[v]] describes state after v [[v]] describes state before v must JOIN(v) = [[w]] = w∈pred(v) ∩ [[w]] w∈pred(v) JOIN(v) = [[w]] = w∈succ(v) ∩ [[w]] w∈succ(v) Static Analysis 65 1. Past / Future Behavior
    78. Initialized Variables Analysis ! Compute for each program point those variables that have definitely been initialized in the past ! ! forwards must analysis ! Reverse subset lattice of all variables ! JOIN(v) = ∩ [[w]] w∈pred(v) ! [[entry]] = {} ! For assignments: [[v]] = JOIN(v) ∪ {id} ! For all others: [[v]] = JOIN(v) Static Analysis 66
    SlideShare Zeitgeist 2009

    + Wen-Kai HuangWen-Kai Huang Nominate

    custom

    71 views, 0 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 71
      • 71 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 1
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories