Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Type analysis

933 views

Published on

Lecture 8 of compiler construction course

Published in: Software
  • Be the first to comment

  • Be the first to like this

Type analysis

  1. 1. IN4303 2016-2017 Compiler Construction Type Checking static analysis Eelco Visser
  2. 2. Name Resolution
  3. 3. [ESOP 2015]
  4. 4. Shadowing S0 S1 S2 D < P.p s.p < s.p’ p < p’ S1 S2 x1 y1 y2 x2 z1 x3S0z2 def x3 = z2 5 7 def z1 = fun x1 { fun y1 { x2 + y2 } } S1 S2 x1 y1 y2 x2 z1 x3S0z2 R P P D S1 S2 x1 y1 y2 x2 z1 x3S0z2 D P R R P P D R.P.D < R.P.P.D
  5. 5. Blocks in Java class Foo { void foo() { int x = 1; { int y = 2; } x = y; } } What is the scope graph for this program? Is the y declaration visible to the y reference?
  6. 6. A Calculus for Name Resolution S R1 R2 SR SRS I(R1) S’S S’S P Sx Sx R xS xS D I(_).p’ < P.p D < I(_).p’ D < P.p s.p < s.p’ p < p’ Reachability VisibilityWell formed path: R.P*.I(_)*.D
  7. 7. Alpha Equivalence
  8. 8. Language-independent 𝜶-equivalence Program similarity Equivalence define ↵-equivalence using scope graphs. Except for the leaves rep ifiers, two ↵-equivalent programs must have the same abstract write P ' P’ (pronounced “P and P’ are similar”) when the AS re equal up to identifiers. To compare two programs we first c T structures; if these are equal then we compare how identifiers programs. Since two potentially ↵-equivalent programs are simi s occur at the same positions. In order to compare the identifiers’ efine equivalence classes of positions of identifiers in a program: p me equivalence class are declarations of or reference to the same ract position ¯x identifies the equivalence class corresponding to x. n a program P, we write P for the set of positions correspon s and declarations and PX for P extended with the artificial p We define the P ⇠ equivalence relation between elements of PX if have same AST ignoring identifier names
  9. 9. Language-independent 𝜶-equivalence Position equivalence e same equivalence class are declarations of or reference to the same enti abstract position ¯x identifies the equivalence class corresponding to the fr ble x. Given a program P, we write P for the set of positions corresponding ences and declarations and PX for P extended with the artificial positio ¯x). We define the P ⇠ equivalence relation between elements of PX as t xive symmetric and transitive closure of the resolution relation. nition 7 (Position equivalence). ` p : r i x 7 ! di0 x i P ⇠ i0 i0 P ⇠ i i P ⇠ i0 i P ⇠ i0 i0 P ⇠ i00 i P ⇠ i00 i P ⇠ i his equivalence relation, the class containing the abstract free variable d ion can not contain any other declaration. So the references in a particu are either all free or all bound. mma 6 (Free variable class). The equivalence class of a free variable do contain any other declaration, i.e. 8 di x s.t. i P ⇠ ¯x =) i = ¯x xi xi' Program similarity Equivalence define ↵-equivalence using scope graphs. Except for the leaves rep ifiers, two ↵-equivalent programs must have the same abstract write P ' P’ (pronounced “P and P’ are similar”) when the AS re equal up to identifiers. To compare two programs we first c T structures; if these are equal then we compare how identifiers programs. Since two potentially ↵-equivalent programs are simi s occur at the same positions. In order to compare the identifiers’ efine equivalence classes of positions of identifiers in a program: p me equivalence class are declarations of or reference to the same ract position ¯x identifies the equivalence class corresponding to x. n a program P, we write P for the set of positions correspon s and declarations and PX for P extended with the artificial p We define the P ⇠ equivalence relation between elements of PX if have same AST ignoring identifier names
  10. 10. Language-independent 𝜶-equivalence Position equivalence e same equivalence class are declarations of or reference to the same enti abstract position ¯x identifies the equivalence class corresponding to the fr ble x. Given a program P, we write P for the set of positions corresponding ences and declarations and PX for P extended with the artificial positio ¯x). We define the P ⇠ equivalence relation between elements of PX as t xive symmetric and transitive closure of the resolution relation. nition 7 (Position equivalence). ` p : r i x 7 ! di0 x i P ⇠ i0 i0 P ⇠ i i P ⇠ i0 i P ⇠ i0 i0 P ⇠ i00 i P ⇠ i00 i P ⇠ i his equivalence relation, the class containing the abstract free variable d ion can not contain any other declaration. So the references in a particu are either all free or all bound. mma 6 (Free variable class). The equivalence class of a free variable do contain any other declaration, i.e. 8 di x s.t. i P ⇠ ¯x =) i = ¯x xi xi' Program similarity Equivalence define ↵-equivalence using scope graphs. Except for the leaves rep ifiers, two ↵-equivalent programs must have the same abstract write P ' P’ (pronounced “P and P’ are similar”) when the AS re equal up to identifiers. To compare two programs we first c T structures; if these are equal then we compare how identifiers programs. Since two potentially ↵-equivalent programs are simi s occur at the same positions. In order to compare the identifiers’ efine equivalence classes of positions of identifiers in a program: p me equivalence class are declarations of or reference to the same ract position ¯x identifies the equivalence class corresponding to x. n a program P, we write P for the set of positions correspon s and declarations and PX for P extended with the artificial p We define the P ⇠ equivalence relation between elements of PX if have same AST ignoring identifier names ed proof is in appendix A.5, we first prove: r i x 7 ! d ¯x x ) =) 8 p di0 x , p ` r i x 7 ! di0 x =) i0 = ¯x ^ p = eed by induction on the equivalence relation. ce classes defined by this relation contain references to me entity. Given this relation, we can state that two p f the identifiers at identical positions refer to the same e he same equivalence class: (↵-equivalence). Two programs P1 and P2 are ↵-equi 2) when they are similar and have the same ⇠-equivalen P1 ↵ ⇡ P2 , P1 ' P2 ^ 8 e e0 , e P1 ⇠ e0 , e P2 ⇠ e0 is an equivalence relation since ' and , are equivalenc(with some further details about free variables) Alpha equivalence
  11. 11. Preserving ambiguity 25 module A1 { def x2 := 1 } module B3 { def x4 := 2 } module C5 { import A6 B7 ; def y8 := x9 } module D10 { import A11 ; def y12 := x13 } module E14 { import B15 ; def y16 := x17 } P1 module AA1 { def z2 := 1 } module BB3 { def z4 := 2 } module C5 { import AA6 BB7 ; def s8 := z9 } module D10 { import AA11 ; def u12 := z13 } module E14 { import BB15 ; def v16 := z17 } P2 module A1 { def z2 := 1 } module B3 { def x4 := 2 } module C5 { import A6 B7 ; def y8 := z9 } module D10 { import A11 ; def y12 := z13 } module E14 { import B15 ; def y16 := x17 } P3 Fig. 23. ↵-equivalence and duplicate declarationP1 Lemma 6 (Free variable class). The equivalence class of a fr ot contain any other declaration, i.e. 8 di x s.t. i P ⇠ ¯x =) i = ¯x Proof. Detailed proof is in appendix A.5, we first prove: 8 r i x, (` > : r i x 7 ! d ¯x x ) =) 8 p di0 x , p ` r i x 7 ! di0 x =) i0 = ¯x nd then proceed by induction on the equivalence relation. The equivalence classes defined by this relation contain reference ions of the same entity. Given this relation, we can state that t ↵-equivalent if the identifiers at identical positions refer to the s s belong to the same equivalence class: Definition 8 (↵-equivalence). Two programs P1 and P2 are ↵ oted P1 ↵ ⇡ P2) when they are similar and have the same ⇠-equi P1 ↵ ⇡ P2 , P1 ' P2 ^ 8 e e0 , e P1 ⇠ e0 , e P2 ⇠ e0P2 P2 Lemma 6 (Free variable class). The e not contain any other declaration, i.e. 8 d Proof. Detailed proof is in appendix A.5, 8 r i x, (` > : r i x 7 ! d ¯x x ) =) 8 p di0 x , p ` and then proceed by induction on the equ The equivalence classes defined by this rel tions of the same entity. Given this relatio ↵-equivalent if the identifiers at identical is belong to the same equivalence class: Definition 8 (↵-equivalence). Two pro noted P1 ↵ ⇡ P2) when they are similar and P1 ↵ ⇡ P2 , P1 ' P2 ^ 8P3
  12. 12. Names and Types
  13. 13. Types from Declaration def x : int = 6 def x : int = 6 def f = fun (y : int) { x + y } Static type-checking (or inference) is one obvious client for name resolution In many cases, we can perform resolution before doing type analysis
  14. 14. Types from Declaration def x : int = 6 def f = fun (y : int) { x + y } def f = fun (y : int) { x + y } def x : int = 6 def f = fun (y : int) { x + y } Static type-checking (or inference) is one obvious client for name resolution In many cases, we can perform resolution before doing type analysis
  15. 15. Types from Declaration def x : int = 6 def f = fun (y : int) { x + y } def x : int = 6 def f = fun (y : int) { x + y } Static type-checking (or inference) is one obvious client for name resolution In many cases, we can perform resolution before doing type analysis
  16. 16. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  17. 17. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  18. 18. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  19. 19. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  20. 20. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  21. 21. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  22. 22. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  23. 23. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  24. 24. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  25. 25. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4
  26. 26. Type-Dependent Name Resolution But sometimes we need types before we can do name resolution record A1 { x1 : int } record B1 { a1 : A2 ; x2 : bool} def z1 : B2 = ... def y1 = z2.x3 def y2 = z3.a2.x4 Our approach: interleave partial name resolution with type resolution (also using constraints) See PEPM 2016 paper / talk
  27. 27. Constraint Language
  28. 28. [PEPM 2016]
  29. 29. Architecture
  30. 30. Syntax of Constraints C := CG | CTy | CRes | C ^ C | True CG := R S | S D | S l S | D S | S l R CRes := R 7! D | D S | !N | N ⇢ ⇠ N CTy := T ⌘ T | D : T D := | xD i R := xR i S := & | n T := ⌧ | c(T, ..., T) with c 2 CT N := D(S) | R(S) | V(S) Figure 7. Syntax of constraints scope graph resolution calculus (described in Section 3.3). Finally, we apply |= with G set to CG .
  31. 31. LMR: Language with Modules and Records prog = decl⇤ decl = module id {decl⇤ } | import id | def bind | record id {fdecl⇤ } fdecl = id : ty ty = Int | Bool | id | ty ! ty exp = int | true | false | id | exp exp | if exp then exp else exp | fun ( id : ty ) {exp} | exp exp | letrec tbind in exp | new id {fbind⇤ } | with exp do exp | exp . id bind = id = exp | tbind tbind = id : ty = exp fbind = id = exp Figure 5. Syntax of LMR. [[ds]]prog := [[module xi {ds}]]decl s := [[import xi]]decl s := [[def b]]decl s := [[record xi {fs}]]decl s := [[xi = e]]bind s := [[xi : t = e]]bind s := [[xi:t]]fdecl sr,sd := [[Int]]ty s,t := [[Bool]]ty s,t := [[t1 ! t2]]ty s,t := [[xi]]ty s,t := [[fun (xi:t1){e}]]exp s,t :=
  32. 32. Title
  33. 33. Lexical Scope
  34. 34. Imports
  35. 35. Type Dependent Name Resolution
  36. 36. Constraints for Declarations p [[ds]]prog := !D(s) ^ [[ds]]decl⇤ s [[module xi {ds}]]decl s := s xD i ^ xD i s0 ^ s0 P s ^ !D(s0) ^ [[ds]]decl⇤ s0 [[import xi]]decl s := xR i s ^ s I xR i [[def b]]decl s := [[b]]bind s [[record xi {fs}]]decl s := s xD i ^ xD i s0 ^ s0 P s ^ !D(s0) ^ [[fs]]fdecl⇤ s,s0 [[xi = e]]bind s := s xD i ^ xD i : ⌧ ^ [[e]]exp s,⌧ [[xi : t = e]]bind s := s xD i ^ xD i : ⌧ ^ [[t]]ty s,⌧ ^ [[e]]exp s,⌧ [[xi:t]]fdecl sr,sd := sd xD i ^ xD i : ⌧ ^ [[t]]ty sr,⌧ [[Int]]ty s,t := t ⌘ Int [[Bool]]ty s,t := t ⌘ Bool [[t1 ! t2]]ty s,t := t ⌘ Fun[⌧1,⌧2] ^ [[t1]]ty s,⌧1 ^ [[t2]]ty s,⌧2 [[xi]]ty := t ⌘ Rec( ) ^ xR s ^ xR 7!
  37. 37. Constraints for Expressions LMR. be the algo- R’s concrete defined by a by syntactic ach function nt and possi- parameters, sibly involv- ables or new s are defined ach possible gory. For ex- rules, and is e s in which the expres- ected type t the notation ms of syntac- result of ap- and return- esulting con- pty sequence. 1 2 s,t 1 2 1 s,⌧1 2 s,⌧2 [[xi]]ty s,t := t ⌘ Rec( ) ^ xR i s ^ xR i 7! [[fun (xi:t1){e}]]exp s,t := t ⌘ Fun[⌧1,⌧2] ^ s0 P s ^ !D(s0) ^ s0 xD i ^ xD i : ⌧1 ^ [[t1]]ty s,⌧1 ^ [[e]]exp s0,⌧2 [[letrec bs in e]]exp s,t := s0 P s ^ !D(s0) ^ [[bs]]bind s0 ^ [[e]]exp s0,t [[n]]exp s,t := t ⌘ Int [[true]]exp s,t := t ⌘ Bool [[false]]exp s,t := t ⌘ Bool [[e1 e2]]exp s,t := t ⌘ t3 ^ ⌧1 ⌘ t1 ^ ⌧2 ⌘ t2 ^ [[e1]]exp s,⌧1 ^ [[e2]]exp s,⌧2 (where has type t1 ⇥ t2 ! t3) [[if e1 then e2 else e3]]exp s,t := ⌧1 ⌘ Bool ^ [[e1]]exp s,⌧1 ^ [[e2]]exp s,t ^ [[e3]]exp s,t [[xi]]exp s,t := xR i s ^ xR i 7! ^ : t [[e1 e2]]exp s,t := ⌧ ⌘ Fun[⌧1,t] ^ [[e1]]exp s,⌧ ^ [[e2]]exp s,⌧1 [[e.xi]]exp s,t := [[e]]exp s,⌧ ^ ⌧ ⌘ Rec( ) ^ & ^ s0 I & ^ [[xi]]exp s0,t [[with e1 do e2]]exp s,t := [[e1]]exp s,⌧ ^ ⌧ ⌘ Rec( ) ^ & ^ s0 P s ^ s0 I & ^ [[e2]]exp s0,t [[new xi {bs}]]exp s,t := xR i s ^ xR i 7! ^ s0 I xR i ^ [[bs]]fbind⇤ s,s0 ^ V(s0) ⇡ R(s0) ^ t ⌘ Rec( ) [[xi = e]]fbind s,s0 := xR i s0 ^ xR i 7! ^ : ⌧ ^ [[e]]exp s,⌧
  38. 38. Constraint Generation with NaBL2
  39. 39. Constraint Generation module constraint-gen imports signatures/- rules [[ Constr(t1, ..., tn) ^ (s1, …, sn) : ty ]] := C1, ..., Cn. one rule per abstract syntax constructor
  40. 40. Generic Constraints false // always fails true // always succeeds C1, C2 // conjunction of constraints [[ e ^ (s) : ty ]] // generate constraints for sub-term C | error $[something is wrong] // custom error message C | warning $[does not look right] // custom warning
  41. 41. Visibility Policy and Scope Graph Constraints new s // generate a new scope NS{x} <- s // declaration of name x in namespace NS in scope s NS{x} -> s // reference of name x in namespace NS in scope s s ---> s' // unlabeled scope edge from s to s’ s -L-> s' // scope edge from s to s' labeled L NS{x} |-> d // resolve reference x in namespace NS to declaration d name resolution namespaces NS1 NS2 labels P I well-formedness P* . I* order D < P, D < I, I < P
  42. 42. Name Set Constraints distinct D(s)/NS, // declarations for NS in s should be distinct D(s1) subseteq R(s2), // name set D(s_rec)/Field subseteq R(s_use)/Field | error $[Field [NAME] not initialized] @r Note: incomplete
  43. 43. Type Signature and Type Constraints signature types TC1() TC2(type) TC3(scope) TC4(type, scope, type) [[ e ^ (s) : ty ]] // subterm e has type ty under scope s o : ty // occurrence o has type ty o : ty ! // with priority ty1 == ty2 // ty1 and ty2 should unify ty <! ty2 // declare ty1 a subtype of ty2 ty1 <? ty2 // is ty1 a subtype of ty1?
  44. 44. Tiger in NaBL2
  45. 45. Visibility Policy and Type Signature signature name resolution namespaces Type Var Field Loop labels P I well-formedness P* . I* order D < P, D < I, I < P signature types UNIT() INT() STRING() NIL() RECORD(scope) ARRAY(type, scope) FUN(List(type), type)
  46. 46. Declaration of Built-in Types and Functions init ^ (s) : ty_init := new s, // the root scope Type{"int"} <- s, // declare primitive type int Type{"int"} : INT() !!, Type{"string"} <- s, // declare primitive type string Type{"string"} : STRING() !!, // standard library Var{"print"} <- s, Var{"print"} : FUN([STRING()], UNIT()) !!, … Var{"exit"} <- s, Var{"exit"} : FUN([INT()], UNIT()) !!.
  47. 47. module nabl-lib rules [[ None() ^ (s) ]] := true. [[ Some(e) ^ (s) ]] := [[ e ^ (s) ]]. Map[[ [] ^ (s) ]] := true. Map[[ [ x | xs ] ^ (s) ]] := [[ x ^ (s) ]], Map[[ xs ^ (s) ]]. Map2[[ [] ^ (s, s') ]] := true. Map2[[ [ x | xs ] ^ (s, s') ]] := [[ x ^ (s, s') ]], Map2[[ xs ^ (s, s') ]]. MapT2[[ [] ^ (s, s') : [] ]] := true. MapT2[[ [ x | xs ] ^ (s, s') : [ty | tys] ]] := [[ x ^ (s, s') : ty ]], MapT2[[ xs ^ (s, s') : tys ]]. MapT[[ [] ^ (s) : ty ]] := true. MapT[[ [ x | xs ] ^ (s) : ty ]] := [[ x ^ (s) : ty ]], MapT[[ xs ^ (s) : ty ]]. MapTs[[ [] ^ (s) : [] ]] := true. MapTs[[ [ x | xs ] ^ (s) : [ty | tys] ]] := [[ x ^ (s) : ty ]], MapTs[[ xs ^ (s) : tys ]]. MapTs2[[ [] ^ (s1, s2) : [] ]] := true. MapTs2[[ [ x | xs ] ^ (s1, s2) : [ty | tys] ]] := [[ x ^ (s1, s2) : ty ]], MapTs2[[ xs ^ (s1, s2) : tys ]]. Constraint Generation for Lists of Terms
  48. 48. Non-Binding Constructs
  49. 49. Literals and Operators rules // literals [[ Int(i) ^ (s) : INT() ]] := true. [[ String(str) ^ (s) : STRING() ]] := true. [[ NilExp() ^ (s) : NIL() ]] := true. rules // operators [[ Uminus(e) ^ (s) : INT() ]] := [[ e ^ (s) : INT() ]]. [[ Divide(e1, e2) ^ (s) : INT() ]] := [[ e1 ^ (s) : INT() ]], [[ e2 ^ (s): INT() ]]. [[ Times(e1, e2) ^ (s) : INT() ]] := [[ e1 ^ (s) : INT() ]], [[ e2 ^ (s): INT() ]]. // etc.
  50. 50. Assignment rules [[ Assign(e1, e2) ^ (s) : UNIT() ]] := [[ e1 ^ (s) : ty1 ]], [[ e2 ^ (s) : ty2 ]], ty2 <? ty1 | error $[type mismatch] @ e2.
  51. 51. Sequences rules [[ Seq(es) ^ (s) : ty ]] := Seq[[ es ^ (s) : ty ]]. Seq[[ [] ^ (s) : UNIT() ]] := true. Seq[[ [e] ^ (s) : ty ]] := [[ e ^ (s) : ty ]]. Seq[[ [ e | es@[_|_] ] ^ (s) : ty ]] := [[ e ^ (s) : ty' ]], Seq[[ es ^ (s) : ty ]].
  52. 52. Control-Flow rules [[ If(e1, e2, e3) ^ (s) : ty2 ]] := [[ e1 ^ (s) : INT() ]], [[ e2 ^ (s) : ty2 ]], [[ e3 ^ (s) : ty3 ]], ty2 == ty3 | error $[branches should have same type]. [[ IfThen(e1, e2) ^ (s) : UNIT() ]] := [[ e1 ^ (s) : INT() ]], [[ e2 ^ (s) : ty ]]. [[ While(e1, e2) ^ (s) : UNIT() ]] := new s', s' -P-> s, [[ e1 ^ (s) : INT() ]], [[ e2 ^ (s') : ty ]].
  53. 53. Binding Constructs
  54. 54. Lets Bind Sequentially rules // let [[ Let(blocks, exps) ^ (s) : ty ]] := new s_body, distinct D(s_body), Decs[[ blocks ^ (s, s_body) ]], Seq[[ exps ^ (s_body) : ty ]]. Decs[[ [] ^ (s_outer, s_body) ]] := s_body -P-> s_outer. Decs[[ [block] ^ (s_outer, s_body) ]] := s_body -P-> s_outer, Dec[[ block ^ (s_body, s_outer) ]]. Decs[[ [block | blocks@[_|_]] ^ (s_outer, s_body) ]] := new s_dec, s_dec -P-> s_outer, Dec[[ block ^ (s_dec, s_outer) ]], Decs[[ blocks ^ (s_dec, s_body) ]]. let var x : int := 0 + z // z not in scope var y : int := x + 1 var z : int := x + y + 1 in x + y + z end
  55. 55. Variable Declarations and References rules // variable declarations Dec[[ VarDec(x, t, e) ^ (s, s_outer) ]] := Var{x} <- s, Var{x} : ty1 !, [[ t ^ (s_outer) : ty1 ]], [[ e ^ (s_outer) : ty2 ]], ty2 <? ty1 | error $[type mismatch] @ e. Dec[[ VarDecNoInit(x, t) ^ (s, s_outer) ]] := Var{x} <- s, Var{x} : ty !, [[ t ^ (s_outer) : ty ]]. rules // variable references [[ Var(x) ^ (s) : ty ]] := Var{x} -> s, Var{x} |-> d, d : ty. let var x : int := 5 var f : int := 1 in for y := 1 to x do ( f := f * y ) end
  56. 56. Type Declarations let type foo = int function foo(x : foo) : foo = 3 var foo : foo := foo(4) in foo(56) + foo // both refer to the variable foo end rules Dec[[ TypeDecs(tdecs) ^ (s, s_outer) ]] := Map[[ tdecs ^ (s) ]]. [[ TypeDec(x, t) ^ (s) ]] := Type{x} <- s, Type{x} : ty !, [[ t ^ (s) : ty ]]. rules // types [[ Tid(x) ^ (s) : ty ]] := Type{x} -> s, Type{x} |-> d | error $[Type [x] not declared], d : ty.
  57. 57. Functions
  58. 58. Adjacent Functions are Mutually Recursive let function odd(x : int) : int = if x > 0 then even(x - 1) else false function even(x : int) : int = if x > 0 then odd(x - 1) else true in even(34) end let function odd(x : int) : int = if x > 0 then even(x - 1) else false var x : int function even(x : int) : int = if x > 0 then odd(x - 1) else true in even(34) end
  59. 59. Function Definitions rules Dec[[ FunDecs(fdecs) ^ (s, s_outer) ]] := Map2[[ fdecs ^ (s, s_outer) ]]. [[ FunDec(f, args, t, e) ^ (s, s_outer) ]] := Var{f} <- s, Var{f} : FUN(tys, ty) !, new s_fun, s_fun -P-> s, distinct D(s_fun) | error $[duplicate argument] @ NAMES, MapTs2[[ args ^ (s_fun, s_outer) : tys ]], [[ t ^ (s_outer) : ty ]], [[ e ^ (s_fun) : ty_body ]], ty == ty_body| error $[return type does not match body] @ t. [[ FArg(x, t) ^ (s_fun, s_outer) : ty ]] := Var{x} <- s_fun, Var{x} : ty !, [[ t ^ (s_outer) : ty ]]. let function fact(n : int) : int = if n < 1 then 1 else (n * fact(n - 1)) in fact(10) end
  60. 60. let function fact(n : int) : int = if n < 1 then 1 else (n * fact(n - 1)) in fact(10) end Function Calls rules [[ Call(Var(f), exps) ^ (s) : ty ]] := Var{f} -> s, Var{f} |-> d | error $[Function [f] not declared], d : FUN(tys, ty) | error $[Function expected] , MapTs[[ exps ^ (s) : tys ]].
  61. 61. Records
  62. 62. Type Dependent Name Resolution let type point = {x : int, y : int} var origin : point := point { x = 1, y = 2 } in origin.x end
  63. 63. Errors in Record Declaration and Creation let type point = {x : int, y : int} type errpoint = {x : int, x : int} var p : point var e : errpoint in p := point{ x = 3, y = 3, z = "a" } p := point{ x = 3 } end Field “y” not initialized Reference “z” not resolved Duplicate Declaration of Field “x”
  64. 64. Recursive Types let type intlist = {hd : int, tl : intlist} type tree = {key : int, children : treelist} type treelist = {hd : tree, tl : treelist} var l : intlist var t : tree var tl : treelist in l := intlist { hd = 3, tl = l }; t := tree { key = 2, children = treelist { hd = tree{ key = 3, children = 3 }, tl = treelist{ } } }; t.children.hd.children := t.children end type mismatch Field "tl" not initialized Field "hd" not initialized
  65. 65. NIL is a Subtype of RECORD let type intlist = {hd : int, tl : intlist} var l : intlist := nil in l := intlist{ hd = 1, tl = l }; l := intlist{ hd = 2, tl = l }; l := intlist{ hd = 3, tl = l } end
  66. 66. Record Types rules [[ RecordTy(fields) ^ (s) : ty ]] := new s_rec, ty == RECORD(s_rec), NIL() <! ty, distinct D(s_rec)/Field | error $[Duplicate declaration of field [NAME]] @ NAMES, Map2[[ fields ^ (s_rec, s) ]]. [[ Field(x, t) ^ (s_rec, s_outer) ]] := Field{x} <- s_rec, Field{x} : ty !, [[ t ^ (s_outer) : ty ]]. let type point = {x : int, y : int} var origin : point := point { x = 1, y = 2 } in origin.x end
  67. 67. Record Creation rules [[ r@Record(t, inits) ^ (s) : ty ]] := [[ t ^ (s) : ty ]], ty == RECORD(s_rec) | error $[record type expected], new s_use, s_use -I-> s_rec, D(s_rec)/Field subseteq R(s_use)/Field | error $[Field [NAME] not initialized] @r, Map2[[ inits ^ (s_use, s) ]]. [[ InitField(x, e) ^ (s_use, s) ]] := Field{x} -> s_use, Field{x} |-> d, d : ty1, [[ e ^ (s) : ty2 ]], ty2 <? ty1 | error $[type mismatch]. let type point = {x : int, y : int} var origin : point := point { x = 1, y = 2 } in origin.x end
  68. 68. Record Field Access rules [[ FieldVar(e, f) ^ (s) : ty ]] := [[ e ^ (s) : ty_e ]], ty_e == RECORD(s_rec), new s_use, s_use -I-> s_rec, Field{f} -> s_use, Field{f} |-> d, d : ty. let type point = {x : int, y : int} var origin : point := point { x = 1, y = 2 } in origin.x end
  69. 69. On GitHub • Arrays • For loops
  70. 70. Formal Framework (with labeled scope edges)
  71. 71. Issues with the Reachability Calculus S R1 R2 SR SRS I(R1) S’S S’S P Sx Sx R xS xS D I(_).p’ < P.p D < I(_).p’ D < P.p s.p < s.p’ p < p’ Well formed path: R.P*.I(_)*.D Disambiguating import paths Fixed visibility policy Cyclic Import Paths Multi-import interpretation
  72. 72. Resolution Calculus with Edge Labels s the resolution relation for graph G. collection JNKG is the multiset defined DG(S)), JR(S)KG = ⇡(RG(S)), and `G p : S 7 ! xD i }) where ⇡(A) is ojecting the identifiers from a set A of iven a multiset M, 1M (x) denotes the nes the resolution of a reference to a as a most specific, well-formed path n through a sequence of edges. A path ng the atomic scope transitions in the of steps: l, S2) is a direct transition from the pe S2. This step records the label of s used. , yR , S) requires the resolution of ref- n with associated scope S to allow a rrent scope and scope S. nds with a declaration step D(xD ) that path is leading to. ion in the graph from reference xR i `G p : xR i 7 ! xD i according to . These rules all implicitly apply to omit to avoid clutter. The calculus n in terms of edges in the scope graph, isible declarations. Here I is the set of vice needed to avoid “out of thin air” Well-formed paths WF(p) , labels(p) 2 E Visibility ordering on paths label(s1) < label(s2) s1 · p1 < s2 · p2 p1 < p2 s · p1 < s · p2 Edges in scope graph S1 l S2 I ` E(l, S2) : S1 ! S2 (E) S1 l yR i yR i /2 I I ` p : yR i 7 ! yD j yD j S2 I ` N(l, yR i , S2) : S1 ! S2 (N) Transitive closure I, S ` [] : A ⇣ A (I) B /2 S I ` s : A ! B I, {B} [ S ` p : B ⇣ C I, S ` s · p : A ⇣ C (T) Reachable declarations I, {S} ` p : S ⇣ S0 WF(p) S0 xD i I ` p · D(xD i ) : S ⇢ xD i (R) Visible declarations I ` p : S ⇢ xD i 8j, p0 (I ` p0 : S ⇢ xD j ) ¬(p0 < p)) I ` p : S 7 ! xD i (V ) Reference resolution xR i S {xR i } [ I ` p : S 7 ! xD j I ` p : xR i 7 ! xD j (X) G, |= !N JN1KG ✓ JN2KG G, |= N1 ⇢ ⇠ N2 (C-SUBNAME) t1 = t2 G, |= t1 ⌘ t2 (C-EQ) Figure 8. Interpretation of resolution and typing constraints Resolution paths s := D(xD i ) | E(l, S) | N(l, xR i , S) p := [] | s | p · p (inductively generated) [] · p = p · [] = p (p1 · p2) · p3 = p1 · (p2 · p3) Well-formed paths WF(p) , labels(p) 2 E Visibility ordering on paths label(s1) < label(s2) s1 · p1 < s2 · p2 p1 < p2 s · p1 < s · p2 Edges in scope graph S1 l S2 I ` E(l, S2) : S1 ! S2 (E) S1 l yR i yR i /2 I I ` p : yR i 7 ! yD j yD j S2 I ` N(l, yR i , S2) : S1 ! S2 (N) Transitive closure I, S ` [] : A ⇣ A (I) B /2 S I ` s : A ! B I, {B} [ S ` p : B ⇣ C I, S ` s · p : A ⇣ C (T) Reachable declarations I, {S} ` p : S ⇣ S0 WF(p) S0 xD i I ` p · D(xD i ) : S ⇢ xD i (R) Visible declarations I ` p : S ⇢ xD i 8j, p0 (I ` p0 : S ⇢ xD j ) ¬(p0 < p)) I ` p : S 7 ! xD i (V ) Reference resolution xR S {xR } [ I ` p : S 7 ! xD C := CG | CTy | CRes | C ^ C | True CG := R S | S D | S l S | D S | S l R CRes := R 7! D | D S | !N | N ⇢ ⇠ N CTy := T ⌘ T | D : T D := | xD i R := xR i S := & | n T := ⌧ | c(T, ..., T) with c 2 CT N := D(S) | R(S) | V(S) Figure 7. Syntax of constraints scope graph resolution calculus (described in Section 3.3). Finally, we apply |= with G set to CG . To lift this approach to constraints with variables, we simply apply a multi-sorted substitution , mapping type variables ⌧ to ground types, declaration variables to ground declarations and scope variables & to ground scopes. Thus, our overall definition of satisfaction for a program p is: (CG ), |= (CRes ) ^ (CTy ) (⇧)
  73. 73. Visibility Policies Lexical scope L := {P} E := P⇤ D < P Non-transitive imports L := {P, I} E := P⇤ · I? D < P, D < I, I < P Transitive imports L := {P, TI} E := P⇤ · TI⇤ D < P, D < TI, TI < P Transitive Includes L := {P, Inc} E := P⇤ · Inc⇤ D < P, Inc < P Transitive includes and imports, and non-transitive imports L := {P, Inc, TI, I} E := P⇤ · (Inc | TI)⇤ · I? D < P, D < TI, TI < P, Inc < P, D < I, I < P, Figure 10. Example reachability and visibility policies by instan- tiation of path well-formedness and visibility. 3.4 Parameterization R Envre [ EnvL re [ EnvD re [ Envl re [ I
  74. 74. Seen Imports hout import tracking. ficity order on paths is e visibility policies can e and specificity order. module A1 { module A2 { def a3 = ... } } import A4 def b5 = a6 Fig. 11. Self im- port tion the dule far, few ! t A4 ody ? 13 AD 2 :SA2 2 D(SA1 ) AR 4 2 I(Sroot) AR 4 2 R(Sroot) AD 1 :SA1 2 D(Sroot) AR 4 7 ! AD 1 :SA1 Sroot ! SA1 (⇤) Sroot ⇢ AD 2 :SA2 AR 4 2 R(Sroot) Sroot 7 ! AD 2 :SA2 AR 4 7 ! AD 2 :SA2 Fig. 10. Derivation for AR 4 7 ! AD 2 :SA2 in a calculus without import tracking. The complete definition of well-formed paths and specificity order on paths is given in Fig. 2. In Section 2.5 we discuss how alternative visibility policies can be defined by just changing the well-formedness predicate and specificity order. module A1 { module A2 { def a3 = ... } } import A4 def b5 = a6 Seen imports. Consider the example in Fig. 11. Is declaration a3 reachable in the scope of reference a6? This reduces to the question whether the import of A4 can resolve to module A2. Surprisingly, it can, in the calculus as discussed so far, as shown by the derivation in Fig. 10 (which takes a few shortcuts). The conclusion of the derivation is that AR 4 7 ! AD :S . This conclusion is obtained by using the import at A as shown by the derivation in Fig. 10 (which takes shortcuts). The conclusion of the derivation is that AR 4 AD 2 :SA2 . This conclusion is obtained by using the import to conclude at step (*) that Sroot ! SA1 , i.e. that the of module A1 is reachable! In other words, the import is used in its own resolution. Intuitively, this is nonsen To rule out this kind of behavior we extend the cal to keep track of the set of seen imports I using judgem of the form I ` p : xR i 7 ! xD j . We need to extend all ru pass the set I, but only the rules for resolution and im are truly affected: xR i 2 R(S) {xR i } [ I ` p : S 7 ! xD j I ` p : xR i 7 ! xD j yR i 2 I(S1) I I ` p : yR i 7 ! yD j :S2 I ` I(yR i , yD j :S2) : S1 ! S2 With this final ingredient, we reach the full calcul
  75. 75. 4 def b5 = a6 Fig. 11. Self im- port module A1 { module B2 { def x3 = 1 } } module B4 { module A5 { def y6 = 2 } } module C7 { import A8 import B9 def z10 = x11 + y12 } Fig. 12. Anoma- lous resolution . The conclusion of the derivation is that AR 4 7 ! This conclusion is obtained by using the import at A4 de at step (*) that Sroot ! SA1 , i.e. that the body A1 is reachable! In other words, the import of A4 its own resolution. Intuitively, this is nonsensical. e out this kind of behavior we extend the calculus ack of the set of seen imports I using judgements m I ` p : xR i 7 ! xD j . We need to extend all rules to et I, but only the rules for resolution and import affected: xR i 2 R(S) {xR i } [ I ` p : S 7 ! xD j I ` p : xR i 7 ! xD j (X) yR i 2 I(S1) I I ` p : yR i 7 ! yD j :S2 I ` I(yR i , yD j :S2) : S1 ! S2 (I) this final ingredient, we reach the full calculus in is not hard to see that the resolution relation is ded. The only recursive invocation (via the I rule) ictly larger set I of seen imports (via the X rule); since the set R(G) Anomaly 4 def b5 = a6 Fig. 11. Self im- port module A1 { module B2 { def x3 = 1 } } module B4 { module A5 { def y6 = 2 } } module C7 { import A8 import B9 def z10 = x11 + y12 } Fig. 12. Anoma- lous resolution R 4 7 ! at A4 body of A4 sical. culus ments les to mport (X) (I) us in on is rule) rule); since the set R(G) 2 1 3 1 3 2
  76. 76. Resolution Algorithm < P P, nstan- R[I](xR ) := let (r, s) = EnvE [{xR } [ I, ;](Sc(xR ))} in ( U if r = P and {xD |xD 2 s} = ; {xD |xD 2 s} Envre [I, S](S) := ( (T, ;) if S 2 S or re = ? Env L[{D} re [I, S](S) EnvL re [I, S](S) := [ l2Max(L) ⇣ Env {l0 2L|l0 <l} re [I, S](S) Envl re [I, S](S) ⌘ EnvD re [I, S](S) := ( (T, ;) if [] /2 re (T, D(S)) Envl re [I, S](S) := 8 >< >: (P, ;) if S I l contains a variable or ISl[I](S) = U S S02 ⇣ ISl[I](S)[S I l ⌘ Env(l 1re)[I, {S} [ S](S0) ISl [I](S) := ( U if 9yR 2 (S B l I) s.t. R[I](yR ) = U {S0 | yR 2 (S B l I) ^ yD 2 R[I](yR ) ^ yD S0}
  77. 77. Semantics of Constraints
  78. 78. Constraint Resolution
  79. 79. Summary
  80. 80. Q1: Compiler Front-End • Syntax Definition - from formal grammars to syntax definition - derivation of parsers, abstract syntax trees, syntax-aware editors, … • Syntax Techniques (1) - automata for lexical analysis • Term Rewriting - simple syntactic transformations using term rewrite rules - desugaring, outline view • Imperative and OO Languages - behavior and types • Static Semantics - name resolution using scope graphs - type analysis using type constraints
  81. 81. Q2: Compiler Back-End • Dynamic Semantics - what is the meaning of programs in a language • Target Machine - instruction set is API for programming (virtual) machine • Code Generation - from (typed) abstract syntax trees to (virtual) machine code instructions • Garbage Collection - techniques for safe automated memory management • Register Allocation - mapping unlimited set of variables to limited set of registers • Dataflow Analysis - basis for optimizations such as constant propagation, dead code elimination, … • Syntax Techniques (2) - LL and LR parsing
  82. 82. First … http://2016.splashcon.org
  83. 83. Name Resolution copyrights 69
  84. 84. Name Resolution 70
  85. 85. copyrights Name Resolution Pictures Slide 1: How glorious a greeting the sun gives the mountains! by Justin Kern, some rights reserved 71

×