Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Declare Your Language: Name Resolution

766 views

Published on

Lecture 5 of compiler construction course about name resolution using scope graphs

Published in: Software
  • Be the first to comment

  • Be the first to like this

Declare Your Language: Name Resolution

  1. 1. Eelco Visser IN4303 Compiler Construction TU Delft September 2017 Declare Your Language Chapter 5: Name Resolution
  2. 2. 2 Source Code Editor Parse Abstract Syntax Tree Type Check Check that names are used correctly and that expressions are well-typed Errors
  3. 3. Reading Material 3
  4. 4. 4 Type checkers are algorithms that check names and types in programs. This paper introduces the NaBL Name Binding Language, which supports the declarative definition of the name binding and scope rules of programming languages through the definition of scopes, declarations, references, and imports associated. http://dx.doi.org/10.1007/978-3-642-36089-3_18 SLE 2012 Background
  5. 5. 5 This paper introduces scope graphs as a language-independent representation for the binding information in programs. http://dx.doi.org/10.1007/978-3-662-46669-8_9 ESOP 2015
  6. 6. 6 Separating type checking into constraint generation and constraint solving provides more declarative definition of type checkers. This paper introduces a constraint language integrating name resolution into constraint resolution through scope graph constraints. This is the basis for the design of the NaBL2 static semantics specification language. https://doi.org/10.1145/2847538.2847543 PEPM 2016
  7. 7. 7 Documentation for NaBL2 at the metaborg.org website. http://www.metaborg.org/en/latest/source/langdev/meta/lang/nabl2/index.html
  8. 8. Name Resolution Examples in Tiger 8
  9. 9. Name Resolution 9 let var x : int := 0 + z var y : int := x + 1 var z : int := x + y + 1 in x + y + z end
  10. 10. Name Resolution: Scope 10 let var x : int := 0 + z // z not in scope var y : int := x + 1 var z : int := x + y + 1 in x + y + z end
  11. 11. Name Resolution 11 let function odd(x : int) : int = if x > 0 then even(x - 1) else false function even(x : int) : int = if x > 0 then odd(x - 1) else true in even(34) end let function odd(x : int) : int = if x > 0 then even(x - 1) else false var x : int function even(x : int) : int = if x > 0 then odd(x - 1) else true in even(34) end
  12. 12. Mutually Recursive Functions should be Adjacent 12 let function odd(x : int) : int = if x > 0 then even(x - 1) else false function even(x : int) : int = if x > 0 then odd(x - 1) else true in even(34) end let function odd(x : int) : int = if x > 0 then even(x - 1) else false var x : int function even(x : int) : int = if x > 0 then odd(x - 1) else true in even(34) end
  13. 13. Name Resolution 13 let type foo = int function foo(x : foo) : foo = 3 var foo : foo := foo(4) in foo(56) + foo end
  14. 14. Name Resolution: Name Spaces 14 let type foo = int function foo(x : foo) : foo = 3 var foo : foo := foo(4) in foo(56) + foo // both refer to the variable foo end Functions and variables are in the same namespace
  15. 15. Name Resolution 15 let type point = {x : int, y : int} var origin : point := point { x = 1, y = 2 } in origin.x end
  16. 16. Type Dependent Name Resolution 16 let type point = {x : int, y : int} var origin : point := point { x = 1, y = 2 } in origin.x end Resolving origin.x requires the type of origin
  17. 17. Name Resolution 17 let type point = {x : int, y : int} type errpoint = {x : int, x : int} var p : point var e : errpoint in p := point{ x = 3, y = 3, z = "a" } p := point{ x = 3 } end
  18. 18. Name Resolution: Name Set Correspondence 18 let type point = {x : int, y : int} type errpoint = {x : int, x : int} var p : point var e : errpoint in p := point{ x = 3, y = 3, z = "a" } p := point{ x = 3 } end Field “y” not initialized Reference “z” not resolved Duplicate Declaration of Field “x”
  19. 19. Name Resolution 19 let type intlist = {hd : int, tl : intlist} type tree = {key : int, children : treelist} type treelist = {hd : tree, tl : treelist} var l : intlist var t : tree var tl : treelist in l := intlist { hd = 3, tl = l }; t := tree { key = 2, children = treelist { hd = tree{ key = 3, children = 3 }, tl = treelist{ } } }; t.children.hd.children := t.children end
  20. 20. Name Resolution: Recursive Types 20 let type intlist = {hd : int, tl : intlist} type tree = {key : int, children : treelist} type treelist = {hd : tree, tl : treelist} var l : intlist var t : tree var tl : treelist in l := intlist { hd = 3, tl = l }; t := tree { key = 2, children = treelist { hd = tree{ key = 3, children = 3 }, tl = treelist{ } } }; t.children.hd.children := t.children end type mismatch Field "tl" not initialized Field "hd" not initialized
  21. 21. Testing Static Analysis 21
  22. 22. Testing Name Resolution 22 test inner name [[ let type t = u type u = int var x: u := 0 in x := 42 ; let type [[u]] = t var y: [[u]] := 0 in y := 42 end end ]] resolve #2 to #1 test outer name [[ let type t = u type [[u]] = int var x: [[u]] := 0 in x := 42 ; let type u = t var y: u := 0 in y := 42 end end ]] resolve #2 to #1
  23. 23. Testing Type Checking 23 test variable reference [[ let type t = u type u = int var x: u := 0 in x := 42 ; let type u = t var y: u := 0 in y := [[x]] end end ]] run get-type to INT() test integer constant [[ let type t = u type u = int var x: u := 0 in x := 42 ; let type u = t var y: u := 0 in y := [[42]] end end ]] run get-type to INT()
  24. 24. Testing Errors 24 test undefined variable [[ let type t = u type u = int var x: u := 0 in x := 42 ; let type u = t var y: u := 0 in y := [[z]] end end ]] 1 error test type error [[ let type t = u type u = string var x: u := 0 in x := 42 ; let type u = t var y: u := 0 in y := [[x]] end end ]] 1 error
  25. 25. Test Corner Cases 25 context-free superset language
  26. 26. Implementing a Type Checker 26
  27. 27. Compute Type of Expression 27 rules type-check : Mod(e) -> t where <type-check> e => t type-check : String(_) -> STRING() type-check : Int(_) -> INT() type-check : Plus(e1, e2) -> INT() where <type-check> e1 => INT(); <type-check> e2 => INT() type-check : Times(e1, e2) -> INT() where <type-check> e1 => INT(); <type-check> e2 => INT() 1 + 2 * 3 Mod( Plus(Int("1"), Times(Int("2"), Int("3"))) ) INT()
  28. 28. Compute Type of Variable? 28 rules type-check : Let([VarDec(x, e)], e_body) -> t where <type-check> e => t_e; <type-check> e_body => t type-check : Var(x) -> t // ??? let var x := 1 in x + 1 end Mod( Let( [VarDec("x", Int("1"))] , [Plus(Var("x"), Int("1"))] ) ) ?
  29. 29. Type Checking Variable Bindings 29 rules type-check(|env) : Let([VarDec(x, e)], e_body) -> t where <type-check(|env)> e => t_e; <type-check(|[(x, t_e) | env])> e_body => t type-check(|env) : Var(x) -> t where <fetch(?(x, t))> env let var x := 1 in x + 1 end INT() Store association between variable and type in type environment Mod( Let( [VarDec("x", Int("1"))] , [Plus(Var("x"), Int("1"))] ) )
  30. 30. Pass Environment to Sub-Expressions 30 rules type-check : Mod(e) -> t where <type-check(|[])> e => t type-check(|env) : String(_) -> STRING() type-check(|env) : Int(_) -> INT() type-check(|env) : Plus(e1, e2) -> INT() where <type-check(|env)> e1 => INT(); <type-check(|env)> e2 => INT() type-check(|env) : Times(e1, e2) -> INT() where <type-check(|env)> e1 => INT(); <type-check(|env)> e2 => INT() let var x := 1 in x + 1 end INT() Mod( Let( [VarDec("x", Int("1"))] , [Plus(Var("x"), Int("1"))] ) )
  31. 31. Name Resolution 31
  32. 32. Language = Set of Trees Fun AddArgDecl VarRefId Int “1” VarDeclId “x” TXINT “x” Tree is a convenient interface for transforming programs
  33. 33. Language = Set of Graphs Fun AddArgDecl VarRefId Int “1” VarDeclId TXINT Edges from references to declarations
  34. 34. Fun AddArgDecl VarRefId Int “1” VarDeclId “x” TXINT “x” Language = Set of Graphs Names are placeholders for edges in linear / tree representation
  35. 35. Variables 35 let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
  36. 36. Function Calls 36 let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
  37. 37. 37 function prettyprint(tree: tree) : string = let var output := "" function write(s: string) = output := concat(output, s) function show(n: int, t: tree) = let function indent(s: string) = (write("n"); for i := 1 to n do write(" "); output := concat(output, s)) in if t = nil then indent(".") else (indent(t.key); show(n+1, t.left); show(n+1, t.right)) end in show(0, tree); output end Nested Scopes (Shadowing)
  38. 38. 38 let type point = { x : int, y : int } var origin := point { x = 1, y = 2 } in origin.x := 10; origin := nil end Type References
  39. 39. 39 let type point = { x : int, y : int } var origin := point { x = 1, y = 2 } in origin.x := 10; origin := nil end Record Fields
  40. 40. 40 let type point = { x : int, y : int } var origin := point { x = 1, y = 2 } in origin.x := 10; origin := nil end Type Dependent Name Resolution
  41. 41. Used in many different language artifacts • compiler • interpreter • semantics • IDE • refactoring Binding rules encoded in many different and ad-hoc ways • symbol tables • environments • substitutions No standard approach to formalization • there is no BNF for name binding No reuse of binding rules between artifacts • how do we know substitution respects binding rules? Name Resolution is Pervasive 41
  42. 42. Approaches to representing binding within an (extended) AST • Numeric indexing [deBruijn72] • Higher-order abstract syntax [PfenningElliott88] • Nominal logic approaches [GabbayPitts02] Support binding-sensitive AST manipulation But • Do not say how to resolve identifiers in the first place! • Can not represent ill-bound programs • Tend to be biased towards lambda-like bindings Representing Bound Programs 42
  43. 43. Name Binding Languages 43
  44. 44. 44 name binding ? How to define the rules of a language
  45. 45. 45 name binding ? What is the BNF of
  46. 46. DSLs for specifying binding structure of a (target) language • Ott [Sewell+10] • Romeo [StansiferWand14] • Unbound [Weirich+11] • Cαml [Pottier06] • NaBL [Konat+12] Generate code to do resolution and record results What is the semantics of such a name binding language? Name Binding Languages 46
  47. 47. NaBL Name Binding Language binding rules // variables Param(t, x) : defines Variable x of type t Let(bs, e) : scopes Variable Bind(t, x, e) : defines Variable x of type t Var(x) : refers to Variable x Declarative Name Binding and Scope Rules Gabriël D. P. Konat, Lennart C. L. Kats, Guido Wachsmuth, Eelco Visser SLE 2012 Declarative specification Abstracts from implementation Incremental name resolution But: How to explain it to Coq? What is the semantics of NaBL?
  48. 48. NaBL Name Binding Language Declarative Name Binding and Scope Rules Gabriël D. P. Konat, Lennart C. L. Kats, Guido Wachsmuth, Eelco Visser SLE 2012 Especially: What is the semantics of imports? binding rules // classes Class(c, _, _, _) : defines Class c of type ClassT(c) scopes Field, Method, Variable Extends(c) : imports Field, Method from Class c ClassT(c) : refers to Class c New(c) : refers to Class c
  49. 49. What is semantics of binding language? - the meaning of a binding specification for language L should be given by - a function from L programs to their “resolution structures” So we need - a (uniform, language-independent) method for describing such resolution structures ... - … that can be used to compute the resolution of each program identifier - (or to verify that a claimed resolution is valid) That supports - Handle broad range of language binding features ... - … using minimal number of constructs - Make resolution structure language-independent - Handle named collections of names (e.g. modules, classes, etc.) within the theory - Allow description of programs with resolution errors Approach 49
  50. 50. Representation - To conduct and represent the results of name resolution Declarative Rules - To define name binding rules of a language Language-Independent Tooling - Name resolution - Code completion - Refactoring - … Separation of Concerns in Name Binding 50
  51. 51. Representation - ? Declarative Rules - To define name binding rules of a language Language-Independent Tooling - Name resolution - Code completion - Refactoring - … Separation of Concerns in Name Binding 51 Scope Graphs
  52. 52. 52 let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end fact S1 fact S2n n fact nn Scope GraphProgram Name Resolution
  53. 53. A Calculus for Name Resolution 53 S R1 R2 SR SRS I(R1 ) S’S S’S P Sx Sx R xS xS D Path in scope graph connects reference to declaration Scopes, References, Declarations, Parents, Imports Neron, Tolmach, Visser, Wachsmuth A Theory of Name Resolution ESOP 2015
  54. 54. Simple Scopes 54 Reference Declaration Scope Reference Step Declaration Step def y1 = x2 + 1 def x1 = 5 S0 def y1 = x2 + 1 def x1 = 5 Sx Sx R xS xS D S0 y1 x1S0 y1 x1S0x2 R y1 x1S0x2 R D y1 x1S0x2 S x x
  55. 55. Lexical Scoping 55 S1 S0 Parent def x1 = z2 5 def z1 = fun y1 { x2 + y2 } S’S S’S P z1 x1S0z2 S1 y1y2 x2 z1 x1S0z2 S1 y1y2 x2 z1 x1S0z2 S1 y1y2 x2 z1 x1S0z2 R S1 y1y2 x2 z1 x1S0z2 R D S1 y1y2 x2 z1 x1S0z2 R S1 y1y2 x2 z1 x1S0z2 R P S1 y1y2 x2 z1 x1S0z2 R P D S’S Parent Step
  56. 56. Imports 56 Associated scope Import S0 SB SA module A1 { def z1 = 5 } module B1 { import A2 def x1 = 1 + z2 } A1 SA z1 B1 SB z2 S0 A2 x1 S R1 R2 SR A1 SA z1 B1 SB z2 S0 A2 x1 A1 SA z1 B1 SB z2 S0 A2 x1 A1 SA z1 B1 SB z2 S0 A2 x1 R A1 SA z1 B1 SB z2 S0 A2 x1 R R A1 SA z1 B1 SB z2 S0 A2 x1 R R P A1 SA z1 B1 SB z2 S0 A2 x1 R R P D A1 SA z1 B1 SB z2 S0 A2 x1 I(A2 )R R P D A1 SA z1 B1 SB z2 S0 A2 x1 I(A2 )R R D P D S R1 R2 SR SRS I(R1 ) Import Step
  57. 57. Qualified Names 57 module N1 { def s1 = 5 } module M1 { def x1 = 1 + N2.s2 } S0 N1 SN s2 S0 N2 R D R I(N2) D X1 s1 N1 SN s2 S0 N2 R D X1 s1 N1 SN s2 S0 N2 X1 s1
  58. 58. A Calculus for Name Resolution 58 S R1 R2 SR SRS I(R1 ) S’S S’S P Sx Sx R xS xS D Reachability of declarations from references through scope graph edges How about ambiguities? References with multiple paths
  59. 59. A Calculus for Name Resolution 59 S R1 R2 SR SRS I(R1 ) S’S S’S P Sx Sx R xS xS D I(_).p’ < P.p D < I(_).p’ D < P.p s.p < s.p’ p < p’ Visibility Well formed path: R.P*.I(_)*.D Reachability
  60. 60. Ambiguous Resolutions 60 match t with | A x | B x => … z1 x2 x1S0x3 z1 x2 x1S0x3 R z1 x2 x1S0x3 R D z1 x2 x1S0x3 R D z1 x2 x1S0x3 R D D S0def x1 = 5 def x2 = 3 def z1 = x3 + 1
  61. 61. Shadowing 61 S0 S1 S2 D < P.p s.p < s.p’ p < p’ S1 S2 x1 y1 y2 x2 z1 x3S0z2 def x3 = z2 5 7 def z1 = fun x1 { fun y1 { x2 + y2 } } S1 S2 x1 y1 y2 x2 z1 x3S0z2 R P P D S1 S2 x1 y1 y2 x2 z1 x3S0z2 D P R R P P D R.P.D < R.P.P.D
  62. 62. Imports shadow Parents 62 I(_).p’ < P.p R.I(A2).D < R.P.D A1 SA z1 B1 SB z2 S0 A2 x1 z3 A1 SA z1 B1 SB z2 S0 A2 x1 I(A2)R D z3 A1 SA z1 B1 SB z2 S0 A2 x1 I(A2)R D P D z3 R S0def z3 = 2 module A1 { def z1 = 5 } module B1 { import A2 def x1 = 1 + z2 } SA SB
  63. 63. Imports vs. Includes 63 S0def z3 = 2 module A1 { def z1 = 5 } import A2 def x1 = 1 + z2 SA A1 SA z1 z2 S0 A2 x1 I(A2) R D z3 R D D < I(_).p’ R.D < R.I(A2).D A1 SA z1 z2 S0 A2 x1 R z3 D A1 SA z1 z2 S0 A2 x1z3 S0def z3 = 2 module A1 { def z1 = 5 } include A2 def x1 = 1 + z2 SA X
  64. 64. Import Parents 64 def s1 = 5 module N1 { } def x1 = 1 + N2.s2 S0 SN def s1 = 5 module N1 { } def x1 = 1 + N2.s2 S0 SN Well formed path: R.P*.I(_)*.D N1 SN s2 S0 N2 X1 s1 R I(N2 ) D P N1 SN s2 S0 N2 X1 s1
  65. 65. Transitive vs. Non-Transitive 65 With transitive imports, a well formed path is R.P*.I(_)*.D With non-transitive imports, a well formed path is R.P*.I(_)?.D A1 SA z1 B1 SB S0 A2 C1 SCz2 x1 B2 A1 SA z1 B1 SB S0 A2 I(A2 ) D C1 SCz2 I(B2 ) R x1 B2 ?? module A1 { def z1 = 5 } module B1 { import A2 } module C1 { import B2 def x1 = 1 + z2 } SA SB SC
  66. 66. A Calculus for Name Resolution 66 S R1 R2 SR SRS I(R1 ) S’S S’S P Sx Sx R xS xS D I(_).p’ < P.p D < I(_).p’ D < P.p s.p < s.p’ p < p’ Visibility Well formed path: R.P*.I(_)*.D Reachability
  67. 67. Visibility Policies 67 Lexical scope L := {P} E := P⇤ D < P Non-transitive imports L := {P, I} E := P⇤ · I? D < P, D < I, I < P Transitive imports L := {P, TI} E := P⇤ · TI⇤ D < P, D < TI, TI < P Transitive Includes L := {P, Inc} E := P⇤ · Inc⇤ D < P, Inc < P Transitive includes and imports, and non-transitive imports L := {P, Inc, TI, I} E := P⇤ · (Inc | TI)⇤ · I? D < P, D < TI, TI < P, Inc < P, D < I, I < P, Figure 10. Example reachability and visibility policies by instan- Envre EnvL re EnvD re Envl re
  68. 68. More Examples 68 From TUD-SERG-2015-001
  69. 69. Let Bindings 69 1 2 b2a1 c3 a4 b6 c12a10 b11 c5 b9 a7 c8 def a1 = 0 def b2 = 1 def c3 = 2 letpar a4 = c5 b6 = a7 c8 = b9 in a10+b11+c12 1 b2a1 c3 a4 b6 c12a10 b11 c5 b9 a7 c8 2 def a1 = 0 def b2 = 1 def c3 = 2 letrec a4 = c5 b6 = a7 c8 = b9 in a10+b11+c12 1 b2a1 c3 a4 b6 c12a10 b11 c5 b9 a7 c8 2 4 3 def a1 = 0 def b2 = 1 def c3 = 2 let a4 = c5 b6 = a7 c8 = b9 in a10+b11+c12
  70. 70. Definition before Use / Use before Definition 70 class C1 { int a2 = b3; int b4; void m5 (int a6) { int c7 = a8 + b9; int b10 = b11 + c12; } int c12; } 0 C1 1 2 a2 b4 c12 b3 m5 a6 3 4 c7 b10 b9 a8 b11 c12
  71. 71. Inheritance 71 class C1 { int f2 = 42; } class D3 extends C4 { int g5 = f6; } class E7 extends D8 { int f9 = g10; int h11 = f12; } 32 1 C4 C1 4D3 E7 D8 f2 g5 f6 f9 g10 f12 h11
  72. 72. Java Packages 72 package p3; class D4 {} package p1; class C2 {} 1 p3p1 2 p1 p3 3 D4C2
  73. 73. Java Import 73 package p1; imports r2.*; imports q3.E4; public class C5 {} class D6 {} 4 p1 D6C5 3 2r2 1 p1 E4 q3
  74. 74. C# Namespaces and Partial Classes 74 namespace N1 { using M2; partial class C3 { int f4; } } namespace N5 { partial class C6 { int m7() { return f8; } } } 1 3 6 4 7 8 C3 C6 N1 N5 f4 m7 N1 N5 C3 C6 f8 2 M2 5
  75. 75. Formal Framework (with labeled scope edges) 75
  76. 76. Issues with the Reachability Calculus 76 S R1 R2 SR SRS I(R1 ) S’S S’S P Sx Sx R xS xS D I(_).p’ < P.p D < I(_).p’ D < P.p s.p < s.p’ p < p’ Well formed path: R.P*.I(_)*.D Disambiguating import paths Fixed visibility policy Cyclic Import Paths Multi-import interpretation
  77. 77. Resolution Calculus with Edge Labels 77 ! xD j is the resolution relation for graph G. f a name collection JNKG is the multiset defined G = ⇡(DG(S)), JR(S)KG = ⇡(RG(S)), and | 9p, `G p : S 7 ! xD i }) where ⇡(A) is ed by projecting the identifiers from a set A of ations. Given a multiset M, 1M (x) denotes the M. alculus ulus defines the resolution of a reference to a pe graph as a most specific, well-formed path eclaration through a sequence of edges. A path epresenting the atomic scope transitions in the ee kinds of steps: step E(l, S2) is a direct transition from the the scope S2. This step records the label of ion that is used. step N(l, yR , S) requires the resolution of ref- eclaration with associated scope S to allow a en the current scope and scope S. always ends with a declaration step D(xD ) that ation the path is leading to. d resolution in the graph from reference xR i uch that `G p : xR i 7 ! xD i according to n Fig. 9. These rules all implicitly apply to which we omit to avoid clutter. The calculus n relation in terms of edges in the scope graph, ns, and visible declarations. Here I is the set of nical device needed to avoid “out of thin air” Well-formed paths WF(p) , labels(p) 2 E Visibility ordering on paths label(s1) < label(s2) s1 · p1 < s2 · p2 p1 < p2 s · p1 < s · p2 Edges in scope graph S1 l S2 I ` E(l, S2) : S1 ! S2 (E) S1 l yR i yR i /2 I I ` p : yR i 7 ! yD j yD j S2 I ` N(l, yR i , S2) : S1 ! S2 (N) Transitive closure I, S ` [] : A ⇣ A (I) B /2 S I ` s : A ! B I, {B} [ S ` p : B ⇣ C I, S ` s · p : A ⇣ C (T) Reachable declarations I, {S} ` p : S ⇣ S0 WF(p) S0 xD i I ` p · D(xD i ) : S ⇢ xD i (R) Visible declarations I ` p : S ⇢ xD i 8j, p0 (I ` p0 : S ⇢ xD j ) ¬(p0 < p)) I ` p : S 7 ! xD i (V ) Reference resolution xR i S {xR i } [ I ` p : S 7 ! xD j I ` p : xR i 7 ! xD j (X) JN1KG ✓ JN2KG G, |= N1 ⇢ ⇠ N2 (C-SUBNAME) t1 = t2 G, |= t1 ⌘ t2 (C-EQ) Figure 8. Interpretation of resolution and typing constraints Resolution paths s := D(xD i ) | E(l, S) | N(l, xR i , S) p := [] | s | p · p (inductively generated) [] · p = p · [] = p (p1 · p2) · p3 = p1 · (p2 · p3) Well-formed paths WF(p) , labels(p) 2 E Visibility ordering on paths label(s1) < label(s2) s1 · p1 < s2 · p2 p1 < p2 s · p1 < s · p2 Edges in scope graph S1 l S2 I ` E(l, S2) : S1 ! S2 (E) S1 l yR i yR i /2 I I ` p : yR i 7 ! yD j yD j S2 I ` N(l, yR i , S2) : S1 ! S2 (N) Transitive closure I, S ` [] : A ⇣ A (I) B /2 S I ` s : A ! B I, {B} [ S ` p : B ⇣ C I, S ` s · p : A ⇣ C (T) Reachable declarations I, {S} ` p : S ⇣ S0 WF(p) S0 xD i I ` p · D(xD i ) : S ⇢ xD i (R) C := CG | CTy | CRes | C ^ C | True CG := R S | S D | S l S | D S | S l R CRes := R 7! D | D S | !N | N ⇢ ⇠ N CTy := T ⌘ T | D : T D := | xD i R := xR i S := & | n T := ⌧ | c(T, ..., T) with c 2 CT N := D(S) | R(S) | V(S) Figure 7. Syntax of constraints scope graph resolution calculus (described in Section 3.3). Finally, we apply |= with G set to CG . To lift this approach to constraints with variables, we simply apply a multi-sorted substitution , mapping type variables ⌧ to ground types, declaration variables to ground declarations and scope variables & to ground scopes. Thus, our overall definition of satisfaction for a program p is: G Res Ty F
  78. 78. Visibility Policies 78 Lexical scope L := {P} E := P⇤ D < P Non-transitive imports L := {P, I} E := P⇤ · I? D < P, D < I, I < P Transitive imports L := {P, TI} E := P⇤ · TI⇤ D < P, D < TI, TI < P Transitive Includes L := {P, Inc} E := P⇤ · Inc⇤ D < P, Inc < P Transitive includes and imports, and non-transitive imports L := {P, Inc, TI, I} E := P⇤ · (Inc | TI)⇤ · I? D < P, D < TI, TI < P, Inc < P, D < I, I < P, Figure 10. Example reachability and visibility policies by instan- tiation of path well-formedness and visibility. 3.4 Parameterization R[I](xR ) Envre [I, S](S) EnvL re [I, S](S) EnvD re [I, S](S) Envl re [I, S](S) ISl [I](S)
  79. 79. Seen Imports 79 :SA2 s without import tracking. specificity order on paths is native visibility policies can edicate and specificity order. module A1 { module A2 { def a3 = ... } } import A4 def b5 = a6 Fig. 11. Self im- port claration ces to the o module ed so far, kes a few t AR 4 7 ! port at A4 the body ? 13 AD 2 :SA2 2 D(SA1 ) AR 4 2 I(Sroot) AR 4 2 R(Sroot) AD 1 :SA1 2 D(Sroot) AR 4 7 ! AD 1 :SA1 Sroot ! SA1 (⇤) Sroot ⇢ AD 2 :SA2 AR 4 2 R(Sroot) Sroot 7 ! AD 2 :SA2 AR 4 7 ! AD 2 :SA2 Fig. 10. Derivation for AR 4 7 ! AD 2 :SA2 in a calculus without import tracking. The complete definition of well-formed paths and specificity order on paths is given in Fig. 2. In Section 2.5 we discuss how alternative visibility policies can be defined by just changing the well-formedness predicate and specificity order. module A1 { module A2 { def a3 = ... } } import A4 def b5 = a6 Seen imports. Consider the example in Fig. 11. Is declaration a3 reachable in the scope of reference a6? This reduces to the question whether the import of A4 can resolve to module A2. Surprisingly, it can, in the calculus as discussed so far, as shown by the derivation in Fig. 10 (which takes a few shortcuts). The conclusion of the derivation is that AR 4 7 ! AD :S . This conclusion is obtained by using the import at A as shown by the derivation in Fig. 10 (which takes a few shortcuts). The conclusion of the derivation is that AR 4 7 ! AD 2 :SA2 . This conclusion is obtained by using the import at A4 to conclude at step (*) that Sroot ! SA1 , i.e. that the body of module A1 is reachable! In other words, the import of A4 is used in its own resolution. Intuitively, this is nonsensical. To rule out this kind of behavior we extend the calculus to keep track of the set of seen imports I using judgements of the form I ` p : xR i 7 ! xD j . We need to extend all rules to pass the set I, but only the rules for resolution and import are truly affected: xR i 2 R(S) {xR i } [ I ` p : S 7 ! xD j I ` p : xR i 7 ! xD j (X) yR i 2 I(S1) I I ` p : yR i 7 ! yD j :S2 I ` I(yR i , yD j :S2) : S1 ! S2 (I) With this final ingredient, we reach the full calculus in
  80. 80. Anomaly 80 4 def b5 = a6 Fig. 11. Self im- port module A1 { module B2 { def x3 = 1 } } module B4 { module A5 { def y6 = 2 } } module C7 { import A8 import B9 def z10 = x11 + y12 } Fig. 12. Anoma- lous resolution tcuts). The conclusion of the derivation is that AR 4 7 ! SA2 . This conclusion is obtained by using the import at A4 onclude at step (*) that Sroot ! SA1 , i.e. that the body module A1 is reachable! In other words, the import of A4 sed in its own resolution. Intuitively, this is nonsensical. To rule out this kind of behavior we extend the calculus eep track of the set of seen imports I using judgements he form I ` p : xR i 7 ! xD j . We need to extend all rules to the set I, but only the rules for resolution and import truly affected: xR i 2 R(S) {xR i } [ I ` p : S 7 ! xD j I ` p : xR i 7 ! xD j (X) yR i 2 I(S1) I I ` p : yR i 7 ! yD j :S2 I ` I(yR i , yD j :S2) : S1 ! S2 (I) With this final ingredient, we reach the full calculus in 3. It is not hard to see that the resolution relation is -founded. The only recursive invocation (via the I rule) a strictly larger set I of seen imports (via the X rule); since the set R(G) 4 def b5 = a6 Fig. 11. Self im- port module A1 { module B2 { def x3 = 1 } } module B4 { module A5 { def y6 = 2 } } module C7 { import A8 import B9 def z10 = x11 + y12 } Fig. 12. Anoma- lous resolution hat AR 4 7 ! mport at A4 at the body mport of A4 onsensical. he calculus judgements all rules to and import (X) (I) calculus in relation is the I rule) he X rule); since the set R(G) 2 1 3 1 3 2
  81. 81. Resolution Algorithm 81 I < P TI < P c < P mports · I? I < P, es by instan- R[I](xR ) := let (r, s) = EnvE [{xR } [ I, ;](Sc(xR ))} in ( U if r = P and {xD |xD 2 s} = ; {xD |xD 2 s} Envre [I, S](S) := ( (T, ;) if S 2 S or re = ? Env L[{D} re [I, S](S) EnvL re [I, S](S) := [ l2Max(L) ⇣ Env {l0 2L|l0 <l} re [I, S](S) Envl re [I, S](S) ⌘ EnvD re [I, S](S) := ( (T, ;) if [] /2 re (T, D(S)) Envl re [I, S](S) := 8 >< >: (P, ;) if S I l contains a variable or ISl[I](S) = U S S02 ⇣ ISl[I](S)[S I l ⌘ Env(l 1re)[I, {S} [ S](S0) ISl [I](S) := ( U if 9yR 2 (S B l I) s.t. R[I](yR ) = U {S0 | yR 2 (S B l I) ^ yD 2 R[I](yR ) ^ yD S0}
  82. 82. Alpha Equivalence 82
  83. 83. Language-independent 𝜶-equivalence 83 Position equivalence in the same equivalence class are declarations of or reference to the same entity. The abstract position ¯x identifies the equivalence class corresponding to the free variable x. Given a program P, we write P for the set of positions corresponding to references and declarations and PX for P extended with the artificial positions (e.g. ¯x). We define the P ⇠ equivalence relation between elements of PX as the reflexive symmetric and transitive closure of the resolution relation. Definition 7 (Position equivalence). ` p : r i x 7 ! di0 x i P ⇠ i0 i0 P ⇠ i i P ⇠ i0 i P ⇠ i0 i0 P ⇠ i00 i P ⇠ i00 i P ⇠ i In this equivalence relation, the class containing the abstract free variable dec- laration can not contain any other declaration. So the references in a particular class are either all free or all bound. Lemma 6 (Free variable class). The equivalence class of a free variable does not contain any other declaration, i.e. 8 di x s.t. i P ⇠ ¯x =) i = ¯x xi xi' Program similarity ↵-Equivalence now define ↵-equivalence using scope graphs. Except for the leaves represen identifiers, two ↵-equivalent programs must have the same abstract synt . We write P ' P’ (pronounced “P and P’ are similar”) when the ASTs o P’ are equal up to identifiers. To compare two programs we first compa r AST structures; if these are equal then we compare how identifiers beha hese programs. Since two potentially ↵-equivalent programs are similar, t tifiers occur at the same positions. In order to compare the identifiers’ beha we define equivalence classes of positions of identifiers in a program: positio he same equivalence class are declarations of or reference to the same enti abstract position ¯x identifies the equivalence class corresponding to the fr able x. Given a program P, we write P for the set of positions corresponding rences and declarations and PX for P extended with the artificial positio ¯x). We define the P ⇠ equivalence relation between elements of PX as t if have same AST ignoring identifier names tailed proof is in appendix A.5, we first prove: ` > : r i x 7 ! d ¯x x ) =) 8 p di0 x , p ` r i x 7 ! di0 x =) i0 = ¯x ^ p = > proceed by induction on the equivalence relation. alence classes defined by this relation contain references to or d he same entity. Given this relation, we can state that two progra ent if the identifiers at identical positions refer to the same entit to the same equivalence class: n 8 (↵-equivalence). Two programs P1 and P2 are ↵-equivale ↵ ⇡ P2) when they are similar and have the same ⇠-equivalence c P1 ↵ ⇡ P2 , P1 ' P2 ^ 8 e e0 , e P1 ⇠ e0 , e P2 ⇠ e0 . ↵ ⇡ is an equivalence relation since ' and , are equivalence rel(with some further details about free variables) Alpha equivalence
  84. 84. Preserving ambiguity 84 25 module A1 { def x2 := 1 } module B3 { def x4 := 2 } module C5 { import A6 B7 ; def y8 := x9 } module D10 { import A11 ; def y12 := x13 } module E14 { import B15 ; def y16 := x17 } P1 module AA1 { def z2 := 1 } module BB3 { def z4 := 2 } module C5 { import AA6 BB7 ; def s8 := z9 } module D10 { import AA11 ; def u12 := z13 } module E14 { import BB15 ; def v16 := z17 } P2 module A1 { def z2 := 1 } module B3 { def x4 := 2 } module C5 { import A6 B7 ; def y8 := z9 } module D10 { import A11 ; def y12 := z13 } module E14 { import B15 ; def y16 := x17 } P3 Fig. 23. ↵-equivalence and duplicate declarationP1 Lemma 6 (Free variable class). The equivalence class of a free va not contain any other declaration, i.e. 8 di x s.t. i P ⇠ ¯x =) i = ¯x Proof. Detailed proof is in appendix A.5, we first prove: 8 r i x, (` > : r i x 7 ! d ¯x x ) =) 8 p di0 x , p ` r i x 7 ! di0 x =) i0 = ¯x ^ p = and then proceed by induction on the equivalence relation. The equivalence classes defined by this relation contain references to tions of the same entity. Given this relation, we can state that two p ↵-equivalent if the identifiers at identical positions refer to the same e is belong to the same equivalence class: Definition 8 (↵-equivalence). Two programs P1 and P2 are ↵-equi noted P1 ↵ ⇡ P2) when they are similar and have the same ⇠-equivalen P1 ↵ ⇡ P2 , P1 ' P2 ^ 8 e e0 , e P1 ⇠ e0 , e P2 ⇠ e0P2 P2 Lemma 6 (Free variable class). The equiva not contain any other declaration, i.e. 8 di x s.t Proof. Detailed proof is in appendix A.5, we fi 8 r i x, (` > : r i x 7 ! d ¯x x ) =) 8 p di0 x , p ` r i x 7 and then proceed by induction on the equivale The equivalence classes defined by this relation tions of the same entity. Given this relation, w ↵-equivalent if the identifiers at identical posit is belong to the same equivalence class: Definition 8 (↵-equivalence). Two program noted P1 ↵ ⇡ P2) when they are similar and hav P1 ↵ ⇡ P2 , P1 ' P2 ^ 8 e e0 ,P3
  85. 85. Closing 85
  86. 86. Representation: Scope Graphs - Standardized representation for lexical scoping structure of programs - Path in scope graph relates reference to declaration - Basis for syntactic and semantic operations Formalism: Name Binding Constraints - References + Declarations + Scopes + Reachability + Visibility - Language-specific rules map AST to constraints Language-Independent Interpretation - Resolution calculus: correctness of path with respect to scope graph - Name resolution algorithm - Alpha equivalence - Mapping from graph to tree (to text) - Refactorings - And many other applications Summary: A Theory of Name Resolution 86
  87. 87. We have modeled a large set of example binding patterns - definition before use - different let binding flavors - recursive modules - imports and includes - qualified names - class inheritance - partial classes Next goal: fully model some real languages - In progress: Go, Rust, TypeScript - Java, ML Validation 87
  88. 88. Scope graph semantics for binding specification languages - starting with NaBL - or rather: a redesign of NaBL based on scope graphs Resolution-sensitive program transformations - renaming, refactoring, substitution, … Dynamic analogs to static scope graphs - how does scope graph relate to memory at run-time? Supporting mechanized language meta-theory - relating static and dynamic bindings Future Work 88
  89. 89. Representation - To conduct and represent the results of name resolution Declarative Rules - To define name binding rules of a language Language-Independent Tooling - Name resolution - Code completion - Refactoring - … Separation of Concerns in Name Binding 89
  90. 90. Representation - ? Declarative Rules - To define name binding rules of a language Language-Independent Tooling - Name resolution - Code completion - Refactoring - … Separation of Concerns in Name Binding 90 Scope Graphs
  91. 91. Representation - ? Declarative Rules - ? Language-Independent Tooling - Name resolution - Code completion - Refactoring - … Separation of Concerns in Name Binding 91 Scope & Type Constraint Rules [PEPM16] Scope Graphs
  92. 92. Next: Constraint-Based Type Checkers 92 Program AST Parse Constraints Generate Constraints Solution Resolve Constraints Language Specific Language Independent

×