An Annotation Framework for Statically-Typed Syntax Trees Loren Abrams  Ray Toal Loyola Marymount University Los Angeles C...
Outline <ul><li>Overview </li></ul><ul><li>Previous Work </li></ul><ul><li>Motivating Example </li></ul><ul><li>Some Theor...
Goals <ul><li>To contribute to parser generation theory and practice with a grammar annotation framework that is </li></ul...
Contributions <ul><li>A grammar-independent annotation  framework  (not “just another generator”) </li></ul><ul><li>Distil...
Previous Research <ul><li>Grammars : CFG, (E)BNF, XBNF, SDF, PEG </li></ul><ul><li>Parser Generators : Lex/Yacc, Flex/Biso...
Motivating Example (1 of 2) <ul><li>Given a grammar such as this one... </li></ul><ul><li>ID  => @[A-Za-z][A-Za-z0-9_]+ </...
Motivating Example (2 of 2) <ul><li>...we want to annotate the grammar to produce  statically-typed  ASTs </li></ul>var y;...
Describing the ASTs <ul><li>Each node in the generated AST is an object of some AST node class </li></ul><ul><li>The field...
Annotating the Grammar <ul><li>Our contribution is to exhibit a high-level approach to annotation </li></ul><ul><li>The ap...
Annotation Highlights <ul><li>Each rule execution produces a value (null, string, node-ref, list) </li></ul><ul><li>We  ta...
Annotation Example (1 of 3) <ul><li>Value of rule is last value produced </li></ul><ul><li>params  is a scalar variable;  ...
Annotation Example (2 of 3) <ul><li>Value of  IdList  is just the value of  id </li></ul><ul><li>left  is a scalar variabl...
Annotation Example (3 of 3) <ul><li>Used  value  since  numlit  and  strlit  would not be nice field names </li></ul><ul><...
Parser Generator Implementation <ul><li>A parser generator reads a description (like the one on the last three slides) and...
Prototype Implementation (1 of 2) <ul><li>Initial implementation is a proof of concept </li></ul><ul><li>Description eleme...
Prototype Implementation (2 of 2) <ul><li>Java only </li></ul><ul><li>Microsyntax specification uses Java regexes (nice) <...
Summary <ul><li>Presentation of a terse, declarative, grammar-independent annotation framework for the generation of stati...
Questions?
Upcoming SlideShare
Loading in …5
×

An Annotation Framework for Statically-Typed Syntax Trees

383
-1

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
383
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

An Annotation Framework for Statically-Typed Syntax Trees

  1. 1. An Annotation Framework for Statically-Typed Syntax Trees Loren Abrams Ray Toal Loyola Marymount University Los Angeles CA USA IASTED SEA 2009 2009-11-03
  2. 2. Outline <ul><li>Overview </li></ul><ul><li>Previous Work </li></ul><ul><li>Motivating Example </li></ul><ul><li>Some Theoretical Contributions </li></ul><ul><li>An Annotation Framework </li></ul><ul><li>Implementation of a Parser Generator </li></ul><ul><li>Conclusions </li></ul>
  3. 3. Goals <ul><li>To contribute to parser generation theory and practice with a grammar annotation framework that is </li></ul><ul><ul><li>Terse (more convention, less markup) </li></ul></ul><ul><ul><li>Fully declarative </li></ul></ul><ul><ul><li>Grammar-independent </li></ul></ul><ul><ul><li>Supportive of statically typed host languages </li></ul></ul><ul><li>To demonstrate feasibility with a prototype parser generator </li></ul>
  4. 4. Contributions <ul><li>A grammar-independent annotation framework (not “just another generator”) </li></ul><ul><li>Distillation of embedded abstract syntax tree specification (useful for understanding) </li></ul><ul><li>Definition of statically-typed AST specification </li></ul><ul><li>Prototype parser generator </li></ul><ul><ul><li>very lightweight </li></ul></ul><ul><ul><li>self-contained, easy to integrate </li></ul></ul>
  5. 5. Previous Research <ul><li>Grammars : CFG, (E)BNF, XBNF, SDF, PEG </li></ul><ul><li>Parser Generators : Lex/Yacc, Flex/Bison, JavaCC, AntLR, SableCC, Rats! </li></ul><ul><li>Tree Builders : JTB, JJTree </li></ul><ul><li>Parser Generation Design Axes </li></ul><ul><ul><li>Concrete vs. Abstract Tree Production </li></ul></ul><ul><ul><li>Static vs. Dynamic Typing </li></ul></ul><ul><ul><li>Inline vs. External Specification </li></ul></ul>
  6. 6. Motivating Example (1 of 2) <ul><li>Given a grammar such as this one... </li></ul><ul><li>ID => @[A-Za-z][A-Za-z0-9_]+ </li></ul><ul><li>NUMLIT => @d+(.d+([Ee][+-]?d+)?)? </li></ul><ul><li>STRLIT => @&quot;[^&quot;p{Cc}]*&quot; </li></ul><ul><li>SKIP => @s+ </li></ul><ul><li>Program => Block </li></ul><ul><li>Block => (Dec &quot;;&quot;)* (Stmt &quot;;&quot;)+ </li></ul><ul><li>Dec => &quot;var&quot; ID (&quot;=&quot; Exp)? | &quot;fun&quot; ID &quot;(&quot; IdList? &quot;)&quot; &quot;=&quot; Exp </li></ul><ul><li>Stmt => ID &quot;=&quot; Exp </li></ul><ul><li>| &quot;read&quot; IdList </li></ul><ul><li>| &quot;write&quot; ExpList </li></ul><ul><li>| &quot;while&quot; Exp &quot;do&quot; Block &quot;end&quot; </li></ul><ul><li>IdList => ID (&quot;,&quot; ID)* </li></ul><ul><li>ExpList => Exp (&quot;,&quot; Exp)* </li></ul><ul><li>Exp => Term ((&quot;+&quot; | &quot;-&quot;) Term)* </li></ul><ul><li>Term => Factor ((&quot;*&quot; | &quot;/&quot;) Factor)* </li></ul><ul><li>Factor => NUMLIT | STRLIT | ID | Call | &quot;(&quot; Exp &quot;)&quot; </li></ul><ul><li>Call => ID &quot;(&quot; ExpList? &quot;)&quot; </li></ul>
  7. 7. Motivating Example (2 of 2) <ul><li>...we want to annotate the grammar to produce statically-typed ASTs </li></ul>var y; fun half(x) = x / 2; while x - (5 * x) do write half(10.4), x+2; read x; end;
  8. 8. Describing the ASTs <ul><li>Each node in the generated AST is an object of some AST node class </li></ul><ul><li>The fields of each object are name-value pairs, with values we define recursively as being </li></ul><ul><ul><li>The value null </li></ul></ul><ul><ul><li>Strings (which come from token literals) </li></ul></ul><ul><ul><li>References to nodes </li></ul></ul><ul><ul><li>Lists of values </li></ul></ul>
  9. 9. Annotating the Grammar <ul><li>Our contribution is to exhibit a high-level approach to annotation </li></ul><ul><li>The approach must be fully declarative and support statically typed ASTs </li></ul><ul><li>The key idea is to ensure each type of value (on the previous slide) is producible </li></ul><ul><li>Our current work is only for embedded annotations, but it should extend to AST descriptions external to the grammar </li></ul>
  10. 10. Annotation Highlights <ul><li>Each rule execution produces a value (null, string, node-ref, list) </li></ul><ul><li>We tag syntax elements and AST node expressions: tags become field names </li></ul><ul><li>Expressions not tagged get the name of the construct (convention!) </li></ul><ul><li>Different tag binding symbols for scalar and list values </li></ul><ul><li>Simple notation for node class hierarchies </li></ul>
  11. 11. Annotation Example (1 of 3) <ul><li>Value of rule is last value produced </li></ul><ul><li>params is a scalar variable; decs and stmts are list variables; block , exp are implicit variables </li></ul><ul><li>Var and Fun are subclasses of Dec </li></ul><ul><li>Note how some values can be null </li></ul>Program => Block {Program block} Block => ( decs *: Dec &quot;;&quot;)* ( stmts *: Stmt &quot;;&quot;)+ {Block decs stmts } Dec => &quot; var &quot; ID (&quot;=&quot; Exp )? {Var:Dec id exp } | &quot;fun&quot; ID &quot;(&quot; params:IdList? &quot;)&quot; &quot;=&quot; Exp {Fun:Dec id params exp }
  12. 12. Annotation Example (2 of 3) <ul><li>Value of IdList is just the value of id </li></ul><ul><li>left is a scalar variable, note how it gets “reassigned” </li></ul>Stmt => ID &quot;=&quot; Exp {Assign:Stmt id exp} | &quot;read&quot; IdList {Read:Stmt idList} | &quot;write&quot; ExpList {Write:Stmt expList} | &quot;while&quot; Exp &quot;do&quot; Block &quot;end&quot; {While:Stmt exp block} IdList => id*:ID (&quot;,&quot; id*:ID)* ExpList => exp*:Exp (&quot;,&quot; exp*:Exp)* Exp => left:Term (op:(&quot;+&quot; | &quot;-&quot;) right:Term left:{Bin:Exp op left right} )* Term => left:Factor (op:(&quot;*&quot; | &quot;/&quot;) right:Factor left:{Bin:Exp op left right} )*
  13. 13. Annotation Example (3 of 3) <ul><li>Used value since numlit and strlit would not be nice field names </li></ul><ul><li>Type of Factor rule is Exp (most general superclass) </li></ul><ul><li>^exp required to avoid “)” as the value </li></ul>Factor => value:NUMLIT {NumLit:Exp value} | value:STRLIT {StrLit:Exp value} | ID {Ref:Exp id} | Call | &quot;(&quot; Exp &quot;)&quot; ^exp Call => ID &quot;(&quot; args:ExpList? &quot;)&quot; {Call:Exp id args}
  14. 14. Parser Generator Implementation <ul><li>A parser generator reads a description (like the one on the last three slides) and outputs </li></ul><ul><ul><li>A scanner </li></ul></ul><ul><ul><li>A parser, producing an AST (only) </li></ul></ul><ul><ul><li>A set of AST node classes, each with setters, getters, and (possibly) visit methods </li></ul></ul><ul><ul><li>A visitor framework for using the generated AST (without touching the tree classes, of course) </li></ul></ul><ul><li>Interesting implementation : token set, types, etc. are computed (inference algorithm) </li></ul>
  15. 15. Prototype Implementation (1 of 2) <ul><li>Initial implementation is a proof of concept </li></ul><ul><li>Description elements fixed, not yet pluggable </li></ul><ul><ul><li>: for scalar binding </li></ul></ul><ul><ul><li>*: for list binding </li></ul></ul><ul><ul><li>{ } for node expressions </li></ul></ul><ul><ul><li>: for subclassing </li></ul></ul><ul><li>Produces incomplete parsers, though scanner, tree classes, and navigation are fully implemented. </li></ul>
  16. 16. Prototype Implementation (2 of 2) <ul><li>Java only </li></ul><ul><li>Microsyntax specification uses Java regexes (nice) </li></ul><ul><li>Packaged as altgen-m-n.jar (m and n are version numbers) — under 5 0KB </li></ul><ul><li>Further info at http://xlg.cs.lmu.edu/altgen </li></ul><ul><li>Planned open source distribution at Google Code </li></ul>
  17. 17. Summary <ul><li>Presentation of a terse, declarative, grammar-independent annotation framework for the generation of statically type abstract syntax trees </li></ul><ul><li>Presentation of a prototype parser generator using the framework </li></ul><ul><li>Java implementation of the prototype is only 50KB </li></ul>
  18. 18. Questions?

×