Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

An Annotation Framework for Statically-Typed Syntax Trees


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

An Annotation Framework for Statically-Typed Syntax Trees

  1. 1. An Annotation Framework for Statically-Typed Syntax Trees Loren Abrams Ray Toal Loyola Marymount University Los Angeles CA USA IASTED SEA 2009 2009-11-03
  2. 2. Outline <ul><li>Overview </li></ul><ul><li>Previous Work </li></ul><ul><li>Motivating Example </li></ul><ul><li>Some Theoretical Contributions </li></ul><ul><li>An Annotation Framework </li></ul><ul><li>Implementation of a Parser Generator </li></ul><ul><li>Conclusions </li></ul>
  3. 3. Goals <ul><li>To contribute to parser generation theory and practice with a grammar annotation framework that is </li></ul><ul><ul><li>Terse (more convention, less markup) </li></ul></ul><ul><ul><li>Fully declarative </li></ul></ul><ul><ul><li>Grammar-independent </li></ul></ul><ul><ul><li>Supportive of statically typed host languages </li></ul></ul><ul><li>To demonstrate feasibility with a prototype parser generator </li></ul>
  4. 4. Contributions <ul><li>A grammar-independent annotation framework (not “just another generator”) </li></ul><ul><li>Distillation of embedded abstract syntax tree specification (useful for understanding) </li></ul><ul><li>Definition of statically-typed AST specification </li></ul><ul><li>Prototype parser generator </li></ul><ul><ul><li>very lightweight </li></ul></ul><ul><ul><li>self-contained, easy to integrate </li></ul></ul>
  5. 5. Previous Research <ul><li>Grammars : CFG, (E)BNF, XBNF, SDF, PEG </li></ul><ul><li>Parser Generators : Lex/Yacc, Flex/Bison, JavaCC, AntLR, SableCC, Rats! </li></ul><ul><li>Tree Builders : JTB, JJTree </li></ul><ul><li>Parser Generation Design Axes </li></ul><ul><ul><li>Concrete vs. Abstract Tree Production </li></ul></ul><ul><ul><li>Static vs. Dynamic Typing </li></ul></ul><ul><ul><li>Inline vs. External Specification </li></ul></ul>
  6. 6. Motivating Example (1 of 2) <ul><li>Given a grammar such as this one... </li></ul><ul><li>ID => @[A-Za-z][A-Za-z0-9_]+ </li></ul><ul><li>NUMLIT => @d+(.d+([Ee][+-]?d+)?)? </li></ul><ul><li>STRLIT => @&quot;[^&quot;p{Cc}]*&quot; </li></ul><ul><li>SKIP => @s+ </li></ul><ul><li>Program => Block </li></ul><ul><li>Block => (Dec &quot;;&quot;)* (Stmt &quot;;&quot;)+ </li></ul><ul><li>Dec => &quot;var&quot; ID (&quot;=&quot; Exp)? | &quot;fun&quot; ID &quot;(&quot; IdList? &quot;)&quot; &quot;=&quot; Exp </li></ul><ul><li>Stmt => ID &quot;=&quot; Exp </li></ul><ul><li>| &quot;read&quot; IdList </li></ul><ul><li>| &quot;write&quot; ExpList </li></ul><ul><li>| &quot;while&quot; Exp &quot;do&quot; Block &quot;end&quot; </li></ul><ul><li>IdList => ID (&quot;,&quot; ID)* </li></ul><ul><li>ExpList => Exp (&quot;,&quot; Exp)* </li></ul><ul><li>Exp => Term ((&quot;+&quot; | &quot;-&quot;) Term)* </li></ul><ul><li>Term => Factor ((&quot;*&quot; | &quot;/&quot;) Factor)* </li></ul><ul><li>Factor => NUMLIT | STRLIT | ID | Call | &quot;(&quot; Exp &quot;)&quot; </li></ul><ul><li>Call => ID &quot;(&quot; ExpList? &quot;)&quot; </li></ul>
  7. 7. Motivating Example (2 of 2) <ul><li>...we want to annotate the grammar to produce statically-typed ASTs </li></ul>var y; fun half(x) = x / 2; while x - (5 * x) do write half(10.4), x+2; read x; end;
  8. 8. Describing the ASTs <ul><li>Each node in the generated AST is an object of some AST node class </li></ul><ul><li>The fields of each object are name-value pairs, with values we define recursively as being </li></ul><ul><ul><li>The value null </li></ul></ul><ul><ul><li>Strings (which come from token literals) </li></ul></ul><ul><ul><li>References to nodes </li></ul></ul><ul><ul><li>Lists of values </li></ul></ul>
  9. 9. Annotating the Grammar <ul><li>Our contribution is to exhibit a high-level approach to annotation </li></ul><ul><li>The approach must be fully declarative and support statically typed ASTs </li></ul><ul><li>The key idea is to ensure each type of value (on the previous slide) is producible </li></ul><ul><li>Our current work is only for embedded annotations, but it should extend to AST descriptions external to the grammar </li></ul>
  10. 10. Annotation Highlights <ul><li>Each rule execution produces a value (null, string, node-ref, list) </li></ul><ul><li>We tag syntax elements and AST node expressions: tags become field names </li></ul><ul><li>Expressions not tagged get the name of the construct (convention!) </li></ul><ul><li>Different tag binding symbols for scalar and list values </li></ul><ul><li>Simple notation for node class hierarchies </li></ul>
  11. 11. Annotation Example (1 of 3) <ul><li>Value of rule is last value produced </li></ul><ul><li>params is a scalar variable; decs and stmts are list variables; block , exp are implicit variables </li></ul><ul><li>Var and Fun are subclasses of Dec </li></ul><ul><li>Note how some values can be null </li></ul>Program => Block {Program block} Block => ( decs *: Dec &quot;;&quot;)* ( stmts *: Stmt &quot;;&quot;)+ {Block decs stmts } Dec => &quot; var &quot; ID (&quot;=&quot; Exp )? {Var:Dec id exp } | &quot;fun&quot; ID &quot;(&quot; params:IdList? &quot;)&quot; &quot;=&quot; Exp {Fun:Dec id params exp }
  12. 12. Annotation Example (2 of 3) <ul><li>Value of IdList is just the value of id </li></ul><ul><li>left is a scalar variable, note how it gets “reassigned” </li></ul>Stmt => ID &quot;=&quot; Exp {Assign:Stmt id exp} | &quot;read&quot; IdList {Read:Stmt idList} | &quot;write&quot; ExpList {Write:Stmt expList} | &quot;while&quot; Exp &quot;do&quot; Block &quot;end&quot; {While:Stmt exp block} IdList => id*:ID (&quot;,&quot; id*:ID)* ExpList => exp*:Exp (&quot;,&quot; exp*:Exp)* Exp => left:Term (op:(&quot;+&quot; | &quot;-&quot;) right:Term left:{Bin:Exp op left right} )* Term => left:Factor (op:(&quot;*&quot; | &quot;/&quot;) right:Factor left:{Bin:Exp op left right} )*
  13. 13. Annotation Example (3 of 3) <ul><li>Used value since numlit and strlit would not be nice field names </li></ul><ul><li>Type of Factor rule is Exp (most general superclass) </li></ul><ul><li>^exp required to avoid “)” as the value </li></ul>Factor => value:NUMLIT {NumLit:Exp value} | value:STRLIT {StrLit:Exp value} | ID {Ref:Exp id} | Call | &quot;(&quot; Exp &quot;)&quot; ^exp Call => ID &quot;(&quot; args:ExpList? &quot;)&quot; {Call:Exp id args}
  14. 14. Parser Generator Implementation <ul><li>A parser generator reads a description (like the one on the last three slides) and outputs </li></ul><ul><ul><li>A scanner </li></ul></ul><ul><ul><li>A parser, producing an AST (only) </li></ul></ul><ul><ul><li>A set of AST node classes, each with setters, getters, and (possibly) visit methods </li></ul></ul><ul><ul><li>A visitor framework for using the generated AST (without touching the tree classes, of course) </li></ul></ul><ul><li>Interesting implementation : token set, types, etc. are computed (inference algorithm) </li></ul>
  15. 15. Prototype Implementation (1 of 2) <ul><li>Initial implementation is a proof of concept </li></ul><ul><li>Description elements fixed, not yet pluggable </li></ul><ul><ul><li>: for scalar binding </li></ul></ul><ul><ul><li>*: for list binding </li></ul></ul><ul><ul><li>{ } for node expressions </li></ul></ul><ul><ul><li>: for subclassing </li></ul></ul><ul><li>Produces incomplete parsers, though scanner, tree classes, and navigation are fully implemented. </li></ul>
  16. 16. Prototype Implementation (2 of 2) <ul><li>Java only </li></ul><ul><li>Microsyntax specification uses Java regexes (nice) </li></ul><ul><li>Packaged as altgen-m-n.jar (m and n are version numbers) — under 5 0KB </li></ul><ul><li>Further info at </li></ul><ul><li>Planned open source distribution at Google Code </li></ul>
  17. 17. Summary <ul><li>Presentation of a terse, declarative, grammar-independent annotation framework for the generation of statically type abstract syntax trees </li></ul><ul><li>Presentation of a prototype parser generator using the framework </li></ul><ul><li>Java implementation of the prototype is only 50KB </li></ul>
  18. 18. Questions?