Implementing External DSLs Using Scala Parser Combinators


Published on

Slides from talk I gave at St. Louis Lambda Lounge ( for the Dec. 2009 meeting.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • (|) = OR logic { } = repetition
  • The multiline string notation is used so escaping “\\” in the regex is not needed
  • Internal DSLs in Scala leverage implicit conversion to a great degree to allow a more flexible “syntax”
  • ~ is a case class.
  • Because statement and repeat invoke each other and therefore can be invoked recursively, The return type for one must be specified
  • There is also rep1sep
  • Demonstrate no so precise message “RT” -> “ET”
  • For some reason Scala compile didn’t like { Turn(-_) }
  • Tail Recursive !!
  • Implementing External DSLs Using Scala Parser Combinators

    1. 1. Implementing External DSLs Using Scala Parser Combinators St. Louis Lambda Lounge Sept. 3, 2009 Tim Dalton Senior Software Engineer Object Computing Inc.
    2. 2. External vs Internal DSL <ul><li>Internal DSLs are implemented using syntax of “host” programming language </li></ul><ul><ul><li>Examples </li></ul></ul><ul><ul><ul><li>Fluent APIs in Java </li></ul></ul></ul><ul><ul><ul><li>RSpec and ScalaSpec </li></ul></ul></ul><ul><ul><li>Constrained by features of programming language </li></ul></ul><ul><li>External DSLs syntax is only limited by capabilities of the parser </li></ul>
    3. 3. What is a Combinator ? <ul><li>Combinators are functions that can be combined to perform more complex operations </li></ul><ul><ul><li>Concept originates in Lambda Calculus </li></ul></ul><ul><ul><li>Mostly comes from the Haskell community </li></ul></ul><ul><ul><ul><li>Haskell implementations use Monads </li></ul></ul></ul><ul><ul><ul><li>Scala implementation “almost Monadic” </li></ul></ul></ul>
    4. 4. Scala’s Parser Implementation <ul><li>Context-free LL grammar </li></ul><ul><ul><li>Left to right </li></ul></ul><ul><ul><li>Leftmost derivation </li></ul></ul><ul><li>Recursive descent </li></ul><ul><li>Backtracking </li></ul><ul><ul><li>There are ways to prevent backtracking </li></ul></ul><ul><li>Advances planned for Scala 2.8 </li></ul><ul><ul><li>Support for Packrat parsing </li></ul></ul><ul><ul><li>Parser Expression Grammar </li></ul></ul><ul><ul><li>More predictive with less recursion and backtracking </li></ul></ul>
    5. 5. Scala Combinator Parser Hierarchy <ul><li>scala.util.parsing.combinator.Parsers </li></ul><ul><ul><li>scala.util.parsing.combinator.syntactical.TokenParsers </li></ul></ul><ul><ul><ul><li>scala.util.parsing.combinator.syntactical.StdTokenParsers </li></ul></ul></ul><ul><ul><li>scala.util.parsing.combinator.RegexParsers </li></ul></ul><ul><ul><ul><li>scala.util.parsing.combinator.JavaTokenParsers </li></ul></ul></ul>
    6. 6. A Simple Logo(-Like) Interpreter <ul><li>Only a few commands: </li></ul><ul><ul><li>Right Turn <angle-degrees> </li></ul></ul><ul><ul><li>Left Turn <angle-degrees> </li></ul></ul><ul><ul><li>Forward <number-of-pixels> </li></ul></ul><ul><ul><li>Repeat <nested sequence of other commands> </li></ul></ul>
    7. 7. Grammar for Simple Logo <ul><ul><li>forward = (“FORWARD” | “FD”) positive-integer </li></ul></ul><ul><ul><li>right = (“RIGHT” | “RT) positive-integer </li></ul></ul><ul><ul><li>left = (“LEFT” | “LT”) positive-integer </li></ul></ul><ul><ul><li>repeat = “REPEAT” positive-integer “[“{statement}”]” </li></ul></ul><ul><ul><li>statement = right | left | forward | repeat </li></ul></ul><ul><ul><li>program = { statement } </li></ul></ul>
    8. 8. Scala Code to Implement Parser <ul><li>object LogoParser extends RegexParsers { </li></ul><ul><li>def positiveInteger = &quot;&quot;&quot;d+&quot;&quot;&quot;r </li></ul><ul><li>def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~positiveInteger </li></ul><ul><li>def right = (&quot;RT&quot;|&quot;RIGHT&quot;)~positiveInteger </li></ul><ul><li>def left = (&quot;LT&quot;|&quot;LEFT&quot;)~positiveInteger </li></ul><ul><li>def repeat = &quot;REPEAT&quot; ~ positiveInteger ~ &quot;[&quot; ~ rep(statement) ~ &quot;]&quot; </li></ul><ul><li>def statement:Parser[Any] = forward | right | left | repeat </li></ul><ul><li>def program = rep(statement) </li></ul><ul><li>} </li></ul>
    9. 9. Scala Code to Implement Parser <ul><li>An internal DSL is used to implement an External One </li></ul><ul><li>Methods on preceding slide are referred to as parser generators </li></ul><ul><li>RegexParsers is subclass of Parsers trait that provides a generic parser combinator </li></ul>
    10. 10. A Closer Look <ul><li>def positiveInteger = &quot;&quot;&quot;d+&quot;&quot;&quot;r </li></ul><ul><li>The trailing “r” is a method call the converts the string to a Regex object </li></ul><ul><ul><li>More verbose syntax: </li></ul></ul><ul><li>&quot;&quot;&quot;d+&quot;&quot;&quot;.r() </li></ul><ul><li>String does not have an r() method !! </li></ul><ul><li>Class RichString does, so an implicit conversion is done </li></ul>
    11. 11. Implicit Conversions <ul><li>One of the more powerful / dangerous features of Scala is implicit conversions </li></ul><ul><ul><li>RichString.r method signature </li></ul></ul><ul><ul><li>def r : Regex </li></ul></ul><ul><ul><li>scala.Predef implicit convertor </li></ul></ul><ul><ul><li>implicit def stringWrapper( x : java.lang.String) : RichString </li></ul></ul><ul><li>The Scala compiler will look for implicit convertors in scope and insert them implicitly </li></ul><ul><li>“ With great power, comes great responsibility” </li></ul>
    12. 12. Back to the Parser <ul><li>def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~positiveInteger </li></ul><ul><li>The “|” and “~” are methods of class Parsers.Parser[T] !! </li></ul><ul><li>RegexParser has implicit conversions: </li></ul><ul><ul><li>implicit def literal(s : String) : Parser[String] </li></ul></ul><ul><li>implicit def regex(r : Regex) : Parser[String] </li></ul><ul><li>Parser generator methods should return something that can be at least be converted to Parser[T] </li></ul>
    13. 13. Parser[T]’s and ParseResult[T]’s <ul><li>Parsers.Parser[T] </li></ul><ul><ul><li>Extends Reader => ParseResult[T] </li></ul></ul><ul><ul><ul><li>This makes it a function object </li></ul></ul></ul><ul><li>ParserResult[T] Hierarchy: </li></ul><ul><ul><li>Parsers.Success </li></ul></ul><ul><ul><li>Parsers.NoSuccess </li></ul></ul><ul><ul><ul><li>Parsers.Failure </li></ul></ul></ul><ul><ul><ul><li>Parsers.Error </li></ul></ul></ul><ul><li>Invoking Parsers[T] function object return one of the above subclasses </li></ul>
    14. 14. Combining Parser[T]’s <ul><li>Signature for Parser[T].| method: </li></ul><ul><li>def |[U >: T](q : => Parser[U]) : Parser[U] </li></ul><ul><ul><li>Parser Combinator for alternative composition (OR) </li></ul></ul><ul><ul><li>Succeeds (returns Parsers.Success) if either “this” Parser[T] succeeds or “q” Parser[U] succeeds </li></ul></ul><ul><ul><li>Type U must be same or super-class of type T. </li></ul></ul>
    15. 15. Combining Parser[T]’s <ul><li>Signature of Parser[T].~ method: </li></ul><ul><li>def ~[U](p : => Parser[U]) : Parser[~[T, U]] </li></ul><ul><ul><li>Parser Combinator for sequential composition </li></ul></ul><ul><ul><li>Succeeds only if “this” Parser succeeds and “q” Parser succeeds </li></ul></ul><ul><ul><li>Return an instance “~” that contain both results </li></ul></ul><ul><ul><ul><li>Yes, “~” is also a class ! </li></ul></ul></ul><ul><ul><ul><li>Like a Pair, but easier to pattern match on </li></ul></ul></ul>
    16. 16. Forward March <ul><li>Back to the specification of forward: </li></ul><ul><li>def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~positiveInteger </li></ul><ul><ul><li>For this combinator to succeed, </li></ul></ul><ul><ul><ul><li>Either the Parser for literal “FD” or “FORWARD” </li></ul></ul></ul><ul><ul><ul><li>And the Parser for the positiveInt Regex </li></ul></ul></ul><ul><ul><li>Both the literal strings and Regex result of positiveInt are implicitly converted to Parser[String] </li></ul></ul>
    17. 17. Repetition <ul><li>Next lines of note: </li></ul><ul><li>def repeat = </li></ul><ul><li>&quot;REPEAT&quot; ~ positiveInteger ~ &quot;[&quot; ~ rep(statement) ~ &quot;]&quot; </li></ul><ul><li>def statement:Parser[Any] = </li></ul><ul><li> forward | right | left | repeat </li></ul><ul><ul><li>Type for either repeat or statement need to be explicitly specified due to recursion </li></ul></ul><ul><li>The rep method specifies that Parser can be repeated </li></ul>
    18. 18. Repetition <ul><li>Signature for Parsers.rep method: </li></ul><ul><li>def rep[T](p : => Parser[T]) : Parser[List[T]] </li></ul><ul><ul><li>Parser Combinator for repetitions </li></ul></ul><ul><ul><li>Parses input until Parser, p, fails. Returns consecutive successful results as List. </li></ul></ul>
    19. 19. Other Forms of Repetition <ul><li>def repsep[T](p: => Parser[T], q: => Parser[Any]) : Parser[List[T]] </li></ul><ul><ul><li>Specifies a Parser to be interleaved in the repetition </li></ul></ul><ul><ul><li>Example: repsep(term, &quot;,&quot;) </li></ul></ul><ul><li>def rep1[T](p: => Parser[T]): Parser[List[T]] </li></ul><ul><ul><li>Parses non-empty repetitions </li></ul></ul><ul><li>def repN[T](n : Int, p : => Parser[T]) : Parser[List[T]] </li></ul><ul><ul><li>Parses a specified number of repetitions </li></ul></ul>
    20. 20. Execution <ul><li>Root Parser Generator: </li></ul><ul><li>def program = rep(statement) </li></ul><ul><li>To Execute the Parser </li></ul><ul><li>parseAll(program, &quot;REPEAT 4 [FD 100 RT 90]&quot;) </li></ul><ul><ul><li>Returns Parsers.Success[List[Parsers.~[…]]] </li></ul></ul><ul><ul><ul><li>Remember ,Parsers.Success[T] is subclass of ParseResult[T] </li></ul></ul></ul><ul><ul><ul><li>toString: </li></ul></ul></ul><ul><li>[1.24] parsed: List(((((REPEAT~4)~[)~List((FD~100), (RT~90)))~])) </li></ul><ul><ul><ul><li>The “…” indicates many levels nested Parsers </li></ul></ul></ul>
    21. 21. Not-so-Happy Path <ul><li>Example of failed Parsing: </li></ul><ul><li>parseAll(program, &quot;REPEAT 4 [FD 100 RT 90 ) &quot;) </li></ul><ul><ul><li>Returns Parsers.Failure </li></ul></ul><ul><ul><ul><li>Subclass of ParseResult[Nothing] </li></ul></ul></ul><ul><ul><li>toString: </li></ul></ul><ul><ul><li>[1.23] failure: `]' expected but `)' found </li></ul></ul><ul><ul><li>REPEAT 4 [FD 100 RT 90) </li></ul></ul><ul><ul><li>^ </li></ul></ul><ul><ul><ul><li>Failure message not always so “precise” </li></ul></ul></ul>
    22. 22. Making Something Useful <ul><li>Successful parse results need to transformed into something that can be evaluated </li></ul><ul><ul><li>Enter the “eye brows” method of Parser[T]: </li></ul></ul><ul><ul><li>def ^^[U](f : (T) => U) : Parser[U] </li></ul></ul><ul><ul><ul><li>Parser combinator for function application </li></ul></ul></ul>
    23. 23. Eye Brows Example <ul><ul><li>Example of “^^” method: </li></ul></ul><ul><ul><ul><li>def positiveInteger = &quot;&quot;&quot;d+&quot;&quot;&quot;.r ^^ </li></ul></ul></ul><ul><ul><ul><li>{ x:String => x.toInt } </li></ul></ul></ul><ul><ul><li>Now positiveInteger generates Parser[Int] instead of Parser[String] </li></ul></ul><ul><ul><li>Transformer can be shortened to “{ _.toInt }” </li></ul></ul>
    24. 24. Implementing Commands <ul><li>For the statements we need a hierarchy of command classes: </li></ul><ul><li>sealed abstract class LogoCommand </li></ul><ul><li>case class Forward(x: Int) extends LogoCommand </li></ul><ul><li>case class Turn(x: Int) extends LogoCommand </li></ul><ul><li>case class Repeat(i: Int, e: List[LogoCommand]) extends LogoCommand </li></ul>
    25. 25. Transforming into Commands <ul><li>The Forward command: </li></ul><ul><li>def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~positiveInteger ^^ </li></ul><ul><li>{ case _~value => Forward(value) } </li></ul><ul><ul><li>A ~[String, Int] is being passed in the transformer </li></ul></ul><ul><ul><li>Pattern matching is to extract the Int, value and construct a Forward instance </li></ul></ul><ul><ul><ul><li>Forward is a case class, so “new” not needed </li></ul></ul></ul><ul><ul><li>Case constructs can be partial functions themselves. Longer form: </li></ul></ul><ul><li>… ^^ { tilde => tilde match </li></ul><ul><li>{ case _~value => Forward(value) }} </li></ul>
    26. 26. Derivates of “~” <ul><li>Two methods related to “~”: </li></ul><ul><ul><li>def <~ [U](p: => Parser[U]): Parser[T] </li></ul></ul><ul><ul><ul><li>Parser combinator for sequential composition which keeps only the left result </li></ul></ul></ul><ul><ul><li>def ~> [U](p: => Parser[U]): Parser[U] </li></ul></ul><ul><ul><ul><li>Parser combinator for sequential composition which keeps only the right result </li></ul></ul></ul><ul><ul><li>Note, neither returns a “~” instance </li></ul></ul><ul><li>The forward method can be simplified: </li></ul><ul><li>def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~>positiveInteger ^^ </li></ul><ul><li>{ Forward(_) } </li></ul>
    27. 27. Updated Parser <ul><li>def positiveInteger = &quot;&quot;&quot;d+&quot;&quot;&quot;.r ^^ { _.toInt } </li></ul><ul><li>def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~>positiveInteger ^^ </li></ul><ul><li>{ Forward(_) } </li></ul><ul><li>def right = (&quot;RT&quot;|&quot;RIGHT&quot;)~>positiveInteger ^^ </li></ul><ul><li>{ x => Turn(-x) } </li></ul><ul><li>def left = (&quot;LT&quot;|&quot;LEFT&quot;)~>positiveInteger ^^ </li></ul><ul><li>{ Turn(_) } </li></ul><ul><li>def repeat = &quot;REPEAT&quot; ~> positiveInteger ~ &quot;[&quot; ~ rep(statement) <~ &quot;]&quot; ^^ </li></ul><ul><li>{ case number~_~statements => Repeat(number, statements)} </li></ul>
    28. 28. Updated Parser Results <ul><li>Executing the Parser now: </li></ul><ul><li>parseAll(program, &quot;REPEAT 4 [FD 100 RT 90]&quot;) </li></ul><ul><ul><li>Results: </li></ul></ul><ul><li>[1.24] parsed: List(Repeat(4,List(Forward(100), Turn(-90)))) </li></ul><ul><ul><li>Returns Parsers.Success[List[Repeat]] </li></ul></ul><ul><ul><li>This can be evaluated !! </li></ul></ul>
    29. 29. Evaluation <ul><li>class LogoEvaluationState { </li></ul><ul><li>var x = 0 </li></ul><ul><li>var y = 0 </li></ul><ul><li>var heading = 0 </li></ul><ul><li>} </li></ul><ul><li>implicit def dblToInt(d: Double):Int = if (d > 0) (d+0.5).toInt else (d-0.5).toInt </li></ul><ul><li>def parse(s: String) : List[LogoCommand] = LogoParser.parse(s).get </li></ul><ul><li>def evaluate(parseResult: LogoParser.ParseResult[List[LogoCommand]], g:Graphics2D) { </li></ul><ul><li>var state = new LogoEvaluationState </li></ul><ul><li>if (parseResult.successful) { </li></ul><ul><li>evaluate(parseResult.get, g, state) </li></ul><ul><li>} </li></ul><ul><li>// draw turtle </li></ul><ul><li>evaluate(parse(&quot;RT 90 FD 3 LT 110 FD 10 LT 140 FD 10 LT 110 FD 3&quot;), g, state) </li></ul><ul><li>} // Continued... </li></ul>
    30. 30. Evaluation (More Functional) <ul><li>private def evaluate(list: List[LogoCommand], g:Graphics2D, state:LogoEvaluationState) { </li></ul><ul><li>if (!list.isEmpty) { </li></ul><ul><li>val head :: tail = list </li></ul><ul><li>head match { </li></ul><ul><li> case Forward(distance) => { </li></ul><ul><li>val (nextX, nextY) = </li></ul><ul><li>(state.x + distance * sin(toRadians(state.heading)), </li></ul><ul><li> state.y + distance * cos(toRadians(state.heading))) </li></ul><ul><li>g.drawLine(state.x, state.y, nextX, nextY) </li></ul><ul><li>state.x = nextX </li></ul><ul><li>state.y = nextY </li></ul><ul><li>evaluate(tail, g, state) </li></ul><ul><li> } </li></ul><ul><li> case Turn(degrees) => { </li></ul><ul><li>state.heading += degrees </li></ul><ul><li>evaluate(tail, g, state) </li></ul><ul><li> } </li></ul><ul><li> case Repeat(0, _) => evaluate(tail, g, state) </li></ul><ul><li> case Repeat(count, statements) => </li></ul><ul><li>evaluate(statements ::: Repeat(count-1, statements)::tail, g, state) </li></ul><ul><li>} </li></ul><ul><li>} </li></ul><ul><li>} </li></ul>
    31. 31. Evaluation (More Imperative) <ul><li>def evaluate(list: List[LogoCommand], g:Graphics2D, state:LogoEvaluationState) { list.foreach(evaluate(_, g, state)) </li></ul><ul><li>} </li></ul><ul><li>def evaluate(command:LogoCommand, g:Graphics2D, state:LogoEvaluationState) { </li></ul><ul><li>command match { </li></ul><ul><li> case Forward(distance) => { </li></ul><ul><li> val (nextX, nextY) = (state.x + distance * Math.sin(Math.toRadians(state.heading)), </li></ul><ul><li>state.y + distance * Math.cos(Math.toRadians(state.heading))) </li></ul><ul><li>g.drawLine(state.x, state.y, nextX, nextY) </li></ul><ul><li>state.x = nextX </li></ul><ul><li>state.y = nextY </li></ul><ul><li> } </li></ul><ul><li> case Turn(degrees) => state.heading += degrees </li></ul><ul><li> case Repeat(count, statements) => (0 to count).foreach { _ => </li></ul><ul><li>evaluate(statements, g, state) </li></ul><ul><li> } </li></ul><ul><li>} </li></ul><ul><li>} </li></ul>
    32. 32. Demonstration