Implementing External DSLs Using Scala Parser Combinators
Upcoming SlideShare
Loading in...5
×
 

Implementing External DSLs Using Scala Parser Combinators

on

  • 3,540 views

Slides from talk I gave at St. Louis Lambda Lounge (http://lambdalounge.org/) for the Dec. 2009 meeting.

Slides from talk I gave at St. Louis Lambda Lounge (http://lambdalounge.org/) for the Dec. 2009 meeting.

Statistics

Views

Total Views
3,540
Views on SlideShare
3,540
Embed Views
0

Actions

Likes
3
Downloads
43
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • (|) = OR logic { } = repetition
  • The multiline string notation is used so escaping “\\” in the regex is not needed
  • Internal DSLs in Scala leverage implicit conversion to a great degree to allow a more flexible “syntax”
  • ~ is a case class.
  • Because statement and repeat invoke each other and therefore can be invoked recursively, The return type for one must be specified
  • There is also rep1sep
  • Demonstrate no so precise message “RT” -> “ET”
  • For some reason Scala compile didn’t like { Turn(-_) }
  • Tail Recursive !!

Implementing External DSLs Using Scala Parser Combinators Implementing External DSLs Using Scala Parser Combinators Presentation Transcript

  • Implementing External DSLs Using Scala Parser Combinators St. Louis Lambda Lounge Sept. 3, 2009 Tim Dalton Senior Software Engineer Object Computing Inc.
  • External vs Internal DSL
    • Internal DSLs are implemented using syntax of “host” programming language
      • Examples
        • Fluent APIs in Java
        • RSpec and ScalaSpec
      • Constrained by features of programming language
    • External DSLs syntax is only limited by capabilities of the parser
  • What is a Combinator ?
    • Combinators are functions that can be combined to perform more complex operations
      • Concept originates in Lambda Calculus
      • Mostly comes from the Haskell community
        • Haskell implementations use Monads
        • Scala implementation “almost Monadic”
  • Scala’s Parser Implementation
    • Context-free LL grammar
      • Left to right
      • Leftmost derivation
    • Recursive descent
    • Backtracking
      • There are ways to prevent backtracking
    • Advances planned for Scala 2.8
      • Support for Packrat parsing
      • Parser Expression Grammar
      • More predictive with less recursion and backtracking
  • Scala Combinator Parser Hierarchy
    • scala.util.parsing.combinator.Parsers
      • scala.util.parsing.combinator.syntactical.TokenParsers
        • scala.util.parsing.combinator.syntactical.StdTokenParsers
      • scala.util.parsing.combinator.RegexParsers
        • scala.util.parsing.combinator.JavaTokenParsers
  • A Simple Logo(-Like) Interpreter
    • Only a few commands:
      • Right Turn <angle-degrees>
      • Left Turn <angle-degrees>
      • Forward <number-of-pixels>
      • Repeat <nested sequence of other commands>
  • Grammar for Simple Logo
      • forward = (“FORWARD” | “FD”) positive-integer
      • right = (“RIGHT” | “RT) positive-integer
      • left = (“LEFT” | “LT”) positive-integer
      • repeat = “REPEAT” positive-integer “[“{statement}”]”
      • statement = right | left | forward | repeat
      • program = { statement }
  • Scala Code to Implement Parser
    • object LogoParser extends RegexParsers {
    • def positiveInteger = &quot;&quot;&quot;d+&quot;&quot;&quot;r
    • def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~positiveInteger
    • def right = (&quot;RT&quot;|&quot;RIGHT&quot;)~positiveInteger
    • def left = (&quot;LT&quot;|&quot;LEFT&quot;)~positiveInteger
    • def repeat = &quot;REPEAT&quot; ~ positiveInteger ~ &quot;[&quot; ~ rep(statement) ~ &quot;]&quot;
    • def statement:Parser[Any] = forward | right | left | repeat
    • def program = rep(statement)
    • }
  • Scala Code to Implement Parser
    • An internal DSL is used to implement an External One
    • Methods on preceding slide are referred to as parser generators
    • RegexParsers is subclass of Parsers trait that provides a generic parser combinator
  • A Closer Look
    • def positiveInteger = &quot;&quot;&quot;d+&quot;&quot;&quot;r
    • The trailing “r” is a method call the converts the string to a Regex object
      • More verbose syntax:
    • &quot;&quot;&quot;d+&quot;&quot;&quot;.r()
    • String does not have an r() method !!
    • Class RichString does, so an implicit conversion is done
  • Implicit Conversions
    • One of the more powerful / dangerous features of Scala is implicit conversions
      • RichString.r method signature
      • def r : Regex
      • scala.Predef implicit convertor
      • implicit def stringWrapper( x : java.lang.String) : RichString
    • The Scala compiler will look for implicit convertors in scope and insert them implicitly
    • “ With great power, comes great responsibility”
  • Back to the Parser
    • def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~positiveInteger
    • The “|” and “~” are methods of class Parsers.Parser[T] !!
    • RegexParser has implicit conversions:
      • implicit def literal(s : String) : Parser[String]
    • implicit def regex(r : Regex) : Parser[String]
    • Parser generator methods should return something that can be at least be converted to Parser[T]
  • Parser[T]’s and ParseResult[T]’s
    • Parsers.Parser[T]
      • Extends Reader => ParseResult[T]
        • This makes it a function object
    • ParserResult[T] Hierarchy:
      • Parsers.Success
      • Parsers.NoSuccess
        • Parsers.Failure
        • Parsers.Error
    • Invoking Parsers[T] function object return one of the above subclasses
  • Combining Parser[T]’s
    • Signature for Parser[T].| method:
    • def |[U >: T](q : => Parser[U]) : Parser[U]
      • Parser Combinator for alternative composition (OR)
      • Succeeds (returns Parsers.Success) if either “this” Parser[T] succeeds or “q” Parser[U] succeeds
      • Type U must be same or super-class of type T.
  • Combining Parser[T]’s
    • Signature of Parser[T].~ method:
    • def ~[U](p : => Parser[U]) : Parser[~[T, U]]
      • Parser Combinator for sequential composition
      • Succeeds only if “this” Parser succeeds and “q” Parser succeeds
      • Return an instance “~” that contain both results
        • Yes, “~” is also a class !
        • Like a Pair, but easier to pattern match on
  • Forward March
    • Back to the specification of forward:
    • def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~positiveInteger
      • For this combinator to succeed,
        • Either the Parser for literal “FD” or “FORWARD”
        • And the Parser for the positiveInt Regex
      • Both the literal strings and Regex result of positiveInt are implicitly converted to Parser[String]
  • Repetition
    • Next lines of note:
    • def repeat =
    • &quot;REPEAT&quot; ~ positiveInteger ~ &quot;[&quot; ~ rep(statement) ~ &quot;]&quot;
    • def statement:Parser[Any] =
    • forward | right | left | repeat
      • Type for either repeat or statement need to be explicitly specified due to recursion
    • The rep method specifies that Parser can be repeated
  • Repetition
    • Signature for Parsers.rep method:
    • def rep[T](p : => Parser[T]) : Parser[List[T]]
      • Parser Combinator for repetitions
      • Parses input until Parser, p, fails. Returns consecutive successful results as List.
  • Other Forms of Repetition
    • def repsep[T](p: => Parser[T], q: => Parser[Any]) : Parser[List[T]]
      • Specifies a Parser to be interleaved in the repetition
      • Example: repsep(term, &quot;,&quot;)
    • def rep1[T](p: => Parser[T]): Parser[List[T]]
      • Parses non-empty repetitions
    • def repN[T](n : Int, p : => Parser[T]) : Parser[List[T]]
      • Parses a specified number of repetitions
  • Execution
    • Root Parser Generator:
    • def program = rep(statement)
    • To Execute the Parser
    • parseAll(program, &quot;REPEAT 4 [FD 100 RT 90]&quot;)
      • Returns Parsers.Success[List[Parsers.~[…]]]
        • Remember ,Parsers.Success[T] is subclass of ParseResult[T]
        • toString:
    • [1.24] parsed: List(((((REPEAT~4)~[)~List((FD~100), (RT~90)))~]))
        • The “…” indicates many levels nested Parsers
  • Not-so-Happy Path
    • Example of failed Parsing:
    • parseAll(program, &quot;REPEAT 4 [FD 100 RT 90 ) &quot;)
      • Returns Parsers.Failure
        • Subclass of ParseResult[Nothing]
      • toString:
      • [1.23] failure: `]' expected but `)' found
      • REPEAT 4 [FD 100 RT 90)
      • ^
        • Failure message not always so “precise”
  • Making Something Useful
    • Successful parse results need to transformed into something that can be evaluated
      • Enter the “eye brows” method of Parser[T]:
      • def ^^[U](f : (T) => U) : Parser[U]
        • Parser combinator for function application
  • Eye Brows Example
      • Example of “^^” method:
        • def positiveInteger = &quot;&quot;&quot;d+&quot;&quot;&quot;.r ^^
        • { x:String => x.toInt }
      • Now positiveInteger generates Parser[Int] instead of Parser[String]
      • Transformer can be shortened to “{ _.toInt }”
  • Implementing Commands
    • For the statements we need a hierarchy of command classes:
    • sealed abstract class LogoCommand
    • case class Forward(x: Int) extends LogoCommand
    • case class Turn(x: Int) extends LogoCommand
    • case class Repeat(i: Int, e: List[LogoCommand]) extends LogoCommand
  • Transforming into Commands
    • The Forward command:
    • def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~positiveInteger ^^
    • { case _~value => Forward(value) }
      • A ~[String, Int] is being passed in the transformer
      • Pattern matching is to extract the Int, value and construct a Forward instance
        • Forward is a case class, so “new” not needed
      • Case constructs can be partial functions themselves. Longer form:
    • … ^^ { tilde => tilde match
    • { case _~value => Forward(value) }}
  • Derivates of “~”
    • Two methods related to “~”:
      • def <~ [U](p: => Parser[U]): Parser[T]
        • Parser combinator for sequential composition which keeps only the left result
      • def ~> [U](p: => Parser[U]): Parser[U]
        • Parser combinator for sequential composition which keeps only the right result
      • Note, neither returns a “~” instance
    • The forward method can be simplified:
    • def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~>positiveInteger ^^
    • { Forward(_) }
  • Updated Parser
    • def positiveInteger = &quot;&quot;&quot;d+&quot;&quot;&quot;.r ^^ { _.toInt }
    • def forward = (&quot;FD&quot;|&quot;FORWARD&quot;)~>positiveInteger ^^
    • { Forward(_) }
    • def right = (&quot;RT&quot;|&quot;RIGHT&quot;)~>positiveInteger ^^
    • { x => Turn(-x) }
    • def left = (&quot;LT&quot;|&quot;LEFT&quot;)~>positiveInteger ^^
    • { Turn(_) }
    • def repeat = &quot;REPEAT&quot; ~> positiveInteger ~ &quot;[&quot; ~ rep(statement) <~ &quot;]&quot; ^^
    • { case number~_~statements => Repeat(number, statements)}
  • Updated Parser Results
    • Executing the Parser now:
    • parseAll(program, &quot;REPEAT 4 [FD 100 RT 90]&quot;)
      • Results:
    • [1.24] parsed: List(Repeat(4,List(Forward(100), Turn(-90))))
      • Returns Parsers.Success[List[Repeat]]
      • This can be evaluated !!
  • Evaluation
    • class LogoEvaluationState {
    • var x = 0
    • var y = 0
    • var heading = 0
    • }
    • implicit def dblToInt(d: Double):Int = if (d > 0) (d+0.5).toInt else (d-0.5).toInt
    • def parse(s: String) : List[LogoCommand] = LogoParser.parse(s).get
    • def evaluate(parseResult: LogoParser.ParseResult[List[LogoCommand]], g:Graphics2D) {
    • var state = new LogoEvaluationState
    • if (parseResult.successful) {
    • evaluate(parseResult.get, g, state)
    • }
    • // draw turtle
    • evaluate(parse(&quot;RT 90 FD 3 LT 110 FD 10 LT 140 FD 10 LT 110 FD 3&quot;), g, state)
    • } // Continued...
  • Evaluation (More Functional)
    • private def evaluate(list: List[LogoCommand], g:Graphics2D, state:LogoEvaluationState) {
    • if (!list.isEmpty) {
    • val head :: tail = list
    • head match {
    • case Forward(distance) => {
    • val (nextX, nextY) =
    • (state.x + distance * sin(toRadians(state.heading)),
    • state.y + distance * cos(toRadians(state.heading)))
    • g.drawLine(state.x, state.y, nextX, nextY)
    • state.x = nextX
    • state.y = nextY
    • evaluate(tail, g, state)
    • }
    • case Turn(degrees) => {
    • state.heading += degrees
    • evaluate(tail, g, state)
    • }
    • case Repeat(0, _) => evaluate(tail, g, state)
    • case Repeat(count, statements) =>
    • evaluate(statements ::: Repeat(count-1, statements)::tail, g, state)
    • }
    • }
    • }
  • Evaluation (More Imperative)
    • def evaluate(list: List[LogoCommand], g:Graphics2D, state:LogoEvaluationState) { list.foreach(evaluate(_, g, state))
    • }
    • def evaluate(command:LogoCommand, g:Graphics2D, state:LogoEvaluationState) {
    • command match {
    • case Forward(distance) => {
    • val (nextX, nextY) = (state.x + distance * Math.sin(Math.toRadians(state.heading)),
    • state.y + distance * Math.cos(Math.toRadians(state.heading)))
    • g.drawLine(state.x, state.y, nextX, nextY)
    • state.x = nextX
    • state.y = nextY
    • }
    • case Turn(degrees) => state.heading += degrees
    • case Repeat(count, statements) => (0 to count).foreach { _ =>
    • evaluate(statements, g, state)
    • }
    • }
    • }
  • Demonstration