Your SlideShare is downloading. ×
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

1,502

Published on

Talk on Parsers Combinators in Scala by Ilya @lambdamix Kliuchnikov at scalaby#8 (scala.by)

Talk on Parsers Combinators in Scala by Ilya @lambdamix Kliuchnikov at scalaby#8 (scala.by)

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,502
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Parser Combinators in Scala Илья Ключников @lambdamix 1
  • 2. Комбинаторные библиотеки● Actors● Parsers● ScalaCheck, Spesc● Scalaz● SBT● EDSLs● ... 2
  • 3. 33/35 11/14 8/9 4/13 3
  • 4. Intro: combinators, parsers Scala Parser Combinators from the Ground Up How to write typical parser33/35 11/14 Pros, cons 8/9 4/13 Advanced techniques 4
  • 5. Parser?● Трансформирует текст в структуру + 2*3 + 4 * 3 2 3 5
  • 6. Hello, parserimport scala.util.parsing.combinator._import syntactical.StandardTokenParserssealed trait Exprcase class Num(i: Int) extends Exprcase class Var(n: String) extends Exprcase class Plus(e1: Expr, e2: Expr) extends Exprcase class Mult(e1: Expr, e2: Expr) extends Exprobject ArithParsers extends StandardTokenParsers with ImplicitConversions { lexical.delimiters += ("(", ")", "+", "*") def expr: Parser[Expr] = term ~ ("+" ~> expr) ^^ Plus | term def term: Parser[Expr] = factor ~ ("*" ~> term) ^^ Mult | factor def factor: Parser[Expr] = numericLit ^^ { s => Num(s.toInt) } | ident ^^ Var | "(" ~> expr <~ ")" def parseExpr(s: String) = phrase(expr)(new lexical.Scanner(s))}scala> ArithParsers.parseExpr("1")res1: ArithParsers.ParseResult[parsers2.Expr] = [1.2] parsed: Num(1)scala> ArithParsers.parseExpr("1 + 1 * 2")res2: ArithParsers.ParseResult[parsers2.Expr] = [1.10] parsed: Plus(Num(1),Mult(Num(1),Num(2)))scala> ArithParsers.parseExpr("a * (a * a)")res3: ArithParsers.ParseResult[parsers2.Expr] = [1.12] parsed: Mult(Var(a),Mult(Var(a),Var(a))) 6
  • 7. Example 2: Lambda calculust ::= terms: x variable λx.t abstraction tt application x y z = ((x y) z) λx.λy.y = λx.(λy.y) 7
  • 8. Example 2sealed trait Termcase class Var(n: String) extends Termcase class Lam(v: Var, body: Term) extends Termcase class App(t1: Term, t2: Term) extends Termobject LamParsers extends StandardTokenParsers with ImplicitConversions with PackratParsers { lexical.delimiters += ("(", ")", ".", "") lazy val term: PackratParser[Term] = appTerm | lam lazy val vrb: PackratParser[Var] = ident ^^ Var lazy val lam: PackratParser[Term] = ("" ~> vrb) ~ ("." ~> term) ^^ Lam lazy val appTerm: PackratParser[Term] = appTerm ~ aTerm ^^ App | aTerm lazy val aTerm: PackratParser[Term] = vrb | "(" ~> term <~ ")" def parseTerm(s: String) = phrase(term)(new lexical.Scanner(s))}scala> LamParsers.parseTerm("x y z")res1: LamParsers.ParseResult[parsers.Term] = [1.6] parsed: App(App(Var(x),Var(y)),Var(z))scala> LamParsers.parseTerm("""x.y.x y""")res2: LamParsers.ParseResult[parsers.Term] = [1.10] parsed:Lam(Var(x),Lam(Var(y),App(Var(x),Var(y))))scala> LamParsers.parseTerm("""(x.x x) (x. x x)""")res3: LamParsers.ParseResult[parsers.Term] = [1.19] parsed:App(Lam(Var(x),App(Var(x),Var(x))),Lam(Var(x),App(Var(x),Var(x)))) 8
  • 9. Combinators 9
  • 10. Комбинаторные библиотеки● Actors● Parsers● ScalaCheck, Spesc● Scalaz● SBT● EDSLs● ... 10
  • 11. Принципы комбинаторных библиотек● Соответствие терминологии библиотеки и терминологии предметной области.● Состав ● типы, ● примитивы, ● комбинаторы первого порядка, ● комбинаторы высшего порядка.● Свойство замыкания (композиционность).● Возможность эффективной реализации.E. Кирпичев. Элементы функциональных языков. Практика функционального 11программирования №3.
  • 12. Парсеры 12
  • 13. Предметная область● Грамматика ● Парсеры ● Регулярная ● LL-парсеры ● Бесконтекстная ● LR-парсеры ● Леворекурсивная ● Нисходящие ● Праворекурсивная ● Восходящие ● Аттрибутная ● GLL ● Boolean ● Packrat-парсеры ● PEG ● Parsing with ● ... derivativatives 13
  • 14. Предметная область 14
  • 15. Подходы к созданию парсеров● Parser-generator ● Hand-written ● Yacc ● Low-level ● Lex ● High-level ● JavaCC ● AntLR ● Rat! 15
  • 16. Parsers in ScalaC9 Lectures: Dr. Erik Meijer - Functional Programming Fundamentals Chapter 8 of 13A. Moors, F. Piessens, M. Odersky. Parser Combinators in Scala. Report CW 49 // Feb 2008 16
  • 17. Scala parser combinators area form of recursive descent parsing with infinite backtracking. 17
  • 18. Parsers in Scala are functionalBackground: ● W. Burge. Recursive Programming Techniques. Addison-Wesley, 1975. ● Ph. Wadler. How to Replace Failure by a List of Successes. A method for exception handling, backtracking, and pattern matching in lazy functional languages // 1985 ● G. Hutton. Higher-order functions for parsing // Journal of functional programming. 1992/2 ● J. Fokker. Functional Parsers // 1995 18
  • 19. Parser?● Трансформирует текст в структуру + 2*3 + 4 * 3 2 3 19
  • 20. Парсер – это функция type Parser[A] = String => AНет композиции функций, не обязательно парсить всю строку type Parser[A] = String => (A, String) Может закончиться неудачей type Parser[A] = String => Option[(A, String)] 20
  • 21. Attempt #1 21
  • 22. Resultstrait SimpleResults { type Input trait Result[+T] { def next: Input } case class Success[+T](result: T, next: Input) extends Result[T] case class Failure(msg: String, next: Input) extends Result[Nothing]}object XParser extends SimpleResults { type Input = String val acceptX: Input => Result[Char] = { (in: String) => if (in.charAt(0) == x) Success(x, in.substring(1)) else Failure("expected an x", in) }}scala> XParser.acceptX("xyz")res0: parsers.XParser.Result[Char] = Success(x,yz)scala> XParser.acceptX("yz")res1: parsers.XParser.Result[Char] = Failure(expected an x,yz) 22
  • 23. The basis: Parser, |, ~, accepttrait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}} def ~[U](p: => Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}trait StringParsers extends SimpleParsers { type Input = String private val EOI = 0.toChar def accept(expected: Char) = new Parser[Char] { def apply(in: String) = if (in == "") { if (expected == EOI) Success(expected, "") else Failure("no more input", in) } else if (in.charAt(0) == expected) Success(expected, in.substring(1)) else Failure("expected " + expected + "", in) } def eoi = accept(EOI)} 23
  • 24. The simplest parserobject OXOParser extends StringParsers { def oxo = accept(o) ~ accept(x) ~ accept(o) def oxos: Parser[Any] = (oxo ~ accept( ) ~ oxos | oxo)}scala> OXOParser.oxos("123")res2: parsers.OXOParser.Result[Any] = Failure(expected o,123)scala> OXOParser.oxos("oxo")res3: parsers.OXOParser.Result[Any] = Success(((o,x),o),)scala> OXOParser.oxos("oxo oxo")res4: parsers.OXOParser.Result[Any] = Success(((((o,x),o), ),((o,x),o)),)scala> OXOParser.oxos("oxo oxo 1")res5: parsers.OXOParser.Result[Any] = Success(((((o,x),o), ),((o,x),o)), 1)scala> (OXOParser.oxos ~ OXOParser.eoi)("oxo oxo 1")res6: parsers.OXOParser.Result[(Any, Char)] = Failure(expected ?, 1) 24
  • 25. Be careful!trait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { call-by-name param def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}} def ~[U](p: => Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { call-by-name param def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}object OXOParser extends StringParsers { def oxo = accept(o) ~ accept(x) ~ accept(o) def oxos: Parser[Any] = (oxo ~ accept( ) ~ oxos | oxo)} 25
  • 26. Be careful!trait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: Parser[U]): Parser[U] = new Parser[U] { call-by-value param def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}} def ~[U](p: Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { call-by-value param def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}object OXOParser extends StringParsers { def oxo = accept(o) ~ accept(x) ~ accept(o) def oxos: Parser[Any] = (oxo ~ accept( ) ~ oxos | oxo)}scala> OXOParser.oxos("123")java.lang.StackOverflowError at parsers.OXOParser$.oxo(stepbystep.scala:67) at parsers.OXOParser$.oxos(stepbystep.scala:69) at parsers.OXOParser$.oxos(stepbystep.scala:69) at parsers.OXOParser$.oxos(stepbystep.scala:69) at parsers.OXOParser$.oxos(stepbystep.scala:69) at parsers.OXOParser$.oxos(stepbystep.scala:69) at parsers.OXOParser$.oxos(stepbystep.scala:69) 26 ...
  • 27. Attempt #2(Factoring out Plumbing) 27
  • 28. Where is a problem?trait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}} def ~[U](p: => Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}object OXOParser extends StringParsers { def oxo = accept(o) ~ accept(x) ~ accept(o) def oxos: Parser[Any] = (oxo ~ accept( ) ~ oxos | oxo)} 28
  • 29. Too much “threading”trait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}} def ~[U](p: => Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}object OXOParser extends StringParsers { def oxo = accept(o) ~ accept(x) ~ accept(o) def oxos: Parser[Any] = (oxo ~ accept( ) ~ oxos | oxo)} 29
  • 30. Improved Resultstrait SimpleResults { type Input trait Result[+T] { def next: Input def map[U](f: T => U): Result[U] def flatMapWithNext[U](f: T => Input => Result[U]): Result[U] def append[U >: T](alt: => Result[U]): Result[U] } case class Success[+T](result: T, next: Input) extends Result[T] { def map[U](f: T => U) = Success(f(result), next) def flatMapWithNext[U](f: T => Input => Result[U]) = f(result)(next) def append[U >: T](alt: => Result[U]) = this } case class Failure(msg: String, next: Input) extends Result[Nothing] { def map[U](f: Nothing => U) = this def flatMapWithNext[U](f: Nothing => Input => Result[U]) = this def append[U](alt: => Result[U]) = alt }}●map -...●flatMapWithNext - ...●append – for multiple results (we do not consider it here) 30
  • 31. Parser is a function with many results type Parser[A] = String => A type Parser[A] = String => (A, String) type Parser[A] = String => Option[(A, String)] type Parser[A] = String => List[(A, String)] 31
  • 32. After improvingtrait SimpleResults { type Input trait Result[+T] { def next: Input def map[U](f: T => U): Result[U] def flatMapWithNext[U](f: T => Input => Result[U]): Result[U] def append[U >: T](alt: => Result[U]): Result[U] } case class Success[+T](result: T, next: Input) extends Result[T] { def map[U](f: T => U) = Success(f(result), next) def flatMapWithNext[U](f: T => Input => Result[U]) = f(result)(next) def append[U >: T](alt: => Result[U]) = this } case class Failure(msg: String, next: Input) extends Result[Nothing] { def map[U](f: Nothing => U) = this def flatMapWithNext[U](f: Nothing => Input => Result[U]) = this def append[U](alt: => Result[U]) = alt }}trait SimpleParsers extends SimpleResults { abstract class Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def flatMap[U](f: T => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) flatMapWithNext (f) } def map[U](f: T => U): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) map (f) } def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) append p(in) } def ~[U](p: => Parser[U]): Parser[(T, U)] = for (a <- this; b <- p) yield (a, b) Hey! 32 }}
  • 33. So, Parser is a Monad!! 33
  • 34. Where is my “withFilter”?● In Scala 2.10● It was not easy... 34
  • 35. Removing noise...trait SimpleParsers extends SimpleResults { def Parser[T](f: Input => Result[T]) = new Parser[T] { def apply(in: Input) = f(in) } Removing Boilerplate abstract class Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] New Parser{apply} def flatMap[U](f: T => Parser[U]): Parser[U] = Parser { in => Parser.this(in) flatMapWithNext (f) } def map[U](f: T => U): Parser[U] = Parser { in => Parser.this(in) map (f) } def |[U >: T](p: => Parser[U]): Parser[U] = Parser { in => Parser.this(in) append p(in) } def ~[U](p: => Parser[U]): Parser[(T, U)] = for (a <- this; b <- p) yield (a, b) }} 35
  • 36. Real Parsers 36
  • 37. Real Parserspackage scala.util.parsing.combinatortrait Parsers { Stream annotated with type Elem type Input = Reader[Elem] coordinates sealed abstract class ParseResult[+T] case class Success[+T](result: T, override val next: Input) extends ParseResult[T] sealed abstract class NoSuccess(val msg: String, override val next: Input) extends ParseResult[Nothing] case class Failure(override val msg: String, override val next: Input) extends NoSuccess(msg, next) case class Error(override val msg: String, override val next: Input) extends NoSuccess(msg, next) ... abstract class Parser[+T] extends (Input => ParseResult[T]) { Controlling ... backtracking } case class ~[+a, +b](_1: a, _2: b) { override def toString = "("+ _1 +"~"+ _2 +")" } Deconstructing sequencing}package scala.util.parsing.inputabstract class Reader[+T] { def first: T def rest: Reader[T] 37}
  • 38. Simplified picturepackage scala.util.parsing.combinatortrait Parsers { type Elem type Input = Reader[Elem] sealed abstract class ParseResult[+T] abstract class Parser[+T] extends (Input => ParseResult[T]) { combinators } combinators} 38
  • 39. Combinators 39
  • 40. Basic Combinatorspackage scala.util.parsing.combinatortrait Parsers { def elem(kind: String, p: Elem => Boolean): Parser[Elem] def elem(e: Elem): Parser[Elem] implicit def accept(e: Elem): Parser[Elem] abstract class Parser[+T] extends (Input => ParseResult[T]) { def ~ [U](q: => Parser[U]): Parser[~[T, U]] def <~ [U](q: => Parser[U]): Parser[T] def ~! [U](p: => Parser[U]): Parser[~[T, U]] def | [U >: T](q: => Parser[U]): Parser[U] def ||| [U >: T](q0: => Parser[U]): Parser[U] def ^^ [U](f: T => U): Parser[U] def ^^^ [U](v: => U): Parser[U] def ^? [U](f: PartialFunction[T, U], error: T => String): Parser[U] def ^? [U](f: PartialFunction[T, U]): Parser[U] def >>[U](fq: T => Parser[U]) def *: Parser[List[T]] def +: Parser[List[T]] def ?: Parser[Option[T]] }} 40
  • 41. Swiss army knife Combinatorspackage scala.util.parsing.combinatortrait Parsers { def commit[T](p: => Parser[T]): Parser[T] def accept[ES <% List[Elem]](es: ES): Parser[List[Elem]] def accept[U](expected: String, f: PartialFunction[Elem, U]): Parser[U] def failure(msg: String): Parser[Nothing] def err(msg: String): Parser[Nothing] def success[T](v: T): Parser[T] def rep[T](p: => Parser[T]): Parser[List[T]] def repsep[T](p: => Parser[T], q: => Parser[Any]): Parser[List[T]] def rep1[T](p: => Parser[T]): Parser[List[T]] def rep1[T](first: => Parser[T], p0: => Parser[T]): Parser[List[T]] def repN[T](num: Int, p: => Parser[T]): Parser[List[T]] def rep1sep[T](p : => Parser[T], q : => Parser[Any]): Parser[List[T]] def chainl1[T](p: => Parser[T], q: => Parser[(T, T) => T]): Parser[T] def chainl1[T, U](first: => Parser[T], p: => Parser[U], q: => Parser[(T, U) => T]): Parser[T] def chainr1[T, U](p: => Parser[T], q: => Parser[(T, U) => U], combine: (T, U) => U, first: U): Parser[U] def opt[T](p: => Parser[T]): Parser[Option[T]] def not[T](p: => Parser[T]): Parser[Unit] def guard[T](p: => Parser[T]): Parser[T] def positioned[T <: Positional](p: => Parser[T]): Parser[T] def phrase[T](p: Parser[T]): Parser[T]} Inpired by G. Hutton and E. Meijer. Monadic Parser Combinators. 41
  • 42. Lexing 42
  • 43. Простейший (low-level) парсерtrait SimplestParsers extends Parsers { type Elem = Char def whitespaceChar: Parser[Char] = elem("space char", ch => ch <= && ch != EofCh) def letter: Parser[Char] = elem("letter", _.isLetter) def whitespace: Parser[List[Char]] = rep(whitespaceChar) def ident: Parser[List[Char]] = rep1(letter) def parse[T](p: Parser[T], in: String): ParseResult[T] = p(new CharSequenceReader(in))}scala> val p1 = new SimplestParsers{}p1: java.lang.Object with parsers.SimplestParsers = $anon$1@17d59ff0scala> import p1._import p1._scala> parse(letter, "foo bar")res0: p1.ParseResult[Char] = [1.2] parsed: fscala> parse(ident, "foo bar")res1: p1.ParseResult[List[Char]] = [1.4] parsed: List(f, o, o)scala> parse(ident, "123")res2: p1.ParseResult[List[Char]] =[1.1] failure: letter expected 43123^
  • 44. Towards ASTtrait Tokencase class Id(n: String) extends Tokencase class Num(n: String) extends Tokencase object ErrorToken extends Tokentrait TokenParsers extends Parsers { type Elem = Char private def whitespaceChar: Parser[Char] = elem("space char", ch => ch <= && ch != EofCh) def letter: Parser[Char] = elem("letter", _.isLetter) def digit: Parser[Char] = elem("digit", _.isDigit) def whitespace: Parser[List[Char]] = rep(whitespaceChar) def idLit: Parser[String] = rep1(letter) ^^ { _.mkString("") } def numLit: Parser[String] = rep1(digit) ^^ { _.mkString("") } def id: Parser[Token] = idLit ^^ Id def num: Parser[Token] = numLit ^^ Num def token = id | num def parse[T](p: Parser[T], in: String): ParseResult[T] = p(new CharSequenceReader(in))} 44
  • 45. Lexer/Scannertrait Scanners extends TokenParsers { class Scanner(in: Reader[Char]) extends Reader[Token] { def this(in: String) = this(new CharArrayReader(in.toCharArray())) private val (tok, rest1, rest2) = whitespace(in) match { case Success(_, in1) => token(in1) match { case Success(tok, in2) => (tok, in1, in2) case ns: NoSuccess => (ErrorToken, ns.next, ns.next.rest) } case ns: NoSuccess => (ErrorToken, ns.next, ns.next.rest) } def first = tok def rest = new Scanner(rest2) }}scala> val scs = new Scanners {}scs: java.lang.Object with Scanners = $anon$1@68a750ascala> val reader = new scs.Scanner("foo bar")reader: scs.Scanner = Scanners$Scanner@6a75863fscala> reader.firstres0: Token = Id(foo)scala> reader.rest.firstres1: Num = Num(123)scala> reader.rest.rest.firstres2: Token = ErrorToken 45
  • 46. LexingReader[Char] Low-level Parsing Reader[Token] 46
  • 47. Typical Parser 47
  • 48. RAM++ 48
  • 49. AST 49
  • 50. Parser Implicit magic “~” magic 50
  • 51. Итак, ...● Parsers Combinators in Scala позволяют описывать исполняемые грамматики в виде, близком к BNF.● Внутреннее устройство Parser Combinators - самый настоящий Programming Pearl.● Internal DSL for External DSLs. 51
  • 52. Discussion(Parser Combinators vs Parser Generator) 52
  • 53. PROS● Toт же язык (Scala) – не нужно учить новый инструмент.● Исполняемая грамматика - всегда актуальный код.● Краткость + богатая выразительность: LL(*) и больше (в том числе, контекстные грамматики).● Можно делать fusion синтаксического разбора и чего-нибудь еще.● Модульность 53
  • 54. CONS● Некоторые простые вещи могут кодироваться очень непросто.● Performance. 54
  • 55. PerformanceHand-written Lift-json is 350 times faster than version based on parsercombinators (proof link) 55
  • 56. Packrat Parsers 56
  • 57. Parsing “9”: Too much backtrackingimport scala.util.parsing.combinator._import syntactical.StandardTokenParserssealed trait Exprcase class Num(i: Int) extends Exprcase class Var(n: String) extends Exprcase class Plus(e1: Expr, e2: Expr) extends Exprcase class Mult(e1: Expr, e2: Expr) extends Exprobject ArithParsers extends StandardTokenParsers with ImplicitConversions { lexical.delimiters += ("(", ")", "+", "*") def expr: Parser[Expr] = term ~ ("+" ~> expr) ^^ Plus | term def term: Parser[Expr] = factor ~ ("*" ~> term) ^^ Mult | factor def factor: Parser[Expr] = numericLit ^^ { s => Num(s.toInt) } | ident ^^ Var | "(" ~> expr <~ ")" def parseExpr(s: String) = phrase(expr)(new lexical.Scanner(s))} 57
  • 58. Idea: Memoization (Really, Laziness) 58
  • 59. + Left Recursionsealed trait Termcase class Var(n: String) extends Termcase class Lam(v: Var, body: Term) extends Termcase class App(t1: Term, t2: Term) extends Termobject LamParsers extends StandardTokenParsers with ImplicitConversions with PackratParsers { lexical.delimiters += ("(", ")", ".", "") lazy val term: PackratParser[Term] = appTerm | lam lazy val vrb: PackratParser[Var] = ident ^^ Var lazy val lam: PackratParser[Term] = ("" ~> vrb) ~ ("." ~> term) ^^ Lam lazy val appTerm: PackratParser[Term] = appTerm ~ aTerm ^^ App | aTerm lazy val aTerm: PackratParser[Term] = vrb | "(" ~> term <~ ")" def parseTerm(s: String) = phrase(term)(new lexical.Scanner(s))} 59
  • 60. + Left Recursionsealed trait Termcase class Var(n: String) extends Termcase class Lam(v: Var, body: Term) extends Termcase class App(t1: Term, t2: Term) extends Termobject LamParsers extends StandardTokenParsers with ImplicitConversions with PackratParsers { lexical.delimiters += ("(", ")", ".", "") lazy val term: PackratParser[Term] = appTerm | lam lazy val vrb: PackratParser[Var] = ident ^^ Var lazy val lam: PackratParser[Term] = ("" ~> vrb) ~ ("." ~> term) ^^ Lam lazy val appTerm: PackratParser[Term] = appTerm ~ aTerm ^^ App | aTerm lazy val lazy val aTerm: PackratParser[Term] = vrb | "(" ~> term <~ ")" def parseTerm(s: String) = phrase(term)(new lexical.Scanner(s))} 60
  • 61. Without Left Recursionsealed trait Termcase class Var(n: String) extends Termcase class Lam(v: Var, body: Term) extends Termcase class App(t1: Term, t2: Term) extends Termobject LamParsers extends StandardTokenParsers with ImplicitConversions { lexical.delimiters += ("(", ")", ".", "") lazy val term: Parser[Term] = appTerm | lam lazy val vrb: Parser[Var] = ident ^^ Var lazy val lam: Parser[Term] = ("" ~> vrb) ~ ("." ~> term) ^^ Lam lazy val appTerm: Parser[Term] = (aTerm +) ^^ { _.reduceLeft(App) } lazy val aTerm: Parser[Term] = vrb | "(" ~> term <~ ")" def parseTerm(s: String) = phrase(term)(new lexical.Scanner(s))} 61
  • 62. Packrat Performance 62
  • 63. Other Parsers● Pairboled Parser (PEG parser)● GLL parser● Derivative combinators http://stackoverflow.com/questions/4423514/scala-parsers-availability-differences-and-combining 63
  • 64. Trends● Merging two worlds ● Compositionality (Functional programming) ● Performance 64
  • 65. Спасибо! 65
  • 66. https://github.com/ilya-klyuchnikov/tapl-scala 66

×