Parsing Expression Grammars and Treetop

543 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
543
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Parsing Expression Grammars and Treetop

  1. 1. parsing expression grammars
  2. 2. @joaomilho
 
 juan maiz joão milho john corn ‫תירס‬ ‫יוחנן‬ -H E L L O
  3. 3. describe DSL do it { should be_valid_ruby } end -D S L
  4. 4. Feature Polish notation calculator 
 Scenario Sum
 Given I’ve pressed 4 And then pressed space And then pressed 5 When I press + Then it should display 9 -L A N G U A G E
  5. 5. In fact, you do it all the time. Open files, parse stuff, turn it into strings, numbers, arrays and link it to code. Like when you wrote your first CSV parser. (yeah, i know you did it...) You just haven’t used the right tool. -W R I T E A L A N G U A G E ? W T F ?
  6. 6. -C S V grammar CSV rule file (line [n])* line end rule line (value ',')* value end
 rule value [^,]+ end end
  7. 7. - moshe@interwebz.co.il PA R S E
  8. 8. - moshe@interwebz.co.il PA R S E U S E R
  9. 9. - moshe@interwebz.co.il PA R S E AT
  10. 10. moshe@interwebz.co.il D O M A I N -PA R S E
  11. 11. moshe@interwebz.co.il D O T -PA R S E
  12. 12. moshe@interwebz.co.il C O M P L -PA R S E
  13. 13. moshe@interwebz.co.il D O T -PA R S E
  14. 14. moshe@interwebz.co.il C O M P L -PA R S E
  15. 15. moshe@interwebz.co.il U S E R AT C O M P LD O M A I N D O T D O T C O M P L -PA R S E
  16. 16. moshe@interwebz.co.il U S E R AT C O M P LD O M A I N D O T D O T C O M P L -PA R S E
  17. 17. grammar Email rule full_address user at domain dot compl dot? compl? end
 rule compl 'co' / 'info' / 'br' / 'il' end end -PA R S E
  18. 18. -I N T E R P R E T rule compl 'com' / 'info' / 'br' { def brazil?
 text_value == 'br' end } end
  19. 19. 1 -R E C U R S I O N
  20. 20. (1) -R E C U R S I O N
  21. 21. ((1)) -R E C U R S I O N
  22. 22. (((((((((((((1))))))))))))) -R E C U R S I O N
  23. 23. grammar Recursion rule parens '(' parens ')' / '1' end end -R E C U R S I O N
  24. 24. -I N T E R P R E T grammar Personnummer rule birthdate [0-9] 6..8 <Birthdate> end end
  25. 25. -I N T E R P R E T grammar Molecule include Atom include Quantity
 
 rule molecule (atom quantity?)+ end end
  26. 26. Defined by Noam Chomsky in the 50s.
 (Yeah, that weird talking guy from the World Social Forum. For your info he is a great linguist.) ! LEX, Yacc
 Idea: Generates a language -C F G
  27. 27. 'if(' exp ')' stmt 'else' stmt | 'if(' exp ')' stmt -C F G O R The dangling else. Consider this:
  28. 28. Which alert will be executed? -C F G if(false) if(true) alert('The second if') else
 alert('The else')
  29. 29. Gotcha. None. ! Generally, languages prefer to associate else to the nearest if. ! But this has to be made in a “meta” rule. -C F G
  30. 30. Brian Ford, 2004
 http://pdos.csail.mit.edu/~baford/packrat/ 
 Priorities, Not Ambiguities (a / a b) is different of (a b / a) Syntactic Predicates (& and !) Regular Expression-Like No left recursion
 Idea: Recognizes a language -P E G
  31. 31. -P E G P R I O R I T Y O P E R AT O R The dangling else fixed. 'if(' exp ')' stmt 'else' stmt / 'if(' exp ')' stmt
  32. 32. Result • linear time parsing using memoization • one and only one valid parse tree • no dangling else cases • easier to learn and use Drawbacks • still new • memory eater • not suitable for NLP -P E G
  33. 33. treetop.rubyforge.org rubyconf2007.confreaks.com/d1t1p5_treetop.html 
 Nathan Sobo (Pivotal Labs), Clifford Heath, Nick Kallen & more. -T R E E T O P
  34. 34. arithmetic + lambda calculus + sql 0xCOFFEE full lang compiled to LLVM html css json odp hapag dcf scheme apache conf trxl lolcode card sloc hairball (haml) lordbishop (names) -T R E E T O P
  35. 35. Citrus, Parsley, NEG Ruby tynirb: peg/leg C Rats! Java PetitParser SmallTalk ppeg, pyPEG, pyparsing Python Frisby, Leafy Haskell Perl6 rules Perl6 native! -O U T T H E R E
  36. 36. brain fuck
  37. 37. 3000 slots -B R A I N F U C K 0 0 0 0 0 0 0 0 0
  38. 38. -B R A I N F U C K 0 0 0 0 0 0 0 0 0 >
  39. 39. -B R A I N F U C K 0 0 0 0 0 0 0 0 0
  40. 40. -B R A I N F U C K 0 0 0 0 0 0 0 0 0 <
  41. 41. -B R A I N F U C K 1 0 0 0 0 0 0 0 0 +
  42. 42. -B R A I N F U C K 0 0 0 0 0 0 0 0 0 -
  43. 43. -B R A I N F U C K 0 0 0 0 0 0 0 0 0 ,
  44. 44. -B R A I N F U C K 66 0 0 0 0 0 0 0 0 , B
  45. 45. -B R A I N F U C K 66 0 0 0 0 0 0 0 0 . B
  46. 46. -B R A I N F U C K 66 0 0 0 0 0 0 0 0 []
  47. 47. -B R A I N F U C K [>+] 66 0 0 0 0 0 0 0 0
  48. 48. -B R A I N F U C K 66 1 1 1 1 1 1 1 1 [>+]
  49. 49. +++[>+++[>+++++++<-] <-] >>< +++[>++++++<-] >. <+++[>+++++ +++++++<-] >. <++[>-------- <-] >. <+++++++[>++<-] >. <+ [>+<-] >. <+++++[>--<-] >-. < +++[>++<-] >. <+[>-<-] >. <++ [>++<-] >+. <+++++++++++++ [>----<-] >.

×