Your SlideShare is downloading. ×
0
parsing
expression
grammars
@joaomilho



juan maiz
joão milho
john corn
‫תירס‬ ‫יוחנן‬
-H E L L O
describe DSL do
it { should be_valid_ruby }
end
-D S L
Feature Polish notation calculator


Scenario Sum

Given I’ve pressed 4
And then pressed space
And then pressed 5
When I p...
In fact, you do it all the time.
Open files, parse stuff, turn it into strings,
numbers, arrays and link it to code.
Like wh...
-C S V
grammar CSV
rule file
(line [n])* line
end
rule line
(value ',')* value
end

rule value
[^,]+
end
end
-
moshe@interwebz.co.il
PA R S E
-
moshe@interwebz.co.il
PA R S E
U S E R
-
moshe@interwebz.co.il
PA R S E
AT
moshe@interwebz.co.il
D O M A I N
-PA R S E
moshe@interwebz.co.il
D O T
-PA R S E
moshe@interwebz.co.il
C O M P L
-PA R S E
moshe@interwebz.co.il
D O T
-PA R S E
moshe@interwebz.co.il
C O M P L
-PA R S E
moshe@interwebz.co.il
U S E R
AT
C O M P LD O M A I N
D O T D O T
C O M P L
-PA R S E
moshe@interwebz.co.il
U S E R
AT
C O M P LD O M A I N
D O T D O T
C O M P L
-PA R S E
grammar Email
rule full_address
user at domain dot compl dot? compl?
end

rule compl
'co' / 'info' / 'br' / 'il'
end
end
-...
-I N T E R P R E T
rule compl
'com' / 'info' / 'br' {
def brazil?

text_value == 'br'
end
}
end
1
-R E C U R S I O N
(1)
-R E C U R S I O N
((1))
-R E C U R S I O N
(((((((((((((1)))))))))))))
-R E C U R S I O N
grammar Recursion
rule parens
'(' parens ')' / '1'
end
end
-R E C U R S I O N
-I N T E R P R E T
grammar Personnummer
rule birthdate
[0-9] 6..8 <Birthdate>
end
end
-I N T E R P R E T
grammar Molecule
include Atom
include Quantity



rule molecule
(atom quantity?)+
end
end
Defined by Noam Chomsky in the 50s.

(Yeah, that weird talking guy from the World Social Forum.
For your info he is a great...
'if(' exp ')' stmt 'else' stmt |
'if(' exp ')' stmt
-C F G
O R
The dangling else. Consider this:
Which alert will be executed?
-C F G
if(false)
if(true)
alert('The second if')
else

alert('The else')
Gotcha. None.
!
Generally, languages prefer to associate else
to the nearest if.
!
But this has to be made in a “meta” rul...
Brian Ford, 2004

http://pdos.csail.mit.edu/~baford/packrat/


Priorities, Not Ambiguities (a / a b) is different of (a b /...
-P E G
P R I O R I T Y O P E R AT O R
The dangling else fixed.
'if(' exp ')' stmt 'else' stmt /
'if(' exp ')' stmt
Result
• linear time parsing using memoization
• one and only one valid parse tree
• no dangling else cases
• easier to le...
treetop.rubyforge.org
rubyconf2007.confreaks.com/d1t1p5_treetop.html


Nathan Sobo (Pivotal Labs), Clifford Heath,
Nick Kal...
arithmetic + lambda calculus + sql
0xCOFFEE full lang compiled to LLVM
html
css
json
odp
hapag
dcf
scheme
apache conf
trxl...
Citrus, Parsley, NEG Ruby
tynirb: peg/leg C
Rats! Java
PetitParser SmallTalk
ppeg, pyPEG, pyparsing Python
Frisby, Leafy H...
brain
fuck
3000 slots
-B R A I N F U C K
0 0 0 0 0 0 0 0 0
-B R A I N F U C K
0 0 0 0 0 0 0 0 0
>
-B R A I N F U C K
0 0 0 0 0 0 0 0 0
-B R A I N F U C K
0 0 0 0 0 0 0 0 0
<
-B R A I N F U C K
1 0 0 0 0 0 0 0 0
+
-B R A I N F U C K
0 0 0 0 0 0 0 0 0
-
-B R A I N F U C K
0 0 0 0 0 0 0 0 0
,
-B R A I N F U C K
66 0 0 0 0 0 0 0 0
,
B
-B R A I N F U C K
66 0 0 0 0 0 0 0 0
.
B
-B R A I N F U C K
66 0 0 0 0 0 0 0 0
[]
-B R A I N F U C K
[>+]
66 0 0 0 0 0 0 0 0
-B R A I N F U C K
66 1 1 1 1 1 1 1 1
[>+]
+++[>+++[>+++++++<-] <-] >><
+++[>++++++<-] >. <+++[>+++++
+++++++<-] >. <++[>--------
<-] >. <+++++++[>++<-] >. <+
[>+<-]...
Upcoming SlideShare
Loading in...5
×

Parsing Expression Grammars and Treetop

137

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
137
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
1
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Parsing Expression Grammars and Treetop"

  1. 1. parsing expression grammars
  2. 2. @joaomilho
 
 juan maiz joão milho john corn ‫תירס‬ ‫יוחנן‬ -H E L L O
  3. 3. describe DSL do it { should be_valid_ruby } end -D S L
  4. 4. Feature Polish notation calculator 
 Scenario Sum
 Given I’ve pressed 4 And then pressed space And then pressed 5 When I press + Then it should display 9 -L A N G U A G E
  5. 5. In fact, you do it all the time. Open files, parse stuff, turn it into strings, numbers, arrays and link it to code. Like when you wrote your first CSV parser. (yeah, i know you did it...) You just haven’t used the right tool. -W R I T E A L A N G U A G E ? W T F ?
  6. 6. -C S V grammar CSV rule file (line [n])* line end rule line (value ',')* value end
 rule value [^,]+ end end
  7. 7. - moshe@interwebz.co.il PA R S E
  8. 8. - moshe@interwebz.co.il PA R S E U S E R
  9. 9. - moshe@interwebz.co.il PA R S E AT
  10. 10. moshe@interwebz.co.il D O M A I N -PA R S E
  11. 11. moshe@interwebz.co.il D O T -PA R S E
  12. 12. moshe@interwebz.co.il C O M P L -PA R S E
  13. 13. moshe@interwebz.co.il D O T -PA R S E
  14. 14. moshe@interwebz.co.il C O M P L -PA R S E
  15. 15. moshe@interwebz.co.il U S E R AT C O M P LD O M A I N D O T D O T C O M P L -PA R S E
  16. 16. moshe@interwebz.co.il U S E R AT C O M P LD O M A I N D O T D O T C O M P L -PA R S E
  17. 17. grammar Email rule full_address user at domain dot compl dot? compl? end
 rule compl 'co' / 'info' / 'br' / 'il' end end -PA R S E
  18. 18. -I N T E R P R E T rule compl 'com' / 'info' / 'br' { def brazil?
 text_value == 'br' end } end
  19. 19. 1 -R E C U R S I O N
  20. 20. (1) -R E C U R S I O N
  21. 21. ((1)) -R E C U R S I O N
  22. 22. (((((((((((((1))))))))))))) -R E C U R S I O N
  23. 23. grammar Recursion rule parens '(' parens ')' / '1' end end -R E C U R S I O N
  24. 24. -I N T E R P R E T grammar Personnummer rule birthdate [0-9] 6..8 <Birthdate> end end
  25. 25. -I N T E R P R E T grammar Molecule include Atom include Quantity
 
 rule molecule (atom quantity?)+ end end
  26. 26. Defined by Noam Chomsky in the 50s.
 (Yeah, that weird talking guy from the World Social Forum. For your info he is a great linguist.) ! LEX, Yacc
 Idea: Generates a language -C F G
  27. 27. 'if(' exp ')' stmt 'else' stmt | 'if(' exp ')' stmt -C F G O R The dangling else. Consider this:
  28. 28. Which alert will be executed? -C F G if(false) if(true) alert('The second if') else
 alert('The else')
  29. 29. Gotcha. None. ! Generally, languages prefer to associate else to the nearest if. ! But this has to be made in a “meta” rule. -C F G
  30. 30. Brian Ford, 2004
 http://pdos.csail.mit.edu/~baford/packrat/ 
 Priorities, Not Ambiguities (a / a b) is different of (a b / a) Syntactic Predicates (& and !) Regular Expression-Like No left recursion
 Idea: Recognizes a language -P E G
  31. 31. -P E G P R I O R I T Y O P E R AT O R The dangling else fixed. 'if(' exp ')' stmt 'else' stmt / 'if(' exp ')' stmt
  32. 32. Result • linear time parsing using memoization • one and only one valid parse tree • no dangling else cases • easier to learn and use Drawbacks • still new • memory eater • not suitable for NLP -P E G
  33. 33. treetop.rubyforge.org rubyconf2007.confreaks.com/d1t1p5_treetop.html 
 Nathan Sobo (Pivotal Labs), Clifford Heath, Nick Kallen & more. -T R E E T O P
  34. 34. arithmetic + lambda calculus + sql 0xCOFFEE full lang compiled to LLVM html css json odp hapag dcf scheme apache conf trxl lolcode card sloc hairball (haml) lordbishop (names) -T R E E T O P
  35. 35. Citrus, Parsley, NEG Ruby tynirb: peg/leg C Rats! Java PetitParser SmallTalk ppeg, pyPEG, pyparsing Python Frisby, Leafy Haskell Perl6 rules Perl6 native! -O U T T H E R E
  36. 36. brain fuck
  37. 37. 3000 slots -B R A I N F U C K 0 0 0 0 0 0 0 0 0
  38. 38. -B R A I N F U C K 0 0 0 0 0 0 0 0 0 >
  39. 39. -B R A I N F U C K 0 0 0 0 0 0 0 0 0
  40. 40. -B R A I N F U C K 0 0 0 0 0 0 0 0 0 <
  41. 41. -B R A I N F U C K 1 0 0 0 0 0 0 0 0 +
  42. 42. -B R A I N F U C K 0 0 0 0 0 0 0 0 0 -
  43. 43. -B R A I N F U C K 0 0 0 0 0 0 0 0 0 ,
  44. 44. -B R A I N F U C K 66 0 0 0 0 0 0 0 0 , B
  45. 45. -B R A I N F U C K 66 0 0 0 0 0 0 0 0 . B
  46. 46. -B R A I N F U C K 66 0 0 0 0 0 0 0 0 []
  47. 47. -B R A I N F U C K [>+] 66 0 0 0 0 0 0 0 0
  48. 48. -B R A I N F U C K 66 1 1 1 1 1 1 1 1 [>+]
  49. 49. +++[>+++[>+++++++<-] <-] >>< +++[>++++++<-] >. <+++[>+++++ +++++++<-] >. <++[>-------- <-] >. <+++++++[>++<-] >. <+ [>+<-] >. <+++++[>--<-] >-. < +++[>++<-] >. <+[>-<-] >. <++ [>++<-] >+. <+++++++++++++ [>----<-] >.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×