• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Of Rats And Dragons
 

Of Rats And Dragons

on

  • 2,689 views

Reviews grammar and parsers and discusses my personal path toward writing my own packrat parser-generator for Erlang called neotoma. ...

Reviews grammar and parsers and discusses my personal path toward writing my own packrat parser-generator for Erlang called neotoma.

Given to "Evil Robot Conference" on September 12, 2009.

Statistics

Views

Total Views
2,689
Views on SlideShare
2,471
Embed Views
218

Actions

Likes
0
Downloads
0
Comments
0

5 Embeds 218

http://seancribbs.com 180
http://coderwall.com 29
http://www.linkedin.com 6
http://localhost 2
http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Of Rats And Dragons Of Rats And Dragons Presentation Transcript

    • Of Rats And Dragons Achieving Parsing Sanity Sean Cribbs Web Consultant Ruby and Erlang Hacker Saturday, September 12, 2009
    • Quick Review Saturday, September 12, 2009
    • context-free grammars Saturday, September 12, 2009
    • G = (V, Σ, R, S) Saturday, September 12, 2009
    • V→w Saturday, September 12, 2009
    • non-deterministic pushdown automata Saturday, September 12, 2009
    • stack machine w/ backtracking Saturday, September 12, 2009
    • massage it Saturday, September 12, 2009
    • ε productions Saturday, September 12, 2009
    • produce nothing never produced Saturday, September 12, 2009
    • cycles Saturday, September 12, 2009
    • ambiguities “dangling else” Saturday, September 12, 2009
    • if A then if B then C else D if A then if B then C else D if A then if B then C else D Saturday, September 12, 2009
    • parsing expression grammars Saturday, September 12, 2009
    • top-down parsing language (70’s) Saturday, September 12, 2009
    • direct representation of parsing functions Saturday, September 12, 2009
    • Brian Ford 2002 Saturday, September 12, 2009
    • focused on recognizing Saturday, September 12, 2009
    • computer languages Saturday, September 12, 2009
    • V←e Saturday, September 12, 2009
    • e1 e2 Saturday, September 12, 2009
    • e1 / e2 Saturday, September 12, 2009
    • e+ Saturday, September 12, 2009
    • e* Saturday, September 12, 2009
    • &e Saturday, September 12, 2009
    • !e Saturday, September 12, 2009
    • e? Saturday, September 12, 2009
    • “string” . Saturday, September 12, 2009
    • PEG > regexps Saturday, September 12, 2009
    • combined lex+parse Saturday, September 12, 2009
    • no ambiguity Saturday, September 12, 2009
    • choice is ordered Saturday, September 12, 2009
    • dangling else obviated Saturday, September 12, 2009
    • greedy repetition Saturday, September 12, 2009
    • unlimited lookahead with predicates Saturday, September 12, 2009
    • no left-recursion! (use *,+) Saturday, September 12, 2009
    • Parsing Techniques Saturday, September 12, 2009
    • Tabular test every rule Saturday, September 12, 2009
    • Recursive-descent call & consume Saturday, September 12, 2009
    • Predictive yacc/yecc Saturday, September 12, 2009
    • Packrat RD with memo Saturday, September 12, 2009
    • sacrifice memory for speed Saturday, September 12, 2009
    • supports PEGs and some CFGs Saturday, September 12, 2009
    • Treetop Pappy neotoma Saturday, September 12, 2009
    • neotoma Behind the CodeTM Saturday, September 12, 2009
    • can:has(cukes) -> false. Saturday, September 12, 2009
    • Cucumber uses Treetop Saturday, September 12, 2009
    • PEG → leex/yecc FAIL Saturday, September 12, 2009
    • parsec → eParSec Saturday, September 12, 2009
    • HOF protocol Saturday, September 12, 2009
    • % Implements "?" PEG operator optional(P) -> fun(Input, Index) -> case P(Input, Index) of {fail, _} -> {[], Input, Index}; {_,_,_} = Success -> Success % {Parsed, RemainingInput, NewIndex} end end. Saturday, September 12, 2009
    • % PEG optional_space <- space?; % Erlang optional_space(Input,Index) -> optional(fun space/2)(Input, Index). Saturday, September 12, 2009
    • Yay! RD! make it memo Saturday, September 12, 2009
    • ets Erlang Term Storage Saturday, September 12, 2009
    • {key, value} Saturday, September 12, 2009
    • key = Index Saturday, September 12, 2009
    • value = dict Saturday, September 12, 2009
    • % Memoization wrapper p(Inp, StartIndex, Name, ParseFun, TransformFun) -> % Grab the memo table from ets Memo = get_memo(StartIndex), % See if the current reduction is memoized case dict:find(Name, Memo) of % If it is, return the result {ok, Result} -> Result; % If not, attempt to parse _ -> case ParseFun(Inp, StartIndex) of % If it fails, memoize the failure {fail,_} = Failure -> memoize(StartIndex, dict:store(Name, Failure, Memo)), Failure; % If it passes, transform and memoize the result. {Result, InpRem, NewIndex} -> Transformed = TransformFun(Result, StartIndex), memoize(StartIndex, dict:store(Name, {Transformed, InpRem, NewIndex}, Memo)), {Transformed, InpRem, NewIndex} end end. Saturday, September 12, 2009
    • parse_transform Saturday, September 12, 2009
    • alternative(Input, Index) -> peg:p(Input, Index, alternative, fun(I,P) -> peg:choose([fun sequence/2, fun primary/2])(I,P) end). rule(alternative) -> peg:choose([fun sequence/2, fun primary/2]); Saturday, September 12, 2009
    • rules <- space? declaration_sequence space?; declaration_sequence <- head:declaration tail:(space declaration)*; declaration <- nonterminal space '<-' space parsing_expression space? ';'; parsing_expression <- choice / sequence / primary; choice <- head:alternative tail:(space '/' space alternative)+; alternative <- sequence / primary; primary <- prefix atomic / atomic suffix / atomic; sequence <- head:labeled_sequence_primary tail:(space labeled_sequence_primary)+; labeled_sequence_primary <- label? primary; label <- alpha_char alphanumeric_char* ':'; suffix <- repetition_suffix / optional_suffix; optional_suffix <- '?'; repetition_suffix <- '+' / '*'; prefix <- '&' / '!'; atomic <- terminal / nonterminal / parenthesized_expression; parenthesized_expression <- '(' space? parsing_expression space? ')'; nonterminal <- alpha_char alphanumeric_char*; terminal <- quoted_string / character_class / anything_symbol; quoted_string <- single_quoted_string / double_quoted_string; double_quoted_string <- '"' string:(!'"' ("" / '"' / .))* '"'; single_quoted_string <- "'" string:(!"'" ("" / "'" / .))* "'"; character_class <- '[' characters:(!']' ('' . / !'' .))+ '] anything_symbol <- '.'; alpha_char <- [a-z_]; alphanumeric_char <- alpha_char / [0-9]; space <- (white / comment_to_eol)+; comment_to_eol <- '%' (!"n" .)*; white <- [ tnr]; Saturday, September 12, 2009
    • self-hosting Saturday, September 12, 2009
    • Future directions Saturday, September 12, 2009
    • self-contained parsers Saturday, September 12, 2009
    • inline code in PEG Saturday, September 12, 2009
    • Reia retem sedate Saturday, September 12, 2009
    • questions? Saturday, September 12, 2009