SlideShare a Scribd company logo
Hacking parse.y
     Tatsuhiro UJIHISA
     ujihisa@gmail.com
 http://ujihisa.blogspot.com/

           @ujm
hi



• I'm from Vancouver, Canada
Trick or Treat!
Hacking parse.y
Fixing ruby parser to understand ruby
 • Introducing new syntax
  • {:key :-) "value"}
  • 'symbol
  • ++i
  • def A#b(c)
  • {1}
MRI Inside

• MRI (Matz Ruby Implementation)
• $ ruby -v
  ruby 1.9.2dev (2009-08-05 trunk 24397) [i386-darwin9.7.0]


• Written in C
 • array.c, vm.c, gc.c, etc...
ruby 1.8 vs 1.9

• ~1.8
 • Parser: parse.y
 • Evaluator: eval.c
• 1.9~
 • Parser: parse.y
 • Evaluator:YARV (vm*.c)
Matz said

• Ugly: eval.c and parse.y
 RubyConf2006
• Now the original evaluator
 was all replaced with YARV
MRI Parser

• MRI uses yacc
  (parser generator for C)
• parse.y-o y.tab.c parse.y
  bison -d
  sed -f ./tool/ytab.sed -e "/^#/s!y.tab.c!
  parse.c!" y.tab.c > parse.c.new
  ...
parse.y

• One of the darkest side
• $ wc -l *{c,h,y} | sort -n
  ...
 9261 io.c
 10350 parse.y
 16352 parse.c # (automatically generated)
 183370 total
(Broad) Parser

• Lexer (yylex)
 • Bytes → Symbols
• Parser (yyparse)
 • Symbols → Syntax Tree
Tokens in Lexer
                                           %token <id> tOP_ASGN /* +=, -= et
%token   tUPLUS     /* unary+ */           %token tASSOC       /* => */
%token   tUMINUS    /* unary- */           %token tLPAREN      /* ( */
%token   tPOW /* ** */                     %token tLPAREN_ARG /* ( */
%token   tCMP /* <=> */                    %token tRPAREN      /* ) */
%token   tEQ     /* == */                  %token tLBRACK      /* [ */
%token   tEQQ /* === */                    %token tLBRACE      /* { */
%token   tNEQ /* != */                     %token tLBRACE_ARG /* { */
%token   tGEQ /* >= */                     %token tSTAR    /* * */
%token   tLEQ /* <= */                     %token tAMPER       /* & */
%token   tANDOP tOROP /* && and || */      %token tLAMBDA      /* -> */
%token   tMATCH tNMATCH/* =~ and !~ */     %token tSYMBEG tSTRING_BEG tXSTRING_
%token   tDOT2 tDOT3 /* .. and ... */      tWORDS_BEG tQWORDS_BEG
%token   tAREF tASET /* [] and []= */      %token tSTRING_DBEG tSTRING_DVAR tST
%token   tLSHFT tRSHFT /* << and >> */
%token   tCOLON2    /* :: */
%token   tCOLON3    /* :: at EXPR_BEG */
(detour)

n   MRI: parse.y (10350 lines)

n   JRuby: src/org/jruby/parser/{DefaultRubyParser.y,
    Ruby19Parser.y}
    (1886, 2076 lines)

n   Rubinius: lib/ruby_parser.y (1795 lines)
Case 1:
             :-)

• Hash literal
  {:key => 'value'}
  {:key :-) 'value'}
• :-) is just an alias of =>
Mastering “Colon”
Colons in Ruby

• A::B, ::C
• :symbol, :"sy-m-bol"
•a ? b : c
• {a: b}
• when 1: something (in 1.8)
static int
parser_yylex(struct parser_params *parser) {
    ...
    switch (c = nextc()) {
      ...
      case '#': /* it's a comment */
      ...
      case ':':
        c = nextc();
        if (c == ':') {
            if (IS_BEG() ||...
      ...
    }
    ... (about 1300 lines)
How does parser deal
    with colon?

• :: → tCOLON2 or tCOLON3
 • tCOLON2 Net::URI
 • tCOLON3 ::Kernel
lex_state
enum lex_state_e {
    EXPR_BEG,        /* ignore newline, +/- is a sign. */
    EXPR_END,        /* newline significant, +/- is an operator. *
    EXPR_ENDARG,     /* ditto, and unbound braces. */
    EXPR_ARG,        /* newline significant, +/- is an operator. *
    EXPR_CMDARG,     /* newline significant, +/- is an operator. *
    EXPR_MID,        /* newline significant, +/- is an operator. *
    EXPR_FNAME,      /* ignore newline, no reserved words. */
    EXPR_DOT,        /* right after `.' or `::', no reserved words
    EXPR_CLASS,      /* immediate after `class', no here document.
    EXPR_VALUE       /* alike EXPR_BEG but label is disallowed. */
};
case ':':
  c = nextc();
  if (c == ':') {
      if (IS_BEG() ||
          lex_state == EXPR_CLASS ||
          (IS_ARG() && space_seen)) {
          lex_state = EXPR_BEG;
          return tCOLON3;
      }
      lex_state = EXPR_DOT;
      return tCOLON2;
  }
...
  if (lex_state == EXPR_END ||
      lex_state == EXPR_ENDARG ||
      (c != -1 && ISSPACE(c))) {
      pushback(c);
      lex_state = EXPR_BEG;
      return ':';
  }
  switch (c) {
    case ''':
      lex_strterm = NEW_STRTERM(str_ssym, c, 0);
      break;
    case '"':
      lex_strterm = NEW_STRTERM(str_dsym, c, 0);
      break;
    default:
      pushback(c);
      break;
  }
  lex_state = EXPR_FNAME;
  return tSYMBEG;
How does parser deal
with colon? (summary)
• :: → tCOLON2 or tCOLON2
• EXPR_END or →: (else)
• otherwise → tSYMBEG
 • :' → str_ssym
 • :" → str_dsym
So,
• :-) → tASSOC
• :: → tCOLON2 or tCOLON2
• EXPR_END or →: (else)
• otherwise → tSYMBEG
 • :' → str_ssym
 • :" → str_dsym
:-)
Case 2:
    Lisp Like Symbol
• Symbol Literal
 :vancouver
 'vancouver
• Ad-hoc
 p :a, :b
 p 'a, 'b
Single Quote
(in parser_yylex)
...
case ''':
      lex_strterm = NEW_STRTERM(str_squote, ''', 0);
      return tSTRING_BEG;
...
Single Quote
(in parser_yylex)
...
case ''':
      if (??? condition ???) {
           lex_state = EXPR_FNAME;
           return tSYMBEG;
      }
      lex_strterm = NEW_STRTERM(str_squote, ''', 0);
      return tSTRING_BEG;
...
(loop
  (lambda (p 'good)))
Case3: Pre
Incremental Operator

• ++i
• i = i.succ
  (NOT i = i + 1)
Lexer
@@ -685,6 +685,7 @@ static void
token_info_pop(struct parser_params*, const
char *token);
 %type <val> program reswords then do
dot_or_colon
 %*/
 %token tUPLUS /* unary+ */
+%token tINCR /* ++var */
 %token tUMINUS /* unary- */
 %token tPOW    /* ** */
 %token tCMP    /* <=> */
               (Actually there are more trivial fixes)
regenerate id.h

• id.h is automatically
 generated by parse.y in make
• $ rm id.h
 $ make
parser example
variable     : tIDENTIFIER
        |   tIVAR
        |   tGVAR
        |   tCONSTANT
        |   tCVAR
        |   keyword_nil {ifndef_ripper($$ = keyword_nil);}
        |   keyword_self {ifndef_ripper($$ = keyword_self);}
        |   keyword_true {ifndef_ripper($$ = keyword_true);}
        |   keyword_false {ifndef_ripper($$ = keyword_false);}
        |   keyword__FILE__ {ifndef_ripper($$ = keyword__FILE__);}
        |   keyword__LINE__ {ifndef_ripper($$ = keyword__LINE__);}
        |   keyword__ENCODING__ {ifndef_ripper($$ = keyword__ENCODING_
        ;
lhs     : variable
         {
         /*%%%*/
        if (!($$ = assignable($1, 0))) $$ = NEW_BEGIN(0);
         /*%
        $$ = dispatch1(var_field, $1);
         %*/
         }
     | primary_value '[' opt_call_args rbracket
         {
         /*%%%*/
        $$ = aryset($1, $3);
         /*%
        $$ = dispatch2(aref_field, $1, escape_Qundef($3));
         %*/
         }
  ...
BNF (part)
program    : compstmt             arg       : lhs '=' arg
                                            | var_lhs tOP_ASGN arg
compstmt   : stmts opt_terms                | primary_value '[' aref_args ']' tOP

stmts      : none
           | stmt                           | arg '?' arg ':' arg
           | stmts terms stmt               | primary

stmt       : kALIAS fitem fitem   primary   : literal
           | kALIAS tGVAR tGVAR             | strings


           | expr                           | tLPAREN_ARG expr ')'
                                            | tLPAREN compstmt ')'
expr       : kRETURN call_args
           | kBREAK call_args
                                            | kREDO
                                            | kRETRY
           | '!' command_call
           | arg
Assign
stmt : ...
 | mlhs '=' command_call
     {
     /*%%%*/
         value_expr($3);
         $1->nd_value = $3;
         $$ = $1;
     /*%
         $$ = dispatch2(massign, $1, $3);
     %*/
     }
mlhs
mlhs: mlhs_basic | ...
mlhs_basic: mlhs_head | ...
mlhs_head: mlhs_item ',' | ...
mlhs_item: mlhs_node | ...
mlhs_node: variable {
  $$ = assignable($1, 0); }
Method call
block_command        : block_call
| block_call '.' operation2 command_args
    {
    /*%%%*/
        $$ = NEW_CALL($1, $3, $4);
    /*%
        $$ = dispatch3(call, $1, ripper_id2sym('.'),
        $$ = method_arg($$, $4);
    %*/
    }
Mix!
var_ref: ...
| tINCR variable
    {
    /*%%%*/
        $$ = assignable($2, 0);
        $$->nd_value = NEW_CALL(gettable($$->nd_vid),
rb_intern("succ"), 0);
    /*%
        $$ = dispatch2(unary, ripper_intern("++@"), $2);
    %*/
    }
++ruby
Case 4:
         def A#b

• A#b
 instance method b of class A
• A.b
 class method b of class A
A#b
class A    def A.b
  def b      ...
    ...    end
  end
end
A#b
def A#b    def A.b
  ...        ...
end        end
#
(in parser_yylex)
case '#':                 /* it's a comment */
 /* no magic_comment in shebang line */
 if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) {
     if (comment_at_top(parser)) {
            set_file_encoding(parser, lex_p, lex_pend);
     }
 }
 lex_p = lex_pend;
#
(in parser_yylex)
case '#':                 /* it's a comment */
 c = nextc();
 pushback(c);
 if(lex_state == EXPR_END && ISALNUM(c)) return '#';
 /* no magic_comment in shebang line */
 if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) {
     if (comment_at_top(parser)) {
            set_file_encoding(parser, lex_p, lex_pend);
Primary
primary: literal | ...
       | k_def singleton dot_or_colon {lex_state = EXPR_FNAME;} fname
           {
               in_single++;
               lex_state = EXPR_END; /* force for args */
           /*%%%*/
               local_push(0);
           /*%
           %*/
           }
         f_arglist
         bodystmt
         k_end
           {
           /*%%%*/
               NODE *body = remove_begin($8);
               reduce_nodes(&body);
               $$ = NEW_DEFS($2, $5, $7, body);
               fixpos($$, $2);
               local_pop();
           /*%
               $$ = dispatch5(defs, $2, $3, $5, $7, $8);
           %*/
               in_single--;
           }
| k_def cname '#' {lex_state = EXPR_FNAME;} fname
    {
        $<id>$ = cur_mid;
        cur_mid = $5;
        in_def++;
    /*%%%*/
        local_push(0);
    /*%
    %*/
    }
  f_arglist
  bodystmt
  k_end
    {
    /*%%%*/
        NODE *body = remove_begin($8);
        reduce_nodes(&body);
        $$ = NEW_DEFN($5, $7, body, NOEX_PRIVATE);
        fixpos($$, $7);
        fixpos($$->nd_defn, $7);
        $$ = NEW_CLASS(NEW_COLON3($2), $$, 0);
        nd_set_line($$, $<num>6);
        local_pop();
    /*%
        $$ = dispatch4(defi, $2, $5, $7, $8);
    %*/
        in_def--;
        cur_mid = $<id>6;
    }
Reference
Ruby




Minero AOKI,Yukihiro
MATSUMOTO
"Ruby Hacking Guide"


HTML Version is available
Reference


• My blog
 http://ujihisa.blogspot.com
• All patches I showed are there
end
Appendix:
Imaginary Numbers
• Matz wrote a patch in
 [ruby-dev:38843]
• translation:
 [ruby-core:24730]
• It won't be accepted
Appendix:
 Imaginary Numbers

> 3i
=> (0 + 3i)
> 3i.class
=> Complex
Hacking parse.y (RubyKansai38)

More Related Content

What's hot

Swift Programming Language
Swift Programming LanguageSwift Programming Language
Swift Programming Language
Anıl Sözeri
 
A swift introduction to Swift
A swift introduction to SwiftA swift introduction to Swift
A swift introduction to Swift
Giordano Scalzo
 
Swift 함수 커링 사용하기
Swift 함수 커링 사용하기Swift 함수 커링 사용하기
Swift 함수 커링 사용하기
진성 오
 
9 character string &amp; string library
9  character string &amp; string library9  character string &amp; string library
9 character string &amp; string library
MomenMostafa
 
4 operators, expressions &amp; statements
4  operators, expressions &amp; statements4  operators, expressions &amp; statements
4 operators, expressions &amp; statements
MomenMostafa
 
GeoGebra JavaScript CheatSheet
GeoGebra JavaScript CheatSheetGeoGebra JavaScript CheatSheet
GeoGebra JavaScript CheatSheet
Jose Perez
 
Falcon初印象
Falcon初印象Falcon初印象
Falcon初印象
勇浩 赖
 
8 arrays and pointers
8  arrays and pointers8  arrays and pointers
8 arrays and pointers
MomenMostafa
 
6 c control statements branching &amp; jumping
6 c control statements branching &amp; jumping6 c control statements branching &amp; jumping
6 c control statements branching &amp; jumping
MomenMostafa
 
Oops pramming with examples
Oops pramming with examplesOops pramming with examples
Oops pramming with examples
Syed Khaleel
 
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
ssuserd6b1fd
 
The Magnificent Seven
The Magnificent SevenThe Magnificent Seven
The Magnificent Seven
Mike Fogus
 
Hello Swift 3/5 - Function
Hello Swift 3/5 - FunctionHello Swift 3/5 - Function
Hello Swift 3/5 - Function
Cody Yun
 
The Story About The Migration
 The Story About The Migration The Story About The Migration
The Story About The Migration
EDB
 
Javascript basics
Javascript basicsJavascript basics
Javascript basics
Fin Chen
 
PHP Enums - PHPCon Japan 2021
PHP Enums - PHPCon Japan 2021PHP Enums - PHPCon Japan 2021
PHP Enums - PHPCon Japan 2021
Ayesh Karunaratne
 
Nikita Popov "What’s new in PHP 8.0?"
Nikita Popov "What’s new in PHP 8.0?"Nikita Popov "What’s new in PHP 8.0?"
Nikita Popov "What’s new in PHP 8.0?"
Fwdays
 
Swiftの関数型っぽい部分
Swiftの関数型っぽい部分Swiftの関数型っぽい部分
Swiftの関数型っぽい部分
bob_is_strange
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
Nikita Popov
 
A Few Interesting Things in Apple's Swift Programming Language
A Few Interesting Things in Apple's Swift Programming LanguageA Few Interesting Things in Apple's Swift Programming Language
A Few Interesting Things in Apple's Swift Programming Language
SmartLogic
 

What's hot (20)

Swift Programming Language
Swift Programming LanguageSwift Programming Language
Swift Programming Language
 
A swift introduction to Swift
A swift introduction to SwiftA swift introduction to Swift
A swift introduction to Swift
 
Swift 함수 커링 사용하기
Swift 함수 커링 사용하기Swift 함수 커링 사용하기
Swift 함수 커링 사용하기
 
9 character string &amp; string library
9  character string &amp; string library9  character string &amp; string library
9 character string &amp; string library
 
4 operators, expressions &amp; statements
4  operators, expressions &amp; statements4  operators, expressions &amp; statements
4 operators, expressions &amp; statements
 
GeoGebra JavaScript CheatSheet
GeoGebra JavaScript CheatSheetGeoGebra JavaScript CheatSheet
GeoGebra JavaScript CheatSheet
 
Falcon初印象
Falcon初印象Falcon初印象
Falcon初印象
 
8 arrays and pointers
8  arrays and pointers8  arrays and pointers
8 arrays and pointers
 
6 c control statements branching &amp; jumping
6 c control statements branching &amp; jumping6 c control statements branching &amp; jumping
6 c control statements branching &amp; jumping
 
Oops pramming with examples
Oops pramming with examplesOops pramming with examples
Oops pramming with examples
 
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
Notes for GNU Octave - Numerical Programming - for Students - 02 of 02 by aru...
 
The Magnificent Seven
The Magnificent SevenThe Magnificent Seven
The Magnificent Seven
 
Hello Swift 3/5 - Function
Hello Swift 3/5 - FunctionHello Swift 3/5 - Function
Hello Swift 3/5 - Function
 
The Story About The Migration
 The Story About The Migration The Story About The Migration
The Story About The Migration
 
Javascript basics
Javascript basicsJavascript basics
Javascript basics
 
PHP Enums - PHPCon Japan 2021
PHP Enums - PHPCon Japan 2021PHP Enums - PHPCon Japan 2021
PHP Enums - PHPCon Japan 2021
 
Nikita Popov "What’s new in PHP 8.0?"
Nikita Popov "What’s new in PHP 8.0?"Nikita Popov "What’s new in PHP 8.0?"
Nikita Popov "What’s new in PHP 8.0?"
 
Swiftの関数型っぽい部分
Swiftの関数型っぽい部分Swiftの関数型っぽい部分
Swiftの関数型っぽい部分
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
 
A Few Interesting Things in Apple's Swift Programming Language
A Few Interesting Things in Apple's Swift Programming LanguageA Few Interesting Things in Apple's Swift Programming Language
A Few Interesting Things in Apple's Swift Programming Language
 

Similar to Hacking parse.y (RubyKansai38)

Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)
ujihisa
 
Achieving Parsing Sanity In Erlang
Achieving Parsing Sanity In ErlangAchieving Parsing Sanity In Erlang
Achieving Parsing Sanity In Erlang
Sean Cribbs
 
LEX & YACC TOOL
LEX & YACC TOOLLEX & YACC TOOL
Unit 4
Unit 4Unit 4
Unit 4
siddr
 
java compilerCompiler1.javajava compilerCompiler1.javaimport.docx
java compilerCompiler1.javajava compilerCompiler1.javaimport.docxjava compilerCompiler1.javajava compilerCompiler1.javaimport.docx
java compilerCompiler1.javajava compilerCompiler1.javaimport.docx
priestmanmable
 
... now write an interpreter (PHPem 2016)
... now write an interpreter (PHPem 2016)... now write an interpreter (PHPem 2016)
... now write an interpreter (PHPem 2016)
James Titcumb
 
Load-time Hacking using LD_PRELOAD
Load-time Hacking using LD_PRELOADLoad-time Hacking using LD_PRELOAD
Load-time Hacking using LD_PRELOAD
Dharmalingam Ganesan
 
Im having difficulty with the directives i figured out a duplicatio.pdf
Im having difficulty with the directives i figured out a duplicatio.pdfIm having difficulty with the directives i figured out a duplicatio.pdf
Im having difficulty with the directives i figured out a duplicatio.pdf
maheshkumar12354
 
Round PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing FunctionallyRound PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing Functionally
Sean Cribbs
 
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
Functional Thursday
 
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docxcalc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
RAHUL126667
 
Best C++ Programming Homework Help
Best C++ Programming Homework HelpBest C++ Programming Homework Help
Best C++ Programming Homework Help
C++ Homework Help
 
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
Binary Studio
 
Go Says WAT?
Go Says WAT?Go Says WAT?
Go Says WAT?
jonbodner
 
Recursion to iteration automation.
Recursion to iteration automation.Recursion to iteration automation.
Recursion to iteration automation.
Russell Childs
 
Groovy
GroovyGroovy
Groovy
Zen Urban
 
Antlr V3
Antlr V3Antlr V3
Antlr V3
guest5024494
 
Functions
FunctionsFunctions
Functions
Ankit Dubey
 
Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.
Workhorse Computing
 
Ch4c.ppt
Ch4c.pptCh4c.ppt
Ch4c.ppt
MDSayem35
 

Similar to Hacking parse.y (RubyKansai38) (20)

Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)
 
Achieving Parsing Sanity In Erlang
Achieving Parsing Sanity In ErlangAchieving Parsing Sanity In Erlang
Achieving Parsing Sanity In Erlang
 
LEX & YACC TOOL
LEX & YACC TOOLLEX & YACC TOOL
LEX & YACC TOOL
 
Unit 4
Unit 4Unit 4
Unit 4
 
java compilerCompiler1.javajava compilerCompiler1.javaimport.docx
java compilerCompiler1.javajava compilerCompiler1.javaimport.docxjava compilerCompiler1.javajava compilerCompiler1.javaimport.docx
java compilerCompiler1.javajava compilerCompiler1.javaimport.docx
 
... now write an interpreter (PHPem 2016)
... now write an interpreter (PHPem 2016)... now write an interpreter (PHPem 2016)
... now write an interpreter (PHPem 2016)
 
Load-time Hacking using LD_PRELOAD
Load-time Hacking using LD_PRELOADLoad-time Hacking using LD_PRELOAD
Load-time Hacking using LD_PRELOAD
 
Im having difficulty with the directives i figured out a duplicatio.pdf
Im having difficulty with the directives i figured out a duplicatio.pdfIm having difficulty with the directives i figured out a duplicatio.pdf
Im having difficulty with the directives i figured out a duplicatio.pdf
 
Round PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing FunctionallyRound PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing Functionally
 
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
 
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docxcalc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
calc3build# calc3bison -y -d calc3.yflex calc3.lgcc -c .docx
 
Best C++ Programming Homework Help
Best C++ Programming Homework HelpBest C++ Programming Homework Help
Best C++ Programming Homework Help
 
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 4)
 
Go Says WAT?
Go Says WAT?Go Says WAT?
Go Says WAT?
 
Recursion to iteration automation.
Recursion to iteration automation.Recursion to iteration automation.
Recursion to iteration automation.
 
Groovy
GroovyGroovy
Groovy
 
Antlr V3
Antlr V3Antlr V3
Antlr V3
 
Functions
FunctionsFunctions
Functions
 
Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.Perl6 Regexen: Reduce the line noise in your code.
Perl6 Regexen: Reduce the line noise in your code.
 
Ch4c.ppt
Ch4c.pptCh4c.ppt
Ch4c.ppt
 

More from ujihisa

vimconf2013
vimconf2013vimconf2013
vimconf2013
ujihisa
 
KOF2013 Minecraft / Clojure
KOF2013 Minecraft / ClojureKOF2013 Minecraft / Clojure
KOF2013 Minecraft / Clojure
ujihisa
 
Keynote ujihisa.vim#2
Keynote ujihisa.vim#2Keynote ujihisa.vim#2
Keynote ujihisa.vim#2
ujihisa
 
vimshell made other shells legacy
vimshell made other shells legacyvimshell made other shells legacy
vimshell made other shells legacy
ujihisa
 
From Ruby to Haskell (Kansai Yami RubyKaigi)
From Ruby to Haskell (Kansai Yami RubyKaigi)From Ruby to Haskell (Kansai Yami RubyKaigi)
From Ruby to Haskell (Kansai Yami RubyKaigi)
ujihisa
 
Text Manipulation with/without Parsec
Text Manipulation with/without ParsecText Manipulation with/without Parsec
Text Manipulation with/without Parsec
ujihisa
 
CoffeeScript in hootsuite
CoffeeScript in hootsuiteCoffeeScript in hootsuite
CoffeeScript in hootsuite
ujihisa
 
HootSuite Dev 2
HootSuite Dev 2HootSuite Dev 2
HootSuite Dev 2
ujihisa
 
Ruby Kansai49
Ruby Kansai49Ruby Kansai49
Ruby Kansai49
ujihisa
 
Hootsuite dev 2011
Hootsuite dev 2011Hootsuite dev 2011
Hootsuite dev 2011
ujihisa
 
LLVM Workshop Osaka Umeda, Japan
LLVM Workshop Osaka Umeda, JapanLLVM Workshop Osaka Umeda, Japan
LLVM Workshop Osaka Umeda, Japan
ujihisa
 
RubyConf 2009 LT "Termtter"
RubyConf 2009 LT "Termtter"RubyConf 2009 LT "Termtter"
RubyConf 2009 LT "Termtter"
ujihisa
 
Ruby Kansai #35 About RubyKaigi2009 ujihisa
Ruby Kansai #35 About RubyKaigi2009 ujihisaRuby Kansai #35 About RubyKaigi2009 ujihisa
Ruby Kansai #35 About RubyKaigi2009 ujihisa
ujihisa
 
Kof2008 Itll
Kof2008 ItllKof2008 Itll
Kof2008 Itll
ujihisa
 
All About Metarw -- VimM#2
All About Metarw -- VimM#2All About Metarw -- VimM#2
All About Metarw -- VimM#2
ujihisa
 
Itc2008 Ujihisa
Itc2008 UjihisaItc2008 Ujihisa
Itc2008 Ujihisa
ujihisa
 
Agile Web Posting With Ruby / Ruby Kaigi2008
Agile Web Posting With Ruby / Ruby Kaigi2008Agile Web Posting With Ruby / Ruby Kaigi2008
Agile Web Posting With Ruby / Ruby Kaigi2008
ujihisa
 
Agile Web Posting with Ruby (lang:ja)
Agile Web Posting with Ruby (lang:ja)Agile Web Posting with Ruby (lang:ja)
Agile Web Posting with Ruby (lang:ja)
ujihisa
 
From Java To Haskell P
From Java To Haskell PFrom Java To Haskell P
From Java To Haskell P
ujihisa
 
Ruby Monad
Ruby MonadRuby Monad
Ruby Monad
ujihisa
 

More from ujihisa (20)

vimconf2013
vimconf2013vimconf2013
vimconf2013
 
KOF2013 Minecraft / Clojure
KOF2013 Minecraft / ClojureKOF2013 Minecraft / Clojure
KOF2013 Minecraft / Clojure
 
Keynote ujihisa.vim#2
Keynote ujihisa.vim#2Keynote ujihisa.vim#2
Keynote ujihisa.vim#2
 
vimshell made other shells legacy
vimshell made other shells legacyvimshell made other shells legacy
vimshell made other shells legacy
 
From Ruby to Haskell (Kansai Yami RubyKaigi)
From Ruby to Haskell (Kansai Yami RubyKaigi)From Ruby to Haskell (Kansai Yami RubyKaigi)
From Ruby to Haskell (Kansai Yami RubyKaigi)
 
Text Manipulation with/without Parsec
Text Manipulation with/without ParsecText Manipulation with/without Parsec
Text Manipulation with/without Parsec
 
CoffeeScript in hootsuite
CoffeeScript in hootsuiteCoffeeScript in hootsuite
CoffeeScript in hootsuite
 
HootSuite Dev 2
HootSuite Dev 2HootSuite Dev 2
HootSuite Dev 2
 
Ruby Kansai49
Ruby Kansai49Ruby Kansai49
Ruby Kansai49
 
Hootsuite dev 2011
Hootsuite dev 2011Hootsuite dev 2011
Hootsuite dev 2011
 
LLVM Workshop Osaka Umeda, Japan
LLVM Workshop Osaka Umeda, JapanLLVM Workshop Osaka Umeda, Japan
LLVM Workshop Osaka Umeda, Japan
 
RubyConf 2009 LT "Termtter"
RubyConf 2009 LT "Termtter"RubyConf 2009 LT "Termtter"
RubyConf 2009 LT "Termtter"
 
Ruby Kansai #35 About RubyKaigi2009 ujihisa
Ruby Kansai #35 About RubyKaigi2009 ujihisaRuby Kansai #35 About RubyKaigi2009 ujihisa
Ruby Kansai #35 About RubyKaigi2009 ujihisa
 
Kof2008 Itll
Kof2008 ItllKof2008 Itll
Kof2008 Itll
 
All About Metarw -- VimM#2
All About Metarw -- VimM#2All About Metarw -- VimM#2
All About Metarw -- VimM#2
 
Itc2008 Ujihisa
Itc2008 UjihisaItc2008 Ujihisa
Itc2008 Ujihisa
 
Agile Web Posting With Ruby / Ruby Kaigi2008
Agile Web Posting With Ruby / Ruby Kaigi2008Agile Web Posting With Ruby / Ruby Kaigi2008
Agile Web Posting With Ruby / Ruby Kaigi2008
 
Agile Web Posting with Ruby (lang:ja)
Agile Web Posting with Ruby (lang:ja)Agile Web Posting with Ruby (lang:ja)
Agile Web Posting with Ruby (lang:ja)
 
From Java To Haskell P
From Java To Haskell PFrom Java To Haskell P
From Java To Haskell P
 
Ruby Monad
Ruby MonadRuby Monad
Ruby Monad
 

Recently uploaded

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 

Recently uploaded (20)

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 

Hacking parse.y (RubyKansai38)

  • 1. Hacking parse.y Tatsuhiro UJIHISA ujihisa@gmail.com http://ujihisa.blogspot.com/ @ujm
  • 2. hi • I'm from Vancouver, Canada
  • 4. Hacking parse.y Fixing ruby parser to understand ruby • Introducing new syntax • {:key :-) "value"} • 'symbol • ++i • def A#b(c) • {1}
  • 5. MRI Inside • MRI (Matz Ruby Implementation) • $ ruby -v ruby 1.9.2dev (2009-08-05 trunk 24397) [i386-darwin9.7.0] • Written in C • array.c, vm.c, gc.c, etc...
  • 6. ruby 1.8 vs 1.9 • ~1.8 • Parser: parse.y • Evaluator: eval.c • 1.9~ • Parser: parse.y • Evaluator:YARV (vm*.c)
  • 7. Matz said • Ugly: eval.c and parse.y RubyConf2006 • Now the original evaluator was all replaced with YARV
  • 8. MRI Parser • MRI uses yacc (parser generator for C) • parse.y-o y.tab.c parse.y bison -d sed -f ./tool/ytab.sed -e "/^#/s!y.tab.c! parse.c!" y.tab.c > parse.c.new ...
  • 9. parse.y • One of the darkest side • $ wc -l *{c,h,y} | sort -n ... 9261 io.c 10350 parse.y 16352 parse.c # (automatically generated) 183370 total
  • 10. (Broad) Parser • Lexer (yylex) • Bytes → Symbols • Parser (yyparse) • Symbols → Syntax Tree
  • 11. Tokens in Lexer %token <id> tOP_ASGN /* +=, -= et %token tUPLUS /* unary+ */ %token tASSOC /* => */ %token tUMINUS /* unary- */ %token tLPAREN /* ( */ %token tPOW /* ** */ %token tLPAREN_ARG /* ( */ %token tCMP /* <=> */ %token tRPAREN /* ) */ %token tEQ /* == */ %token tLBRACK /* [ */ %token tEQQ /* === */ %token tLBRACE /* { */ %token tNEQ /* != */ %token tLBRACE_ARG /* { */ %token tGEQ /* >= */ %token tSTAR /* * */ %token tLEQ /* <= */ %token tAMPER /* & */ %token tANDOP tOROP /* && and || */ %token tLAMBDA /* -> */ %token tMATCH tNMATCH/* =~ and !~ */ %token tSYMBEG tSTRING_BEG tXSTRING_ %token tDOT2 tDOT3 /* .. and ... */ tWORDS_BEG tQWORDS_BEG %token tAREF tASET /* [] and []= */ %token tSTRING_DBEG tSTRING_DVAR tST %token tLSHFT tRSHFT /* << and >> */ %token tCOLON2 /* :: */ %token tCOLON3 /* :: at EXPR_BEG */
  • 12. (detour) n MRI: parse.y (10350 lines) n JRuby: src/org/jruby/parser/{DefaultRubyParser.y, Ruby19Parser.y} (1886, 2076 lines) n Rubinius: lib/ruby_parser.y (1795 lines)
  • 13. Case 1: :-) • Hash literal {:key => 'value'} {:key :-) 'value'} • :-) is just an alias of =>
  • 15. Colons in Ruby • A::B, ::C • :symbol, :"sy-m-bol" •a ? b : c • {a: b} • when 1: something (in 1.8)
  • 16. static int parser_yylex(struct parser_params *parser) { ... switch (c = nextc()) { ... case '#': /* it's a comment */ ... case ':': c = nextc(); if (c == ':') { if (IS_BEG() ||... ... } ... (about 1300 lines)
  • 17. How does parser deal with colon? • :: → tCOLON2 or tCOLON3 • tCOLON2 Net::URI • tCOLON3 ::Kernel
  • 18. lex_state enum lex_state_e { EXPR_BEG, /* ignore newline, +/- is a sign. */ EXPR_END, /* newline significant, +/- is an operator. * EXPR_ENDARG, /* ditto, and unbound braces. */ EXPR_ARG, /* newline significant, +/- is an operator. * EXPR_CMDARG, /* newline significant, +/- is an operator. * EXPR_MID, /* newline significant, +/- is an operator. * EXPR_FNAME, /* ignore newline, no reserved words. */ EXPR_DOT, /* right after `.' or `::', no reserved words EXPR_CLASS, /* immediate after `class', no here document. EXPR_VALUE /* alike EXPR_BEG but label is disallowed. */ };
  • 19. case ':': c = nextc(); if (c == ':') { if (IS_BEG() || lex_state == EXPR_CLASS || (IS_ARG() && space_seen)) { lex_state = EXPR_BEG; return tCOLON3; } lex_state = EXPR_DOT; return tCOLON2; }
  • 20. ... if (lex_state == EXPR_END || lex_state == EXPR_ENDARG || (c != -1 && ISSPACE(c))) { pushback(c); lex_state = EXPR_BEG; return ':'; } switch (c) { case ''': lex_strterm = NEW_STRTERM(str_ssym, c, 0); break; case '"': lex_strterm = NEW_STRTERM(str_dsym, c, 0); break; default: pushback(c); break; } lex_state = EXPR_FNAME; return tSYMBEG;
  • 21. How does parser deal with colon? (summary) • :: → tCOLON2 or tCOLON2 • EXPR_END or →: (else) • otherwise → tSYMBEG • :' → str_ssym • :" → str_dsym
  • 22. So, • :-) → tASSOC • :: → tCOLON2 or tCOLON2 • EXPR_END or →: (else) • otherwise → tSYMBEG • :' → str_ssym • :" → str_dsym
  • 23. :-)
  • 24. Case 2: Lisp Like Symbol • Symbol Literal :vancouver 'vancouver • Ad-hoc p :a, :b p 'a, 'b
  • 25. Single Quote (in parser_yylex) ... case ''': lex_strterm = NEW_STRTERM(str_squote, ''', 0); return tSTRING_BEG; ...
  • 26. Single Quote (in parser_yylex) ... case ''': if (??? condition ???) { lex_state = EXPR_FNAME; return tSYMBEG; } lex_strterm = NEW_STRTERM(str_squote, ''', 0); return tSTRING_BEG; ...
  • 27. (loop (lambda (p 'good)))
  • 28. Case3: Pre Incremental Operator • ++i • i = i.succ (NOT i = i + 1)
  • 29. Lexer @@ -685,6 +685,7 @@ static void token_info_pop(struct parser_params*, const char *token); %type <val> program reswords then do dot_or_colon %*/ %token tUPLUS /* unary+ */ +%token tINCR /* ++var */ %token tUMINUS /* unary- */ %token tPOW /* ** */ %token tCMP /* <=> */ (Actually there are more trivial fixes)
  • 30. regenerate id.h • id.h is automatically generated by parse.y in make • $ rm id.h $ make
  • 31. parser example variable : tIDENTIFIER | tIVAR | tGVAR | tCONSTANT | tCVAR | keyword_nil {ifndef_ripper($$ = keyword_nil);} | keyword_self {ifndef_ripper($$ = keyword_self);} | keyword_true {ifndef_ripper($$ = keyword_true);} | keyword_false {ifndef_ripper($$ = keyword_false);} | keyword__FILE__ {ifndef_ripper($$ = keyword__FILE__);} | keyword__LINE__ {ifndef_ripper($$ = keyword__LINE__);} | keyword__ENCODING__ {ifndef_ripper($$ = keyword__ENCODING_ ;
  • 32. lhs : variable { /*%%%*/ if (!($$ = assignable($1, 0))) $$ = NEW_BEGIN(0); /*% $$ = dispatch1(var_field, $1); %*/ } | primary_value '[' opt_call_args rbracket { /*%%%*/ $$ = aryset($1, $3); /*% $$ = dispatch2(aref_field, $1, escape_Qundef($3)); %*/ } ...
  • 33. BNF (part) program : compstmt arg : lhs '=' arg | var_lhs tOP_ASGN arg compstmt : stmts opt_terms | primary_value '[' aref_args ']' tOP stmts : none | stmt | arg '?' arg ':' arg | stmts terms stmt | primary stmt : kALIAS fitem fitem primary : literal | kALIAS tGVAR tGVAR | strings | expr | tLPAREN_ARG expr ')' | tLPAREN compstmt ')' expr : kRETURN call_args | kBREAK call_args | kREDO | kRETRY | '!' command_call | arg
  • 34. Assign stmt : ... | mlhs '=' command_call { /*%%%*/ value_expr($3); $1->nd_value = $3; $$ = $1; /*% $$ = dispatch2(massign, $1, $3); %*/ }
  • 35. mlhs mlhs: mlhs_basic | ... mlhs_basic: mlhs_head | ... mlhs_head: mlhs_item ',' | ... mlhs_item: mlhs_node | ... mlhs_node: variable { $$ = assignable($1, 0); }
  • 36. Method call block_command : block_call | block_call '.' operation2 command_args { /*%%%*/ $$ = NEW_CALL($1, $3, $4); /*% $$ = dispatch3(call, $1, ripper_id2sym('.'), $$ = method_arg($$, $4); %*/ }
  • 37. Mix! var_ref: ... | tINCR variable { /*%%%*/ $$ = assignable($2, 0); $$->nd_value = NEW_CALL(gettable($$->nd_vid), rb_intern("succ"), 0); /*% $$ = dispatch2(unary, ripper_intern("++@"), $2); %*/ }
  • 39. Case 4: def A#b • A#b instance method b of class A • A.b class method b of class A
  • 40. A#b class A def A.b def b ... ... end end end
  • 41. A#b def A#b def A.b ... ... end end
  • 42. # (in parser_yylex) case '#': /* it's a comment */ /* no magic_comment in shebang line */ if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) { if (comment_at_top(parser)) { set_file_encoding(parser, lex_p, lex_pend); } } lex_p = lex_pend;
  • 43. # (in parser_yylex) case '#': /* it's a comment */ c = nextc(); pushback(c); if(lex_state == EXPR_END && ISALNUM(c)) return '#'; /* no magic_comment in shebang line */ if (!parser_magic_comment(parser, lex_p, lex_pend - lex_p)) { if (comment_at_top(parser)) { set_file_encoding(parser, lex_p, lex_pend);
  • 44. Primary primary: literal | ... | k_def singleton dot_or_colon {lex_state = EXPR_FNAME;} fname { in_single++; lex_state = EXPR_END; /* force for args */ /*%%%*/ local_push(0); /*% %*/ } f_arglist bodystmt k_end { /*%%%*/ NODE *body = remove_begin($8); reduce_nodes(&body); $$ = NEW_DEFS($2, $5, $7, body); fixpos($$, $2); local_pop(); /*% $$ = dispatch5(defs, $2, $3, $5, $7, $8); %*/ in_single--; }
  • 45. | k_def cname '#' {lex_state = EXPR_FNAME;} fname { $<id>$ = cur_mid; cur_mid = $5; in_def++; /*%%%*/ local_push(0); /*% %*/ } f_arglist bodystmt k_end { /*%%%*/ NODE *body = remove_begin($8); reduce_nodes(&body); $$ = NEW_DEFN($5, $7, body, NOEX_PRIVATE); fixpos($$, $7); fixpos($$->nd_defn, $7); $$ = NEW_CLASS(NEW_COLON3($2), $$, 0); nd_set_line($$, $<num>6); local_pop(); /*% $$ = dispatch4(defi, $2, $5, $7, $8); %*/ in_def--; cur_mid = $<id>6; }
  • 46.
  • 48. Reference • My blog http://ujihisa.blogspot.com • All patches I showed are there
  • 49. end
  • 50. Appendix: Imaginary Numbers • Matz wrote a patch in [ruby-dev:38843] • translation: [ruby-core:24730] • It won't be accepted
  • 51. Appendix: Imaginary Numbers > 3i => (0 + 3i) > 3i.class => Complex