Ppcrslidesannotated

  • 157 views
Uploaded on

Though yacc-like LR parser generators provide ways to solve shift-reduce …

Though yacc-like LR parser generators provide ways to solve shift-reduce
conflicts using token precedences, no mechanisms are provided
for the resolution of difficult reduce-reduce conflicts. To solve this kind
of conflicts the language designer has to modify the grammar.

More in: Technology , Career
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
157
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
5
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Solving Difficult LR Parsing Conflicts by Postponing Them Luis Garcia-Forte and Casiano Rodriguez-Leon Universidad de La Laguna CoRTA2010 September 09-10, 2010
  • 2. Outline Introduction Context Dependent Parsing Nested Parsing Conclusions
  • 3. Outline Introduction Context Dependent Parsing Nested Parsing Conclusions
  • 4. Yacc Limitations ￿ Yacc-like LR parser generators provide ways to solve shift-reduce mechanisms based on token precedence.
  • 5. Yacc Limitations ￿ Yacc-like LR parser generators provide ways to solve shift-reduce mechanisms based on token precedence. ￿ No mechanisms are provided for the resolution of reduce-reduce conflicts or difficult shift-reduce conflicts.
  • 6. Yacc Limitations ￿ Yacc-like LR parser generators provide ways to solve shift-reduce mechanisms based on token precedence. ￿ No mechanisms are provided for the resolution of reduce-reduce conflicts or difficult shift-reduce conflicts. ￿ To solve such kind of conflicts the language designer has to modify the grammar.
  • 7. Postponed Conflict Resolution Approach
  • 8. Postponed Conflict Resolution Approach ￿ The strategy presented here extends yacc conflict resolution mechanisms with new ones, supplying ways to resolve conflicts that can not be solved using static precedences.
  • 9. Postponed Conflict Resolution Approach ￿ The strategy presented here extends yacc conflict resolution mechanisms with new ones, supplying ways to resolve conflicts that can not be solved using static precedences. ￿ The algorithm for the generation of the LR tables remains unchanged, but the programmer can modify the parsing tables during run time.
  • 10. PPCR Approach: Identify the Conflict
  • 11. PPCR Approach: Identify the Conflict ￿ Identify the conflict: What LR(0)-items/productions and tokens are involved?.
  • 12. PPCR Approach: Identify the Conflict ￿ Identify the conflict: What LR(0)-items/productions and tokens are involved?. ￿ Tools must support that stage, as for example via the .output file generated by eyapp.
  • 13. PPCR Approach: Identify the Conflict ￿ Identify the conflict: What LR(0)-items/productions and tokens are involved?. ￿ Tools must support that stage, as for example via the .output file generated by eyapp. ￿ Suppose we identify that the participants are the two LR(0)-items
  • 14. PPCR Approach: Identify the Conflict ￿ Identify the conflict: What LR(0)-items/productions and tokens are involved?. ￿ Tools must support that stage, as for example via the .output file generated by eyapp. ￿ Suppose we identify that the participants are the two LR(0)-items ￿ A → α↑
  • 15. PPCR Approach: Identify the Conflict ￿ Identify the conflict: What LR(0)-items/productions and tokens are involved?. ￿ Tools must support that stage, as for example via the .output file generated by eyapp. ￿ Suppose we identify that the participants are the two LR(0)-items ￿ A → α↑ and
  • 16. PPCR Approach: Identify the Conflict ￿ Identify the conflict: What LR(0)-items/productions and tokens are involved?. ￿ Tools must support that stage, as for example via the .output file generated by eyapp. ￿ Suppose we identify that the participants are the two LR(0)-items ￿ A → α↑ and ￿ B → β↑
  • 17. PPCR Approach: Identify the Conflict ￿ Identify the conflict: What LR(0)-items/productions and tokens are involved?. ￿ Tools must support that stage, as for example via the .output file generated by eyapp. ￿ Suppose we identify that the participants are the two LR(0)-items ￿ A → α↑ and ￿ B → β↑ when the lookahead token is
  • 18. PPCR Approach: Identify the Conflict ￿ Identify the conflict: What LR(0)-items/productions and tokens are involved?. ￿ Tools must support that stage, as for example via the .output file generated by eyapp. ￿ Suppose we identify that the participants are the two LR(0)-items ￿ A → α↑ and ￿ B → β↑ when the lookahead token is @.
  • 19. PPCR Approach: Identify the Conflict
  • 20. PPCR Approach: Identify the Conflict ￿ The software must allow the use of symbolic labels to refer by name to the productions involved in the conflict.
  • 21. PPCR Approach: Identify the Conflict ￿ The software must allow the use of symbolic labels to refer by name to the productions involved in the conflict. ￿ In Eyapp names and labels can be explicitly given using the directive %name, using a syntax similar to this one: %name :rA A → α %name :rB B → β
  • 22. PPCR Approach: Identify the Conflict
  • 23. PPCR Approach: Identify the Conflict ￿ Give a symbolic name to the conflict. In this case we choose isAorB as name of the conflict.
  • 24. PPCR Approach: Identify the Conflict ￿ Give a symbolic name to the conflict. In this case we choose isAorB as name of the conflict. ￿ Inside the body section of the grammar, mark the points of conflict using the new reserved word %PREC followed by the conflict name: %name :rA A → α %PREC IsAorB %name :rA B → β %PREC IsAorB
  • 25. Conflict Identification: Shift-Reduce Conflicts The procedure is similar for shift-reduce conflicts.
  • 26. Conflict Identification: Shift-Reduce Conflicts The procedure is similar for shift-reduce conflicts. ￿ Let us assume we have identified a shift-reduce conflict between LR-(0) items
  • 27. Conflict Identification: Shift-Reduce Conflicts The procedure is similar for shift-reduce conflicts. ￿ Let us assume we have identified a shift-reduce conflict between LR-(0) items ￿ A → α↑
  • 28. Conflict Identification: Shift-Reduce Conflicts The procedure is similar for shift-reduce conflicts. ￿ Let us assume we have identified a shift-reduce conflict between LR-(0) items ￿ A → α↑ and
  • 29. Conflict Identification: Shift-Reduce Conflicts The procedure is similar for shift-reduce conflicts. ￿ Let us assume we have identified a shift-reduce conflict between LR-(0) items ￿ A → α↑ and ￿ B→β↑γ
  • 30. Conflict Identification: Shift-Reduce Conflicts The procedure is similar for shift-reduce conflicts. ￿ Let us assume we have identified a shift-reduce conflict between LR-(0) items ￿ A → α↑ and ￿ B → β ↑ γ for some token ’@’.
  • 31. Conflict Identification: Shift-Reduce Conflicts The procedure is similar for shift-reduce conflicts. ￿ Let us assume we have identified a shift-reduce conflict between LR-(0) items ￿ A → α↑ and ￿ B → β ↑ γ for some token ’@’. ￿ Again, we must give a symbolic name to A → α and mark with the new %PREC directive the places where the conflict occurs: %name :rA A → α %PREC IsAorB B →β %PREC IsAorB γ
  • 32. PPCR Approach: The Conflict Handler
  • 33. PPCR Approach: The Conflict Handler ￿ Introduce a %conflict directive inside the head section of the translation scheme to specify the way the conflict will be solved.
  • 34. PPCR Approach: The Conflict Handler ￿ Introduce a %conflict directive inside the head section of the translation scheme to specify the way the conflict will be solved. ￿ The directive is followed by some code - known as the conflict handler - whose mission is to modify the parsing tables.
  • 35. PPCR Approach: The Conflict Handler ￿ Introduce a %conflict directive inside the head section of the translation scheme to specify the way the conflict will be solved. ￿ The directive is followed by some code - known as the conflict handler - whose mission is to modify the parsing tables. ￿ This code will be executed each time the associated conflict state is reached.
  • 36. PPCR Approach: The Conflict Handler
  • 37. PPCR Approach: The Conflict Handler ￿ This is the usual layout of the conflict handler for a reduce-reduce conflict: %conflict IsAorB { if (is_A) { $self->YYSetReduce(’@’, ’:rA’ ); } else { $self->YYSetReduce(’@’, ’:rB’ ); } }
  • 38. PPCR Approach: The Conflict Handler ￿ This is the usual layout of the conflict handler for a reduce-reduce conflict: %conflict IsAorB { if (is_A) { $self->YYSetReduce(’@’, ’:rA’ ); } else { $self->YYSetReduce(’@’, ’:rB’ ); } } Inside a conflict code handler the Perl default variable $_ refers to the input and $self refers to the parser object.
  • 39. PPCR Approach: The Conflict Handler ￿ This is the usual layout of the conflict handler for a shift-reduce conflict: %conflict IsAorB { if (is_A) { $self->YYSetReduce(’@’, ’:rA’ ); } else { $self->YYSetShift(’@’); } }
  • 40. PPCR Approach: The Conflict Handler
  • 41. PPCR Approach: The Conflict Handler ￿ The method YYSetReduce receives a list of tokens and a production label, like :rA. The call $self->YYSetReduce(’@’, ’:rA’ ) sets the parsing action for the state associated with the conflict IsAorB to reduce by the production :rA when the current lookahead is @.
  • 42. PPCR Approach: The Conflict Handler ￿ The method YYSetReduce receives a list of tokens and a production label, like :rA. The call $self->YYSetReduce(’@’, ’:rA’ ) sets the parsing action for the state associated with the conflict IsAorB to reduce by the production :rA when the current lookahead is @. ￿ The method YYSetShift receives a token (or a reference to a list of tokens). The call $self->YYSetShift(’@’) sets the parsing action for the state associated with the conflict IsAorB to shift when the current lookahead is @.
  • 43. PPCR Approach: The Conflict Handler ￿ %conflict IsAorB { if (is_A) { $self->YYSetReduce(’@’, ’:rA’ ); } else { $self->YYSetReduce(’@’, ’:rB’ ); } } ￿ The call to is_A represents the context-dependent dynamic knowledge that allows us to take the right decision.
  • 44. PPCR Approach: The Conflict Handler ￿ %conflict IsAorB { if (is_A) { $self->YYSetReduce(’@’, ’:rA’ ); } else { $self->YYSetReduce(’@’, ’:rB’ ); } } ￿ The call to is_A represents the context-dependent dynamic knowledge that allows us to take the right decision. ￿ It is usually a call to a nested parser to check that A → α applies but it can also be any other contextual information we have to determine which one is the right production.
  • 45. Outline Introduction Context Dependent Parsing Nested Parsing Conclusions
  • 46. A Simple Example: Context Dependant Associativity $ cat input_for_dynamicgrammar.txt 2-1-1 # left: 0 = (2-1)-1
  • 47. A Simple Example: Context Dependant Associativity $ cat input_for_dynamicgrammar.txt 2-1-1 # left: 0 = (2-1)-1 RIGHT
  • 48. A Simple Example: Context Dependant Associativity $ cat input_for_dynamicgrammar.txt 2-1-1 # left: 0 = (2-1)-1 RIGHT 2-1-1 # right: 2 = 2-(1-1)
  • 49. A Simple Example: Context Dependant Associativity $ cat input_for_dynamicgrammar.txt 2-1-1 # left: 0 = (2-1)-1 RIGHT 2-1-1 # right: 2 = 2-(1-1) LEFT
  • 50. A Simple Example: Context Dependant Associativity $ cat input_for_dynamicgrammar.txt 2-1-1 # left: 0 = (2-1)-1 RIGHT 2-1-1 # right: 2 = 2-(1-1) LEFT 3-1-1 # left: 1 = (3-1)-1
  • 51. A Simple Example: Context Dependant Associativity $ cat input_for_dynamicgrammar.txt 2-1-1 # left: 0 = (2-1)-1 RIGHT 2-1-1 # right: 2 = 2-(1-1) LEFT 3-1-1 # left: 1 = (3-1)-1 RIGHT 3-1-1 # right: 3 = 3-(1-1)
  • 52. A Simple Example: Body p: c * {} ;
  • 53. A Simple Example: Body p: c * {} ; c: $expr { print "$exprn" } | RIGHT { $reduce = 0} | LEFT { $reduce = 1} ;
  • 54. A Simple Example: Body p: c * {} ; c: $expr { print "$exprn" } | RIGHT { $reduce = 0} | LEFT { $reduce = 1} ; expr: ’(’ $expr ’)’ { $expr } | NUM
  • 55. A Simple Example: Body p: c * {} ; c: $expr { print "$exprn" } | RIGHT { $reduce = 0} | LEFT { $reduce = 1} ; expr: ’(’ $expr ’)’ { $expr } | NUM | %name :M expr.left %PREC leftORright ’-’ expr.right %PREC leftORright { $left - $right }
  • 56. A Simple Example: Body p: c * {} ; c: $expr { print "$exprn" } | RIGHT { $reduce = 0} | LEFT { $reduce = 1} ; expr: ’(’ $expr ’)’ { $expr } | NUM | %name :M expr.left %PREC leftORright ’-’ expr.right %PREC leftORright { $left - $right } ; %%
  • 57. A Simple Example: Head Section $ cat dynamicgrammar.eyp
  • 58. A Simple Example: Head Section $ cat dynamicgrammar.eyp %{ my $reduce = 1; %}
  • 59. A Simple Example: Head Section $ cat dynamicgrammar.eyp %{ my $reduce = 1; %} Automatically %whites /(s*(?:#.*)?s*)/ Generated %token NUM = /(d+)/ Lexical Analyzer
  • 60. A Simple Example: Head Section $ cat dynamicgrammar.eyp %{ my $reduce = 1; %} %whites /(s*(?:#.*)?s*)/ %token NUM = /(d+)/ %conflict leftORright { if ($reduce) { $self->YYSetReduce(’-’, ’:M’) } else { $self->YYSetShift(’-’) } }
  • 61. A Simple Example: Head Section $ cat dynamicgrammar.eyp %{ my $reduce = 1; %} %whites /(s*(?:#.*)?s*)/ %token NUM = /(d+)/ %conflict leftORright { if ($reduce) { $self->YYSetReduce(’-’, ’:M’) } else { $self->YYSetShift(’-’) } } %expect 1 %%
  • 62. Outline Introduction Context Dependent Parsing Nested Parsing Conclusions
  • 63. Enumerated and Subrange Types in Pascal ￿ The problem is taken from the Bison manual. type subrange = lo .. hi; type enum = (a, b, c);
  • 64. Enumerated and Subrange Types in Pascal ￿ The problem is taken from the Bison manual. type subrange = lo .. hi; type enum = (a, b, c); ￿ The Pascal standard allows only numeric literals and constant identifiers for the subrange bounds (lo and hi), but Extended Pascal and many other implementations allow arbitrary expressions there.
  • 65. Enumerated and Subrange Types in Pascal ￿ The problem is taken from the Bison manual. type subrange = lo .. hi; type enum = (a, b, c); ￿ The Pascal standard allows only numeric literals and constant identifiers for the subrange bounds (lo and hi), but Extended Pascal and many other implementations allow arbitrary expressions there. type subrange = (a) .. b; type enum = (a);
  • 66. Enumerated and Subrange Types in Pascal ￿ The problem is taken from the Bison manual. type subrange = lo .. hi; type enum = (a, b, c); ￿ The Pascal standard allows only numeric literals and constant identifiers for the subrange bounds (lo and hi), but Extended Pascal and many other implementations allow arbitrary expressions there. Cant decide up to type subrange = (a) .. b; here type enum = (a); ￿ To aggravate the conflict we have added the C comma operator to the language type subrange = (a, b, c) .. (d, e); type enum = (a, b, c);
  • 67. Enumerated vs Range: Full Grammar %token ID = /([A-Za-z]w*)/ %token NUM = /(d+)/
  • 68. Enumerated vs Range: Full Grammar %token ID = /([A-Za-z]w*)/ %token NUM = /(d+)/ %left ’,’ %left ’-’ ’+’ %left ’*’ ’/’
  • 69. Enumerated vs Range: Full Grammar %token ID = /([A-Za-z]w*)/ %token NUM = /(d+)/ %left ’,’ %left ’-’ ’+’ %left ’*’ ’/’ %% type_decl : ’TYPE’ ID ’=’ type ’;’ ;
  • 70. Enumerated vs Range: Full Grammar %token ID = /([A-Za-z]w*)/ %token NUM = /(d+)/ %left ’,’ %left ’-’ ’+’ %left ’*’ ’/’ %% type_decl : ’TYPE’ ID ’=’ type ’;’ ; type : ’(’ id_list ’)’ | expr ’..’ expr ;
  • 71. Enumerated vs Range: Full Grammar %token ID = /([A-Za-z]w*)/ %token NUM = /(d+)/ %left ’,’ These are the %left ’-’ ’+’ productions %left ’*’ ’/’ involved in the %% conflict type_decl : ’TYPE’ ID ’=’ type ’;’ ; type : ’(’ id_list ’)’ | expr ’..’ expr ; id_list : ID | id_list ’,’ ID ; expr : ID | NUM | ’(’ expr ’)’ | expr ’,’ expr /* new */ | expr ’+’ expr | expr ’-’ expr | expr ’*’ expr | expr ’/’ expr ;
  • 72. Identifying the Conflict $ eyapp -v pascalenumeratedvsrange.eyp 2 reduce/reduce conflicts
  • 73. Identifying the Conflict $ eyapp -v pascalenumeratedvsrange.eyp 2 reduce/reduce conflicts State 11: id_list -> ID . (Rule 4) expr -> ID . (Rule 12) ’)’ [reduce using rule 12 (expr)] ’)’ reduce using rule 4 (id_list) ’,’ [reduce using rule 12 (expr)] ’,’ reduce using rule 4 (id_list) ’*’ reduce using rule 12 (expr) ’+’ reduce using rule 12 (expr) ’-’ reduce using rule 12 (expr) ’/’ reduce using rule 12 (expr)
  • 74. Enumerated vs Range: Body type_decl : ’type’ ID ’=’ type ’;’ ;
  • 75. Enumerated vs Range: Body type_decl : ’type’ ID ’=’ type ’;’ ; type : %name ENUM Lookahead ’(’ id_list ’)’ | %name RANGE Lookahead expr ’..’ expr ;
  • 76. Enumerated vs Range: Body type_decl : ’type’ ID ’=’ type ’;’ ; type : %name ENUM Lookahead ’(’ id_list ’)’ | %name RANGE Lookahead expr ’..’ expr ; Lookahead: /* empty */ { $is_range = $_[0]->YYPreParse(’Range’); } ;
  • 77. Nested Parsing: Future Work $ cat Range2.eyp %from pascalnestedeyapp2 import expr %% range: expr ’..’ expr ’;’ ; %%
  • 78. Nested Parsing: Current State $ cat Range.eyp %token ID = /([A-Za-z]w*)/ %token NUM = /(d+)/ %left ’,’ %left ’-’ ’+’ %left ’*’ ’/’ %% range: expr ’..’ expr ’;’ ; expr : ’(’ expr ’)’ | expr ’+’ expr | expr ’-’ expr | expr ’*’ expr | expr ’/’ expr | expr ’,’ expr | ID | NUM ; %%
  • 79. Enumerated vs Range: Body
  • 80. Enumerated vs Range: Body id_list : %name ID:ENUM ID %PREC rangeORenum | id_list ’,’ ID ;
  • 81. Enumerated vs Range: Body id_list : %name ID:ENUM ID %PREC rangeORenum | id_list ’,’ ID ; expr : ’(’ expr ’)’ { $_[2] } /* bypass */ | %name ID:RANGE ID %PREC rangeORenum | %name PLUS expr ’+’ expr | %name MINUS expr ’-’ expr | %name TIMES expr ’*’ expr | %name DIV expr ’/’ expr | %name COMMA expr ’,’ expr | %name NUM NUM ; %%
  • 82. Enumerated vs Range: Header
  • 83. Enumerated vs Range: Header %{ my $is_range = 0; %} %conflict rangeORenum { if ($is_range) { $self->YYSetReduce([’,’, ’)’], ’ID:RANGE’ ); } else { $self->YYSetReduce([’,’, ’)’], ’ID:ENUM’ ); } }
  • 84. Enumerated vs Range: Header %{ my $is_range = 0; %} %conflict rangeORenum { if ($is_range) { $self->YYSetReduce([’,’, ’)’], ’ID:RANGE’ ); } else { $self->YYSetReduce([’,’, ’)’], ’ID:ENUM’ ); } } %token ID = /([A-Za-z]w*)/ %token NUM = /(d+)/
  • 85. Enumerated vs Range: Header %{ my $is_range = 0; %} %conflict rangeORenum { if ($is_range) { $self->YYSetReduce([’,’, ’)’], ’ID:RANGE’ ); } else { $self->YYSetReduce([’,’, ’)’], ’ID:ENUM’ ); } } %token ID = /([A-Za-z]w*)/ %token NUM = /(d+)/ %left ’,’ %left ’-’ ’+’ %left ’*’ ’/’
  • 86. Enumerated vs Range: Header %{ my $is_range = 0; %} %conflict rangeORenum { if ($is_range) { $self->YYSetReduce([’,’, ’)’], ’ID:RANGE’ ); } else { $self->YYSetReduce([’,’, ’)’], ’ID:ENUM’ ); } } %token ID = /([A-Za-z]w*)/ %token NUM = /(d+)/ %left ’,’ %left ’-’ ’+’ %left ’*’ ’/’ %expect-rr 2 %%
  • 87. Enumerated vs Range: Execution $ eyapp -P Range.eyp
  • 88. Enumerated vs Range: Execution $ eyapp -P Range.eyp $ eyapp -TC pascalnestedeyapp2.eyp
  • 89. Enumerated vs Range: Execution $ eyapp -P Range.eyp $ eyapp -TC pascalnestedeyapp2.eyp $ ./pascalnestedeyapp2.pm -t -i -m 1 -c ’type e = (x, y, z);’
  • 90. Enumerated vs Range: Execution $ eyapp -P Range.eyp $ eyapp -TC pascalnestedeyapp2.eyp $ ./pascalnestedeyapp2.pm -t -i -m 1 -c ’type e = (x, y, z);’ typeDecl_is_type_ID_type( TERMINAL[e], ENUM( idList_is_idList_ID( idList_is_idList_ID(ID(TERMINAL[x]), TERMINAL[y]), TERMINAL[z] )))
  • 91. Outline Introduction Context Dependent Parsing Nested Parsing Conclusions
  • 92. Conclusions
  • 93. Conclusions ￿ The new conflict resolution mechanisms provide ways to resolve conflicts that can not be solved using static precedences.
  • 94. Conclusions ￿ The new conflict resolution mechanisms provide ways to resolve conflicts that can not be solved using static precedences. ￿ They also provides finer control over the conflict resolution process than existing alternatives, like GLR and backtracking LR.
  • 95. Conclusions ￿ The new conflict resolution mechanisms provide ways to resolve conflicts that can not be solved using static precedences. ￿ They also provides finer control over the conflict resolution process than existing alternatives, like GLR and backtracking LR. ￿ There are no limitations to PPCR parsing, since the conflict handler is implemented in a universal language and it then can resort to any kind of nested parsing algorithm.
  • 96. Conclusions ￿ The new conflict resolution mechanisms provide ways to resolve conflicts that can not be solved using static precedences. ￿ They also provides finer control over the conflict resolution process than existing alternatives, like GLR and backtracking LR. ￿ There are no limitations to PPCR parsing, since the conflict handler is implemented in a universal language and it then can resort to any kind of nested parsing algorithm. ￿ The conflict resolution mechanisms presented here can be introduced in any LR parsing tools.
  • 97. Conclusions ￿ The new conflict resolution mechanisms provide ways to resolve conflicts that can not be solved using static precedences. ￿ They also provides finer control over the conflict resolution process than existing alternatives, like GLR and backtracking LR. ￿ There are no limitations to PPCR parsing, since the conflict handler is implemented in a universal language and it then can resort to any kind of nested parsing algorithm. ￿ The conflict resolution mechanisms presented here can be introduced in any LR parsing tools. ￿ It certainly involves more programmer work than branching methods like GLR or backtracking LR.
  • 98. http://parse-eyapp.googlecode.com/svn/branches/PPCR
  • 99. http://parse-eyapp.googlecode.com/svn/branches/PPCR Thanks!