This document summarizes a lecture on syntactic analysis and parsing. It introduces context-free grammars, derivations, parse trees, ambiguity, and notation for grammars including BNF and EBNF. Examples are provided to illustrate grammar rules and derivations for simple expressions. Syntax diagrams and parse trees are presented as tools for visualizing grammars and parsing structures. Homework is assigned to build a parse tree for a given expression.
Studiovity film pre-production and screenwriting software
201506 - CSE340 Lecture 08
1. CSE340 - Principles of
Programming Languages
Lecture 08:
Syntactic Analysis II
Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-38
Office Hours: By appointment
2. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 2
Outline
Language
Syntactic
Analysis
(Parser)
Grammar
(Rules)
Non-terminal
Terminal
Derivation
Parse
Tree
Tools
BNF
Syntax Diagrams
3. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 3
Grammar | Derivation
E à E OP E
E à integer
OP à + | - | * | /
E à ( E )
E
⇒ E OP E
⇒ integer OP E
⇒ integer * E
⇒ integer * (E)
⇒ integer * (E OP E)
⇒ integer * (integer OP E)
⇒ integer * (integer + E)
⇒ integer * (integer + integer)
5 * ( 7 + 20 )
Integer operator delimiter integer operator integer delimiter
4. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 4
5 * ( 7 + 20 )
Integer operator delimiter integer operator integer delimiter
Grammar | Derivation
E
⇒ E OP E
⇒ E OP (E)
⇒ E OP (E OP E)
⇒ E OP (E OP integer)
⇒ E OP (E + integer)
⇒ E OP (integer + integer)
⇒ E * (integer + integer)
⇒ integer * (integer + integer)
E à E OP E
E à integer
OP à + | - | * | /
E à ( E )
5. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 5
Parse Tree
§ A parse tree is a tree encoding the steps in a
derivation.
§ Internal nodes represent nonterminal symbols used
in the production.
§ Inorder walk of the leaves contains the generated
string.
§ Encodes what productions are used, not the order
in which those productions are applied.
8. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 8
Goal
Goal of syntax analysis:
§ Recover the structure described by a series of
tokens.
§ Recover a parse tree for the given input.
9. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 9
The problem
E à E OP E
E à integer
OP à + | - | * | /
E à ( E )
5 * 7 + 20
Integer operator integer operator integer
11. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 11
Ambiguity
• A grammar is said to be ambiguous if there is at
least one string with two or more parse trees.
• Note that ambiguity is a property of grammars, not
languages.
We will review this topic in the next lecture
12. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 12
Our Tools
Backus-Naur Form (BNF)
Extended Backus-Naur Form (EBNF)
Syntax Diagrams
13. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 13
BNF (Backus-Naur Form)
Formal, mathematical way to specify grammars
All the previous examples, where we use:
à or ::= is defined as
| or operator
<nonterminal> or use uppercases
terminal (lowercases)
* John Backus and Peter Naur
14. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 14
EBNF
Extended BNF include notation to indicate:
• 0 or more occurrences {…}
• 1 or more occurrences +
• 0 or 1 occurrences […]
• Use of parentheses for grouping ( )
* Niklaus Wirth
15. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 15
Example
Grammar rule for calling a method:
§ draw(x, y, z);
§ print (a, b, c, d);
§ done();
§ foobar(one, two, three, four, five);
§ sqrt(x);
22. Javier Gonzalez-Sanchez | CSE340 | Summer 2015 | 22
Homework
Create a Parse Tree for the following expression.
Use the rules stated in the previous lecture
while ( 5 ) { if ( 6 ) { } }
23. CSE340 - Principles of Programming Languages
Javier Gonzalez-Sanchez
javiergs@asu.edu
Summer 2015
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.