COMPILER DESIGN

Compiler Design
Swarnalatha Prathipati
Assistant Professor
Department of CSE
GITAM Institute of Technology (GIT)
Visakhapatnam – 530045
Email: sprathip2@gitam.edu
Ph no:7893210891
2/15/2023
Department of Computer Science & Engineering, GIT
1

Module-II
Syntax Analysis (Part-I):

Syllabus
Syntax Analysis (Part-I):
 Introduction
 Context free grammars
 Top down parsing : Brute force parsing, recursive descent parsing,
predictive parsing, error recovery in predictive parsing
 Bottom up parsing, shift reduce parsing, operator precedence parsing
 Error recovery in operator precedence parsing.
2/15/2023
3

 Learning Outcomes:
 Explore both Top-Down and Bottom-Up Parsing techniques
 Prescribed Textbook:
Compilers Principles, Techniques and Tools- Alfred.V. Aho, J.D.Ullman, Ravi Sethi; 2nd
Edition, Pearson Education
2/15/2023
4

Introduction
 The syntax analyzer basically checks for the syntax of the language.
 The syntax analyzer takes the tokens from lexical analyzer and groups them
in such a way that some programming structure can be recognized.
 After grouping the tokens if at all any syntax cannot recognized, the syntactic
error will be generated.
 A parsing or syntax analysis is a process which takes the input string W and
produces either a parse tree or generates the syntactic errors.
 The syntax analyzer takes the tokens as input and generates a tree like
structure called parse tree.
2/15/2023
5

Role of Parser
 The parser obtains a string of tokens from the lexical analyzer and verifies
that the string can be generated by the grammar for the source language
 The parser returns any syntax error for the source language
2/15/2023
6

Basic Issues in Parsing
 There are two important issues in parsing
1) Specifications of syntax
2) Representation of input after parsing
2/15/2023
7

Specification of syntax
 To write any programming statements there are certain characteristics of
specification of syntax.
This specification should be precise and unambiguous
This specification should be in detail(cover all the details of programming
language)
This specification should be complete
 Such a specification is called Context free grammar
2/15/2023
8

Representation of the input
after parsing
 This is important because all the subsequent phases of compiler take the
information from the parse tree being generated.
 This is important because the information suggested by any input
programming statement should not be differed after building the syntax tree
for it.
2/15/2023
9

Grammars
 A grammar is a collection of production rules used to generate set of strings
 A grammar is denoted by symbol G
 Grammar is defined by 4 tuples G= (V,T,P,S)
where V-set of Variables
T-set of terminal
P-set of production rules
S-start symbol
2/15/2023
10

Context free Grammars
 Languages defined by Type2 grammars are accepted by Push down automata.
 The Productions are in the form of α->β
such that | α|=1
where α ∈V and β ∈ (VUT)*
Example: S->abc
A->bA | ε
B->Bb
2/15/2023
11

Derivations
 The sequence of substitutions used to obtain a string is called a derivation
 Derivation produces a new string from a given string
 If a string is obtained as a result of the derivation contains only terminal
symbols.
2/15/2023
12

Types of Derivations
There are two types:
1.Left most derivation
2.Right most derivation
2/15/2023
13

Left most derivation
 If at each step of derivation a production is applied to the leftmost variable
then it is called left most derivation.
 Example: E->E+E|E*E|id then generate the string id+id*id using LMD
E->E+E
->id+E
->id+E*E
->id+id*E
->id+id*id
2/15/2023
14

Right most derivation
 If at each step of derivation a production is applied to the right most variable then
it is called right most derivation.
 Example: E->E+E|E*E|id then generate the string id+id*id using RMD
E->E+E
->E+E*E
->E+E*id
->E+id*id
->id+id*id
2/15/2023
15

Parse Tree
 Parse tree is the hierarchical representation of terminals or non-terminals.
 The starting symbol of the grammar must be used as the root of the Parse Tree.
 Leaves of parse tree represent terminals.
 Each interior node represents productions of grammar.
 All leaf nodes need to be terminals.
 All interior nodes need to be non-terminals.
 In-order traversal gives original input string.(Left,Root,Right)
2/15/2023
16

Example
 Construct a parse tree for the given CFG given below to derive the string
“acbd”
S -> AB
A -> c/aA
B -> d/bB
2/15/2023
17

Left most & Right most derivation
trees
2/15/2023
18

Examples
1.Consider the grammar given below
E->E+E|E*E|E-E|E/E|a|b find LMD and RMD to obtain the string a+b*a+b
2. Consider the grammar given below
S->0A|1B|0|1
A->0S|1B|1
B->0A|1S
construct LMD and parse tree for the following sentences
i) 0101 ii)1100101
2/15/2023
19

Ambiguous Grammars
 A Grammar G is said to be ambiguous grammar if there exist two or more left most
derivations or two or more right most derivations.
 Example: E->E+E|E*E|id then generate the string id+id*id using LMD
 LMD 1: E->E+E LMD 2: E->E*E
->id+E ->E+E*E
->id+E*E ->id+E*E
->id+id*E ->id+id*E
->id+id*id ->id+id*id
2/15/2023
21

 RMD 1: E->E+E RMD 2: E->E*E
->E+E*E ->E*id
->E+E*id ->E+E*id
->E+id*id ->E+id*id
->id+id*id ->id+id*id
Therefore it is an ambiguous grammar
2/15/2023
22

Practice problems
1. Show that the following grammar is ambiguous by consider the string “abab”
S->aSbS
S->bSaS
S-> ε
2. Show that the following grammar is ambiguous by consider the string “aab”
S->AB
B->ab
A->aa
A->a
B->b
2/15/2023
23

Classification of Parsing
techniques
2/15/2023
24

 Parsing techniques work on the following principles
1 . The parser scan the input string from left to right and identifies that the
derivation is leftmost or rightmost
2. The parser makes use of production rules for choosing the appropriate
derivation
The different parsing techniques use different approaches in selecting the
appropriate rules for derivation and finally parse tree is constructed
2/15/2023
25

Top down Parsing
 A parse tree can be constructed from root and expanded to leaves then such type of
parser is called top down parser. It is generated from top to bottom.
 The derivation terminates when the required input string terminates.
 Top down parsing can be viewed as finding a leftmost derivation for an input string.
 The main task of top down parsing is find the appropriate production rule in order to
produce the correct input string.
 In top down parsing selection of proper rule is very important task. This selection is
based on trail and error techniques
2/15/2023
26

Problems with Top down
Parsing
 There are certain problems in top down parsing. In order to implement the
parsing we need to eliminate these problems.
1. Back tracking
2. Left recursion
3. Left factoring
4. Ambiguity
2/15/2023
27

Backtracking
 Backtracking is a technique in which for expansion of non- terminal symbol
we choose one alternative and if some mismatch occurs then we try another
alternative if any
 If for a non terminal there are multiple production rules beginning with the
same input symbol then get the correct derivation we need to try all these
alternatives.
 In backtracking we need to move some levels upward in order to check
possibilities. This increases of overhead in implementation of parsing.
 It is necessary to eliminate backtracking by modifying the grammar.
2/15/2023
28

2/15/2023
29

Left Recursion
 Grammar of the form A->A α| β is called left recursive grammar. To
eliminate left recursion rewrite grammar as
A-> β AI
AI ->αAI|ε
 In a given production the L.H.S is equal to first symbol of R.H.S the
production contains left recursion i.e., A->A α| β
2/15/2023
30

2/15/2023
32

Left factoring
 If the grammar is left factored then it becomes suitable for the use. Basically
left factoring is used when it is not clear that which of the two alternatives is
used to expand the non terminal.
A-> α β 1 | α β 2
 To eliminate left factoring we will write the grammar as
A-> α AI
AI -> β 1| β 2
2/15/2023
33

Ambiguity
 A grammar G is said to be ambiguous if there exist two or more derivation trees for the given input
string.(either leftmost or rightmost)
 If the grammar has ambiguity then it is not good for a compiler construction. No method can
automatically detect and remove the ambiguity but you can remove ambiguity by re-writing the whole
grammar without ambiguity.
 S -> aSb | SS
S -> ∈
For the string aabb, the above grammar generates two parse trees:
2/15/2023
35

Removing Ambiguity By Precedence &
Associatively Rules-
2/15/2023
36
 An ambiguous grammar may be converted into an unambiguous grammar by
implementing-
Precedence Constraints
Associatively Constraints

Precedence Constraints
The precedence constraint is implemented using the following rules-
 The level at which the production is present defines the priority of the
operator contained in it.
 The higher the level of the production, the lower the priority of operator.
 The lower the level of the production, the higher the priority of operator.
2/15/2023
37

Associatively Constraints
The associatively constraint is implemented using the following rules-
 If the operator is left associative, induce left recursion in its production.
 If the operator is right associative, induce right recursion in its production.
2/15/2023
38

2/15/2023
39

Example
 Convert the following ambiguous grammar into unambiguous grammar-
R → R + R / R . R / R* / a / b
where * is kleen closure and . is concatenation.
Solution:
To convert the given grammar into its corresponding unambiguous grammar, we
implement the precedence and associativity constraints.
We have
Given grammar consists of the following operators- + , . , *
Given grammar consists of the following operands- a , b
2/15/2023
40

 The priority order is- (a , b) > * > . > +
 where-
. operator is left associative
+ operator is left associative
 Using the precedence and associatively rules, we write the corresponding
unambiguous grammar as-
E → E + T / T
T → T . F / F
F → F* / G
G → a / b
2/15/2023
41

OR
Unambiguous Grammar
E → E + T / T
T → T . F / F
F → F* / a / b
2/15/2023
42

 There are two types in top down parsing
1. Back tracking
2. Predictive parsing
 Predictive parsing is two types
1.Recusive descent Parser
2. LL(1) Parser
2/15/2023
43

Recursive Descent Parser
 A Parser that uses collection of recursive procedures for parsing the given input
string is called “Recursive Descent parser”
 CFG is used to build the recursive routines
 The R.H.S of the production rule is directly converted to a program.
2/15/2023
44

Procedure
 If the input symbol is non terminal then a call to the procedure corresponding to the
non terminal is made
 If the input symbol is terminal then it is matched with the look ahead from input .
The look ahead pointer has to be advanced on matching of the input symbol
 If the production rule has many alternatives then all these alternatives has to be
combined into a single body of procedure
 The parser should be activated by a procedure corresponding to the start symbol
2/15/2023
45

2/15/2023
46

2/15/2023
47

2/15/2023
48

2/15/2023
49

Advantages & Limitations of
Recursive Descent Parser
Advantages
 Recursive descent parser are simple to build
 It can be constructed with the help of parse tree
Limitations
 It is not very efficient as compared to other parsing techniques
 There are chances that the program for recursive descent parser may enter in to an
infinite loop for some input.
 It cannot provide good error messaging
 It is difficult to parse the string if look ahead symbol is arbitrarily long
2/15/2023
50

LL(1) Parser
2/15/2023
51

 LL(1)- The first L means the input is scanned for left to right
- The second L means it uses leftmost derivation for input string
- number 1 in the input symbol means it uses only one input symbol to
predict the parsing process.
 INPUT: Contains string to be parsed with $ as it's end marker
 STACK: Contains sequence of grammar symbols with $ as it's bottom marker. Initially
stack contains only $
 PARSING TABLE: A two dimensional array M[A,a], where A is a non-terminal and a is a
Terminal
2/15/2023
52

As shown the parser program works with the following 3 components to produce
output
 INPUT: Contains string to be parsed with $ as it's end marker
 STACK: Contains sequence of grammar symbols with $ as it's bottom marker.
Initially stack contains only $
 PARSING TABLE: A two dimensional array M[A,a], where A is a non-terminal and a is
a Terminal
2/15/2023
53

Procedure for constructing
LL(1) Parser
1. Computation of FIRST and FOLLOW functions
2. Construct the predictive parsing table using FIRST and FOLLOW functions
3. Parse the input string with the help of predictive parsing table
2/15/2023
54

Rules used to compute FIRST
function
 If the terminal symbol ‘a’ the FIRST(a) = {a}
 If there is a rule X-> ε then FIRST(X) = {ε}
 For the rule A-> X1 X2 X3......XK
FIRST(A) = (FIRST( X1 )U FIRST( X2)U FIRST( X3 )……U FIRST( Xk )
2/15/2023
55

2/15/2023
56

2/15/2023
57

2/15/2023
58

2/15/2023
59

Rules used to compute FOLLOW
function
 FOLLOW(A) is defined as the set of terminal symbols that appear immediately to right
of A
 FOLLOW(A) = {a|S=>αAaβ} where α and β are some grammar symbols may be
terminal or non terminal.
1. For the start symbol S place $ in follow(S)
2. If there is a production A-> αBβ then every thing in FIRST(β) without ε is
to be placed in FOLLOW(B) where β is a non terminal
3. If there is a production A-> αBβ or A-> αB and FIRST(β) ={ε } then
FOLLOW(A)=FOLLOW(B) or FOLLOW(B)=FOLLOW(A) that means
everything in FOLLOW(A) is in FOLLOW(B)
2/15/2023
60

2/15/2023
61

2/15/2023
62

2/15/2023
63

2/15/2023
64

2/15/2023
65

2/15/2023
66

2/15/2023
67

2/15/2023
68

2/15/2023
69

Algorithm for predictive parsing
table:
 For the rule A->α of grammar G
 For each a in FIRST(α) create entry M[A,a]=A->α where a is terminal symbol
 For ε in FIRST(α) create entry M[A,b]=A->α where b is the symbol from
FOLLOW(A)
 If ε is in FIRST(α) and $ is in FOLLOW(A) then create entry in the table
M[A,$]=A-> α
 All the remaining entries in the table M are marked as SYNTAX ERROR.
2/15/2023
70

2/15/2023
71

2/15/2023
72

2/15/2023
73

2/15/2023
74

2/15/2023
75

2/15/2023
76

Bottom-up Parsers
 In bottom-up parser method ,the input string is taken first and we try to reduce
this string with the help of grammar and try to obtain the start symbol.
 The parse tree is constructed from bottom to up that is from leaves to root.
 The bottom-up parse tree is created starting from leaves, the leaf nodes together
are reduced further to internal nodes, these internal nodes are further reduced
and eventually a root node is obtained.
 In this process, basically parser tries to identify R.H.S of production rule and
replace it by corresponding L.H.S. this activity is called reduction.
 The sentential forms that are produced in the reduction process should trace out
rightmost derivation reverse.
2/15/2023
77

Example
2/15/2023
78

Handle Pruning
 Handle :
It is a substring of string that matches the right side of the production and we can
reduce such string by a non-terminal on left hand side production.
 Handle Pruning :
A process of detecting handles and using them in reduction is called handle
pruning.
2/15/2023
79

Example
 Consider the grammar
E-> E+E| id and derive the string “id+id+id” using right most derivation.
-> E
-> E + E
->E + E + E
->E + E + id
->E + id + id
-> id + id + id
2/15/2023
80
Right sentential
form
Handle Production
id + id + id id E->id
E + id + id id E->id
E + E + id id E->id
E + E + E E + E E-> E + E
E + E E + E E-> E + E
E

Shift Reduce Parser
 Shift reduce parser attempts to construct parse tree from leaves to root.
 A shift reduce parser requires following data structures.
1. The input buffer storing the input string.
2. A stack for storing and accessing the L.H.S and R.H.S of rules.
2/15/2023
81

The parser performs following basic operations.
1. Shift: Moving of the symbols from input buffer on to the stack.
2. Reduce: If the handle appears on the top of the stack then reduce of it by
appropriate rule is done .That means R.H.S of rule is popped of and L.H.S is pushed
ion to the stack.
3. Accept : If the stack contains start symbol only and input buffer is empty at the
same time then the parser accept the string .
4. Error : A situation in which parser cannot either shift or reduce the symbols. It
cannot perform even the accept action is called as error.
2/15/2023
82

Example on SRP
Stack Input buffer Parsing Action
$ id-id*id $ Shift
id $ -id*id $ Reduce by E-> id
E$ -id*id $ Shift
-E$ id*id $ Shift
id –E $ *id $ Reduce by E-> id
E – E $ * id $ Shift
* E – E $ id $ Shift
83

Stack Input buffer Parsing Action
id * E – E $ $ Reduce by E-> id
E * E – E $ $ Reduce by E-> E * E
E – E $ $ Reduce by E-> E - E
E $ $ Accept
2/15/2023
84

Operator Precedence Parser
 A Grammar G is said to be operator precedence if it posses following properties
1. No production on the right side is ε
2. There should not be any production rule possessing two adjacent non-terminals
at the right hand side.
 Example: E-> EAE |( E ) |- E| id
A-> + | - |/ |^|*
This is not a operator precedence grammar.
Because production E - >EAE contains two consecutive non terminals. we
will convert it in to equivalent operator precedence grammar by
removing A.
E –> E + E | E-E | E*E | E/E | E^E
E- > ( E) | -E |id
2/15/2023
85

Simple operator precedence Parser
 In operator precedence parsing we first define three disjoint precedence relations
between every pair of terminals and construct the operator precedence table.
a <. b if b has higher precedence than a
a = b if b has same precedence as a
a.>b if b has lower precedence than a
 Rules to determine precedence relations:
The determination of correct precedence relations between terminal are based on the
traditional notations of associatively and precedence of operations.
id has higher precedence than any other symbol
$ has lowest precedence
If two operators have equal precedence then we check the associatively of that particular
operator.
2/15/2023
86

Rules to parsing the string
 Step1: Insert
* $ symbol at the start and at the end of input string
* Precedence operation in between every two symbols of the string by
referring (<. Id .>)
the designed precedence table.
 Step2: Start scanning the string from left until seeing .> and put a pointer on its location
Now scan backward the string from right to left until seeing <. Everything
between the two relations <. and .> form the handle. Replace handle with
the head of the respective production
 Step3: Repeat this step until reaching start symbol.
2/15/2023
87

Example
 Construct operator precedence parser for the following grammar.
E -> EAE |id
A-> +| *
parse the following string id + id * id
2/15/2023
88

Advantages and disadvantages of
simple operator precedence
Parsing
Advantages:
 This type of parsing is simple to implement.
Disadvantages:
 The operator like minus has two different precedence (unary and binary).Hence it is
hard to handle tokens like minus sign.
 This kind of parsing is applicable to only small class of grammars.
2/15/2023
89

Operator Precedence Parsing
 For construction of operator precedence parsing we have to follow the following steps:
1. Computation of Leading and Trailing symbols
2. Construct the operator precedence table using leading and trailing functions.
3. Parse the input string with the help of operator precedence table
2/15/2023
90

Leading function rules:
Rule-1:
 If the production rule is in the form of A->YaB and the production start with a single
non terminal then we have to take next terminal as lead of A.
 If the production start with the terminal in the R.H.S we can take terminal directly.
Rule-2:
 If the production rule is in the form of A->B means in the R.H.S single non-terminal is
there then we have to write whatever lead of B is there add to lead of A
2/15/2023
91

Trailing Function rules
Rule-1
 If the production rule is A->YaB means the production ending with a single no –terminal
in the R.H.S then we consider the previous symbol as trail of A
 If the production end with the terminal in the R.H.S then we consider terminal as trail
of a directly
Rule-2
 If the production rule is A->B means in the R.H.S a single non-terminal is there then we
have to write whatever trail of B is there we have to add to trail of A.
2/15/2023
92

Example
 Consider the following grammar and find leading and trailing functions.
E- >E+T
E->T
T->T*F
T->F
F-> (E) |id
2/15/2023
93

2/15/2023
94

COMPILER DESIGN

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to COMPILER DESIGN

Similar to COMPILER DESIGN (20)

Recently uploaded

Recently uploaded (20)

COMPILER DESIGN