Coefficient of Thermal Expansion and their Importance.pptx
Cd2 [autosaved]
1. DEFINITION OF PARSING
A parser is a compiler or interpreter
component that breaks data into smaller
elements for easy translation into another
language.
A parsertakes input in the form of a
sequence of tokens or program
instructions and usually builds a data
structure in the form of a parse tree or an
abstract syntax tree.
3. In the compiler model, the parser obtains a string of tokens from the lexical
analyser,
and verifies that the string can be generated by the grammar for the
source language.
The parser returns any syntax error for the source language.
It collects sufficient number of tokens anf builds a parse tree
4.
5. There are basically two types of parser:
Top-down parser:
starts at the root of derivation tree and fills in
picks a production and tries to match the input
may require backtracking
some grammars are backtrack-free (predictive)
Bottom-up parser:
starts at the leaves and fills in
starts in a state valid for legal first tokens
uses a stack to store both state and sentential forms
6. TOP DOWN PARSING
A top-down parser starts with the root of the parse tree, labeled with the start or
goal symbol of the grammar.
To build a parse, it repeats the following steps until the fringe of the parse tree
matches the input string
STEP1: At a node labeled A, select a production A → α and construct the appropriate
child for each symbol of α
STEP2: When a terminal is added to the fringe that doesn’t match the input string,
backtrack
STEP3: Find the next node to be expanded.
The key is selecting the right production in step 1
7. EXAMPLE FOR TOP DOWN PARSING
Supppose the given production rules are as follows:
S-> aAd|aB
A-> b|c
B->ccd
8. PROBLEMS WITH TOPDOWN
PARSING
1) BACKTRACKING
Backtracing is a technique in which for expansion of non-terminal
symbol we choose one alternative and if some mismatch occurs then
we try another alternative if any.
If for a non-terminal there are multiple production rules beginning
with the same input symbol then to get the correct derivation we
need to try all these alternatives.
10. 2) LEFT RECURSION
Left recursion is a case when the left-most non-terminal in a production of a non-
terminal is the non-terminal itself( direct left recursion ) or through some other
non-terminal definitions, rewrites to the non-terminal again(indirect left
recursion). Consider these examples -
(1) A -> Aq (direct)
(2) A -> Bq
B -> Ar (indirect)
Left recursion has to be removed if the parser performs top-down parsing
11. REMOVING LEFT RECURSION
To eliminate left recursion we need to modify the grammar. Let, G be a
grammar having a production rule with left recursion
A-> Aa
A->B
Thus, we eliminate left recursion by rewriting the production rule as:
A->BA’
A’->aA’
A’->c
12. 3) LEFT FACTORING
Left factoring is removing the common left factor that appears in two productions
of the same non-terminal. It is done to avoid back-tracing by the parser. Suppose
the parser has a look-ahead ,consider this example-
A -> qB | qC
where A,B,C are non-terminals and q is a sentence. In this case, the parser will be
confused as to which of the two productions to choose and it might have to back-
trace. After left factoring, the grammar is converted to-
A -> qD
D -> B | C
13. RECURSIVE DESCENT PARSING
A recursive descent parser is a kind of top-down parser built from a
set of mutually recursive procedures (or a non-recursive equivalent)
where each such procedure usually implements one of the
productions of the grammar.
14. EXAMPLE OF RECURSIVE DESCENT
PARSING Suppose the grammar given is as follows:
E->iE’
E’->+iE’
Program:
E()
{
if(l==‘i’)
{
match(‘i’);
E’();
}
} l=getchar();
17. PREDICTIVE LL(1) PARSING
The first “L” in LL(1) refers to the fact that the input is processed from
left to right.
The second “L” refers to the fact that LL(1) parsing determines a
leftmost derivation for the input string.
The “1” in parentheses implies that LL(1) parsing uses only one
symbol of input to predict the next grammar rule that should be
used.
The data structures used by LL(1) are 1. Input buffer 2. Stack
3. Parsing table
18. The construction of predictive LL(1) parser is bsed
on two very important functions and those are First
and Follow.
For construction of predictive LL(1) parser we have
to follow the following steps:
STEP1: computate FIRST and FOLLOW function.
STEP2: construct predictive parsing table using first and
follow function.
STEP3: parse the input string with the help of predictive
parsing table
19. FIRST
If X is a terminal then First(X) is just X!
If there is a Production X → ε then add ε to first(X)
If there is a Production X → Y1Y2..Yk then add
first(Y1Y2..Yk) to first(X)
First(Y1Y2..Yk) is either
First(Y1) (if First(Y1) doesn't contain ε)
OR (if First(Y1) does contain ε) then First (Y1Y2..Yk) is
everything in First(Y1) <except for ε > as well as everything in
First(Y2..Yk)
If First(Y1) First(Y2)..First(Yk) all contain ε then add ε to
First(Y1Y2..Yk) as well.
20. FOLLOW
First put $ (the end of input marker) in Follow(S) (S is
the start symbol)
If there is a production A → aBb, (where a can be
a whole string) then everything in FIRST(b) except
for ε is placed in FOLLOW(B).
If there is a production A → aB, then everything in
FOLLOW(A) is in FOLLOW(B)
If there is a production A → aBb, where FIRST(b)
contains ε, then everything in FOLLOW(A) is in
FOLLOW(B)
21. EXAMPLE OF FIRST AND FOLLOW
The Grammar
E → TE'
E' → +TE'
E' → ε
T → FT'
T' → *FT'
T' → ε
F → (E)
F → id
22. PROPERTIES OF LL(1) GRAMMARS
1. No left-recursive grammar is LL(1)
2. No ambiguous grammar is LL(1)
3. Some languages have no LL(1) grammar
4. A ε–free grammar where each alternative expansion for A begins with
a distinct terminal is a simple LL(1) grammar.
Example:
S → aS | a
is not LL(1) because FIRST(aS) = FIRST(a) = { a }
S → aS´
S´ → aS | ε
accepts the same language and is LL(1)
23. PREDICTIVE PARSING TABLE
Method:
1. ∀ production A → α:
α) ∀a ∈ FIRST(α), add A → α to M[A,a]
b) If ε ∈ FIRST(α):
Ι. ∀b ∈ FOLLOW(A), add A → α to M[A,b]
II. If $ ∈ FOLLOW(A), add A → α to M[A,$]
2. Set each undefined entry of M to error
If ∃M[A,a] with multiple entries then G is not LL(1).
24. EXAMPLE OF PREDICTIVE PARSING
LL(1) TABLE
The given grammar is as follows
S → E
E → TE´
E´ → +E | —E | ε
T → FT´
T´ → * T | / T | ε
F → num | id
25. BOTTOM UP PARSING
Bottom-up parsing starts from the leaf nodes of a tree and
works in upward direction till it reaches the root node.
we start from a sentence and then apply production rules in
reverse manner in order to reach the start symbol.
Here, parser tries to identify R.H.S of production rule and
replace it by corresponding L.H.S. This activity is known as
reduction.
Also known as LR parser, where L means tokens are read
from left to right and R means that it constructs rightmost
derivative.
26. EXAMPLE OF BOTTOM-UP PARSER
E → T + E | T
T → int * T | int | (E)
Consider the string: int * int + int
int * int + int T → int
int * T + int T → int * T
T + int T → int
T + T E → T
T + T E → T
E
27. SHIFT REDUCE PARSING
Bottom-up parsing uses two kinds of actions: 1.Shift 2.Reduce
Shift: Move | one place to the right , Shifts a terminal to the left
string ABC|xyz ABCx|yz⇒
Reduce: Apply an inverse production at the right end of the left
string If A → xy is a production, then Cbxy|ijk CbA|ijk⇒
28. EXAMPLE OF SHIFT REDUCE PARSING
|int * int + int shift
int | * int + int shift
int * | int + int shift
int * int | + int reduce T → int
int * T | + int reduce T → int * T
T | + int shift
T + | int shift
T + int | reduce T → int
T + T | reduce E → T
T + E | reduce E → T + E
E |
29. OPERATOR PRECEDENCE PARSING
Operator grammars have the property that no production right side
is empty or has two adjacent nonterminals.
This property enables the implementation of efficient operator-
precedence parsers.
These parser rely on the following three precedence relations:
Relation Meaning
a <· b a yields precedence to b
a =· b a has the same precedence as b
a ·> b a takes precedence over b
30. These operator precedence relations allow to delimit the handles in the right
sentential forms: <· marks the left end, =· appears in
the interior of the handle, and ·> marks the right end.
. Suppose that $ is the end of the string, Then for all terminals we can write: $ <·
b and b ·> $
If we remove all nonterminals and place the correct precedence relation:<·, =·,
·> between the remaining terminals, there remain strings that can be analyzed
by easily developed parser.
31. EXAMPLE OF OPERATOR
PRECEDENCE PARSING
id + * $
id ·> ·> ·>
+ <· ·> <· ·>
* <· ·> ·> ·>
$ <· <· <· ·>
For example, the following operator precedence relations
can
be introduced for simple expressions:
Example: The input string:
id1
+ id2
* id3
after inserting precedence relations becomes
$ <· id1
·> + <· id2
·> * <· id3
·> $
32. UNIT-III
Syntax Directed Translations
We can associate information with a language construct by
attaching attributes to the grammar symbols.
A syntax directed definition specifies the values of attributes
by associating semantic rules with the grammar productions.
Production Semantic Rule
E->E1+T E.code=E1.code||T.code||’+’
• We may alternatively insert the semantic actions inside the grammar
E -> E1+T {print ‘+’}
33. Syntax Directed Definitions
1. We associate information with the programming language
constructs by attaching attributes to grammar symbols.
2. Values of these attributes are evaluated by the semantic rules
associated with the production rules.
3. Evaluation of these semantic rules:
may generate intermediate codes
may put information into the symbol table
may perform type checking, may issue error messages
may perform some other activities
in fact, they may perform almost any activities.
1. An attribute may hold almost any thing.
a string, a number, a memory location, a complex record.
34. Syntax-Directed Definitions and Translation Schemes
1. When we associate semantic rules with productions, we use two
notations:
Syntax-Directed Definitions
Translation Schemes
A. Syntax-Directed Definitions:
give high-level specifications for translations
hide many implementation details such as order of evaluation of
semantic actions.
We associate a production rule with a set of semantic actions, and we
do not say when they will be evaluated.
B. Translation Schemes:
indicate the order of evaluation of semantic actions associated with a
production rule.
In other words, translation schemes give a little bit information about
implementation details.
35. Syntax-Directed Translation
Conceptually with both the syntax directed translation and translation
scheme we
Parse the input token stream
Build the parse tree
Traverse the tree to evaluate the semantic rules at the parse tree
nodes.
Input string parse tree dependency graph evaluation
order for semantic rules
Conceptual view of syntax directed translation
36. Syntax-Directed Definitions
1. A syntax-directed definition is a generalization of a context-free grammar in
which:
Each grammar symbol is associated with a set of attributes.
This set of attributes for a grammar symbol is partitioned into two subsets
called
synthesized and
inherited attributes of that grammar symbol.
Each production rule is associated with a set of semantic rules.
1. The value of an attribute at a parse tree node is defined by the semantic rule
associated with a production at that node.
2. The value of a synthesized attribute at a node is computed from the values of
attributes at the children in that node of the parse tree.
3. The value of an inherited attribute at a node is computed from the values of
attributes at the siblings and parent of that node of the parse tree.
37. Syntax-Directed Definitions
Examples:
Synthesized attribute : E→E1+E2 { E.val =E1.val + E2.val}
Inherited attribute :A→XYZ {Y.val = 2 * A.val}
1. Semantic rules set up dependencies between attributes which can be
represented by a dependency graph.
2. This dependency graph determines the evaluation order of these
semantic rules.
3. Evaluation of a semantic rule defines the value of an attribute. But a
semantic rule may also have some side effects such as printing a value.
38. Syntax Trees
Syntax-Tree
an intermediate representation of the compiler’s input.
A condensed form of the parse tree.
Syntax tree shows the syntactic structure of the program while
omitting irrelevant details.
Operators and keywords are associated with the interior
nodes.
Chains of simple productions are collapsed.
Syntax directed translation can be based on syntax tree as well as parse tree.
39. Syntax Tree-Examples
Expression:
+
5 *
3 4
Leaves: identifiers or constants
Internal nodes: labelled with
operations
Children: of a node are its operands
if B then S1 else S2
if - then - else
Statement:
Node’s label indicates what kind of a
statement it is
Children of a node correspond to the
components of the statement
B S1 S2
40. Intermediate representation and code generation
Two possibilities:
1. .....
semantic
routines
code
generation
Machine code
(+) no extra pass for code generation
(+) allows simple 1-pass compilation
2.
semantic
routines
code
generation
Machine code
IR
(+) allows higher-level operations e.g. open block, call
procedures.
(+) better optimization because IR is at a higher level.
(+) machine dependence is isolated in code generation.
.....
42. Intermediate code
1. postfix form
Example
a+b ab+
(a+b)*c ab+c*
a+b*c abc*+
a:=b*c+b*d abc*bd*+:=
(+) simple and concise
(+) good for driving an interpreter
(- ) Not good for optimization or code generation
43. INTERMEDIATE CODE
2. 3-addr code
Triple
op arg1 arg2
Quadruple
op arg1 arg2 arg3
Triple: more concise
But what if instructions are deleted,
Moved or added during optimization?
Triples and quadruples
are more similar to machine code.
44. More detailed 3-addr code
Add type information
Example a := b*c + b*d
Suppose b,c are integer type, d is float type.
(1) ( I* b c ) (I* b c t1)
(2) (FLOAT b _ ) (FLOAT b t2 _)
(3) ( F* (2) d ) (F* t2 d t3)
(4) (FLOAT (1) _ ) (FLOAT t1 t4 _)
(5) ( *f+ (4) (3)) ( F+ t4 t3 t5)
(6) ( := (5) a ) ( := t5 a _)
45. PARSE TREES
Parsing:
build the parse tree
Non-terminals for operator precedence
and associatively are included.
parse tree
<target> := <exp>
id
<exp> + <term>
<term
>
<term> ∗ <factoor>
<factor>
Const
id
<factor>
id
47. BOOLEAN EXPRESSIONS
Control flow translation of boolean
expressions:
Basic idea: generate the jumping code without evaluating the whole
boolean expression.
Example:
Let E = a < b, we will generate the code as
(1) If a < b then goto E.true
(2) Goto T.false
Grammar:
E->E or E | E and E | not E | (E) | id relop id |
true | false.
49. Example: a < b or (c < d and e < f)
Example: while a< b do
if c < d then
x := y + z;
else
x: = y – z;
50. Three address code
In a three address code there is at most one operator at the right side of
an instruction
Example:
+
+ *
-
b c
a
d
t1 = b – c
t2 = a * t1
t3 = a + t2
t4 = t1 * d
t5 = t3 + t4
*
51. Forms of three address instructions
x = y op z
x = op y
x = y
goto L
if x goto L and ifFalse x goto L
if x relop y goto L
Procedure calls using:
param x
call p,n
y = call p,n
x = y[i] and x[i] = y
x = &y and x = *y and *x =y
52. Example
L: t1 = i + 1
i = t1
t2 = i * 8
t3 = a[t2]
if t3 < v goto L
Symbolic labels
100: t1 = i + 1
101: i = t1
102: t2 = i * 8
103: t3 = a[t2]
104: if t3 < v goto 100
Position numbers
do i = i+1; while (a[i] < v);
53. Data structures for three address codes
Quadruples
Has four fields: op, arg1, arg2 and result
Triples
Temporaries are not used and instead references to
instructions are made
Indirect triples
In addition to triples we use a list of pointers to triples
54. Example
b * minus c + b * minus c
t1 = minus c
t2 = b * t1
t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Three address code
Quadruples Triples Indirect Triples
Op Arg1 Arg2 result
Minus c T1
* b T1 T2
Minus c T3
* b T3 T4
+ t2 t4 T5
= t5 a
Op Arg1 arg2
Minus c
* b (0)
Minus c
* b (2)
+ (1) (3)
a (4)
0
1
2
3
4
5
(0)
(1)
(2)
(3)
(4)
(5)
35
36
37
38
39
40
Op Arg1 arg2
Minus c
* b (0)
Minus c
* b (2)
+ (1) (3)
a (4)
0
1
2
3
4
5
55. ASSIGNMENT STATEMENTS
The assignment statement mainly deals with the expressions. The
expressions
Can be of type integer, real, array and record.
Consider the following grammar:
S-> id : =E
E-> E1+ E2
E-> E1* E2
E-> -E1
E-> (E1)
E-> id
56. The translation scheme of above grammar is given below:
Production
Rule
Semantic actions
S-> id : =E { p=look_up(id.name);
If p≠ nil then
Emit(p= E.place)
Else
Error; }
E-> E1+ E2 { E.place= newtemp();
Emit (E.place= E1.place ‘+’ E2.place) }
E-> E1* E2 { E.place= newtemp();
Emit (E.place= E1.place ‘*’ E2.place) }
E-> -E1 { E.place= newtemp();
Emit (E.place = ‘uminus’E1.place) }
E-> (E1) {E.place=E1.place}
E-> id { p= look_up(id.name);
If p≠ nil then
Emit (p = E.place)
Else
Error; }
59. Control flow (or alternatively, flow of control) is the order in which
individual statements.
Instructions or function calls of an imperative program are executed
or evaluated.
It is a statement whose execution results in a choice being made as
to which of two or more paths should be followed.
A set of statements is in turn generally structured as a block,
which in addition to grouping also defines a lexical scope.
60. Postfix notation
The postfix notation for an expression E can be defined:-
1. If E is a variable or constant, then the postfix notation for E is E itself.
2. If E is an expression of the form E1 op E2, where op is any binary
operator.
3. If E is a parenthesized expression of the form (E1), then the postfix
notation for E is the same as the postfix notation for E1.
61. UNIT-IV SYMBOL TABLE
Symbol table: A data structure used by a compiler to keep track of
semantics of variables.
Data type.
When is used: scope.
The effective context where a name is valid.
Where it is stored: storage address.
Possible implementations:
Unordered list: for a very small set of variables.
Ordered linear list: insertion is expensive, but implementation is
relatively easy
62. Data structure for symbol tables
Possible entries in a symbol table:
Name: a string.
Attribute:
Reserved word
Variable name
Type name
Procedure name
Constant name
Data type.
Scope information: where it can be used.
Storage allocation, size