SlideShare a Scribd company logo
1 of 51
Download to read offline
Chapter 4: Syntax Analysis 
Principles of Programming Languages
Contents 
•Non-regular languages 
•Context-free grammars and BNF 
•Formal Methods of Describing Syntax 
•Parsing problems 
•Top-down parsing and LL grammar 
•Combinator Parser in Scala
Non-Regular Language 
•Write regular expression for following language 
L = {anbn: n ≥ 0}
Context-Free Grammars 
•Context-Free Grammars 
–Developed by Noam Chomsky in the mid-1950s 
–Language generators, meant to describe the syntax of natural languages 
–Define a class of languages called context-free languages
Chomsky Hierarchy 
Grammars 
Languages 
Automaton 
Restrictions 
(w1  w2) 
Type-0 
Phrase-structure 
Turing machine 
w1 = any string with at least 1 non-terminal 
w2 = any string 
Type-1 
Context-sensitive 
Bounded Turing machine 
w1 = any string with at least 1 non-terminal 
w2 = any string at least as long as w1 
Type-2 
Context-free 
Non-deterministic pushdown automaton 
w1 = one non-terminal 
w2 = any string 
Type-3 
Regular 
Finite state automaton 
w1 = one non-terminal 
w2 = tA or t 
(t = terminal 
A = non-terminal)
Backus-Naur Form (BNF) 
•Backus-Naur Form (1959) 
–Invented by John Backus to describe ALGOL 58 
–Revised by Peter Naur in ALGOL 60 
–BNF is equivalent to context-free grammars 
–BNF is a metalanguage used to describe another language
BNF Fundamentals 
•Non-terminals: BNF abstractions 
•Terminals: lexemes and tokens 
•Grammar: a collection of rules
Rules 
•A rule has a left-hand side (LHS) and a right- hand side (RHS), and consists of terminal and nonterminal symbols 
•A nonterminal symbol can have more than one RHS 
stmt  single_stmt 
| "begin" stmt_list "end"
Regular vs. Context-Free 
•Regular languages are subset of context-free languages 
•Languages are generated by grammars 
•Regular grammars have form 
A  α | αB 
•Context-free grammars have form 
A  αBβ
Regular vs. Context-Free 
•Context-Free languages are recognized by Nondeterministic Pushdown Automata 
•A Pushdown Automaton is a Finite Automaton with “pushdown store”, or “stack” 
e.g., L = {anbn: n ≥ 0}
Nondeterministic Pushdown Automata
Describing Lists 
•List syntax: a,b,c,d,… 
•Syntactic lists are described using recursion 
•To describe comma-separated list of IDENT 
ident_list  IDENT 
| IDENT "," ident_list
Derivation 
•A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)
An Example Grammar 
program  stmts 
stmts  stmt | stmt ";" stmts 
stmt  var "=" expr 
var  "a" | "b" | "c" | "d" 
expr  term "+" term | term "-" term 
term  var | CONST 
An Example Derivation 
<program> => <stmts> => <stmt> 
=> <var> = <expr> => a =<expr> 
=> a = <term> + <term> 
=> a = <var> + <term> 
=> a = b + <term> 
=> a = b + CONST
Derivation 
•A leftmost (rightmost) derivation is one in which the leftmost (rightmost) nonterminal in each sentential form is the one that is expanded 
•A derivation may be neither leftmost nor rightmost
Parse Tree 
•A hierarchical representation of a derivation 
<program> 
<stmts> 
<stmt> 
CONST 
a 
<var> 
= <expr> 
<var> b 
<term> 
+ 
<term> 
<program> => <stmts> => <stmt> 
=> <var> = <expr> 
=> a = <expr> 
=> a = <term> + <term> 
=> a = <var> + <term> 
=> a = b + <term> 
=> a = b + CONST
Exerices 
•Consider the BNF 
S  "(" L ")" | "a" 
L  L "," S | S 
Draw parse trees for the derivation of: 
(a, a) 
(a, ((a, a), (a, a)))
Ambiguity in Grammars 
•A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees
An Ambiguous Expression Grammar 
expr  expr op expr | CONST 
op  "/" | "-" 
<expr> 
<expr> 
<expr> <expr> 
<expr> <expr> 
<expr> 
<expr> 
<expr> 
<expr> 
<op> 
<op> 
<op> 
CONST 
CONST CONST 
CONST 
CONST CONST 
- - 
/ 
/ 
<op>
Exercises 
•Is the following grammar ambiguous? 
A  A "and " A | "not" A | "0" | "1"
Exercises 
•Write BNF for well-formed parentheses 
( ), ((())), ))((, ((() …
Precedence of Operators 
•Use the parse tree to indicate precedence levels of the operators 
expr  expr "-" term | term 
term  term "/" CONST| CONST 
<expr> <expr> 
<term> 
<term> <term> 
CONST CONST 
const 
/ 
-
1-23 
•Operator associativity can also be indicated by a grammar 
expr  expr "+" expr | const (ambiguous) 
expr  expr "+" CONST | CONST (unambiguous) 
<expr> 
<expr> 
<expr> 
<expr> 
CONST CONST 
CONST 
+ + 
Associativity of Operators
Exercises 
•Rewrite the grammar to fulfill the following requirements: 
–operator “*” takes lower precedence than “+” 
–operator “-” is right-associativity
Extended BNF 
•Optional parts are placed in ( )? 
proc_call  IDENT ("(" expr_list ")")? 
•Alternative parts of RHSs are placed inside parentheses and separated via vertical bars 
term  term ("+"|"-") CONST 
•Repetitions (0 or more) are placed inside braces ( )* 
IDENT  letter (letter|digit)*
BNF and EBNF 
•BNF 
expr  expr "+" term 
| expr "-" term 
| term 
term  term "*" factor 
| term "/" factor 
| factor 
•EBNF 
expr  term (("+" | "-") term)* 
term  factor (("*" | "/") factor)*
Exercises 
•Write EBNF descriptions for a function header as follow: 
•The function declaration begins with a function keyword, then the function name, an opening parenthesis '(', a semicolon-separated parameter list, a closing parenthesis ')', a colon ':', a return type, a semi-colon and the body of the function. A function declaration is terminated by a semi-colon ';'. 
•The parameter list of a function declaration may contain zero or more parameters. A parameter consists of an identifier, a colon and a parameter type. If two or more consecutive parameters have the same type, they could be reduced to a shorter form: a comma delimited list of these parameter names, followed by a colon ':' and the shared parameter type. 
•For example 
•function area(a: real; b: real; c: real): real; could be rewritten as follows 
•function area(a,b,c: real): real; 
•A return type must be a primitive type. A parameter type could be a primitive type or an array type. The body of a function is also simply a block statement.
Parsing 
•Goals of the parser, given an input program: 
–Find all syntax errors; for each, produce an appropriate diagnostic message, and recover quickly 
–Produce the parse tree, or at least a trace of the parse tree, for the program
Parser Generator 
•Parser generator: Produces a parser from a given grammar 
•Works in conjunction with lexical analyzer 
•Examples: yacc, Antlr, JavaCC 
•Programmer specifies actions to be taken during parsing 
•No free lunch: need to know parsing theory to build efficient parsers 
•Scala has a built-in “combinator parser” (later slides)
Parsing 
•Two categories of parsers 
–Top down - produce the parse tree, beginning at the root 
–Bottom up - produce the parse tree, beginning at the leaves (not mention here) 
•Parsers look only one token ahead in the input
The Parsing Problem 
•The Complexity of Parsing 
–Parsers that work for any unambiguous grammar are complex and inefficient ( O(n3), where n is the length of the input ) 
–Compilers use parsers that only work for a subset of all unambiguous grammars, but do it in linear time ( O(n), where n is the length of the input )
Top-down Parsers 
•Top-down Parsers 
–Given a sentential form, xA , the parser must choose the correct A-rule to get the next sentential form in the leftmost derivation, using only the first token produced by A 
•The most common top-down parsing algorithms: 
–Recursive descent - a coded implementation 
–LL parsers - table driven implementation
Recursive-Descent Parsing 
•There is a subprogram for each nonterminal in the grammar, which can parse sentences that can be generated by that nonterminal 
•EBNF is ideally suited for being the basis for a recursive-descent parser, because EBNF minimizes the number of nonterminals
Recursive-Descent Parsing (cont.) 
•A grammar for simple expressions: 
expr  term (("+" | "-") term)* 
term  factor (("*" | "/") factor)* 
factor  id | "(" expr ")"
Recursive-Descent Parsing (cont.) 
•Assume we have a lexical analyzer which puts the next token code in a variable 
•The coding process when there is only one RHS: 
–For each terminal symbol in the RHS, compare it with the next input token; if they match, continue, else there is an error 
–For each nonterminal symbol in the RHS, call its associated parsing subprogram
Recursive-Descent Parsing (cont.) 
expr  term (("+" | "-") term)*
Recursive-Descent Parsing (cont.) 
•A nonterminal that has more than one RHS requires an initial process to determine which RHS it is to parse 
–The correct RHS is chosen on the basis of the next token of input (the lookahead) 
–The next token is compared with the first token that can be generated by each RHS until a match is found 
–If no match is found, it is a syntax error
Recursive-Descent Parsing (cont.) 
factor  ID | "(" expr ")"
Top-down Problems 1 
•The LL Grammar Class 
–The Left Recursion Problem 
•If a grammar has left recursion, either direct or indirect, it cannot be the basis for a top-down parser 
–A grammar can be modified to remove left recursion
Left Recursion Elimination 
•For each nonterminal, A, 
1. Group the A-rules as 
A  Aα1,|…|Aαm|β1|β2|…|βn 
2. Replace the original A-rules with 
A  β1A’|β2A’|…|βnA’ 
A’  α1A’|α2A’|…|αmA’|e
Example 
E  E "+" T | T 
T  T "*" F | F 
F  "(" E ")" | id
Top-down Problems 2 
•Have to be pairwise disjointness 
–Have to determine the correct RHS on the basis of one token of lookahead 
–Def: FIRST() = {a |  =>* a } 
–For each nonterminal, A, in the grammar that has more than one RHS, for each pair of rules, A  i and A  j, it must be true that 
FIRST(i) ∩ FIRST(j) = Ø
Pairwise Disjointness 
•Examples: 
A  a | bB | cAb 
A  a | aB
Pairwise Disjointness 
•Left factoring can resolve the problem 
Replace 
variable  IDENTIFIER | IDENTIFIER "[" expression "]" 
with 
variable  IDENTIFIER new 
new  e | "[" expression "]" 
or 
variable  IDENTIFIER ("["expression"]")?
Example 
A  a | aB
Building Grammar Techniques 
•Repetition 
–[10][10][20] 
•Repetition with separators 
–id(arg1, arg2, arg3,…) 
•Operator precedence 
•Operator associativity 
•Avoiding left recursion 
•Left factoring 
•Disambiguation
Scala Combinator Parser 
•Each nonterminal becomes a function 
•Terminals (strings) and nonterminals (functions) are combined with operators 
–Sequence ~ 
–Alternative | 
–0 or more rep(...) 
–0 or 1 opt(...)
Combinator Parser 
expr  term (("+" | "-") expr)? 
term  factor (("*" | "/") term)? 
factor  wholeNumber | "(" expr ")" 
class SimpleLanguageParser extends JavaTokenParsers { 
def expr: Parser[Any] = term ~ opt(("+" | "-") ~ expr) 
def term: Parser[Any] = factor ~ opt(("*" | "/" ) ~ term) 
def factor: Parser[Any] = wholeNumber | "(" ~ expr ~ ")" 
} 
•Will replace Parser[Any] with something more useful later
Combinator Parser Results 
•String returns itself 
•opt(P) returns Option: Some of the result of P, or None 
•rep(P) returns List of the results of P 
•P ~ Q returns instance of class ~ (similar to a pair)
Example 
val parser = new SimpleLanguageParser 
val result = parser.parse(parser.expr, "3 - 4 * 5") 
sets result to 
((3~None)~Some((-~((4~Some((*~(5~None))))~None)))) 
•After changing (x~None) to x, Some(y) to y, and ~ to spaces: (3 (- (4 (* 5)))) 
•Way to do it: wait until labs
Summary 
•BNF and context-free grammars are equivalent meta-languages 
–Well-suited for describing the syntax of programming languages 
•Syntax analyzers: 
–Detects syntax errors 
–Produces a parse tree 
•A recursive-descent parser is an LL parser 
–EBNF

More Related Content

What's hot (20)

Language for specifying lexical Analyzer
Language for specifying lexical AnalyzerLanguage for specifying lexical Analyzer
Language for specifying lexical Analyzer
 
Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
A Role of Lexical Analyzer
A Role of Lexical AnalyzerA Role of Lexical Analyzer
A Role of Lexical Analyzer
 
Lexical
LexicalLexical
Lexical
 
A simple approach of lexical analyzers
A simple approach of lexical analyzersA simple approach of lexical analyzers
A simple approach of lexical analyzers
 
Parsing (Automata)
Parsing (Automata)Parsing (Automata)
Parsing (Automata)
 
Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
Syntactic analysis in NLP
Syntactic analysis in NLPSyntactic analysis in NLP
Syntactic analysis in NLP
 
02. chapter 3 lexical analysis
02. chapter 3   lexical analysis02. chapter 3   lexical analysis
02. chapter 3 lexical analysis
 
Types of Parser
Types of ParserTypes of Parser
Types of Parser
 
2_2Specification of Tokens.ppt
2_2Specification of Tokens.ppt2_2Specification of Tokens.ppt
2_2Specification of Tokens.ppt
 
Parsing
ParsingParsing
Parsing
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
 
Lexical Analyzer Implementation
Lexical Analyzer ImplementationLexical Analyzer Implementation
Lexical Analyzer Implementation
 
Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Lexical Analysis - Compiler design
Lexical Analysis - Compiler design
 
1.Role lexical Analyzer
1.Role lexical Analyzer1.Role lexical Analyzer
1.Role lexical Analyzer
 
Role-of-lexical-analysis
Role-of-lexical-analysisRole-of-lexical-analysis
Role-of-lexical-analysis
 
4 lexical and syntax
4 lexical and syntax4 lexical and syntax
4 lexical and syntax
 
Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
System Programming Unit IV
System Programming Unit IVSystem Programming Unit IV
System Programming Unit IV
 

Viewers also liked

Programming Haskell Chapter8
Programming Haskell Chapter8Programming Haskell Chapter8
Programming Haskell Chapter8Kousuke Ruichi
 
Algorithms Vs Meta Language
Algorithms Vs Meta LanguageAlgorithms Vs Meta Language
Algorithms Vs Meta LanguageKelly Bauer
 
Untitled Presentation
Untitled PresentationUntitled Presentation
Untitled Presentationbaran19901990
 
Nhập môn công tác kỹ sư
Nhập môn công tác kỹ sưNhập môn công tác kỹ sư
Nhập môn công tác kỹ sưbaran19901990
 
Config websocket on apache
Config websocket on apacheConfig websocket on apache
Config websocket on apachebaran19901990
 
Principles of programming languages. Detail notes
Principles of programming languages. Detail notesPrinciples of programming languages. Detail notes
Principles of programming languages. Detail notesVIKAS SINGH BHADOURIA
 
Bus tracking application in Android
Bus tracking application in AndroidBus tracking application in Android
Bus tracking application in Androidyashonil
 
Meta Languages Bnf Ebnf Student Version
Meta Languages Bnf Ebnf Student VersionMeta Languages Bnf Ebnf Student Version
Meta Languages Bnf Ebnf Student VersionKelly Bauer
 
09 implementing+subprograms
09 implementing+subprograms09 implementing+subprograms
09 implementing+subprogramsbaran19901990
 
How to build a news website use CMS wordpress
How to build a news website use CMS wordpressHow to build a news website use CMS wordpress
How to build a news website use CMS wordpressbaran19901990
 
How to install nginx vs unicorn
How to install nginx vs unicornHow to install nginx vs unicorn
How to install nginx vs unicornbaran19901990
 
Tìm đường đi xe buýt trong TPHCM bằng Google Map
Tìm đường đi xe buýt trong TPHCM bằng Google MapTìm đường đi xe buýt trong TPHCM bằng Google Map
Tìm đường đi xe buýt trong TPHCM bằng Google Mapbaran19901990
 
NS2: Binding C++ and OTcl variables
NS2: Binding C++ and OTcl variablesNS2: Binding C++ and OTcl variables
NS2: Binding C++ and OTcl variablesTeerawat Issariyakul
 
Memory allocation
Memory allocationMemory allocation
Memory allocationsanya6900
 

Viewers also liked (20)

Programming Haskell Chapter8
Programming Haskell Chapter8Programming Haskell Chapter8
Programming Haskell Chapter8
 
Algorithms Vs Meta Language
Algorithms Vs Meta LanguageAlgorithms Vs Meta Language
Algorithms Vs Meta Language
 
Subprogram
SubprogramSubprogram
Subprogram
 
Untitled Presentation
Untitled PresentationUntitled Presentation
Untitled Presentation
 
Nhập môn công tác kỹ sư
Nhập môn công tác kỹ sưNhập môn công tác kỹ sư
Nhập môn công tác kỹ sư
 
Config websocket on apache
Config websocket on apacheConfig websocket on apache
Config websocket on apache
 
Principles of programming languages. Detail notes
Principles of programming languages. Detail notesPrinciples of programming languages. Detail notes
Principles of programming languages. Detail notes
 
Bus tracking application in Android
Bus tracking application in AndroidBus tracking application in Android
Bus tracking application in Android
 
Meta Languages Bnf Ebnf Student Version
Meta Languages Bnf Ebnf Student VersionMeta Languages Bnf Ebnf Student Version
Meta Languages Bnf Ebnf Student Version
 
Hadoop
HadoopHadoop
Hadoop
 
09 implementing+subprograms
09 implementing+subprograms09 implementing+subprograms
09 implementing+subprograms
 
How to build a news website use CMS wordpress
How to build a news website use CMS wordpressHow to build a news website use CMS wordpress
How to build a news website use CMS wordpress
 
How to install nginx vs unicorn
How to install nginx vs unicornHow to install nginx vs unicorn
How to install nginx vs unicorn
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
08 subprograms
08 subprograms08 subprograms
08 subprograms
 
Datatype
DatatypeDatatype
Datatype
 
Chapter2
Chapter2Chapter2
Chapter2
 
Tìm đường đi xe buýt trong TPHCM bằng Google Map
Tìm đường đi xe buýt trong TPHCM bằng Google MapTìm đường đi xe buýt trong TPHCM bằng Google Map
Tìm đường đi xe buýt trong TPHCM bằng Google Map
 
NS2: Binding C++ and OTcl variables
NS2: Binding C++ and OTcl variablesNS2: Binding C++ and OTcl variables
NS2: Binding C++ and OTcl variables
 
Memory allocation
Memory allocationMemory allocation
Memory allocation
 

Similar to Control structure

Similar to Control structure (20)

Syntax
SyntaxSyntax
Syntax
 
Syntax Analyzer.pdf
Syntax Analyzer.pdfSyntax Analyzer.pdf
Syntax Analyzer.pdf
 
3. Syntax Analyzer.pptx
3. Syntax Analyzer.pptx3. Syntax Analyzer.pptx
3. Syntax Analyzer.pptx
 
CS-4337_03_Chapter3- syntax and semantics.pdf
CS-4337_03_Chapter3- syntax and semantics.pdfCS-4337_03_Chapter3- syntax and semantics.pdf
CS-4337_03_Chapter3- syntax and semantics.pdf
 
Predictive parser
Predictive parserPredictive parser
Predictive parser
 
Unitiv 111206005201-phpapp01
Unitiv 111206005201-phpapp01Unitiv 111206005201-phpapp01
Unitiv 111206005201-phpapp01
 
LVEE 2014: Text parsing with Python and PLY
LVEE 2014: Text parsing with Python and PLYLVEE 2014: Text parsing with Python and PLY
LVEE 2014: Text parsing with Python and PLY
 
CH 2.pptx
CH 2.pptxCH 2.pptx
CH 2.pptx
 
Theory of computing
Theory of computingTheory of computing
Theory of computing
 
Regular Expression to Finite Automata
Regular Expression to Finite AutomataRegular Expression to Finite Automata
Regular Expression to Finite Automata
 
Pcd question bank
Pcd question bank Pcd question bank
Pcd question bank
 
UNIT 1 part II.ppt
UNIT 1 part II.pptUNIT 1 part II.ppt
UNIT 1 part II.ppt
 
Chapter-3 compiler.pptx course materials
Chapter-3 compiler.pptx course materialsChapter-3 compiler.pptx course materials
Chapter-3 compiler.pptx course materials
 
Module 11
Module 11Module 11
Module 11
 
8074448.ppt
8074448.ppt8074448.ppt
8074448.ppt
 
9781284077247_PPTx_CH01.pptx
9781284077247_PPTx_CH01.pptx9781284077247_PPTx_CH01.pptx
9781284077247_PPTx_CH01.pptx
 
Lisp and scheme i
Lisp and scheme iLisp and scheme i
Lisp and scheme i
 
Ch2 (1).ppt
Ch2 (1).pptCh2 (1).ppt
Ch2 (1).ppt
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 

More from baran19901990

More from baran19901990 (12)

Introduction
IntroductionIntroduction
Introduction
 
10 logic+programming+with+prolog
10 logic+programming+with+prolog10 logic+programming+with+prolog
10 logic+programming+with+prolog
 
07 control+structures
07 control+structures07 control+structures
07 control+structures
 
How to install git on ubuntu
How to install git on ubuntuHow to install git on ubuntu
How to install git on ubuntu
 
Ruby notification
Ruby notificationRuby notification
Ruby notification
 
Rails notification
Rails notificationRails notification
Rails notification
 
Linux notification
Linux notificationLinux notification
Linux notification
 
Lab4
Lab4Lab4
Lab4
 
Lab5
Lab5Lab5
Lab5
 
09
0909
09
 
Báo cáo mô hình quản lý khách sạn
Báo cáo mô hình quản lý khách sạnBáo cáo mô hình quản lý khách sạn
Báo cáo mô hình quản lý khách sạn
 
MDA Framework
MDA FrameworkMDA Framework
MDA Framework
 

Recently uploaded

定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Intellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptxIntellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptxBipin Adhikari
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleanscorenetworkseo
 

Recently uploaded (20)

定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Intellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptxIntellectual property rightsand its types.pptx
Intellectual property rightsand its types.pptx
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleans
 

Control structure

  • 1. Chapter 4: Syntax Analysis Principles of Programming Languages
  • 2. Contents •Non-regular languages •Context-free grammars and BNF •Formal Methods of Describing Syntax •Parsing problems •Top-down parsing and LL grammar •Combinator Parser in Scala
  • 3. Non-Regular Language •Write regular expression for following language L = {anbn: n ≥ 0}
  • 4. Context-Free Grammars •Context-Free Grammars –Developed by Noam Chomsky in the mid-1950s –Language generators, meant to describe the syntax of natural languages –Define a class of languages called context-free languages
  • 5. Chomsky Hierarchy Grammars Languages Automaton Restrictions (w1  w2) Type-0 Phrase-structure Turing machine w1 = any string with at least 1 non-terminal w2 = any string Type-1 Context-sensitive Bounded Turing machine w1 = any string with at least 1 non-terminal w2 = any string at least as long as w1 Type-2 Context-free Non-deterministic pushdown automaton w1 = one non-terminal w2 = any string Type-3 Regular Finite state automaton w1 = one non-terminal w2 = tA or t (t = terminal A = non-terminal)
  • 6. Backus-Naur Form (BNF) •Backus-Naur Form (1959) –Invented by John Backus to describe ALGOL 58 –Revised by Peter Naur in ALGOL 60 –BNF is equivalent to context-free grammars –BNF is a metalanguage used to describe another language
  • 7. BNF Fundamentals •Non-terminals: BNF abstractions •Terminals: lexemes and tokens •Grammar: a collection of rules
  • 8. Rules •A rule has a left-hand side (LHS) and a right- hand side (RHS), and consists of terminal and nonterminal symbols •A nonterminal symbol can have more than one RHS stmt  single_stmt | "begin" stmt_list "end"
  • 9. Regular vs. Context-Free •Regular languages are subset of context-free languages •Languages are generated by grammars •Regular grammars have form A  α | αB •Context-free grammars have form A  αBβ
  • 10. Regular vs. Context-Free •Context-Free languages are recognized by Nondeterministic Pushdown Automata •A Pushdown Automaton is a Finite Automaton with “pushdown store”, or “stack” e.g., L = {anbn: n ≥ 0}
  • 12. Describing Lists •List syntax: a,b,c,d,… •Syntactic lists are described using recursion •To describe comma-separated list of IDENT ident_list  IDENT | IDENT "," ident_list
  • 13. Derivation •A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)
  • 14. An Example Grammar program  stmts stmts  stmt | stmt ";" stmts stmt  var "=" expr var  "a" | "b" | "c" | "d" expr  term "+" term | term "-" term term  var | CONST An Example Derivation <program> => <stmts> => <stmt> => <var> = <expr> => a =<expr> => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + CONST
  • 15. Derivation •A leftmost (rightmost) derivation is one in which the leftmost (rightmost) nonterminal in each sentential form is the one that is expanded •A derivation may be neither leftmost nor rightmost
  • 16. Parse Tree •A hierarchical representation of a derivation <program> <stmts> <stmt> CONST a <var> = <expr> <var> b <term> + <term> <program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + CONST
  • 17. Exerices •Consider the BNF S  "(" L ")" | "a" L  L "," S | S Draw parse trees for the derivation of: (a, a) (a, ((a, a), (a, a)))
  • 18. Ambiguity in Grammars •A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees
  • 19. An Ambiguous Expression Grammar expr  expr op expr | CONST op  "/" | "-" <expr> <expr> <expr> <expr> <expr> <expr> <expr> <expr> <expr> <expr> <op> <op> <op> CONST CONST CONST CONST CONST CONST - - / / <op>
  • 20. Exercises •Is the following grammar ambiguous? A  A "and " A | "not" A | "0" | "1"
  • 21. Exercises •Write BNF for well-formed parentheses ( ), ((())), ))((, ((() …
  • 22. Precedence of Operators •Use the parse tree to indicate precedence levels of the operators expr  expr "-" term | term term  term "/" CONST| CONST <expr> <expr> <term> <term> <term> CONST CONST const / -
  • 23. 1-23 •Operator associativity can also be indicated by a grammar expr  expr "+" expr | const (ambiguous) expr  expr "+" CONST | CONST (unambiguous) <expr> <expr> <expr> <expr> CONST CONST CONST + + Associativity of Operators
  • 24. Exercises •Rewrite the grammar to fulfill the following requirements: –operator “*” takes lower precedence than “+” –operator “-” is right-associativity
  • 25. Extended BNF •Optional parts are placed in ( )? proc_call  IDENT ("(" expr_list ")")? •Alternative parts of RHSs are placed inside parentheses and separated via vertical bars term  term ("+"|"-") CONST •Repetitions (0 or more) are placed inside braces ( )* IDENT  letter (letter|digit)*
  • 26. BNF and EBNF •BNF expr  expr "+" term | expr "-" term | term term  term "*" factor | term "/" factor | factor •EBNF expr  term (("+" | "-") term)* term  factor (("*" | "/") factor)*
  • 27. Exercises •Write EBNF descriptions for a function header as follow: •The function declaration begins with a function keyword, then the function name, an opening parenthesis '(', a semicolon-separated parameter list, a closing parenthesis ')', a colon ':', a return type, a semi-colon and the body of the function. A function declaration is terminated by a semi-colon ';'. •The parameter list of a function declaration may contain zero or more parameters. A parameter consists of an identifier, a colon and a parameter type. If two or more consecutive parameters have the same type, they could be reduced to a shorter form: a comma delimited list of these parameter names, followed by a colon ':' and the shared parameter type. •For example •function area(a: real; b: real; c: real): real; could be rewritten as follows •function area(a,b,c: real): real; •A return type must be a primitive type. A parameter type could be a primitive type or an array type. The body of a function is also simply a block statement.
  • 28. Parsing •Goals of the parser, given an input program: –Find all syntax errors; for each, produce an appropriate diagnostic message, and recover quickly –Produce the parse tree, or at least a trace of the parse tree, for the program
  • 29. Parser Generator •Parser generator: Produces a parser from a given grammar •Works in conjunction with lexical analyzer •Examples: yacc, Antlr, JavaCC •Programmer specifies actions to be taken during parsing •No free lunch: need to know parsing theory to build efficient parsers •Scala has a built-in “combinator parser” (later slides)
  • 30. Parsing •Two categories of parsers –Top down - produce the parse tree, beginning at the root –Bottom up - produce the parse tree, beginning at the leaves (not mention here) •Parsers look only one token ahead in the input
  • 31. The Parsing Problem •The Complexity of Parsing –Parsers that work for any unambiguous grammar are complex and inefficient ( O(n3), where n is the length of the input ) –Compilers use parsers that only work for a subset of all unambiguous grammars, but do it in linear time ( O(n), where n is the length of the input )
  • 32. Top-down Parsers •Top-down Parsers –Given a sentential form, xA , the parser must choose the correct A-rule to get the next sentential form in the leftmost derivation, using only the first token produced by A •The most common top-down parsing algorithms: –Recursive descent - a coded implementation –LL parsers - table driven implementation
  • 33. Recursive-Descent Parsing •There is a subprogram for each nonterminal in the grammar, which can parse sentences that can be generated by that nonterminal •EBNF is ideally suited for being the basis for a recursive-descent parser, because EBNF minimizes the number of nonterminals
  • 34. Recursive-Descent Parsing (cont.) •A grammar for simple expressions: expr  term (("+" | "-") term)* term  factor (("*" | "/") factor)* factor  id | "(" expr ")"
  • 35. Recursive-Descent Parsing (cont.) •Assume we have a lexical analyzer which puts the next token code in a variable •The coding process when there is only one RHS: –For each terminal symbol in the RHS, compare it with the next input token; if they match, continue, else there is an error –For each nonterminal symbol in the RHS, call its associated parsing subprogram
  • 36. Recursive-Descent Parsing (cont.) expr  term (("+" | "-") term)*
  • 37. Recursive-Descent Parsing (cont.) •A nonterminal that has more than one RHS requires an initial process to determine which RHS it is to parse –The correct RHS is chosen on the basis of the next token of input (the lookahead) –The next token is compared with the first token that can be generated by each RHS until a match is found –If no match is found, it is a syntax error
  • 38. Recursive-Descent Parsing (cont.) factor  ID | "(" expr ")"
  • 39. Top-down Problems 1 •The LL Grammar Class –The Left Recursion Problem •If a grammar has left recursion, either direct or indirect, it cannot be the basis for a top-down parser –A grammar can be modified to remove left recursion
  • 40. Left Recursion Elimination •For each nonterminal, A, 1. Group the A-rules as A  Aα1,|…|Aαm|β1|β2|…|βn 2. Replace the original A-rules with A  β1A’|β2A’|…|βnA’ A’  α1A’|α2A’|…|αmA’|e
  • 41. Example E  E "+" T | T T  T "*" F | F F  "(" E ")" | id
  • 42. Top-down Problems 2 •Have to be pairwise disjointness –Have to determine the correct RHS on the basis of one token of lookahead –Def: FIRST() = {a |  =>* a } –For each nonterminal, A, in the grammar that has more than one RHS, for each pair of rules, A  i and A  j, it must be true that FIRST(i) ∩ FIRST(j) = Ø
  • 43. Pairwise Disjointness •Examples: A  a | bB | cAb A  a | aB
  • 44. Pairwise Disjointness •Left factoring can resolve the problem Replace variable  IDENTIFIER | IDENTIFIER "[" expression "]" with variable  IDENTIFIER new new  e | "[" expression "]" or variable  IDENTIFIER ("["expression"]")?
  • 45. Example A  a | aB
  • 46. Building Grammar Techniques •Repetition –[10][10][20] •Repetition with separators –id(arg1, arg2, arg3,…) •Operator precedence •Operator associativity •Avoiding left recursion •Left factoring •Disambiguation
  • 47. Scala Combinator Parser •Each nonterminal becomes a function •Terminals (strings) and nonterminals (functions) are combined with operators –Sequence ~ –Alternative | –0 or more rep(...) –0 or 1 opt(...)
  • 48. Combinator Parser expr  term (("+" | "-") expr)? term  factor (("*" | "/") term)? factor  wholeNumber | "(" expr ")" class SimpleLanguageParser extends JavaTokenParsers { def expr: Parser[Any] = term ~ opt(("+" | "-") ~ expr) def term: Parser[Any] = factor ~ opt(("*" | "/" ) ~ term) def factor: Parser[Any] = wholeNumber | "(" ~ expr ~ ")" } •Will replace Parser[Any] with something more useful later
  • 49. Combinator Parser Results •String returns itself •opt(P) returns Option: Some of the result of P, or None •rep(P) returns List of the results of P •P ~ Q returns instance of class ~ (similar to a pair)
  • 50. Example val parser = new SimpleLanguageParser val result = parser.parse(parser.expr, "3 - 4 * 5") sets result to ((3~None)~Some((-~((4~Some((*~(5~None))))~None)))) •After changing (x~None) to x, Some(y) to y, and ~ to spaces: (3 (- (4 (* 5)))) •Way to do it: wait until labs
  • 51. Summary •BNF and context-free grammars are equivalent meta-languages –Well-suited for describing the syntax of programming languages •Syntax analyzers: –Detects syntax errors –Produces a parse tree •A recursive-descent parser is an LL parser –EBNF