Syntax analysis


Published on

Syntax Analysis

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Syntax analysis

  1. 1. Structure of Programming Languages SYNTAX ANALYSIS VSRivera
  2. 2. Syntax <ul><li>“ The arrangement of words as elements in a sentence to show their relationship” describes the sequence of symbols that make up valid programs. </li></ul><ul><li>Form of expressions, statements and program units. </li></ul>
  3. 3. The General Problem of Describing Syntax: Terminology <ul><li>A sentence is a string of characters over some alphabet </li></ul><ul><li>A language is a set of sentences </li></ul><ul><li>A lexeme is the lowest level syntactic unit of a language (e.g ., * , sum, begin ) </li></ul><ul><li>A token is a category of lexemes (e.g., identifier) </li></ul>
  4. 4. Syntactic elements of the Language <ul><li>Character set – ASCII, Unicode </li></ul><ul><li>Identifiers –restrictions on length reduces readability </li></ul><ul><li>Operator symbols - + and – represents two basic arithmetic operations. </li></ul>
  5. 5. Syntactic elements of the Language <ul><li>Keywords and reserved words – is an identifier used as a fixed part of the syntax of a statement. It is a reserved word if it may not be used as a programmer-chosen identifier. </li></ul><ul><li>Noise words – optional words that are inserted in a statements to improved readability. </li></ul>
  6. 6. Syntactic elements of the Language <ul><li>Comments – important part of the documentation. REM, /* */, or // </li></ul><ul><li>Blank (spaces) </li></ul><ul><li>Delimiters – a syntactic element used simply to mark the beginning or end of some syntactic unit such as a statement or expression. “begin”…”end”, or { }. </li></ul>
  7. 7. Syntactic elements of the Language <ul><li>Expressions – functions that access data objects in a program and return some value. </li></ul><ul><li>Statements </li></ul>
  8. 8. Syntactic Analysis (parsing) <ul><li>2 nd stage in translation </li></ul><ul><li>Determines if the program being compiled is a valid sentence in the syntactic model programming language. </li></ul>
  9. 9. Role of the Parser <ul><li>Where lexical analysis splits the input into tokens, the purpose of syntax analysis (also known as parsing) is to recombine these tokens to reflect the data structure of the text. </li></ul><ul><li>The parse must also reject invalid texts by reporting syntax errors, and recover from commonly occurring errors so that it can continue processing the remainder of its input. </li></ul>
  10. 10. Role of the Parser Lexical Analyzer Source program Get next token token Parser Rest of front end Parser Parse tree Intermediate representation
  11. 11. Formal Methods of Describing Syntax <ul><li>Grammars </li></ul><ul><li>Parse Trees </li></ul><ul><li>Syntax Diagrams </li></ul>
  12. 12. Grammars <ul><li>Formal definition of the syntax of a programming language. </li></ul><ul><li>Collection of rules that define, mathematically, which strings of symbols are valid sentences. </li></ul>
  13. 13. Parts of Grammar <ul><li>Set of tokens/terminal symbols </li></ul><ul><ul><li>symbols that are atomic / non-divisible </li></ul></ul><ul><ul><li>can be combined to form valid constructs in the language </li></ul></ul><ul><li>Set of non-terminal symbols </li></ul><ul><ul><li>symbols used to represent intermediate definitions within the language </li></ul></ul><ul><ul><li>defined by productions </li></ul></ul><ul><ul><li>syntactic classes or categories </li></ul></ul>
  14. 14. Parts of Grammar <ul><li>Set of rules called productions </li></ul><ul><ul><li>a definition of a non-terminal symbol </li></ul></ul><ul><ul><li>has the form </li></ul></ul><ul><ul><li>x ::= y </li></ul></ul><ul><ul><li>where x is a non-terminal symbol and y is a sequence of symbols (non-terminal or terminal) </li></ul></ul>
  15. 15. Parts of Grammar <ul><ul><li>LHS: abstraction being defined </li></ul></ul><ul><ul><li>RHS: tokens, lexemes, references to other abstractions </li></ul></ul><ul><li>Goal symbol </li></ul><ul><ul><li>one of the set of non-terminal symbols </li></ul></ul><ul><ul><li>also referred to as the start symbol </li></ul></ul>
  16. 16. Rules to form Grammar <ul><li>Every non-terminal symbol must appear to the left of the ::= at least one production </li></ul><ul><li>The goal symbol must not appear to the right of the ::= of any production </li></ul><ul><li>A rule is recursive if its LHS appears in its RHS </li></ul>
  17. 17. Context Free Grammar (CFG) <ul><li>Backus-Naur Form (BNF) Grammar </li></ul><ul><ul><li>originally presented by John Backus (to describe ALGOL 58)and later modified by Peter Naur </li></ul></ul><ul><li>Composed of finite set of grammar rules which define a programming language. </li></ul>
  18. 18. Examples <ul><li><conditional stmt> ::= </li></ul><ul><li>if <boolean expr> then </li></ul><ul><li><stmt> </li></ul><ul><li>else </li></ul><ul><li><stmt> </li></ul><ul><li>| if <boolean expr> then </li></ul><ul><li><stmt> </li></ul>
  19. 19. Examples <ul><li><unsigned int> ::= </li></ul><ul><li><digit> | <unsigned int> <digit> </li></ul><ul><li>A rule is recursive if its LHS appears in its RHS </li></ul>
  20. 20. Examples <ul><li><assign> ::= <id> := <expr> </li></ul><ul><li><id> ::= A | B | C </li></ul><ul><li><expr> ::= <id> + <expr> </li></ul><ul><li>| <id> * <expr> </li></ul><ul><li>| ( <expr> ) </li></ul><ul><li>| <id> </li></ul>
  21. 21. Examples <ul><li><program> ::= begin </li></ul><ul><li><stmt_list> </li></ul><ul><li>end </li></ul><ul><li><stmt_list> ::=<stmt> | <stmt> <stmt_list> </li></ul><ul><li><stmt> ::= <var> := <expression> </li></ul><ul><li><var> ::= A | B | C </li></ul><ul><li><expression> ::= <var> + <var> </li></ul>
  22. 22. Grammar Derivation <ul><li>BNF is a generative device for defining language. </li></ul><ul><li>The sentences of the language are generated through a sequence of applications of the rules, beginning with a special non-terminal (start symbol) of the grammar. </li></ul>
  23. 23. Example <ul><li><program> ::= begin </li></ul><ul><li> <stmt_list> </li></ul><ul><li> end </li></ul><ul><li>begin <stmt> end </li></ul><ul><li>begin <var> := <expression> end </li></ul><ul><li>begin <var> := <var> + <var> end </li></ul><ul><li>begin A := B + C end </li></ul>
  24. 24. Example <ul><li>A := B * ( A + C) </li></ul><ul><li><assign> ::= <id> := <expr> </li></ul><ul><li>:= A := <expr> </li></ul><ul><li>:= A := <id> * <expr> </li></ul><ul><li>:= A := B * <expr> </li></ul><ul><li>:= A := B * (<expr>) </li></ul><ul><li>:= A := B * ( A + <expr>) </li></ul><ul><li>:= A := B * ( A + <id>) </li></ul><ul><li>:= A := B * ( A + C) </li></ul>
  25. 25. When does derivation stop? <ul><li>By exhaustingly choosing all combinations of choices, the entire language can generate. </li></ul>
  26. 26. Exercise <ul><li>BNF of signed integer? </li></ul><ul><li>begin A := B + C; B := C; end </li></ul>
  27. 27. Extended BNF (EBNF) <ul><li>Enhance the descriptive power of BNF </li></ul><ul><li>Increases the readability and writability of BNF </li></ul>
  28. 28. Extended BNF (EBNF) <ul><li>Notational Extensions </li></ul><ul><ul><li>An optional element may be indicated by enclosing the element in square brackets, </li></ul></ul><ul><ul><li>[ … ]. </li></ul></ul><ul><ul><li>A choice of alternative may use the symbol | within the single rule, optionally enclosed by parenthesis ( [ , ] ) if needed. </li></ul></ul><ul><ul><li>An arbitrary sequence of instances of element may be indicated by enclosing the element in braces followed by an asterisk, { … } + . </li></ul></ul>
  29. 29. Example <ul><li>BNF </li></ul><ul><ul><li><expr> ::= <expr> + <term> </li></ul></ul><ul><ul><li>| <expr> - <term> </li></ul></ul><ul><ul><li>| <term> </li></ul></ul><ul><ul><li><term> ::= <term> * <factor> </li></ul></ul><ul><ul><li>| <term> / <factor> </li></ul></ul><ul><ul><li>| <factor> </li></ul></ul>
  30. 30. Example <ul><li>EBNF </li></ul><ul><ul><li><expr> ::= <term> { (+|-) <term> } </li></ul></ul><ul><ul><li><term> ::= <factor> { (*|/) <factor>} </li></ul></ul>
  31. 31. Example <ul><li>BNF </li></ul><ul><ul><li><program> ::= begin </li></ul></ul><ul><li> <stmt_list> </li></ul><ul><li> end </li></ul>
  32. 32. Example <ul><li>EBNF </li></ul><ul><ul><li><program> ::= begin </li></ul></ul><ul><li> <stmt> {<stmt>} </li></ul><ul><li> end </li></ul><ul><ul><li><program> ::= begin </li></ul></ul><ul><li> {<stmt>} + </li></ul><ul><li> end </li></ul>
  33. 33. Example <ul><li>BNF </li></ul><ul><ul><li><signed int> ::= + <int> | - <int> </li></ul></ul><ul><ul><li><int> ::= <digit> | <int> <digit> </li></ul></ul><ul><li>EBNF </li></ul><ul><ul><li><signed int> ::= [+|-] <digit> {<digit>} + </li></ul></ul>
  34. 34. Exercise <ul><li>EBNF of identifier? </li></ul>
  35. 35. Solution <ul><li>EBNF of identifier </li></ul><ul><ul><li><identifier> ::= <letter> {<letter> | <digit> } + </li></ul></ul>
  36. 36. <ul><li>Get ½ sheet of yellow pad. </li></ul><ul><li>Prepare for a quiz. </li></ul><ul><li>Open Notes. </li></ul>
  37. 37. Midterm Quiz #1 <ul><li>Using the following English Grammar: </li></ul><ul><li><sentence> ::= <noun phrase> <verb phrase> . </li></ul><ul><li><noun phrase> ::= <determiner> <noun>| <determiner> <noun> <prepositional phrase> </li></ul><ul><li><verb phrase> ::= <verb> | <verb> <noun phrase> | <verb> <noun phrase> <prepositional phrase> </li></ul><ul><li><prepositional phrase> ::= <preposition> <noun phrase> </li></ul><ul><li><noun> ::= boy | girl | cat | telescope | song | feather </li></ul><ul><li><determiner> ::= a | the </li></ul><ul><li><verb> ::= saw | touched | surprised | sang </li></ul><ul><li><preposition> ::= by | with </li></ul><ul><li>Write the Left Side Derivation of the sentence “ the girl touched the cat with a feather ” </li></ul>