The document discusses syntax analysis and parsing. It covers context-free grammars, Backus-Naur Form (BNF), Extended BNF, and different parsing techniques like recursive descent parsing and LL parsing. It also discusses Scala's combinator parser, which uses parser combinators to parse input based on a grammar.
Its the first phase of the compiler,useful in generating lexemes ,tokens and matching of the pattern.Its helpful in solving GATE/ UGCNET problems.For more insight refer http://tutorialfocus.net/
In this slide you will explore more about how to make derivations ,design parse tree ,what is ambiguity and how to remove ambiguity ,left recursion ,left factoring .
Its the first phase of the compiler,useful in generating lexemes ,tokens and matching of the pattern.Its helpful in solving GATE/ UGCNET problems.For more insight refer http://tutorialfocus.net/
In this slide you will explore more about how to make derivations ,design parse tree ,what is ambiguity and how to remove ambiguity ,left recursion ,left factoring .
Lexical Analysis, Tokens, Patterns, Lexemes, Example pattern, Stages of a Lexical Analyzer, Regular expressions to the lexical analysis, Implementation of Lexical Analyzer, Lexical analyzer: use as generator.
This is the presentation on Syntactic Analysis in NLP.It includes topics like Introduction to parsing, Basic parsing strategies, Top-down parsing, Bottom-up
parsing, Dynamic programming – CYK parser, Issues in basic parsing methods, Earley algorithm, Parsing
using Probabilistic Context Free Grammars.
Lexical Analysis, Tokens, Patterns, Lexemes, Example pattern, Stages of a Lexical Analyzer, Regular expressions to the lexical analysis, Implementation of Lexical Analyzer, Lexical analyzer: use as generator.
This is the presentation on Syntactic Analysis in NLP.It includes topics like Introduction to parsing, Basic parsing strategies, Top-down parsing, Bottom-up
parsing, Dynamic programming – CYK parser, Issues in basic parsing methods, Earley algorithm, Parsing
using Probabilistic Context Free Grammars.
For college going students, employees of a company or for a common man, bus is the most comprehensive and affordable mode of transport. The use of bus for transport reduces private vehicle usage and thus reduces fuel consumption. It also curbs traffic congestion. The user usually wants to know the accurate arrival time of the bus. Long time waiting at the bus stops discourage the use of buses. Also many unpredictable factors delay the schedule of a bus like harsh weather situation, traffic conditions etc. Towards this aim of reducing this problem, we are proposing a project which will assist the bus travellers in predicting bus timings. The system described uses DriverSideApp, ClientSideApp and a server.
In this project, we propose an innovative method for predicting the bus information. No specific device is required for this purpose. Our sole objective is to build a application that will help student to access the current location of the bus. Moving forward our application focus on to providing them more convenience with bus schedules, bus location information so that they may not get delayed. . Further, the recent advent and popularity of Android technology motivates us to create an Android application for the same.
How to build a news website use CMS wordpressbaran19901990
This tutorial wil show you, how to build a news website use CMS Wordpress step by step. From basic to advantage. I think that this tutorial is amazing and very basic for newbie.
This 7-second Brain Wave Ritual Attracts Money To You.!nirahealhty
Discover the power of a simple 7-second brain wave ritual that can attract wealth and abundance into your life. By tapping into specific brain frequencies, this technique helps you manifest financial success effortlessly. Ready to transform your financial future? Try this powerful ritual and start attracting money today!
1.Wireless Communication System_Wireless communication is a broad term that i...JeyaPerumal1
Wireless communication involves the transmission of information over a distance without the help of wires, cables or any other forms of electrical conductors.
Wireless communication is a broad term that incorporates all procedures and forms of connecting and communicating between two or more devices using a wireless signal through wireless communication technologies and devices.
Features of Wireless Communication
The evolution of wireless technology has brought many advancements with its effective features.
The transmitted distance can be anywhere between a few meters (for example, a television's remote control) and thousands of kilometers (for example, radio communication).
Wireless communication can be used for cellular telephony, wireless access to the internet, wireless home networking, and so on.
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
Italy Agriculture Equipment Market Outlook to 2027harveenkaur52
Agriculture and Animal Care
Ken Research has an expertise in Agriculture and Animal Care sector and offer vast collection of information related to all major aspects such as Agriculture equipment, Crop Protection, Seed, Agriculture Chemical, Fertilizers, Protected Cultivators, Palm Oil, Hybrid Seed, Animal Feed additives and many more.
Our continuous study and findings in agriculture sector provide better insights to companies dealing with related product and services, government and agriculture associations, researchers and students to well understand the present and expected scenario.
Our Animal care category provides solutions on Animal Healthcare and related products and services, including, animal feed additives, vaccination
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfFlorence Consulting
Quattordicesimo Meetup di Milano, tenutosi a Milano il 23 Maggio 2024 dalle ore 17:00 alle ore 18:30 in presenza e da remoto.
Abbiamo parlato di come Axpo Italia S.p.A. ha ridotto il technical debt migrando le proprie APIs da Mule 3.9 a Mule 4.4 passando anche da on-premises a CloudHub 1.0.
2. Contents
•Non-regular languages
•Context-free grammars and BNF
•Formal Methods of Describing Syntax
•Parsing problems
•Top-down parsing and LL grammar
•Combinator Parser in Scala
4. Context-Free Grammars
•Context-Free Grammars
–Developed by Noam Chomsky in the mid-1950s
–Language generators, meant to describe the syntax of natural languages
–Define a class of languages called context-free languages
5. Chomsky Hierarchy
Grammars
Languages
Automaton
Restrictions
(w1 w2)
Type-0
Phrase-structure
Turing machine
w1 = any string with at least 1 non-terminal
w2 = any string
Type-1
Context-sensitive
Bounded Turing machine
w1 = any string with at least 1 non-terminal
w2 = any string at least as long as w1
Type-2
Context-free
Non-deterministic pushdown automaton
w1 = one non-terminal
w2 = any string
Type-3
Regular
Finite state automaton
w1 = one non-terminal
w2 = tA or t
(t = terminal
A = non-terminal)
6. Backus-Naur Form (BNF)
•Backus-Naur Form (1959)
–Invented by John Backus to describe ALGOL 58
–Revised by Peter Naur in ALGOL 60
–BNF is equivalent to context-free grammars
–BNF is a metalanguage used to describe another language
8. Rules
•A rule has a left-hand side (LHS) and a right- hand side (RHS), and consists of terminal and nonterminal symbols
•A nonterminal symbol can have more than one RHS
stmt single_stmt
| "begin" stmt_list "end"
9. Regular vs. Context-Free
•Regular languages are subset of context-free languages
•Languages are generated by grammars
•Regular grammars have form
A α | αB
•Context-free grammars have form
A αBβ
10. Regular vs. Context-Free
•Context-Free languages are recognized by Nondeterministic Pushdown Automata
•A Pushdown Automaton is a Finite Automaton with “pushdown store”, or “stack”
e.g., L = {anbn: n ≥ 0}
12. Describing Lists
•List syntax: a,b,c,d,…
•Syntactic lists are described using recursion
•To describe comma-separated list of IDENT
ident_list IDENT
| IDENT "," ident_list
13. Derivation
•A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)
14. An Example Grammar
program stmts
stmts stmt | stmt ";" stmts
stmt var "=" expr
var "a" | "b" | "c" | "d"
expr term "+" term | term "-" term
term var | CONST
An Example Derivation
<program> => <stmts> => <stmt>
=> <var> = <expr> => a =<expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + CONST
15. Derivation
•A leftmost (rightmost) derivation is one in which the leftmost (rightmost) nonterminal in each sentential form is the one that is expanded
•A derivation may be neither leftmost nor rightmost
16. Parse Tree
•A hierarchical representation of a derivation
<program>
<stmts>
<stmt>
CONST
a
<var>
= <expr>
<var> b
<term>
+
<term>
<program> => <stmts> => <stmt>
=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + CONST
17. Exerices
•Consider the BNF
S "(" L ")" | "a"
L L "," S | S
Draw parse trees for the derivation of:
(a, a)
(a, ((a, a), (a, a)))
18. Ambiguity in Grammars
•A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees
22. Precedence of Operators
•Use the parse tree to indicate precedence levels of the operators
expr expr "-" term | term
term term "/" CONST| CONST
<expr> <expr>
<term>
<term> <term>
CONST CONST
const
/
-
23. 1-23
•Operator associativity can also be indicated by a grammar
expr expr "+" expr | const (ambiguous)
expr expr "+" CONST | CONST (unambiguous)
<expr>
<expr>
<expr>
<expr>
CONST CONST
CONST
+ +
Associativity of Operators
24. Exercises
•Rewrite the grammar to fulfill the following requirements:
–operator “*” takes lower precedence than “+”
–operator “-” is right-associativity
25. Extended BNF
•Optional parts are placed in ( )?
proc_call IDENT ("(" expr_list ")")?
•Alternative parts of RHSs are placed inside parentheses and separated via vertical bars
term term ("+"|"-") CONST
•Repetitions (0 or more) are placed inside braces ( )*
IDENT letter (letter|digit)*
26. BNF and EBNF
•BNF
expr expr "+" term
| expr "-" term
| term
term term "*" factor
| term "/" factor
| factor
•EBNF
expr term (("+" | "-") term)*
term factor (("*" | "/") factor)*
27. Exercises
•Write EBNF descriptions for a function header as follow:
•The function declaration begins with a function keyword, then the function name, an opening parenthesis '(', a semicolon-separated parameter list, a closing parenthesis ')', a colon ':', a return type, a semi-colon and the body of the function. A function declaration is terminated by a semi-colon ';'.
•The parameter list of a function declaration may contain zero or more parameters. A parameter consists of an identifier, a colon and a parameter type. If two or more consecutive parameters have the same type, they could be reduced to a shorter form: a comma delimited list of these parameter names, followed by a colon ':' and the shared parameter type.
•For example
•function area(a: real; b: real; c: real): real; could be rewritten as follows
•function area(a,b,c: real): real;
•A return type must be a primitive type. A parameter type could be a primitive type or an array type. The body of a function is also simply a block statement.
28. Parsing
•Goals of the parser, given an input program:
–Find all syntax errors; for each, produce an appropriate diagnostic message, and recover quickly
–Produce the parse tree, or at least a trace of the parse tree, for the program
29. Parser Generator
•Parser generator: Produces a parser from a given grammar
•Works in conjunction with lexical analyzer
•Examples: yacc, Antlr, JavaCC
•Programmer specifies actions to be taken during parsing
•No free lunch: need to know parsing theory to build efficient parsers
•Scala has a built-in “combinator parser” (later slides)
30. Parsing
•Two categories of parsers
–Top down - produce the parse tree, beginning at the root
–Bottom up - produce the parse tree, beginning at the leaves (not mention here)
•Parsers look only one token ahead in the input
31. The Parsing Problem
•The Complexity of Parsing
–Parsers that work for any unambiguous grammar are complex and inefficient ( O(n3), where n is the length of the input )
–Compilers use parsers that only work for a subset of all unambiguous grammars, but do it in linear time ( O(n), where n is the length of the input )
32. Top-down Parsers
•Top-down Parsers
–Given a sentential form, xA , the parser must choose the correct A-rule to get the next sentential form in the leftmost derivation, using only the first token produced by A
•The most common top-down parsing algorithms:
–Recursive descent - a coded implementation
–LL parsers - table driven implementation
33. Recursive-Descent Parsing
•There is a subprogram for each nonterminal in the grammar, which can parse sentences that can be generated by that nonterminal
•EBNF is ideally suited for being the basis for a recursive-descent parser, because EBNF minimizes the number of nonterminals
34. Recursive-Descent Parsing (cont.)
•A grammar for simple expressions:
expr term (("+" | "-") term)*
term factor (("*" | "/") factor)*
factor id | "(" expr ")"
35. Recursive-Descent Parsing (cont.)
•Assume we have a lexical analyzer which puts the next token code in a variable
•The coding process when there is only one RHS:
–For each terminal symbol in the RHS, compare it with the next input token; if they match, continue, else there is an error
–For each nonterminal symbol in the RHS, call its associated parsing subprogram
37. Recursive-Descent Parsing (cont.)
•A nonterminal that has more than one RHS requires an initial process to determine which RHS it is to parse
–The correct RHS is chosen on the basis of the next token of input (the lookahead)
–The next token is compared with the first token that can be generated by each RHS until a match is found
–If no match is found, it is a syntax error
39. Top-down Problems 1
•The LL Grammar Class
–The Left Recursion Problem
•If a grammar has left recursion, either direct or indirect, it cannot be the basis for a top-down parser
–A grammar can be modified to remove left recursion
40. Left Recursion Elimination
•For each nonterminal, A,
1. Group the A-rules as
A Aα1,|…|Aαm|β1|β2|…|βn
2. Replace the original A-rules with
A β1A’|β2A’|…|βnA’
A’ α1A’|α2A’|…|αmA’|e
41. Example
E E "+" T | T
T T "*" F | F
F "(" E ")" | id
42. Top-down Problems 2
•Have to be pairwise disjointness
–Have to determine the correct RHS on the basis of one token of lookahead
–Def: FIRST() = {a | =>* a }
–For each nonterminal, A, in the grammar that has more than one RHS, for each pair of rules, A i and A j, it must be true that
FIRST(i) ∩ FIRST(j) = Ø
44. Pairwise Disjointness
•Left factoring can resolve the problem
Replace
variable IDENTIFIER | IDENTIFIER "[" expression "]"
with
variable IDENTIFIER new
new e | "[" expression "]"
or
variable IDENTIFIER ("["expression"]")?
46. Building Grammar Techniques
•Repetition
–[10][10][20]
•Repetition with separators
–id(arg1, arg2, arg3,…)
•Operator precedence
•Operator associativity
•Avoiding left recursion
•Left factoring
•Disambiguation
47. Scala Combinator Parser
•Each nonterminal becomes a function
•Terminals (strings) and nonterminals (functions) are combined with operators
–Sequence ~
–Alternative |
–0 or more rep(...)
–0 or 1 opt(...)
49. Combinator Parser Results
•String returns itself
•opt(P) returns Option: Some of the result of P, or None
•rep(P) returns List of the results of P
•P ~ Q returns instance of class ~ (similar to a pair)
50. Example
val parser = new SimpleLanguageParser
val result = parser.parse(parser.expr, "3 - 4 * 5")
sets result to
((3~None)~Some((-~((4~Some((*~(5~None))))~None))))
•After changing (x~None) to x, Some(y) to y, and ~ to spaces: (3 (- (4 (* 5))))
•Way to do it: wait until labs
51. Summary
•BNF and context-free grammars are equivalent meta-languages
–Well-suited for describing the syntax of programming languages
•Syntax analyzers:
–Detects syntax errors
–Produces a parse tree
•A recursive-descent parser is an LL parser
–EBNF