This lecture covers parsing and turning syntax definitions into parsers. It discusses context-free grammars and derivations. Grammars can be ambiguous, allowing multiple parse trees for a sentence. Grammar transformations like disambiguation, eliminating left recursion, and left factoring can address issues while preserving the language. Associativity and priority can be defined through transformations. The reading material covers parsing schemata, classical compiler textbooks, and papers on disambiguation filters and parsing algorithms.
This document provides an overview of parsing in compiler construction. It discusses context-free grammars and how they are used to generate sentences and parse trees through derivations. It also covers ambiguity that can arise from grammars and various grammar transformations used to eliminate ambiguity, including defining associativity and priority. The dangling else problem is presented as an example of an ambiguous grammar.
Declare Your Language: Syntax DefinitionEelco Visser
This document provides information about syntax definition and the lab organization for a compiler construction course. It includes links to papers on declarative syntax definition and the SDF3 syntax definition formalism. It also provides details about submitting lab assignments through GitHub, including instructions to fork a repository template and submit solutions as pull requests. Grades will be published on a web lab and early feedback may be provided on pull requests and pushed changes.
This document summarizes and discusses type checking algorithms for programming languages. It introduces constraint-based type checking, which separates type checking into constraint generation and constraint solving. This provides a more declarative way to specify type checkers. The document discusses using variables and constraints to represent types during type checking. It introduces NaBL2, a domain-specific language for writing constraint generators to specify name and type constraints for programming language static semantics. NaBL2 uses scope graphs to represent name binding structures and supports features like type equality, subtyping, and type-dependent name resolution through constraint rules. An example scope graph and constraint rule for let-bindings is provided.
Declare Your Language: Name ResolutionEelco Visser
Scope graphs are used to represent the binding information in programs. They provide a language-independent representation of name resolution that can be used to conduct and represent the results of name resolution. Separating the representation of resolved programs from the declarative rules that define name binding allows language-independent tooling to be developed for name resolution and other tasks.
Declare Your Language: Syntactic (Editor) ServicesEelco Visser
Lecture 3 on compiler construction course on definition of lexical syntax and syntactic services that can be derived from syntax definitions such as formatting and syntactic completion
Compiler Construction | Lecture 8 | Type ConstraintsEelco Visser
This lecture covers type checking with constraints. It introduces the NaBL2 meta-language for writing type specifications as constraint generators that map a program to constraints. The constraints are then solved to determine if a program is well-typed. NaBL2 supports defining name binding and type structures through scope graphs and constraints over names, types, and scopes. Examples show type checking patterns in NaBL2 including variables, functions, records, and name spaces.
This document provides an overview of parsing in compiler construction. It discusses context-free grammars and how they are used to generate sentences and parse trees through derivations. It also covers ambiguity that can arise from grammars and various grammar transformations used to eliminate ambiguity, including defining associativity and priority. The dangling else problem is presented as an example of an ambiguous grammar.
Declare Your Language: Syntax DefinitionEelco Visser
This document provides information about syntax definition and the lab organization for a compiler construction course. It includes links to papers on declarative syntax definition and the SDF3 syntax definition formalism. It also provides details about submitting lab assignments through GitHub, including instructions to fork a repository template and submit solutions as pull requests. Grades will be published on a web lab and early feedback may be provided on pull requests and pushed changes.
This document summarizes and discusses type checking algorithms for programming languages. It introduces constraint-based type checking, which separates type checking into constraint generation and constraint solving. This provides a more declarative way to specify type checkers. The document discusses using variables and constraints to represent types during type checking. It introduces NaBL2, a domain-specific language for writing constraint generators to specify name and type constraints for programming language static semantics. NaBL2 uses scope graphs to represent name binding structures and supports features like type equality, subtyping, and type-dependent name resolution through constraint rules. An example scope graph and constraint rule for let-bindings is provided.
Declare Your Language: Name ResolutionEelco Visser
Scope graphs are used to represent the binding information in programs. They provide a language-independent representation of name resolution that can be used to conduct and represent the results of name resolution. Separating the representation of resolved programs from the declarative rules that define name binding allows language-independent tooling to be developed for name resolution and other tasks.
Declare Your Language: Syntactic (Editor) ServicesEelco Visser
Lecture 3 on compiler construction course on definition of lexical syntax and syntactic services that can be derived from syntax definitions such as formatting and syntactic completion
Compiler Construction | Lecture 8 | Type ConstraintsEelco Visser
This lecture covers type checking with constraints. It introduces the NaBL2 meta-language for writing type specifications as constraint generators that map a program to constraints. The constraints are then solved to determine if a program is well-typed. NaBL2 supports defining name binding and type structures through scope graphs and constraints over names, types, and scopes. Examples show type checking patterns in NaBL2 including variables, functions, records, and name spaces.
Dynamic Semantics Specification and Interpreter GenerationEelco Visser
(1) The document describes a domain specific language called DynSem for specifying dynamic semantics of programming languages. DynSem allows defining semantics in a modular way using semantic rules.
(2) DynSem specifications can be used to generate high-performance interpreters. The document outlines various language features that can be modeled in DynSem, including arithmetic, booleans, control flow, functions, and mutable state.
(3) DynSem specifications are composed of modules that import language signatures and define semantic rules over them. Rules are used to reduce expressions to values in an environment and store. This allows modeling features like variables, functions, and mutable boxes.
This document discusses syntactic editor services including formatting, syntax coloring, and syntactic completion. It describes how syntactic completion can be provided generically based on a syntax definition. The document also discusses how context-free grammars can be extended with templates to specify formatting layout when pretty-printing abstract syntax trees to text. Templates are used to insert whitespace, line breaks, and indentation to produce readable output.
Compiler Construction | Lecture 3 | Syntactic Editor ServicesEelco Visser
The document discusses syntactic editor services for programming languages. It covers formatting specifications that define how abstract syntax trees are mapped to text using templates. It also discusses syntactic completion, which proposes valid completions in an editor by using the syntax definition. The lecture focuses on defining lexical and syntactic syntax for Tiger using SDF, and generating editor services like formatting and coloring from the syntax definitions.
Compiler Construction | Lecture 9 | Constraint ResolutionEelco Visser
This document provides an overview of constraint resolution in the context of a compiler construction lecture. It discusses unification, which is the basis for many type inference and constraint solving approaches. It also describes separating type checking into constraint generation and constraint solving, and introduces a constraint language that integrates name resolution into constraint resolution through scope graph constraints. Finally, it discusses papers on further developments with this approach, including addressing expressiveness and staging issues in type systems through the Statix DSL for defining type systems.
This document provides an overview of the Lecture 2 on Declarative Syntax Definition for the CS4200 Compiler Construction course. The lecture covers the specification of syntax definition from which parsers can be derived, the perspective on declarative syntax definition using SDF, and reading material on the SDF3 syntax definition formalism and papers on testing syntax definitions and declarative syntax. It also discusses what syntax is, both in linguistics and programming languages, and how programs can be described in terms of syntactic categories and language constructs. An example Tiger program for solving the n-queens problem is presented to illustrate syntactic categories in Tiger.
Separation of Concerns in Language DefinitionEelco Visser
Slides for keynote at the Modularity 2014 conference in Lugano on April 23, 2014. (A considerable part of the talk consisted of live programming the syntax definition and name binding rules for a little language.)
This document discusses top-down parsing and predictive parsing. It begins with an overview of top-down parsing, noting that the parse tree is constructed from the top-down and left-to-right. It then discusses recursive descent parsing and its limitations for ambiguous, left-recursive, or non-left-factored grammars. Finally, it introduces LL(1) parsing and how LL(1) parsing tables can be used to predict the next production without backtracking, making parsing more efficient.
Parsing is the process of analyzing a string of symbols according to the rules of a formal grammar to determine its syntactic structure. It involves expanding a parse tree from the start symbol to the leaves using either a top-down or bottom-up approach. Top-down parsers expand the leftmost non-terminal at each step, while bottom-up parsers start at the leaves and grow the tree toward the root. Bottom-up, or LR, parsing is preferred in practice.
This document discusses operator precedence parsing. It describes operator grammars that can be parsed efficiently using an operator precedence parser. It explains how precedence relations are defined between terminal symbols and how these relations are used during the shift-reduce parsing process to determine whether to shift or reduce at each step. It also addresses handling unary minus operators and recovering from shift/reduce errors during parsing.
ANTLR v3 is an improved version of ANTLR that provides more robust grammars, error recovery, attributes, tree construction and code generation capabilities compared to version 2. Some key features include single element EBNF grammar syntax, support for parameters and return values in rules, dynamic scoping of attributes, automatic and rewrite-based tree construction, tree grammars, and internationalization through string templates. The runtime is also better organized and separated into modules for parsing, trees, and debugging.
The document discusses syntax analysis and parsing. It defines a syntax analyzer as creating the syntactic structure of a source program in the form of a parse tree. A syntax analyzer, also called a parser, checks if a program satisfies the rules of a context-free grammar and produces the parse tree if it does, or error messages otherwise. It describes top-down and bottom-up parsing methods and how parsers use grammars to analyze syntax.
Declarative Type System Specification with StatixEelco Visser
In this talk I present the design of Statix, a new constraint-based language for the executable specification of type systems. Statix specifications consist of predicates that define the well-formedness of language constructs in terms of built-in and user-defined constraints. Statix has a declarative semantics that defines whether a model satisfies a constraint. The operational semantics of Statix is defined as a sound constraint solving algorithm that searches for a solution for a constraint. The aim of the design is that Statix users can ignore the execution order of constraint solving and think in terms of the declarative semantics.
A distinctive feature of Statix is its use of scope graphs, a language parametric framework for the representation and querying of the name binding facts in programs. Since types depend on name resolution and name resolution may depend on types, it is typically not possible to construct the entire scope graph of a program before type constraint resolution. In (algorithmic) type system specifications this leads to explicit staging of the construction and querying of the type environment (class table, symbol table). Statix automatically stages the construction of the scope graph of a program such that queries are never executed when their answers may be affected by future scope graph extension. In the talk, I will explain the design of Statix by means of examples.
https://eelcovisser.org/post/309/declarative-type-system-specification-with-statix
Compiler Construction | Lecture 14 | InterpretersEelco Visser
This document summarizes a lecture on interpreters for programming languages. It discusses how operational semantics can be used to define the meaning of a program through state transitions in an interpreter. It provides examples of defining the semantics of a simple language using DynSem, a domain-specific language for specifying operational semantics. DynSem specifications can be compiled to interpreters that execute programs in the defined language.
A Language Designer’s Workbench. A one-stop shop for implementation and verif...Eelco Visser
The document describes a language designer's workbench that provides tools to support the implementation and verification of language designs. It allows language designers to define the syntax, name binding, type system, and semantics of a language and provides tools like parsers, type checkers, interpreters, and debuggers. The workbench integrates existing tools like SDF3, NaBL, TypeScript, and Stratego under a common framework. A prototype implementation of the workbench is demonstrated by defining the syntax, types, and semantics of a simple functional language called PCF.
The document discusses the role of the parser in compiler design. It explains that the parser takes a stream of tokens from the lexical analyzer and checks if the source program satisfies the rules of the context-free grammar. If so, it creates a parse tree representing the syntactic structure. Parsers are categorized as top-down or bottom-up based on the direction they build the parse tree. The document also covers context-free grammars, derivations, parse trees, ambiguity, and techniques for eliminating left-recursion from grammars.
The document describes static name resolution in programming languages. It discusses how names are bound to declarations through lexical scoping and how references are resolved to declarations by following paths through a scope graph representation. It presents the concepts of scopes, declarations, references, resolution paths, imports, and parent scopes. It also discusses how name resolution can be formalized using a calculus based on scope graphs, separation reachability and visibility, and how this supports name resolution, disambiguation, and program transformations.
Compiler Construction | Lecture 2 | Declarative Syntax DefinitionEelco Visser
The document describes a lecture on declarative syntax definition. It discusses the perspective on declarative syntax definition explained in an Onward! 2010 essay. It also mentions an OOPSLA 2011 paper that introduced the SPoofax Testing (SPT) language used in the section on testing syntax definitions. Finally, it provides a link to documentation on the SDF3 syntax definition formalism.
Syntax-Directed Translation into Three Address Codesanchi29
The document discusses syntax-directed translation of code into three-address code. It defines semantic rules for generating three-address code for expressions, boolean expressions, and control flow statements. Temporary variables are generated for subexpressions and intermediate values. The semantic rules specify generating three-address code statements using temporary variables. Backpatching is also discussed as a technique to replace symbolic names in goto statements with actual addresses after code generation.
Compiler Construction | Lecture 6 | Introduction to Static AnalysisEelco Visser
Lecture introducing the need for static analysis in addition to parsing, the complications caused by names, and an introduction to name resolution with scope graphs
The document discusses top-down and bottom-up parsing techniques. Top-down parsing constructs a parse tree starting from the root node and progresses depth-first. It can require backtracking. Bottom-up parsing uses shift-reduce parsing, shifting input symbols onto a stack until they can be reduced based on grammar rules.
The document discusses top-down and bottom-up parsing techniques, with top-down parsing constructing a parse tree starting from the root node and working downward, while bottom-up parsing uses shift-reduce parsing to shift input symbols onto a stack and reduce the stack based on grammar rules. It also covers topics like recursive descent parsing, predictive parsing tables, LL(1) grammars, and error recovery techniques used in parsing.
Dynamic Semantics Specification and Interpreter GenerationEelco Visser
(1) The document describes a domain specific language called DynSem for specifying dynamic semantics of programming languages. DynSem allows defining semantics in a modular way using semantic rules.
(2) DynSem specifications can be used to generate high-performance interpreters. The document outlines various language features that can be modeled in DynSem, including arithmetic, booleans, control flow, functions, and mutable state.
(3) DynSem specifications are composed of modules that import language signatures and define semantic rules over them. Rules are used to reduce expressions to values in an environment and store. This allows modeling features like variables, functions, and mutable boxes.
This document discusses syntactic editor services including formatting, syntax coloring, and syntactic completion. It describes how syntactic completion can be provided generically based on a syntax definition. The document also discusses how context-free grammars can be extended with templates to specify formatting layout when pretty-printing abstract syntax trees to text. Templates are used to insert whitespace, line breaks, and indentation to produce readable output.
Compiler Construction | Lecture 3 | Syntactic Editor ServicesEelco Visser
The document discusses syntactic editor services for programming languages. It covers formatting specifications that define how abstract syntax trees are mapped to text using templates. It also discusses syntactic completion, which proposes valid completions in an editor by using the syntax definition. The lecture focuses on defining lexical and syntactic syntax for Tiger using SDF, and generating editor services like formatting and coloring from the syntax definitions.
Compiler Construction | Lecture 9 | Constraint ResolutionEelco Visser
This document provides an overview of constraint resolution in the context of a compiler construction lecture. It discusses unification, which is the basis for many type inference and constraint solving approaches. It also describes separating type checking into constraint generation and constraint solving, and introduces a constraint language that integrates name resolution into constraint resolution through scope graph constraints. Finally, it discusses papers on further developments with this approach, including addressing expressiveness and staging issues in type systems through the Statix DSL for defining type systems.
This document provides an overview of the Lecture 2 on Declarative Syntax Definition for the CS4200 Compiler Construction course. The lecture covers the specification of syntax definition from which parsers can be derived, the perspective on declarative syntax definition using SDF, and reading material on the SDF3 syntax definition formalism and papers on testing syntax definitions and declarative syntax. It also discusses what syntax is, both in linguistics and programming languages, and how programs can be described in terms of syntactic categories and language constructs. An example Tiger program for solving the n-queens problem is presented to illustrate syntactic categories in Tiger.
Separation of Concerns in Language DefinitionEelco Visser
Slides for keynote at the Modularity 2014 conference in Lugano on April 23, 2014. (A considerable part of the talk consisted of live programming the syntax definition and name binding rules for a little language.)
This document discusses top-down parsing and predictive parsing. It begins with an overview of top-down parsing, noting that the parse tree is constructed from the top-down and left-to-right. It then discusses recursive descent parsing and its limitations for ambiguous, left-recursive, or non-left-factored grammars. Finally, it introduces LL(1) parsing and how LL(1) parsing tables can be used to predict the next production without backtracking, making parsing more efficient.
Parsing is the process of analyzing a string of symbols according to the rules of a formal grammar to determine its syntactic structure. It involves expanding a parse tree from the start symbol to the leaves using either a top-down or bottom-up approach. Top-down parsers expand the leftmost non-terminal at each step, while bottom-up parsers start at the leaves and grow the tree toward the root. Bottom-up, or LR, parsing is preferred in practice.
This document discusses operator precedence parsing. It describes operator grammars that can be parsed efficiently using an operator precedence parser. It explains how precedence relations are defined between terminal symbols and how these relations are used during the shift-reduce parsing process to determine whether to shift or reduce at each step. It also addresses handling unary minus operators and recovering from shift/reduce errors during parsing.
ANTLR v3 is an improved version of ANTLR that provides more robust grammars, error recovery, attributes, tree construction and code generation capabilities compared to version 2. Some key features include single element EBNF grammar syntax, support for parameters and return values in rules, dynamic scoping of attributes, automatic and rewrite-based tree construction, tree grammars, and internationalization through string templates. The runtime is also better organized and separated into modules for parsing, trees, and debugging.
The document discusses syntax analysis and parsing. It defines a syntax analyzer as creating the syntactic structure of a source program in the form of a parse tree. A syntax analyzer, also called a parser, checks if a program satisfies the rules of a context-free grammar and produces the parse tree if it does, or error messages otherwise. It describes top-down and bottom-up parsing methods and how parsers use grammars to analyze syntax.
Declarative Type System Specification with StatixEelco Visser
In this talk I present the design of Statix, a new constraint-based language for the executable specification of type systems. Statix specifications consist of predicates that define the well-formedness of language constructs in terms of built-in and user-defined constraints. Statix has a declarative semantics that defines whether a model satisfies a constraint. The operational semantics of Statix is defined as a sound constraint solving algorithm that searches for a solution for a constraint. The aim of the design is that Statix users can ignore the execution order of constraint solving and think in terms of the declarative semantics.
A distinctive feature of Statix is its use of scope graphs, a language parametric framework for the representation and querying of the name binding facts in programs. Since types depend on name resolution and name resolution may depend on types, it is typically not possible to construct the entire scope graph of a program before type constraint resolution. In (algorithmic) type system specifications this leads to explicit staging of the construction and querying of the type environment (class table, symbol table). Statix automatically stages the construction of the scope graph of a program such that queries are never executed when their answers may be affected by future scope graph extension. In the talk, I will explain the design of Statix by means of examples.
https://eelcovisser.org/post/309/declarative-type-system-specification-with-statix
Compiler Construction | Lecture 14 | InterpretersEelco Visser
This document summarizes a lecture on interpreters for programming languages. It discusses how operational semantics can be used to define the meaning of a program through state transitions in an interpreter. It provides examples of defining the semantics of a simple language using DynSem, a domain-specific language for specifying operational semantics. DynSem specifications can be compiled to interpreters that execute programs in the defined language.
A Language Designer’s Workbench. A one-stop shop for implementation and verif...Eelco Visser
The document describes a language designer's workbench that provides tools to support the implementation and verification of language designs. It allows language designers to define the syntax, name binding, type system, and semantics of a language and provides tools like parsers, type checkers, interpreters, and debuggers. The workbench integrates existing tools like SDF3, NaBL, TypeScript, and Stratego under a common framework. A prototype implementation of the workbench is demonstrated by defining the syntax, types, and semantics of a simple functional language called PCF.
The document discusses the role of the parser in compiler design. It explains that the parser takes a stream of tokens from the lexical analyzer and checks if the source program satisfies the rules of the context-free grammar. If so, it creates a parse tree representing the syntactic structure. Parsers are categorized as top-down or bottom-up based on the direction they build the parse tree. The document also covers context-free grammars, derivations, parse trees, ambiguity, and techniques for eliminating left-recursion from grammars.
The document describes static name resolution in programming languages. It discusses how names are bound to declarations through lexical scoping and how references are resolved to declarations by following paths through a scope graph representation. It presents the concepts of scopes, declarations, references, resolution paths, imports, and parent scopes. It also discusses how name resolution can be formalized using a calculus based on scope graphs, separation reachability and visibility, and how this supports name resolution, disambiguation, and program transformations.
Compiler Construction | Lecture 2 | Declarative Syntax DefinitionEelco Visser
The document describes a lecture on declarative syntax definition. It discusses the perspective on declarative syntax definition explained in an Onward! 2010 essay. It also mentions an OOPSLA 2011 paper that introduced the SPoofax Testing (SPT) language used in the section on testing syntax definitions. Finally, it provides a link to documentation on the SDF3 syntax definition formalism.
Syntax-Directed Translation into Three Address Codesanchi29
The document discusses syntax-directed translation of code into three-address code. It defines semantic rules for generating three-address code for expressions, boolean expressions, and control flow statements. Temporary variables are generated for subexpressions and intermediate values. The semantic rules specify generating three-address code statements using temporary variables. Backpatching is also discussed as a technique to replace symbolic names in goto statements with actual addresses after code generation.
Compiler Construction | Lecture 6 | Introduction to Static AnalysisEelco Visser
Lecture introducing the need for static analysis in addition to parsing, the complications caused by names, and an introduction to name resolution with scope graphs
The document discusses top-down and bottom-up parsing techniques. Top-down parsing constructs a parse tree starting from the root node and progresses depth-first. It can require backtracking. Bottom-up parsing uses shift-reduce parsing, shifting input symbols onto a stack until they can be reduced based on grammar rules.
The document discusses top-down and bottom-up parsing techniques, with top-down parsing constructing a parse tree starting from the root node and working downward, while bottom-up parsing uses shift-reduce parsing to shift input symbols onto a stack and reduce the stack based on grammar rules. It also covers topics like recursive descent parsing, predictive parsing tables, LL(1) grammars, and error recovery techniques used in parsing.
The document discusses lexical analysis in compilers. It describes how a lexical analyzer groups characters into tokens by recognizing patterns in the input based on regular expressions. It provides examples of token classes and structures. It also explains how lexical analysis is implemented using a lexical analyzer generator called LEX, which translates a LEX source file into a C program that performs lexical analysis.
The document discusses syntax analysis and predictive parsing. It defines the role of the parser and describes techniques for dealing with ambiguity and left recursion in grammars. It also covers computing first and follow sets, constructing predictive parsing tables, and using a table-driven predictive parsing algorithm. The algorithm uses the parsing table to shift and reduce based on the lookahead token until reaching the end symbol.
Declare Your Language: Dynamic SemanticsEelco Visser
This document provides an overview of DynSem, a domain-specific language for specifying the dynamic semantics of programming languages. It discusses the design goals of DynSem in being concise, modular, executable, portable, and high-performance. The document then provides an example DynSem specification for the semantics of a simple language with features like arithmetic, booleans, functions, and mutable variables. It describes how DynSem specifications are organized into modules that can be composed to define the semantics.
This document discusses syntax definition and provides examples using various syntax definition formalisms including Backus-Naur Form (BNF), Extended Backus-Naur Form (EBNF), and SDF3. It introduces concepts of lexical syntax, context-free syntax, abstract syntax, disambiguation, and testing syntax definitions. Specific examples are provided for defining the syntax of an expression language using BNF, EBNF, and SDF3. Testing syntax definitions using Spoofax is also discussed with examples of test cases for lexical and context-free syntax.
The document discusses syntax definition in programming languages. It provides examples of lexical syntax, context-free syntax, abstract syntax trees, disambiguation, and testing syntax definitions using SDF3 and Spoofax. Key topics covered include regular expressions, Backus-Naur Form, Extended Backus-Naur Form, SDF3 syntax, Spoofax architecture for language implementation, and syntax processing tools like parsers, pretty-printers and compilers.
Many developers will be familiar with lex, flex, yacc, bison, ANTLR, and other related tools to generate parsers for use inside their own code. For recognizing computer-friendly languages, however, context-free grammars and their parser-generators leave a few things to be desired. This is about how the seemingly simple prospect of parsing some text turned into a new parser toolkit for Erlang, and why functional programming makes parsing fun and awesome
Open course(programming languages) 20150225JangChulho
This document discusses programming language parsers and abstract syntax trees. It contains the following key points:
1) A parser takes a sequence of tokens as input and outputs a parse tree representing the syntactic structure according to a grammar. The tree structure is convenient for computers to interpret even if cumbersome for humans.
2) Parsers can build the parse tree either top-down or bottom-up. An abstract syntax tree then simplifies the parse tree, removing details not needed for semantic analysis.
3) Tools like Python Lex-Yacc can generate parsers from grammar rules and token definitions, allowing the construction and use of parsers and abstract syntax trees for programming languages.
This document discusses programming language parsers and abstract syntax trees. It covers topics like:
- The process of syntactical analysis and turning a sequence of tokens into a parse tree.
- How parse trees represent the syntactic structure of a string according to a grammar and why they are useful for computers.
- The two main approaches for building a parse tree - top-down and bottom-up parsing.
- How Python's lex-yacc module can be used to generate a parser from grammar rules and build an abstract syntax tree.
This document provides an outline and overview of dynamic semantics and operational semantics. It discusses defining the meaning of programs through execution and transition systems. It introduces DynSem, a domain-specific language for specifying dynamic semantics in a modular way. DynSem specifications generate interpreters from language definitions. The document uses examples from arithmetic expressions and a language with boxes to illustrate DynSem specifications.
This document discusses F# and FParsec. It provides examples of parsing expressions in FParsec using lazy evaluation and references, as opposed to NParsec which uses bindings. FParsec allows defining recursive parsers in a natural way in F#.
Perl6 regular expression ("regex") syntax has a number of improvements over the Perl5 syntax. The inclusion of grammars as first-class entities in the language makes many uses of regexes clearer, simpler, and more maintainable. This talk looks at a few improvements in the regex syntax and also at how grammars can help make regex use cleaner and simpler.
Slaying the Dragon: Implementing a Programming Language in RubyJason Yeo Jie Shun
The document provides an overview of creating your own programming language. It discusses implementing a basic Lisp dialect called MAL to demonstrate core concepts like parsing, the abstract syntax tree, and defining functions. The examples show how to tokenize, parse, and evaluate simple MAL programs that perform operations like addition and defining functions to square values.
The document describes intermediate code generation during compilation.
1) An intermediate language is used to translate source programs into a form that is CPU independent yet close to machine language. This facilitates code optimization and retargeting of compilers.
2) Common intermediate languages include syntax trees, postfix notation, and three-address code using quadruples. Three-address code breaks down expressions into single assignment statements to simplify optimization.
3) Semantic rules for syntax-directed translation are described to generate three-address code for expressions, assignments, procedures, and other language constructs. Attributes track temporary variables and generated code.
Scala presentation by Aleksandar ProkopecLoïc Descotte
This document provides an introduction to the Scala programming language. It discusses how Scala runs on the Java Virtual Machine, supports both object-oriented and functional programming paradigms, and provides features like pattern matching, immutable data structures, lazy evaluation, and parallel collections. Scala aims to be concise, expressive, and extensible.
The document discusses top-down parsing and predictive parsing. It defines key concepts like scanners, parsers, syntax analyzers, parse trees, top-down parsing, and predictive parsing. It also describes the process of constructing a predictive parsing table using the First and Follow sets of a grammar's productions.
The document discusses intermediate code generation in compiler construction. It covers several intermediate representations including postfix notation, three-address code, and quadruples. It also discusses generating three-address code through syntax-directed translation and the use of symbol tables to handle name resolution and scoping.
This document provides an overview of building a simple one-pass compiler to generate bytecode for the Java Virtual Machine (JVM). It discusses defining a programming language syntax, developing a parser, implementing syntax-directed translation to generate intermediate code targeting the JVM, and generating Java bytecode. The structure of the compiler includes a lexical analyzer, syntax-directed translator, and code generator to produce JVM bytecode from a grammar and language definition.
The document discusses intermediate representation in compilers. It states that compilers use a front end to translate source programs into an intermediate representation and a back end to generate target code from the intermediate representation. This allows for machine-independent optimization and retargeting of compilers to different architectures. It then describes three-address code as a common intermediate representation and shows syntax-directed translation of expressions and control flow structures into three-address code. Finally, it discusses various aspects of compiler implementation like symbol tables, type checking, and backpatching.
Similar to Compiler Construction | Lecture 4 | Parsing (20)
This document provides an overview of the CS4200 Compiler Construction course at TU Delft. It discusses the course organization, structure, and assessment. The course is split into two parts - CS4200-A which covers concepts and techniques through lectures, papers, and homework assignments, and CS4200-B which involves building a compiler for a subset of Java as a semester-long project. Students will use the Spoofax language workbench to implement their compiler and will submit assignments through a private GitLab repository.
A Direct Semantics of Declarative Disambiguation RulesEelco Visser
This document discusses research into providing a direct semantics for declarative disambiguation of expression grammars. It aims to define what disambiguation rules mean, ensure they are safe and complete, and provide an effective implementation strategy. The document outlines key research questions around the meaning, safety, completeness and coverage of disambiguation rules. It also presents contributions around using subtree exclusion patterns to define safe and complete disambiguation for classes of expression grammars, and implementing this in SDF3.
Compiler Construction | Lecture 17 | Beyond Compiler ConstructionEelco Visser
Compiler construction techniques are applied beyond general-purpose languages through domain-specific languages (DSLs). The document discusses several DSLs developed using Spoofax including:
- WebDSL for web programming with sub-languages for entities, queries, templates, and access control.
- IceDust for modeling information systems with derived values computed on-demand, incrementally, or eventually consistently.
- PixieDust for client-side web programming with views as derived values updated incrementally.
- PIE for defining software build pipelines as tasks with dynamic dependencies computed incrementally.
The document also outlines several research challenges in compiler construction like high-level declarative language definition, verification of
Domain Specific Languages for Parallel Graph AnalytiX (PGX)Eelco Visser
This document discusses domain-specific languages (DSLs) for parallel graph analytics using PGX. It describes how DSLs allow users to implement graph algorithms and queries using high-level languages that are then compiled and optimized to run efficiently on PGX. Examples of DSL optimizations like multi-source breadth-first search are provided. The document also outlines the extensible compiler architecture used for DSLs, which can generate code for different backends like shared memory or distributed memory.
Compiler Construction | Lecture 15 | Memory ManagementEelco Visser
The document discusses different memory management techniques:
1. Reference counting counts the number of pointers to each record and deallocates records with a count of 0.
2. Mark and sweep marks all reachable records from program roots and sweeps unmarked records, adding them to a free list.
3. Copying collection copies reachable records to a "to" space, allowing the original "from" space to be freed without fragmentation.
4. Generational collection focuses collection on younger object generations more frequently to improve efficiency.
Compiler Construction | Lecture 13 | Code GenerationEelco Visser
The document discusses code generation and optimization techniques, describing compilation schemas that define how language constructs are translated to target code patterns, and covers topics like ensuring correctness of generated code through type checking and verification of static constraints on the target format. It also provides examples of compilation schemas for Tiger language constructs like arithmetic expressions and control flow and discusses generating nested functions.
Compiler Construction | Lecture 12 | Virtual MachinesEelco Visser
The document discusses the architecture of the Java Virtual Machine (JVM). It describes how the JVM uses threads, a stack, heap, and method area. It explains JVM control flow through bytecode instructions like goto, and how the operand stack is used to perform operations and hold method arguments and return values.
Compiler Construction | Lecture 7 | Type CheckingEelco Visser
This document summarizes a lecture on type checking. It discusses using constraints to separate the language-specific type checking rules from the language-independent solving algorithm. Constraint-based type checking collects constraints as it traverses the AST, then solves the constraints in any order. This allows type information to be learned gradually and avoids issues with computation order.
Compiler Construction | Lecture 1 | What is a compiler?Eelco Visser
This document provides an overview of the CS4200 Compiler Construction course at TU Delft. It discusses the organization of the course into two parts: CS4200-A which covers compiler concepts and techniques through lectures, papers, and homework assignments; and CS4200-B which involves building a compiler for a subset of Java as a semester-long project. Key topics covered include the components of a compiler like parsing, type checking, optimization, and code generation; intermediate representations; and different types of compilers.
Declare Your Language: Virtual Machines & Code GenerationEelco Visser
The document summarizes virtual machines and code generation. It discusses how high-level programming languages are abstracted from low-level machine details through virtual machines. The Java Virtual Machine architecture and bytecode instructions are described, including its stack-based design, threads, heap, and method area. Code generation mechanics like string operations are also covered.
Declare Your Language: Constraint Resolution 2Eelco Visser
1) The document discusses unification and the union-find algorithm for efficiently solving unification problems.
2) Unification involves resolving constraints between terms to find a substitution that makes the terms equal. This can have exponential time and space complexity.
3) The union-find algorithm uses set representatives and linking to maintain disjoint sets in near-constant time complexity, improving on naive recursive algorithms.
4) An example shows how union-find can be applied to fully unify a complex expression by progressively linking variables and subterms between the two sides.
Declare Your Language: Constraint Resolution 1Eelco Visser
This document provides an overview of constraint resolution and type checking with constraints. It discusses unification theory and the semantics and implementation of constraint solving. Specifically, it covers:
- Unification algorithms and their properties of soundness, completeness, and principality
- The meaning of constraints and how constraint satisfaction is formally defined using constraint semantics
- How to write a constraint solver using the Constraint Handling Rules (CHR) formalism and the semantics of solving subtyping constraints
- The evaluation rules for CHR-based constraint solving, including solving built-in constraints, introducing new constraints, and simplifying/propagating constraints using rewrite rules
- How the CHR rules for subtyping constraints are evaluated using
Building API data products on top of your real-time data infrastructureconfluent
This talk and live demonstration will examine how Confluent and Gravitee.io integrate to unlock value from streaming data through API products.
You will learn how data owners and API providers can document, secure data products on top of Confluent brokers, including schema validation, topic routing and message filtering.
You will also see how data and API consumers can discover and subscribe to products in a developer portal, as well as how they can integrate with Confluent topics through protocols like REST, Websockets, Server-sent Events and Webhooks.
Whether you want to monetize your real-time data, enable new integrations with partners, or provide self-service access to topics through various protocols, this webinar is for you!
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISTier1 app
Are you ready to unlock the secrets hidden within Java thread dumps? Join us for a hands-on session where we'll delve into effective troubleshooting patterns to swiftly identify the root causes of production problems. Discover the right tools, techniques, and best practices while exploring *real-world case studies of major outages* in Fortune 500 enterprises. Engage in interactive lab exercises where you'll have the opportunity to troubleshoot thread dumps and uncover performance issues firsthand. Join us and become a master of Java thread dump analysis!
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...kalichargn70th171
In today's fiercely competitive mobile app market, the role of the QA team is pivotal for continuous improvement and sustained success. Effective testing strategies are essential to navigate the challenges confidently and precisely. Ensuring the perfection of mobile apps before they reach end-users requires thoughtful decisions in the testing plan.
What to do when you have a perfect model for your software but you are constrained by an imperfect business model?
This talk explores the challenges of bringing modelling rigour to the business and strategy levels, and talking to your non-technical counterparts in the process.
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid
IBM watsonx Code Assistant for Z, our latest Generative AI-assisted mainframe application modernization solution. Mainframe (IBM Z) application modernization is a topic that every mainframe client is addressing to various degrees today, driven largely from digital transformation. With generative AI comes the opportunity to reimagine the mainframe application modernization experience. Infusing generative AI will enable speed and trust, help de-risk, and lower total costs associated with heavy-lifting application modernization initiatives. This document provides an overview of the IBM watsonx Code Assistant for Z which uses the power of generative AI to make it easier for developers to selectively modernize COBOL business services while maintaining mainframe qualities of service.
Nashik's top web development company, Upturn India Technologies, crafts innovative digital solutions for your success. Partner with us and achieve your goals
Voxxed Days Trieste 2024 - Unleashing the Power of Vector Search and Semantic...Luigi Fugaro
Vector databases are redefining data handling, enabling semantic searches across text, images, and audio encoded as vectors.
Redis OM for Java simplifies this innovative approach, making it accessible even for those new to vector data.
This presentation explores the cutting-edge features of vector search and semantic caching in Java, highlighting the Redis OM library through a demonstration application.
Redis OM has evolved to embrace the transformative world of vector database technology, now supporting Redis vector search and seamless integration with OpenAI, Hugging Face, LangChain, and LlamaIndex. This talk highlights the latest advancements in Redis OM, focusing on how it simplifies the complex process of vector indexing, data modeling, and querying for AI-powered applications. We will explore the new capabilities of Redis OM, including intuitive vector search interfaces and semantic caching, which reduce the overhead of large language model (LLM) calls.
How GenAI Can Improve Supplier Performance Management.pdfZycus
Data Collection and Analysis with GenAI enables organizations to gather, analyze, and visualize vast amounts of supplier data, identifying key performance indicators and trends. Predictive analytics forecast future supplier performance, mitigating risks and seizing opportunities. Supplier segmentation allows for tailored management strategies, optimizing resource allocation. Automated scorecards and reporting provide real-time insights, enhancing transparency and tracking progress. Collaboration is fostered through GenAI-powered platforms, driving continuous improvement. NLP analyzes unstructured feedback, uncovering deeper insights into supplier relationships. Simulation and scenario planning tools anticipate supply chain disruptions, supporting informed decision-making. Integration with existing systems enhances data accuracy and consistency. McKinsey estimates GenAI could deliver $2.6 trillion to $4.4 trillion in economic benefits annually across industries, revolutionizing procurement processes and delivering significant ROI.
Orca: Nocode Graphical Editor for Container OrchestrationPedro J. Molina
Tool demo on CEDI/SISTEDES/JISBD2024 at A Coruña, Spain. 2024.06.18
"Orca: Nocode Graphical Editor for Container Orchestration"
by Pedro J. Molina PhD. from Metadev
How Can Hiring A Mobile App Development Company Help Your Business Grow?ToXSL Technologies
ToXSL Technologies is an award-winning Mobile App Development Company in Dubai that helps businesses reshape their digital possibilities with custom app services. As a top app development company in Dubai, we offer highly engaging iOS & Android app solutions. https://rb.gy/necdnt
4. The perspective of this lecture on declarative syntax definition is
Elained more elaborately in this Onward! 2010 essay. It uses an on
older version of SDF than used in these slides. Production rules have
the form
X1 … Xn -> N {cons(“C”)}
instead of
N.C = X1 … Xn
http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2010-019.pdf
https://doi.org/10.1145/1932682.1869535
5. !5
Compilers: Principles, Techniques, and Tools, 2nd Edition
Alfred V. Aho, Columbia University
Monica S. Lam, Stanford University
Ravi Sethi, Avaya Labs
Jeffrey D. Ullman, Stanford University
2007 | Pearson
Classical compiler textbook
Chapter 4: Syntax Analysis
Read Sections 4.1, 4.2, 4.3, 4.5, 4.6
Pictures in these slides are copies from the book
6. !6
Sikkel, N. (1993). Parsing Schemata.
PhD thesis. Enschede: Universiteit Twente
This PhD thesis presents a uniform framework for describing
a wide range of parsing algorithms.
https://research.utwente.nl/en/publications/parsing-schemata
“Parsing schemata provide a general framework for
specication, analysis and comparison of (sequential and/or
parallel) parsing algorithms. A grammar specifies implicitly what
the valid parses of a sentence are; a parsing algorithm specifies
Elicitly how to compute these. Parsing schemata form a well-
defined level of abstraction in between grammars and parsing
algorithms. A parsing schema specifies the types of
intermediate results that can be computed by a parser, and the
rules that allow to Eand a given set of such results with new
results. A parsing schema does not specify the data structures,
control structures, and (in case of parallel processing)
communication structures that are to be used by a parser.”
For the interested
11. Terminals
- Basic symbols from which strings are formed
Nonterminals
- Syntactic variables that denote sets of strings
Start Symbol
- Denotes the nonterminal that generates strings of the languages
Productions
- A = X … X
- Head/left side (A) is a nonterminal
- Body/right side (X … X) zero or more terminals and nonterminals
!11
Context-Free Grammars
13. Abbreviated Grammar
!13
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
Nonterminals, terminals can be derived from productions
First production defines start symbol
grammar
start S
non-terminals E T F
terminals "+" "*" "(" ")" ID
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
14. Notation
!14
A, B, C: non-terminals
l: terminals
a, b, c: strings of non-terminals and terminals
(alpha, beta, gamma in math)
w, v: strings of terminal symbols
17. Derivations
!17
grammar
productions
E = E "+" E
E = E "*" E
E = "-" E
E = "(" E ")"
E = ID
// derivation step: replace symbol by rhs of production
// E = E "+" E
// replace E by E "+" E
//
// derivation:
// repeatedly apply derivations
derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" ID ")"
derivation // derives in zero or more steps
E =>* "-" "(" ID "+" ID ")"
19. Left-Most Derivation
!19
grammar
productions
E = E "+" E
E = E "*" E
E = "-" E
E = "(" E ")"
E = ID
derivation // left-most derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" E "+" E ")"
=> "-" "(" ID "+" E ")"
=> "-" "(" ID "+" ID ")"
Left-most derivation: Expand left-most non-terminal at each step
20. Right-Most Derivation
!20
grammar
productions
E = E "+" E
E = E "*" E
E = "-" E
E = "(" E ")"
E = ID
derivation // left-most derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" E "+" E ")"
=> "-" "(" ID "+" E ")"
=> "-" "(" ID "+" ID ")"
Right-most derivation: Expand right-most non-terminal at each step
derivation // right-most derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" E "+" E ")"
=> "-" "(" E "+" ID ")"
=> "-" "(" ID "+" ID ")"
22. Left-Most Tree Derivation
!22
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.N = "-" E
E.P = "(" E ")"
E.V = ID
derivation // left-most derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" E "+" E ")"
=> "-" "(" ID "+" E ")"
=> "-" "(" ID "+" ID ")"
tree derivation // left-most
E
=> E["-" E]
=> E["-" E["(" E ")"]]
=> E["-" E["(" E[E "+" E] ")"]]
=> E["-" E["(" E[E[ID] "+" E] ")"]]
=> E["-" E["(" E[E[ID] "+" E[ID]] ")"]]
23. Left-Most Tree Derivation
!23
tree derivation // left-most
E
=> E["-" E]
=> E["-" E["(" E ")"]]
=> E["-" E["(" E[E "+" E] ")"]]
=> E["-" E["(" E[E[ID] "+" E] ")"]]
=> E["-" E["(" E[E[ID] "+" E[ID]] ")"]]
24. Ambiguity: Deriving Multiple Parse Trees
!24
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.N = "-" E
E.P = "(" E ")"
E.V = ID
derivation
E =>* ID "+" ID "*" ID
Ambiguous grammar: produces >1 parse tree for a sentence
derivation
E
=> E "+" E
=> ID "+" E
=> ID "+" E "*" E
=> ID "+" ID "*" E
=> ID "+" ID "*" ID
tree derivation
E =>* E[E[ID] "+" E[E[ID] "*" E[ID]]]
derivation
E
=> E "*" E
=> E "+" E "*" E
=> ID "+" E "*" E
=> ID "+" ID "*" E
=> ID "+" ID "*" ID
tree derivation
E =>* E[E[E[ID] "+" E[ID]] "*" E[ID]]
26. Ambiguity: Deriving Abstract Syntax Terms
!26
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.N = "-" E
E.P = "(" E ")"
E.V = ID
derivation
E =>* ID "+" ID "*" ID
derivation
E
=> E "+" E
=> ID "+" E
=> ID "+" E "*" E
=> ID "+" ID "*" E
=> ID "+" ID "*" ID
derivation
E
=> E "*" E
=> E "+" E "*" E
=> ID "+" E "*" E
=> ID "+" ID "*" E
=> ID "+" ID "*" ID
term derivation
E
=> A(E, E)
=> A(V(ID), E)
=> A(V(ID), T(E, E))
=> A(V(ID), T(V(ID), E))
=> A(V(ID), T(V(ID), V(ID)))
term derivation
E
=> T(E, E)
=> T(A(E, E), E)
=> T(A(V(ID), E), E)
=> T(A(V(ID), V(ID)), E)
=> T(A(V(ID), V(ID)), V(ID))
28. Why?
- Disambiguation
- For use by a particular parsing algorithm
Transformations
- Eliminating ambiguities
- Eliminating left recursion
- Left factoring
Properties
- Does transformation preserve the language (set of strings, trees)?
- Does transformation preserve the structure of trees?
!28
Grammar Transformations
29. Ambiguous Expression Grammar
!29
derivation
E =>* ID "*" ID "+" ID
term derivation
E
=> A(E, E)
=> A(T(E, E), E)
=> A(T(E, E), E)
=> A(T(V(ID), E), E)
=> A(T(V(ID), V(ID)), E)
=> A(T(V(ID), V(ID)), V(ID))
term derivation
E
=> T(E, E)
=> T(E, E)
=> T(V(ID), E)
=> T(V(ID), A(E, E))
=> T(V(ID), A(V(ID), E))
=> T(V(ID), A(V(ID), V(ID)))
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.M = "-" E
E.B = "(" E ")"
E.V = ID
30. Associativity and Priority Filter Ambiguities
!30
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.M = "-" E
E.B = "(" E ")"
E.V = ID
derivation
E =>* ID "*" ID "+" ID
term derivation
E
=> A(E, E)
=> A(T(E, E), E)
=> A(T(E, E), E)
=> A(T(V(ID), E), E)
=> A(T(V(ID), V(ID)), E)
=> A(T(V(ID), V(ID)), V(ID))
term derivation
E
=> T(E, E)
=> T(E, E)
=> T(V(ID), E)
=> T(V(ID), A(E, E))
=> T(V(ID), A(V(ID), E))
=> T(V(ID), A(V(ID), V(ID)))
grammar
productions
E.A = E "+" E {left}
E.T = E "*" E {left}
E.M = "-" E
E.B = "(" E ")"
E.V = ID
priorities
E.M > E. T > E.A
31. Define Associativity and Priority by Transformation
!31
grammar
productions
E.A = E "+" T
E = T
T.T = T "*" F
T = F
F.V = ID
F.B = "(" E ")"
derivation
E =>* ID "*" ID "+" ID
term derivation
E
=> A(E, E)
=> A(T(E, E), E)
=> A(T(E, E), E)
=> A(T(V(ID), E), E)
=> A(T(V(ID), V(ID)), E)
=> A(T(V(ID), V(ID)), V(ID))
term derivation
E
=> T(E, E)
=> T(E, E)
=> T(V(ID), E)
=> T(V(ID), A(E, E))
=> T(V(ID), A(V(ID), E))
=> T(V(ID), A(V(ID), V(ID)))
grammar
productions
E.A = E "+" E {left}
E.T = E "*" E {left}
E.M = "-" E
E.B = "(" E ")"
E.V = ID
priorities
E.M > E. T > E.A
32. Define Associativity and Priority by Transformation
!32
grammar
productions
E.A = E "+" T
E = T
T.T = T "*" F
T = F
F.V = ID
F.B = "(" E ")"
grammar
productions
E.A = E "+" E {left}
E.T = E "*" E {left}
E.M = "-" E
E.B = "(" E ")"
E.V = ID
priorities
E.M > E. T > E.A
Define new non-terminal for each priority level:
E, T, F
Add ‘injection’ productions to include priority
level n+1 in n:
E = T
T = F
Transform productions
Left: E = E “+” T
Right: E = T “+” E
Change head of production to reflect priority
level
T = T “*” F
33. Dangling Else Grammar
!33
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
derivation
S =>* if E1 then S1 else if E2 then S2 else S3
term derivation
S =>* IfE(E1, S1, IfE(E2, S2, S3))
term derivation
S
=> IfE(E1, S1, S)
=> IfE(E1, S1, IfE(E2, S2, S3))
34. Dangling Else Grammar is Ambiguous
!34
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
term derivation
S
=> If(E1, S)
=> If(E1, IfE(E2, S1, S2))
derivation
S
=> if E1 then S
=> if E1 then if E2 then S1 else S2
derivation
S =>* if E1 then if E2 then S1 else S2
term derivation
S
=> IfE(E1, S, S2)
=> IfE(E1, If(E2, S1), S2)
derivation
S
=> if E1 then S else S2
=> if E1 then if E2 then S1 else S2
35. Eliminating Dangling Else Ambiguity
!35
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
grammar
productions
S = MS
S = OS
MS = if E then MS else MS
MS = other
OS = if E then S
OS = if E then MS else OS
Generalization of this transformation: contextual grammars
36. !36
This paper defines a declarative semantics for associativity
and priority declarations for disambiguation.
The paper provides a safe semantics and extends it to
deep priority conflicts.
The result of disambiguation is a contextual grammar,
which generalises the disambiguation for the dangling-else
grammar.
The paper is still in production. Ask us for a copy of the draft.
37. Eliminating Left Recursion
!37
grammar
productions
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
grammar
productions
A = A a
A = b
grammar
productions
A = b A'
A' = a A'
A' = // empty
grammar
productions
E = T E'
E' = "+" T E'
E' =
T = F T'
T' = "*" F T'
T' =
F = "(" E ")"
F = ID
// b followed by a list of as
38. Left Factoring
!38
grammar
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
grammar
productions
A = a b1
A = a b2
A = c
grammar
productions
A = a A'
A' = b1
A' = b2
A = c
grammar
sorts S E
productions
S.If = if E then S S'
S'.Else = else S
S'.NoElse = // empty
S = other
39. Preservation
- Preserves set of sentences
- Preserves set of trees
- Preserves tree structure
Systematic
- Algorithmic
- Heuristic
!39
Properties of Grammar Transformations
41. Top-Down Parse
!41
grammar
sorts E T E' F T'
productions
E = T E'
E' = "+" T E'
E' =
T = F T'
T' = "*" F T'
T' =
F = "(" E ")"
F = ID
tree derivation // top-down parse
E
=> E[T E']
=> E[T[F T'] E']
=> E[T[F[ID] T'] E']
=> E[T[F[ID] T'[]] E']
=> E[T[F[ID] T'[]] E'["+" T E']]
=> E[T[F[ID] T'[]] E'["+" T[F T'] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F T']] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F[ID] T']] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F[ID] T'[]]] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F[ID] T'[]]] E'[]]]
derivation
E =>* ID "+" ID "*" ID
44. Use LL(1) grammar
- Not left recursive
- Left factored
Top-down back-track parsing
- Predict symbol
- If terminal: corresponds to next input symbol?
- Try productions for non-terminal in turn
Predictive parsing
- Predict symbol to parse
- Use lookahead to deterministically chose production for non-terminal
Variants
- Parser combinators, PEG, packrat, …
!44
LL Parsing
47. A Reduction is an Inverse Derivation
!47
grammar
sorts A
productions
A = b
reduction
a b c <= a A c
48. Reducing to Symbols
!48
grammar
sorts E T F ID
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
reduction
ID "*" ID
<= F "*" ID
<= T "*" ID
<= T "*" F
<= T
<= E
49. Reducing to Parse Trees
!49
grammar
sorts E T F ID
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
reduction
ID "*" ID
<= F "*" ID
<= T "*" ID
<= T "*" F
<= T
<= E
tree reduction
ID "*" ID
<= F[ID] "*" ID
<= T[F[ID]] "*" ID
<= T[F[ID]] "*" F[ID]
<= T[T[F[ID]] "*" F[ID]]
<= E[T[T[F[ID]] "*" F[ID]]]
50. Reducing to Abstract Syntax Terms
!50
grammar
sorts E T F ID
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
reduction
ID "*" ID
<= F "*" ID
<= T "*" ID
<= T "*" F
<= T
<= E
tree reduction
ID "*" ID
<= F[ID] "*" ID
<= T[F[ID]] "*" ID
<= T[F[ID]] "*" F[ID]
<= T[T[F[ID]] "*" F[ID]]
<= E[T[T[F[ID]] "*" F[ID]]]
term reduction
ID "*" ID
<= V(ID) "*" ID
<= T(V(ID)) "*" ID
<= T(V(ID)) "*" V(ID)
<= M(T(V(ID)), V(ID))
<= E(M(T(V(ID)), V(ID)))
51. Handles
!51
grammar
productions
A = b
derivation // right-most derivation
S =>* a A w => a b w
// Handle
// sentential form: string of non-terminal and terminal symbols
// that can be derived from start symbol
// S =>* a
// right sentential form: a sentential form derived by a right-most derivation
// handle: the part of a right sentential form that if reduced
// would produced the previous right-sential form in a right-most derivation
reduction
a b w <= a A w <=* S
53. Shift-Reduce Parsing Machine
!53
$ S | $ accept
// reduced to start symbol; accept
$ a | w $ error
// no action possible in this state
shift-reduce parse
$ stack | input $ action
$ a | l w $ shift
=> $ a l | w $
// shift input symbol on the stack
$ a b | w $ reduce by A = b
=> $ a A | w $
// reduce n symbols of the stack to symbol A
54. Shift-Reduce Parsing
!54
grammar
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
shift-reduce parse
$ | ID "*" ID $ shift
=> $ ID | "*" ID $ reduce by F = ID
=> $ F | "*" ID $ reduce by T = F
=> $ T | "*" ID $ shift
=> $ T "*" | ID $ shift
=> $ T "*" ID | $ reduce by F = ID
=> $ T "*" F | $ reduce by T = T "*" F
=> $ T | $ reduce by E = T
=> $ E | $ accept
Shift-reduce parsing constructs a right-most derivation
55. Shift-Reduce Conflicts
!55
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
shift-reduce parse
$ | if E then S else S $ shift
=> $ if | E then S else S $ shift
=>* $ if E then S | else S $ shift
=> $ if E then S else | S $ shift
=> $ if E then S else S | $ reduce by S = if E then S else S
=> $ S | $ accept
shift-reduce parse
$ | if E then S else S $ shift
=> $ if | E then S else S $ shift
=>* $ if E then S | else S $ reduce by S = if E then S
=> $ S | else S $ error
56. Shift-Reduce Conflicts
!56
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
shift-reduce parse
$ | if E then if E then S else S $ ...
=>* $ if E then if E then S | else S $ shift
=> $ if E then if E then S else | S $ shift
=> $ if E then if E then S else S | $ reduce by S = if E then S else S
=> $ if E then S | $ reduce by S = if E then S
=> $ S | $ accept
shift-reduce parse
$ | if E then if E then S else S $ ...
=>* $ if E then if E then S | else S $ reduce by S = if E then S
=> $ if E then S | else S $ shift
=> $ if E then S else | S $ shift
=> $ if E then S else S | $ reduce by S = if E then S else S
=> $ S
58. How can we make shift-reduce parsing deterministic?
!58
grammar
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
shift-reduce parse
$ | ID "*" ID $ shift
=> $ ID | "*" ID $ reduce by F = ID
=> $ F | "*" ID $ reduce by T = F
=> $ T | "*" ID $ shift
=> $ T "*" | ID $ shift
=> $ T "*" ID | $ reduce by F = ID
=> $ T "*" F | $ reduce by T = T "*" F
=> $ T | $ reduce by E = T
=> $ E | $ accept
Is there a production in the grammar that matches the top of the stack?
How to chose between possible shift and reduce actions?
59. LR(k) Parsing
- L: left-to-right scanning
- R: constructing a right-most derivation
- k: the number of lookahead symbols
Motivation for LR
- LR grammars for many programming languages
- Most general non-backtracking shift-reduce parsing method
- Early detection of parse errors
- Class of grammars superset of predictive/LL methods
SLR: Simple LR
!59
LR Parsing
60. Items
!60
[E = . E "+" T]
[E = E . "+" T]
[E = E "+" . T]
[E = E "+" T .]
E = E "+" T
Production Items
Item indicates how far we have progressed in parsing a production
We expect to see an expression
We have seen an expression
and may see a “+”
61. Item Set
!61
{
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
Item set used to keep track where we are in a parse
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
62. Closure of an Item Set
!62
SetOfItems Closure(I) {
J := I;
repeat
for(each [A = a . B b] in J)
for(each [B = c] in G)
if([B = . c] is not in J)
J := Add([B = .c], J);
until Not(Changed(J));
return J;
}
{
[F = "(" . E ")"]
}
{
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
63. Goto
!63
SetOfItems Goto(I, X) {
J := {};
for(each [A = a . X b] in I)
J := Add([A = a X . b], J);
return Closure(J);
}
{
[F = "(" . E ")"]
}
{
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
{
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
64. Computing LR(0) Automaton
!64
Items(G) {
C := {Closure({[Sp = . S]})};
repeat
for(each I in C)
for(each X in G) {
if(NotEmpty(Goto(I, X)) and Not(In(J, C)))
C := Add(Goto(I, X), C);
}
until Not(Changed(C));
}
65. Computing LR(0) Automaton
!65
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
state 0 {
[S = . E]
}
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
66. !66
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
67. !67
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
68. !68
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
69. !69
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
70. !70
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
71. !71
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
72. !72
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
73. !73
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
74. !74
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift F to 3
shift "(" to 4
shift ID to 5
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
75. !75
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift ID to 5state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
76. !76
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 8
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift ID to 5
state 8 {
[F = "(" E . ")"]
}
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
77. !77
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 8
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 9
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift ID to 5
state 8 {
[F = "(" E . ")"]
}
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
78. !78
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 8
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 9
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift F to 10
shift ID to 5
state 8 {
[F = "(" E . ")"]
}
shift ")" to 11
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
state 10 {
[T = T "*" F .]
}
reduce T = T "*" F
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
79. !79
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 8
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 9
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift F to 10
shift ID to 5
state 8 {
[F = "(" E . ")"]
}
shift ")" to 11
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
state 10 {
[T = T "*" F .]
}
reduce T = T "*" F
state 11 {
[F = "(" E ")" .]
}
reduce F = "(" E ")"
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
80. SLR Parse
!80
$ | 0 | ID "*" ID $ shift ID to 5
=> $ ID | 0 5 | "*" ID $ reduce F = ID
=> $ | 0 | F "*" ID $ shift F to 3
=> $ F | 0 3 | "*" ID $ reduce T = F
=> $ | 0 | T "*" ID $ shift T to 2
=> $ T | 0 2 | "*" ID $ shift "*" to 7
=> $ T "*" | 0 2 7 | ID $ shift ID to 5
=> $ T "*" ID | 0 2 7 5 | $ reduce F = ID
=> $ T "*" | 0 2 7 | F $ shift F to 10
=> $ T "*" F | 0 2 7 10 | $ reduce T = T "*" F
=> $ | 0 | T $ shift T to 2
=> $ T | 0 2 | $ reduce E = T
=> $ | 0 | E $ shift E to 1
=> $ E | 0 1 | $ accept
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
81. !81
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 3 {
[T = F . ]
}
reduce T = F
state 5 {
[F = ID .]
}
reduce F = ID
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift F to 10
shift ID to 5
state 10 {
[T = T "*" F .]
}
reduce T = T "*" F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 5 {
[F = ID .]
}
reduce F = ID
$ | 0 | ID "*" ID $ shift ID to 5
=> $ ID | 0 5 | "*" ID $ reduce F = ID
=> $ | 0 | F "*" ID $ shift F to 3
=> $ F | 0 3 | "*" ID $ reduce T = F
=> $ | 0 | T "*" ID $ shift T to 2
=> $ T | 0 2 | "*" ID $ shift "*" to 7
=> $ T "*" | 0 2 7 | ID $ shift ID to 5
=> $ T "*" ID | 0 2 7 5 | $ reduce F = ID
=> $ T "*" | 0 2 7 | F $ shift F to 10
=> $ T "*" F | 0 2 7 10 | $ reduce T = T "*" F
=> $ | 0 | T $ shift T to 2
=> $ T | 0 2 | $ reduce E = T
=> $ | 0 | E $ shift E to 1
=> $ E | 0 1 | $ accept
10
2
3
4
5
6
7
8
82. !82
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 3 {
[T = F . ]
}
reduce T = F
state 5 {
[F = ID .]
}
reduce F = ID
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift F to 10
shift ID to 5
state 10 {
[T = T "*" F .]
}
reduce T = T "*" F
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 5 {
[F = ID .]
}
reduce F = ID
Parsing: ID "+" ID "*" ID .
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 9
shift F to 3
shift "(" to 4
shift ID to 5
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
E "+"
ID
F
T
ID"*"
F
E
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
T
1
2 3
4
5
6
7
8
9
10
87. Context-free grammars
- Productions define how to generate sentences of language
- Derivation: generate sentence from (start) symbol
- Reduction: reduce sentence to (start) symbol
Parse tree
- Represents structure of derivation
- Abstracts from derivation order
Parser
- Algorithm to reconstruct derivation
!87
Parsing
88. First/Follow
- Selecting between actions in LR parse table
Other algorithms
- Top-down: LL(k) table
- Generalized parsing: Earley, Generalized-LR
- Scannerless parsing: characters as tokens
Disambiguation
- Semantics of declarative disambiguation
- Deep priority conflicts
!88
More Topics in Syntax and Parsing