Things to Do
•Start brushing up your skills in C++
• Install and learn how to develop C++ code using vscode
• You can setup GCC/g++ compiler for your practice. It will be
used throughout this for assignments and final project.
3.
Course Overview
Welcome tocourse on Compiler Construction Using C++!
• This course is designed for an intermediate learner with a solid
grasp of C++, data structures, algorithms and finite automata.
• During this course, you’ll build a functional compiler from scratch,
understanding its core components and implementing them in C++.
• We’ll balance theory, hands-on coding, and practical exercises to
ensure you master in construction of a compiler.
• The course is structured into 14 sessions, each with bite-sized
lessons, real-world examples, checkpoints, and optional challenges
to deepen your understanding.
4.
Course Goals
• Understandthe architecture and phases of a compiler.
• Implement a compiler for a simple language using C++.
• Apply compiler construction concepts to real-world
problems.
• Develop problem-solving skills through practical coding and
debugging.
5.
Course Curriculum
1. Introductionto Compilers and Lexical Analysis
• Overview of compilers, their phases, and real-world applications.
• Focus on lexical analysis: tokenizing source code.
• Tools: C++ Standard Library, Flex (lexer generator).
2. Syntax Analysis and Parsing
• Syntax analysis: understanding grammars and building parsers.
• Implement recursive descent and table-driven parsers in C++.
• Tools: Bison (parser generator), C++ data structures.
3. Semantic Analysis and Intermediate Code Generation
• Semantic analysis: type checking and symbol tables.
• Generating intermediate representations (IR) like three-address code.
• Tools: C++ containers, LLVM IR (optional exploration).
continued
6.
Course Curriculum
4. CodeOptimization and Code Generation
• Optimize intermediate code for performance.
• Generate target machine code or assembly.
• Tools: C++ for optimization algorithms, basic assembly exploration.
5. Integration and Testing
• Integrate all compiler phases into a working system.
• Test and debug the compiler with sample programs.
• Tools: GDB, unit testing frameworks (e.g., Google Test).
7.
Course Curriculum
• Quizzes:7-2 (best 6 each of 2.5 marks i.e. 2.5 X 6 = 15marks)
• Assignment/Exercise : 10 Marks
• Final Project: 15 Marks
• Mid-Term Exam: 20 Marks
• Final Exam: 40 Marks
8.
Compilers
• What isCompiler ?
• A program that translates an executable program in one language into
an executable program in another language.
• Compilers translate high-level code to machine code (e.g., C++ to x86
assembly).
• We expect the program produced by compiler to be better in some way
then other.
• What is Interpreter?
• a program that reads an executable program and produces the results
of running that program.
• Usually, this involves executing the source program in some fashion
9.
Importance of StudyingCompiler
Construction
Compiler construction integrates various aspects of computer science,
including algorithms, systems, and architecture, making it a rich field of study.
It provides insights into how programming languages are translated into
machine code, which is essential for understanding computer operations.
• Compiler construction encompasses multiple disciplines: artificial
intelligence, algorithms, theory, systems, and architecture.
• It serves as a microcosm* of computer science, illustrating the interplay
between different concepts.
• Understanding compilers is crucial for grasping how computers execute
programs.
* Encapsulating in miniature the characteristics of something much larger.
Building Compilers –Real-world
Context
• Compilers power everything from operating systems (compiled
with GCC) to machine learning frameworks (TensorFlow’s XLA
compiler).
• Your course will have you build a compiler for a simple language,
giving you hands-on experience with these concepts using C++.
• Performance Optimization
• Cross-Platform Compatibility
• Machine Learning
12.
Building Compilers –Interest
Compiler construction is a microcosm* of computer science
Inside a compiler, all these things come together
* Encapsulating in miniature the characteristics of something much larger.
Artificial Intelligence Greedy Algorithm
Learning Algorithm
Algorithm Graph Algorithm
Union-Find
Dynamic Programming
Theory DFAs for Scanning
Parser Generators
Lattice Theory for Analysis
Systems Allocation and Naming
Locality
Synchronization
Architecture Pipeline Management
Hierarchy Management
Instruction Set use
13.
Intrinsic Value ofCompiler
Construction
Compiler construction is both challenging and enjoyable,
presenting interesting problems that significantly impact
performance and computer usage. The complexity of compilers
leads to real-world results and insights into computer operations.
• Building compilers is a complex task that involves intricate
interactions and problem-solving.
• Compilers play a crucial role in determining the performance of
programs.
• The field poses some of the most interesting problems in
computing, making it a rewarding area of study.
14.
Qualities of EffectiveCompilers
Effective compilers must exhibit several important qualities to
ensure they produce high-quality code and facilitate the
development process. These qualities shape the curriculum and
focus of compiler construction courses.
• Correctness of generated code is paramount.
• The speed of the output and the compiler itself are critical for
efficiency.
• Good diagnostics for errors and compatibility with debuggers
enhance usability.
• Support for separate compilation and cross-language calls is
essential for modern programming needs.
15.
Abstract Representation inCompiler
Design
Compilers utilize an abstract representation of code,
transitioning from source code to intermediate
representation (IR) and finally to machine code.
This process simplifies the retargeting of compilers
and allows for multiple front ends.
• The compiler translates source code into an
intermediate representation (IR) before generating
machine code.
• The front end of the compiler recognizes legal
programs and generates IR.
• The back end maps IR to the target machine,
facilitating easier retargeting and optimization.
Compiler
Source
Code
Machine
Code
Errors
Recognize legal (and illegal Programs
Generate correct code
Manage storage of all variables and code
Agreement on format of object (assemble) code
16.
Compiler Phases andTheir Functions
The compilation process consists of several distinct phases, each with specific
responsibilities that contribute to the overall function of the compiler.
Understanding these phases is crucial for grasping how compilers operate.
Analysis Part:
• Lexical analysis breaks the source file into tokens.
• Parsing analyses the structure of the program and builds abstract syntax trees.
• Semantic analysis checks variable definitions and types.
• Intermediate representation (IR) is produced for further translation into machine
code.
Synthesis Part:
• Optimization phases improve the efficiency of the generated code.
• Final Code Generation.
Lexical Analysis (Scanner)
•Lexical analysis or scanning is the process where:
• The source program is read from left-to-right and grouped into
tokens.
• Tokens are sequences of characters with a collective meaning.
• In any programming language tokens may be constants,
operators, reserved words, punctuations etc.
What do lexerslook like?
Lexers are specified as regex-to-token mappings
21.
Check Point #1
• Write a C++ program to read a text file and print each character with its ASCII value.
• What are the main phases of a compiler?
• Why is lexical analysis the first step?
• How does a compiler differ from an interpreter (e.g., Python vs. C++)?
• Home Assignment
• Learn
• C++ <vector>, and how to use them
• How to push and pull items from vector
• Practice to write regex using C++
• Modify the program to count whitespace, digits, and letters, and store them as key value pair
using vector
Editor's Notes
#10 Why Learning Compiler Construction is Valuable
1. Enables Creation of New Languages and Tools
Design Domain-Specific Languages (DSLs) for areas like finance, AI workflows, or parallel computing.
Build custom tooling such as static analyzers, interpreters, or code generators.
2. Deepens Systems-Level Understanding
Get hands-on with lexing, parsing, and semantic analysis, bridging high-level code with low-level execution.
Understand how assembly instructions, memory models, and CPU pipelines impact performance.
3. Strengthens Problem-Solving Skills
Apply advanced algorithms (e.g., graph coloring for register allocation, dynamic programming in parsing).
Use sophisticated data structures (e.g., abstract syntax trees, symbol tables, intermediate representations).
4. Improves Debugging and Optimization
Gain insight into compiler error reporting and why certain C++ bugs occur.
Learn how optimizers work (constant folding, loop unrolling, inlining) and apply similar techniques to your own projects.
5. Enhances Career and Academic Opportunities
Compilers are core to systems programming, embedded software, AI accelerators, and GPU programming.
Organizations like NVIDIA, Apple, Intel, and Google actively seek engineers with compiler expertise.
Strong foundation for graduate studies and research in programming languages, formal methods, or high-performance computing.
#11 Real-World Applications & Importance
Software Development:
Compilers are fundamental to creating executable programs, enabling the use of high-level programming languages that are more efficient for developers.
Error Detection:
By analyzing the source code, compilers can identify syntax and semantic errors early in the development process, saving time and preventing crashes.
Performance Optimization:
Compiler designers strive to generate code that runs as efficiently as possible, reducing execution time and resource consumption.
Cross-Platform Compatibility:
Compilers can ensure that software can run on different hardware architectures or operating systems through features like cross-compilation.
Machine Learning:
Compiler design plays a crucial role in modern machine learning by optimizing model performance and enabling deployment across diverse hardware.
#18 The main purposes of lexical analyzer are:
* It is used to analyze the source code.
* Lexical analyzer is able to remove the comments and the white space present in the expression.
* It is used to format the expression for easy access i.e. creates tokens.
* It begins to fill information in SYMBOL TABLE.
#19 Lexing converts a sequence of characters into tokens.
Example: Characters "i f a = 3 \n t h e n b e l s e c" become tokens: [if, ident "a", equal, int "3", then, ident "b", else, ident "c"].
Visual: A table showing characters above and tokens below. Explanation: Whitespace is ignored, and patterns are matched to categories. This is why your C++ lexer in Session 3 skips spaces and identifies types.
#20 Lexers are specified as regex-to-token mappings.
Examples: "if" → IF, "then" → THEN, "[a-zA-Z]+ as s" → IDENT s, "[0-9]+ as i" → INT i, "[ \t\n]" → skip.
Token datatype: type token = INT of int | IDENT of string | EQUAL | IF | THEN | ELSE | ...
Question posed: How do we turn this declarative spec into a program? This is the core challenge we'll solve. In your course, this is like defining your Token struct in C++ and using patterns to match.