Compiler Construction
Sajjad Raza Abidi
0321-3783573
sajjadabidi@gmail.com
Fall 2025
Things to Do
• Start brushing up your skills in C++
• Install and learn how to develop C++ code using vscode
• You can setup GCC/g++ compiler for your practice. It will be
used throughout this for assignments and final project.
Course Overview
Welcome to course on Compiler Construction Using C++!
• This course is designed for an intermediate learner with a solid
grasp of C++, data structures, algorithms and finite automata.
• During this course, you’ll build a functional compiler from scratch,
understanding its core components and implementing them in C++.
• We’ll balance theory, hands-on coding, and practical exercises to
ensure you master in construction of a compiler.
• The course is structured into 14 sessions, each with bite-sized
lessons, real-world examples, checkpoints, and optional challenges
to deepen your understanding.
Course Goals
• Understand the architecture and phases of a compiler.
• Implement a compiler for a simple language using C++.
• Apply compiler construction concepts to real-world
problems.
• Develop problem-solving skills through practical coding and
debugging.
Course Curriculum
1. Introduction to Compilers and Lexical Analysis
• Overview of compilers, their phases, and real-world applications.
• Focus on lexical analysis: tokenizing source code.
• Tools: C++ Standard Library, Flex (lexer generator).
2. Syntax Analysis and Parsing
• Syntax analysis: understanding grammars and building parsers.
• Implement recursive descent and table-driven parsers in C++.
• Tools: Bison (parser generator), C++ data structures.
3. Semantic Analysis and Intermediate Code Generation
• Semantic analysis: type checking and symbol tables.
• Generating intermediate representations (IR) like three-address code.
• Tools: C++ containers, LLVM IR (optional exploration).
continued
Course Curriculum
4. Code Optimization and Code Generation
• Optimize intermediate code for performance.
• Generate target machine code or assembly.
• Tools: C++ for optimization algorithms, basic assembly exploration.
5. Integration and Testing
• Integrate all compiler phases into a working system.
• Test and debug the compiler with sample programs.
• Tools: GDB, unit testing frameworks (e.g., Google Test).
Course Curriculum
• Quizzes: 7-2 (best 6 each of 2.5 marks i.e. 2.5 X 6 = 15marks)
• Assignment/Exercise : 10 Marks
• Final Project: 15 Marks
• Mid-Term Exam: 20 Marks
• Final Exam: 40 Marks
Compilers
• What is Compiler ?
• A program that translates an executable program in one language into
an executable program in another language.
• Compilers translate high-level code to machine code (e.g., C++ to x86
assembly).
• We expect the program produced by compiler to be better in some way
then other.
• What is Interpreter?
• a program that reads an executable program and produces the results
of running that program.
• Usually, this involves executing the source program in some fashion
Importance of Studying Compiler
Construction
Compiler construction integrates various aspects of computer science,
including algorithms, systems, and architecture, making it a rich field of study. ​
It provides insights into how programming languages are translated into
machine code, which is essential for understanding computer operations.
• Compiler construction encompasses multiple disciplines: artificial
intelligence, algorithms, theory, systems, and architecture. ​
• It serves as a microcosm* of computer science, illustrating the interplay
between different concepts.
• Understanding compilers is crucial for grasping how computers execute
programs. ​
* Encapsulating in miniature the characteristics of something much larger.
Building Compilers
Building
compilers is a
powerful
endeavour with
practical,
intellectual, and
technical
benefits.
continued
Building Compilers – Real-world
Context
• Compilers power everything from operating systems (compiled
with GCC) to machine learning frameworks (TensorFlow’s XLA
compiler).
• Your course will have you build a compiler for a simple language,
giving you hands-on experience with these concepts using C++.
• Performance Optimization
• Cross-Platform Compatibility
• Machine Learning
Building Compilers – Interest
Compiler construction is a microcosm* of computer science
Inside a compiler, all these things come together
* Encapsulating in miniature the characteristics of something much larger.
Artificial Intelligence Greedy Algorithm
Learning Algorithm
Algorithm Graph Algorithm
Union-Find
Dynamic Programming
Theory DFAs for Scanning
Parser Generators
Lattice Theory for Analysis
Systems Allocation and Naming
Locality
Synchronization
Architecture Pipeline Management
Hierarchy Management
Instruction Set use
Intrinsic Value of Compiler
Construction
Compiler construction is both challenging and enjoyable,
presenting interesting problems that significantly impact
performance and computer usage. ​The complexity of compilers
leads to real-world results and insights into computer operations.
​
• Building compilers is a complex task that involves intricate
interactions and problem-solving. ​
• Compilers play a crucial role in determining the performance of
programs. ​
• The field poses some of the most interesting problems in
computing, making it a rewarding area of study.
Qualities of Effective Compilers
Effective compilers must exhibit several important qualities to
ensure they produce high-quality code and facilitate the
development process. ​These qualities shape the curriculum and
focus of compiler construction courses. ​
• Correctness of generated code is paramount.
• The speed of the output and the compiler itself are critical for
efficiency. ​
• Good diagnostics for errors and compatibility with debuggers
enhance usability. ​
• Support for separate compilation and cross-language calls is
essential for modern programming needs.
Abstract Representation in Compiler
Design
Compilers utilize an abstract representation of code,
transitioning from source code to intermediate
representation (IR) and finally to machine code. ​
This process simplifies the retargeting of compilers
and allows for multiple front ends.
• The compiler translates source code into an
intermediate representation (IR) before generating
machine code. ​
• The front end of the compiler recognizes legal
programs and generates IR.
• The back end maps IR to the target machine,
facilitating easier retargeting and optimization. ​
Compiler
Source
Code
Machine
Code
Errors
Recognize legal (and illegal Programs
Generate correct code
Manage storage of all variables and code
Agreement on format of object (assemble) code
Compiler Phases and Their Functions
The compilation process consists of several distinct phases, each with specific
responsibilities that contribute to the overall function of the compiler.
Understanding these phases is crucial for grasping how compilers operate. ​
Analysis Part:
• Lexical analysis breaks the source file into tokens. ​
• Parsing analyses the structure of the program and builds abstract syntax trees.
• Semantic analysis checks variable definitions and types. ​
• Intermediate representation (IR) is produced for further translation into machine
code.
Synthesis Part: ​
• Optimization phases improve the efficiency of the generated code.
• Final Code Generation.
Phases of a Compiler
Lexical Analysis (Scanner)
• Lexical analysis or scanning is the process where:
• The source program is read from left-to-right and grouped into
tokens.
• Tokens are sequences of characters with a collective meaning.
• In any programming language tokens may be constants,
operators, reserved words, punctuations etc.
What is lexing? Example
if a = 3
then b else
c
What do lexers look like?
Lexers are specified as regex-to-token mappings
Check Point # 1
• Write a C++ program to read a text file and print each character with its ASCII value.
• What are the main phases of a compiler?
• Why is lexical analysis the first step?
• How does a compiler differ from an interpreter (e.g., Python vs. C++)?
• Home Assignment
• Learn
• C++ <vector>, and how to use them
• How to push and pull items from vector
• Practice to write regex using C++
• Modify the program to count whitespace, digits, and letters, and store them as key value pair
using vector

Compiler Construction , basic introduction

  • 1.
    Compiler Construction Sajjad RazaAbidi 0321-3783573 sajjadabidi@gmail.com Fall 2025
  • 2.
    Things to Do •Start brushing up your skills in C++ • Install and learn how to develop C++ code using vscode • You can setup GCC/g++ compiler for your practice. It will be used throughout this for assignments and final project.
  • 3.
    Course Overview Welcome tocourse on Compiler Construction Using C++! • This course is designed for an intermediate learner with a solid grasp of C++, data structures, algorithms and finite automata. • During this course, you’ll build a functional compiler from scratch, understanding its core components and implementing them in C++. • We’ll balance theory, hands-on coding, and practical exercises to ensure you master in construction of a compiler. • The course is structured into 14 sessions, each with bite-sized lessons, real-world examples, checkpoints, and optional challenges to deepen your understanding.
  • 4.
    Course Goals • Understandthe architecture and phases of a compiler. • Implement a compiler for a simple language using C++. • Apply compiler construction concepts to real-world problems. • Develop problem-solving skills through practical coding and debugging.
  • 5.
    Course Curriculum 1. Introductionto Compilers and Lexical Analysis • Overview of compilers, their phases, and real-world applications. • Focus on lexical analysis: tokenizing source code. • Tools: C++ Standard Library, Flex (lexer generator). 2. Syntax Analysis and Parsing • Syntax analysis: understanding grammars and building parsers. • Implement recursive descent and table-driven parsers in C++. • Tools: Bison (parser generator), C++ data structures. 3. Semantic Analysis and Intermediate Code Generation • Semantic analysis: type checking and symbol tables. • Generating intermediate representations (IR) like three-address code. • Tools: C++ containers, LLVM IR (optional exploration). continued
  • 6.
    Course Curriculum 4. CodeOptimization and Code Generation • Optimize intermediate code for performance. • Generate target machine code or assembly. • Tools: C++ for optimization algorithms, basic assembly exploration. 5. Integration and Testing • Integrate all compiler phases into a working system. • Test and debug the compiler with sample programs. • Tools: GDB, unit testing frameworks (e.g., Google Test).
  • 7.
    Course Curriculum • Quizzes:7-2 (best 6 each of 2.5 marks i.e. 2.5 X 6 = 15marks) • Assignment/Exercise : 10 Marks • Final Project: 15 Marks • Mid-Term Exam: 20 Marks • Final Exam: 40 Marks
  • 8.
    Compilers • What isCompiler ? • A program that translates an executable program in one language into an executable program in another language. • Compilers translate high-level code to machine code (e.g., C++ to x86 assembly). • We expect the program produced by compiler to be better in some way then other. • What is Interpreter? • a program that reads an executable program and produces the results of running that program. • Usually, this involves executing the source program in some fashion
  • 9.
    Importance of StudyingCompiler Construction Compiler construction integrates various aspects of computer science, including algorithms, systems, and architecture, making it a rich field of study. ​ It provides insights into how programming languages are translated into machine code, which is essential for understanding computer operations. • Compiler construction encompasses multiple disciplines: artificial intelligence, algorithms, theory, systems, and architecture. ​ • It serves as a microcosm* of computer science, illustrating the interplay between different concepts. • Understanding compilers is crucial for grasping how computers execute programs. ​ * Encapsulating in miniature the characteristics of something much larger.
  • 10.
    Building Compilers Building compilers isa powerful endeavour with practical, intellectual, and technical benefits. continued
  • 11.
    Building Compilers –Real-world Context • Compilers power everything from operating systems (compiled with GCC) to machine learning frameworks (TensorFlow’s XLA compiler). • Your course will have you build a compiler for a simple language, giving you hands-on experience with these concepts using C++. • Performance Optimization • Cross-Platform Compatibility • Machine Learning
  • 12.
    Building Compilers –Interest Compiler construction is a microcosm* of computer science Inside a compiler, all these things come together * Encapsulating in miniature the characteristics of something much larger. Artificial Intelligence Greedy Algorithm Learning Algorithm Algorithm Graph Algorithm Union-Find Dynamic Programming Theory DFAs for Scanning Parser Generators Lattice Theory for Analysis Systems Allocation and Naming Locality Synchronization Architecture Pipeline Management Hierarchy Management Instruction Set use
  • 13.
    Intrinsic Value ofCompiler Construction Compiler construction is both challenging and enjoyable, presenting interesting problems that significantly impact performance and computer usage. ​The complexity of compilers leads to real-world results and insights into computer operations. ​ • Building compilers is a complex task that involves intricate interactions and problem-solving. ​ • Compilers play a crucial role in determining the performance of programs. ​ • The field poses some of the most interesting problems in computing, making it a rewarding area of study.
  • 14.
    Qualities of EffectiveCompilers Effective compilers must exhibit several important qualities to ensure they produce high-quality code and facilitate the development process. ​These qualities shape the curriculum and focus of compiler construction courses. ​ • Correctness of generated code is paramount. • The speed of the output and the compiler itself are critical for efficiency. ​ • Good diagnostics for errors and compatibility with debuggers enhance usability. ​ • Support for separate compilation and cross-language calls is essential for modern programming needs.
  • 15.
    Abstract Representation inCompiler Design Compilers utilize an abstract representation of code, transitioning from source code to intermediate representation (IR) and finally to machine code. ​ This process simplifies the retargeting of compilers and allows for multiple front ends. • The compiler translates source code into an intermediate representation (IR) before generating machine code. ​ • The front end of the compiler recognizes legal programs and generates IR. • The back end maps IR to the target machine, facilitating easier retargeting and optimization. ​ Compiler Source Code Machine Code Errors Recognize legal (and illegal Programs Generate correct code Manage storage of all variables and code Agreement on format of object (assemble) code
  • 16.
    Compiler Phases andTheir Functions The compilation process consists of several distinct phases, each with specific responsibilities that contribute to the overall function of the compiler. Understanding these phases is crucial for grasping how compilers operate. ​ Analysis Part: • Lexical analysis breaks the source file into tokens. ​ • Parsing analyses the structure of the program and builds abstract syntax trees. • Semantic analysis checks variable definitions and types. ​ • Intermediate representation (IR) is produced for further translation into machine code. Synthesis Part: ​ • Optimization phases improve the efficiency of the generated code. • Final Code Generation.
  • 17.
    Phases of aCompiler
  • 18.
    Lexical Analysis (Scanner) •Lexical analysis or scanning is the process where: • The source program is read from left-to-right and grouped into tokens. • Tokens are sequences of characters with a collective meaning. • In any programming language tokens may be constants, operators, reserved words, punctuations etc.
  • 19.
    What is lexing?Example if a = 3 then b else c
  • 20.
    What do lexerslook like? Lexers are specified as regex-to-token mappings
  • 21.
    Check Point #1 • Write a C++ program to read a text file and print each character with its ASCII value. • What are the main phases of a compiler? • Why is lexical analysis the first step? • How does a compiler differ from an interpreter (e.g., Python vs. C++)? • Home Assignment • Learn • C++ <vector>, and how to use them • How to push and pull items from vector • Practice to write regex using C++ • Modify the program to count whitespace, digits, and letters, and store them as key value pair using vector

Editor's Notes

  • #10 Why Learning Compiler Construction is Valuable 1. Enables Creation of New Languages and Tools Design Domain-Specific Languages (DSLs) for areas like finance, AI workflows, or parallel computing. Build custom tooling such as static analyzers, interpreters, or code generators. 2. Deepens Systems-Level Understanding Get hands-on with lexing, parsing, and semantic analysis, bridging high-level code with low-level execution. Understand how assembly instructions, memory models, and CPU pipelines impact performance. 3. Strengthens Problem-Solving Skills Apply advanced algorithms (e.g., graph coloring for register allocation, dynamic programming in parsing). Use sophisticated data structures (e.g., abstract syntax trees, symbol tables, intermediate representations). 4. Improves Debugging and Optimization Gain insight into compiler error reporting and why certain C++ bugs occur. Learn how optimizers work (constant folding, loop unrolling, inlining) and apply similar techniques to your own projects. 5. Enhances Career and Academic Opportunities Compilers are core to systems programming, embedded software, AI accelerators, and GPU programming. Organizations like NVIDIA, Apple, Intel, and Google actively seek engineers with compiler expertise. Strong foundation for graduate studies and research in programming languages, formal methods, or high-performance computing.
  • #11 Real-World Applications & Importance Software Development: Compilers are fundamental to creating executable programs, enabling the use of high-level programming languages that are more efficient for developers.  Error Detection: By analyzing the source code, compilers can identify syntax and semantic errors early in the development process, saving time and preventing crashes.  Performance Optimization: Compiler designers strive to generate code that runs as efficiently as possible, reducing execution time and resource consumption.  Cross-Platform Compatibility: Compilers can ensure that software can run on different hardware architectures or operating systems through features like cross-compilation.  Machine Learning: Compiler design plays a crucial role in modern machine learning by optimizing model performance and enabling deployment across diverse hardware. 
  • #18 The main purposes of lexical analyzer are: * It is used to analyze the source code. * Lexical analyzer is able to remove the comments and the white space present in the expression. * It is used to format the expression for easy access i.e. creates tokens. * It begins to fill information in SYMBOL TABLE.
  • #19 Lexing converts a sequence of characters into tokens. Example: Characters "i f a = 3 \n t h e n b e l s e c" become tokens: [if, ident "a", equal, int "3", then, ident "b", else, ident "c"]. Visual: A table showing characters above and tokens below. Explanation: Whitespace is ignored, and patterns are matched to categories. This is why your C++ lexer in Session 3 skips spaces and identifies types.
  • #20 Lexers are specified as regex-to-token mappings. Examples: "if" → IF, "then" → THEN, "[a-zA-Z]+ as s" → IDENT s, "[0-9]+ as i" → INT i, "[ \t\n]" → skip. Token datatype: type token = INT of int | IDENT of string | EQUAL | IF | THEN | ELSE | ... Question posed: How do we turn this declarative spec into a program? This is the core challenge we'll solve. In your course, this is like defining your Token struct in C++ and using patterns to match.