4. What is a Compiler?
A compiler translate texts from one language to other
language.
Programming
Language
(Source)
Compiler
Machine
Language
(Target)
5. A compiler acts as a translator,
transforming human-oriented programming languages
into computer-oriented machine languages.
Ignore machine-dependent details for programmer
What Do Compilers Do ?
Compilers may generate three types of code:
Pure Machine Code
Machine instruction set without assuming the existence of any operating system or library.
Mostly being OS or embedded applications.
Augmented Machine Code
Code with OS routines and runtime support routines.
More often
Virtual Machine Code
Virtual instructions, can be run on any architecture with a virtual machine interpreter or a just-in-time comp
Ex. Java
7. Major Tasks
• Any compiler must perform two major tasks
Compiler
Analysis Synthesis
• Analysis of the source program
• Synthesis of a machine-language program
8. The Structure of a Compiler
8
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all Phases of The Compiler)
(Character Stream)
Intermediate
Representation
Target machine code
9. The Structure of a Compiler
9
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Scanner
The scanner begins the analysis of the source program
by reading the input, character by character, and
grouping characters into individual words and symbols
(tokens)
RE ( Regular expression )
NFA ( Non-deterministic Finite Automata )
DFA ( Deterministic Finite Automata )
(Character Stream)
Intermediate
Representation
Target machine code
10. The Structure of a Compiler
10
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Parser
Given a formal syntax specification (typically as a context-
free grammar [CFG] ), the parse reads tokens and groups
them into units as specified by the productions of the
CFG being used.
As syntactic structure is recognized, the parser either calls
corresponding semantic routines directly or builds a
syntax tree.
CFG ( Context-Free Grammar )
GAA ( Grammar Analysis Algorithms )
(Character Stream)
Intermediate
Representation
Target machine code
11. The Structure of a Compiler
11
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program
(Character Stream)
Tokens Syntactic
Structure
Intermediate
Representation
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Semantic Routines
Perform two functions
Check the static semantics of each construct
Do the actual translation
The heart of a compiler
Syntax Directed Translation
Semantic Processing Techniques
IR (Intermediate Representation)
Target machine code
12. The Structure of a Compiler
12
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Optimizer
The IR code generated by the semantic routines is
analyzed and transformed into functionally equivalent but
improved IR code
This phase can be very complex and slow
Peephole optimization
loop optimization, register allocation, code scheduling
Register and Temporary Management
Peephole Optimization
(Character Stream)
Intermediate
Representation
Target machine code
13. The Structure of a Compiler
13
Source
Program
(Character Stream)
Scanner
Tokens
Parser
Syntactic
Structure
Semantic
Routines
Intermediate
Representation
Optimizer
Code
Generator
Code Generator
Interpretive Code Generation
Generating Code from Tree/Dag
Grammar-Based Code Generator
Target machine code
14. The Structure of a Compiler (Example)
14
Scanner
[Lexical Analyzer]
Parser
[Syntax Analyzer]
Semantic Process
[Semantic analyzer]
Code Generator
[Intermediate Code Generator]
Code Optimizer
Tokens
Parse tree
Abstract Syntax Tree Attributes
Non-optimized Intermediate Code
Optimized Intermediate Code
Code Generator
Target machine code
LDF R2,id3
MULF R2,R2,#60.0
LDF R1,id2
ADDF R1,R1,R2
MOVF id1,R1
15. What is NFA and DFA
What is NFA?
An NFA is a Nondeterministic Finite
Automaton. Nondeterministic means it
can transition to, and be in, multiple
states at once
What is DFA?
A DFA is a Deterministic Finite
Automaton. Deterministic means that it
can only be in, and transition to, one
state at a time
16. Formal Definition of Nondeterministic Finite Automata.
An NFA is a five-tuple: N = (S, ∑, δ, S0, F)
A Finite set of state (S)
A Set of input symbol (∑)
A Transition Function (δ)
The initial/starting state, S0
A Set of final state F
17. Regular Expression:
State a b
1 {1,2} {1}
2 ϕ {3}
3 ϕ {4}
4 ϕ ϕ
(a|b)* abb
An NFA can easily implemented using a transition table
18. Now we convert the following NFA to DFA
State a b
{1} {1,2} {1}
{1,2} {1,2} {1,3}
{1,3} {1,2} {1,4}
{1,4} {1,2} {1}
Transition Table for DFA
19. Difference Between NFA and DFA
NFA
Multiple states Transition at once
Operator Zero or More
DFA
only one state transition in DFA.
Operator One or More