Passive Air Cooling System and Solar Water Heater.ppt
Unit 1 CD.pptx
1. Compiler Design
CSE - 353
UNIT-1
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
2. Syllabus
Unit 1: Introduction
Introduction to Compiler, Phases and passes, Bootstrapping,
Cross-Compiler
Finite state machines and regular expressions and their
applications to lexical analysis
lexical-analyzer generator, Lexical Phase errors
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
3. Definition of
Compiler:
Compiler is a software which
converts a program written in high
level language (Source Language)
to low level language (Object/
Target/ Machine Language).
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
4. Types of Compiler
Cross Compiler that runs on a machine ‘A’ and produces a code
for another machine ‘B’. It is capable of creating code for a
platform other than the one on which the compiler is running.
Source-to-source Compiler or transcompiler or transpiler is a
compiler that translates source code written in one programming
language into source code of another programming language.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
5. Language processing systems
(using Compiler)
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
6. Continued..
High Level Language
• If a program contains #define or #include directives such as #include or
#define it is called HLL.
• They are closer to humans but far from machines.
• These (#) tags are called pre-processor directives. They direct the pre-
processor about what to do.
Pre-Processor
The pre-processor removes all the #include directives by including the
files called file inclusion and all the #define directives using macro
expansion.
It performs file inclusion, augmentation, macro-processing etc.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
7. Continued..
Assembly Language
Its neither in binary form nor high level.
It is an intermediate state that is a combination of machine instructions
and some other useful data needed for execution.
Assembler
For every platform (Hardware + OS) we will have a assembler.
They are not universal since for each platform we have one.
The output of assembler is called object file. Its translates assembly
language to machine code.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
8. Continued..
Interpreter
An interpreter converts high level language into low level machine
language, just like a compiler.
But they are different in the way they read the input.
The Compiler in one go reads the inputs, does the processing and executes
the source code whereas the interpreter does the same line by line.
Compiler scans the entire program and translates it as a whole into
machine code whereas an interpreter translates the program one statement
at a time.
Interpreted programs are usually slower with respect to compiled ones.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
9. Continued..
Relocatable Machine Code
It can be loaded at any point and can be run.
The address within the program will be in such a way that it will
cooperate for the program movement.
Loader/Linker
Linker loads a variety of object files into a single file to make it
executable. Then loader loads it in memory and executes it.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
10. Phases of Compiler
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
11. Continued..
1. Lexical analyser
It is also called scanner.
It takes the output of preprocessor (which performs file inclusion and
macro expansion) as the input which is in pure high level language.
It reads the characters from source program and groups them into
lexemes (sequence of characters that “go together”).
Each lexeme corresponds to a token. Tokens are defined by regular
expressions which are understood by the lexical analyzer.
It also removes lexical errors (for e.g., erroneous characters),
comments and white space.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
12. Continued..
2. Syntax Analyser
It is sometimes called as parser.
It constructs the parse tree.
It takes all the tokens one by one and uses Context Free
Grammar to construct the parse tree.
Syntax error can be detected at this level if the input is not in
accordance with the grammar.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
13. .
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
14. Continued..
3. Semantic Analyser
It verifies the parse tree, whether it’s meaningful or not.
It does type checking, Label checking and Flow control
checking.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
15. Continued..
4. Intermediate Code Generator
It generates intermediate code, that is a form which can be readily
executed by machine
Example – Three address code
5. Code Optimizer
It transforms the code so that it consumes fewer resources and produces
more speed.
6. Target Code Generator
The main purpose is to write a code that the machine can understand and
also register allocation, instruction selection etc.
The optimized code is converted into relocatable machine code which
then forms the input to the linker and loader.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
16. .
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
17. Compiler Construction Tools
The compiler writer can use some specialized tools that
help in implementing various phases of a compiler.
These tools assist in the creation of an entire compiler or
its parts.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
18. Continued..
1. Parser Generator
It produces syntax analyzers (parsers) from the input that is based on a
grammatical description of programming language or on a context-free
grammar.
It is useful as the syntax analysis phase is highly complex and
consumes more manual and compilation time.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
19. Continued..
2. Scanner Generator
It generates lexical analyzers from the input that consists of regular
expression description based on tokens of a language.
It generates a finite automaton to recognize the regular expression.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
20. Difference between Compiler and Interpreter
Compiler Interpreter
Coverts the entire source code of a
programming language into executable
machine code for a CPU.
Takes a source program and runs it line by
line, translating each line as it comes to it.
Compiler takes large amount of time to
analyze the entire source code but the
overall execution time of the program is
comparatively faster.
Interpreter takes less amount of time to
analyze the source code but the overall
execution time of the program is slower.
Compiler generates the error message only
after scanning the whole program, so
debugging is comparatively hard.
Its Debugging is easier as it continues
translating the program until the error is
met
Generates intermediate object code. No intermediate object code is generated.
Examples: C, C++, Java Examples: Python, Perl
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
21. Bootstrapping and Cross Compiler
Bootstrapping is a process in which simple language is used to
translate more complicated program which in turn may handle far more
complicated program. This complicated program can further handle
even more complicated program and so on.
A cross compiler is a compiler capable of creating executable code for
a platform other than the one on which the compiler is running. For
example, a compiler that runs on a Windows 7 PC but generates code
that runs on Android smartphone is a cross compiler.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
22. Continued..
Suppose we want to write a cross
compiler for new language X.
The implementation language of this
compiler is say Y and the target
code being generated is in language
Z. That is, we create XYZ.
Now if existing compiler Y runs on
machine M and generates code for
M then it is denoted as YMM.
Now if we run XYZ using YMM
then we get a compiler XMZ.
That means a compiler for source
language X that generates a target
code in language Z and which runs
on machine M.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
24. Difference between Native Compiler and
Cross Compiler
Native Compiler Cross Compiler
Translates program for same
hardware/platform/machine on it is
running.
Translates program for different
hardware/platform/machine other than
the platform which it is running.
It is used to build programs for same
system/machine & OS it is installed.
It is used to build programs for other
system/machine like AVR/ARM.
It is dependent on System/machine and
OS
It is independent of System/machine and
OS
It can generate executable file like .exe It can generate raw code .hex
TurboC or GCC is native Compiler. Keil is a cross compiler.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
25. Finite state machine
Finite state machine is used to recognize patterns.
Finite automata machine takes the string of symbol as input and changes
its state accordingly. In the input, when a desired symbol is found then
the transition occurs.
While transition, the automata can either move to the next state or stay in
the same state.
FA has two states: accept state or reject state. When the input string is
successfully processed and the automata reached its final state then it will
accept.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
27. Continued.. (From PDF-1)
DFA
NFA
Regular Expression
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
28. Lexical Analysis
Lexical analysis is the process of converting a sequence of characters
from source program into a sequence of tokens.
A program which performs lexical analysis is termed as a lexical
analyzer (lexer), tokenizer or scanner.
Lexical analysis consists of two stages of processing which are as
follows:
• Scanning
• Tokenization
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
29. Continued..
Token:
Token is a valid sequence of characters which are given by
lexeme. In a programming language,
• keywords,
• constant,
• identifiers,
• numbers,
• operators and
• punctuations symbols
are possible tokens to be identified.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
30. Continued..
Pattern
Pattern describes a rule that must be matched by sequence of characters
(lexemes) to form a token. It can be defined by regular expressions or
grammar rules.
Lexeme
Lexeme is a sequence of characters that matches the pattern for a token
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
32. Role of Lexical Analyzer
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
33. Continued..
Lexical analyzer performs the following tasks:
• Reads the source program, scans the input characters, group them into
lexemes and produce the token as output.
• Enters the identified token into the symbol table.
• Strips out white spaces and comments from source program.
• Correlates error messages with the source program i.e., displays error
message with its occurrence by specifying the line number.
• Expands the macros if it is found in the source program.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
34. Issues in Lexical Analysis
Lexical analysis is the process of producing tokens from the
source program. It has the following issues:
• Lookahead
• Ambiguities
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
35. Continued..
Lookahead
Lookahead is required to decide when one token will end and the next token
will begin. The simple example which has lookahead issues are i vs. if,
= vs. ==. Therefore a way to describe the lexemes of each token is required.
A way needed to resolve ambiguities
• Is if it is two variables i and f or if?
• Is == is two equal signs =, = or ==?
Hence, the number of lookahead to be considered and a way to describe the
lexemes of each token is also needed.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
36. Continued..
Ambiguities
The lexical analysis programs written with lex accept ambiguous
specifications and choose the longest match possible at each input point.
Lex can handle ambiguous specifications. When more than one expression
can match the current input, lex chooses as follows:
• The longest match is preferred.
• Among rules which matched the same number of characters, the rule given
first is preferred.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
37. Example of Lexical Analysis, Tokens, Non-Tokens
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
39. Lexical Errors
A character sequence which is not possible to scan into any valid
token is a lexical error. Important facts about the lexical error:
Lexical errors are not very common, but it should be managed by a
scanner
Misspelling of identifiers, operators, keyword are considered as
lexical errors
Generally, a lexical error is caused by the appearance of some
illegal character, mostly at the beginning of a token.
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
40. Error Recovery in Lexical Analyzer
Most common error recovery techniques:
Removes one character from the remaining input
In the panic mode, the successive characters are always ignored
until we reach a well-formed token
By inserting the missing character into the remaining input
Replace a character with another character
Transpose two serial characters
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
41. Lexical Analyzer vs. Parser
Lexical Analyser Parser
Scan Input program Perform syntax analysis
Identify Tokens Create an abstract
representation of the code
Insert tokens into Symbol
Table
Update symbol table entries
It generates lexical errors It generates a parse tree of the
source code
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.
42. .
THANK YOU
Dr. Bharat Bhushan, School of Engineering and Technology, Sharda University,
Greater Noida, India.