Srikanth Vanama Computer Science Graduate Student Clemson University
Name    : Srikanth Vanama School    : Clemson University Major    : Computer Science Qualification : Master in Computer Science, Clemson, South Carolina, USA.  This Project : Implemented during the second semester of my graduate study. January 29, 2010
University Profile. System Environment. What is a Compiler. Reasons to choose this project Compiler Developed for? Lexical Analyzer Parser Semantic Analyzer Code Generation Conclusion Questions
Clemson University offers countless opportunities for students, faculty and community members to participate in decades of tradition, improve quality of life for their surrounding communities and pursue academic challenges. Ranked as the 22nd best national public university by  U.S.News and World Report Computer Science offers students the opportunity for classroom study and research in the underlying theory of computation, algorithms, Compilers, Software engineering, CyberInfrastructure and other core areas of traditional computer science. January 29, 2010
Hardware Configuration Processor   :  Intel Pentium-IV (2.23 GHz) Hard Disk   :  64 GB RAM   :  512 MB Software Configuration Operating System : Windows, Mac OSX, Linux Technologies : Perl, Java Script Tools : Active Perl, Komodo Assembler Linkers : HLA January 29, 2010
A Compiler is a Computer Program that translate source code from a high-level programming language to a lower level language (e.g. machine code or assembly language)  The reason for this is to create an executable program to generate  the output of the source program Involves operations like  Lexical Analysis Parsing Semantic Analysis Code Generation Code Optimization  January 29, 2010
Required programming knowledge on a high scale. Opportunity to implement a newly learnt programming language in a full scale real time project. To take on a challenging project. Build my own compiler.  Analyze performance differences between Intel and SPARC architectures.
a new programming language with pre-defined syntax and semantics A Context-Free Grammar (CFG) is associated with the new language for which the compiler was built.  A CFG is defined with various constraints and reduction policies.  January 29, 2010
The Lexical Analyzer is a procedure that is called every time a token is needed by the parser.  Tokens are generated based on different criteria from a string of characters from the source code. A Token can be of various types. Integers, Real values, Reserved Words, Comments, Identifiers, ASCII characters, Multiple ASCII characters.  ( Tell Examples here ) It returns only lexically correct entities and detects certain error conditions.  If an error is found, it prints an informative message and continues operation until a valid token is found and returned. Example a+b-(c*d) January 29, 2010
Examples of Tokens: Identifiers:  start with a lower case letter and may contain up to 31 additional lower case letters, digits, -s, and _s. Integers:  d+, +d+, -d+, where the + represents a continuous sequence of 9 or less significant digits. Real Values:  as d+.d+, +d+.d+, or -d+.d+ where the total number of significant digits does not exceed 7. ASCII Characters:  ;,_,-,,,[,],(,),:,!,<,>,+,-,*,/,{.} Multiple ASCII Characters:  ==, !=, <=, >=, <-, ::, ||, && January 29, 2010
January 29, 2010
Simple Precedence Parsing mechanism is employed for the language.  Conditions for a simple precedence grammar all right hand sides are unique there are no empty right hand sides and  at most one simple precedence relation holds between pairs of symbols in the grammar. January 29, 2010
Example of a simple precedence grammar: E    E+T  E    T  T    T*P  T    P P    (E)  P    V January 29, 2010
The Parser takes input from Lexical Analyzer and the relations defined by the context free grammar.  The Parser examines each token and reduces accordingly using the rules specified in the grammar. Constructs a syntax tree for the source program that we intend to parse. January 29, 2010
This routine is called between the spot where the parser has identified which reduction to make and before that reduction is made. Reduction number is passed to the semantic routine. The semantic routine will perform the semantics associated with that production and return to the parser to make the syntactic reduction and the corresponding stack reduction. January 29, 2010
The semantic structure of the grammar is constructed. Includes three important tasks: Storing the variables in the symbol table and setting attributes of those variables in the symbol table Maintain a semantic stack that has a 1:1 correspondence with the syntax stack. This semantic stack will be used to store or remember information from one reduction to the next.  Produces intermediate code, such as 4-tuples to indicate an arithmetic operation, a label, a jump, a conditional test, etc.  ( Generate 4-tuples) January 29, 2010
For the Production  decstat    decstat, var The stacks look like  Example statement for the above production looks like VECTOR aqz ELEMENTS 50, bxw, lm January 29, 2010
A 4-tuple is in the format (Result, Operator, Operand, Operand) Example: (S i ,  SUM,  S i-2 ,  S i-3 )   (S i-2 ,  MEMORY,  S i-1 ,  S i-3 ) January 29, 2010
At this stage of the project we have translated our source program into a sequence of 4-tuples, called  intermediate code. The 4-tuples are intermediate code in the sense that they reside between the source code and assembly code or machine code.  The idea behind code generation is to take our intermediate code, the 4-tuples, and generate assembly code for a specific hardware platform and operating system.  (INTEL or SPARC) This is the first place that we get specific about the computer system that our source program will execute on. January 29, 2010
Do appropriate register allocation based on the target machine’s architecture and memory constraints. Assembly code is resulted for the source program. The generated Assembly code is given as an input to Assembler linker to finally generate the output of the source program. The Assembler linker used for this project is HLA. January 29, 2010
Implemented in Perl.  Scripting made easy. Optimize use of stacks which resulted in efficient memory management Employed Innovative Parsing techniques to generate tokens.  January 29, 2010
The most challenging project ever done.  Understood the entire SDLC process which includes  Requirement Analysis Design Coding Testing Implementation Code Optimization January 29, 2010
Acquired new skills from different phases of the project. Lexical Analysis – Patience Parsing – Problem Solving Skills, Time Management Semantics Analysis – Algorithmic approach Code Generation – Memory Management, Efficiency January 29, 2010
Thank you Any Queries ?! SRIKANTH VANAMA January 29, 2010

Compiler_Project_Srikanth_Vanama

  • 1.
    Srikanth Vanama ComputerScience Graduate Student Clemson University
  • 2.
    Name : Srikanth Vanama School : Clemson University Major : Computer Science Qualification : Master in Computer Science, Clemson, South Carolina, USA. This Project : Implemented during the second semester of my graduate study. January 29, 2010
  • 3.
    University Profile. SystemEnvironment. What is a Compiler. Reasons to choose this project Compiler Developed for? Lexical Analyzer Parser Semantic Analyzer Code Generation Conclusion Questions
  • 4.
    Clemson University offerscountless opportunities for students, faculty and community members to participate in decades of tradition, improve quality of life for their surrounding communities and pursue academic challenges. Ranked as the 22nd best national public university by U.S.News and World Report Computer Science offers students the opportunity for classroom study and research in the underlying theory of computation, algorithms, Compilers, Software engineering, CyberInfrastructure and other core areas of traditional computer science. January 29, 2010
  • 5.
    Hardware Configuration Processor : Intel Pentium-IV (2.23 GHz) Hard Disk : 64 GB RAM : 512 MB Software Configuration Operating System : Windows, Mac OSX, Linux Technologies : Perl, Java Script Tools : Active Perl, Komodo Assembler Linkers : HLA January 29, 2010
  • 6.
    A Compiler isa Computer Program that translate source code from a high-level programming language to a lower level language (e.g. machine code or assembly language) The reason for this is to create an executable program to generate the output of the source program Involves operations like Lexical Analysis Parsing Semantic Analysis Code Generation Code Optimization January 29, 2010
  • 7.
    Required programming knowledgeon a high scale. Opportunity to implement a newly learnt programming language in a full scale real time project. To take on a challenging project. Build my own compiler. Analyze performance differences between Intel and SPARC architectures.
  • 8.
    a new programminglanguage with pre-defined syntax and semantics A Context-Free Grammar (CFG) is associated with the new language for which the compiler was built. A CFG is defined with various constraints and reduction policies. January 29, 2010
  • 9.
    The Lexical Analyzeris a procedure that is called every time a token is needed by the parser. Tokens are generated based on different criteria from a string of characters from the source code. A Token can be of various types. Integers, Real values, Reserved Words, Comments, Identifiers, ASCII characters, Multiple ASCII characters. ( Tell Examples here ) It returns only lexically correct entities and detects certain error conditions. If an error is found, it prints an informative message and continues operation until a valid token is found and returned. Example a+b-(c*d) January 29, 2010
  • 10.
    Examples of Tokens:Identifiers: start with a lower case letter and may contain up to 31 additional lower case letters, digits, -s, and _s. Integers: d+, +d+, -d+, where the + represents a continuous sequence of 9 or less significant digits. Real Values: as d+.d+, +d+.d+, or -d+.d+ where the total number of significant digits does not exceed 7. ASCII Characters: ;,_,-,,,[,],(,),:,!,<,>,+,-,*,/,{.} Multiple ASCII Characters: ==, !=, <=, >=, <-, ::, ||, && January 29, 2010
  • 11.
  • 13.
    Simple Precedence Parsingmechanism is employed for the language. Conditions for a simple precedence grammar all right hand sides are unique there are no empty right hand sides and at most one simple precedence relation holds between pairs of symbols in the grammar. January 29, 2010
  • 14.
    Example of asimple precedence grammar: E  E+T E  T T  T*P T  P P  (E) P  V January 29, 2010
  • 15.
    The Parser takesinput from Lexical Analyzer and the relations defined by the context free grammar. The Parser examines each token and reduces accordingly using the rules specified in the grammar. Constructs a syntax tree for the source program that we intend to parse. January 29, 2010
  • 16.
    This routine iscalled between the spot where the parser has identified which reduction to make and before that reduction is made. Reduction number is passed to the semantic routine. The semantic routine will perform the semantics associated with that production and return to the parser to make the syntactic reduction and the corresponding stack reduction. January 29, 2010
  • 17.
    The semantic structureof the grammar is constructed. Includes three important tasks: Storing the variables in the symbol table and setting attributes of those variables in the symbol table Maintain a semantic stack that has a 1:1 correspondence with the syntax stack. This semantic stack will be used to store or remember information from one reduction to the next. Produces intermediate code, such as 4-tuples to indicate an arithmetic operation, a label, a jump, a conditional test, etc. ( Generate 4-tuples) January 29, 2010
  • 18.
    For the Production decstat  decstat, var The stacks look like Example statement for the above production looks like VECTOR aqz ELEMENTS 50, bxw, lm January 29, 2010
  • 19.
    A 4-tuple isin the format (Result, Operator, Operand, Operand) Example: (S i , SUM, S i-2 , S i-3 ) (S i-2 , MEMORY, S i-1 , S i-3 ) January 29, 2010
  • 20.
    At this stageof the project we have translated our source program into a sequence of 4-tuples, called intermediate code. The 4-tuples are intermediate code in the sense that they reside between the source code and assembly code or machine code. The idea behind code generation is to take our intermediate code, the 4-tuples, and generate assembly code for a specific hardware platform and operating system. (INTEL or SPARC) This is the first place that we get specific about the computer system that our source program will execute on. January 29, 2010
  • 21.
    Do appropriate registerallocation based on the target machine’s architecture and memory constraints. Assembly code is resulted for the source program. The generated Assembly code is given as an input to Assembler linker to finally generate the output of the source program. The Assembler linker used for this project is HLA. January 29, 2010
  • 22.
    Implemented in Perl. Scripting made easy. Optimize use of stacks which resulted in efficient memory management Employed Innovative Parsing techniques to generate tokens. January 29, 2010
  • 23.
    The most challengingproject ever done. Understood the entire SDLC process which includes Requirement Analysis Design Coding Testing Implementation Code Optimization January 29, 2010
  • 24.
    Acquired new skillsfrom different phases of the project. Lexical Analysis – Patience Parsing – Problem Solving Skills, Time Management Semantics Analysis – Algorithmic approach Code Generation – Memory Management, Efficiency January 29, 2010
  • 25.
    Thank you AnyQueries ?! SRIKANTH VANAMA January 29, 2010

Editor's Notes

  • #2 January 29, 2010