Compiler Engineering Lab#1


Published on

Introduction to Compiler Engineering
How to start building a lexical analyzer

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Compiler Engineering Lab#1

  1. 1. University of DammamGirls’ College of ScienceDepartment of Computer ScienceCompiler Engineering Lab COMPILER ENGINEERING LAB # 1: INTRODUCTION & LEXICAL ANALYSIS
  2. 2. WHAT IS A COMPILER? • It is a program that reads a program written in one language - the source language – and translates it into an equivalent program in another language – the target language- • An important part of this translation process is that the compiler reports to its user the presence of errors in the source program. Department of Computer Science -25-29/2/12 2 Compiler Engineering Lab
  3. 3. COMPILER THEORY Source target Compiler program program error messages Department of Computer Science -25-29/2/12 3 Compiler Engineering Lab
  4. 4. COMPILER ENVIRONTMENT TOOLS • Many software tools that manipulate source program first perform some analysis . • Some examples of such tools include Department of Computer Science -25-29/2/12 4 Compiler Engineering Lab
  5. 5. 1- STRUCTURE EDITOR • It takes as input a sequence of commands to build a source program • performs the text creation and modification function of a text editor • Analyze program text, putting and appropriate hierarchical structure on the source program • Checks that the input is correctly formed • Can supply Keywords automatically • Can jump from a begin or left parenthesis to its matching end or right parenthesis Department of Computer Science -25-29/2/12 5 Compiler Engineering Lab
  6. 6. 2- PRETTY PRINTERS • Analyze the program and prints it in such a way that the structure of the program becomes clearly visible. Department of Computer Science -25-29/2/12 6 Compiler Engineering Lab
  7. 7. 3- STATIC CHECKERS • Reads a program • Analyze it • Discover potential bugs without running the program • Catch logical errors Department of Computer Science -25-29/2/12 7 Compiler Engineering Lab
  8. 8. 4 - INTERPRETERS • Performs the operations implied by the source program. • What is the difference between a Compiler and an Interpreter ? Department of Computer Science -25-29/2/12 8 Compiler Engineering Lab
  9. 9. COMPILER PHASES Department of Computer Science -25-29/2/12 9 Compiler Engineering Lab
  10. 10. PARTS OF COMPILATION 1. Analysis 2. Synthesis  The analysis part The synthesis part breaks up the constructs the source program desired target into consistent program from the intermediate pieces representation.  and creates an intermediate representation of the source program. Department of Computer Science -25-29/2/12 10 Compiler Engineering Lab
  11. 11. PROCESSING ENDS OF A COMPILER 1. Front-End 2. Back-End Consists of phases that Includes those portions of depend primarily on the the compiler that depend source language and on the target machine , largely independent of and do not depend on the the target machine source language (code (lexical – syntactic – optimization , code symbol table – semantic generation) – intermediate code ) Department of Computer Science -25-29/2/12 11 Compiler Engineering Lab
  12. 12. COMPLIER PHASES Source Program Lexical “Scanning” Compiler Analysis Syntax (Hierarchical) “Parsing” Front End Contextual “Semantic Analysis” Intermediate Back End Code Synthesis Phases are Object important to Code simplify the Machine compiler’s Language structure Department of Computer Science -25-29/2/12 12 Compiler Engineering Lab
  13. 13. COMPLIER PHASES INTERACTION (VIA DATA STRUCTURE) Source Program Text Lexical Tokens Compiler Analysis Syntax Abstract Front End (Syntax Tree) Contextual Decorated AST Intermediate + Symbol Table Back End Code Synthesis Intermediate Code Object Code Object Code Machine Language Department of Computer Science -25-29/2/12 13 Compiler Engineering Lab
  15. 15. COMPILER CONSTRUCTION TOOLS • Compiler can be written like any program •A programmer can use software development tools like : • Debugger • Version manager • Profilers • More specialized tools have been developed for helping implementing various phases of a compiler Department of Computer Science -25-29/2/12 15 Compiler Engineering Lab
  16. 16. 1- SCANNER GENERATORS • Generate lexical analyzer from a specification based on regular expression. Department of Computer Science -25-29/2/12 16 Compiler Engineering Lab
  17. 17. 2- PARSER GENERATORS • Produces syntax analyzers from input that is based on a context – free grammar. • In early compilers ,syntax analysis consumed a large fraction of running time and large fraction of intellectual effort of writing compilers. • Using parser generator gives ability to implement this phase in few days. Department of Computer Science -25-29/2/12 17 Compiler Engineering Lab
  18. 18. 3- SYNTAX–DIRECTED TRANSLATOR ENGINE • Produce collection of routines that walk the parser tree generating the intermediate code Department of Computer Science -25-29/2/12 18 Compiler Engineering Lab
  19. 19. 4 - AUTOMATIC CODE GENERATOR • Takes a collection of rules that define the translation of each operation of the intermediate language into the machine language for the target machine Department of Computer Science -25-29/2/12 19 Compiler Engineering Lab
  20. 20. 5 - DATA FLOW ENGINE • Much of information needed to perform good code optimization involves “ data_ flow analysis”, • The gathering of information about how values are transmitted from one part of a program to each other part Department of Computer Science -25-29/2/12 20 Compiler Engineering Lab
  21. 21. LEXICAL ANALYSIS FIRST PHASE OF A COMPILER Department of Computer Science -25-29/2/12 21 Compiler Engineering Lab
  22. 22. INSERTING A LEXICAL ANALYZERBETWEEN THE INPUT AND THE PARSER Read character LexicalInput Parser Analyzer push back pass Token and character its attribute
  23. 23. LEXICAL ANALYZER MECHANISM• Read the characters from the input• Group them into lexemes• Pass the tokens formed by the lexemes together with their attribute values to the later stages• In some situations the lexical analyzer has to read some more characters ahead before it can decide on the token to be returned to the parser• the extra character has to be pushed back onto the input, because it can be the beginning of the next lexeme.
  24. 24. IMPLEMENTING THE INTERACTION Read character using getchar( ) Lexan() pass Token and its attribute Lexical push back Analyzer character F ungetc(F,stdin)
  25. 25. LEX …• A particular tool , that has been widely used to specify lexical analyzers for a variety of languages• Using such tool will allow us to show how the specification of patterns using regular expressions can be combined with action
  26. 26. REGULAR EXPRESSION PATTERNS FOR TOKENSRegular expression Token Attribute-value ws - - If if - then then - else else - Id id Pointer to table entry Num num Pointer to table entry < relop LT <= relop LE = relop EQ <> relop NE > relop GT >= relop GE
  27. 27. LEX SPECIFICATION• A Lex program consists of three parts: 1. Declarations 2. Translation rules 3. Auxiliary procedure
  28. 28. 1- DECLARATIONS SECTIONIncludes declarations of :variables, manifest constantsand regular definitionsManifest constant..Is an identifier that is declared to represent a constant
  30. 30. REGULAR DEFINITIONSdelim [ tn]Ws {delim}+letter [A-Za-z]digit [0-9]id {letter}({letter}|{digit})*number {digit}+(.{digit}+)?(E[+-]?{digit}+)?
  31. 31. 2-TRANSLATION RULESare statements of the form P1 {action1} P2 {action2} …………….. Pn {action n}• where each p is a regular expression and each {action} is a program fragment describing what action the lexical analyzer shoud take when pattern p matches a lexeme
  32. 32. 2- TRANSLATION RULESWs no action and no returnif return (IF)then return (THEN)else return (ELSE)“<“ val =LT return (RELOP)and similarly to other relation operationsId val = install_id( ) return(ID)Number val= install_num( ) return(NUM)
  33. 33. 3-AUXILIARY PROCEDURES• Holds whatever auxiliary procedures are needed by the action• a lexical analyzer created by lex behaves in concert with a parser in the following manner: when activated by the parser the lexical analyzer begins reading its remaining input ,one character at a time ,until it has found the longest prefix of the input that is matched by one of the regular expressions P then it execute action
  34. 34. CON..• Typically action will return control to the parser, if it does not the lexical analyzer proceeds to find more lexemes until an action causes control to return to the parser• The lexical analyzer returns a single quantity to the parser ,the token..• to pass an attribute value with information about the lexeme we can set a global variable called val
  35. 35. AUXILIARY PROCEDURES• install_id ( ) Procedure to install the lexeme• install_num ( ) similar procedure to install a lexeme that is a number
  36. 36. WRITING A LEXICAL ANALYZER• Write a lexical analyzer Using C++ language.• Write it as a function called from inside main( )• Call that function Lexan• Lexan function returns the value of Token
  37. 37. THE LEXICAL ANALYZER WILL DO..• Read character from the user• If the character is a blank (Space) or a (tab) (written ‘t’) no token is returned to the parser, exit the function• If the character is (new line) written (‘n’) the line numbers will be incremented ,no token is returned• If the character is one Digit .. Tokenval
  38. 38. MORE THAN ONE DIGIT ..• Allow user to enter sequence of characters• While the user entering digits after first digit the analyzer allows him to enter more digits• Each time the analyzer compute the Tokenval• If the next character is not digit push back the character• Each time print the result from each part to see the output
  39. 39. TOKENVAL..• First digit Tokenval= t –’0’• Next digit Tokenval = tokenval * 10 + t - ’0’
  40. 40. READING CHARACTER FROM THE USER#include <stdio.h>int getchar( );• Gets character from stdin.• getchar is a macro that returns the next character on the named input stream stdin.• On success , getchar returns the character read, after converting it to an int without sign extension using the ASCII code.
  41. 41. PUSHING BACK CHARACTERS#include <stdio.h>ungetc (c,stdin)• Pushes a character back into input stream.• ungetc pushes the character c back onto the named input stream, which must be open for reading. This character will be returned on the next call to getchar for that stream. One character can be pushed back in all situations.• On success, ungetc returns the character pushed back.
  42. 42. TEST CHARACTER IF (DIGIT) OR NOT#include <ctype.h>isdigit(t)• Tests for decimal-digit character.• isdigit is a macro that classifies ASCII-coded integer values by table lookup• isdigit returns nonzero if c is a digit.
  43. 43. QUESTIONS? Thank you for listening  Department of Computer Science -25-29/2/12 43 Compiler Engineering Lab