Compiler Engineering Lab#5 : Symbol Table, Flex Tool

on

  • 3,437 views

Discussion of Symbol Table

Discussion of Symbol Table
Flex\Bison Interaction
Flex File Specification
Cygwin Installation

Statistics

Views

Total Views
3,437
Views on SlideShare
3,436
Embed Views
1

Actions

Likes
0
Downloads
176
Comments
0

1 Embed 1

http://www.slashdocs.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Compiler Engineering Lab#5 : Symbol Table, Flex Tool Compiler Engineering Lab#5 : Symbol Table, Flex Tool Presentation Transcript

  • COMPILER ENGINEERINGLAB # 5: SYMBOL TABLE, FLEX
  • WHY DO I NEED A SYMBOL TABLE? • Symbol Table answers the following questions: 1. For a certain declaration of an Identifier name , does it have multiple declarations in different scopes? • If YES, keep track of these different cases 2. For a USE of an Identifier name, to which scope does it correspond? (using the "most closely nested" rule) 3. How can various language logical structures be presented? • One of the Main Purposes of Symbol Table is: to keep track of IDENTIFIERS recognized in the input stream. • All subsequent references to identifiers refer to the appropriate symbol table index. Department of Computer Science -1-4/4/12 2 Compiler Engineering Lab
  • WHY DO I NEED A SYMBOL TABLE? • So, Symbol Table is a group of linked hash tables that manages either: • IDENTIFIERS attributes, • or convey the structure of a statementexpression that they reflect. Department of Computer Science -1-4/4/12 3 Compiler Engineering Lab
  • BUILDING COMPILER WITH Syntax-SemanticLexical Analysis Assembly Analysis Linking (Lex Flex) (LLVM) (Yacc Bison) Department of Computer Science -1-4/4/12 4 Compiler Engineering Lab
  • First: Lexical Analysis with Flex Reminder: What is the goal of Lexical Analysis phase? In other words, why do I use TOKENS? TOKENS ARE NUMERICAL REPRESENTATIONS OF STRINGS, AND SIMPLIFY PROCESSING Department of Computer Science -1-4/4/12 5 Compiler Engineering Lab
  • BUILDING A COMPILER WITH FLEX(LEX) & BISON(YACC) Department of Computer Science -1-4/4/12 6 Compiler Engineering Lab
  • FLEX & BISON INPUT-OUTPUT INTERACTION • Bas.y (Parser Input): • Y.tab.h (Parser Output): • is a header file that contains, among other things, • the definitions of the token names NUM, OPA, etc., • and the variable yylval that we use to pass the bison code the semantic values of tokens • Y.tab.c (Parser Output): • contains the CC++ code for the parser (which is a function called yyparse()) • Bas.l (Lexica Analyzer Input): • needs those definitions in (Y.tab.h) so that the yylex() function it defines for bison can pass back the information it needs to pass back. • Lex.yy.c(Lexica Analyzer Output): generated by FLEX that contains, among other things, the definition of the yylex() function Department of Computer Science -1-4/4/12 7 Compiler Engineering Lab
  • FLEX • Flex is an open-source tool for Fast lexical Analyzer can be downloaded from Flex Download Webpage • Flex reads user-specified input files • If no input file is given, Flex will read its standard file • Flex INPUT file: • is a description of a scanner to generate • The description is Pairs of Regular Expressions and C Code called RULES Source: Flex Website Department of Computer Science -1-4/4/12 8 Compiler Engineering Lab
  • FLEX • When the executable runs it analyzes its input for occurrences of text matching the Regular Expressions (RegExp) for each rule. • Whenever it finds a match  it executes the corresponding C code • Flex generates a C source file named “lex.yy.c” which defines the function yylex() • The file “lex.yy.c” can be compiled and linked to produce an executable Department of Computer Science -1-4/4/12 9 Compiler Engineering Lab
  • BISON(1) • Bison is a general-purpose parser generator that converts an annotated context-free grammar into a deterministic [LR or generalized LR (GLR)] parser. • Bison, you can use it to develop a wide range of language parsers. • Bison is compatible with YACC • Based on CC++ and works with Java. (1) Source: Bison Webpage Department of Computer Science -1-4/4/12 10 Compiler Engineering Lab
  • FIRST : LEXICAL ANALYSIS FLEX TOOL Department of Computer Science -1-4/4/12 11 Compiler Engineering Lab
  • BISONYACC & PARSING • The grammar in the previous diagram (FlexBison Interaction Model Diagram) is a text file you create with a text editor • Yacc will read your grammar and generate C code for a syntax analyzer or parser • The syntax analyzer uses grammar rules that allow it to analyze tokens from the lexical analyzer and create a syntax tree • The syntax tree imposes a hierarchical structure on the tokens • e.g. operator precedence and associativity are apparent in the syntax tree Department of Computer Science -1-4/4/12 12 Compiler Engineering Lab
  • LEXICAL ANALYSIS: FLEX IN DETAIL • The flex generated scanner code tries to match characters from the current input stream to these regular expressions, and when a match is found, it executes the associtated action code. • The variable yytext contains the string (in the C sense, i.e. 0 terminated char*.) of characters that were matched. • When more than one match is possible it breaks ties by going with the longest match then first listed Department of Computer Science -1-4/4/12 13 Compiler Engineering Lab
  • FLEX FILES FORMAT The general format of Lex source is: {definitions} %% {rules} %% {user subroutines} Department of Computer Science -1-4/4/12 14 Compiler Engineering Lab
  • FLEX FILES FORMAT • Any line which is not part of a Lex rule or action which begins with a blank or tab is copied into the Lex generated program. (used for Comments) • source input prior to the first %% delimiter will be external to any function in the code; if it appears immediately after the first %%, it appears in an appropriate place for declarations in the function written by • Anything included between lines containing only %{ and %} is copied out as above. The delimiters are discarded. This format permits entering text like preprocessor statements that must begin in column 1. Department of Computer Science -1-4/4/12 15 Compiler Engineering Lab
  • FLEX FILES FORMAT • Any line which is not part of a Lex rule or action which begins with a blank or tab is copied into the Lex generated program. (used for Comments) • This means each code-line must start from column one. • source input prior to the first %% delimiter will be external to any function in the code; if it appears immediately after the first %%, it appears in an appropriate place for declarations in the function written by • Anything included between lines contain- ing only %{ and %} is copied out as above. The delimiters are discarded. This format permits entering text like preprocessor statements that must begin in column 1. Department of Computer Science -1-4/4/12 16 Compiler Engineering Lab
  • Department of Computer Science -1-4/4/12 17 Compiler Engineering Lab
  • EXAMPLE # 1 : CALCULATOR FLEX FILE Write a Flex file for a calculator that is able to recognize the following: • numbers, • (+,-,*,/) operators, • and parantheses Department of Computer Science -1-4/4/12 18 Compiler Engineering Lab
  • EXAMPLE # 1 : CALCULATOR (.L) %{ #include "ex1.tab.hpp" #include <iostream> using namespace std; %} %option noyywrap %% [0-9]+ { yylval.val = atoi(yytext); return NUM; } [+|-] { yylval.sym = yytext[0]; return OPA; } [*|/] { yylval.sym = yytext[0]; return OPM; } "(" { return LP; } ")" { return RP; } ";" { return STOP; } <<EOF>> { return 0; } [ tn]+ { } . { cerr << "Unrecognized token!" << endl; exit(1); } %% Department of Computer Science -1-4/4/12 19 Compiler Engineering Lab
  • EXAMPLE # 2 : “BASIC LANGUAGE” FILE Write a Flex file for ”Basic Language” that is able to recognize the following: • Numbers, • Identifiers, • Keywords, • Relation Operations • Arthimitic operators, • and Delimiters Department of Computer Science -1-4/4/12 20 Compiler Engineering Lab
  • EXAMPLE # 2 : (BAS.L) FILE FOR BASIC LANGAUGE %{ #include "y.tab.h” %} digit [0-9] letter [a-zA-Z] %% "+" { return PLUS; } "-" { return MINUS; } "*" { return TIMES; } "/" { return SLASH; } "(" { return LPAREN; } ")" { return RPAREN; } ";" { return SEMICOLON; } "," { return COMMA; } "." { return PERIOD; } Department of Computer Science -1-4/4/12 21 Compiler Engineering Lab
  • EXAMPLE # 2 : (BAS.L) FILE FOR BASIC LANGAUGE ":=" { return BECOMES; } "=" { return EQL; } "<>" { return NEQ; } "<" { return LSS; } ">" { return GTR; } "<=" { return LEQ; } ">=" { return GEQ; } Department of Computer Science -1-4/4/12 22 Compiler Engineering Lab
  • EXAMPLE # 2 : (BAS.L) FILE FOR BASIC LANGAUGE "begin" { return BEGINSYM; } "call" { return CALLSYM; } "const" { return CONSTSYM; } "do" { return DOSYM; } "end" { return ENDSYM; } "if" { return IFSYM; } "odd" { return ODDSYM; } "procedure" { return PROCSYM; } "then" { return THENSYM; } "var" { return VARSYM; } "while" { return WHILESYM; } Department of Computer Science -1-4/4/12 23 Compiler Engineering Lab
  • EXAMPLE # 2 : (BAS.L) FILE FOR BASIC LANGAUGE {letter}({letter}|{digit})* { yylval.id = (char *)strdup(yytext); return IDENT; } {digit}+ { yylval.num = atoi(yytext); return NUMBER; } [ tnr] /* skip whitespace */ . { printf("Unknown character [%c]n",yytext[0]); return UNKNOWN; } %% int yywrap(void) {return 1;} Department of Computer Science -1-4/4/12 24 Compiler Engineering Lab
  • INSTALLING FLEX & BISON • Flex & Bison are Unix-based Tools that can be installed directly under UnixLinux environment. • Unser Windows Environment: Flex & Bison can be installed under Cygwin ( Linux Terminal). • Cygwin can be downloaded from here. • Cygwin is a Linux API layer providing substantial Linux API functionality. Department of Computer Science -1-4/4/12 25 Compiler Engineering Lab
  • INSTALLING FLEX & BISON • Steps to download Cygwin: 1. Click the following link for Setup 2. Click Run  Next  Choose the option “Install from Internet”  Next  Make sure the correct installation directory then Next 3. Choose “Direct Connection” then Next 4. Choose any of the available HTTP download sites and click Next 5. In the “Select Packages” page: search for both “Flex” and “Bison” packages through the search textbox  then ensure the “install” option is selected 6. Finish the installation process by clicking Next to the end. Department of Computer Science -1-4/4/12 26 Compiler Engineering Lab
  • OVERVIEW OF ANTLR(1) • ANTLR, ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of target languages. • ANTLR provides excellent support for tree construction, tree walking, translation, error recovery, and error reporting • ANTLR can be downloaded from the following link ANTLR (.jar) file download link (1) Source: ANTLR Webpage Department of Computer Science -1-4/4/12 27 Compiler Engineering Lab
  • QUESTIONS? Thank you for listening  Department of Computer Science -1-4/4/12 28 Compiler Engineering Lab