Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Compiler Engineering Lab#5 : Symbol Table, Flex Tool
1. L A B # 5 : S Y M B O L TA B L E , F L E X
COMPILER
ENGINEERING
2. WHY DO I NEED A SYMBOL TABLE?
• Symbol Table answers the following questions:
1. For a certain declaration of an Identifier name , does it have
multiple declarations in different scopes?
• If YES, keep track of these different cases
2. For a USE of an Identifier name, to which scope does it
correspond? (using the "most closely nested" rule)
3. How can various language logical structures be presented?
• One of the Main Purposes of Symbol Table is: to keep
track of IDENTIFIERS recognized in the input stream.
• All subsequent references to identifiers refer to the
appropriate symbol table index.
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
2
3. WHY DO I NEED A SYMBOL TABLE?
• So, Symbol Table is a group of linked hash tables that
manages either:
• IDENTIFIERS attributes,
• or convey the structure of a statementexpression that they
reflect.
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
3
4. BUILDING COMPILER WITH
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
4
Lexical Analysis
(Lex Flex)
Syntax-Semantic
Analysis
(Yacc Bison)
Assembly
(LLVM)
Linking
5. 1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
5
TOKENS ARE NUMERICAL
REPRESENTATIONS OF STRINGS,
AND SIMPLIFY PROCESSING
First: Lexical Analysis with Flex
Reminder:
What is the goal of Lexical Analysis phase?
In other words, why do I use TOKENS?
6. BUILDING A COMPILER WITH
FLEX(LEX) & BISON(YACC)
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
6
7. FLEX & BISON INPUT-OUTPUT
INTERACTION
• Bas.y (Parser Input):
• Y.tab.h (Parser Output):
• is a header file that contains, among other things,
• the definitions of the token names NUM, OPA, etc.,
• and the variable yylval that we use to pass the bison code the semantic values of
tokens
• Y.tab.c (Parser Output):
• contains the CC++ code for the parser (which is a function called
yyparse())
• Bas.l (Lexica Analyzer Input):
• needs those definitions in (Y.tab.h) so that the yylex() function it defines for
bison can pass back the information it needs to pass back.
• Lex.yy.c(Lexica Analyzer Output): generated by FLEX that
contains, among other things, the definition of the yylex() function
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
7
8. FLEX
• Flex is an open-source tool for Fast lexical Analyzer can
be downloaded from Flex Download Webpage
• Flex reads user-specified input files
• If no input file is given, Flex will read its standard file
• Flex INPUT file:
• is a description of a scanner to generate
• The description is Pairs of Regular Expressions and C Code
called RULES
Source: Flex Website
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
8
9. FLEX
• When the executable runs it analyzes its input for
occurrences of text matching the Regular Expressions
(RegExp) for each rule.
• Whenever it finds a match it executes the corresponding C
code
• Flex generates a C source file named “lex.yy.c” which
defines the function yylex()
• The file “lex.yy.c” can be compiled and linked to produce
an executable
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
9
10. BISON(1)
• Bison is a general-purpose parser generator that
converts an annotated context-free grammar into a
deterministic [LR or generalized LR (GLR)] parser.
• Bison, you can use it to develop a wide range of
language parsers.
• Bison is compatible with YACC
• Based on CC++ and works with Java.
(1) Source: Bison Webpage
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
10
11. 1-4/4/12 11
Department of Computer Science -
Compiler Engineering Lab
FLEX TOOL
FIRST : LEXICAL ANALYSIS
12. BISONYACC & PARSING
• The grammar in the previous diagram (FlexBison
Interaction Model Diagram) is a text file you create with a
text editor
• Yacc will read your grammar and generate C code for a
syntax analyzer or parser
• The syntax analyzer uses grammar rules that allow it to
analyze tokens from the lexical analyzer and create a
syntax tree
• The syntax tree imposes a hierarchical structure on the
tokens
• e.g. operator precedence and associativity are apparent in the
syntax tree
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
12
13. LEXICAL ANALYSIS: FLEX IN DETAIL
• The flex generated scanner code tries to match
characters from the current input stream to these regular
expressions, and when a match is found, it executes the
associtated action code.
• The variable yytext contains the string (in the C sense,
i.e. '0' terminated char*.) of characters that were
matched.
• When more than one match is possible it breaks ties by
going with the longest match then first listed
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
13
14. FLEX FILES FORMAT
The general format of Lex source is:
{definitions}
%%
{rules}
%%
{user subroutines}
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
14
15. FLEX FILES FORMAT
• Any line which is not part of a Lex rule or action which begins
with a blank or tab is copied into the Lex generated program.
(used for Comments)
• source input prior to the first %% delimiter will be external to
any function in the code; if it appears immediately after the
first %%, it appears in an appropriate place for declarations in
the function written by
• Anything included between lines containing only %{ and %} is
copied out as above. The delimiters are discarded. This
format permits entering text like preprocessor statements that
must begin in column 1.
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
15
16. FLEX FILES FORMAT
• Any line which is not part of a Lex rule or action which begins
with a blank or tab is copied into the Lex generated program.
(used for Comments)
• This means each code-line must start from column one.
• source input prior to the first %% delimiter will be external to
any function in the code; if it appears immediately after the
first %%, it appears in an appropriate place for declarations in
the function written by
• Anything included between lines contain- ing only %{ and %}
is copied out as above. The delimiters are discarded. This
format permits entering text like preprocessor statements that
must begin in column 1.
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
16
18. EXAMPLE # 1 : CALCULATOR FLEX FILE
Write a Flex file for a calculator that is able
to recognize the following:
• numbers,
• (+,-,*,/) operators,
• and parantheses
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
18
20. EXAMPLE # 2 : “BASIC LANGUAGE” FILE
Write a Flex file for ”Basic Language” that is
able to recognize the following:
• Numbers,
• Identifiers,
• Keywords,
• Relation Operations
• Arthimitic operators,
• and Delimiters
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
20
24. EXAMPLE # 2 : (BAS.L) FILE FOR BASIC
LANGAUGE
{letter}({letter}|{digit})*
{ yylval.id = (char *)strdup(yytext);
return IDENT; }
{digit}+
{ yylval.num = atoi(yytext);
return NUMBER; }
[ tnr] /* skip whitespace */
. { printf("Unknown character [%c]n",yytext[0]);
return UNKNOWN; }
%% int yywrap(void) {return 1;}
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
24
25. INSTALLING FLEX & BISON
• Flex & Bison are Unix-based Tools that can be installed
directly under UnixLinux environment.
• Unser Windows Environment: Flex & Bison can be
installed under Cygwin ( Linux Terminal).
• Cygwin can be downloaded from here.
• Cygwin is a Linux API layer providing substantial Linux
API functionality.
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
25
26. INSTALLING FLEX & BISON
• Steps to download Cygwin:
1. Click the following link for Setup
2. Click Run Next Choose the option “Install from Internet”
Next Make sure the correct installation directory then
Next
3. Choose “Direct Connection” then Next
4. Choose any of the available HTTP download sites and click
Next
5. In the “Select Packages” page: search for both “Flex” and
“Bison” packages through the search textbox then ensure
the “install” option is selected
6. Finish the installation process by clicking Next to the end.
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
26
27. OVERVIEW OF ANTLR(1)
• ANTLR, ANother Tool for Language Recognition, is a
language tool that provides a framework for constructing
recognizers, interpreters, compilers, and translators from
grammatical descriptions containing actions in a variety
of target languages.
• ANTLR provides excellent support for tree construction,
tree walking, translation, error recovery, and error
reporting
• ANTLR can be downloaded from the following link
ANTLR (.jar) file download link
(1) Source: ANTLR Webpage
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
27
28. QUESTIONS?
Thank you for listening
1-4/4/12
Department of Computer Science -
Compiler Engineering Lab
28