This document discusses Lex and Yacc, which are tools used to generate lexical analyzers and parsers. Lex reads input specifying a lexical analyzer and outputs C code implementing it. Yacc reads a grammar and generates a C parser. The document provides examples of Lex and Yacc code and explains how they are linked and used together, with Lex generating tokens that are passed to a Yacc parser. Key aspects covered include the structure and sections of Lex and Yacc programs, and how they are compiled and linked to recognize formal languages.
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Lex and Yacc: Tools for Specifying Lexical and Syntactic Analyzers
1. By
Dr. Prem Nath
Associate Professor
Computer Science & Engineering Department,
H N B Garhwal University
(A Central University)
Lex And Yacc
1
2. Outline
▪ Lex:
▪ Theory
▪ Execution
▪ Example
▪ Yacc:
▪ Theory
▪ Description
▪ Example
▪ Lex & Yacc linking.
2
3. Lex
Lex is a Programming Language or Tool that generates lexical
analyzers, (widely used on Unix with vi editor).
It is mostly used with Yacc parser generator.
Written by Eric Schmidt and Mike Lesk.
It reads the input stream (specifying the lexical analyzer ) and
outputs source code implementing the lexical analyzer in the C
programming language.
Lex will read patterns (regular expressions) then produces C
code.
3
4. 4
LEX…
A Language for Specifying Lexical Analyzers
• Not only a table generator, but also allows “actions”
to associate with RE’s.
• Lex is widely used in the Unix community
• Lex is not efficient enough for production
compilers, however.
5. Lex…
Dr. Prem Nath 5
General Information:
• Input is stored in a file with *.l extension
• File consists of three main sections
• lex generates C function stored in lex.yy.c
Using Lex:
1) Specify words to be used as tokens (Extension of regular
expressions)
2) Run the lex utility on the source file to generate yylex( ),
a C function
3) Declares global variables char* yytext and int yyleng
9. Lex…
A simple pattern: letter(letter|digit)*
Regular expressions are translated by Lex to a computer program that
mimics an FA.
This pattern matches a string of characters that begins with a single
letter followed by zero or more letters or digits.
9
10. Lex…
Some limitations, Lex cannot be used to recognize nested structures such
as parentheses, since it only has states and transitions between states.
So, Lex is good at pattern matching, while Yacc is for more challenging
tasks.
10
14. Lex…
Whitespace must separate the defining term and the associated expression.
Code in the definitions section is simply copied as-is to the top of the generated C file and
must be bracketed with “%{“ and “%}” markers.
substitutions in the rules section are surrounded by braces ({letter}) to distinguish them
from literals.
14
15. Lex…
/* Lex Program to Recognize Id, Keywords, and Number
%{ #include<stdio.h>
letter [a-z A-Z]
digit [0-9] %}
%%
id {letter}({letter}I{letter})*
number {digit}+(.{digit})+?(E[+, -]?{digit})+
%%
/* Auxiliary Functions
{id} {printf(“%s is an Identifier”, yytext);}
{if} {printf(“%s is a Keyword”, yytext);}
{number} {printf(“%s is a Number”, yytext);}
Dr. Prem Nath 15
16. Yacc
➢ Yacc reads the grammar and generate C code for a
parser .
➢ Parser Generator (Compiler to Compiler)
➢ Files are with .y extension
◦ Grammars written in Backus Naur Form (BNF)
◦ BNF grammar used to express context-free languages .
◦ e.g. to parse an expression , do reverse operation(
reducing the expression)
◦ LALR(1) Parser (Look Ahead Left to Right One)
◦ Using stack for storing (LIFO)
◦ Written by Stefen C Johnson
16
17. Yacc…
Input to Yacc is divided into three sections.
... definitions ...
%%
... rules ...
%%
... subroutines ...
17
18. Yacc…
The definitions section consists of:
◦ token declarations .
◦ C code bracketed by “%{“ and “%}”.
◦ the rules section consists of:
BNF grammar .
the subroutines section consists of:
◦ user subroutines .
18
20. 20
Linking lex and Yacc…
Yacc
Compiler
Lex
a.out
parse.y
y.tab.c
(yyparse)
scan.l
source
program
output
C
Compiler
lex.yy.c
a.out
Included
21. Linking lex and Yacc…
21
• yyval: Vallue Associated With Token
• yytext: Pointer to Input String
22. Linking lex and Yacc…
22
$$: First Symbol of the Body
$i: ith Symbol of the Body
23. Linking lex and Yacc…
❑Construct a Lexical Analyzer and Parser for
Language L={anbn: n ≥1}.
/* Lex Tool ab.l
%{
#include<y.tab.h>
%}
%%
[a] {return a;}
[b] {return b;}
[n] {return 0;}
%%
/* No Main Function
Dr. Prem Nath 23
24. Linking lex and Yacc…
/* Yacc Tool cd.y
%{
#include<stdio.h>
#include<stdlib.h>
%}
% token a b
%%
start: S ‘n’ {return 0;}
S: aSb {$S=$1$2;}
%%
Dr. Prem Nath 24
25. Linking lex and Yacc…
/* Yacc Tool CD.y
/* Main Function
int main( )
{
printf(“Enter a String:”);
if(yyparse= = 0)
printf(“Valid String”);
yyerror( )
{
printf(“Invalid String”);
exit(0);
}
int yywrap( )
{
return 1;
}
}
Dr. Prem Nath 25