SlideShare a Scribd company logo
1 of 7
Download to read offline
Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309
© 2013, IJCSMC All Rights Reserved 303
ABSTRACT: Packrat Parsing is a variant of recursive decent parsing technique with memoization by saving
intermediate parsing result as they are computed so that result will not be reevaluated. It is extremely useful as it
allows the use of unlimited look ahead without compromising on the power and flexibility of backtracking. However,
Packrat parsers need storage which is in the order of constant multiple of input size for memoization. This makes
packrat parsers not suitable for parsing input streams which appears to be in simple format but have large amount
of data.
In this paper instead of translating productions into procedure calls with memoization, an attempt is made to
eliminate the calls by using stack without using memoization for implementation of ordered choice operator in
Parsing expression Grammar (PEG). The experimental results show the possibility of using this stack based
algorithm to eliminate the need of storage for memoization with a guarantee of linear parse time.
Keywords: Parsing; PEG; Packrat; CFG
I. INTRODUCTION
Recursive-descent parsing is a top-down method of syntax analysis in which we execute a set of
recursive procedures to process the input. It was suggested as early as in 1961 by Lucas [1]. The great
advantage of a recursive-descent parser is its simplicity and clear relationship to the grammar. For smaller
grammars, the parser can be easily produced and maintained by hand. This is contrary to bottom-up parsers,
normally driven by large tables that have no obvious relationship to the grammar; these tables must be
mechanically generated. The problem with constructing recursive-descent parsers from a classical context-
Available Online at www.ijcsmc.com
International Journal of Computer Science and Mobile Computing
A Monthly Journal of Computer Science and Information Technology
ISSN 2320–088X
IJCSMC, Vol. 2, Issue. 11, November 2013, pg.303 – 309
RESEARCH ARTICLE
Stack Based Implementation of Ordered
Choice in Packrat Parsing
Manish M. Goswami M.M. Raghuwanshi Latesh Malik
Research Scholar, Professor, Professor, Dept. of CSE,
G.H. Raisoni College of Engg. Rajiv Gandhi College of Engg. G.H.raisoni College of Engineering
Nagpur, India Nagpur, India Nagpur, India
Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309
© 2013, IJCSMC All Rights Reserved 304
free grammar is that the grammar must have the so-called LL(1) property. Forcing the language into the
LL(1) mold can make the grammar – and the parser – unreadable. The LL(1) restriction can be
circumvented by the use of backtracking. However, full backtracking may require exponential time. A
reasonable compromise is limited backtracking: never try another alternative after one alternative already
succeeded on a portion of input. Recently, Ford [2–4] introduced a language for writing recursive-descent
parsers with limited backtracking. It is called Parsing Expression Grammar (PEG) and has the form of a
grammar that can be easily transcribed into a set of recursive procedures. In addition to backtracking, PEG
can directly define structures that normally require a separate ”lexer” or ”scanner”. Together with lifting of
the LL(1) restriction, this gives a very convenient tool when we need an ad-hoc parser for some application.
Theoretically, even the limited backtracking may require a lot of time. In [2, 3], PEG was introduced
together with a technique called packrat parsing. Packrat parsing handles backtracking by extensive
memoization: storing all results of parsing procedures. It guarantees linear parsing time at a huge memory
cost. There exists a complete parser generator named Rats! [4, 5] that produces packrat parsers from PEG.
Excessive backtracking does not matter in small interactive applications where the input is short and
performance is not critical. Moreover, experiments reported in [7, 8] demonstrated a moderate backtracking
activity in PEG parsers for programming languages Java 1.5 and C.
In this paper stack based approach is adopted to transcribe a grammar into explicit call stack Push,
Pop statements. in place of procedure calls in packrat parsing. The algorithm is inspired from Generalized
LL parsing algorithm[12].Each nonterminal with which production begins is associated with the label
where code for evaluation of that nonterminal in parse tree is stored. The data structure stack is used
basically for two purposes: one to store the label for the next nonterminal in a choice to be evaluated and
second to store the label of alternative choice if the production has more than two choices. Performance of
resulting parser is compared with Generalized LL parsing algorithm (non optimized version). Experimental
results shows that Packrat parsing implemented in this way(only implementation of ordered choice is
considered) shows further scope for optimization which is not possible when implemented using procedure
calls. Also, algorithm when compared with GLL parsing algorithm shows improvement on some inputs
which is obvious as GLL algorithm tries to explore all the choices while packrat parsing algorithm explores
an alternative choice only if first choice fails.
II. BACKGROUND
Parsing Expression Grammars (PEGs)
PEGs is a recognition-based formal syntactic foundation that was first presented by Ford [9] to
describe the syntax of formal languages. PEGs are a formalization of recursive descent parsing with
backtracking. A PEG consists of rules, represented by N <- e, where N is a nonterminal symbol and e, an
expression (called parsing expression). A parsing expression consists of the following elements, as shown
in figure 1:
ε : Empty string
" " : String literal
[ ] : Character class
. : Wildcard (Any character)
(e) : Grouping
N : Nonterminal
e1 e2 : Sequence
Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309
© 2013, IJCSMC All Rights Reserved 305
e1 / e2: Ordered-choice
e* : Zero-or-more repetition
&e : And-predicate (Positive lookahead)
!e : Not-predicate (Negative lookahead)
e+ : One-or-more repetition
e? : Zero-or-more
Figure 1. Expressions Constituting a Parsing Expression
It appears that PEGs are similar to Extended Backus-Naur Forms (EBNFs)[9]. However, it should
be noted that e1 / e2 is an ordered choice, not an unordered choice, e1 | e2 in EBNFs. e1 / e2 does not
indicate strings expressed by e1 or e2. e1 / e2 means an action in which e1 is evaluated at first and e2 is
evaluated only when e1 fails. Therefore, generally, e1 / e2 ≠ e2 / e1. In addition, it should be noted e* is not
similar to the operators used in regular expressions (REs). e* does not mean a greedy match in REs but a
possessive match. For example, "a" * "a" in PEG does not express {"a", "aa", ...} but Ø because once "a"*
succeeds, the parser does not backtrack even if successive expressions "a" do not succeed. &e and !e are
lookahead expressions. !e succeeds if e does not succeed and &e succeeds if e succeeds. Note that !e and
&e don’t change the position in an input of a parser even if the expressions succeed. PEGs can express all
deterministic LR(k) languages and some non-context-free languages[10].
Packrat Parsing
Roughly speaking, packrat parsing is a combination of PEG-based recursive descent parsing with
memoization. A packrat parser takes each nonterminal of a PEG as a parsing function of the nonterminal
and carries out parsing on an input by calling the function. A parsing function takes a start position in an
input as an argument and returns a parse result, which is failure or success. A parsing function must be pure
in a packrat parser. That is, the same parsing function called at the same position returns the same result.
This fact allows parsing functions to be memoized. Memoization of all parsing functions in a packrat parser
guarantees that the packrat parser parses any input in linear time. Despite the guarantee of linear time
parsing, packrat parsing is powerful. Packrat parsing can rapidly handle wide-ranged grammar PEGs can
express, including all deterministic LR(k) languages and some non-context-free languages. In addition,
packrat parsing does not require a separate lexer because when one of the choices fails in parsing, a packrat
parser can backtrack to the other choice, unlike traditional LL or LR predictive parsers. Furthermore,
because packrat parsers are simple, they can be implemented easily. Although packrat parsing is a simple,
powerful, and liner time parsing algorithm, it has a major disadvantage in that packrat parsers require O(n)
space for memoization in parsing because they memoize all intermediate results. Because of this
disadvantage, packrat parsers are considered to be unsuitable for large file parsing (e.g. XML streams).
Rats![5], which generates packrat parsers in Java, supports several optimizations to improve execution
performance and memory efficiency. For example, Rats! merges successive memoized fields into objects
called chunks to decrease the heap size. By chunks and many other optimizations, parsers generated by
Rats! achieve an execution performance comparable to that of LL parsers generated by ANTLR[11].
However, Rats! does not resolve the fundamental problem that packrat parsers require O(n) space.
Call Stacks and elementary descriptors
A traditional parser for Γ0 described below is composed of parse functions:-
S -> ASd | BS
A-> a|b
B->a |c
Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309
© 2013, IJCSMC All Rights Reserved 306
Main()
{
i := 0
if pS() and I[i] = $ report success
else error()
}
pS() { Evaluate pA() followed by pS() and then evaluate ‘ if (I[i] = d) { i := i + 1 }’ .If any one of
condition is false then evaluate following else return true.
Evaluate pB() followed by pS() .If any one of the condition is false then return false else return true. }
pA() { if (I[i] = a) { i := i + 1; return true }
else if (I[i] = c) { i := i + 1; return true} else return false }
pB() { if (I[i] = a) { i := i + 1 ; return true}
else if (I[i] = b) { i := i + 1; return true} else return false }
Of course, Γ0 is not LL (1) parsing algorithm so this algorithm will not behave correctly without
some additional mechanism for dealing with non-determinism. This is addressed by converting the
function calls into explicit call stack operations using stack push and goto statements in the usual way. We
also partition the body of those functions whose corresponding nonterminal is not LL (1) and separately
label each partition.
In practice, then, some goto statements will have several target labels, corresponding to these
multiple partitions: for example, this will be the case for the nonterminal S in Γ0. We use descriptors to
record each possible choice, and replace termination in the RD algorithm with execution re-start from the
point recorded in the next descriptor on the stack. Instead of calls to the error function, the algorithm
simply processes the next descriptor of alternative choice if available on the stack and it terminates when
there are no further descriptors to be processed. Finally, it signals error when there is no descriptor
available and all input is not processed.
Two stacks are maintained, one is parse stack which is used for storing the label for next
nonterminal in the choice and second is backtrack stack which is used to store the starting label of next
choice if the nonterminal has more than one productions.
For example, If a node corresponding nonterminal S in S->ASd|BS is evaluated then starting label
for evaluating choice BS will be stored on stack named as backtrack stack corresponding choice for
backtracking if choice ASd fails and control will be transferred to starting label of choice ASd. In particular
A is evaluated but before it label for evaluation of next node corresponding to S in ASd will be pushed onto
the stack called as parse stack to return it after A is evaluated.
In detail, an elementary descriptor is a triple (L, s, j) where L is a line label, s is a parse stack and j is a
position in the input array I. We maintain a placeholder R for current descriptor. At the end of a parse
function and at points of non-determinism in the grammar we create a new descriptor using the label at the
top of the current parse stack. When a particular execution of the algorithm stops, at input I[i] say, the top
element L is popped from the stack s = [s’, L] and (L, s’, i) is added to R (if it has not already been added).
We use POP(s, i, R) to denote this action. Then the next descriptor (L’, t, j) is removed from R and
execution starts at line L’ with call stack t and input symbol I[j]. The overall execution terminates when the
Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309
© 2013, IJCSMC All Rights Reserved 307
R is empty. In order to allow us for backtracking we record both the line label L and the current input
buffer index k on the stack using the notation Lk. At this interim stage we treat the stack as a bracketed list,
[ ] denotes the empty stack, and we assume that we have a function PUSH(s,Lk) which simply updates the
stack s by pushing on the element Lk.The details of the algorithm is as follows:-
i := 0; R := ∅; s := [L0
0
]
LS : add (LS1, s, i) to R
L0: if (R ≠∅) { remove (L, s1, j) from R
if (L = L0 and s1 = [ ] and j = |I| and b=[]) report success
else { s := s1; i := j; goto L }
else report failure.
LS1 : PUSH(s,Li
1
); PUSH(b,LS2
i
) ; goto LA
L1: PUSH(s,Li
2); goto LS
L2: if (I[i] = d) { i := i + 1; POP(s, i,R) };else POP(bs,i,R); goto L0
LS2 : PUSH(s,Li
3); PUSH(b,LS3
i
) goto LB
L3: PUSH(s,Li
4); goto LS
L4: POP(s, i,R); goto L0
LS3 : POP(s, i,R); goto L0
LA: if (I[i] = a) { i := i + 1; POP(s, i,R); goto L0 }
else{ if (I[i] = c) { i := i + 1; POP(s, i,R) goto L0 }
else POP(bs,i,R); goto L0 }
LB: if (I[i] = a) { i := i + 1; POP(s, i,R); goto L0 }
else{ if (I[i] = b) { i := i + 1; POP(s, i,R) goto L0}
else POP(bs,i,R) goto L0 }
P:Parse Stack B:Backtrack Stack
L2
1
L1
1
LS2
1
LS2
0
P B
L2
1
LS2
0
P B
L1
0
LS2
0
P B
Fig.2: When input
pointer is at pos=0 and
Label Ls1 is evaluated
Fig.3: When input a is
matched and Label L1
is evaluated
Fig.4: When input
pointer is at pos=1 and
Label Ls1 is evaluated
Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309
© 2013, IJCSMC All Rights Reserved 308
III. EXPERIMENTATION AND ANALYSIS
The Algorithm is implemented in Java. The grammar which is translated to explicit stack calls is
reproduced here mentioned in above section:-
S ->AS d |BS | ε
A ->a |c
B -> a |b
Though the grammar seems to be simple it reveals many facts pertaining to parsing. The resulting
program is run on Core 2 Duo 1.8GH, 2 GB RAM machine. The algorithm is run on a particular input
string for 50 times for each parameter and the average is taken. Parameters which were considered into
consideration to measure performance are:-
1) No. of Push
2) No. of Pop
3) Time required recognizing the input string.
The performance of above algorithm is measured and compared with GLL algorithm (The non-
optimized version of GLL i.e. without graph structured stack is considered here)
As can be seen from the following table, no. of push count, pop count, and parsing time differs for each
algorithm for particular input. The explanation for the same is cited as follows on case to case basis:-
Case I: For input a65
bd65
, the stack based implementation of packrat parsing algorithm requires less time
as compared to GLL algorithm. This is because in GLL algorithm for all a’s, two choices ASd and BS are
evaluated. Here only S->Asd is correct choice for recognizing a65
bd65
.In the proposed algorithm discussed
above choice Asd is evaluated for all a’s reducing push count and pop count thereby reducing time
required to parse the input.
Case II: Here GLL algorithm takes less time. This is because for all b’s correct choice i.e. BS is picked
up while in the above proposed algorithm for every b, first Asd is evaluated by the property of packrat
parsing (ordered choice) which will be obviously failed and then BS will be evaluated resulting into more
push and pops thereby requiring more time.
Case III: Here again GLL requires more time compared to above algorithm. This case is similar to Case I.
Input= a65
bd65
Sr. No.
Parameter
Stack based
implementation of
packrat parsing
Algorithm
GLL Parsing
Algorithm
1
Push_Count 204 6140
Pop_Count 205 15358
Parsing Time(ns) 674655 14449780
Input= b130
2
Push_Count 654 260
Pop_Count 655 261
Parsing Time(ns) 1231788 1132462
Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309
© 2013, IJCSMC All Rights Reserved 309
Input=a10
d
3
Push_Count 6650 4092
Pop_Count 6651 6129
Parsing Time(ns) 6119207 7073932
IV. CONCLUSION AND FUTURE WORK
In this paper, an algorithm for implementation of packrat parsing algorithm is proposed by
eliminating function calls by explicit stack push and pop operations. The performance in terms of time
required to recognize the input is compared with GLL parsing algorithm. The advantage that can be seen
with this algorithm is the chance of optimization (for example modeling of stack using graph structured
stack) so that time required to recognize the input would be reduced which is not possible when the
algorithm is implemented using recursive descent parsing style. It would be interesting to see the impact of
stack used in the proposed algorithm on the use of memoization in packrat parsing algorithm. If somehow
this reduces the memory required for memoization, then packrat parsing algorithm will have linear parsing
time without sacrificing the storage.
REFERENCES
[1]. Lucas, P. The structure of formula-translators. ALGOL Bulletin Supplement 16 (September 1961), 1–27.
[2]. Ford, B. Packrat parsing: a practical linear-time algorithm with backtracking. Master’s thesis, assachusetts Institute of Technology,
September 2002.
[3]. Ford, B. Packrat parsing: Simple, powerful, lazy, linear time. In Proceedings of the 2002 International Conference on Functional
Programming (October 2002).
[4]. Ford, B. Parsing expression grammars: A recognition-based syntactic foundation. In Proceedings of the 31st ACM SIGPLAN-
SIGACT Symposium on Principles of Programming Languages (Venice, Italy, 14–16 January 2004), N. D. Jones and X. Leroy,
Eds.,ACM, pp. 111–122.
[5]. Grimm, R. Rats! – an easily extensible parser generator http://www.cs.nyu.edu /rgrimm/xtc/ rats.html.
[6]. Grimm, R. Practical packrat parsing. Tech. Rep. TR2004-854, Dept. of Computer Science, New York University, March 2004.
[7]. Redziejowski, R. R. Parsing Expression Grammar as a primitive recursive-descent parser with backtracking. Fundamenta
Informaticae 79, 3–4 (2007), 513–524.
[8]. Redziejowski, R. R. Some aspects of Parsing Expression Grammar. Fundamenta Informaticae 85, 1–4 (2008), 441–454.
[9] I. S. Organization. Syntactic metalanguage – Extended BNF, 1996.ISO/IEC 14977.
[10] B. Ford. Parsing expression grammars: A recognition-based syntactic foundation. In Symposium on Principles of Programming
Languages,January 2004.
[11] T. J. Parr and R.W. Quong. Antlr: A predicated-ll(k) parser generator.Software Practice and Experience, 25:789–810, 1994.
[12] Elizabeth Scott and Adrian Johnstone.GLL Parsing, Electronic Notes in Theoretical Computer Science 253 (2010) 177–189.

More Related Content

What's hot

Hw6 interpreter iterator GoF
Hw6 interpreter iterator GoFHw6 interpreter iterator GoF
Hw6 interpreter iterator GoFEdison Lascano
 
Regular Expressions -- SAS and Perl
Regular Expressions -- SAS and PerlRegular Expressions -- SAS and Perl
Regular Expressions -- SAS and PerlMark Tabladillo
 
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONSGENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONSijseajournal
 
Modification of some solution techniques of combinatorial
Modification of some solution techniques of combinatorialModification of some solution techniques of combinatorial
Modification of some solution techniques of combinatorialAlexander Decker
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Deep Learning Italia
 
220 runtime environments
220 runtime environments220 runtime environments
220 runtime environmentsJ'tong Atong
 
3. Lexical analysis
3. Lexical analysis3. Lexical analysis
3. Lexical analysisSaeed Parsa
 
Fractal analysis of good programming style
Fractal analysis of good programming styleFractal analysis of good programming style
Fractal analysis of good programming stylecsandit
 
System design using HDL - Module 5
System design using HDL - Module 5System design using HDL - Module 5
System design using HDL - Module 5Aravinda Koithyar
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Avelin Huo
 
Theory Psyco
Theory PsycoTheory Psyco
Theory Psycodidip
 
Towards Building Semantic Role Labeler for Indian Languages
Towards Building Semantic Role Labeler for Indian LanguagesTowards Building Semantic Role Labeler for Indian Languages
Towards Building Semantic Role Labeler for Indian LanguagesAlgoscale Technologies Inc.
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ijnlc
 
Icsoft 2011 51_cr
Icsoft 2011 51_crIcsoft 2011 51_cr
Icsoft 2011 51_crDmitry Kan
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel ArchitecturesJoel Falcou
 

What's hot (20)

Hw6 interpreter iterator GoF
Hw6 interpreter iterator GoFHw6 interpreter iterator GoF
Hw6 interpreter iterator GoF
 
Regular Expressions -- SAS and Perl
Regular Expressions -- SAS and PerlRegular Expressions -- SAS and Perl
Regular Expressions -- SAS and Perl
 
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONSGENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
GENERATING PYTHON CODE FROM OBJECT-Z SPECIFICATIONS
 
Modification of some solution techniques of combinatorial
Modification of some solution techniques of combinatorialModification of some solution techniques of combinatorial
Modification of some solution techniques of combinatorial
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
 
220 runtime environments
220 runtime environments220 runtime environments
220 runtime environments
 
3. Lexical analysis
3. Lexical analysis3. Lexical analysis
3. Lexical analysis
 
Fractal analysis of good programming style
Fractal analysis of good programming styleFractal analysis of good programming style
Fractal analysis of good programming style
 
IJCTT-V4I9P137
IJCTT-V4I9P137IJCTT-V4I9P137
IJCTT-V4I9P137
 
System design using HDL - Module 5
System design using HDL - Module 5System design using HDL - Module 5
System design using HDL - Module 5
 
Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03
 
Theory Psyco
Theory PsycoTheory Psyco
Theory Psyco
 
Function polynomial time
 Function polynomial time Function polynomial time
Function polynomial time
 
Towards Building Semantic Role Labeler for Indian Languages
Towards Building Semantic Role Labeler for Indian LanguagesTowards Building Semantic Role Labeler for Indian Languages
Towards Building Semantic Role Labeler for Indian Languages
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
 
ijcai05_srl
ijcai05_srlijcai05_srl
ijcai05_srl
 
Icsoft 2011 51_cr
Icsoft 2011 51_crIcsoft 2011 51_cr
Icsoft 2011 51_cr
 
IJETAE_1013_119
IJETAE_1013_119IJETAE_1013_119
IJETAE_1013_119
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures
 

Similar to Packrat parsing

IRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine LearningIRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine LearningIRJET Journal
 
Extractive Summarization with Very Deep Pretrained Language Model
Extractive Summarization with Very Deep Pretrained Language ModelExtractive Summarization with Very Deep Pretrained Language Model
Extractive Summarization with Very Deep Pretrained Language Modelgerogepatton
 
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELEXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELijaia
 
Evaluate And Analysis of ALGOL, ADA ,PASCAL Programming Languages
Evaluate And Analysis of ALGOL, ADA ,PASCAL Programming Languages Evaluate And Analysis of ALGOL, ADA ,PASCAL Programming Languages
Evaluate And Analysis of ALGOL, ADA ,PASCAL Programming Languages Charitha Gamage
 
Introduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdfIntroduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdfsudeshnakundu10
 
Iaetsd march c algorithm for embedded memories in fpga
Iaetsd march c algorithm for embedded memories in fpgaIaetsd march c algorithm for embedded memories in fpga
Iaetsd march c algorithm for embedded memories in fpgaIaetsd Iaetsd
 
Cohesive Software Design
Cohesive Software DesignCohesive Software Design
Cohesive Software Designijtsrd
 
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP IJCSEIT Journal
 
Multi core processing of xml twig patterns
Multi core processing of xml twig patternsMulti core processing of xml twig patterns
Multi core processing of xml twig patternsieeepondy
 
Recent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP ApproachesRecent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP ApproachesIRJET Journal
 
Concurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsConcurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsCSCJournals
 
A minimization approach for two level logic synthesis using constrained depth...
A minimization approach for two level logic synthesis using constrained depth...A minimization approach for two level logic synthesis using constrained depth...
A minimization approach for two level logic synthesis using constrained depth...IAEME Publication
 
Query optimization to improve performance of the code execution
Query optimization to improve performance of the code executionQuery optimization to improve performance of the code execution
Query optimization to improve performance of the code executionAlexander Decker
 
11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code executionAlexander Decker
 
Automated Essay Scoring Using Efficient Transformer-Based Language Models
Automated Essay Scoring Using Efficient Transformer-Based Language ModelsAutomated Essay Scoring Using Efficient Transformer-Based Language Models
Automated Essay Scoring Using Efficient Transformer-Based Language ModelsNat Rice
 

Similar to Packrat parsing (20)

IRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine LearningIRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine Learning
 
Extractive Summarization with Very Deep Pretrained Language Model
Extractive Summarization with Very Deep Pretrained Language ModelExtractive Summarization with Very Deep Pretrained Language Model
Extractive Summarization with Very Deep Pretrained Language Model
 
LEXICAL ANALYZER
LEXICAL ANALYZERLEXICAL ANALYZER
LEXICAL ANALYZER
 
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELEXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
 
Evaluate And Analysis of ALGOL, ADA ,PASCAL Programming Languages
Evaluate And Analysis of ALGOL, ADA ,PASCAL Programming Languages Evaluate And Analysis of ALGOL, ADA ,PASCAL Programming Languages
Evaluate And Analysis of ALGOL, ADA ,PASCAL Programming Languages
 
Introduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdfIntroduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdf
 
Intermediate Fabrics
Intermediate FabricsIntermediate Fabrics
Intermediate Fabrics
 
Y24168171
Y24168171Y24168171
Y24168171
 
Iaetsd march c algorithm for embedded memories in fpga
Iaetsd march c algorithm for embedded memories in fpgaIaetsd march c algorithm for embedded memories in fpga
Iaetsd march c algorithm for embedded memories in fpga
 
LaTeX 3 Paper
LaTeX 3 PaperLaTeX 3 Paper
LaTeX 3 Paper
 
Cohesive Software Design
Cohesive Software DesignCohesive Software Design
Cohesive Software Design
 
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
 
Multi core processing of xml twig patterns
Multi core processing of xml twig patternsMulti core processing of xml twig patterns
Multi core processing of xml twig patterns
 
Recent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP ApproachesRecent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP Approaches
 
Concurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsConcurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core Processors
 
A minimization approach for two level logic synthesis using constrained depth...
A minimization approach for two level logic synthesis using constrained depth...A minimization approach for two level logic synthesis using constrained depth...
A minimization approach for two level logic synthesis using constrained depth...
 
Query optimization to improve performance of the code execution
Query optimization to improve performance of the code executionQuery optimization to improve performance of the code execution
Query optimization to improve performance of the code execution
 
11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution
 
Automated Essay Scoring Using Efficient Transformer-Based Language Models
Automated Essay Scoring Using Efficient Transformer-Based Language ModelsAutomated Essay Scoring Using Efficient Transformer-Based Language Models
Automated Essay Scoring Using Efficient Transformer-Based Language Models
 
python and perl
python and perlpython and perl
python and perl
 

Recently uploaded

main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 

Recently uploaded (20)

main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

Packrat parsing

  • 1. Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309 © 2013, IJCSMC All Rights Reserved 303 ABSTRACT: Packrat Parsing is a variant of recursive decent parsing technique with memoization by saving intermediate parsing result as they are computed so that result will not be reevaluated. It is extremely useful as it allows the use of unlimited look ahead without compromising on the power and flexibility of backtracking. However, Packrat parsers need storage which is in the order of constant multiple of input size for memoization. This makes packrat parsers not suitable for parsing input streams which appears to be in simple format but have large amount of data. In this paper instead of translating productions into procedure calls with memoization, an attempt is made to eliminate the calls by using stack without using memoization for implementation of ordered choice operator in Parsing expression Grammar (PEG). The experimental results show the possibility of using this stack based algorithm to eliminate the need of storage for memoization with a guarantee of linear parse time. Keywords: Parsing; PEG; Packrat; CFG I. INTRODUCTION Recursive-descent parsing is a top-down method of syntax analysis in which we execute a set of recursive procedures to process the input. It was suggested as early as in 1961 by Lucas [1]. The great advantage of a recursive-descent parser is its simplicity and clear relationship to the grammar. For smaller grammars, the parser can be easily produced and maintained by hand. This is contrary to bottom-up parsers, normally driven by large tables that have no obvious relationship to the grammar; these tables must be mechanically generated. The problem with constructing recursive-descent parsers from a classical context- Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320–088X IJCSMC, Vol. 2, Issue. 11, November 2013, pg.303 – 309 RESEARCH ARTICLE Stack Based Implementation of Ordered Choice in Packrat Parsing Manish M. Goswami M.M. Raghuwanshi Latesh Malik Research Scholar, Professor, Professor, Dept. of CSE, G.H. Raisoni College of Engg. Rajiv Gandhi College of Engg. G.H.raisoni College of Engineering Nagpur, India Nagpur, India Nagpur, India
  • 2. Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309 © 2013, IJCSMC All Rights Reserved 304 free grammar is that the grammar must have the so-called LL(1) property. Forcing the language into the LL(1) mold can make the grammar – and the parser – unreadable. The LL(1) restriction can be circumvented by the use of backtracking. However, full backtracking may require exponential time. A reasonable compromise is limited backtracking: never try another alternative after one alternative already succeeded on a portion of input. Recently, Ford [2–4] introduced a language for writing recursive-descent parsers with limited backtracking. It is called Parsing Expression Grammar (PEG) and has the form of a grammar that can be easily transcribed into a set of recursive procedures. In addition to backtracking, PEG can directly define structures that normally require a separate ”lexer” or ”scanner”. Together with lifting of the LL(1) restriction, this gives a very convenient tool when we need an ad-hoc parser for some application. Theoretically, even the limited backtracking may require a lot of time. In [2, 3], PEG was introduced together with a technique called packrat parsing. Packrat parsing handles backtracking by extensive memoization: storing all results of parsing procedures. It guarantees linear parsing time at a huge memory cost. There exists a complete parser generator named Rats! [4, 5] that produces packrat parsers from PEG. Excessive backtracking does not matter in small interactive applications where the input is short and performance is not critical. Moreover, experiments reported in [7, 8] demonstrated a moderate backtracking activity in PEG parsers for programming languages Java 1.5 and C. In this paper stack based approach is adopted to transcribe a grammar into explicit call stack Push, Pop statements. in place of procedure calls in packrat parsing. The algorithm is inspired from Generalized LL parsing algorithm[12].Each nonterminal with which production begins is associated with the label where code for evaluation of that nonterminal in parse tree is stored. The data structure stack is used basically for two purposes: one to store the label for the next nonterminal in a choice to be evaluated and second to store the label of alternative choice if the production has more than two choices. Performance of resulting parser is compared with Generalized LL parsing algorithm (non optimized version). Experimental results shows that Packrat parsing implemented in this way(only implementation of ordered choice is considered) shows further scope for optimization which is not possible when implemented using procedure calls. Also, algorithm when compared with GLL parsing algorithm shows improvement on some inputs which is obvious as GLL algorithm tries to explore all the choices while packrat parsing algorithm explores an alternative choice only if first choice fails. II. BACKGROUND Parsing Expression Grammars (PEGs) PEGs is a recognition-based formal syntactic foundation that was first presented by Ford [9] to describe the syntax of formal languages. PEGs are a formalization of recursive descent parsing with backtracking. A PEG consists of rules, represented by N <- e, where N is a nonterminal symbol and e, an expression (called parsing expression). A parsing expression consists of the following elements, as shown in figure 1: ε : Empty string " " : String literal [ ] : Character class . : Wildcard (Any character) (e) : Grouping N : Nonterminal e1 e2 : Sequence
  • 3. Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309 © 2013, IJCSMC All Rights Reserved 305 e1 / e2: Ordered-choice e* : Zero-or-more repetition &e : And-predicate (Positive lookahead) !e : Not-predicate (Negative lookahead) e+ : One-or-more repetition e? : Zero-or-more Figure 1. Expressions Constituting a Parsing Expression It appears that PEGs are similar to Extended Backus-Naur Forms (EBNFs)[9]. However, it should be noted that e1 / e2 is an ordered choice, not an unordered choice, e1 | e2 in EBNFs. e1 / e2 does not indicate strings expressed by e1 or e2. e1 / e2 means an action in which e1 is evaluated at first and e2 is evaluated only when e1 fails. Therefore, generally, e1 / e2 ≠ e2 / e1. In addition, it should be noted e* is not similar to the operators used in regular expressions (REs). e* does not mean a greedy match in REs but a possessive match. For example, "a" * "a" in PEG does not express {"a", "aa", ...} but Ø because once "a"* succeeds, the parser does not backtrack even if successive expressions "a" do not succeed. &e and !e are lookahead expressions. !e succeeds if e does not succeed and &e succeeds if e succeeds. Note that !e and &e don’t change the position in an input of a parser even if the expressions succeed. PEGs can express all deterministic LR(k) languages and some non-context-free languages[10]. Packrat Parsing Roughly speaking, packrat parsing is a combination of PEG-based recursive descent parsing with memoization. A packrat parser takes each nonterminal of a PEG as a parsing function of the nonterminal and carries out parsing on an input by calling the function. A parsing function takes a start position in an input as an argument and returns a parse result, which is failure or success. A parsing function must be pure in a packrat parser. That is, the same parsing function called at the same position returns the same result. This fact allows parsing functions to be memoized. Memoization of all parsing functions in a packrat parser guarantees that the packrat parser parses any input in linear time. Despite the guarantee of linear time parsing, packrat parsing is powerful. Packrat parsing can rapidly handle wide-ranged grammar PEGs can express, including all deterministic LR(k) languages and some non-context-free languages. In addition, packrat parsing does not require a separate lexer because when one of the choices fails in parsing, a packrat parser can backtrack to the other choice, unlike traditional LL or LR predictive parsers. Furthermore, because packrat parsers are simple, they can be implemented easily. Although packrat parsing is a simple, powerful, and liner time parsing algorithm, it has a major disadvantage in that packrat parsers require O(n) space for memoization in parsing because they memoize all intermediate results. Because of this disadvantage, packrat parsers are considered to be unsuitable for large file parsing (e.g. XML streams). Rats![5], which generates packrat parsers in Java, supports several optimizations to improve execution performance and memory efficiency. For example, Rats! merges successive memoized fields into objects called chunks to decrease the heap size. By chunks and many other optimizations, parsers generated by Rats! achieve an execution performance comparable to that of LL parsers generated by ANTLR[11]. However, Rats! does not resolve the fundamental problem that packrat parsers require O(n) space. Call Stacks and elementary descriptors A traditional parser for Γ0 described below is composed of parse functions:- S -> ASd | BS A-> a|b B->a |c
  • 4. Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309 © 2013, IJCSMC All Rights Reserved 306 Main() { i := 0 if pS() and I[i] = $ report success else error() } pS() { Evaluate pA() followed by pS() and then evaluate ‘ if (I[i] = d) { i := i + 1 }’ .If any one of condition is false then evaluate following else return true. Evaluate pB() followed by pS() .If any one of the condition is false then return false else return true. } pA() { if (I[i] = a) { i := i + 1; return true } else if (I[i] = c) { i := i + 1; return true} else return false } pB() { if (I[i] = a) { i := i + 1 ; return true} else if (I[i] = b) { i := i + 1; return true} else return false } Of course, Γ0 is not LL (1) parsing algorithm so this algorithm will not behave correctly without some additional mechanism for dealing with non-determinism. This is addressed by converting the function calls into explicit call stack operations using stack push and goto statements in the usual way. We also partition the body of those functions whose corresponding nonterminal is not LL (1) and separately label each partition. In practice, then, some goto statements will have several target labels, corresponding to these multiple partitions: for example, this will be the case for the nonterminal S in Γ0. We use descriptors to record each possible choice, and replace termination in the RD algorithm with execution re-start from the point recorded in the next descriptor on the stack. Instead of calls to the error function, the algorithm simply processes the next descriptor of alternative choice if available on the stack and it terminates when there are no further descriptors to be processed. Finally, it signals error when there is no descriptor available and all input is not processed. Two stacks are maintained, one is parse stack which is used for storing the label for next nonterminal in the choice and second is backtrack stack which is used to store the starting label of next choice if the nonterminal has more than one productions. For example, If a node corresponding nonterminal S in S->ASd|BS is evaluated then starting label for evaluating choice BS will be stored on stack named as backtrack stack corresponding choice for backtracking if choice ASd fails and control will be transferred to starting label of choice ASd. In particular A is evaluated but before it label for evaluation of next node corresponding to S in ASd will be pushed onto the stack called as parse stack to return it after A is evaluated. In detail, an elementary descriptor is a triple (L, s, j) where L is a line label, s is a parse stack and j is a position in the input array I. We maintain a placeholder R for current descriptor. At the end of a parse function and at points of non-determinism in the grammar we create a new descriptor using the label at the top of the current parse stack. When a particular execution of the algorithm stops, at input I[i] say, the top element L is popped from the stack s = [s’, L] and (L, s’, i) is added to R (if it has not already been added). We use POP(s, i, R) to denote this action. Then the next descriptor (L’, t, j) is removed from R and execution starts at line L’ with call stack t and input symbol I[j]. The overall execution terminates when the
  • 5. Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309 © 2013, IJCSMC All Rights Reserved 307 R is empty. In order to allow us for backtracking we record both the line label L and the current input buffer index k on the stack using the notation Lk. At this interim stage we treat the stack as a bracketed list, [ ] denotes the empty stack, and we assume that we have a function PUSH(s,Lk) which simply updates the stack s by pushing on the element Lk.The details of the algorithm is as follows:- i := 0; R := ∅; s := [L0 0 ] LS : add (LS1, s, i) to R L0: if (R ≠∅) { remove (L, s1, j) from R if (L = L0 and s1 = [ ] and j = |I| and b=[]) report success else { s := s1; i := j; goto L } else report failure. LS1 : PUSH(s,Li 1 ); PUSH(b,LS2 i ) ; goto LA L1: PUSH(s,Li 2); goto LS L2: if (I[i] = d) { i := i + 1; POP(s, i,R) };else POP(bs,i,R); goto L0 LS2 : PUSH(s,Li 3); PUSH(b,LS3 i ) goto LB L3: PUSH(s,Li 4); goto LS L4: POP(s, i,R); goto L0 LS3 : POP(s, i,R); goto L0 LA: if (I[i] = a) { i := i + 1; POP(s, i,R); goto L0 } else{ if (I[i] = c) { i := i + 1; POP(s, i,R) goto L0 } else POP(bs,i,R); goto L0 } LB: if (I[i] = a) { i := i + 1; POP(s, i,R); goto L0 } else{ if (I[i] = b) { i := i + 1; POP(s, i,R) goto L0} else POP(bs,i,R) goto L0 } P:Parse Stack B:Backtrack Stack L2 1 L1 1 LS2 1 LS2 0 P B L2 1 LS2 0 P B L1 0 LS2 0 P B Fig.2: When input pointer is at pos=0 and Label Ls1 is evaluated Fig.3: When input a is matched and Label L1 is evaluated Fig.4: When input pointer is at pos=1 and Label Ls1 is evaluated
  • 6. Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309 © 2013, IJCSMC All Rights Reserved 308 III. EXPERIMENTATION AND ANALYSIS The Algorithm is implemented in Java. The grammar which is translated to explicit stack calls is reproduced here mentioned in above section:- S ->AS d |BS | ε A ->a |c B -> a |b Though the grammar seems to be simple it reveals many facts pertaining to parsing. The resulting program is run on Core 2 Duo 1.8GH, 2 GB RAM machine. The algorithm is run on a particular input string for 50 times for each parameter and the average is taken. Parameters which were considered into consideration to measure performance are:- 1) No. of Push 2) No. of Pop 3) Time required recognizing the input string. The performance of above algorithm is measured and compared with GLL algorithm (The non- optimized version of GLL i.e. without graph structured stack is considered here) As can be seen from the following table, no. of push count, pop count, and parsing time differs for each algorithm for particular input. The explanation for the same is cited as follows on case to case basis:- Case I: For input a65 bd65 , the stack based implementation of packrat parsing algorithm requires less time as compared to GLL algorithm. This is because in GLL algorithm for all a’s, two choices ASd and BS are evaluated. Here only S->Asd is correct choice for recognizing a65 bd65 .In the proposed algorithm discussed above choice Asd is evaluated for all a’s reducing push count and pop count thereby reducing time required to parse the input. Case II: Here GLL algorithm takes less time. This is because for all b’s correct choice i.e. BS is picked up while in the above proposed algorithm for every b, first Asd is evaluated by the property of packrat parsing (ordered choice) which will be obviously failed and then BS will be evaluated resulting into more push and pops thereby requiring more time. Case III: Here again GLL requires more time compared to above algorithm. This case is similar to Case I. Input= a65 bd65 Sr. No. Parameter Stack based implementation of packrat parsing Algorithm GLL Parsing Algorithm 1 Push_Count 204 6140 Pop_Count 205 15358 Parsing Time(ns) 674655 14449780 Input= b130 2 Push_Count 654 260 Pop_Count 655 261 Parsing Time(ns) 1231788 1132462
  • 7. Manish M. Goswami et al, Int. Journal of Computer Science and Mobile Computing Vol.2 Issue. 11, November- 2013, pg. 303-309 © 2013, IJCSMC All Rights Reserved 309 Input=a10 d 3 Push_Count 6650 4092 Pop_Count 6651 6129 Parsing Time(ns) 6119207 7073932 IV. CONCLUSION AND FUTURE WORK In this paper, an algorithm for implementation of packrat parsing algorithm is proposed by eliminating function calls by explicit stack push and pop operations. The performance in terms of time required to recognize the input is compared with GLL parsing algorithm. The advantage that can be seen with this algorithm is the chance of optimization (for example modeling of stack using graph structured stack) so that time required to recognize the input would be reduced which is not possible when the algorithm is implemented using recursive descent parsing style. It would be interesting to see the impact of stack used in the proposed algorithm on the use of memoization in packrat parsing algorithm. If somehow this reduces the memory required for memoization, then packrat parsing algorithm will have linear parsing time without sacrificing the storage. REFERENCES [1]. Lucas, P. The structure of formula-translators. ALGOL Bulletin Supplement 16 (September 1961), 1–27. [2]. Ford, B. Packrat parsing: a practical linear-time algorithm with backtracking. Master’s thesis, assachusetts Institute of Technology, September 2002. [3]. Ford, B. Packrat parsing: Simple, powerful, lazy, linear time. In Proceedings of the 2002 International Conference on Functional Programming (October 2002). [4]. Ford, B. Parsing expression grammars: A recognition-based syntactic foundation. In Proceedings of the 31st ACM SIGPLAN- SIGACT Symposium on Principles of Programming Languages (Venice, Italy, 14–16 January 2004), N. D. Jones and X. Leroy, Eds.,ACM, pp. 111–122. [5]. Grimm, R. Rats! – an easily extensible parser generator http://www.cs.nyu.edu /rgrimm/xtc/ rats.html. [6]. Grimm, R. Practical packrat parsing. Tech. Rep. TR2004-854, Dept. of Computer Science, New York University, March 2004. [7]. Redziejowski, R. R. Parsing Expression Grammar as a primitive recursive-descent parser with backtracking. Fundamenta Informaticae 79, 3–4 (2007), 513–524. [8]. Redziejowski, R. R. Some aspects of Parsing Expression Grammar. Fundamenta Informaticae 85, 1–4 (2008), 441–454. [9] I. S. Organization. Syntactic metalanguage – Extended BNF, 1996.ISO/IEC 14977. [10] B. Ford. Parsing expression grammars: A recognition-based syntactic foundation. In Symposium on Principles of Programming Languages,January 2004. [11] T. J. Parr and R.W. Quong. Antlr: A predicated-ll(k) parser generator.Software Practice and Experience, 25:789–810, 1994. [12] Elizabeth Scott and Adrian Johnstone.GLL Parsing, Electronic Notes in Theoretical Computer Science 253 (2010) 177–189.