Csr2011 june17 15_15_kaminski


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Csr2011 june17 15_15_kaminski

  1. 1. LR(0) Conjunctive Grammars and Deterministic SynchronizedAlternating Pushdown AutomataTamar Aizikowitz and Michael KaminskiTechnion – Israel Institute of Technology<br />June 17<br />CSR 2011, St. Petersburg, Russia<br />
  2. 2. Context-Free Languages<br />Combine expressiveness with polynomial parsing, which is appealing for practical applications.<br />One of the most widely used language class in Computer Science.<br />At the theoretical basis of Programming Languages, Computational Linguistics, Formal Verification, Computational Biology, and more.<br />
  3. 3. Extended Models<br />Goal: Models of computation that generate a slightly stronger language class without sacrificing polynomial parsing.<br />Why? Such models seem to have great potential for practical applications.<br />In fact, several fields (e.g.,Computational Linguistics) have already voiced their need for a stronger language class.<br />
  4. 4. Conjunctive Grammars<br /><ul><li>Conjunctive Grammars (CG) [Okhotin, 2001] are an extension of context-free grammars. </li></ul>Add explicit intersection rules<br />S (A & B)⋯ (w & w) w<br />Semantics:L(A & B) = L(A) ⋂ L(B)<br />Since context-free languages are not closed under intersection, CG is a wider class of languages<br />Retain polynomial parsing and therefore can be useful for practical applications<br />
  5. 5. Synchronized Alternating PDA<br />Synchronized Alternating Pushdown Automata (SAPDA) [Aizikowitz & Kaminski, 2008] extend PDA.<br />Stack modeled as tree in which all braches must accept.<br />A limited form of synchronization createslocalized parallel computations.<br />Firstautomaton counterpart shown for Conjunctive Grammars.<br />
  6. 6. Outline<br />Model Definitions<br />Equivalence Results<br />Linear-time LR(0) Parsing<br />Summary<br />
  7. 7. Conjunctive Grammars<br />Synchronized Alternating Pushdown Automata<br />Model Definitions<br />
  8. 8. Conjunctive Grammars<br />A CG is a quadruple: G=(V , Σ , P , S )<br />Rules:A -> (α1&⋯&αn), where A∊V,αi∊(V⋃Σ)*<br />Examples: A-> (aAB & Bc & aD) ; A->abC<br />Derivation steps:<br />s1As2s1(α1&⋯&αn)s2,where A->(α1&⋯&αn)∊P<br />s1(w&⋯&w)s2s1 w s2, where w∊ Σ*<br />non-terminals terminals derivation rules start symbol<br />The case of all n=1 isan ordinary CFG<br />
  9. 9. Grammar Language<br />Language:L(G) = {w∊Σ*| S*w}, i.e., L(G) consists of all terminal words w derivable from the start symbol S.<br />Note: ( , ) , and & are not terminal symbols. Therefore, all conjunctions must be collapsed in order to derive a terminal word.<br />Semantics:(A & B) *w if and only if A *wand B *w. Therefore, L(A & B) = L(A) ⋂ L(B).<br />
  10. 10. Example: Multiple Agreement<br />Example: a CG for the multiple-agreement language {anbncn | n∊ℕ}:<br />C->Cc | D ; D->aDb | ε:L(C) = {anbncm | n,m∊ℕ}<br />A->aA | E ; E->bEc | ε:L(A) = {ambncn | n,m∊ℕ}<br />S-> (C & A) :L(S ) = L(C) ⋂ L(A)<br />S (C&A)  (Cc&A)  (Dc&A)  (aDbc&A) <br /> (abc&A) ⋯ (abc&abc) abc<br />
  11. 11. Synchronized Alternating Pushdown Automata (SAPDA)<br />An extension of classical PDA.<br />Transitions are made to conjunctions of ( state , stack-word ) pairs, e.g.,<br />δ(q , σ, X) = { (p1 , XX)∧(p2 ,Y) , (p3 , Z) }<br />Note:if all conjunctions are of one pair only, the automaton is an “ordinary” PDA.<br />non-deterministic model, i.e., many possible transitions<br />
  12. 12. SAPDA Stack Tree <br />The stack of an SAPDA is a tree.A transition to n pairs splits the current branch into n branches.<br />Branches are processed independently.<br />Empty sibling branches can be collapsed if they are synchronized, i.e., are in the same state and have read the same portion of the input.<br />p<br />D<br />q<br />q<br />q<br />q<br />A<br />C<br />ε<br />ε<br />A<br />q<br />B<br />(q,A)(p,DC)δ(q,σ,A)<br />B<br />B<br />collapse<br />B<br />
  13. 13. SAPDA Definition<br />An SAPDA is a sextuple <br />A = ( Q , Σ , Γ , q0 , δ , ⊥ )<br />Transition function:<br />δ(q,σ,X) ⊆ {(q1,α1)∧⋯∧(qn,αn) | qi∊Q, αi∊Γ*, n∊ℕ}<br />states terminals<br />initial <br />state<br />transition <br />function<br />initial stack <br />symbol<br />stack <br />symbols<br />
  14. 14. SAPDA Computation and Language<br />Computation: <br />Each step, a transition is applied to one stack-branch<br />If a stack-branch is empty, it cannot be selected<br />Synchronous empty sibling branches are collapsed<br />Initial Configuration: <br />Accepting configuration:<br />Language:L(A)={w∊Σ*| ∃q∊Q, (q0,w,⊥)⊢*(q,ε,ε)} <br />Note: all branches must empty, i.e., must “agree”.<br />have the same state and remaining input<br />q0<br />⊥<br />q<br />ε<br />
  15. 15. SAPDA and Conjunctive Grammars<br />Equivalence Results<br />
  16. 16. Equivalence of SAPDA and CG<br />Theorem 1.A language is generated by an CG if and only if it is accepted by an SAPDA.<br />The equivalence is analogous to the classical equivalence between CFG and PDA.<br />The proofs of the equivalence are extended versions of the classical proofs.<br />Corollary:Single-state SAPDA and multi-state SAPDA are equivalent (like ordinary PDA).<br />
  17. 17. Deterministic SAPDA and LR(0) CG<br />Linear-time Implementation of DSAPDA and LR(0) Parsing<br />Linear-time LR(0) Parsing<br />
  18. 18. Deterministic Context-free Languages<br />DCFL are a sub-family of context-free languages, accepted by Deterministic PDA (DPDA).<br />LR(k) languages were introduced by Knuth, and shown to be equivalent to DPDA, [Knuth, 1965].<br />Knuth developed a linear-time parsing algorithm for DCFL  the basis of Compilation Theory.<br />Knuth also proved that LR(0) grammars are equivalent to DPDA which accept by empty stack.<br />
  19. 19. Deterministic SAPDA<br />Deterministic SAPDA are defined according to the classical notion of a deterministic model.<br />That is, an SAPDA is deterministic if it has only one possible step from any given configuration.<br />Remark:A deterministic SAPDA has exactly one computation on any given input word w.<br />
  20. 20. LR(0) Conjunctive Grammars<br />To define LR(0) conjunctive grammars, we extend the notions of viable prefixes, valid items ([A->]), etc…<br />A conjunctive grammar is LR(0) if a conflict-free item automaton can be constructed for it.<br />Main issue: How to build the item automaton so that it supports conjunctive rules...<br />Solution: Read the paper <br />
  21. 21. Equivalence Results<br />Theorem 2.A language is generated by an LR(0) CG if and only if it is accepted by a DSAPDA.<br />The equivalence is analogous to the classical equivalence between LR(0) CFG and DPDA.<br />The proofs of the equivalence are extended versions of the classical proofs.<br />
  22. 22. The DSAPDA Membership Problem<br />By using a naive implementation of a DSAPDA, we have an exponential time solution for the membership problem. <br />Each branch is linear, but there can be an exponential number of branches.<br />
  23. 23. Do we need so many branches?<br />
  24. 24. DSAPDA Efficient Implementation<br />There are at most M = |Q|  |Γ| different combinations for branch head configurations.<br />Thus, at any given time, we need at most M heads.<br />
  25. 25. Linear-time Parsing<br />Based on the efficient implementation of a DSAPDA, we can build an LR(0) parser.<br />The parser runs in linear time for the boolean closure of LR(0) Conjunctive Languages<br />This class strictly Includes the boolean closure of classical LR(0) languages.<br />{ ai1 bai2 b2 ⋯ ainbn$ bai1 bai2 ⋯ bain$ | n ≥ 1  ij≥ 1 }<br />That is, we have linear-time parsing for a wider class of languages. <br />
  26. 26. Summary<br />Conjunctive Languages are interesting because:<br />They are a strong, rich class of languages.<br />They are polynomially parsable.<br />Their models of computation are intuitive and easy to understand; highly resemble classical CFG and PDA.<br />SAPDA are the first automaton model presented for Conjunctive Languages.<br />They are a natural extension of PDA.<br />They lend new intuition on Conjunctive Languages.<br />They are the basis of a linear parsing algorithm for a wide class of languages.<br />
  27. 27. Thank you.<br />
  28. 28. Thesis Overview<br />Conjunctive grammars and alternating pushdown automata. T. Aizikowitz and M. Kaminski.WoLLIC 2008, LNAI vol. 5110.<br />Linear conjunctive grammars and one-turn synchronized alternating pushdown automata. T. Aizikowitz and M. Kaminski.FG 2009, LNAI vol. 5591.<br />LR(0) conjunctive grammars and deterministic synchronized alternating pushdown automata. T. Aizikowitz and M. Kaminski.submitted.<br />
  29. 29. References<br />Aizikowitz, T., Kaminski, M.: Conjunctive grammars and alternating pushdown automata. WoLLIC’09. LNAI 5110 (2008) 30 – 41.<br />Aizikowitz, T., Kaminski, M.: Linear conjunctive grammars and one-turn synchronized alternating pushdown automata.Formal Grammars: Bordeaux 2009. LNAI 5591. To appear.<br />Ginsburg, S., Spanier, E.H.: Finite-turn pushdown automata.SIAM Journal on Control. 4(3) (1966) 429 – 453<br />Knuth, D.E.: On the translation of languages from left to right. Information and Control. 8 (1965) 114 – 133.<br />Okhotin, A.: Conjunctive grammars. Journal of Automata, Languages and Combinatorics. 6(4) (2001) 519 – 535<br />Okhotin, A.: LR parsing for conjunctive grammars. Grammars. 5(2) (2002) 21 – 40<br />Okhotin, A.: Conjunctive languages are closed under inverse homomorphism. Technical Report 2003-468. School of Computing, Queens Univ., Kingston, Ontario, Canada.<br />Okhotin, A.: On the equivalence of linear conjunctive grammars and trellis automata.RAIRO Theoretical Informatics and Applications. 38(1) (2004) 69 – 88<br />Vijay-Shanker, K., Weir, D.J.: The equivalence of four extensions of context-free grammars. Mathematical Systems Theory. 27(6) (1994) 511 – 546.<br />
  30. 30. Reduplication with a Center Marker<br />The reduplication with a center markerlanguage (RCM) , {w$w | w∊Σ*}, describes structures in various fields, e.g.,<br />Copying phenomena in natural languages: “deal or no deal”, “boys will be boys”, “is she beautiful or is she beautiful?”<br />Biology:microRNA patterns in DNA, tandem repeats<br />We will construct an SAPDA for RCM. <br />Note: it is not known whether reduplicationwithout a center marker can be derived by a CG.<br />
  31. 31. Example: SAPDA for RCM<br />We consider an SAPDA for {w$uw | w,u∊Σ*}, which can easily be modified to accept RCM.<br />The SAPDA is especially interesting, as it utilizes recursive conjunctive transitions.<br />Construction Idea:if σ in the nth letter before the $, check that the nth letter from the end is also σ.<br />n - 1<br />n - 1<br />σ<br />σ<br />$<br />w<br />u<br />w<br />
  32. 32. Construction of SAPDA for RCM<br />A=(Q,{a,b,$},{⊥,#},q0,δ,⊥)<br />Q = {q0,qe}⋃{qσi| σ∊{a,b} , i∊{1,2}}<br />δ(q0,σ,⊥) = {(qσ1,⊥) ∧(q0,⊥)}<br />δ(qσ1,τ,X) = {(qσ1,#X)}<br />δ(q0,$,⊥) = {(qe,ε)}<br />δ(qσ1,$,X) = {(qσ2,X)}<br />δ(qσ2,τσ,X) = {(qσ2,X)}<br />δ(qσ2,σ,X) = {(qe,X),(qσ2,X)}<br />δ(qe,τ,#) = {(qe,ε)}<br />δ(qe,ε,⊥) = {(qe,ε)}<br />recursively open branch to verify σ is nth from the $ and from the end<br />“count” symbols between σ and $<br />$ reached, stop recursion<br />$ reached, stop “counting” symbols<br />look for σ<br />“guess” that this is the σwe are looking for, or keep looking<br />“count” symbols from σ to the end<br />all done: empty stack <br />
  33. 33. Computation of SAPDA for RCM<br />qb1<br />qb2<br />qe<br />qe<br />qb1<br />qb2<br />qe<br />word accepted!<br />qa1<br />qa2<br />qe<br />qe<br />q0<br />⊥<br />ε<br />⊥<br />ε<br />#<br />qe<br />#<br />q0<br />ε<br />⊥<br />⊥<br />ε<br />qa1<br />qe<br />#<br />q0<br />qa1<br />⊥<br />⊥<br />ε<br />ε<br />abb$babb<br />q0<br />⊥<br />ε<br />
  34. 34. What CG can do for Programming Language Specification and Parsing<br />Usage Example<br />
  35. 35. A Toy Programming Language<br />A program in PrintVars has three parts:<br />Definition of variables<br />Assignment of values<br />Printing of variable values to the screen<br />Example:<br />VARS a , b , c<br />VALS b = 2 , a = 1 , c = 3<br />PRNT b a a c b<br />Output:2 1 1 3 2<br />
  36. 36. PrintVars Specification<br />A PrintVars program is well-formed if:<br />(1) It has the correct structure<br />(2) All used variables are defined<br />(3) All defined variables are used<br />(4) All defined variables are assigned a value<br />(5) All variables assigned a value are defined<br />Item (1) is easily defined by a CF Grammar.<br />However, items (2) – (5) amount to a language reducible to RCM, which is not CF.<br />
  37. 37. S-> ( structure & defined_used & used_defined & defined_assigned & assigned_defined )<br />structure->varsvalsprnt<br />vars->VARS… ; vals->VALS… ; prnt->PRNT…<br />defined_used->VARS check_du<br />check_du-> ( a X vals X a X & acheck_du) | ( b X vals X b X & bcheck_du) | ⋯ | vals X )<br />X->a X | ⋯ | z X | 0 X | ⋯ | 9 X | = X | ε<br />A (partial) CG for PrintVars<br />Semantic Checks via Syntactic Parsing<br />
  38. 38. Linear CG and One-turn SAPDA<br />Linear Conjunctive Grammars (LCG) [Okhotin, 2001] are an interesting sub-class of CG.<br />More efficient parsing [Okhotin, 2003]<br />Equivalent to Trellis Automata [Okhotin, 2004] <br />It is a well known classical result [Ginsburg et.al, 1966] that Linear Grammars are equivalent to one-turn Pushdown Automata.<br />We define a sub-class of SAPDA, one-turn SAPDA, analogously to one-turn PDA.<br />
  39. 39. One-turn SAPDA ~ LCG<br />Theorem 2.A language is generated by an LCG if and only if it is accepted by a one-turn SAPDA.<br />This result mirrors the classical equivalence between Linear Grammars and one-turn PDA, strengthening the claim of SAPDA as a natural automaton counterpart for CG.<br />Corollary:One-turn SAPDA are equivalent to Trellis automata.<br />
  40. 40. Generative Power<br />Closure Properties<br />Decidability Problems<br />Conjunctive Languages<br />
  41. 41. Generative Power<br />CG<br />⋂CFG<br />LCG<br />⋂ LG<br />
  42. 42. Closure Properties<br />Union, concatenation, intersection, Kleene star<br />Proven quite easily using grammars<br />Homomorphism<br />Inverse homomorphism<br />We’ll touch on the proof of this in the next slide…<br />Linear CL are closed under complement.<br />It is an open question whether general CG are closed under complement.<br /><br /><br /><br /><br />?<br />
  43. 43. Inverse Homomorphism<br /><ul><li>92% reduction in proof length </li></ul>For the grammar based proof, see [Okhotin, 2003].<br />
  44. 44. Decidability Problems<br />Linear CG Membership: <br /> O(n2) time and O(n) space.<br />General CG Membership:<br /> O(n3) time and O(n2) space.<br />Emptiness, finiteness, equivalence, inclusion, regularity<br /><br /><br /><br />
  45. 45. “only if” Proof Sketch<br />Given an CG, we construct an single-state SAPDA using an extension of the classical construction.<br />A simulation of the derivation is run in the stack:<br />If the top stack symbol is a non-terminal, it is replaced with the r.h.s. of one of its rules.<br />If the top stack symbol is a terminal, it is emptied while reading the same terminal symbol from the input.<br />A correlation is achieved between the stack contents and the grammar sentential forms.<br />
  46. 46. “only if” Proof Simulation<br />S *wAα<br />w uxv z<br />⋯<br /><br />w(uBβ&⋯)α<br /><br />x<br />⋯uxCγβ⋯<br />switch variable with r.h.s of rule<br />empty terminals<br />u<br />C<br />*<br />B<br />γ<br />⋯ux(v& ⋯&v)γβ⋯<br />β<br />⋯<br />*<br />A<br />⋯uxv γβ⋯<br />α<br />
  47. 47. “if” Proof Sketch<br />Given an SAPDA we construct an CG.<br />The proof is an extension of the classical one. <br />However, due to the added complexity of the extended models, it is more involved.<br />Therefore, we won’t get into it now…<br />
  48. 48. Mildly Context Sensitive Languages<br />Computational Linguistics pursues a computational model which exactly describesnatural languages.<br />Originally, context-free models were considered. <br />However, non-CF natural language structures led to interest in a slightly extended class of languages – Mildly Context-sensitive Languages (MSCL).<br />Several formalisms (e.g., Tree Adjoining Grammars) are known to converge to MCSL. [Vijay-Shanker, 1994]<br />
  49. 49. Conjunctive Languages and MCSL<br />We explore the correlation between Conjunctive Languages and MCSL. <br />MCSL are loosely categorized as follows:<br />(1) They contain the context-free languages<br />(2) They contain multiple-agreement, cross-agreement and reduplication <br />(3) They are polynomially parsable <br />(4) They are semi-linear  a CG exists for the language {ba2ba4 ⋯ ba2nb | n∊ℕ} <br /> Not an exact characterization of natural languages, but still with applicative potential.<br /><br /><br /><br /><br />
  50. 50. LR Parsing for Conjunctive Grammars<br />Okhotin presented an extension of Tomita’s Generalized LR parsing algorithm for Conjunctive Grammars, [Okhotin, 2002].<br />The algorithm runs in polynomial time for all conjunctive grammars.<br />When applied to the boolean closure of deterministic context free languages, it runs in linear time.<br />
  51. 51. One-turn SAPDA<br />We introduce a sub-family of SAPDA, one-turn SAPDA, analogously to one-turn PDA.<br />An SAPDA is one-turn if all stack-branches make exactly one turn in all accepting computations.<br />Note:the requirement of a turn is not limiting as we are considering acceptance by empty stack.<br />
  52. 52. Single-state SAPDA<br />The “if” proof translates a general SAPDA into a Conjunctive Grammar.<br />The “only if” proof translates a Conjunctive Grammar into a single-stateSAPDA.<br />Corollary:Single-state SAPDA and multi-state SAPDA are equivalent.<br />This characterizes classical PDA as well.<br />
  53. 53. Background<br />It is a well known result [Ginsburg et.al, 1966] that Linear Grammars are equivalent to one-turn PDA.<br />A turn is a computation step where the stack height changes from increasing to decreasing.<br />A one-turn PDA is a PDA s.t. all accepting computations, have only one turn.<br />
  54. 54. One-turn SAPDA<br />An SAPDA is one-turn if all stack-branches make exactly one turn in all accepting computations.<br />Note: if the automaton is a PDA, then there is only one branch with no second phase and therefore the automaton is a classical one-turn PDA.<br />phase 2<br />phase 1<br />phase 3<br />
  55. 55. Summary<br />Future Directions<br />Acknowledgments<br />Concluding Remarks<br />
  56. 56. Future Directions<br />Broadening the theory of SAPDA<br />Finite-turn Automata<br />LR(k) Grammars<br />Considering possible applications<br />Formal verification<br />Programming Languages / Compilation<br />…<br />
  57. 57. Acknowledgments<br />Prof. Michael Kaminski<br />Prof. Nissim Francez<br />Prof. Shmuel Zaks<br />EitanYaakobi, Yaniv Carmeli,<br />Sonya Liberman, Keren Censor<br />My Parents,<br />Nava and Jacob<br />Sivan Bercovici<br />