Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# MELJUN CORTES Automata Theory (Automata9)

51
views

Published on

MELJUN CORTES Automata Theory (Automata9)

MELJUN CORTES Automata Theory (Automata9)

Published in: Technology, Business

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

Views
Total Views
51
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
3
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. MELJUN P. CORTES, MBA,MPA,BSCS,ACS Fall 2008 CSC 3130: Automata theory and formal languages Normal forms and parsing MELJUN CORTES
• 2. Testing membership and parsing • Given a grammar S → 0S1 | 1S0S1 | T T→S|e • How can we know if a string x is in its language? • If so, can we reconstruct a parse tree for x?
• 3. First attempt S → 0S1 | 1S0S1 | T T→S|ε x = 00111 • Maybe we can try all possible derivations: S 0S1 1S0S1 ... T 00S11 01S0S11 0T1 10S10S1 S ε when do we stop?
• 4. Problems S → 0S1 | 1S0S1 | T T→S|ε x = 00111 • How do we know when to stop? S 0S1 1S0S1 ... 00S11 01S0S11 0T1 10S10S1 when do we stop?
• 5. Problems S → 0S1 | 1S0S1 | T T→S|ε x = 01011 • Idea: Stop derivation when length exceeds |x| • Not right because of ε-productions S ⇒ 0S1 ⇒ 01S0S11 ⇒ 01S011 ⇒ 01011 1 3 7 6 5 • We might want to eliminate ε-productions too
• 6. Problems S → 0S1 | 1S0S1 | T T→S|ε x = 00111 • Loops among the variables (S → T → S) might make us go forever • We might want to eliminate such loops
• 7. Unit productions • A unit production is a production of the form A1 → A 2 where A1 and A2 are both variables • Example grammar: S → 0S1 | 1S0S1 | T T→S|R|ε R → 0SR unit productions: S T R
• 8. Removal of unit productions • If there is a cycle of unit productions A1 → A2 → ... → Ak → A1 delete it and replace everything with A1 • Example S → 0S1 | 1S0S1 | T T → | R | ε S R → 0SR S T R S → 0S1 | 1S0S1 S→R|ε R → 0SR T is replaced by S in the {S, T} cycle
• 9. Removal of unit productions • For other unit productions, replace every chain A1 → A2 → ... → Ak → α by productions A1 → α,... , Ak → α • Example S → 0S1 | 1S0S1 |R|ε R → 0SR S → 0S1 | 1S0S1 | 0SR | ε R → 0SR S → R → 0SR is replaced by S → 0SR, R → 0SR
• 10. Removal of ε-productions • A variable N is nullable if there is a derivation * N⇒ε • How to remove ε-productions (except from S)  Find all nullable variables N1, ..., Nk  For i = 1 to k For every production of the form A → αNiβ,  add another production A → αβ If Ni → ε is a production, remove it If S is nullable, add the special production S → ε
• 11. Example • Find the nullable variables grammar nullable variables S → ACD A→ a B→ε C → ED | ε D → BC | b E→b B C D  Find all nullable variables N1, ..., Nk
• 12. Finding nullable variables • To find nullable variables, we work backwards – First, mark all variables A s.t. A → ε as nullable – Then, as long as there are productions of the form A → A1… Ak where all of A1,…, Ak are marked as nullable, mark A as nullable
• 13. Eliminating ε-productions S → ACD A→ a B→ε C → ED | ε D → BC | b E→b nullable variables: B, C, D D→C S → AD D→B D→ε S → AC S →A C →E  For i = 1 to k For every production of the form A → αNiβ, add another production A → αβ If Ni → ε is a production, remove it
• 14. Recap • After eliminating ε-productions and unit productions, we know that every derivation * S ⇒ a1…ak where a1, …, ak are terminals doesn’t shrink in length and doesn’t go into cycles • Exception: S → ε – We will not use this rule at all, except to check if ε ∈ L • Note ε-productions must be eliminated before unit productions
• 15. Example: testing membership S → 0S1 | 1S0S1 | T T→S|ε eliminate unit, ε-prod S → ε | 01 | 101 | 0S1 |10S1 | 1S01 | 1S0S1 x = 00111 S 01, 101 0S1 10S1 1S01 1S0S1 0011, 01011 only strings of length ≥ 6 00S11 strings of length ≥ 6 10011, strings of length ≥ 6 10101, strings of length ≥ 6 only strings of length ≥ 6
• 16. Algorithm 1 for testing membership • We can now use the following algorithm to check if a string x is in the language of G  Eliminate all ε-productions and unit productions  If x = ε and S → ε, accept; else delete S → ε  Let X := S  While some new production P can be applied to X Apply P to X If X = x, accept If |X| > |x|, backtrack  If no more productions can be applied to X, reject
• 17. Practical limitations of Algorithm I • Previous algorithm can be very slow if x is long G = CFG of the java programming language x = code for a 200-line java program algorithm might take about 10200 steps! • There is a faster algorithm, but it requires that we do some more transformations on the grammar
• 18. Chomsky Normal Form • A grammar is in Chomsky Normal Form if every production (except possibly S → ε) is of the type A → BC or A→a • Conversion to Chomsky Normal Form is easy: A → BcDE replace terminals with new variables A → BCDE C→c break up sequences with new variables A → BX1 X1 → CX2 X2 → DE C→c
• 19. Exercise • Convert this CFG into Chomsky Normal Form: S → ε |ADDA A→a C→c D → bCb
• 20. Algorithm 2 for testing membership S → AB | BC A → BA | a B → CC | b C → AB | a x = baaba SAC – – SA B SAC B B AC B SC AC SA B AC b a a b a Idea: We generate each substring of x bottom up
• 21. Parse tree reconstruction S → AB | BC A → BA | a B → CC | b C → AB | a x = baaba SAC – – SA B SAC B B AC B SC AC SA B AC b a a b a Tracing back the derivations, we obtain the parse tree
• 22. Cocke-Younger-Kasami algorithm Input: Grammar G in CNF, string x = x1…xk ce l e lls 1k … … For i = 1 to k If there is a production A → xi Put A in table cell ii For b = 2 to k For s = 1 to k – b + 1 Set t = s + b For j = s to t If there is a production A → BC where B is in cell sj and C is in cell jt Put A in cell st tab 12 11 23 22 x1 1 x2 kk … xk j t k s b Cell ij remembers all possible derivations of substring xi…xj