Upcoming SlideShare
×

# Theory of Computation: CFN Parse Trees & Yield String Lengths, CFL Pumping Lemma & Its Use, Finite Automata, CFGs, & Compilation

527 views
446 views

Published on

CFN Parse Trees & Yield String Lengths, CFL Pumping Lemma & Its Use, Finite Automata, CFGs, & Compilation

1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
527
On SlideShare
0
From Embeds
0
Number of Embeds
134
Actions
Shares
0
0
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Theory of Computation: CFN Parse Trees & Yield String Lengths, CFL Pumping Lemma & Its Use, Finite Automata, CFGs, & Compilation

1. 1. Theory of Computation CFN Parse Trees & Yield String Lengths, CFL Pumping Lemma & Its Use, Finite Automata, CFGs, & Compilation Vladimir Kulyukin
2. 2. Outline  CFN Parse Trees & Yield String Lengths  CFL Pumping Lemma & its Use  Finite Automata, CFGs, and Compilation  Tokenization  Syntactic Analysis
3. 3. CFN Parse Trees & Yield String Lengths
4. 4. Review: Chomsky Normal Form (CNF) A grammar G = (V, T, S, P) is said to be in Chomsky Normal Form (CNF) if each production in P has the following form: 1)A  BC 2)A  a where A, B, C are in V and a is in T
5. 5. Sample CNF Derivation G’s Productions: 1. S  AB | BC 2. A  BA | a 3. B  CC | b 4. C  AB | a S b a b
6. 6. Sample CNF Derivation G’s Productions: 1. S  AB | BC 2. A  BA | a 3. B  CC | b 4. C  AB | a S B C b a b
7. 7. Sample CNF Derivation G’s Productions: 1. S  AB | BC 2. A  BA | a 3. B  CC | b 4. C  AB | a S B C b a b
8. 8. Sample CNF Derivation G’s Productions: 1. S  AB | BC 2. A  BA | a 3. B  CC | b 4. C  AB | a S b a b B C A B
9. 9. Sample CNF Derivation G’s Productions: 1. S  AB | BC 2. A  BA | a 3. B  CC | b 4. C  AB | a S b a b B C A B
10. 10. CNF Parse Tree Lemma Suppose that L is a CFL and G is a CNF grammar for L – {ɛ}. If a string z is in L and the parse tree T for z has no path of length greater than i, then|z| ≤ 2(i-1).
11. 11. Example: Parse Tree & String Length The parse tree has no path of length greater than 3; The length of the string |bab|= 3 ≤ 2(3-1) = 4 S b a b B C A B
12. 12. CNF Parse Tree Lemma: Proof The proof is by induction on the length i of the longest path in the CNF parse tree. Base case: If i = 1, then the parse tree must be of the form shown below, where a is some terminal symbol. So the statement of the lemma is true. S a
13. 13. CNF Parse Tree Lemma: Proof Inductive case: If i > 1, then the CNF parse tree must be of the form below. S T1 T2 A B
14. 14. CNF Parse Tree Lemma: Proof By inductive hypothesis, the yield of T1 (substring covered by the A-rooted tree) has a length no greater than 2(i-2), because T1 has no path greater than i-1. The same is true for T2 . Thus the yield of the S tree has a length no greater than 22(i-2)=2(i-1). S T1 T2 A B yield of T1 yield of T2
15. 15. Equivalent Formulation of CNF Parse Tree Lemma Suppose that L is a CFL and G is a CNF grammar for L – {ɛ}. If a string z is in L and |z| > 2(i-1), then the parse tree T for z has a path of length greater than i.
16. 16. Pumping Lemma for CFLs (CFLPL)
17. 17. Review: Pumping Lemma for Regular Languages   .0for,and1where ,can writeThen we.andLet states.DFA withaiswhere,Let    iLwuvv uvwxnxLx nMMLL i
18. 18. The Pumping Lemma for CFLs If L be a CFL, there exists a constant n (that depends on the number of variables in a CNF grammar for L) such that if z is in L and |z|≥ n, z can be written as z = uvwxy such that: 1) |vx| ≥ 1 2) |vwx| ≤ n 3) For all i ≥ 0, uviwxiy is in L
19. 19. The Pumping Lemma for CFLs: Proof Suppose that G has k variables, i.e., |V|= k > 1. If |V| = 1, the grammar is trivial. Let n = 2k. Suppose that z is in L and |z| ≥ n = 2k. By the equivalent formulation of the CNF Parse Tree Lemma, the parse tree for z has a path of length at least k+1. Such a path must have at least k+2 vertices. Only the last vertex on that path is a terminal symbol. The remaining k+1 vertices are variables. What does the last statement imply?
20. 20. The Pumping Lemma for CFLs: Proof There must be a variable vertex in the path that appears twice.
21. 21. The Pumping Lemma for CFLs: Proof Let P be a path that is as long or longer than any other path in the parse tree for z. Then the following three statements are true: 1. P has two vertices v1 and v2 that have the same label A. 2. Vertex v1 is closer to the root S than v2. 3. The path from v1 to the leaf is of length at most k+1.
22. 22. The Pumping Lemma for CFLs: Proof Both v1 and v2 can be found as follows: start at the leaf of P and keep going up. P has k+1 variable vertices, v1 and v2 must have the same label. Thus, the path from v1 to v2 has length of at most k. Hence, the path from v1 to the leaf has length of at most k+1.
23. 23. T1 The Pumping Lemma for CFLs: Proof Suppose T1 is the parse tree rooted at v1 and T2 is the parse tree rooted at v2. We know that T2 must be a sub- tree of T1, as shown in below. Assume that v1 = v2 = X X X T2 Path from v1 = X to v2 = X
24. 24. The Pumping Lemma for CFLs: Proof Suppose z1 is the yield of T1 and z2 is the yield of T2. X X z1 z2 T1 T2 Path from v1 = X to v2 = X
25. 25. The Pumping Lemma for CFLs: Proof z1 = z3z2z4, where z3 and z4 cannot both be empty. Why? X X z1 = z3z2z4 z2 z4z3 T1 T2 Path from v1 = X to v2 = X
26. 26. The Pumping Lemma for CFLs: Proof z1 = z3z2z4, where z3 and z4 cannot both be empty. Why? Because the first production used in the derivation of z1 must have been of the form X  ZY. Why? X X Path from v1 = X to v2 = X z1 = z3z2z4 z2z3 z4 Z Y
27. 27. The Pumping Lemma for CFLs: Proof z1 = z3z2z4, where z3 and z4 cannot both be empty. Why? Because the first production used in the derivation of z1 must have been of the form X  ZY. Why? Because |z1| > 1 and G is a CNF grammar. X X z1 = z3z2z4 z2 z4z3 Z Y Path from v1 = X to v2 = X
28. 28. The Pumping Lemma for CFLs: Proof We can pump!!! X * z3Xz4 * z3z3Xz4z4 * z3 i Xz4 i X X Path from v1 = X to v2 = X z1 = z3z2z4 z2z3 z4 Z Y
29. 29. Example of the CFL Pumping Lemma Use Claim: L = {akbkck | k ≥ 1} is not a CFL. Proof: Suppose L is a CFL. Let n be the constant of the CFL Pumping Lemma. Let z = anbncn. By the CFL Pumping Lemma, z = uvwxy and |vwx|≤ n and |vx| ≥ 1. Since |vwx|≤ n, it is impossible for the substring vx to contain a’s and c’s. There are five cases to consider: vx may contain 1) only a’s; 2) a’s and b’s; 3) b’s and c’s; 4) only b’s; 5) only c’s.
30. 30. Example of the CFL Pumping Lemma Use Proof Continued: Let us consider case 1 when vx contains only a’s. Then uv0wx0y has fewer a’s than b’s and c’s. This type of argument can be used for cases 4 and 5. Let us consider case 2 vx contains a’s and b’s, then uv0wx0y contains fewer a’s and b’s than c’s. The same type of argument holds for case 3 when vx contains b’s and c’s.
31. 31. Two Pumping Lemmas Side By Side  The Pumping Lemma for regular languages states that every sufficiently long string in a given regular set has a non-empty sub-string that can be pumped  The Pumping Lemma for CFLs states that every sufficiently long string in a given CFL has two sub-strings, not both empty, that can be pumped the same number of times  In both cases, new strings obtained through pumping remain in the same language from which the original string comes
32. 32. Two Pumping Lemmas Side By Side  The Pumping Lemmas are not used to prove that specific languages are regular or CF  The Pumping Lemmas state that if a language is regular or CF, then its sufficiently long strings have specific properties  A typical use is to assume that a language is regular/CF and then show that some sufficiently long string does not satisfy specific properties
33. 33. Summary  The pumping lemma for regular languages shows that there are languages that are not regular  The pumping lemma for CFLs shows that there are languages that are not CF  In summary, there are languages that are neither regular nor CF
34. 34. Finite Automata, Context-Free Grammars, & Compilation
35. 35. Three Stages of Compilation ● Syntactic Analysis: The source program is processed to determine its conformity to the language grammar and its structure ● Contextual Analysis: The output of the syntactic analysis (a parse tree) is checked for its conformity to the language’s contextual constraints ● Code Generation: The checked parse tree is used to generate the target code, e.g. Java byte code or assembly or some other target language
36. 36. Components of Syntactic Analysis ● Syntactic Analysis consists of Tokenization and Parsing ● Tokenization: We have to define a set of FA’s (regular expressions) to tokenize input statements (primitive instructions) ● Parsing: We have to define a CFG to map tokenized input statements (primitive instructions) into parse trees.
37. 37. Tokenization: Two Basic Design Principles ● Zero Token Ambiguity: Each sequence of non- white-space characters must be mapped to at most one token ● Zero Statement (Instruction) Ambiguity: Each sequence of tokens recognized in between the beginning of a line and a newline character must have at most one parse tree
38. 38. References & Reading Suggestions  Hopcroft and Ullman. Introduction to Automata Theory, Languages, and Computation, Narosa Publishing House  Moll, Arbib, and Kfoury. An Introduction to Formal Language Theory  Davis, Weyuker, Sigal. Computability, Complexity, and Languages, 2nd Edition, Academic Press  Brooks Webber. Formal Language: A Practical Introduction, Franklin, Beedle & Associates, Inc