3. CHOMSKY NORMAL FORM
A context-free grammar is in Chomsky normal
form if every rule is of the form:
A → BC
A → a
S → ε
B and C aren’t start variables
a is a terminal
S is the start variable
Any variable A (except the start variable)
can only generate strings of length > 0
4. CHOMSKY NORMAL FORM
A context-free grammar is in Chomsky normal
form if every rule is of the form:
A → BC
A → a
S → ε
B and C aren’t start variables
a is a terminal
S is the start variable
S → 0S1
S → TT
T → ε S → TU | TV
T → 0
U → SV
S0 → TU | TV | ε
V → 1
5. 1. Add a new start variable S0
and add the rule S0 → S S → 0S1
S → T#T
T → ε
S0 → S
2. Remove all A → ε rules
(where A is not S0)
For each occurrence of A on right
hand side of a rule, add a new rule
with the occurrence deleted
If we have the rule B → A, add
B → ε, unless we have
previously removed B → ε
S → T
3. Remove unit rules A → B
Whenever B → w appears, add
the rule A → w unless this was
a unit rule previously removed
S → T#
S → #T
S → #
S → ε
S → 01
S0 → ε
S0 → 0S1
6. S → 0S1
S → T#T
S → T#
S → #T
S → #
S → 01
S0 → ε
S0 → 0S1
S0 → T#T
S0 → T#
S0 → #T
S0 → #
S0 → 01
4. Convert all remaining rules into the
proper form:
S0 → 0S1
S0 → A1A2
A1 → 0
A2 → SA3
A3 → 1
S0 → 01
S0 → A1A3
S → 01
S → A1A3
7. Convert the following into Chomsky normal form:
A → BAB | B | ε
B → 00 | ε
A → BAB | B | ε
B → 00 | ε
S0 → A
A → BAB | B | BB | AB | BA
B → 00
S0 → A | ε
A → BAB | 00 | BB | AB | BA
B → 00
S0 → BAB | 00 | BB | AB | BA | ε
S0 → BC | DD | BB | AB | BA | ε, C → AB,
A → BC | DD | BB | AB | BA , B → DD, D → 0
9. 0 → 0, R
read write move
→ , R
qaccept
qreject
0 → 0, R
→ , R
0 → 0, R
→ , L
10. 0 → 0, R
read write move
→ , R
qaccept
0 → 0, R
→ , R
0 → 0, R
→ , L
11. TMs VERSUS FINITE AUTOMATA
TM can both write to and read from the tape
The head can move left and right
The input doesn’t have to be read entirely,
Accept and Reject take immediate effect
and the computation can continue after all
the input has been read
12. Definition: A Turing Machine is a 7-tuple
T = (Q, Σ, Γ, , q0, qaccept, qreject), where:
Q is a finite set of states
Γ is the tape alphabet, where Γ and Σ Γ
q0 Q is the start state
Σ is the input alphabet, where Σ
: Q Γ → Q Γ {L, R}
qaccept Q is the accept state
qreject Q is the reject state, and qreject qaccept
14. A TM recognizes a language iff it accepts all
and only those strings in the language
A TM decides a language L iff it accepts all
strings in L and rejects all strings not in L
A language L is called Turing-recognizable
or recursively enumerable
iff some TM recognizes L
A language L is called decidable or recursive
iff some TM decides L
15. A language is called Turing-recognizable or
recursively enumerable (r.e.) if some TM
recognizes it
A language is called decidable or recursive
if some TM decides it
recursive
languages
r.e.
languages
16. Theorem: If A and A are r.e. then A is recursive
Given:
a TM that recognizes A and
a TM that recognizes A,
we can build a new machine that decides A
17. { 0 | n ≥ 0 }
2n
1. Sweep from left to right, cross out every other 0
2. If in stage 1, the tape had only one 0, accept
3. If in stage 1, the tape had an odd number of 0’s,
reject
4. Move the head back to the first input symbol.
5. Go to stage 1.
PSEUDOCODE:
18. 0 → , R
→ , R
qaccept
qreject
0 → x, R
x → x, R
→ , R
x → x, R
0 → 0, L
x → x, L
x → x, R
→ , L
→ , R
0 → x, R
0 → 0, R
→ , R
x → x, R
{ 0 | n ≥ 0 }
2n
q0 q1
q2
q3
q4
19. 0 → , R
→ , R
qaccept
qreject
0 → x, R
x → x, R
→ , R
x → x, R
0 → 0, L
x → x, L
x → x, R
→ , L
→ , R
0 → x, R
0 → 0, R
→ , R
x → x, R
{ 0 | n ≥ 0 }
2n
q0 q1
q2
q3
q4
q00000
q1000
xq300
x0q40
x0xq3
x0q2x
xq20x
q2x0x
q2x0x
20. C = {aibjck | k = ij, and i, j, k ≥ 1}
1. If the input doesn’t match a*b*c*, reject.
2. Move the head back to the leftmost symbol.
3. Cross off an a, scan to the right until b.
Sweep between b’s and c’s, crossing off one of
each until all b’s are crossed off.
4. Uncross all the b’s.
If there’s another a left, then repeat stage 3.
If all a’s are crossed out,
Check if all c’s are crossed off.
If yes, then accept, else reject.
PSEUDOCODE:
21. C = {aibjck | k = ij, and i, j, k ≥ 1}
aabbbcccccc
xabbbcccccc
xayyyzzzccc
xabbbzzzccc
xxyyyzzzzzz
23. Theorem: Every Multitape Turing Machine can be
transformed into a single tape Turing Machine
FINITE
STATE
CONTROL
0 0
1
FINITE
STATE
CONTROL 0 0
1 # # #
. . .
24. Theorem: Every Multitape Turing Machine can be
transformed into a single tape Turing Machine
FINITE
STATE
CONTROL
0 0
1
FINITE
STATE
CONTROL 0 0
1 # # #
. . .
26. We can encode a TM as a string of 0s and 1s
0n10m10k10s10t10r10u1…
n states
m tape symbols
(first k are input
symbols)
start
state
accept
state
reject
state
blank
symbol
( (p, a), (q, b, L) ) = 0p10a10q10b10
( (p, a), (q, b, R) ) = 0p10a10q10b11
27. Similarly, we can encode DFAs, NFAs,
CFGs, etc. into strings of 0s and 1s
ADFA = { (B, w) | B is a DFA that accepts string w }
ANFA = { (B, w) | B is an NFA that accepts string w }
ACFG = { (G, w) | G is a CFG that generates string w }
So we can define the following languages:
28. ADFA = { (B, w) | B is a DFA that accepts string w }
Theorem: ADFA is decidable
Proof Idea: Simulate B on w
ACFG = { (G, w) | G is a CFG that generates string w }
Theorem: ACFG is decidable
Proof Idea: Transform G into Chomsky Normal
Form. Try all derivations of length 2|w|-1
ANFA = { (B, w) | B is an NFA that accepts string w }
Theorem: ANFA is decidable
Chomsky normal form: a special way of writing a CFG.
So each rule either has two variables produced, or a single terminal produced.
And only the start can produce epsilon.
Note this doesn’t mean that the grammar must generate epsilon: it just says that the production rule that can generate epsilon must have the start variable on the left hand side.
Now I’ll give a procedure for converting a grammar into CNF.
...
We can just remove S -> T, no rules of the form T -> w
…
S_0 -> S: replace with S_0 -> 0S1, once we do this replacement for all rules we’ll get… (next slide)
After applying these rules some more, we get this grammar on the right.
No more eps-productions, no more unit-productions.
We can convert S0 -> 0S1 as follows.
Note that all the rules involving a T can be removed, there’s nothing you can do with them!
So what does the final grammar look like? What are the rules?
We’d also have to convert S -> 0S1, just as in the other case.
1. Get rid of epsilon-productions.
2. Get rid of the unit productions.
3. Then clean up the remaining rules so they have two variables or one symbol on each RHS.
So that’s how you get a CNF grammar from an arbitrary one.
That concludes our discussion of context-free languages.
Now we’ll shift to a much more general computational model, called the Turing machine.
Named after its inventor…
It has a finite control, like that of a DFA, but it can read and write to its input, which we think of as being presented on an infinite tape.
Only a finite part of the tape has input symbols on it, and the rest of the tape is filled with special “blank” symbols in both directions.
This pointer here is the TAPE HEAD. It begins at the leftmost input symbol.
Here’s what a “step” of a Turing machine looks like:
from a state q0, it will read the symbol under the head, write a symbol, change its state, and move its head left or right by one.
Now the state diagrams for a TM look something like this.
We have a TM here. We have both accept states and reject states.
What language does it recognize?
Suppose we take away the reject state and replace it with this loop.
We will say that this new TM still recognizes this language,
even though there are inputs for which it will run forever.
But note that a Turing machine need not accept OR reject on an input:
it could run forever.
We can completely describe a Turing machine at any step of its computation using strings called configurations.
These are strings over the tape alphabet and the states (treating the states as symbols).
For example, this string means that the Turing machine has
these 1’s and 0’s on its tape,
the state is q7, and
its tape head is reading the 0 to the right of it.
Configurations are like “snapshots” of the TM at a given point.
With configurations, we can give a formal definition of TM accepting a string. (Read Sipser.)
On input w, the “start” configuration is ‘q0 w’
From any particular configuration, there is a “next” configuration we can get to by applying the transition function.
And if we reach a configuration that has an accept state in it, then the machine accepts.
Now, since a TM can run forever on some inputs,
we have two different ways for a TM to “compute” a language.
So what does this theorem mean?
(Click)
How might we prove this?
(Don’t go on until it’s proved.)
How can we get a TM that decides this language?
As with the last lecture, we’ll want to use precise pseudocode whenever we can, because it’s just so much easier to understand.
But for every line of pseudocode you write, you need to think about how you’d convert that into a formal specification.
(Every execution of stage 1 halves the number of 0’s.
If we get to “only one 0” and we never got an odd number of 0’s greater than one,
then we must have had a power of two.)
This is a state diagram corresponding to the pseudocode of the previous slide.
Let’s think about how the two correspond.
We have an extra symbol “x” that is used to cross out symbols.
We start by writing the first 0 over with a blank. If there was just a single 0, we accept.
(This corresponds to stage 2.)
In states q3 and q4, we cross out every other 0, until we see a blank. (Stage 1)
If we see a blank in state q4, then the number of 0’s left was odd, so we reject. (Stage 3)
If we see a blank in state q3, then the number was even, so now in q2 we move left until we hit blank
(Stage 4).
We move back to q1.
Let’s look at what this TM does by looking at its configurations on an input with four zeroes.
….
….
And the process will continue from q1 at this point.
(If not much time left, skip this.)
We can have a TM that does some basic arithmetic.
If we run out of c’s in stage 3, we’ll reject in that case too.
Here’s an example.
One amazing aspect of Turing machines is that you can define all sorts of variations on them, give them more complicated ways to store information, faster ways to access information, whatever, but so long as your new TM only reads and writes a finite number of symbols in each step, it doesn’t matter: an old TM can simulate it!
Here’s an example.
We can define Turing machines with multiple tapes.
We can imagine the input coming in (on one tape), and using the other tapes for scratch work.
This can easily be stuck into the formalism, by redefining the transition function: now it takes k tape symbols, and outputs k tape symbols, and k tape directions.
We can simulate these multiple tapes with one tape, in the following way.
We write down the contents of all these tapes in order, and put a # between two tapes.
To keep track of where all the heads of the multitape machine are,
we double the size of the alphabet to include “marked” symbols,
and a mark above a symbol indicates the head position.
When some tape head moves, we move the mark.
When the multi-tape machine makes one of its tape contents “longer” (by writing over some blanks),
the simulation of it moves all the tape content beyond over by one.
Like, suppose the multitape machine adds a 0 to the end of the first tape.
Then everything from the # on is shifted over to the right, and 0 is written.
The facts that:
(1) we can change the definition of a TM so much, without changing what can be decided by them,
AND
(2) everything we can think of that can be computed with an intuitive idea of an algorithm can be done with TMs,
led mathematicians to state the Church-Turing thesis.
If some language can be computed,
in our intuitive sense of what it means to “compute”,
then we can build a Turing machine that decides it.
An important observation that is used time and time again in computability theory and complexity theory,
is that we can encode TMs.
This means that we can have TMs compute on the code of other TMs.
…
Now if we think of the states and tape symbols as integers, then we can encode transitions as follows…