Basic Foundations of Automata Theory

BY: DABBAL S. MAHARA 1
Unit – 1 Basic Foundations
Motivation
Theory of computation is the course about the entity called computation. Here by the word
computation we understand any task performed on computer or any machine. The fundamental
idea behind computation is execution of program on some computer.
Programs are algorithms expressed in some programming language. Algorithm is a recipe for
carrying out input to output transformation. The theory of computation is indirectly theory of
programs or theory of algorithms. Every algorithm tells us how to compute a function.
Function is an abstract notion, which tells us that here is a mapping between input and output or
domain and range. Algorithm specifies how to obtain output for specific input. This might sound
little unfamiliar that if we can define a function, then that itself should be an algorithm. However,
really, that is not the case. For example, let us take an illustration. Consider a function:
is_prime : number → {𝑦𝑒𝑠, 𝑛𝑜}
i.e.
This is definition is not telling us how actually to obtain the right answer 'yes' or 'no' for given
input n. For given n, we have to write an algorithm to figure out whether it is prime or not. Here,
what we are trying to say is that it may be possible to define a function, but the definition of a
function does not immediately point out in all cases to algorithm to compute that function.
Therefore, it is a possibility that we may be able to define a function or to describe what output
should be without having any idea how to obtain correct answer. In other words, if we think of all
the classes of functions, only tiny fraction of these functions admit algorithms to compute them.
Therefore, the primary goal to this course, Theory of Computation, is to give an insight about the
computability of the problems. The course will not talk about computability of functions directly.
Instead, it will talk about different hierarchies of computational problems and their computational
models in terms of set membership problem.
Objectives of the Course
The objectives of this course are as follows:
• To understand the basic concepts in the theory of computation through simple models of
computational devices.
• To apply models in practice to solving problems in diverse areas such as string searching,
pattern matching, cryptography, and language design;

• To understand the limitations of computing, the relative power of formal languages and the
inherent complexity of many computational problems.
• To be familiar with standard tools and notation for formal reasoning about machines and
programs.
Components of TOC
TOC is a study of power and limits of computing devices. The first question that it answers is that
whether a computer or system can solve a problem or not. The second question it answers is if the
problem can be solved, how efficiently it can be solved. Therefore, the primary concern of TOC
is computability and algorithms are secondary. Depending upon the complexity of problem to be
solved, there are different types of computational models or machines from simple finite state
machines to very complex Turing Machine.
TOC has following three interacting components:
1. Computability Theory
2. Complexity Theory
3. Automata Theory and formal languages
1. Computability Theory: - Computability theory answers the questions: What can or cannot be
computed in computer? Are there problems that no program can solve? The objective of
computability theory is to classify the problems by those that are solvable and those that are
not solvable.
2. Complexity Theory: - It answers the questions: What can be computed efficiently? Are there
problems that no program can solve in a limited amount of time or space? The objective of this
theory is to classify the problems as easy ones and hard ones.
3. Automata Theory and Formal Languages: - It is a study of abstract machines and their
properties, providing a mathematical notion of “computer”. Automata are abstract
mathematical models of machines that perform computations on an input by moving through
a series of states or configurations. The objective of automata theory is to deal with the
definitions and properties of mathematical models of computation.
Formal language is a language defined mathematically. These languages are closely related to
automata theory, as automata are used to generate and recognize formal languages. A language
is defined by the grammar. Noam Chomsky has given the concept of formal grammars in 1959
and popularly known as Chomsky Hierarchy of grammars.

Why Study Automata Theory?
• For software designing and checking behavior of digital circuits, verifying communication
protocols or protocols for secure exchange of information.
• For designing software for checking large body of text as a collection of web pages,
to find occurrence of words, phrases, patterns (i.e. pattern recognition, string matching etc.)
• Designing “lexical analyzer” of a compiler, that breaks input text into logical units called
“tokens
• It is a useful concept of software for natural language processing.
Abstract Model
An abstract model is a model of computer system (considered either as hardware or
software) constructed to allow a detailed and precise analysis of how the computer system works.
Such a model usually consists of input, output and operations that can be performed and so can be
thought of as a processor. E.g. an abstract machine that models a banking system can have
operations like “deposit”, “withdraw”, “transfer”, etc.
Brief History
In 1936, when no any computer were there, British mathematician, logician, computer scientist,
Alan Turing wrote a paper that defined an abstract machine called Turing Machine. It defines
power and limitation of any computational machines. The point is that whatever Turing Machines
can do, computer can do it; and whatever Turing machines cannot do, no computer can do it. In
other words, what all can be done by an algorithm; can be done by Turing machines. So, by just
studying the Turing machines, we can find power and limitations of any computers.
Later in 1940’s and 1950’s, a number of researchers introduced simple kinds of machines called
finite automata. [ See References for Chapter 2 of Textbook]
In late 1950’s the linguist N. Chomsky began the study of formal grammar that are closely related
to abstract automata.
In 1969 S. Cook extended Turing’s study of what could and what couldn’t be computed
and classified the problem as:
Decidable: The problems that can be solved by computer are called decidable. In computability
theory, an undecidable problem is a decision problem for which it is impossible to construct
a single algorithm that always leads to a correct “yes” or “no” answer- the problem is
not decidable. An undecidable problem consists of a family of instances for which a
particular yes/no answer is required, such that there is no computer program that, given any
problem instance as input, terminates and outputs the required answer after a finite number of
steps.

Tractable/intractable: The problems that can be solved by any computational model, probably
TM, using no more time than some slowly growing function size of the input are called “tractable:,
i.e. those problems solvable within reasonable time and space constraints (polynomial time). The
problems that cannot be solved in polynomial time but requires super polynomial (exponential)
time algorithm are called intractable or hard problems. There are many problems, for which no
algorithm with running time better than exponential time is known, some of them are, traveling
salesman problem, Hamiltonian cycles, and circuit satisfiability, etc.
Mathematical Preliminaries
Set, Relation and Function
• Sets
A set is a collection of well-defined objects represented as a unit. Sets may contain any type
of object, including numbers, symbols and even other sets. The objects in set are called its
elements or members. Usually the element of a set has common properties.
Sets may be described formally in several ways.
Examples:
‣ The entire students who enroll for a course “theory of computation” make
up a set.
‣ The set of even positive integer less than 20 can be expressed by: E = {2,
4, 6, 8, 10, 12, 14, 16, 18}
Or E = {x|x is even and 0<x<20}
The symbols ∈ 𝑎𝑛𝑑 ∉ denote set membership and non-membership respectively. For example:
2 E and 3 E.
• Finite and Infinite Sets
A set is finite if it contains finite number of elements. And infinite otherwise. The empty set
has no element and is denoted by ɸ.
• Cardinality of set:
It is a number of element in a set. The cardinality of set E is |E|=9.
• Subset:
For two sets A and B, set A is subset of a set B if each element of A is also element of B
and is denoted by 𝐴 ⊆ 𝐵. There are two notions of subsets proper and improper subsets. If
set A is a subset of B and not equal to B, then A is proper subset of B, A
.
• Power Set:
The power set of a set A is the set of all subsets of A. For example: If A = {0,1}, the power set
of A is the set { ɸ , {0}, {1}, {0,1}).

Set operations
‣ Union:
A For two sets A and B, the union of A and B, denoted by 𝐴 ∪ B, is the set we get by
combining all the elements in A and B into a single set.
‣ Intersection:
The intersection of two sets A and B is the collection of all elements of the two sets which are
common into a single set and is denoted by A ∩ 𝐵.
‣ Difference:
The difference of two sets A and B, denoted by A-B, is the set of all elements that are in the
set A but not in the set B.
‣ Complement:
The complement of a set A, written as , is the set of all elements under consideration that
are not in A.
• Sequences and Tuples
A sequence of objects is a list of objects in some order. We usually designate a sequence
by writing the list within the parenthesis. For example, the sequence 5 ,4, 11 would be
written as (5,4,11). In set the order does not matter but in sequence it does. Repetition
is not permitted in a set but is allowed in a sequence. Like set, sequence may be finite or
infinite. Finite sequences are often called tuples. A sequence with k elements is called a k-
tuple. Thus, (5,4,11) is called 3-tuple. A 2-tuple is called a pair.
• Cartesian Product:
If A and B are two sets, the Cartesian product or cross product of A and B, written as A × 𝐵,
is the set of all pairs wherein the first element is a member of A and the second element is the
member of B. If A = {1, 2} and B= {x, y, z}, then A × 𝐵 = { (1, x), (1, y), (1, z), (2, x),
(2, y), (2, z)}. We can also take the Cartesian product of k sets, A1, A2, A3, .... , Ak , written as
A1 A2 × A3 .... × Ak. It consists of k-tuples (a1, a2 , a3 , ..........., ak).
• Relations And Functions
A binary relation on two sets A and B is a subset of A×B. For example, if A={1,3,9},
B={x,y}, then {(1,x),(3,y),(9,x)} is a binary relation on 2- sets.
A binary relation r is an equivalence relation if R satisfies :
 R is reflexive.i.e. for every x, (x,x)єR.
 R is symmetric i.e. for every x and y , (x,y)єR implies (y,x) єR.
 R is transitive i..e. for every x,y, and z, (x,y) єR and (y,z) єR implies (x,z) єR.
A function is an object that setup an input- output relationship i.e. a function takes an input and
produces the required output. For a function f , with input x, the output y, we write f(x) =y.
We also say that f maps x to y.

Boolean Logic
Boolean logic is a mathematical system built around the two values TRUE and FALSE. The values
TRUE and FALSE are called Boolean values and are often represented as 1 and 0. We use Boolean
values in situations with two possibilities such as a wire that may high or low voltage, a proposition
that may be true or false or a question that may be answered yes or no.
Boolean values are manipulated with specifically designed operations, called Boolean operations.
The simplest such operation is negation or NOT operation, represented as . The negation of a
Boolean value is the opposite value.
The conjunction, or AND operation is designated with the symbol . The conjunction of two
Boolean values is 1 if both the values are 1.
Disjunction or OR, operation is designated with the symbol . The disjunction of two Boolean
values is 1 if either of the value is 1.
We can summarize this information as:
There are several other Boolean operations occasionally appear. The exclusive OR or XOR
designated with the symbol , is true if either but not both of its operands are 1. The implication
operation, designated with the symbol →, is false if its first operand is true and second operand is
false and otherwise it is true. Finally, equality operation, written as , is true if both of its
operands are same.
We can summarize the behavior of these operations as:
We can establish various relationship among these operations. In fact, we can express all the
Boolean operations in terms of AND and NOT. Two expression in each row in the following figure
are equivalent. Each row expresses operations in the left-hand side in terms of the operations above
it and AND and NOT.

Methods of Proofs
A proof is a convincing logical argument that a statement is true. The only way to determine the
truth or falsity of a mathematical statement is with a mathematical proof. Unfortunately, finding
proofs is not always easy. There are a number of formal proof methods in computer science.
Some of proofing techniques are as follows:
Mathematical Induction
Mathematical induction can be used to prove statements that assert that P(n) is true for all positive
integers n, where P(n) is propositional function. A proof by induction has two parts, a basis step,
where we show that P(1) is true, and an inductive step, where we show that all positive integers
k, if P(k) is true, then P(k+1) is true.
To complete inductive step of a proof using the principle of mathematical induction, we assume
that P(k) is true for an arbitrary integer k, this assumption is called inductive hypothesis and show
that under this assumption, P(k+1) must be true.
Expressed as a rule of inference, this proof technique can be stated as,
[P(1) ^ k P(k) P(k+1)] → ∀𝑛 (𝑛), when domain is the set of positive integers.
When the domain is the set of non-negative integers, so that we need to prove P(n) is true for n =
0, 1, 2,....., basis step is P(0).
Example:
Prove that if n is a positive integer, then 1 + 2 + 3 + ... + n = n(n+1)/2.
Let us prove this by induction:
Let P(n) be the proposition that the sum of the first n positive integers is n(n+1)/2. To prove P(n)
is true for n = 1,2,3...., we must show P(1) is true and that the conditional statement P(k) implies
P(k+1) is true fro k = 1,2,3...
Basis step: P(1) is true, since 1= 1(1+1)/2.
Inductive step: Let us assume that P(k) holds for arbitrary positive integer k. i.e. we assume that,
1+2+... + k = k(k+1)/2.
By using this assumption, let us show that P(k+1) is true, i.e. we show 1+2+.... + k + (k+1) =
(k+1)[(k+1)+1]/2 = (k+1)(k+2)/2.
For this, let us use P(k),
1+2+.. + k + ( k+1) = k(k+1)/2 + (k+1)
= [k(k+1)+2(k+1)]/2
= (k+1) (k+2)/2
This last equation shows that P(k+1) is true under the assumption that P(k) is true. This completes
the proof.

Exercise: Use mathematical induction to show that 1 + 2 + 22
+ ... + 2n
= 2n+1
-1 for all non-
negative integers n.
Deductive Proofs
A deductive proof consists of a sequence of statements whose truth leads us from some initial
statement, called the hypothesis or the given statement(s), to a conclusion statement. Each step of
the proof must follow, by some accepted logical principle, from either the given facts, or some of
the previous statements in the deductive proof, or a combination of these.
Example:
If a and b are odd integers, then a + b is an even integer.
Proof:
We know the fact that if a number is even then we can represent it as 2k, where k is
an integer and if the number is odd then it can be written as 2m + 1, where m is an
integer.
Assume that a = 2k + 1 and b = 2m + 1, for some integers k and m. then a + b = 2k
+ 1 +2m + 1 = 2(k + m + 1), here (k + m + 1) is an integer. Hence a + b is even integer.
Proof by Contrapositive
The contrapositive of the statement "if H then C" is " if not C then not H." i. e. p → q ≡ ¬q →
¬p i.e. Contrapositive of implication is equivalent to the implication. We prove the implication
p q by assuming that the conclusion is false and using the known facts we show that
the hypothesis is also false.
Example:
If the product of two integers a and b is even, then either a is even or b is even.
Proof:
Suppose both a and b are odd, then we have a = 2k + 1 and b = 2l + 1. So ab =
(2k +1)(2l + 1) = 4kl +2k +2l +1 = 2(2kl + k + l) + 1, i.e. ab is an
odd number.
Hence both a and b being odd implies ab is also odd.

Proof by Contradiction
In this method, we assume that the theorem is false and then show that this f assumption leads to
an obviously false consequence, called contradiction. For example, Jack sees Jill, who has just
come in from outdoors. Observing that she is completely dry, he knows that it is not raining. His
proof that it is not raining is that, if it were raining ( the assumption that the statement is false),
Jill would be wet ( the obviously false consequence). Therefore, it must not be raining.
Mathematically, we prove the implication p q by assuming p ∧ ¬ is true and try to show that
the above assumption is false.
Example: If a2 is an even number, then a is an even number.
Proof:
Assume that a2
is an even number and a is an odd number. Since a is an odd number
we have a = 2k + 1, for some integer k. so a2
= (2k + 1)2
= 4k2
+ 4k + 1 = 2(k2
+ k) + 1, here k2
+ k is some integer, say l, then a2
= 2l + 1 i.e. a2
is an odd number This contradicts our
assumption that is a2
even. Hence the proof.
Basic Concepts of Formal Languages and Automata Theory
• Symbol: Symbol is a basic building block of TOC. It can be any character such as 0,1, a,
b,c or any other notation.
• Alphabet is a finite non-empty set of symbols. For example {0, 1} is an alphabet with two
symbols, {a, b} is another alphabet with two symbols and English alphabet is also an
alphabet. It is denoted by .
• A string is a finite sequence of symbols taken from some alphabet. b, a and aabab are
examples of string over alphabet {a, b} and 0, 10 and 001 are examples of string over
alphabet {0, 1},
• An empty or null string is a string with no symbols, usually denoted by epsilon .
• Length of a string: The length of a string w, denoted by |w|, is the number of positions for
symbols in w. For example |00100| = 5, |aab| = 3, | | = 0

• Power of alphabet: The set of all strings of certain length k from an alphabet is the kth power
of that alphabet. i.e. ∑ 𝑘 = { w / |w| = k}
If ∑ { 0 then,
0 = { ∈ }
1 = { 0, }
∑2 = { 00, 01, 10, 11 }
∑3 = { 000, 001, 010, 011, 100, 101, 110, 111 }
• Kleen Closure: The set of all strings over an alphabet is called kleen closure of
. Thus, Kleen closure is set of all strings over alphabet .
∑ = ∑ ∪ ∑ ∪ ∑ ∪ ∑ ∪ ....................
E.g. A = {0}
A* = {0n
/ n = 0, 1, 2, …}
• Positive Closure: The set of all stringss over an alphabet except empty string is called
positive closure and is denoted by ∑+ . That is,
∑+ = ∑1 ∪ ∑2 ∪ ∑3 ∪ ....................
• Language: A language L over an alphabet Σ is subset of all the strings that can be formed out
of Σ; i.e. a language is subset of kleen closure over an alphabet Σ; L Σ*.
(Set of strings chosen from Σ* defines language). For example;
i. Set of all strings over Σ = {0, 1} with equal number of 0’s & 1’s. L = {ε,
01, 0011, 000111, ………}
ii. ∅ is an empty language, a language containing no strings & is a language
over any alphabet.
iii. {ε} is a language consisting of only empty string.
iv. Set of binary numbers whose value is a prime: L = {10, 11, 101, 111,
1011, ……}
• Concatenation of Strings
Let x & y be strings then xy denotes concatenation of x & y, i.e. the string formed by making
a copy of x & following it by a copy of y. More precisely, if x is the string of i symbols as x
= a1a2a3…ai & y is the string of j symbols as y = b1b2b3…bj then xy is the string of i + j
symbols as xy = a1a2a3…aib1b2b3…bj. For example; x = 000 and y = 111 xy = 000111 &
yx = 111000
Note: ‘ε’ is identity for concatenation; i.e. for any w, εw = wε = w.

• Suffix of a string
A string s is called a suffix of a string w if it is obtained by removing 0 or more
leading symbols in w. For example; w = abcd s = bcd is suffix of w. here s is proper
suffix if s ≠ w.
• Prefix of a string
A string s is called a prefix of a string w if it is obtained by removing 0 or more trailing symbols
of w. For example; w = abcd
s = abc is prefix of w, Here, s is proper suffix i.e. s is proper suffix if s ≠ w
• Substring
A string s is called substring of a string w if it is obtained by removing 0 or more leading or
trailing symbols in w. It is proper substring of w if s ≠ w.
If s is a string then Substr (s, i, j) is substring of s beginning at ith
position & ending at jth
position both inclusive.
• Problem
A problem is the question of deciding whether a given string is a member of some particular
language. In other words, if Σ is an alphabet & L is a language over Σ, then problem is; given
a string w in Σ*, decide whether or not w is in L.
Note: Read Chapter 1, from the textbook [ Hopcroft, Motwani and Ullman]
Formal Grammars and Languages
Grammar
A grammar is a set of rules that defines the structure of strings in the language. It specifies whether
the strings belong to the language or not. Grammars are the language generators. A language is
collection of all the valid strings that are defined by the grammar.
Formally, a grammar G is a 4-tuple (N, T, P, S), where N is a finite set of non-terminals, T is a
finite set of terminals, P is a finite set of production rules and S Є N is the start symbol.
Eg. Let grammar G = (N, T, P, S), where N = { S}, T = {a,b}, start symbol = S and P = { S
aSb , S Є }.

Chomsky Hierarchy of Grammars
Noam Chomsky has given the formal definition or mathematical model of grammars. There are
several classes of formal languages, each allowing more complex language specification than the
other one before it. The figure given below shows Chomsky hierarchy and each of these has a
corresponding class of automata which recognizes it.
Fig. Chomsky Hierarchy
The hierarchy has following set of grammars:
1. Type - 0 grammar ( or Unrestricted Grammar)
2. Type - 1 grammar ( or Context Sensitive Grammar)
3. Type - 2 grammar ( or Context Free Grammar )
4. Type - 3 grammar ( or Regular Grammar)
1. Type - 0
This is most general type of grammar. It is also known as unrestricted or phrase structure
grammar. The grammar G = (N, T, P, S) is called unrestricted grammar if its productions
are of the form: u → , where u Є ( N ∪ 𝑇 )∗ N ( N ∪ 𝑇 )∗ and v Є ( N ∪ 𝑇 )∗
2. Type - 1
The production rules are of the same form as in type - 0 but |u| ≤ |𝑣|. It is called length-
increasing grammar. This type of grammar is also called Context Sensitive Grammar.
The context sensitive definition is as follows:
∝ 𝐴 𝛽 → 𝛼 𝛾 , that is, A is replaced with in the context of 𝛼 𝑎𝑛𝑑 𝛽, where
𝛼, 𝛽 ∈ (𝑁 ∪ 𝑇)∗ , A Є N and ∈ (𝑁 ∪ 𝑇)+

3. Type - 2 or Context Free Grammars
The production rules in this grammar are of the form: 𝐴 → , where A ∈ 𝑁 and 𝛼 ∈ (𝑁 ∪
𝑇)∗ . This grammar is called context free grammar.
4. Type - 3 or Regular Grammars
The production rules in this grammar are of form: 𝐴 → 𝑎 and 𝐴 → 𝑏 | ∈ where A, B N
and a, b T. This grammar is called regular grammar.
Languages:
The language L defined by grammar G is the set of all terminal strings derivable in the
grammar from the start symbol. i.e. L(G) = { w | w Є T*, S ⇒ 𝑤 }
Eg. Let grammar G = (N, T, P, S), where N = { S}, T = {a,b}, start symbol = S and P = { S
aSb , S Є }.
Then, S aSb (rule 1)
aЄb (rule 2)
i.e. S . Hence, ab Є L(G).
Similarly,
S aSb (rule 1)
aaSbb (rule 1)
aЄb (rule 2)
i.e. S . Hence, aabb Є L(G).
We see that L(G) = { an
bn
| n ≥ 0}

Basic Foundations of Automata Theory

More Related Content

What's hot

Similar to Basic Foundations of Automata Theory

Recently uploaded

In this document

Basic Foundations of Automata Theory