2. Chpt1.Introduction to Data Structures and Algorithm
Analysis
1.1 Introduction
1.1.1 Need of Data Structure
1.1.2 Definitions - Data and information, Data type, Data
object, ADT, Data Structure
1.1.3 Types of Data Structures
1.2 Algorithm analysis
1.2.1 Space and time complexity, Graphical understanding
of the relation between different functions of n, examples of
linear loop, logarithmic,quadratic loop etc.
1.2.2 Best, Worst, Average case analysis, Asymptotic
notations (Big O, Omega Ω, Theta ),
Problems on time complexity calculation
3. Introduction
The term DS is used to describe the way data is stored , and the term
algorithm is used to describe the way data is processed.
DS and algorithm are interrelated. Choosing a DS affects the kind of
algorithm we might use, and choosing an algorithm affects the DS we use.
To develop a program of an algorithm we should select an appropriate DS
for that algorithm.
Therefore program is represented as:
Program=algorithm+DS
Data Structure is represented as:
DS=Organized Data + allowed operations
In Simple terms DS can be defined as a representation of data along with
its associated operations.
DS is the way of organizing and storing data in a computer system so that it
can be used efficiently.
Algorithm is a step by step procedure that provides solution to given
problem.
The analysis of algorithm is the process of finding the computational
complexity of algorithm- the amount of time, storage, or other resources
needed to execute them
4. Need of Data Structure
DS shows how data are stored in a computer so that
operations can be implemented efficiently.
DS is specially important when we have large amount of
information /data.
It gives conceptual and concrete way to organize data for
efficient storage and manipulation.
With the help of DS data can be stored in such a manner
that the program can easily access it.
Various operations ( insertion, deletion, traversing,
searching, sorting etc.) can be easily performed as data is
well organized and accessible with help of data structure.
Effective use of principles of data structure increases
efficiency of algorithms to solve problems like searching,
sorting and so on.
5. Advantages of Data structure
Data structure allows easier processing of data.
It allows information stored on disk very efficiently.
DS is necessary for designing an efficient algorithm
Data can be maintained more easily by encouraging
a better design or implementation of real life
problems.
DS is a secure way of storage of data.
DS allows processing of data on software system
6. Definition
Computer is a data processing machine. Raw data is given as input
and operations are performed on this raw data. The transformed
data is called as useful data. Hence we can say that Computer
Science is a study of data.
Following factors are to be considered while studying data.
Factors and Example
1) Machine that holds data
8086, Pentium etc.
2) Language required for describing data manipulation
C, C++, Pascal etc.
3) Function that describe what kind of refined data can
Logic written in English like be produced from the raw data words
4) Structure for representing data in the Computer
Bites, Bytes or words
7. Definition : Data
Data: Data refers to value or set of values. It is defined as, raw facts and
figures before they have been processed.
Or It is defined as a known fact that can be recorded and have implicit
meaning.
Data is collected from different sources. It is collected for different
purposes.
Data may consists of numbers, characters, symbols or pictures etc.
Data is atomic or non-atomic(composite) data.
Atomic Data :
It is a single or non–decomposable entity. For eg., an integer 3241 is
consider as a single integer value. Even though 3241 is decomposed into
digits the decomposed digits will not have the same characteristics of the
original integer.
Non- Atomic Data or Composite data :
This is also called structured data. Composite data can be broken into
subfields that have meaning. For eg., date consists of day, month and year.
The composite data is also referred as a structured data because they are
implemented using structure statement like struct in ‘C’.
8. Definition : Information
When data is processed it becomes an information. In
short meaningful, logical and processed data is called
an information.
Data is raw facts from which the required information
is produced. Information is an organized and
processed form of data.
9. Definition: Data Types
Data Types :
Data types are a term used to describe the information type
that can be processed by the computer and which is supported
by the programming language.
It is also defined as a term which refers to the kind of data
that variables may holds in a programming language.
In C the standard data types are:
1) Built-in - char, int, float, double
2) void
3) User defined - typedef, enum, structure,
union
4) Derived - array, function, pointer
10. Built-in data types :
The data types provided by programming language are known
as built in data types
Types Bytes Example
1) char 1 ‘A’,’Z’,’s’
2) unsigned char 1
3) int 2
4) unsigned int 2 324
5) signed int 2 -34
6) short int 2
7) long int 4
8) float 4 1.2
9) double 8
11. Void
void :
The void type specifies an empty set of values. It is
used as the typed returned by function that generate
no value
12. User- defined data types
User- defined data types
Those data types which are defined by the user
as per his/her requirement are called as user
defined data types.
The typedef allows user to define a new data types in C
Eg. typedef int count;
count is the new user defined data types and can be used
in place of int.
If we declare the statement as : count min , max; then
the count word is used to declare min, max as integer.
It must be noted that a typedef declaration does not
create a new type, but it just adds a new name for some
existing type.
13. Enumeration
Enumeration (enum)
Enumeration is the way of defining constants.
Example- enum colors{black,green,red,blue}
In this eg. colors is the name of the enumeration and
black green etc. are the members of the enumeration
which are called enumeration constants.
Enumeration constants are automatically assigned
integer values internally starting with 0 for first member
with each successive number increasing by one.
Enumerations data types help in making the program
listings. They are more readable and also help you to
reduce programming errors.
14. Array and Structure
Structure
A structure is a tool for packing together logically related data items of different type
(heterogeneous).
Eg.
Struct employee {
char empname[20];
int emocode;
float basic;
char address[20];
};
Once the structure is defined we can declare structure variable and access its
member by using either dot(.) or arrow(->) operator.
Unoins
Unions are very much similar to structure where both are used to group number of
different elements. Unlike structures , the members of a union share the same
storage area , even though the individual members may differ in types.
Eg. union id {
char color[12];
int size;
}car;
15. Derived Data Types
Derived data types are the data types that are derived from
primitive data types.
Arrays
Array assign a single name to whole group (homogenous)
Eg. int a[10]; defined an array of a of size 10;
Pointers
A pointer is a memory location that holds the address of another
variable.
Pointers are more efficient in handling complex data
structure and data tables. It increase the execution speed.
There are two special pointers operations * and &.
Eg. int *p; means p is a pointer to an integer.
We can initialize the pointer after declaring a pointer variable.
Initialising the pointer using an assignment statement such as- -
p=&a; which causes p to point a;
16. Data Object
A data object represents a container for data values – a place where data
values may be stored and later retrieved.
A data object is characterized by a set of attributes. One of the most
important of which is the data types.
Data object refers to the set of elements say D, which may be finite or
infinite.
Eg. The data object “alphabet” is defined as D={A,B,……,Z}
The data object integer is defined as D={…..-2,-1,0,1,2…..}
Data object is runtime instances of data structure.
Some of the data objects that exist during program execution are
programmers defined such as variables, constants, arrays, files etc.
System defined data objects are ordinarily during program execution
without explicit specification by the programmer.
Eg. fig shows simple variable data object.
X: 00000000 00001001
A data object X containing the data value 9.
A data object may be elementary if it contains a single value. If data object
contains number of heterogeneous (structure) or homogeneous (array)
elements then they are referred as “data structure”
17. Abstract data types (ADT) :
Each programming language provides a set of built-in data types. However these
data types are not enough, since present day programming problem are more
complex.
Thus there is a need for a structured data types which may be a combination or
collection of basic data type with a set of properties and legal operations that may be
performed on it by programmer.
Programmers own data type is termed as Abstract data types (ADT).
ADT=Type + Function Names + Behaviour of each Function
With an ADT, users are not concerned with how the task is done but rather with
what it can do.
In other words, the ADT consists of the set of definitions that allow programmer to
use the functions while hiding the implementation.
The generalization of operations with hidden implementation is known as
abstraction. We abstract the essence of the process and leave the implementation
details hidden.
To know the abstraction concept we will consider the e.g. of a list.
Atleast 3 data structure will support a list. We can use an array, a linked list or a file.
If we place our list is an ADT, users should not be aware of the structure we use. As
long as they can insert and retrieve data, it should make no difference how we store
the data. What user needs is just know about how to store and retrieve the data in
list. Hence we know what the data type can do. How it is done is hidden.
18. Abstract data types (ADT) :
An ADT is a data declaration packaged together with the
operations that are meaningful for the data type. In other words,
we encapsulate the data and the operations on data and we hide
them from a user.
An ADT includes: Declaration of data.
A structure defined by a set of rules that put components(data)
together
A set of operations (functions) and encapsulation of data.
ADT is a mathematical modes with a set of operations defined
on that model.
Eg- consider a rational number that can be expressed as the
quotient of two integer. The operations on rational numbers are
addition, multiplication and testing for quality.
We can say that rational number together with addition,
multiplication and equal operation is an e.g. of ADT of rational
number.
19. Abstract data types (ADT)
Each ADT has two properties :
Generalization
ADT’s are generalization of primitive data types. Each ADT
can be applied further for a specific data object of specific data
types.
For e.g. ADT of array can be further applied to integer array,
string array or character array
Encapsulation
The ADT encapsulates data type such that definition of the
type and all operations on that type are put together as one
section of the program.
Eg. language c++ provides the class declaration for the
purpose of defining of ADT from which objects are created.
20. Example of ADT in C
In C we define data type named FILE, which is abstract
model of a file.
We can perform operations on such as fopen to open a
file, fclose to close a file, fgetc to read character from file,
fputc to write a character to the file.
Definition of file is in stdio.h- the file input-output
routines in C runtime. Library enables us to view file as a
stream of bytes and to view file as a stream of bytes and
to perform various operations on this stream by calling
I/O routines.
So FILE definition together with functions operate on it,
becomes a new data type we do not know C structure
internally instead, we rely on functions and macros that
essentially hide inner details of file. This is called hiding
21. Advantages of ADT
Encapsulation
Information hiding
Implementation details hiding
The ADT is a useful guideline to implement and a
useful tool to programmers who wish to use the data
type correctly.
22. Abstract Data Types (ADT) :
A data structure is a set of domains D, a designated domain € D a
set of function F and a set of axioms A.
The triple (D,F,A) denotes the data structure d and it will usually
written as d=(D,F,A), where-
- Domain (D):- denotes the data objects
- Function(F):- denotes the set of operation that can be carried
out on the data objects.
- Axioms(A):- Denotes the properties and rules of the operations.
i.e. semantics of the operations.
23. Data structure
A data structure describes the set of data objects and describes the set of operations
which can be performed on elements of the data objects. It describes the set of
operations and show how they work.
Eg: An integers arithmetic operation like +,-,*,/,%,>,< etc can be performed.
The data object integers and description of how +,-,*,/ etc. work makes a D.S
D.S “natural numbers” abbreviates as natno={0,1,2,3------} is defined with 3
operations-
Test for zero, Addition, Equality
Definition -
A data structure is a set of domains D, a designated domain € D a set of function F
and a set of axioms A.
The triple (D,F,A) denotes the data structure d and it will usually written as
d=(D,F,A), where-
Domain (D):- denotes the data objects
Function(F):- denotes the set of operation that can be carried out on the data objects.
Axioms(A):- Denotes the properties and rules of the operations. i.e. semantics of the
operations.
This definitions is also called as Abstract Data type(ADT). Implementation details of
an ADT are hidden that’s why it is called abstract.
To represent the mathematical model underlying an ADT, we use DS which is
collection of variables, data types connected different ways.
24. Advantages of data structure
1. To manipulate and access the information.
2. Various operations can be performed an structured data.
3. To improve the program efficiency.
4. To improve the memory utilization or storage.
5. Related data can be stored together and in the required format
25. Application area
1. Compiler Design
2. Operating System
3. Database Management Systems
4. Statistical Analysis
5. Numerical analysis
6. Graphics
7. Artificial Intelligence
8. Simulation
9. Games
26. Types of Data structure
1)Primitive and non-primitive data structure.
2)Linear and Non-linear data structure.
3)Static and dynamic D.S
4)Persistent and ephemeral D.S
27. Types of Data structure
Primitive (Atomic) and non-primitive(Non-Atomic) data
structure.
Primitive data structure: -
It is a set of atomic or primitive elements which do not involve any
other elements as its subparts(non-decomposable)
e.g.- The D.S. define for basic built-in-data types like integer, float, character etc.
Non-Primitive D.S.
It is a set of derived elements such as array, files , structure etc.
Array D.S of C consists of a set of similar type of elements . File D.S. of C
consists of set of different types of elements. Hence array structure files are user-
define data types and non-primitive D.S.
Primitive Non-Primitive
- Integer - array
- Float - files
- Character - structure
- Pointer
28. Types of Data structure
Linear and Non-linear data structure
Linear D.S
In this all the elements form a sequence or maintain a linear ordering.
Every data element has unique successor (front) and unique predecessor (back).
Non-linear data structure
It is used to represent the data containing hierarchical or network
relationship between the elements. In this data structure every data element may
have more than one predecessor as well as successor. In this type of data
structure the elements do not form a sequence.
Linear Non-linear
- Arrays - trees
- Linked lists - tables
- Stacks - sets
- Queues - graphs
29. Types of Data structure
Static and Dynamic Data Structure
If variable is defined inside the function then its value is kept till
function executes. We say that the variable is local to function. When
the function execution is complete the storage allocated for that variable
is free. The variable has static or dynamic lifetime.
In static memory is allocated at the beginning of the program execution
and freed only after the program terminate e.g. array.
In dynamic lifetime, memory is allocated for the variable dynamically,
means created and destroyed during the execution of the program.
In ‘C’ the library functions malloc(), calloc(), alloc() is used for
dynamic allocation.
A DS which is created at runtime is called as dynamic D.S. e.g.- linked
list, trees, graphs.
30. Types of Data structure
Persistent and Ephemeral D.S
A D.S. that supports operations on the most recent
version as well as previous version is called persistent
D.S.
An D.S that supports operations on the most recent
version is called Ephemeral D.S
D.S in language like C,C++, Java, Fortran, Pascal are
Ephemeral.
31. Algorithm- Definition
In simple words algorithm is a method that can be used by a
computer for the solution of a problem.
Definition
An algorithm is a finite set of instruction that, if followed
accomplishes a particular task. It is a method that can be used
by a computer for the solution of a problem.
An algorithm is independent of the language
Or
An algorithm is a sequence of instruction such that:-
- The sequence is finite.
- Each instruction is executed only a finite number of times.
- Each instruction is unambiguous.
32. Algorithm- Example
Algorithm to find the given no is odd or even
step 1: start
step 2: read n
step 3: if(n mod 2)= =0 then
“ n is even number” otherwise
“ n is odd number”
Step 4: stop.
33. Algorithm- Characteristics / Criteria
Characteristics of an algorithm / criteria of algorithm
All algorithms must satisfy the following criteria.
Input-
An algorithm takes zero or more inputs, quantities that are given to it initially before the
algorithm begins or dynamically as the algorithm runs.
Output-
An algorithm generates one or more outputs, quantities that have a specified relation to the
inputs, atleast one quantity is produced.
Definiteness-
Each instruction must be clear and unambiguous i.e. each step must be precisely defined.
e.g.- add 5 to x or y is not clear statement.
Effectiveness-
Every instruction must be very basic, so that it can be carried out in principle by a person using
only paper or pencil.
e.g- add num1 to num2 is an effective operation if num1 and num2 are integers, but if they are real
numbers with infinite decimal expansions, the instruction is ineffective.
Finiteness-
An algorithm must halt. If we trace out the instruction of an algorithm then algorithm must
terminate after a finite number of steps for all cases.
e.g.- operating system is exceptional . It is made to run infinitely till the power is not off.
34. Algorithm- Characteristics / Criteria example
To find the largest of 3 nos.
step 1: start
step 2: Read 3 positive numbers, say x,y,z.
step 3: if(x>y) & (x>z)
then “x is a largest number” go to step 5
step 4: if(y>z) & (y>x)
then “y is largest number”
else
“z is largest number’
Go to step 5
step 5: stop.
We can find that this algorithm satisfies all the above criteria.
Input :-
Here x,y & z are three positive number are input.
Output:-
one output with largest number.
Definiteness:-
all steps are clear and unambiguous.
Finiteness:-
the algorithm terminates at steps. Also at steps3 and step4, if conditions are satisfies then
algorithm terminates.
Effectiveness:-
the operator “>” is sufficiently basic and steps are effective.
35. Algorithm Analysis
There can be several organizations of data and / or
algorithm for given problem. We can compare one
algorithm with another is called analysis of algorithm.
Analysis of algorithm has several purposes.
1) It helps one choose among different solution to
problems.
2)We can predict the performance of a program before we
take the time to write code
3) By analyzing we gain a better understanding of where
the fast and slow parts are and what to work on or work
around in order to speed it up
36. Algorithm Analysis - Criteria
There are many criteria upon which we can analyze a
program such as:-
1) Does it perform the desired task?
2) Does it work correctly according to the original
specification of that task?
3) Is there documentation that describes how to use it and
how it works?
4) Are procedures created in such a way that they perform
logical sub-function? i.e. Is program modular?
5) Is the code readable?
6) How is performance of program?
37. Algorithm Analysis – Space Complexity
Definition-
The space complexity of an algorithm is the
amount of memory it needs for running an
algorithm.
We can use the space complexity to estimate the size
of the largest problem that a program can solve.
38. Algorithm Analysis – Components of space
complexity:-
Components of space complexity:-
The space needed by an algorithm is the sum of the following two
components-:
1)Fixed part 2) Variable part:-
1)Fixed part
It is memory space required by that part of the program which is
independent of input and output characteristics. It includes space required for
instruction(code), simple and fixed size variables and constants etc.
2)Variable part:-
It is memory space required by that part of the program which is dependent
on a particular problem instance.
It includes space requirement for reference variable that depend on instance
characteristics and also for stack in case of recursion and also depends on
instance characteristics.
In short it includes dynamically allocated space and the recursion stack space.
The space requirement of an algorithm p is denoted by S(p)
S(p)=c+sp
Where c=constant, which is fixed part of memory.
sp=instant of characteristics which is the variable part.
39. Space Complexity : Example
float area(float r)
{
return(3.142*r*r);
}
The space needed is independent of the i/p and o/p
characteristics sp=0
It uses only value of r c=1
S(p)=c+sp = 1+0 =1
40. Space Complexity : Example
int prod(int a,int b,int c)
{
return(a*b*c);
}
This algorithm uses only values of a, b and c. Assume
that each variable needs one word for space required
is independent of i/p and o/p.
sp=0 and c=3
S(prod)=c+sp 3+0 =3
41. Space Complexity : Example
float sum(float a[],int n)
{
float s=0.0;
int i;
for (i=0;i<n;i++)
s=s+a[i];
return (s);
}
This algorithm is characterized by n which is number of
elements to be added. Assume that space needed by n, i and s
is 1 word each.
The space needed by a is n words.
S(sum) >= (n+3)
42. Space Complexity : Example
float recsum(float a[],int n)
{
if(n<=0)
return 0.0;
else
return(recsum(a,n-1)+a[n-1];
}
S(recsum) >=3(n+1)
(per iteration word of memory * no of iteration)
This algorithm is depending on the recursion depth. For
each function call 3 words of memory are needed i.e. 1 word
each for base addresses of a, n and return address of function.
The recursion is done for (n+1) times.
43. Time Complexity
The amount of time taken by a program for execution
is the running time of a program.
The total time taken by the algorithm or program is
calculated using the sum of the time by each of
executable statement in algorithm or program
Time required by each statement depends on-
- Time required for executing it once-
- No. of times the statement is executed.
Product of above two gives time required for that
particular statement
It is denoted by t(p)
44. Time Complexity : Example
int sum(int a[],int n) // no of times executed
{
int i,s=0; 1
for(i=0;i<n;i++) n+1
s=s+a[i]; n
return s; 1
}
t(sum) = 1+n+1+n+1 = 2n+3
45. Time Complexity : Example
void addition(int a[], int b[], int c[],int m, int n)
{
int i,j;
for(i=0;i<m;i++) m+1
for(j=0;j<n;j++) m(n+1)
c[i][j]=a[i][j]+b[i][j]; mn
}
t(addition) = m+1+m(n+1)+mn
= 1+m+1+mn+m+mn
= 2mn+2m+2
47. Measures of times:-
There are different types of time complexity which can be
analyzed for an algorithm.
1) Best case complexity
The best case is a measure of minimum time that the
algorithm will require for an input of size n.
2) Average case complexity:-
The amount of time the algorithm taken on an average set
inputs.
3) Worst case complexity:-
The amount of time the algorithm takes on the worst
possible set of inputs or maximum time required for n inputs.
49. Asymptotic notation(Big-Oh Notation - O)
Big O notation:-
This notation is used to denote upper bounds of an
algorithm in terms of Time Complexity.
It always indicates the maximum time required by an
algorithm for input values.
It describes the worst case of an algorithm time
complexity.
Definition :The function f(n)=O(g(n)) [read as f of n is big
O of g of n] iff there exist positive constants c and no such
that - f(n)<=c*g(n) for all n, n>=no
50. Asymptotic notation(Big-Oh Notation - O)
In above graph after a particular input value no, always C*g(n) is greater
than f(n) which indicates the algorithm’s upper bound.
51. Most commonly used Big O descriptions are
1) Constant 1 O(1)
The running time of the program is constant (i.e it is
independent of the size of the input) e.g. find the 5th element in an array.
2) Logarithmic f(n)=O(logn)
Typically achieved by dividing the problem in to smaller segments and only
looking at one input element in each segment. e.g. binary search
3) Linear f(n)=O(n)
Typically achieved by examining each element in the input once.
e.g. find the minimum element.
4) Linear log f(n)=O(nlogn)
Typically achieved by dividing the problem into subproblem solving the
subprograms independently and then combining the result.
e.g. merge sort.
5) Quadratic f(n)=O(n2)
Typically achieved by examining all pairs of data elements. e.g. selection
sort
6) Cube f(n)= O(n3)
Achieved by combining algorithm with mixture of the previous running e.g.
matrix multiplication
52. Asymptotic notation(Bib Omega Notation Ω)
This notation is used to denote lower bounds of an
algorithm in terms of Time Complexity.
It always indicates the minimum time required by an
algorithm for input values.
It describes the best case of an algorithm time complexity.
Definition-
The function f(n)= Ω (g(n)) [read as f of n is big
O of g of n] iff there exist positive constants c and
no such that -
f(n)>=c*g(n) for all n,n>=no
53. Asymptotic notation(Bib Omega Notation Ω)
In above graph after a particular input value no, always C*g(n) is less than f(n)
which indicates the algorithm’s lower bound.
54. Asymptotic notation(Big theta)
It is the formal method of expressing the
average bound of an algorithm’s running time.
It is measure of the average amount of time
required for the algorithm to complete.
Definition:-
The function f(n)=0(g(n)) iff there exist
positive constants c1,c2 and no such that
c1*g(n)<=f(n)<=c2*g(n) for all n , n>=no
55. Asymptotic notation(Big theta)
In above graph after a particular input value no, always C1*g(n) is less than f(n)
C2 g(n) is greater than f(n) which indicates the algorithm’s average bound.
56. Common Asymptotic Notations
Following is a list of some common asymptotic notations −
constant Ο(1)
logarithmic Ο(log n)
linear Ο(n)
n log n Ο(n log n)
quadratic Ο(n
2
)
cubic Ο(n
3
)
polynomial n
Ο(1)
exponential 2
Ο(n)