3. Data are values or a set of values
Data item refers to single unit of values
Data item
Group item :
Data item that can be subdivided into sub item.
Ex Name : First Name, Middle initial and Last Name
Elementary item:
Data item that can not be sub divided into sub item Ex :
PAN card number / Bank Pass Book Number is treated as
single item
Collection of data are frequently organized into a
hierarchy of fields, records and files
4. Entity :
Something that has certain attributes or properties which
may be assigned values
Values may be numeric or non-numeric
Ex: The employee of an organization
Attributes Name Age Sex Employee Code
Values John 33 M 13472
5. Entity with similar attributes ( e.g all employees of an
organization) form an entity set
Each attribute of an entity set has a range of values [ the
set of possible values that could be assigned to the
particular attribute]
Information: Data with given attribute or processed data
Information =Instruction + Data
6. Field is a single elementary unit of information
representing an attribute of an entity
File is the collection of records of the entities in a given
entity set
7. Record is the collection of field values of a given entity
Based on length records can be classified as
Fixed Length: All records contain the same data items
with the same amount of space assigned to each data
item .
Variable Length: Records may contain different length
usually with a minimum and maximum length .
8. Data Types
A data type is a term which refers to the kind of data that
may appear in computation.
Ex: in C
int, float, char, double, long double, etc.
9. The organized collection of data is known as data structure.
Data structure =Organized data + operations.
“Data Structures ”deals with the study of how the data is
organized in the memory, how efficiently the data can be
retrieved and manipulated, and the possible ways in which
different data items are logically related.
10. Application of Data Structure
The data structure is necessary for the following reasons.
It provides a function that can be used to retrieve the individual
data elements .
The data structure enables to solve the relationship between the
data elements that are relevant to the solution of the problem.
Data structure helps to describe the operation that must be
performed on the logically related data elements ,the operations
such as creating ,displaying ,inserting ,deleting ,retrieving ,etc.
Data structures give the freedom to the programmer to decide any
type of language that best suits for a particular problem.
The data structure describes the physical and logical relationship
between the data items .it also provides a mode of access to each
element in the data structure.
11.
12. Primitive datastructures
These are data structures which are directly operated
upon by machine level instruction .
The storage structure (memory representation )of
these data structures vary from one machine to
another machine .
Example : integer ,real , character , Boolean , and
pointer.
13. Operations on primitive data
structures
The various operations that can be performed on
primitive data structures are:
Create : Create operation is used to create a new data
structure. This operation reserves memory space for
the program elements. It can be carried out at
compile time and run-time. For example, int x;
Destroy : Destroy operation is used to destroy or
remove the data structures from the memory space.
When the program execution ends, the data structure
is automatically destroyed and the memory allocated
is eventually de-allocated..
14. Select : Select operation is used by programmers to
access the data within data structure. This operation
updates or alters data.
Update : Update operation is used to change data of
data structures.
For example, int x = 2;
Here, 2 is assigned to x.
Again, x = 4; 4 is reassigned to x. The value
of x now is 4 because 2 is automatically replaced by 4,
i.e. updated.
15. Non primitive data structures
Data structure which are not primitive are called
Non –primitive data structures . This means that
cannot be operated upon directly by machine level
instructions.
Depending on the type of relationships between data
elements in a data structure , they can be classified as
a.Linear data structure
b. non-linear data structure
16. Linear data structure:
A linear data structure traverses the data elements
sequentially, in which only one data element can
directly be reached.
Ex: Arrays, Linked Lists ,stack , Queue.
Arrays: Array is a finite ordered set of homogeneous
elements ,which are stored in adjacent cells in
memory the data are represented by a single name
identified by a subscript .
Stacks : it is a linear data structure in which data is
inserted and deleted at one end called as top of stack
i.e. data is stored and retrieved in last in first
out(LIFO) order
17. Queue : A queue is an ordered collection of items
where insertion of items and removal of items always
take place at different ends.
Insertion and deletion is performed according to the
first-in first-out (FIFO) principle.
Linked lists: A list is a sequence of data objects of the
same type .the list may be such as singly, doubly and
circular list.
18. Non-Linear data structure:
Every data item is attached to several other data
items in a way that is specific for reflecting
relationships. The data items are not arranged in a
sequential structure. Ex: Trees, Graphs
Trees: A tree is a finite set of vertices that has a vertex
called as root and remaining vertices are collection of
sub-trees.
Graphs:A graph is a set of vertices and edges which
connect them.
A graph is a collection of nodes called vertices, and
the connections between them, called edges.
20. • Traversing: The process of accessing each data item
exactly once to perform some operation is called
traversing.
• Searching: It is the process of finding the location the
element with a given key value in a perticular data
structure .
Insertion: It is the process of adding a new element to
the structure. For this ,the position should first be
identified ,and only then the new element can be
inserted.
21. Deletion: Deletion refers to the operation of
removing one of the elements from the collection of
data elements .
Sorting: it is the process of arranging elements of a
particular data structure in some logical order. The
order may be either ascending or descending or
alphabetical order depending on the type of data
present.
Merging : It is the process of combining the elements
in two different structures into a single structure.
Merging is the combining the elements in two
different sorted lists into a single sorted list.
22. Abstract data type(ADT)
A data type that is defined entirely by a set of
operations is referred to as Abstract data type or
simply ADT
Abstract data types are a way of separating the
specification and representation of data types
An ADT is a combination of interface and
implementation The interface defines the logical
properties of the ADT and especially the
signatures of its operations
The implementation defines the representation of
data structure and the algorithms that implement
the operations
23. Abstract data type (ADTs)
The basic difference between ADTs and primitive data
types is that the latter allow us to look at the
representation, whereas former hide the
representation from us
An ADT consists of a collection of values and
operations with the values derive their meaning solely
through the operations that can be performed upon
them
Benefits of using ADTs:
Code is easier to understand
Implementations of ADTs can be changed without requiring
changes to the program that uses the ADTs
24. Algorithms
An algorithm is a step by step recipe for solving an
instance of a problem.
Every single procedure that a computer performs is an
algorithm.
An algorithm is a precise procedure for solving a
problem in finite number of steps.
An algorithm states the actions to be executed and the
order in which these actions are to be executed.
An algorithm is a well ordered collection of clear and
simple instructions of definite and effectively
computable operations that when executed produces a
result and stops executing at some point in a finite
amount of time rather than just going on and on
infinitely.
25. Algorithm Properties
An algorithm possesses the following properties:
It must be correct.
It must be composed of a series of concrete steps.
There can be no ambiguity as to which step will be
performed next.
It must be composed of a finite number of steps.
It must terminate.
It takes zero or more inputs
It should be efficient and flexible
It should use less memory space as much as possible
It results in one or more outputs
26. Efficiency of an algorithm
Algorithms are programs in a general form. An
algorithm is an idea upon which a program is built.
An algorithm should meet three things:
It should be independent of the programming language in
which the idea is realized
Every programmer having enough knowledge and experience
should understand it
It should be applicable to inputs of all sizes
27. Efficiency of an algorithm
Efficiency of an algorithm denotes the rate at which an
algorithm solves a problem of size n.
It is measured by the amount of resources it uses,
the time and the space.
The time refers to the number of steps the algorithm
executes .
The space refers to the number of unit memory
storage it requires.
An algorithm’s complexity is measured by calculating
the time taken and space required for performing the
algorithm.
The input size, denoted by n, is one parameter , used
to characterize the instance of the problem.
The input size n is the number of registers needed to
hold the input (data segment size).
28. Time Complexity of an Algorithm
Time Complexity of an algorithm is the amount of
time(or the number of steps) needed by a program to
complete its task (to execute a particular algorithm)
The way in which the number of steps required by an
algorithm varies with the size of the problem it is
solving.
29. Space Complexity
Space Complexity of a program is the amount of
memory consumed by the algorithm ( apart from input
and output, if required by specification) until it
completes its execution.
The way in which the amount of storage space
required by an algorithm varies with the size of the
problem to be solved.
Space complexity is normally expressed as an order of
magnitude, eg O(n2
)means that if the size of the
problem n doubles then four times as much working
storage will be needed.
30. Time Complexity of an
Algorithm
Time complexity of a given algorithm can be defined
for computation of function f() as a total number of
statements that are executed for computing the value
of f(n).
Time complexity is a function dependent from the
value of n. In practice it is often more convenient to
consider it as a function from |n|
Time complexity of an algorithm is generally classified
as three types.
(i) Worst case
(ii) Average Case
(iii) Best Case
31. Mathematical notations and
functions
1. Floor and Ceiling Functions:
If x is a real number, then it means that x lies
between two integers which are called the floor and
ceiling of x. i.e.
|_x_| is called the floor of x. It is the greatest integer
that is not greater than x.
| x | is called the ceiling of x. It is the smallest integer
that is not less than x.
32. If x is itself an integer, then |_x_| = | x |, otherwise |
_x_| + 1 = | x |
E.g.
|_3.14_| = 3, |_-8.5_| = -9, |_7_| = 7
| 3.14 |= 4, | -8.5 | = -8, | 7 |= 7
33. 2. Remainder Function (Modular Arithmetic):
If k is any integer and M is a positive integer, then:
k (mod M)
gives the integer remainder when k is divided by M.
E.g.
25(mod 7) = 4
25(mod 5) = 0
34. Integer and Absolute Value
Functions:
If x is a real number, then integer function INT(x) will
convert x into integer and the fractional part is
removed.
E.g.
INT (3.14) = 3
INT (-8.5) = -8
The absolute function ABS(x) or | x | gives the
absolute value of x i.e. it gives the positive value of x
even if x is negative.
E.g.
ABS(-15) = 15 or ABS | -15| = 15
ABS(7) = 7 or ABS | 7 | = 7
ABS(-3.33) = 3.33 or ABS | -3.33 | = 3.33
35. Summation Symbol (Sums):
The symbol which is used to denote summation is a
Greek letter Sigma ?.
Let a1, a2, a3, ….. , an be a sequence of numbers. Then
the sum a1 + a2 + a3 + ….. + an will be written as:
n
? aj
j=1
where j is called the dummy index or dummy variable.
E.g.
n
? j = 1 + 2 + 3 +…..+ n
j=1
36. Factorial Function:
n! denotes the product of the positive integers from 1
to n. n! is read as ‘n factorial’, i.e.
n! = 1 * 2 * 3 * ….. * (n-2) * (n-1) * n
E.g.
4! = 1 * 2 * 3 * 4 = 24
5! = 5 * 4! = 120
37. Permutations:
Let we have a set of n elements. A permutation of this
set means the arrangement of the elements of the set
in some order.
E.g.
Suppose the set contains a, b and c. The various
permutations of these elements can be: abc, acb, bac,
bca, cab, cba.
If there are n elements in the set then there will be n!
permutations of those elements. It means if the set
has 3 elements then there will be 3! = 1 * 2 * 3 = 6
permutations of the elements.
38. Exponents and Logarithms:
Exponent means how many times a number is
multiplied by itself. If m is a positive integer, then:
am = a * a * a * ….. * a (m times)
and
a-m = 1 / am
E.g.
24 = 2 * 2 * 2 * 2 = 16
2-4 = 1 / 24 = 1 / 16
39. The concept of logarithms is related to exponents. If
b is a positive number, then the logarithm of any
positive number x to the base b is written as logbx. It
represents the exponent to which b should be raised
to get x i.e. y = logbx and by = x
E.g.
log28 = 3, since 23
=8
log100.001 = - 3, since 10-3
= 0.001
logb1 = 0, since b0
= 1
logbb = 1, since b1
= b
41. Complexity of algorithms
Algorithm complexity is a rough approximation of the
number of steps, which will be executed depending
on the size of the input data. Complexity gives the
order of steps count, not their exact count.
42. Time Complexity of Algorithms
Time complexity of an algorithm signifies the total
time required by the program to run to completion.
The time complexity of algorithms is most commonly
expressed using the big O notation.
Time Complexity is most commonly estimated by
counting the number of elementary functions
performed by the algorithm.
43. Time Complexity of an
AlgorithmThe time taken for an algorithm is comprised of two
times
Compilation Time
Run Time
Compilation time is the time taken to compile an
algorithm. While compiling it checks for the syntax
and semantic errors in the program and links it with
the standard libraries , your program has asked to.
44. Time Complexity of an
Algorithm
Run Time: It is the time to execute the compiled
program.
The run time of an algorithm depend upon the
number of instructions present in the algorithm.
Usually we consider, one unit for executing one
instruction.
The run time is in the control of the programmer , as
the compiler is going to compile only the same number
of statements , irrespective of the types of the compiler
used.
Note that run time is calculated only for executable
statements and not for declaration statements
45. Space Complexity
Space Complexity of a program is the amount of
memory consumed by the algorithm ( apart from input
and output, if required by specification) until it
completes its execution.
The way in which the amount of storage space
required by an algorithm varies with the size of the
problem to be solved.
The space occupied by the program is generally by the
following:
A fixed amount of memory occupied by the space for the
program code and space occupied by the variables used in the
program.
A variable amount of memory occupied by the component
variable whose size is dependent on the problem being solved.
This space increases or decreases depending upon whether the
46. Space Complexity
The memory taken by the instructions is not in the
control of the programmer as its totally dependent
upon the compiler to assign this memory.
But the memory space taken by the variables is in the
control of a programmer. More the number of
variables used, more will be the space taken by them in
the memory.
47. Order of growth
The time complexity of an algorithm is generally
some function of the instance characteristics . This
function is very useful in determining how the time
requirements vary as the instance characteristics
change.
Let T(n) be the complexity function with input size
‘n’.
the complexity function is directly proportional to the
instance characteristics ‘n’.
i.e. the value of T(n) increases when ‘n’ value
increases and T(n) value decreases when ‘n’ value
48. Order of Growth Big O notation
Constant O(1)
Logarithmic O(log n )
Linear O(n)
Loglinear O(n log n )
Quadratic O(n2
)
Cubic O(n3
)
Exponential O(2n
), O(10n
)
The order of growth is
O(1)<O(logn)<O(n)<O(nlogn)<O(n2
)<O(n3
)
<O(2n
)<O(n!)
49. Worst case ,best case , and
average case efficiency
Worst Case: It is the longest time that an algorithm
will use over all instances of size n for a given problem
to produce a desired result.
Average Case: It is the average time( or average
space) that the algorithm will use over all instances of
size n for a given problem to produce a desired result.
It depends on the probability distribution of instances
of the problem.
Best Case: It is the shortest time ( or least space ) that
the algorithm will use over all instances of size n for a
given problem to produce a desired result.
50. Asymptotic Notations
Following are commonly used asymptotic notations
used in calculating running time complexity of an
algorithm.
Ο Notation
Ω Notation
θ Notation
51. Big Oh Notation, Ο
The Ο(n) is the formal way to express the upper
bound of an algorithm's running time.
It measures the worst case time complexity or longest
amount of time an algorithm can possibly take to
complete
For example, for a function f(n)
f(n)=O(g(n)) such that there exists two positive
constants c and n0 with the constraint that
|f(n)| ≤c| g(n)| for all n ≥n0
52. Omega Notation, Ω
The Ω(n) is the formal way to express the lower
bound of an algorithm's running time.
It measures the best case time complexity or best
amount of time an algorithm can possibly take to
complete.
For example, for a function f(n)
f(n)=Ω( g(n))
If and only if there exists two positive constraints c and
n0 with the constraint that
|f(n)| ≥c|g(n)| for all n > n0. }
53. Theta Notation, θ
The θ(n) is the formal way to express both the lower
bound and upper bound of an algorithm's running
time.
It is represented as following −
(f(n)) = Ѳ( g(n))
if there exists three positive constants c1,c2and n0
with the constraint that
C1|g(n)| ≤ |f(n)| ≤c2|(g(n)| for all n > n0. }
54. String processing
The character set consists of alphabets ,digits and
special characters. A finite sequence S of zero or more
characters is called a String .
The number of characters is called its length.
The string with zero character is called Empty string
or null string.
Each string is terminated by a NULL character ,which
indicates the end of string .
A string constant is denoted by any set of characters
included in double quote marks.
55. Storing strings
Strings are generally stored in three types of
structures
Fixed length storage
Variable length storage with fixed maximum
Linked storage
56. Fixed length storage
Each string is allocated same number of memory cells
so that the number of charactes in each string will be
fixed .
The length of string is known when the string is
created .
Ex.
Char str[25];
This means that 25 memory locations are alloted for
the string str.
57. Advantages
Implementation is easy.
Updation is also easy
Disadvantages
Memory is not utilized properly when the string
length is less than the fixed length.
Time is wasted in reading the entire sting which ,ay
have some blank spaces
58. Variable length storage with
fixed maximum
Here strings are stored in fixed length memory
locations. The actual length of each string is also
known .
if there are blank spaces included in the string and if
the length is known ,we need not read the entire
string.
There are two ways of storing variable length strings
in memory with fixed length maximum.
A special character can be used to signal the end of
the string like ‘0’
The length of the strings can be listed as an additional
item.
59. Advantage
This method save space in memory and can be used
when strings are relatively permanent.
Disadvantage
This method is insufficient when the strings and their
lengths are frequently being changed.
60. Linked storage
Linked list is a linearly ordered sequence of memory
cells ,called nodes where each node contains an item,
called a link, which points to the next nodes.
The characters are stored in the data field of the
node.
Advantages
Modification, insertion and deletion operations are
easier.
Storage representation is efficient.
Disadvantages
Extra memory is required for the link field
One cannt directly access a character in th middle of
61. String as ADT
In most of the programming languages, strings are
built in data type or part of standard library .
The string function like finding the length
,concatenating two strings ,finding the index of a
character in a string copying from one to another are
already available as library functions and hence there
will not be any need to create String ADT .
If we want to have our own string functions
implementation then we can create our own string
ADT
62. Properties of string ADT
The component characters are from the ASCII
character set
They are comparable in lexicographic order
They have a length ,from )to the specified length
63. String operations
Finding length of string
Copying one string into another
Concatenation of two strings
Extracting substring from a given string
String comparison
Indexing or pattern matcing
64. length of string
The length of a string is the number of characters in
the string .
Strlen() is the built-in function to find the length of
the string.
If str is a character array variable and length is an
integer variable ,then the length of string can be
determined by using
Length=strlen(str)
65. Copying of one string into
another
One string cannot be assigned to another as we do
with numbers. there is a built-in function called
strcpy(),in order to copy one string to another.
Strcpy(dest,source)
where dest in the destination string and source is the
source string.
66. String concatenation
Adding or appending one string at the end of another
string is called concatenation.
Two strings str1 and str2 can be concatenated using
the library function strcat().
67. substring
Substring operation is extracting some characters
from the given string .
Accessing a substring from a given string requires the
following;
The name of the string or the string itself
Position of the first character of the substring in the
given string
Length of substring
Substr(str,pos,n) extracts the n number of characters
from the specified position pos from the string str
68. String comparison
This function compares two strings to find out
whether they are same or different.
The two strings are compared letter by letter until
there is a mismatch or end of one of the strings is
reached ,whichever occurs first .
Strcmp(str1,str2)compares two strings str1 and str2
69. Pattern matching
Pattern matching is to find a pattern, which is
relatively small, in a text, which is supposed to be very
large.
It refers to finding the position where a string pattern
P first appears in the given string text T. if pattern P
does not appear in text T,then Index =0.
The text and pattern can either be string constants or
string variables
70. Algorithm 1. pattern matching algorithm
1. j:=1;
2. while j <= tlen-plen+1 do begin
3. i:=1;
4. while (i<=plen) and (pat[i]=text[j]) do begin
5. i:=i+1;
6. j:=j+1
7. end while ;
8. if i<=plen then j:=j-i+2 /* shift the pattern one
place right */
9. else write(“found at “, j-i+1)
10. end
71. Word processing
The operators that can be performed with word
processing are
Replacement: Replacement operation involves
replacing one string in the text by another.
Insertion: Insertion involves inserting a string in the
middle of the text.
Deletion:Deletion operation involves deleting a string
from the text.
72. Replacement
Suppose in a given text T, the first occurrence of
pattern P1,should replaced by a pattern P2.
This operation can be denoted by the function
REPLACE(Text,pattern1,Pattern2)
For example:
REPLACE(‘RAMESH’,’M’,’J’)
In the text RAMESH ,the pattern M is replaced by the
pattern j. the resultant string text is RAJESH
73. Algorithm
REPLACE(STR,P1,P2)
Given string STR .we want to replace the first
occurrence of pattern P1 by a Pattern P2
Step 1: first find the index of p1 in the given string STR
POS:=INDEX(STR,P1)
Step 2:Delete the pattern P1 from the string STR using
DELETE operation.
STR=DELETE(STR,POS,LENGTH(P!))
Step3: Insert P2 in the string STR at position POS
INSERT (STR,POS,P2)
74. Insertion
Suppose in a given text T, a string S is to be inserted at
position K, then this operation can be denoted by
INSERT(text,position,string)
Example:
INSERT(‘MAHALAKSHMI’,5,’RANI’)=‘MAHARANILAKSH
MI’
75. Algorithm
INSERT(STR,POS,NEWSTR)
Given string is STR; we want to insert string NEWSTR
in position POS. the insert operation can be
implemented using SUBSTRING operation
Step 1: FIRST =SUBSTRING(STR,1,POS-1) will give the
substring from the first letter to till POS-1.
Step 2: concatenate FIRST and NEWSTR
CONSTR:=FIRST+NEWSTR
Step 3: Second =substring (STR,POS ,LENGTH (STR))
will give the second part of the string.
Step 4; concatenate the Strings as shown below
FINALSTRING=CONSTR+SECOND
76. Deletion
Suppose in a given text T, a substring which begins in
position K and has length l is to be deleted. This
operation can be denoted by
DELETE(text,position,length)
Example:
DELETE(‘MAHARANI’,1,4)=“RANI”
77. Algorithm
DELETE(STR,POS,LEN)
Given string is STR ,We want to delete from position
POS of length Len . The delete operation can be
implemented using SUBSTRING operation.
Step 1: FIRST:=SUBSTRING(STR,1,POS-1) will give the
substring from the first letter to till POS-1
Step 2: SECOND :=SUBSTRING (STR,POS+LEN
,LENGTH(STR)) will give the second part of the
string.
Step 3: concatenate the strings as shown below
FINAL STRING=FIRST+SECOND
Editor's Notes
A primary concern for this course is efficiency.
You might believe that faster computers make it unnecessary to be concerned with efficiency. However…
So we need special training.
We frequently interchange use of “algorithm” and “program” though they are actually different concepts.