4. Objective
At the end of the session students should have basic
understanding of
Data structure
Classification of data structure
Algorithm
Characteristics of algorithms
Algorithm analysis
4
5. Why Are Data Structures and
Algorithms So Important?
5
6. Why Are Data Structures and
Algorithms So Important?
are the fundamentals of programming
in order to become a good developer (professional)
– one that analyzes the problems in depth and proposes efficient
solutions.
6
7. Why Data
Structures?
They are essential ingredients in creating fast and
powerful algorithms.
They help to manage and organize data.
They make code cleaner and easier to understand.
7
8. “Bad programmers worry about the code. Good programmers
worry about data structures and their relationships.”
Linus Torvalds
8
9. Introduction
Program = Algorithm + Data structure
Data Structure:
The way data is organized in a computer’s memory so that
it can be used effectively.
Algorithm :
The sequence of computational steps to solve a
problem.
9
10. Con’t…
All of you have programmed;
already been exposed to algorithms and data structure.
Perhaps
you didn't see them as separate entities;
you saw data structures as simple programming constructs
(provided by STL--standard template library).
However, data structures are quite distinct from
algorithms, and very important in their own right.
10
11. Con’t…
Model
defines an abstract view to the
problem
define the properties of the problem
Properties of the problem include
The data which are affected and
The operations that are involved in
the problem.
Define Data
structure
Properties
11
13. Abstract Data Type (ADT)
is an abstraction of a data structure which provides only the interface to
which a data structure must follow.
The interface doesn’t give any specific details about how something
should be implemented or in what programming language.
13
14. Abstract Data Type (ADT)
Consists of an abstract data structure and
operations
Specifies:
What can be stored in the ADT
What operations can be done on/by the ADT
E.g., if we are going to model employees of an organization:
ADT stores employees with
their relevant attributes and
discarding irrelevant attributes.
ADT supports
hiring, firing, retiring, … operations.
14
15. Data structure (DS)
is a language construct that the programmer has defined in order
to implement an abstract data type.
There are lots of formalized and standard Abstract data types
such as Stacks, Queues, Trees, etc.
Do all characteristics need to be modeled?
Not at all it depends on
the scope of the model
the reason for developing the model
Abstraction
is a process of classifying characteristics as relevant and irrelevant for
the particular purpose at hand and ignoring the irrelevant ones.
15
16. Abstract Data Types Vs. Data
Structures
Examples
16
Abstraction (ADT) Implementation (DS)
List Dynamic array
Linked list
Queue Linked list Based
Array Based
Map Tree Map
Hash Table
Vehicle Bicycle
Car
Train
17. CLASSIFICATION OF DATA
STRUCTURE
What is data structure :
is basically a technique of organizing and storing of different
types of data items in computer memory. It is considered as
not only the storing of data elements but also the maintaining
of the logical relationship existing between individual data
elements.
also be defined as a mathematical or logical model, which
relates to a particular organization of different data elements.
17
19. CLASSIFICATION OF DATA
STRUCTURE
Primitive data structure:
also known as basic data structures.
predefined types of data, which are supported by the
programming language.
Non-Primitive data structure:
are highly developed complex data structures.
are not defined by the programming language, but are instead
created by the programmer.
Are responsible for organizing the group of homogeneous and
heterogeneous data elements.
19
21. CLASSIFICATION OF DATA
STRUCTURE
is the
The linear and non-linear data structure
subclassification of Non-primitive data structure.
Difference:
linear data structure arranges the data into a sequence
and follow some sort of order.
non-linear data structure does not organize the data in a
sequential manner.
21
25. Algorithm
An algorithm can be thought of as a set of instructions that
specifies how to solve a particular problem.
is a well-defined computational procedure that takes some
value or a set of values as input and produces some value or a
set of values as output
An algorithm transforms data structures from one state to
another state in two ways:
An algorithm may change the value held by a data
structure
An algorithm may change the data structure itself
25
27. Algorithm Analysis
Refers to the process of determining the amount of computing time
and storage space required by different algorithms.
A process of predicting the resource requirement of algorithms in a
given environment.
Main resources are:
Running Time
Memory Usage
Communication Bandwidth
28
28. Approaches to measure the efficiency of
algorithms
Empirical:
competing algorithms and trying them on different
instances.
Theoretical:
Determining the quantity of resources required
mathematically (Execution time, memory space, etc.)
needed by each algorithm.
29
29. Specific processor speed, Current processor load
Specific data for a particular run of the program
Input Size, Input Properties
Operating Environment
30
However, it is difficult to use actual clock-time as a consistent measure of
an algorithm‘s efficiency, because clock-time can vary based on many
things. For example,
30. Complexity Analysis
Is the systematic study of the cost of computation, measured
either
in time units or
in operations performed, or
in the amount of storage space required
The goal is to have a meaningful measure that permits comparison
of algorithms independent of operating platform.
31
31. Example Algorithms
Two algorithms for computing the Factorial
Which one is better?
int factorial (int n)
{ if (n <= 1) return 1;
else return n * factorial(n-1);
}
int factorial (int n) {
return 1;
if (n<=1)
else {
fact = 1;
for (k=2; k<=n; k++)
fact *= k;
return fact;
}
}
32
32. Complexity: A measure of the
performance of an algorithm
An algorithm’s performance depends on internal and external factors
Complexity measures the internal factors
(usually more interested in time than space)
Internal
The algorithm’s efficiency, in
terms of:
• Time required to run
• Space (memory storage)
required to run
External
• Size of the input to the algorithm
• Speed of the computer on which it is run
• Quality of the compiler
33
33. Con’t…
Two things to consider:
Time Complexity:
Determine the approximate number of operations required to
solve a problem of size n.
Space Complexity:
Determine the approximate memory required to solvea
problem of size n.
34
34. Benefits
Provides the short hand method, to find how much time
algorithm takes (approx.)
Its useful way of estimating the efficiency of analgorithm,
without having to measure actual run times.
Gives the choice to select among the best algorithms
available
36
35. Program running time – Why?
When is the running time (waiting time for user) noticeable/important?
37
36. Program running time – Why?
When is the running time (waiting time for user) noticeable/important?
• web search
• database search
• real-time systems with time constraints
38
37. How to Measure Algorithm Performance
What metric should be used to judge algorithms?
Length of the program (lines of code)
Ease of programming (bugs, maintenance)
Memory required
Running time
Running time is the dominant standard.
Quantifiable and easy to compare
Often the critical bottleneck
39
38. Con’t…
• An algorithm may run differently depending on:
the hardware platform (PC, Cray, Sun)
the programming language (C, Java, C++)
the programmer (you, me)
40
39. Complexity analysis involves two distinct phases
Algorithm Analysis:
Analysis of the algorithm or data structure to produce a
function T (n) that describes the algorithm in terms of the
operations performed in order to measure the complexity of
the algorithm.
Order of Magnitude Analysis:
Analysis of the function T (n) to determine the general
complexity category to which it belongs.
41
40. Analysis
Rules:
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes time 1:
• Assignment Operation
• Single Input/output Operation
• Single Boolean Operations
• Single Arithmetic Operations
• Function Return
42
41. Con’t…
3. Running time of a selection statement (if, switch)
is the time for the condition evaluation +
the maximum of the running times for the individual clauses in the
selection.
43
42. Con’t….
4. Loops: Running time for a loop is equal to
the running time for the statements inside the loop *
number of iterations.
Remark:
The total running time of a statementinside a
group of nested loops is
the running time of the statements multiplied by the
product of the sizes of all the loops.
For nested loops, analyze inside out.
Always assume that the loop executes the maximumnumber of
iterations possible.
44
43. Con’t…..
5. Running time of a function call is
1 for setup +
the time for any parameter calculations +
the time required for the execution of the functionbody.
45
44. Example
int count(){
int k=0;
cout<< ―Enter an integer‖;
cin>>n;
for (i=0;i<n;i++)
k=k+1;
return 0;}
Time Units to
Compute
1 for the assignment statement: int
k=0
1 for the output statement 1
for the input statement. In
the for loop:
1 assignment, n+1 tests, and n
increments.
n loops of 2 units for an assignment,
and an addition.
1 for the return statement.
T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6
= O(n)
46
45. void func()
{
int x=0;
int i=0;
int j=1;
cout<< ―Enter an Integer value‖;
cin>>n;
while (i<n){
x++;
i++;
}
while (j<n)
{
j++;
}
}
Time Units to Compute
-------------------------------------------------
1 for the first assignment statement: x=0;
1 for the second assignment statement: i=0;
1 for the third assignment statement: j=1;
1 for the output statement. 1
for the input statement.
In the first while loop:
n+1 tests
n loops of 2 units for the two increment
(addition) operations
In the second while loop:
n tests
n-1 increments
-------------------------------------------------------
T (n)= 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5 = O(n)
47
46. Example
for (i=1; i<n-1; i++) {
for (j=n; j>= i+1; j--) {
if (A(j-1) > A(j)) {
temp = A(j-1);
A(j-1) = A(j);
A(j) = tmp;
}
}
}
Complexity Analysis is ?
48
47. Asymptotic
Analysis
The term asymptotic means approaching a value or curve arbitrarily
closely (i.e., as some sort of limit is taken).
Asymptotic analysis refers to computing the running time of any
operation in mathematical units of computation.
Asymptotic analysis of an algorithm refers to defining the mathematical
boundary/framing of its run-time performance.
Asymptotic Notations are languages that allow us to analyze an
algorithm's running time by identifying its behavior as the input size for the
algorithm increases. This is also known as an algorithm's growth rate.
49
48. Simplifying the Bound
T(n) = ck nk + ck-1 nk-1 + ck-2 nk-2 + … + c1 n + co
too complicated
too many terms
Difficult to compare two expressions, each with 10 or 20 terms
Do we really need that many terms?
50
49. Simplifications
Keep just one term!
the fastest growing term (dominates the runtime)
No constant coefficients are kept
Constant coefficients affected by machines, languages,
etc.
Asymptotic behavior (as n gets large) is determined
entirely by the leading term.
Example. T(n) = 10 n3 + n2 + 40n + 800
If n = 1,000, then T(n) = 10,001,040,800
error is 0.01% if we drop all but the n3 term
51
50. T(n) keep one drop coef
3n2+4n+1 3 n2 n2
101 n2+102 101 n2 n2
15 n2+6n 15 n2 n2
a n2+bn+c a n2 n2
Asymptotic
They all have the same “growth” rate
52
51. Asymptotic Notations
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an
algorithm's running time. It measures the worst case time complexity or
the longest amount of time an algorithm can possibly take to complete.
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an
algorithm's running time. It measures the best case time complexity or the
best amount of time an algorithm can possibly take to complete.
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and
the upper bound of an algorithm's running time. It measures the average
case time complexity.
53
52. Asymptotic Notations
Big-O, “bounded above by”: T(n) = O(f(n))
For some c and N, T(n) c·f(n) whenever n > N.
Big-Omega, “bounded below by”: T(n) = (f(n))
For some c>0 and N, T(n) c·f(n) whenever n > N.
Same as f(n) = O(T(n)).
Big-Theta, “bounded above and below”: T(n) = (f(n))
T(n) = O(f(n)) and also T(n) = (f(n))
Little-o, “strictly bounded above”: T(n) = o(f(n))
T(n)/f(n) 0 as n
54
53. By Pictures
Big-Oh (most commonly used)
bounded above
Big-Omega
bounded below
Big-Theta
exactly
Small-o
not as expensive as ...
55
54. Average, Best, and Worst-Case
The time required by an algorithm falls under the following three
types −
Average case:
— Average time required for program execution
Real world distributions difficult to predict
Best case:
— Minimum time required for program execution
Seems unrealistic
Worst case:
— Maximum time required for program execution
Gives an absolute guarantee
On which input instances should the algorithm’s performance be
Judged?
We will use the worst-case measure.
56
56. Summary (Why O(n)?)
n + co
T(n) = ck nk + ck-1 nk-1 + ck-2 nk-2 + … + c1
Too complicated
O(nk )
a single term with constant coefficient dropped
Much simpler, extra terms and coefficients do not matter
asymptotically
Other criteria hard to quantify
58
57. Runtime Analysis (summary)
Useful rules
simple statements (read, write, assign)
O(1) (constant)
simple operations (+ - * / == > >= < <=
O(1)
sequence of simple statements/operations
rule of sums
for, do, while loops
rules of products
59
58. Runtime Analysis (cont.)
Two important rules
Rule of sums
if you do a number of operations in sequence, the runtime is
dominated by the most expensive operation
Rule of products
if you repeat an operation a number of times, the total runtime
is the runtime of the operation multiplied by the iteration count
60
60. Runtime Analysis (cont.)
Method calls
A calls B
B calls C
etc.
A sequence of operations when call sequences are flattened
T(n) = max(TA(n), TB(n), TC(n))
62
61. Example
int count()
{
int k=0;
cout<< “Enter an
integer”;
cin>>n;
for (i=0;i<n;i++)
k=k+1;
return 0;
}
T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6 = O(n)
In the for loop:
• 1 assignment, n+1 tests, and n
increments.
• n loops of 2 units for an assignment,
and an addition.
• 1 for the return statement.
Time Units to Compute
-------------------------------------------------
• 1 for the assignment statement: int
k=0
• 1 for the output statement.
• 1 for the input statement.
63
62. Example
for (i=1; i<n; i++)
if A(i) > maxVal then
maxVal= A(i);
maxPos= i;
Asymptotic Complexity: ?
64
63. Example
for (i=1; i<n; i++)
if A(i) > maxVal then
maxVal= A(i);
maxPos= i;
Asymptotic Complexity: O(n)
65
64. Example
for (i=1; i<n-1; i++)
for (j=n; j>= i+1; j--)
if (A(j-1) > A(j)) then
temp = A(j-1);
A(j-1) = A(j);
A(j) = tmp;
endif
endfor
endfor
Asymptotic Complexity is ?
66
65. Example
for (i=1; i<n-1; i++)
for (j=n; j>= i+1; j--)
if (A(j-1) > A(j)) then
temp = A(j-1);
A(j-1) = A(j);
A(j) = tmp;
endif
endfor
endfor
Asymptotic Complexity is O(n2)
67
66. Run Time for Recursive Programs
T(n) is defined recursively in terms of T(k), k<n
The recurrence relations allow T(n) to be “unwound” recursively into
some base cases (e.g., T(0) or T(1)).
Examples:
Factorial
Hanoi towers
68
A computer program is a collection of instructions that can be executed by a computer to perform a specific task.
Abstraction (from the Latin abs, meaning away from and trahere , meaning to draw) is the process of taking away or removing characteristics from something in order to reduce it to a set of essential characteristics