SlideShare a Scribd company logo
1 of 358
Download to read offline
DAA notes by Pallavi Joshi
Chapter 1: The Role of Algorithms in
Computing
DAA notes by Pallavi Joshi
Algorithms
Informally, an algorithm is …
A well-defined computational procedure that takes
some value, or set of values, as input and produces
some value, or set of values, as output.
algorithm
input output
A sequence of computational steps that transform the
input into output.
DAA notes by Pallavi Joshi
Algorithms
Empirically, an algorithm is …
A tool for solving a well-specified computational
problem.
Problem specification includes what the input is, what
the desired output should be.
Algorithm describes a specific computational
procedure for achieving the desired output for a given
input.
DAA notes by Pallavi Joshi
Algorithms
The Sorting Problem:
Input: A sequence of n numbers [a1, a2, … , an].
Output: A permutation or reordering [a'1, a'2, … , a'n ] of the input
sequence such that a'1  a'2  …  a'n .
An instance of the Sorting Problem:
Input: A sequence of 6 number [31, 41, 59, 26, 41, 58].
Expected output for given instance:
Expected
Output: The permutation of the input [26, 31, 41, 41, 58 , 59].
DAA notes by Pallavi Joshi
Algorithms
Some definitions …
An algorithm is said to be correct if, for every input
instance, it halts with the correct output.
A correct algorithm solves the given computational
problem.
Focus will be on correct algorithms; incorrect
algorithms can sometimes be useful.
Algorithm specification may be in English, as a
computer program, even as a hardware design.
DAA notes by Pallavi Joshi
Gallery of Problems
The Human Genome Project seeks to identify
all the 100,000 genes in human DNA,
determining the sequences of the 3 billion
chemical base pairs comprising human DNA,
storing this information in databases, and
developing tools for data analysis.
Algorithms are needed (most of which are novel) to solve the
many problems listed here …
The huge network that is the Internet and the
huge amount of data that courses through it
require algorithms to efficiently manage and
manipulate this data.
DAA notes by Pallavi Joshi
Gallery of Problems
E-commerce enables goods and services to be
negotiated and exchanged electronically.
Crucial is the maintenance of privacy and
security for all transactions.
Traditional manufacturing and commerce
require allocation of scarce resources in the
most beneficial way. Linear programming
algorithms are used extensively in commercial
optimization problems.
DAA notes by Pallavi Joshi
Some algorithms
• Shortest path algorithm
– Given a weighted graph and two
distinguished vertices -- the source and
the destination
-- compute the most efficient way to get
from one to the other
• Matrix multiplication algorithm
– Given a sequence of conformable
matrices, compute the most efficient way
of forming the product of the matrix
sequence
DAA notes by Pallavi Joshi
Some algorithms
• Convex hull algorithm
– Given a set of points on the plane,
compute the smallest convex body that
contains the points
• String matching algorithm
– Given a sequence of characters, compute
where (if at all) a second sequence of
characters occurs in the first
DAA notes by Pallavi Joshi
Hard problems
• Usual measure of efficiency is speed
– How long does an algorithm take to produce its result?
– Define formally measures of efficiency
• Problems exist that, in all probability, will take a
long time to solve
– Exponential complexity
– NP-complete problems
• Problems exist that are unsolvable
DAA notes by Pallavi Joshi
Hard problems
• NP-complete problems are interesting in and of
themselves
– Some of them arise in real applications
– Some of them look very similar to problems for which
efficient solutions do exist
– Knowing the difference is crucial
• Not known whether NP-complete problems really
are as hard as they seem, or, perhaps, the
machinery for solving them efficiently has not been
developed just yet
DAA notes by Pallavi Joshi
Hard problems
• P  NP conjecture
– Fundamental open problem in
the theory of computational
complexity
– Open now for 30+ years
DAA notes by Pallavi Joshi
Algorithms as a technology
• Even if computers were infinitely fast and
memory was plentiful and free
– Study of algorithms still important – still need to
establish algorithm correctness
– Since time and space resources are infinite, any
correct algorithm would do
• Real-world computers are fast but not
infinitely so
• Memory is cheap but not unlimited
DAA notes by Pallavi Joshi
Efficiency
• Time and space efficiency are the goal
• Algorithms often differ dramatically in their
efficiency
– Example: Two sorting algorithms
• INSERTION-SORT – time efficiency is c1n2
• MERGE-SORT – time efficiency is c1nlogn
– For which problem instances would one algorithm
be preferable to the other?
DAA notes by Pallavi Joshi
Efficiency
– Answer depends on several factors:
• Speed of machine performing the computation
– Internal clock speed
– Shared environment
– I/O needed by algorithm
• Quality of implementation (coding)
– Compiler optimization
– Implementation details (e.g., data structures)
• Size of problem instance
– Most stable parameter – used as independent variable
DAA notes by Pallavi Joshi
Efficiency
• INSERTION-SORT
– Implemented by an ace programmer and run on a machine A that
performs 109
instructions per second such that time efficiency is given
by:
tA(n) = 2n2
instructions (i.e., c1=2)
• MERGE-SORT
– Implemented by a novice programmer and run on a machine B that
performs 107
instructions per second such that time efficiency is given
by:
tB(n) = 50nlogn instructions (i.e., c1=50)
DAA notes by Pallavi Joshi
Efficiency
n 2n2
/109
50nlogn/107
10,000 0.20 0.66
50,000 5.00 3.90
100,000 20.00 8.30
500,000 500.00 47.33
1,000,000 2,000.00 99.66
5,000,000 50,000.00 556.34
10,000,000 200,000.00 1,162.67
50,000,000 5,000,000.00 6,393.86
Problem
Size
Machine A
Insertion-
Sort
Machine B
Merge-
Sort
DAA notes by Pallavi Joshi
Efficiency
• Graphical comparison
Time Efficiency Comparison
0.00
2.00
4.00
6.00
8.00
10.00
1 9 17 25 33 41 49 57 65
Size of Problem (in 1000s)
Seconds
Insertion Sort
Merge Sort
DAA notes by Pallavi Joshi
The Sorting Problem
Input: A sequence of n numbers [a1, a2, … , an].
Output: A permutation or reordering [a'1, a'2, … , a'n ] of the input
sequence such that a'1  a'2  …  a'n .
An instance of the Sorting Problem:
Input: A sequence of 6 number [31, 41, 59, 26, 41, 58].
Expected output for given instance:
Expected
Output: The permutation of the input [26, 31, 41, 41, 58 , 59].
DAA notes by Pallavi Joshi
Insertion Sort
The main idea …
DAA notes by Pallavi Joshi
Insertion Sort (cont.)
DAA notes by Pallavi Joshi
Insertion Sort (cont.)
DAA notes by Pallavi Joshi
Insertion Sort (cont.)
The algorithm …
DAA notes by Pallavi Joshi
Loop Invariant
• Property of A[1 .. j  1]
At the start of each iteration of the for loop of lines 1 8, the
subarray A[1 .. j  1] consists of the elements originally in
A[1 .. j  1] but in sorted order.
• Need to establish the following re invariant:
– Initialization: true prior to first iteration
– Maintenance: if true before iteration, remains true after
iteration
– Termination: at loop termination, invariant implies
correctness of algorithm
DAA notes by Pallavi Joshi
Analyzing Algorithms
• Has come to mean predicting the resources that the
algorithm requires
• Usually computational time is resource of primary
importance
• Aims to identify best choice among several alternate
algorithms
• Requires an agreed-upon “model” of computation
• Shall use a generic, one-processor, random-access
machine (RAM) model of computation
DAA notes by Pallavi Joshi
Random-Access Machine
• Instructions are executed one after another (no
concurrency)
• Admits commonly found instructions in “real”
computers, data movement operations, control
mechanism
• Uses common data types (integer and float)
• Other properties discussed as needed
• Care must be taken since model of computation has
great implications on resulting analysis
DAA notes by Pallavi Joshi
Analysis of Insertion Sort
• Time resource requirement depends on input size
• Input size depends on problem being studied;
frequently, this is the number of items in the input
• Running time: number of primitive operations or
“steps” executed for an input
• Assume constant amount of time for each line of
pseudocode
DAA notes by Pallavi Joshi
Analysis of Insertion Sort
Time efficiency analysis …
DAA notes by Pallavi Joshi
Best Case Analysis
• Least amount of (time) resource ever needed by algorithm
• Achieved when incoming list is already sorted in increasing order
• Inner loop is never iterated
• Cost is given by:
T(n) = c1n+c2 (n1)+c4 (n1)+c5(n1)+c8(n1)
= (c1+c2+c4+c5+c8)n  (c2+c4+c5+c8)
= an + b
• Linear function of n
DAA notes by Pallavi Joshi
Worst Case Analysis
• Greatest amount of (time) resource ever needed by algorithm
• Achieved when incoming list is in reverse order
• Inner loop is iterated the maximum number of times, i.e., tj = j
• Therefore, the cost will be:
T(n) = c1n + c2 (n1)+c4 (n1) + c5((n(n+1)/2) 1) + c6(n(n1)/2)
+ c7(n(n1)/2) + c8(n1)
= ( c5 /2 + c6 /2 + c7/2 ) n2 + (c1+c2+c4+c5 /2  c6 /2  c7 /2 +c8 ) n
 ( c2 + c4 + c5 + c8 )
= an2 + bn + c
• Quadratic function of n
DAA notes by Pallavi Joshi
Future Analyses
• For the most part, subsequent analyses will
focus on:
– Worst-case running time
• Upper bound on running time for any input
– Average-case analysis
• Expected running time over all inputs
• Often, worst-case and average-case have the
same “order of growth”
DAA notes by Pallavi Joshi
Order of Growth
• Simplifying abstraction: interested in rate of growth or
order of growth of the running time of the algorithm
• Allows us to compare algorithms without worrying
about implementation performance
• Usually only highest order term without constant
coefficient is taken
• Uses “theta” notation
– Best case of insertion sort is (n)
– Worst case of insertion sort is (n2)
DAA notes by Pallavi Joshi
Designing Algorithms
• Several techniques/patterns for designing algorithms
exist
• Incremental approach: builds the solution one
component at a time
• Divide-and-conquer approach: breaks original problem
into several smaller instances of the same problem
– Results in recursive algorithms
– Easy to analyze complexity using proven techniques
DAA notes by Pallavi Joshi
Divide-and-Conquer
• Technique (or paradigm) involves:
– “Divide” stage: Express problem in terms of
several smaller subproblems
– “Conquer” stage: Solve the smaller subproblems
by applying solution recursively – smallest
subproblems may be solved directly
– “Combine” stage: Construct the solution to
original problem from solutions of smaller
subproblem
DAA notes by Pallavi Joshi
n
(sorted)
MERGE
Merge Sort Strategy
• Divide stage: Split the n-element
sequence into two subsequences of
n/2 elements each
• Conquer stage: Recursively sort the
two subsequences
• Combine stage: Merge the two
sorted subsequences into one
sorted sequence (the solution)
n
(unsorted)
n/2
(unsorted)
n/2
(unsorted)
MERGE SORT MERGE SORT
n/2
(sorted)
n/2
(sorted)
DAA notes by Pallavi Joshi
Merging Sorted Sequences
DAA notes by Pallavi Joshi
Merging Sorted Sequences
•Combines the sorted
subarrays A[p..q] and
A[q+1..r] into one sorted
array A[p..r]
•Makes use of two working
arrays L and R which
initially hold copies of the
two subarrays
•Makes use of sentinel
value () as last element
to simplify logic
(1)
(n)
(1)
(n)
DAA notes by Pallavi Joshi
Merge Sort Algorithm
(1)
(n)
T(n/2)
T(n/2)
T(n) = 2T(n/2) + (n)
DAA notes by Pallavi Joshi
Analysis of Merge Sort
Analysis of recursive calls …
DAA notes by Pallavi Joshi
Analysis of Merge Sort
T(n) = cn(lg n + 1)
= cnlg n + cn
T(n) is (n lg n)
DAA notes by Pallavi Joshi
Chapter 3: Growth of Functions
DAA notes by Pallavi Joshi
Overview
• Order of growth of functions provides a simple
characterization of efficiency
• Allows for comparison of relative performance
between alternative algorithms
• Concerned with asymptotic efficiency of algorithms
• Best asymptotic efficiency usually is best choice except
for smaller inputs
• Several standard methods to simplify asymptotic
analysis of algorithms
DAA notes by Pallavi Joshi
Asymptotic Notation
• Applies to functions whose domains are the set of
natural numbers:
N = {0,1,2,…}
• If time resource T(n) is being analyzed, the function’s
range is usually the set of non-negative real numbers:
T(n)  R+
• If space resource S(n) is being analyzed, the function’s
range is usually also the set of natural numbers:
S(n)  N
DAA notes by Pallavi Joshi
Asymptotic Notation
• Depending on the textbook, asymptotic
categories may be expressed in terms of --
a. set membership (our textbook): functions
belong to a family of functions that exhibit
some property; or
b. function property (other textbooks): functions
exhibit the property
• Caveat: we will formally use (a) and
informally use (b)
DAA notes by Pallavi Joshi
The Θ-Notation
f
c1 ⋅ g
n0
c2 ⋅ g
Θ(g(n)) = { f(n) : ∃c1, c2 > 0, n0 > 0 s.t. ∀n ≥ n0:
c1 · g(n) ≤ f(n) ≤ c2 ⋅ g(n) }
DAA notes by Pallavi Joshi
The O-Notation
f
c ⋅ g
n0
O(g(n)) = { f(n) : ∃c > 0, n0 > 0 s.t. ∀n ≥ n0: f(n) ≤ c ⋅ g(n) }
DAA notes by Pallavi Joshi
The Ω-Notation
Ω(g(n)) = { f(n) : ∃c > 0, n0 > 0 s.t. ∀n ≥ n0: f(n) ≥ c ⋅ g(n) }
f
c ⋅ g
n0
DAA notes by Pallavi Joshi
The o-Notation
o(g(n)) = { f(n) : ∀c > 0 ∃n0 > 0 s.t. ∀n ≥ n0: f(n) ≤ c ⋅ g(n) }
f
c1 ⋅ g
n1
c2 ⋅ g
c3 ⋅ g
n2 n3
DAA notes by Pallavi Joshi
The ω-Notation
f
c1 ⋅ g
n1
c2 ⋅ g
c3 ⋅ g
n2
n3
ω(g(n)) = { f(n) : ∀c > 0 ∃n0 > 0 s.t. ∀n ≥ n0: f(n) ≥ c ⋅ g(n) }
DAA notes by Pallavi Joshi
Comparison of Functions
• f(n) = O(g(n)) and
g(n) = O(h(n)) ⇒ f(n) = O(h(n))
• f(n) = Ω(g(n)) and
g(n) = Ω(h(n)) ⇒ f(n) = Ω(h(n))
• f(n) = Θ(g(n)) and
g(n) = Θ(h(n)) ⇒ f(n) = Θ(h(n))
• f(n) = O(f(n))
f(n) = Ω(f(n))
f(n) = Θ(f(n))
Reflexivity
Transitivity
DAA notes by Pallavi Joshi
Comparison of Functions
• f(n) = Θ(g(n)) ⇐⇒ g(n) = Θ(f(n))
• f(n) = O(g(n)) ⇐⇒ g(n) = Ω(f(n))
• f(n) = o(g(n)) ⇐⇒ g(n) = ω(f(n))
• f(n) = O(g(n)) and
f(n) = Ω(g(n)) ⇒ f(n) = Θ(g(n))
Transpose
Symmetry
Symmetry
Theorem 3.1
DAA notes by Pallavi Joshi
Asymptotic
Analysis and Limits
DAA notes by Pallavi Joshi
Comparison of Functions
• f1(n) = O(g1(n)) and f2(n) = O(g2(n)) ⇒
f1(n) + f2(n) = O(g1(n) + g2(n))
• f(n) = O(g(n)) ⇒ f(n) + g(n) = O(g(n))
DAA notes by Pallavi Joshi
Standard Notation and
Common Functions
• Monotonicity
A function f(n) is monotonically increasing if m  n
implies f(m)  f(n) .
A function f(n) is monotonically decreasing if m 
n implies f(m)  f(n) .
A function f(n) is strictly increasing
if m < n implies f(m) < f(n) .
A function f(n) is strictly decreasing
if m < n implies f(m) > f(n) .
DAA notes by Pallavi Joshi
Standard Notation and
Common Functions
• Floors and ceilings
For any real number x, the greatest integer less
than or equal to x is denoted by x.
For any real number x, the least integer greater
than or equal to x is denoted by x.
For all real numbers x,
x1 < x  x  x < x+1.
Both functions are monotonically increasing.
DAA notes by Pallavi Joshi
Standard Notation and
Common Functions
• Exponentials
For all n and a1, the function an is the exponential function
with base a and is monotonically increasing.
• Logarithms
Textbook adopts the following convention
lg n = log2n (binary logarithm),
ln n = logen (natural logarithm),
lgk n = (lg n)k (exponentiation),
lg lg n = lg(lg n) (composition),
lg n + k = (lg n)+k (precedence of lg).
ai
DAA notes by Pallavi Joshi
Standard Notation and
Common Functions
• Important relationships
For all real constants a and b such that a>1,
nb = o(an)
that is, any exponential function with a base
strictly greater than unity grows faster than any
polynomial function.
For all real constants a and b such that a>0,
lgbn = o(na)
that is, any positive polynomial function grows
faster than any polylogarithmic function.
DAA notes by Pallavi Joshi
Standard Notation and
Common Functions
• Factorials
For all n the function n! or “n factorial” is given by
n! = n  (n1)  (n  2)  (n  3)  …  2  1
It can be established that
n! = o(nn)
n! = (2n)
lg(n!) = (nlgn)
DAA notes by Pallavi Joshi
Standard Notation and
Common Functions
• Functional iteration
The notation f (i)(n) represents the function f(n) iteratively
applied i times to an initial value of n, or, recursively
f (i)(n) = n if n=0
f (i)(n) = f(f (i1)(n)) if n>0
Example:
If f(n) = 2n
then f (2)(n) = f(2n) = 2(2n) = 22n
then f (3)(n) = f(f (2)(n)) = 2(22n) = 23n
then f (i)(n) = 2in
DAA notes by Pallavi Joshi
Standard Notation and
Common Functions
• Iterated logarithmic function
The notation lg* n which reads “log star of n” is defined as
lg* n = min {i0 : lg(i) n  1
Example:
lg* 2 = 1
lg* 4 = 2
lg* 16 = 3
lg* 65536 = 4
lg* 265536 = 5
DAA notes by Pallavi Joshi
Asymptotic Running Time
of Algorithms
• We consider algorithm A better than algorithm B if
TA(n) = o(TB(n))
• Why is it acceptable to ignore the behavior of
algorithms for small inputs?
• Why is it acceptable to ignore the constants?
• What do we gain by using asymptotic notation?
DAA notes by Pallavi Joshi
Things to Remember
• Asymptotic analysis studies how the values of
functions compare as their arguments grow
without bounds.
• Ignores constants and the behavior of the function
for small arguments.
• Acceptable because all algorithms are fast for small
inputs and growth of running time is more important
than constant factors.
DAA notes by Pallavi Joshi
Things to Remember
• Ignoring the usually unimportant details, we obtain a
representation that succinctly describes the growth of
a function as its argument grows and thus allows us to
make comparisons between algorithms in terms of
their efficiency.
DAA notes by Pallavi Joshi
Chapter 4: Recurrences
Overview
Define what a recurrence is
Discuss three methods of solving recurrences
Substitution method
Recursion-tree method
Master method
Examples of each method
64
DAA notes by Pallavi Joshi
Definition
A recurrence is an equation or inequality that describes a function in terms of its
value on smaller inputs.
Example from MERGE-SORT
T(n) =
(1) if n=1
2T(n/2) + (n) if n>1
Technicalities
Normally, independent variables only assume integral values
Example from MERGE-SORT revisited
(1) if n=1
T(n) =
T( n/2 ) + T( n/2 ) + (n) if n>1
For simplicity, ignore floors and ceilings – often insignificant
65
DAA notes by Pallavi Joshi
Technicalities
Boundary conditions (small n) are also glossed over
T(n) = 2T(n/2) + (n)
Value of T(n) assumed to be small constant for small n
Substitution Method
Involves two steps:
1.Guess the form of the solution.
2.Use mathematical induction to find the constants and show the solution works.
Drawback: applied only in cases where it is easy to guess at solution
Useful in estimating bounds on true solution even if latter is
unidentified
66
DAA notes by Pallavi Joshi
Substitution Method
Example:
T(n) = 2T(n/2 ) + n
Guess:
T(n) = O(n lg n)
Prove by induction:
T(n)  cn lg n
for suitable c>0.
Inductive Proof
We’ll not worry about the basis case for the moment – we’ll choose this as
needed – clearly we have:
T(1) = (1)  cn lg n
Inductive hypothesis:
For values of n < k the inequality holds, i.e., T(n)  cn lg n
We need to show that this holds for
n = k as well.
67
DAA notes by Pallavi Joshi
Inductive Proof
In particular, for n = k/2 , the inductive hypothesis should hold, i.e.,
T( k/2 )  c k/2 lg k/2
The recurrence gives us:
T(k) = 2T( k/2 ) + k Substituting the inequality above yields:
T(k)  2[c k/2 lg k/2 ] + k
Inductive Proof
Because of the non-decreasing nature of the functions involved, we can
drop the “floors” and obtain:
T(k)  2[c (k/2) lg (k/2)] + k
Which simplifies to:
T(k)  ck (lg k  lg 2) + k
Or, since lg 2 = 1, we have:
T(k)  ck lg k  ck + k = ck lg k + (1 c)k
So if c  1, T(k)  ck lg k Q.E.D.
68
DAA notes by Pallavi Joshi
Recursion-Tree Method
Straightforward technique of coming up with a good
guess
Can help the Substitution Method
Recursion tree: visual representation of recursive call
hierarchy where each node represents the cost of a single
subproblem
Recursion-Tree Method
T(n) = 3T( n/4 ) + (n2)
69
DAA notes by Pallavi Joshi
Recursion-Tree Method
T(n) = 3T( n/4 ) + (n2)
Recursion-Tree Method
T(n) = 3T( n/4 ) + (n2)
70
DAA notes by Pallavi Joshi
Recursion-Tree Method
T(n) = 3T( n/4 ) + (n2)
Recursion-Tree Method
T(n) = (3/16)icn2 + (nlog43)
i=0
Gathering all the costs together:
log4n1
T(n)  (3/16)icn2 + o(n)
i=0

T(n)  (1/(13/16))cn2 + o(n)
T(n)  (16/13)cn2 + o(n)
T(n) = O(n2)
71
DAA notes by Pallavi Joshi
Recursion-Tree Method
T(n) = T(n/3) + T(2n/3) + O(n)
Recursion-Tree Method
An overestimate of the total cost:
log3/2n1
T(n) = cn + (nlog3/22)
i=0
Counter-indications:
T(n) = O(n lg n) + (n lg n)
Notwithstanding this, use as “guess”:
T(n) = O(n lg n)
72
DAA notes by Pallavi Joshi
Substitution Method
Recurrence:
T(n) = T(n/3) + T(2n/3) + cn
Guess:
T(n) = O(n lg n)
Prove by induction:
T(n)  dn lg n
for suitable d>0 (we already use c)
Inductive Proof
Again, we’ll not worry about the basis case
Inductive hypothesis:
For values of n < k the inequality holds, i.e., T(n)  dn lg n
We need to show that this holds for
n = k as well.
In particular, for n = k/3, and n = 2k/3, the inductive hypothesis
should hold…
73
DAA notes by Pallavi Joshi
Inductive Proof
That is
T(k/3)  d k/3 lg k/3 T(2k/3)  d 2k/3 lg 2k/3
The recurrence gives us:
T(k) = T(k/3) + T(2k/3) + ck
Substituting the inequalities above yields:
T(k)  [d (k/3) lg (k/3)] + [d (2k/3) lg (2k/3)] + ck
Inductive Proof
Expanding, we get
T(k)  [d (k/3) lg k  d (k/3) lg 3] +
[d (2k/3) lg k  d (2k/3) lg(3/2)] + ck
Rearranging, we get:
T(k)  dk lg k  d[(k/3) lg 3 + (2k/3) lg(3/2)] + ck T(k)  dk lg k  dk[lg 3  2/3] + ck
When dc/(lg3  (2/3)), we should have the desired:
T(k)  dk lg k
74
DAA notes by Pallavi Joshi
Master Method
Provides a “cookbook” method for solving recurrences
Recurrence must be of the form:
T(n) = aT(n/b) + f(n)
where a1 and b>1 are constants and f(n) is an
asymptotically positive function.
Master Method
Theorem 4.1:
Given the recurrence previously defined, we have:
1.If f(n) = O(n logba)
for some constant >0, then T(n) = (n logba)
2.If f(n) = (n logba),
then T(n) = (nlogba lg n)
75
DAA notes by Pallavi Joshi
Master Method
3. If f(n) = (n logba+)
for some constant >0, and if
af(n/b)  cf(n)
for some constant c<1
and all sufficiently large n, then T(n) = (f(n))
Example
Estimate bounds on the following recurrence:
Use the recursion tree method to arrive at a “guess” then verify using
induction
Point out which case in the Master Method this falls in
76
DAA notes by Pallavi Joshi
Recursion Tree
Recurrence produces the following tree:
Cost Summation
Collecting the level-by-level costs:
A geometric series with base less than one; converges to a finite sum,
hence, T(n) = (n2)
77
DAA notes by Pallavi Joshi
Exact Calculation
If an exact solution is preferred:
Using the formula for a partial geometric series:
78
DAA notes by Pallavi Joshi
Master Theorem (Simplified)
79
DAA notes by Pallavi Joshi
• Divide and Conquer
DAA notes by Pallavi Joshi
Divide and Conquer
-
Recursive in structure
Divide the problem into sub-problems that are
similar to the original but smaller in size
Conquer the sub-problems by solving them
recursively. If they are small enough, just
solve them in a straightforward manner.
Combine the solutions to create a solution to
the original problem
DAA notes by Pallavi Joshi
An Example: Merge Sort
Comp 122
-
Sorting Problem: Sort a sequence of n elements into
non-decreasing order.
Divide: Divide the n-element sequence to be
sorted into two subsequences of n/2 elements each
Conquer:Sort the two subsequences recursively
using merge sort.
Combine: Merge the two sorted
subsequences to produce the sorted answer.
DAA notes by Pallavi Joshi
Merge Sort – Example
Comp 122
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
26
18 6
32 15
43 1
9
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
18 26 32
6 15 43 1 9
6 18 26 32 1 9 15 43
1 6 9 15 18 26 32 43
18 26 32 6
18 26 32 6
43 15 9 1
43 15 9 1
18 26 32 6 43 15 9 1
6 43 15
18 26 32 6 43 15 9 1 18 26 32 9 1
18 26 32 6 43 15 9 1
1 9
18 26 6 32 15 43
- 5
6 18 26 32 1 9 15 43
Original Sequence Sorted Sequence
DAA notes by Pallavi Joshi
Merge-Sort (A, p, r)
Comp 122
- 6
INPUT: a sequence of n numbers stored in array A
OUTPUT: an ordered sequence of n numbers
MergeSort (A, p, r) // sort A[p..r] by divide & conquer
1 if p < r
2 then q  (p+r)/2
3 MergeSort (A, p, q)
4 MergeSort (A, q+1, r)
5 Merge (A, p, q, r) // merges A[p..q] with A[q+1..r]
Initial Call: MergeSort(A, 1, n)
DAA notes by Pallavi Joshi
Comp 122
- 7
Procedure
Merge
Merge(A, p, q, r)
1 n1  q – p + 1
2 n2  r – q
3 for i  1 to n1
4 do L[i]  A[p + i – 1]
5for j  1 to n2
6do R[j]  A[q + j] 7
L[n1+1]  
8 R[n2+1]  
9 i  1
10 j  1
11for k p to r
12do if L[i]  R[j]
13then A[k]  L[i]
14 i  i + 1
15 else A[k]  R[j]
16 j  j + 1
Sentinels, to avoid having to
check if either subarray is
fully copied at each step.
Input: Array containing
sorted subarrays A[p..q]
and A[q+1..r].
Output: Merged sorted
subarray in A[p..r].
DAA notes by Pallavi Joshi
Comp 122
- 8
j
Merge – Example
6 8 26 32 1 9 42 43
A
k
6 8 26 32 1 9 42 43
k k k k k k k
i i i i i j j j j
6 8 26 32  1 9 42 43 
… 1 6 8 9 26 32 42 43 …
k
L R
DAA notes by Pallavi Joshi
Comp 122
- 9
Merg
e
Merge(A, p, q, r)
1 n1  q – p + 1
2 n2  r – q
3 for i  1 to n1
4 do L[i]  A[p + i – 1]
5for j  1 to n2
6do R[j]  A[q + j] 7
L[n1+1]  
8 R[n2+1]  
9 i  1
10 j  1
11for k p to r
12do if L[i]  R[j]
13then A[k]  L[i]
14 i  i + 1
15 else A[k]  R[j]
16 j  j + 1
Loop Invariant for the for loop At the start
of each iteration of the for loop:
Subarray A[p..k – 1] contains the k – p
smallest elements of L and R in sorted order.
L[i] and R[j] are the smallest elements of
L and R that have not been copied back into
A.
Initialization:
Before the first iteration:
•A[p..k – 1] is empty.
•i = j = 1.
•L[1] and R[1] are the smallest
elements of L and R not copied to A.
DAA notes by Pallavi Joshi
Comp 122
- 10
Merg
e
Merge(A, p, q, r)
1 n1  q – p + 1
2 n2  r – q
2
3 for i  1 to n1
4 do L[i]  A[p + i – 1]
5for j  1 to n
6do R[j]  A[q + j] 7
L[n1+1]  
8 R[n2+1]  
9 i  1
10 j  1
11for k p to r
12do if L[i]  R[j]
13then A[k]  L[i]
14 i  i + 1
15 else A[k]  R[j]
16 j  j + 1
Maintenance:
Case 1: L[i]  R[j]
•By LI, A contains p – k smallest elements
of L and R in sorted order.
•By LI, L[i] and R[j] are the smallest
elements of L and R not yet copied into A.
•Line 13 results in A containing p – k + 1
smallest elements (again in sorted order).
Incrementing i and k reestablishes the LI
for the next iteration.
Similarly for L[i] > R[j].
Termination:
•On termination, k = r + 1.
•By LI, A contains r – p + 1 smallest
elements of L and R in sorted order.
•L and R together contain r – p + 3 elements.
All but the two sentinels have been copied
back into A.
DAA notes by Pallavi Joshi
Comp 122
- 11
Analysis of Merge Sort
Running time T(n) of Merge Sort:
Divide: computing the middle takes (1)
Conquer: solving 2 subproblems takes 2T(n/2)
Combine: merging n elements takes (n)
Total:
T(n) = (1)
T(n) = 2T(n/2) + (n)
if n = 1
if n > 1
 T(n) = (n lg n) (CLRS, Chapter 4)
DAA notes by Pallavi Joshi
Comp 122, Spring 2004
Recurrences – I
DAA notes by Pallavi Joshi
Recurrence Relations
Comp 122
- 13
Equation or an inequality that characterizes a
function by its values on smaller inputs.
Solution Methods (Chapter 4)
Substitution Method.
Recursion-tree Method.
Master Method.
Recurrence relations arise when we analyze the
running time of iterative or recursive algorithms.
Ex: Divide and Conquer.
T(n) = (1)
T(n) = a T(n/b) + D(n) + C(n)
if n  c
otherwise
DAA notes by Pallavi Joshi
Substitution
Method
Comp 122
- 14
Guess the form of the solution, then
use mathematical induction to show it correct.
Substitute guessed answer for the function when the
inductive hypothesis is applied to smaller values –
hence, the name.
Works well when the solution is easy to guess.
No general way to guess the correct solution.
DAA notes by Pallavi Joshi
Example – Exact Function
Comp 122
- 15
if n = 1
if n > 1
Recurrence: T(n) = 1
T(n) = 2T(n/2) + n
Guess: T(n) = n lg n + n.
Induction:
•Basis: n = 1  n lgn + n = 1 = T(n).
•Hypothesis: T(k) = k lg k + k for all k < n.
•Inductive Step: T(n) = 2 T(n/2) + n
= 2 ((n/2)lg(n/2) + (n/2)) + n
= n (lg(n/2)) + 2n
= n lg n – n + 2n
= n lg n + n
DAA notes by Pallavi Joshi
Recursion-tree
Method
Comp 122
- 16
Making a good guess is sometimes difficult with
the substitution method.
Use recursion trees to devise good guesses.
Recursion Trees
Show successive expansions of recurrences using
trees.
Keep track of the time spent on the subproblems of a
divide and conquer algorithm.
Help organize the algebraic bookkeeping necessary
to solve a recurrence.
DAA notes by Pallavi Joshi
Recursion Tree – Example
Comp 122
- 17
Running time of Merge Sort:
if n = 1
if n > 1
T(n) = (1)
T(n) = 2T(n/2) + (n)
Rewrite the recurrence as
T(n) = c
T(n) = 2T(n/2) + cn
if n = 1
if n > 1
c > 0: Running time for the base case and
time per array element for the divide and
combine steps.
DAA notes by Pallavi Joshi
Recursion Tree for Merge Sort
Comp 122
- 18
For the original problem,
we have a cost of cn,
plus two subproblems
each of size (n/2) and
running time T(n/2).
cn
T(n/2) T(n/2)
Each of the size n/2 problems
has a cost of cn/2 plus two
subproblems, each costing
T(n/4).
cn
cn/2 cn/2
T(n/4) T(n/4) T(n/4) T(n/4)
Cost of divide and
merge.
Cost of sorting
subproblems.
DAA notes by Pallavi Joshi
Recursion Tree for Merge Sort
Comp 122
- 19
Continue expanding until the problem size reduces to 1.
cn
cn/2 cn/2
cn/4 cn/4 cn/4 cn/4
c c c c c c
lg n
cn
cn
cn
cn
Total : cnlgn+cn
DAA notes by Pallavi Joshi
Recursion Tree for Merge Sort
Comp 122
- 20
Continue expanding until the problem size reduces to 1.
cn
cn/2 cn/2
cn/4 cn/4 cn/4 cn/4
c c c c c c
•Each level has total cost cn.
•Each time we go down one level,
the number of subproblems
doubles, but the cost per
subproblem halves
 cost per level remains the same.
•There are lg n + 1 levels, height is
lg n. (Assuming n is a power of 2.)
•Can be proved by induction.
•Total cost = sum of costs at each
level = (lg n + 1)cn = cnlgn + cn =
(n lgn).
DAA notes by Pallavi Joshi
Other Examples
Comp 122
- 21
Use the recursion-tree method to determine a
guess for the recurrences
T(n) = 3T(n/4) + (n2).
T(n) = T(n/3) + T(2n/3) + O(n).
DAA notes by Pallavi Joshi
Recursion Trees – Caution Note
Comp 122
- 22
Recursion trees only generate guesses.
Verify guesses using substitution method.
A small amount of “sloppiness” can be
tolerated. Why?
If careful when drawing out a recursion tree and
summing the costs, can be used as direct proof.
DAA notes by Pallavi Joshi
The Master
Method
Comp 122
- 23
Based on the Master theorem.
“Cookbook” approach for solving recurrences
of the form
T(n) = aT(n/b) + f(n)
• a  1, b > 1 are constants.
• f(n) is asymptotically positive.
• n/b may not be an integer, but we ignore floors and
ceilings. Why?
Requires memorization of three cases.
DAA notes by Pallavi Joshi
The Master
Theorem
• Theorem 4.1
• Let a  1 and b > 1 be constants, let f(n) be a
function, and
• Let T(n) be defined on nonnegative integers by
the recurrence T(n) = aT(n/b) + f(n), where we
can replace n/b by n/b or n/b . T(n) can be
bounded asymptotically in three cases:
• 1. If f(n) = O(nlogba– ) for some constant > 0, then
T(n) = (nlogba).
• 2. If f(n) = (nlogba), then T(n) = (nlogbalg n).
Comp 122
- 24
Theorem 4.1
Let a  1 and b > 1 be constants, let f(n) be a function, and
Let T(n) be defined on nonnegative integers by the recurrence T(n) =
aT(n/b) + f(n), where we can replace n/b by n/b or n/b. T(n) can
be bounded asymptotically in three cases:
1.If f(n) = O(nlogba– ) for some constant  > 0, then T(n) =
(nlogba).
2.If f(n) = (nlogba), then T(n) = (nlogbalg n).
3.If f(n) =  (nlogba+ ) for some constant  > 0,
and if, for some constant c < 1 and all sufficiently large n, we have
a·f(n/b)  c f(n), then T(n) = (f(n)).
We’ll return to recurrences as we need them…
DAA notes by Pallavi Joshi
• Heap Sort Algorithm
DAA notes by Pallavi Joshi 104
Special Types of Trees
• Def: Full binary tree = a
binary tree in which each node
is either a leaf or has degree
exactly 2.
• Def: Complete binary tree =
a binary tree in which all
leaves are on the same level
and all internal nodes have
degree 2.
Full binary tree
2
14 8
1
16
7
4
3
9 10
12
Complete binary tree
2
1
16
4
3
9 10
DAA notes by Pallavi Joshi 105
Definitions
• Height of a node = the number of edges on the longest simple
path from the node down to a leaf
• Level of a node = the length of a path from the root to the
node
• Height of tree = height of root node
2
14 8
1
16
4
3
9 10
Height of root = 3
Height of (2)= 1 Level of (10)= 2
DAA notes by Pallavi Joshi 106
Useful Properties
2
14 8
1
16
4
3
9 10
Height of root = 3
Height of (2)= 1 Level of (10)= 2
heigh
t
heigh
t
1
1
0
2 1
2 2 1
2 1
d
d
l d
l
n




   


(see Ex 6.1-2, page 129)
DAA notes by Pallavi Joshi 107
The Heap Data Structure
• Def: A heap is a nearly complete binary tree
with the following two properties:
– Structural property: all levels are full, except
possibly the last one, which is filled from left to
right
– Order (heap) property: for any node x
Parent(x) ≥ x
Heap
5
7
8
4
2
From the heap
property, it follows
that:
“The root is the
maximum
element of the heap!”
A heap is a binary tree that is filled in
DAA notes by Pallavi Joshi 108
Array Representation of Heaps
• A heap can be stored as an array
A.
– Root of tree is A[1]
– Left child of A[i] = A[2i]
– Right child of A[i] = A[2i + 1]
– Parent of A[i] = A[ i/2 ]
– Heapsize[A] ≤ length[A]
• The elements in the subarray
A[(n/2+1) .. n] are leaves
DAA notes by Pallavi Joshi 109
Heap Types
• Max-heaps (largest element at root), have the
max-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≥ A[i]
• Min-heaps (smallest element at root), have the
min-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≤ A[i]
DAA notes by Pallavi Joshi 110
Adding/Deleting Nodes
• New nodes are always inserted at the bottom
level (left to right)
• Nodes are removed from the bottom level
(right to left)
DAA notes by Pallavi Joshi 111
Operations on Heaps
• Maintain/Restore the max-heap property
– MAX-HEAPIFY
• Create a max-heap from an unordered array
– BUILD-MAX-HEAP
• Sort an array in place
– HEAPSORT
• Priority queues
DAA notes by Pallavi Joshi 112
Maintaining the Heap Property
• Suppose a node is smaller than a child
– Left and Right subtrees of i are max-heaps
• To eliminate the violation:
– Exchange with larger child
– Move down the tree
– Continue until node is not smaller than
children
DAA notes by Pallavi Joshi 113
Example
MAX-HEAPIFY(A, 2, 10)
A[2] violates the heap property
A[2]  A[4]
A[4] violates the heap property
A[4]  A[9]
Heap property restored
DAA notes by Pallavi Joshi 114
Maintaining the Heap Property
• Assumptions:
– Left and Right
subtrees of i
are max-heaps
– A[i] may be
smaller than
its children
Alg: MAX-HEAPIFY(A, i, n)
1. l ← LEFT(i)
2. r ← RIGHT(i)
3. if l ≤ n and A[l] > A[i]
4. then largest ←l
5. else largest ←i
6. if r ≤ n and A[r] > A[largest]
7. then largest ←r
8. if largest  i
9. then exchange A[i] ↔
A[largest]
10. MAX-HEAPIFY(A, largest,
n)
DAA notes by Pallavi Joshi 115
MAX-HEAPIFY Running Time
• Intuitively:
• Running time of MAX-HEAPIFY is O(lgn)
• Can be written in terms of the height of the heap,
as being O(h)
– Since the height of the heap is lgn
h
2h
O(h)
-
-
-
-
DAA notes by Pallavi Joshi 116
Building a Heap
Alg: BUILD-MAX-HEAP(A)
1. n = length[A]
2. for i ← n/2 downto 1
3. do MAX-HEAPIFY(A, i, n)
• Convert an array A[1 … n] into a max-heap (n = length[A])
• The elements in the subarray A[(n/2+1) .. n] are leaves
• Apply MAX-HEAPIFY on elements between 1 and n/2
2
14 8
1
16
7
4
3
9 10
1
2 3
4 5 6 7
8 9 10
4 1 3 2 16 9 10 14 8 7
A:
DAA notes by Pallavi Joshi 117
Example: A
4 1 3 2 16 9 10 14 8 7
2
14 8
1
16
7
4
3
9 10
1
2 3
4 5 6 7
8 9 10
14
2 8
1
16
7
4
10
9 3
1
2 3
4 5 6 7
8 9 10
2
14 8
1
16
7
4
3
9 10
1
2 3
4 5 6 7
8 9 10
14
2 8
1
16
7
4
3
9 10
1
2 3
4 5 6 7
8 9 10
14
2 8
16
7
1
4
10
9 3
1
2 3
4 5 6 7
8 9 10
8
2 4
14
7
1
16
10
9 3
1
2 3
4 5 6 7
8 9 10
i = 5 i = 4 i = 3
i = 2 i = 1
DAA notes by Pallavi Joshi 118
Running Time of BUILD MAX HEAP
 Running time: O(nlgn)
• This is not an asymptotically tight upper
bound
Alg: BUILD-MAX-HEAP(A)
1. n = length[A]
2. for i ← n/2 downto 1
3. do MAX-HEAPIFY(A, i, n) O(lgn)
O(n)
DAA notes by Pallavi Joshi 119
Running Time of BUILD MAX HEAP
• HEAPIFY takes O(h)  the cost of HEAPIFY on a node i is
proportional to the height of the node i in the tree
Height Level
h0 = 3 (lgn)
h1 = 2
h2 = 1
h3 = 0
i = 0
i = 1
i = 2
i = 3 (lgn)
No. of nodes
20
21
22
23
hi = h – i height of the heap rooted at
level i
ni = 2i number of nodes at
i
h
i
ih
n
n
T 



0
)
(  
i
h
h
i
i

 
0
2 )
(n
O

DAA notes by Pallavi Joshi 120
Running Time of BUILD MAX HEAP
i
h
i
ih
n
n
T 


0
)
( Cost of HEAPIFY at level i  number of nodes at that le
 
i
h
h
i
i

 
0
2 Replace the values of ni and hi computed before
h
h
i
i
h
i
h
2
2
0




 Multiply by 2h both at the nominator and denominator
write 2i as i

2
1



h
k
k
h k
0 2
2 Change variables: k = h - i




0 2
k
k
k
n The sum above is smaller than the sum of all elements
and h = lgn
)
(n
O
 The sum above is smaller than 2
Running time of BUILD-MAX-HEAP: T(n) = O(n)
DAA notes by Pallavi Joshi 121
Heapsort
• Goal:
– Sort an array using heap representations
• Idea:
– Build a max-heap from the array
– Swap the root (the maximum element) with the last
element in the array
– “Discard” this last node by decreasing the heap size
– Call MAX-HEAPIFY on the new root
– Repeat this process until only one node remains
DAA notes by Pallavi Joshi 122
Example: A=[7, 4, 3, 1, 2]
MAX-HEAPIFY(A, 1, 4) MAX-HEAPIFY(A, 1, 3) MAX-HEAPIFY(A, 1, 2)
MAX-HEAPIFY(A, 1, 1)
DAA notes by Pallavi Joshi 123
Alg: HEAPSORT(A)
1. BUILD-MAX-HEAP(A)
2. for i ← length[A] downto 2
3. do exchange A[1] ↔ A[i]
4. MAX-HEAPIFY(A, 1, i - 1)
• Running time: O(nlgn) --- Can
be shown to be Θ(nlgn)
O(n)
O(lgn)
n-1 times
DAA notes by Pallavi Joshi 124
Priority Queues
12 4
DAA notes by Pallavi Joshi 125
Operations
on Priority Queues
• Max-priority queues support the following
operations:
– INSERT(S, x): inserts element x into set S
– EXTRACT-MAX(S): removes and returns element of S
with largest key
– MAXIMUM(S): returns element of S with largest key
– INCREASE-KEY(S, x, k): increases value of element x’s
key to k (Assume k ≥ x’s current key value)
DAA notes by Pallavi Joshi 126
HEAP-MAXIMUM
Goal:
– Return the largest element of the heap
Alg: HEAP-MAXIMUM(A)
1. return A[1]
Running time:
O(1)
Heap A:
Heap-Maximum(A) returns 7
DAA notes by Pallavi Joshi 127
HEAP-EXTRACT-MAX
Goal:
– Extract the largest element of the heap (i.e., return the max value
and also remove that element from the heap
Idea:
– Exchange the root element with the last
– Decrease the size of the heap by 1 element
– Call MAX-HEAPIFY on the new root, on a heap of size n-1
Heap A: Root is the largest element
DAA notes by Pallavi Joshi 128
Example: HEAP-EXTRACT-MAX
8
2 4
14
7
1
16
10
9 3
max = 16
8
2 4
14
7
1
10
9 3
Heap size decreased with 1
4
2 1
8
7
14
10
9 3
Call MAX-HEAPIFY(A, 1, n-1)
DAA notes by Pallavi Joshi 129
HEAP-EXTRACT-MAX
Alg: HEAP-EXTRACT-MAX(A, n)
1. if n < 1
2. then error “heap underflow”
3. max ← A[1]
4. A[1] ← A[n]
5. MAX-HEAPIFY(A, 1, n-1) remakes heap
6. return max
Running time: O(lgn)
DAA notes by Pallavi Joshi 130
HEAP-INCREASE-KEY
• Goal:
– Increases the key of an element i in the heap
• Idea:
– Increment the key of A[i] to its new value
– If the max-heap property does not hold anymore:
traverse a path toward the root to find the proper
place for the newly increased key
8
2 4
14
7
1
16
10
9 3
i
Key [i] ← 15
DAA notes by Pallavi Joshi 131
Example: HEAP-INCREASE-KEY
14
2 8
15
7
1
16
10
9 3
i
8
2 4
14
7
1
16
10
9 3
i
Key [i ] ← 15
8
2 15
14
7
1
16
10
9 3
i
15
2 8
14
7
1
16
10
9 3
i
DAA notes by Pallavi Joshi 132
HEAP-INCREASE-KEY
Alg: HEAP-INCREASE-KEY(A, i, key)
1. if key < A[i]
2. then error “new key is smaller than current key”
3. A[i] ← key
4. while i > 1 and A[PARENT(i)] < A[i]
5. do exchange A[i] ↔ A[PARENT(i)]
6. i ← PARENT(i)
• Running time: O(lgn)
8
2 4
14
7
1
16
10
9 3
i
Key [i] ← 15
DAA notes by Pallavi Joshi 133
-
MAX-HEAP-INSERT
• Goal:
– Inserts a new element into a
max-heap
• Idea:
– Expand the max-heap with a
new element whose key is -
– Calls HEAP-INCREASE-KEY to set
the key of the new node to its
correct value and maintain the
max-heap property
8
2 4
14
7
1
16
10
9 3
15
8
2 4
14
7
1
16
10
9 3
DAA notes by Pallavi Joshi 134
Example: MAX-HEAP-INSERT
-
8
2 4
14
7
1
16
10
9 3
Insert value 15:
- Start by inserting -
15
8
2 4
14
7
1
16
10
9 3
Increase the key to 15
Call HEAP-INCREASE-KEY on A[11] = 15
7
8
2 4
14
15
1
16
10
9 3
7
8
2 4
15
14
1
16
10
9 3
The restored heap containing
the newly added element
DAA notes by Pallavi Joshi 135
MAX-HEAP-INSERT
Alg: MAX-HEAP-INSERT(A, key, n)
1. heap-size[A] ← n + 1
2. A[n + 1] ← -
3. HEAP-INCREASE-KEY(A, n + 1, key)
Running time: O(lgn)
-
8
2 4
14
7
1
16
10
9 3
DAA notes by Pallavi Joshi 136
Summary
• We can perform the following operations on heaps:
– MAX-HEAPIFY O(lgn)
– BUILD-MAX-HEAP O(n)
– HEAP-SORT O(nlgn)
– MAX-HEAP-INSERT O(lgn)
– HEAP-EXTRACT-MAX O(lgn)
– HEAP-INCREASE-KEY O(lgn)
– HEAP-MAXIMUM O(1)
Average
O(lgn)
DAA notes by Pallavi Joshi
Ch. 7 - QuickSort
Quick but not Guaranteed
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
Another Divide-and-Conquer sorting algorithm…
As it turns out, MERGESORT and HEAPSORT, although O(n
lg n) in their time complexity, have fairly large constants
and tend to move data around more than desirable (e.g.,
equal-key items may not maintain their relative position
from input to output).
We introduce another algorithm with better constants, but a
flaw: its worst case in O(n2). Fortunately, the worst case
is “rare enough” so that the speed advantages work an
overwhelming amount of the time… and it is O(n lg n) on
average.
7/15/2021
138
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
Like in MERGESORT, we use Divide-and-Conquer:
1. Divide: partition A[p..r] into two subarrays A[p..q-1] and
A[q+1..r] such that each element of A[p..q-1] is ≤ A[q],
and each element of A[q+1..r] is ≥ A[q]. Compute q as
part of this partitioning.
2. Conquer: sort the subarrays A[p..q-1] and A[q+1..r] by
recursive calls to QUICKSORT.
3. Combine: the partitioning and recursive sorting leave us
with a sorted A[p..r] – no work needed here.
An obvious difference is that we do most of the work in the
divide stage, with no work at the combine one.
7/15/2021
139
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
The Pseudo-Code
7/15/2021
140
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
7/15/2021
141
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
Proof of Correctness: PARTITION
We look for a loop invariant and we observe that at the
beginning of each iteration of the loop (l.3-6) for any
array index k:
1. If p ≤ k ≤ i, then A[k] ≤ x;
2. If i+1 ≤ k ≤ j-1, then A[k] > x;
3. If k = r, then A[k] = x.
4. If j ≤ k ≤ r-1, then we don’t know anything about A[k].
7/15/2021
142
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
The Invariant
• Initialization. Before the first iteration: i=p-1, j=p. No values between
p and i; no values between i+1 and j-1. The first two conditions are
trivially satisfied; the initial assignment satisfies 3.
• Maintenance. Two cases
– 1. A[j] > x.
– 2. A[j] ≥ x.
7/15/2021
143
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
The Invariant
• Termination. j=r. Every entry in the array is in one of the three sets
described by the invariant. We have partitioned the values in the
array into three sets: less than or equal to x, greater than x, and a
singleton containing x.
Running time of PARTITION on A[p..r] is (n), where n = r – p + 1.
7/15/2021
144
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – a quick look.
• We first look at (apparent) worst-case partitioning:
T(n) = T(n-1) + T(0) + (n) = T(n-1) + (n).
It is easy to show – using substitution - that T(n) = (n2).
• We next look at (apparent) best-case partitioning:
T(n) = 2T(n/2) + (n).
It is also easy to show (case 2 of the Master Theorem)
that T(n) = (n lg n).
• Since the disparity between the two is substantial, we
need to look further…
7/15/2021
145
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Balanced Partitioning
7/15/2021
146
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – the Average Case
As long as the number of “good splits” is bounded below as
a fixed percentage of all the splits, we maintain
logarithmic depth and so O(n lg n) time complexity.
7/15/2021
147
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Randomized QUICKSORT
We would like to ensure that the choice of pivot does not
critically impair the performance of the sorting algorithm
– the discussion to this point would indicate that
randomizing the choice of the pivot should provide us
with good behavior (if at all possible with the data-set we
are trying to sort). We introduce
7/15/2021
148
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Randomized QUICKSORT
And the recursive procedure becomes:
Every call to RANDOMIZED-PARTITION has introduced
the (constant) extra overhead of a call to RANDOM.
7/15/2021
149
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Rigorous Worst Case
Analysis
Since we do not, a priori, have any idea of what the splits of
the subarrays will be, we have to represent a possible
“worst case” (we already have an O(n2) bound from the
“bad split” example – so it could be worse… although we
hope not). The worst case leads to the recurrence
T(n) = max0≤q≤n-1(T(q) + T(n – q - 1)) + (n),
where we remember that the pivot does not appear at the
next level (down) of the recursion.
7/15/2021
150
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Rigorous Worst Case
Analysis
We have to come up with a “guess” and the basis for the
guess is our likely “bad split case”: it tells us we cannot
hope for any better than (n2). So we just hope it is no
worse… Guess T(n) ≤ cn2 for some c > 0 and start doing
algebra for the induction:
T(n) ≤ max0≤q≤n-1(T(q) + T(n – q - 1)) + (n)
≤ max0≤q≤n-1(cq2 + c(n – q - 1)2) + (n).
Differentiate cq2 + c(n – q - 1)2 twice with respect to q, to
obtain 4c > 0 for all values of q.
7/15/2021
151
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Rigorous Worst Case
Analysis
Since the expression represents a quadratic curve,
concave up, it reaches it maximum at one of the
endpoints q = 0 and q = n – 1. As we evaluate, we find
max0≤q≤n-1(cq2 + c(n – q - 1)2) + (n) ≤
c max0≤q≤n-1(q2 + (n – q - 1)2) + (n) ≤
c (n – 1)2 + (n) = cn2 – 2cn + 1 + (n) ≤ cn2
by choosing c large enough to overcome the positive
constant in (n).
7/15/2021
152
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Expected RunTime
Understanding partitioning.
1. Each time PARTITION is called, it selects a pivot element
and this pivot element is never included in successive
calls: the total number of calls to PARTITION is n.
2. Each call to PARTITION costs O(1) plus an amount of
time proportional to the number of iterations of the for
loop.
3. Each iteration of the for loop (in line 4) performs a
comparison , comparing the pivot to another element in
A.
4. We need to count the number of times l. 4 is executed.
7/15/2021
153
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Expected RunTime
Lemma 7.1. Let X be the number of comparisons
performed in l. 4 of PARTITION over the entire execution
of QUICKSORT on an n-element array. Then the running
time of QUICKSORT is O(n + X).
Proof: the observations on the previous slide.
We need to find X, the total number of comparisons
performed over all calls to PARTITION.
7/15/2021
154
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Expected RunTime
1. Rename the elements of A as z1, z2, …, zn, so that zi is the
ith smallest element of A.
2. Define the set Zij = {zi, zi+1,…, zj}.
3. Question: when does the algorithm compare zi and zj?
4. Answer: at most once – notice that all elements in every
(sub)array are compared to the pivot once, and will
never be compared to the pivot again (since the pivot is
removed from the recursion).
5. Define Xij = I{zi is compared to zj}, the indicator variable of
this event. Comparisons are over the full run of the
algorithm.
7/15/2021
155
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Expected RunTime
6. Since each pair is compared at most once, we can write
7. Taking expectations of both sides:
8. We need to compute Pr{zi is compared to zj}.
9. We will assume all zi and zj are distinct.
10.For any pair zi, zj, once a pivot x is chosen so that zi < x <
zj, zi and zj will never be compared again (why?).
7/15/2021
156
91.404

X  Xij
ji1
n

i1
n1
 .

E X
  E Xij
ji1
n

i1
n1









 E Xij
 
ji1
n

i1
n1
  Pr zi is compared to zj
 
ji1
n

i1
n1
 .
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Expected RunTime
11.If zi is chosen as a pivot before any other item in Zij, then
zi will be compared to every other item in Zij.
12.Same for zj.
13. zi and zj are compared if and only if the first element to
be chosen as a pivot from Zij is either zi or zj.
14.What is that probability? Until a point of Zij is chosen as
a pivot, the whole of Zij is in the same partition, so every
element of Zij is equally likely to be the first one chosen
as a pivot.
7/15/2021
157
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Expected RunTime
15.Because Zij has j – i + 1 elements, and because pivots
are chosen randomly and independently, the probability
that any given element is the first one chosen as a pivot
is 1/(j-i+1). It follows that:
16. Pr{zi is compared to zj}
= Pr{zi or zj is first pivot chosen from Zij}
= Pr{zi is first pivot chosen from Zij}+
Pr{ zj is first pivot chosen from Zij}
= 1/(j-i+1) + 1/(j-i+1) = 2/(j-i+1).
7/15/2021
158
91.404
DAA notes by Pallavi Joshi
Ch.7 - QuickSort
QUICKSORT: Performance – Expected RunTime
17.Replacing the right-hand-side in 7, and grinding through
some algebra:
And the result follows.
7/15/2021
159
91.404

E X
 
2
j i 1
ji1
n

i1
n1
 
2
k 1
k1
ni

i1
n1
 
2
k
k1
n

i1
n1
  2Hn
i1
n1
  O lg n
  O(nlg n).
i1
n1

DAA notes by Pallavi Joshi
Dynamic Programming
DAA notes by Pallavi Joshi
Outline
161
⦁ Assembly‐line scheduling
⦁ Matrix‐chain multiplication
⦁ Elements of dynamic
programming
⦁ Longest common subsequence
⦁ Optimal binary search trees
DAA notes by Pallavi Joshi
Dynamic Programming1/2
162
Not a specific algorithm, but a technique, like divide‐and‐
conquer.
Dynamic programming is applicable when the subproblems
are not independent.
A dynamic‐programming algorithm solves every
subsubproblem just once and then saves its answer in a table.
"Programming" in this context refers to a tabular method, not
to writing computer code.
Used for optimization problems:
Find a solution with the optimal value.
Minimization or maximization.
DAA notes by Pallavi Joshi
Dynamic Programming2/2
163
⦁ Four‐step method
1. Characterize the structure of an optimal solution.
2. Recursively define the value of an optimal solution.
3. Compute the value of an optimal solution in a bottom‐up
fashion.
4. Construct an optimal solution from computed information.
DAA notes by Pallavi Joshi
⦁ Automobile factory with two assembly lines.
⦁ Each line has n stations: S1,1,…, S1,n and S2,1,…, S2,n.
⦁ S1,j and S2,j :perform the same function with times a1,j and a2,j,
respectively.
⦁ Entry times e1 and e2. Exit times x1 and x2.
⦁ After going through a station, can either
⦁ stay on same line; no cost, or
⦁ transfer to other line; cost after Si,j is ti,j.
Assembly‐line scheduling1/2
164
…
line 1
line 2
S1,2
a1,2
S1,3
a1,3
S1,4
a1,4
enters exits
S1,n‐1
a1,n‐1
S1,n
a1,n
a2,2
S2,2
a2,3
S2,3
a2,4
S2,4
a2,n‐1
S2,n‐1
a2,n
S2,n
t1,1
t2,1
t1,2
t2,2
t1,3
t2,3
t1,n‐1
t2,n‐1
S1,1
a1,1
e1
e2
a2,1
S2,1
x1
x2
DAA notes by Pallavi Joshi
Assembly‐line scheduling2/2
165
Problem: Given all these costs (time = cost), what stations
should be chosen from line 1 and from line 2 for fastest
way through factory?
…
line 1
line 2
S1,2
a1,2
S1,3
a1,3
S1,4
a1,4
enters exits
S1,n‐1
a1,n‐1
S1,n
a1,n
a2,2
S2,2
a2,3
S2,3
a2,4
S2,4
a2,n‐1
S2,n‐1
a2,n
S2,n
t1,1
t2,1
t1,2
t2,2
t1,3
t2,3
t1,n‐1
t2,n‐1
S1,1
a1,1
e1
e2
a2,1
S2,1
x1
x2
DAA notes by Pallavi Joshi
Structure of an optimal solution
166
Step 1: Characterize the structure of an optimal solution.
Fastest way through S1,j is either
fastest way through S1, j−1 then directly through S1, j, or
fastest way through S2, j−1, transfer from line 2 to line 1, then
through S1, j.
Example:
If fastest(S1,4) = (S2,1, S1,2, S2,3, S1,4), then fastest(S2,3) = (S2,1, S1,2,
S2,3)
An optimal solution to a problem contains within it an optimal
solution to subproblems.
This is optimal substructure.
DAA notes by Pallavi Joshi
Recursive solution
167
⦁ Step 2: Recursively define the value of an optimal solution.
⦁ Let fi[j] = the fastest time through Si,j, where i = 1, 2 and j = 1,…, n.
⦁ Let f* = the fastest time through the factory.
⦁ Then, we have the following two recursive equations.

It follows that


if j  1,
a )
if j 
2.
1,
j
1
, f [ j 1] t
min( f [ j 1] a
e1 
a1,1
2,
j1
1, j
2
1
f [ j] 
if j  1,
a )
if j  2.
2, j
2
, f [ j 1] t
 a2,1
e2
f2[ j]  min( f [ j 1] a 2,
j1
2, j
1
f*  min(f1[n] x1 , f2[n] x2
).
DAA notes by Pallavi Joshi
li[j] = line # whose station j −1 is used in fastest way through Si,j.
l*= line # whose station n is used in fastest way through the
entire factory.
An instance of assembly‐line scheduling
168
line 1
line 2
enters exits
2
4
3
2
S1,1 S1,
2
S1,
3
S1,
4
S1,n‐
1
S1,
n
7 9 3 4 8 4
2 3 1 3 4
2 1 2 2 1
8 5 6 4 5 7
S2,1 S2,
2
S2,
3
S2,
4
S2,n‐
1
S2,
n
j 1 2 3 4 5 6
f1[j] 9 18 20 24 32 35
f2[j] 12 16 22 25 30 37
f* =
38
l* =
1
l1[j
]
l2[j
]
2 3 4 5
6
j
1 2 1 1 2
1 2 1 2 2
DAA notes by Pallavi Joshi
Compute an optimal solution
169
Step 3: Compute the value of an optimal solution in
a bottom‐up fashion.
⦁ Write a recursive algorithm based on above recurrences.
⦁ Let ri(j) = # of references made to fi[j].
⦁ r1(n) = r2(n) = 1.
r1(j) = r2(j)= r1(j+1) + r2(j+1) for j = 1,… , n
−1.
One can show that ri(j)= 2n−j and the total
number of references to all fi[j] is (2n).
(Exercises 15.1‐2 and 15.1‐3)
⦁ Observation:
⦁ fi[j] depends only on f1[j−1] and f2[j−1] for j 
2.
⦁ So compute in order of increasing j .
f*
f1[n] f2[n]
f2[n‐1]
f1[n‐1]
f2[n‐1]
f1[n‐1]
DAA notes by Pallavi Joshi
FASTEST‐WAY procedure
170
⦁ Time:O(n).
FASTEST‐WAY(a, t, e, x,
n)
do if f1[j − 1] + a1,j  f2[j − 1] + t2,j−1
+ a1,j
then f1[j]  f1[j − 1] + a1,j
l1[j]  1
else f1[j]  f2[j − 1] + t2,j−1 + a1,j
l1[j]  2
if f2[j − 1] + a2,j  f1[j − 1] + t1,j−1
+ a2,j
then f2[j]  f2[j − 1] + a2, j
l2[j]  2
else f2[j]  f1[j − 1] + t1,j−1 + a2,j
1. f1[1]  e1 + a1,1
2. f2[1]  e2 + a2,1
3. for j  2 to n
4.
5.
6.
7.
8.
9.
10.
11.
12.
13. l2[j] 
1
14. if f1[n] + x1  f2[n] +
x2
15. then f* = f1[n] +
x1
16. l* = 1
17. else f* = f2[n] +
x2
18. l* = 2
(1
)
(n1) ∙
(1)
(1
)
DAA notes by Pallavi Joshi
Construct the fastest way
171
Step 4: Construct an optimal solution from computed
information.
The following procedure prints out the stations used, in decreasing
order of station number.
PRINT‐STATIONS(l, n)
⦁ Time:O(n).
1. i  l*
2. print “line” i “, station”
n
3. for j  n downto 2
4.
5.
do i  li[j]
print “line” i “, station” j
−1
(n1) ∙
(1)
(1
)
DAA notes by Pallavi Joshi
Matrix‐chain multiplication
172
When we multiply two matrices A and B, if A is a p × q matrix
and B is a q × r matrix, the resulting matrix C is a p × r matrix.
The number of scalar multiplications is pqr.
Matrix‐chain multiplication problem
Input: A chain 〈A1, A2,..., An〉 of n matrices. (matrix Ai has
dimension pi − 1 × pi )
Output: A fully parenthesized product A1, A2,..., An that minimizes
the number of scalar multiplications.
For example: The dimensions of the matrices A1, A2, and A3
are 10 × 100, 100 × 5, and 5 × 50, respectively.
((A1A2)A3) = 10 ∙ 100 ∙ 5 + 10 ∙ 5 ∙ 50 = 7500.
(A1(A2A3)) = 100 ∙ 5 ∙ 50 + 10 ∙ 100 ∙ 50 = 75000.
DAA notes by Pallavi Joshi
Counting the number of parenthesizations
173


P(k)P(n  k)
⦁ Thus, we have P(n)  
n1
k
1
1
Brute‐force algorithm:
Checking all possible parenthesizations
Time: (2n). (Exercise 15.2‐3)
Denote the number of alternative parenthesizations of a
sequence of n matrices by P(n).
A fully parenthesized matrix product is the product of two fully
parenthesized matrix subproducts.
The split between the two subproducts may occur between the
kth and (k + 1)st matrices.
if n  1,
if n  2.
DAA notes by Pallavi Joshi
Step 1: The structure of an optimal solution
174
An optimal solution to an instance contains optimal
solutions to subproblem instances.
For example:
If ((A1A2)A3)(A4(A5A6)) is an optimal solution to A1, A2,...,
A6.
Then, ((A1A2)A3) is an optimal solution to A1, A2, A3 and
(A4(A5A6)) is an optimal solution to A4, A5, A6.
DAA notes by Pallavi Joshi
Step 2: A recursive solution
175
Define m[i, j] = the minimum number of scalar
multiplications needed to compute Ai Ai+1… Aj.
if i  j ,
⦁ The recursion tree for the computation of
m[1,4].

ik j
0
if i  j.
i1 k
j
m[i, j]  min(m[i,k] m[k 1, j]  p p p
)
1..1 4..4
1..4
2..4 1..2 3..4 1..3
2..2 3..4 2..3 4..4 1..1 2..2 3..3 4..4 1..1
2..3 1..2 3..3
3..3 4..4 2..2 3..3 2..2 3..3 1..1 2..2
DAA notes by Pallavi Joshi
Step 3: Computing the optimal
costs
176
Based on the recursive formula, we could easily write an
exponential‐time recursive algorithm to compute the
compute the minimum cost m[1, n] for multiplying A1A2…An.
⦁ There are
only
2
= (n ) distinct subproblems,
one
problem for each choice of i and j satisfying 1  i  j  n.
We can use dynamic programming to compute the
solutions bottom up.
n
( )  n
2
DAA notes by Pallavi Joshi
3
Dependencies between the
subproblems
177
matrix A1 A2 A3 A4 A5 A6
dimension 30 × 35 35 × 15 15 × 5 5 × 10 10 × 20 20 × 25


1 4 5
m[2,4] m[5,5] p p p  4375  0  3510
20  11375.
1 3 5
⦁ s[i, j]: index k achieved the optimal cost in computing m[i,
j].
m[2,2] m[3,5] p1 p2 p5  0  2500  3515 20 
13000,
m[2,5]  minm[2,3] m[4,5] p p p  2625 1000  355
20  7125,
15,125
m
A1 A2 A3 A4 A5 A6
0
15,750 2,625 750 1,000
5,000
0 0 0 0 0
7,875 4,375 2,500
3,500
9,375 7,125
5,375
11,875 10,500
1
2
3
4
6
1
6
5
4
3
5
2
3
s
1 2 3 4 5
1 3 3 5
3 3
3 3
6
2
3
4
5
1
5
4
3
2
i
j
i
j
DAA notes by Pallavi Joshi
MATRIX‐CHAIN‐ORDER pseudocode
178
The loops are nested three deep, and each loop index (l, i, and
k) takes on at most n – 1 values.
Time:O(n3).
MATRIX‐CHAIN‐ORDER(
p)
do for i  1 to n – l
+ 1
do j  i + l –
1
1. n  length[p] – 1
2. for i  1 to n
3. do m[i, i]  0
4. for l  2 to n /* l is the chain length*/
5.
6.
7. m[i, j]  
8. for k  i to j – 1
9. do q  m[i, k] + m[k + 1, j] + pi
– 1pkpj
10. if q < m[i, j]
11. then m[i, j]  q
12. s[i, j]  k
13. return m and s
DAA notes by Pallavi Joshi
Step 4: Constructing an optimal solution
179
The call PRINT‐OPTIMAL‐PARENS(s, 1, n) prints the
parenthesization ((A1(A2A3)) ((A4A5)A6)).
PRINT‐OPTIMAL‐PARENS(s, i,
j)
1. if i = j
2. then print “Ai”
3. else print “(“
4. PRINT‐OPTIMAL‐PARENS(s, i, s[i, j])
5. PRINT‐OPTIMAL‐PARENS(s, s[i, j]+1,
j)
6. print ")"
3
Each entry s[i, j] records the value of k such that the
optimal parenthesization of Ai Ai+1∙∙∙Aj splits the product
between
Ak and Ak+1.
s
1 2 3 4 5
1 3 3 5
3
3
3
3
3
3
4
6 1
5
4
3
5
2
i
j
DAA notes by Pallavi Joshi
Elements of dynamic programming1/2
180
Optimal substructure
An optimal solution to a problem contains an optimal solution to
subproblems.
If ((A1A2)A3)(A4(A5A6)) is an optimal solution to A1, A2,..., A6, then
((A1A2)A3) is an optimal solution to A1, A2, A3 and (A4(A5A6)) is an
optimal solution to A4, A5, A6.
Overlapping subproblems
A recursive algorithm revisits the same problem over and over
again.
Typically, the total number of distinct subproblems is a polynomial
in the input size.
In contrast, a problem for which a divide‐and‐conquer approach is
suitable usually generates brand‐new problems at each step of the
recursion.
DAA notes by Pallavi Joshi
Elements of dynamic
programming2/2
181
⦁ Example: merge
sort
⦁ Example:
matrix‐chain
1..8
1..4
1..2
4..4
3..4
3..3
2..2
1..1
5..8
6..6
5..6
5..5 8..8
7..8
7..7
1..1 4..4
1..4
2..4 1..2 3..4 1..3
2..2 3..4 2..3 4..4 1..1 2..2 3..3 4..4 1..1
2..3 1..2 3..3
3..3 4..4 2..2 3..3 2..2 3..3 1..1 2..2
DAA notes by Pallavi Joshi
RECURSIVE‐MATRIX‐CHAIN procedure
182
⦁ We shall prove that T(n) = (2n). Specifically, T(n)  2n–
1.
RECURSIVE‐MATRIX‐CHAIN(p, i,
j)
+ p p
p
i–1 k j I
if q < m[i,
j] then m[i, j] 
q
1. if i = j
2. then return 0
3. m[i, j]  
4. for k  i to j – 1
5. do q  RECURSIVE‐MATRIX‐CHAIN(p, i, k)
6. + RECURSIVE‐MATRIX‐CHAIN(p, k+1,
j)
7.
8.
9.
10. return m[i, j]
(1
)
(1
)

n
1
( (T(k)T(nk)
1)
k1
n1 n1 n1
T(n) 1(T(k) T(nk)1)  2T(i) n 22i1

n
k1 i1 i1
n
2
 22i
n  2(2n1
1)n  (2n
2)n 
2n1
.
i0
T(1) 1 
20
.
Using the substitution
method.
DAA notes by Pallavi Joshi
Memoization
183
A variation of dynamic programming that offers the efficiency of the
usual dynamic‐programming approach while maintaining a
top‐down strategy.
MEMOIZED‐MATRIX‐CHAIN(p
)
1. n  length[p] – 1
2. for i  1 to n
3.
4.
do for j  i to n
do m[i, j] 

5. return LOOKUP‐CHAIN(p, 1,
n)
LOOKUP‐CHAIN(p, i,
j)
1. if m[i, j] < 
2. then return m[i,
j]
3. if i =
j
4. then m[i, j]  0
5. else for k  i to j – 1
6. do q  LOOKUP‐CHAIN(p, i,
k)
7.
8.
9.
10.
+ LOOKUP‐CHAIN (p, k+1,
j)
+ pi–1pkpj I
if q < m[i, j]
then m[i, j]  q
11. return m[i, j]
⦁ Time:O(n3).
🞂 Compute m[i, j] only in the first time to call LOOKUP‐CHAIN(p, i,
j).
DAA notes by Pallavi Joshi
Longest‐common‐subsequence
184
A subsequence is a sequence that can be derived from
another sequence by deleting some elements.
For example:
〈K, C, B, A〉 is a subsequence of 〈K, G, C, E, B, B, A〉.
〈B, C, D, G〉is a subsequence of 〈A, C, B, E, G, C, E, D, B, G〉.
Longest‐common‐subsequence problem
Input: 2 sequences, X = 〈x1, x2,…, xm〉 and Y = 〈y1, y2,…, yn〉.
Output: A maximum‐length common subsequence of X and Y.
For example: X = 〈A, B, C, B, D, A, B〉 and Y = 〈B, D, C, A, B, A〉.
〈B, C, A〉is a common subsequence of both X and Y.
〈B, C, B, A〉is an longest common subsequence (LCS) of X and Y.
DAA notes by Pallavi Joshi
Step 1: Characterizing an
LCS
185
Brute‐force algorithm:
For every subsequence of X, check whether it is a subsequence of
Y.
Time: (n2m).
2m subsequences of X to check.
Each subsequence takes (n) time to check: scan Y for first letter,
from there scan for second, and so on.
Given a sequence X = 〈x1, x2,…, xm〉, we define the ith prefix of
X, as X = 〈x1, x2,…, xi〉.
For example:
X = 〈A, B, C, B, D, A, B〉.
X4 = 〈A, B, C, B〉 and X0 is the empty sequence.
DAA notes by Pallavi Joshi
Optimal substructure of an LCS
186
Theorem 15.1
Let X = 〈x1, x2,…, xm〉 and Y = 〈y1, y2,…, yn〉 be sequences, and let
Z = 〈z1, z2,…, zk〉be any LCS of X and Y.
1. If xm = yn, then zk = xm = yn and Zk−1 is an LCS of Xm−1 and Yn−1.
2. If xm  yn, then zk  xm implies that Z is an LCS of Xm−1 and Y.
3. If xm  yn, then zk  yn implies that Z is an LCS of X and Yn−1.
For example:
X = 〈A, B, C, B, D, A, B〉, Y = 〈B, D, C, A, B〉 and Z = 〈B, C, A, B〉 is an LCS of
X and Y. Then, z4 = x7 = y5 and Z3 = 〈B, C, A〉 is an LCS of X6 and Y4.
X = 〈A, B, C, B, D, A, D〉, Y = 〈B, D, C, A, B〉 and Z = 〈B, C, A〉 is an LCS of X
and Y. Then, z3  x7 implies that Z3 = 〈B, C, A〉 is an LCS of X6 and Y5.
X = 〈A, B, C, B, D, A〉, Y = 〈B, D, C, A, B〉 and Z = 〈B, C, A〉 is an LCS of X
and Y. Then, z3  y5 implies that Z3 = 〈B, C, A〉 is an LCS of X6 and Y4.
DAA notes by Pallavi Joshi
Step 2: A recursive
solution
187
⦁ Define c[i, j] = length of LCS of Xi and Yj. We want c[m,
n].
⦁ The recursion tree for the computation of
c[4,3].

max(c[i, j 1],c[i 1, j])

0
c[i, j]  c[i 1, j 1]1
ifi, j  0 and xi  yj.
if i  0 or j  0,
ifi, j  0 and xi  yj ,
4,3
3,3
2,3
3,1
3,2
2,2
2,2
1,3
4,2
3,1
3,2
2,2 3,0
4,1
3,1
0,3 1,2 1,2 2,0 1,2 2,0 2,1 3,0 1,2 2,0 2,1 3,0 2,1
3,0
DAA notes by Pallavi Joshi
Step 3: Computing the length of an LCS
188
Based on the recursive formula, we could easily write an
exponential‐time recursive algorithm to compute the length
of an LCS of two sequences.
There are only (mn) distinct subproblems.
We can use dynamic programming to compute the solutions
bottom up.
DAA notes by Pallavi Joshi
LCS‐LENGTH pseudocode
189
⦁ Time:O(mn)
.
LCS‐LENGTH(X,
Y)
do if xi =
yj
1. m  length[X]; n  length[Y]
2. for i  1 to m
3. do c[i, 0]  0
4. for j  0 to n
5. do c[0, j]  0
6. for i  1 to m
7. do for j  1 to n
8.
9. then c[i, j]  c[i − 1, j − 1] + 1
10. b[i, j]  “↖”
11. else if c[i ‐ 1, j]  c[i, j ‐ 1]
12. then c[i, j]  c[i − 1, j]
13. b[i, j]  “”
14. else c[i, j]  c[i, j − 1]
15. b[i, j]  “”
16. return c and b
i
i
j 0
y
1
B
2
D
3
C
4
A
5
B
6
A
0 xi
1 A
2 B
3 C
4 B
5 D
6 A
7 B
0 0 0 0 0 0 0
0

0

0

0
↖
1  1
↖
1
0
↖
1  1  1

1
↖
2  2
0

1

1
↖
2  2

2

2
0
↖
1

1

2

2
↖
3  3
0

1
↖
2

2

2

3

3
0

1

2

2
↖
3

3
↖
4
0
↖
1

2

2

3
↖
4

4
DAA notes by Pallavi Joshi
Step 4: Constructing an LCS
190
Whenever we encounter a “↖” in entry b[i, j], it implies
that
xi = yj is an element of the LCS.
PRINT‐LCS(b, X, i, j)
1. if i = 0 or j = 0
2. then return
3. if b[i, j] = " ↖“
4. then PRINT‐LCS(b, X, i − 1, j − 1)
5. print xi
6. elseif b[i, j] = "“
7. then PRINT‐LCS(b, X, i − 1, j)
8. else PRINT‐LCS(b, X, i, j − 1)
This procedure prints "BCBA".
i
j 0
yi
1
B
2
D
3
C
4
A
5
B
6
A
0 xi
1 A
2 B
3 C
4 B
5 D
6 A
7 B
0 0 0 0 0 0 0
0

0

0

0
↖
1  1
↖
1
0
↖
1  1  1

1
↖
2  2
0

1

1
↖
2  2

2

2
0
↖
1

1

2

2
↖
3  3
0

1
↖
2

2

2

3

3
0

1

2

2
↖
3

3
↖
4
0
↖
1

2

2

3
↖
4

4
DAA notes by Pallavi Joshi
Optimal binary search trees
Input: A sequence K = 〈k1, k2,..., kn〉of n distinct keys in sorted
order. A sequence D = 〈d0, d1,..., dn〉of n + 1 dummy keys.
k1 < k2 < ∙∙∙ < kn.
d0 = all values < k1. dn = all values > kn.
di = all values between ki and ki+1.
For each key ki, a probability pi that a search is for ki.
For each key di, a probability qi that a search is for di.
Output: A BST with minimum expected search
cost.
191
i T i
i
n
n T i
i1 i
1
n n
 pi  qi
i1 i1
n
n
i1 i
1
⦁ E[searchcost inT]  (depthT (ki )1)pi  (depthT (di )1)
qi
 1  depth (k )p  depth (d )
q
n n
depthT (ki )pi  depthT (di)qi
i1 i1
i1 i
1
n n
pi  qi  1
DAA notes by Pallavi Joshi
0-1 KNAPSACK PROBLEM
DAA notes by Pallavi Joshi
Statement of
the problem:
Given n items, each with corresponding value p and weight w, find
which items to place in the knapsack such that sum of all p is
maximum, and the sum of all w does not exceed the maximum
weight capacity c of the knapsack.
DAA notes by Pallavi Joshi
We can also express the problem as follows:
n
i1
 pi xi
n
i1
is maximum and wi xi  c
i
where x 
0

item is not taken
1 item is taken
DAA notes by Pallavi Joshi
Solution #1 : Brute force
n
 We take all possible item combinations.
 For any n items, the total number of combinations is
C i, n = 2n
i0
 We pick the combinations that satisfy the constraint and sort
each  p and get the maximum.
 This approach has complexity O 2n
.
DAA notes by Pallavi Joshi
Solution #2:
Dynamic Programming (Bottom-Top Computation)
Construct an n  c value matrix V to compute a value in each cell
for every row in the matrix.
The last cell V[n, c] will give the solution to the maximum total
value.
DAA notes by Pallavi Joshi
Bottom-Top
computation
pseudocode:
for i = 0 to c:
V[0, i]  0
for i = 0 to n:
for k = 0 to c:
V[i, k]  Max(V[i - 1, k], pi + V[i - 1, k - wi])
DAA notes by Pallavi Joshi
Exa
mpl
e:
i 1 2 3 4 5
p 30 20 40 70 60
w 4 1 2 5 3
n = 5
c = 10
DAA notes by Pallavi Joshi
Solution:
The value matrix can be viewed as bottom-top, with the first row
(i = 0) the bottom, and moving up to the succeeding rows up to the
top (i = n).
Row 0 of the value matrix all start with 0.
The column k starts at 0 and ends at c (the constraint).
k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
DAA notes by Pallavi Joshi
 Value at V[i, k] = Max(V[i - 1, k], pi + V[i- 1, k - wi]) where
pi - the value of the item at row i
wi - the weight of the item at row i
 For each row, we search each column where k  wi  0, i.e.,
the maximum is V[i - 1, k] (the cell above the current cell).
 If k  wi  0, compare V i 1, k  and pi V i 1, k  wi .
The
maximum of the two is the value of V i, k .
DAA notes by Pallavi Joshi
k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 30 30 30 30 30 30 30
at i = 1, k = 4:
V i 1,kV 0,4 0
p1  30,w1  4 , k  wi  4  w1  4  4  0
pi V i 1,k  wi  p1 V 0,0 30  0  30
DAA notes by Pallavi Joshi
Completing
the value
matrix:
k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 30 30 30 30 30 30 30
2 0 20 20 20 30 50 50 50 50 50 50
3 0 20 40 60 60 60 70 90 90 90 90
4 0 20 40 60 60 70 90 110 130 130 130
5 0 20 40 60 80 100 120 120 130 150 170
The last cell at V n,c is the solution to the maximum value.
DAA notes by Pallavi Joshi
 The value matrix only showed the solution to the maximum
value, but not the individual items chosen.
 Modify the last pseudocode to mark
the cells where the maximum is
pi V i 1, k  wi , where k  wi  0.
k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 30 30 30 30 30 30 30
2 0 20 20 20 30 50 50 50 50 50 50
3 0 20 40 60 60 60 70 90 90 90 90
4 0 20 40 60 60 70 90 110 130 130 130
5 0 20 40 60 80 100 120 120 130 150 170
DAA notes by Pallavi Joshi
Pseudocode to
find the items
selected:
k = c
for i = n down to 0:
if V i, k  is marked:
output item i
k  k  wi
DAA notes by Pallavi Joshi
From the
last
example:
k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 30 30 30 30 30 30 30
2 0 20 20 20 30 50 50 50 50 50 50
3 0 20 40 60 60 60 70 90 90 90 90
4 0 20 40 60 60 70 90 110 130 130 130
5 0 20 40 60 80 100 120 120 130 150 170
i k Marked
5 10 Yes
4 10 - w5 = 10 - 3 = 7 Yes
3 7 - w4 = 7 - 5 = 2 Yes
2 2  w3 = 2 - 2 = 0 No
1 0 - w2 = 0 - 1 = INVALID No
The items selected are 3, 4, 5.
DAA notes by Pallavi Joshi
 Bottom-top computation has complexity Onc.
 For large n, a vast improvement to O2n
.
 Any problem involving maximizing a total value while
satisfying a constraint can use this method, as long as the items
can only be either chosen or not, i.e., the item cannot be broken
into smaller parts.
DAA notes by Pallavi Joshi
Sorting in
Linear Time
Counting sort
Radix sort
Bucket sort
DAA notes by Pallavi Joshi
Counting Sort
The Algorithm
• Counting-Sort(A)
– Initialize two arrays B and C of size n and set all entries to 0
• Count the number of occurrences of every A[i]
– for i = 1..n
– do C[A[i]] ← C[A[i]] + 1
• Count the number of occurrences of elements <= A[i]
– for i = 2..n
– do C[i] ← C[i] + C[i – 1]
• Move every element to its final position
– for i = n..1
– do B[C[A[i]] ← A[i]
– C[A[i]] ← C[A[i]] – 1
DAA notes by Pallavi Joshi
Counting Sort Example
2 3
5 0 2 3 0 3
A =
1 2 3 4 5 6 7 8
2 4
2 7 8
C =
0 1 2 3 4 5
7
1 2 3 4 5 6 7 8
B = 3
6
C =
0 1 2 3 4 5
2 2
0 3 0 1
2 4
2 7 7 8
C =
0 1 2 3 4 5
DAA notes by Pallavi Joshi
Counting Sort Example
0 3
B =
2 4
2 6 7 8
C =
0 1 2 3 4 5
1 2 3 4 5 6 7 8
2 3
5 0 2 3 0 3
A =
1 2 3 4 5 6 7 8
2 4
2 6 7 8
C =
0 1 2 3 4 5
1
DAA notes by Pallavi Joshi
Counting Sort Example
0 3
B =
1 4
2 6 7 8
C =
0 1 2 3 4 5
1 2 3 4 5 6 7 8
2 3
5 0 2 3 0 3
A =
1 2 3 4 5 6 7 8
2 4
2 6 7 8
C =
0 1 2 3 4 5
3
5
DAA notes by Pallavi Joshi
Counting Sort
1 CountingSort(A, B, k)
2 for i=1 to k
3 C[i]= 0;
4 for j=1 to n
5 C[A[j]] += 1;
6 for i=2 to k
7 C[i] = C[i] + C[i-1];
8 for j=n downto 1
9 B[C[A[j]]] = A[j];
10 C[A[j]] -= 1;
What will be the running time?
Takes time O(k)
Takes time O(n)
DAA notes by Pallavi Joshi
Counting Sort
• Total time: O(n + k)
– Usually, k = O(n)
– Thus counting sort runs in O(n) time
• But sorting is (n lg n)!
– No contradiction--this is not a comparison sort (in
fact, there are no comparisons at all!)
– Notice that this algorithm is stable
• If numbers have the same value, they keep their
original order
DAA notes by Pallavi Joshi
• A sorting algorithms is stable if for any two
indices i and j with i < j and ai = aj, element ai
precedes element aj in the output sequence.
Observation: Counting Sort is stable.
Stable Sorting Algorithms
Output
2
1 5
1 6
1 7
1
2
2 2
3 4
1 4
2
Input
2
1 5
1 2
3 6
1
7
1 4
1 4
2 2
2
DAA notes by Pallavi Joshi
Counting Sort
• Linear Sort! Cool! Why don’t we always
use counting sort?
• Because it depends on range k of
elements
• Could we use counting sort to sort 32 bit
integers? Why or why not?
• Answer: no, k too large (232 =
4,294,967,296)
DAA notes by Pallavi Joshi
Radix Sort
• Why it’s not a comparison sort:
– Assumption: input has d digits each ranging from
0 to k
– Example: Sort a bunch of 4-digit numbers, where
each digit is 0-9
• Basic idea:
– Sort elements by digit starting with least
significant
– Use a stable sort (like counting sort) for each stage
DAA notes by Pallavi Joshi
Radix Sort Overview
• Origin : Herman Hollerith’s card-sorting machine for the
1890 U.S Census
• Digit-by-digit sort
• Hollerith’s original (bad) idea : sort on most-significant
digit first.
• Good idea : Sort on least-significant digit first with
auxiliary stable sort
A idéia de Radix Sort não é nova
DAA notes by Pallavi Joshi
Para minha turma da faculdade foi
muito fácil aprender Radix Sort
IBM 083
punch card
sorter
DAA notes by Pallavi Joshi
• Radix Sort takes parameters: the array
and the number of digits in each array
element
• Radix-Sort(A, d)
• 1 for i = 1..d
• 2 do sort the numbers in arrays A by
their i-th digit from the right, using a
stable sorting algorithm
Radix Sort
The Algorithm
DAA notes by Pallavi Joshi
Radix Sort Example
720
329
436
839
355
457
657
329
457
657
839
436
720
355
720
355
436
457
657
329
839
329
355
436
457
657
720
839
DAA notes by Pallavi Joshi
Radix Sort
Correctness and Running Time
•What is the running time of radix sort?
•Each pass over the d digits takes time
O(n+k), so total time O(dn+dk)
•When d is constant and k=O(n),
takes O(n) time
•Stable, Fast
•Doesn’t sort in place (because counting sort
is used)
DAA notes by Pallavi Joshi
Bucket Sort
• Assumption: input - n real numbers from [0, 1)
• Basic idea:
– Create n linked lists (buckets) to divide interval [0,1) into
subintervals of size 1/n
– Add each input element to appropriate bucket and sort
buckets with insertion sort
• Uniform input distribution  O(1) bucket size
– Therefore the expected total time is O(n)
DAA notes by Pallavi Joshi
Bucket Sort
Bucket-Sort(A)
1. n  length(A)
2. for i  0 to n
3. do insert A[i] into list B[floor(n*A[i])]
4. for i  0 to n –1
5. do Insertion-Sort(B[i])
6. Concatenate lists B[0], B[1], … B[n –1] in order
Distribute elements over buckets
Sort each bucket
DAA notes by Pallavi Joshi
Bucket Sort Example
.78
.17
.39
.26
.72
.94
.21
.12
.23
.68
7
6
8
9
5
4
3
2
1
0
.17
.12
.26
.23
.21
.39
.68
.78
.72
.94
.68
.72
.78
.94
.39
.26
.23
.21
.17
.12
DAA notes by Pallavi Joshi
Bucket Sort – Running Time
• All lines except line 5 (Insertion-Sort) take O(n) in the worst
case.
• In the worst case, O(n) numbers will end up in the same
bucket, so in the worst case, it will take O(n2) time.
• Lemma: Given that the input sequence is drawn uniformly at
random from [0,1), the expected size of a bucket is O(1).
• So, in the average case, only a constant number of elements
will fall in each bucket, so it will take O(n) (see proof in book).
• Use a different indexing scheme (hashing) to distribute the
numbers uniformly.
DAA notes by Pallavi Joshi
• Every comparison-based sorting algorithm has to take
Ω(n lg n) time.
• Merge Sort, Heap Sort, and Quick Sort are comparison-based
and take O(n lg n) time. Hence, they are optimal.
• Other sorting algorithms can be faster by exploiting
assumptions made about the input
• Counting Sort and Radix Sort take linear time for integers in a
bounded range.
• Bucket Sort takes linear average-case time for uniformly
distributed real numbers.
Summary
• Knapsack Problem
WHA
T ?
• GIVEN WEIGHTS AND VALUES OF N ITEMS, WE NEED TO
PUT THESE ITEMS IN A KNAPSACK OF CAPACITY W TO
GET THE MAXIMUM TOTAL VALUE IN THE KNAPSACK.
TYP
ES
• 0-1 KNAPSACK PROBLEM
In the 0-1 knapsack problem, we are not allowed to break items. We
either take the whole item or donT’tHEtaOkPeTiItM. AL
KNAPSACK ALGORITHM
• FRACTIONAL KNAPSACK
In fractional knapsack, we can break items for maximizing the total
value of knapsack. This problem in which we can break an item
is also called the fractional knapsack problem.
EXAMP
LE
0-1 KNAPSACK
Take B and C
Total weight
=20+30=50 Total
value
=100+120=220
FRACTIONAL KNAPSACK
Take A,B and 2/3rd of C
Total weight =10+20+(30*
2/3)=50 Total value
=60+100+(120* 2/3)=240
GREEDY
APPROACH
• The basic idea of the greedy approach is to calculate the ratio
value/weight for each item.
• Sort the item on basis of this ratio.
• Then take the item with the highest ratio and add them until we
can’t add the next item as a whole.
• At the end add the next item as much (fraction) as we can.
GREEDYAPPROACH
SOLUTION
Ratio
= 5
R
a
ti
o
=
6
1
.
c
a
l
value/weight.
2.Sort (descending) item
on basis
of ratio.
3.Take item with highest
ratio and add to
knapsack until we cant
add the next item as
Ratio= 4
Capacity left =40
Value = 60
Capacity left =20
Value = 160
Take 2/3rd of C Weight
= 2/3 *30 =20 Value =
2/3 *120 =80 Capacity
left =0
Value = 240
THE OPTIMAL KNAPSACK
ALGORITHM
• INPUT: AN INTEGER N
• Positive values wi and vi such
that 1 < = i < = n
• Positive value W.
• OUTPUT:
• N values of xi such that 0 < = xi
< = 1
• Total profit
WHAT IS STRING
MATCHING
• In computer science, string searching
algorithms, sometimes called string
matching algorithms, that try to find a
place where one or several string (also
called pattern) are found within a larger
string or text.
EXAMPL
E
STRING MATCHING PROBLEM
TEXT
N
A B C A B A A C A B
SHIFT=3
A B A A PATTER
STRING
MATCHING
ALGORITHMS
There are many types of String Matching
Algorithms like:-
1)The Naive string-matching algorithm
2)The Rabin-Krap algorithm
3)String matching with finite automata
4)The Knuth-Morris-Pratt algorithm
But we discuss about 2 types of string matching
algorithms.
1)The Naive string-matching algorithm
2)The Rabin-Krap algorithm
THE NAIVE
ALGORITHM
The naive algorithm finds all valid shifts using a loop
that checks
the condition P[1….m]=T[s+1…. s+m] for eachof the n-
m+1
possible values of s.(P=pattern , T=text/string , s=shift)
NAIVE-STRING-MATCHER(T,P)
1)n = T.length
2)m = P.length
3)for s=0 to n-m
4)
5)
if P[1…m]==T[s+1….s+m]
printf” Pattern occurs with
shift ” s
EXAMPL
E
SUPPOSE,
T=1011101110 P=111
FIND ALL VALID SHIFT……
1 0 1 1 1 0 1 1 1 0
1 1 1
P=Patter
n
T=Tex
t
S=
0
1 0 1 1 1 0 1 1 1 0
1 1 1
S=
1
1 0 1 1 1 0 1 1 1 0
S=2
1 1 1
So, S=2 is a valid shift…
1 0 1 1 1 0 1 1 1 0
S=3
1 1 1
1 0 1 1 1 0 1 1 1 0
S=4
1 1 1
1 0 1 1 1 0 1 1 1 0
1 1 1
S=5
1 0 1 1 1 0 1 1 1 0
S=6
1 1 1
So, S=6 is a valid shift…
1 0 1 1 1 0 1 1 1 0
1 1 1
S=7
THE RABIN-KARP
ALGORITHM
Rabin and Karp proposed a string
matching algorithm that performs well in
practice and that also generalizes to
other algorithms for related problems,
such as two-dimentional pattern
matching.
ALGORITH
M
RABIN-KARP-MATCHER(T,P,d,q)
//pre-processing
//matching
1) n = T.length
2) m = P.length
3) h = d^(m-1) mod q
4) p = 0
5) t = 0
6) for i =1 to m
7) p = (dp + P[i]) mod q
8) t = (d t + T[i]) mod q
9) for s = 0 to n – m
10) if p == t
if P[1…m] == T[s+1…. s+m]
printf “ Pattern occurs with shift ” s if
s< n-m
t+1 = (d(t- T[s+1]h)+ T[s+m+1]) mod q
EXAMPL
E
Pattern P=26, how many spurious hits does the
Rabin
Karp matcher in the text T=3 1 4 1 5 9 2 6 5 3
5…
• T = 3 1 4 1 5 9 2 6 5 3 5
P = 2 6
Here T.length=11so Q=11 and P mod Q =
26 mod 11
= 4
Now find the exact match of P mod Q…
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 mod 1 1 = 9 not equal to 4
3 1 4 1 5 9 2 6 5 3 5
S=0
S=1
1 4 mod 1 1 = 3 not equal to 4
S=2
4 1 mod 1 1 = 8 not equal to 4
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
1 5 mod 1 1 = 4 equal to 4 SPURIOUS HIT
S=3
S=4
5 9 mod 1 1 = 4 equal to 4 SPURIOUS HIT
S=5
9 2 mod 1 1 = 4 equal to 4 SPURIOUS HIT
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
2 6 mod 1 1 = 4
• S=7
• 6 5 mod 1 1 = 10 not equal to 4
• S=8
• 5 3 mod 1 1 = 9 not equal to 4
EXACT MATCH
S=6
3 1 4 1 5 9 2 6 5 3 5
3 5 mod 1 1 = 2 not equal to 4
S=9
Pattern occurs with shift 6
COMPARISSION
The Naive String Matching algorithm slides
the pattern one by one. After each slide, it one
by one checks characters at the current shift
and if all characters match then prints the
match.
Like the Naive Algorithm, Rabin-Karp algorithm
also slides the pattern one by one. But unlike the
Naive algorithm, Rabin Karp algorithm matches
the hash value of the pattern with the hash value
of current substring of text, and if the hash values
match then only it starts matching individual
characters.
Minimum Spanning Trees
•Definition of MST
•Generic MST algorithm
•Kruskal's algorithm
•Prim's algorithm
254
Definition of MST
 Let G=(V,E) be a connected, undirected graph.
 For each edge (u,v) in E, we have a weight w(u,v)
specifying the cost (length of edge) to connect u and v.
 We wish to find a (acyclic) subset T of E that connects all
of the vertices in V and whose total weight is minimized.
 Since the total weight is minimized, the subset T must be
acyclic (no circuit).
 Thus, T is a tree. We call it a spanning tree.
 The problem of determining the tree T is called the
minimum-spanning-tree problem.
255
Application of MST: an example
• In the design of electronic circuitry, it is often
necessary to make a set of pins electrically
equivalent by wiring them together.
• Running cable TV to a set of houses. What’s
the least amount of cable needed to still
connect all the houses?
256
What makes a greedy algorithm?
• Feasible
– Has to satisfy the problem’s constraints
• Locally Optimal
– The greedy part
– Has to make the best local choice among all feasible choices available
on that step
• If this local choice results in a global optimum then the problem has optimal
substructure
• Irrevocable
– Once a choice is made it can’t be un-done on subsequent steps of the
algorithm
• Simple examples:
– Playing chess by making best move without lookahead
– Giving fewest number of coins as change
• Simple and appealing, but don’t always give the best solution
Spanning Tree
• Definition
– A spanning tree of a graph G is a tree
(acyclic) that connects all the vertices of G
once
• i.e. the tree “spans” every vertex in G
– A Minimum Spanning Tree (MST) is a
spanning tree on a weighted graph that has
the minimum total weight
w T w u v
u v T
( ) ( , )
,


 such that w(T) is minimum
Where might this be useful? Can also be used to approximate some
NP-Complete problems
259
Here is an example of a connected graph
and its minimum spanning tree:
a
b
h
c d
e
f
g
i
4
8 7
9
10
14
4
2
2
6
1
7
11
8
Notice that the tree is not unique:
replacing (b,c) with (a,h) yields another spanning tree
with the same minimum weight.
Growing a MST
• Set A is always a subset of some minimum spanning tree..
• An edge (u,v) is a safe edge for A if by adding (u,v) to the
subset A, we still have a minimum spanning tree.
260
GENERIC_MST(G,w)
1 A:={}
2 while A does not form a spanning tree do
3 find an edge (u,v) that is safe for A
4 A:=A∪{(u,v)}
5 return A
How to find a safe edge
We need some definitions and a theorem.
• A cut (S,V-S) of an undirected graph G=(V,E) is
a partition of V.
• An edge crosses the cut (S,V-S) if one of its
endpoints is in S and the other is in V-S.
• An edge is a light edge crossing a cut if its
weight is the minimum of any edge crossing
the cut.
261
262
V-S↓
a
b
h
c d
e
f
g
i
4
8 7
9
10
14
4
2
2
6
1
7
11
8
S↑ ↑ S
↓ V-S
• This figure shows a cut (S,V-S) of the graph.
• The edge (d,c) is the unique light edge
crossing the cut.
The algorithms of Kruskal and Prim
• The two algorithms are elaborations of the
generic algorithm.
• They each use a specific rule to determine a
safe edge in the GENERIC_MST.
• In Kruskal's algorithm,
– The set A is a forest.
– The safe edge added to A is always a least-
weight edge in the graph that connects two
distinct components.
• In Prim's algorithm,
– The set A forms a single tree.
– The safe edge added to A is always a least- 263
Kruskal's algorithm (simple)
(Sort the edges in an increasing order)
A:={}
while (E is not empty do) {
take an edge (u, v) that is shortest in E
and delete it from E
If (u and v are in different components)then
add (u, v) to A
}
Note: each time a shortest edge in E is considered.
264
Kruskal's algorithm
265
1 function Kruskal(G = <N, A>: graph; length: A → R+): set of edges
2 Define an elementary cluster C(v) ← {v}.
3 Initialize a priority queue Q to contain all edges in G, using the weights as
keys.
4 Define a forest T ← Ø //T will ultimately contain the edges of the MST
5 // n is total number of vertices
6 while T has fewer than n-1 edges do
7 // edge u,v is the minimum weighted route from u to v
8 (u,v) ← Q.removeMin()
9 // prevent cycles in T. add u,v only if T does not already contain a path
// between u and v.
10 // the vertices has been added to the tree.
11 Let C(v) be the cluster containing v, and let C(u) be the cluster containing u.
13 if C(v) ≠ C(u) then
14 Add edge (v,u) to T.
15 Merge C(v) and C(u) into one cluster, that is, union C(v) and C(u).
16 return tree T
266
http://Wikipedia/kruskals
This is our original graph. The numbers near the arcs indicate their
weight. None of the arcs are highlighted.
Kruskal's algorithm
267
http://Wikipedia/kruskals
AD and CE are the shortest arcs, with length 5, and AD has been
arbitrarily chosen, so it is highlighted.
Kruskal's algorithm
268
http://Wikipedia/kruskals
CE is now the shortest arc that does not form a cycle, with length 5,
so it is highlighted as the second arc.
Kruskal's algorithm
269
http://Wikipedia/kruskals
The next arc, DF with length 6, is highlighted using much the same
method.
Kruskal's algorithm
270
http://Wikipedia/kruskals
The next-shortest arcs are AB and BE, both with length 7. AB is
chosen arbitrarily, and is highlighted. The arc BD has been
highlighted in red, because there already exists a path (in green)
between B and D, so it would form a cycle (ABD) if it were chosen.
Kruskal's algorithm
271
http://Wikipedia/kruskals
The process continues to highlight the next-smallest arc, BE with
length 7. Many more arcs are highlighted in red at this stage: BC
because it would form the loop BCE, DE because it would form the
loop DEBA, and FE because it would form FEBAD.
Kruskal's algorithm
272
http://Wikipedia/kruskals
Finally, the process finishes with the arc EG of length 9, and the
minimum spanning tree is found.
Kruskal's algorithm
Prim's algorithm (simple)
MST_PRIM(G,w,r){
A={}
S:={r} (r is an arbitrary node in V)
Q=V-{r};
while Q is not empty
do {
take an edge (u, v) such that
uS and vQ (vS ) and (u,v) is the shortest edge
add (u, v) to A,
add v to S and delete v from Q
}
}
273
Prim's algorithm
274
for each vertex in graph
set min_distance of vertex to ∞
set parent of vertex to null
set minimum_adjacency_list of vertex to empty list
set is_in_Q of vertex to true
set min_distance of initial vertex to zero
add to minimum-heap Q all vertices in graph, keyed by min_distance
Initialization
inputs: A graph, a function returning edge weights weight-function, and an
initial vertex
Initial placement of all vertices in the 'not yet seen' set, set initial vertex to
be added to the tree, and place all vertices in a min-heap to allow for
removal of the min distance from the minimum graph.
Wikipedia
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf
CS-323 DAA.pdf

More Related Content

Similar to CS-323 DAA.pdf

Design & Analysis of Algorithm course .pptx
Design & Analysis of Algorithm course .pptxDesign & Analysis of Algorithm course .pptx
Design & Analysis of Algorithm course .pptxJeevaMCSEKIOT
 
Analysis of Algorithms
Analysis of AlgorithmsAnalysis of Algorithms
Analysis of AlgorithmsAmna Saeed
 
Design & Analysis of Algorithms Lecture Notes
Design & Analysis of Algorithms Lecture NotesDesign & Analysis of Algorithms Lecture Notes
Design & Analysis of Algorithms Lecture NotesFellowBuddy.com
 
complexity analysis.pdf
complexity analysis.pdfcomplexity analysis.pdf
complexity analysis.pdfpasinduneshan
 
Chapter1.1 Introduction.ppt
Chapter1.1 Introduction.pptChapter1.1 Introduction.ppt
Chapter1.1 Introduction.pptTekle12
 
Chapter1.1 Introduction to design and analysis of algorithm.ppt
Chapter1.1 Introduction to design and analysis of algorithm.pptChapter1.1 Introduction to design and analysis of algorithm.ppt
Chapter1.1 Introduction to design and analysis of algorithm.pptTekle12
 
Unit 1, ADA.pptx
Unit 1, ADA.pptxUnit 1, ADA.pptx
Unit 1, ADA.pptxjinkhatima
 
DSJ_Unit I & II.pdf
DSJ_Unit I & II.pdfDSJ_Unit I & II.pdf
DSJ_Unit I & II.pdfArumugam90
 
Data structure and algorithm using java
Data structure and algorithm using javaData structure and algorithm using java
Data structure and algorithm using javaNarayan Sau
 
Algorithm in Computer, Sorting and Notations
Algorithm in Computer, Sorting  and NotationsAlgorithm in Computer, Sorting  and Notations
Algorithm in Computer, Sorting and NotationsAbid Kohistani
 
asymptotic analysis and insertion sort analysis
asymptotic analysis and insertion sort analysisasymptotic analysis and insertion sort analysis
asymptotic analysis and insertion sort analysisAnindita Kundu
 
Lec16-CS110 Computational Engineering
Lec16-CS110 Computational EngineeringLec16-CS110 Computational Engineering
Lec16-CS110 Computational EngineeringSri Harsha Pamu
 
Analysis of algorithn class 2
Analysis of algorithn class 2Analysis of algorithn class 2
Analysis of algorithn class 2Kumar
 

Similar to CS-323 DAA.pdf (20)

Design & Analysis of Algorithm course .pptx
Design & Analysis of Algorithm course .pptxDesign & Analysis of Algorithm course .pptx
Design & Analysis of Algorithm course .pptx
 
Analysis of Algorithms
Analysis of AlgorithmsAnalysis of Algorithms
Analysis of Algorithms
 
Daa
DaaDaa
Daa
 
Design & Analysis of Algorithms Lecture Notes
Design & Analysis of Algorithms Lecture NotesDesign & Analysis of Algorithms Lecture Notes
Design & Analysis of Algorithms Lecture Notes
 
complexity analysis.pdf
complexity analysis.pdfcomplexity analysis.pdf
complexity analysis.pdf
 
Algorithm Design and Analysis
Algorithm Design and AnalysisAlgorithm Design and Analysis
Algorithm Design and Analysis
 
Chapter1.1 Introduction.ppt
Chapter1.1 Introduction.pptChapter1.1 Introduction.ppt
Chapter1.1 Introduction.ppt
 
Chapter1.1 Introduction to design and analysis of algorithm.ppt
Chapter1.1 Introduction to design and analysis of algorithm.pptChapter1.1 Introduction to design and analysis of algorithm.ppt
Chapter1.1 Introduction to design and analysis of algorithm.ppt
 
Unit 1, ADA.pptx
Unit 1, ADA.pptxUnit 1, ADA.pptx
Unit 1, ADA.pptx
 
DSJ_Unit I & II.pdf
DSJ_Unit I & II.pdfDSJ_Unit I & II.pdf
DSJ_Unit I & II.pdf
 
Searching Algorithms
Searching AlgorithmsSearching Algorithms
Searching Algorithms
 
Data structure and algorithm using java
Data structure and algorithm using javaData structure and algorithm using java
Data structure and algorithm using java
 
01-algo.ppt
01-algo.ppt01-algo.ppt
01-algo.ppt
 
Unit 2 algorithm
Unit   2 algorithmUnit   2 algorithm
Unit 2 algorithm
 
Algorithm in Computer, Sorting and Notations
Algorithm in Computer, Sorting  and NotationsAlgorithm in Computer, Sorting  and Notations
Algorithm in Computer, Sorting and Notations
 
asymptotic analysis and insertion sort analysis
asymptotic analysis and insertion sort analysisasymptotic analysis and insertion sort analysis
asymptotic analysis and insertion sort analysis
 
1. introduction
1. introduction1. introduction
1. introduction
 
Lec16-CS110 Computational Engineering
Lec16-CS110 Computational EngineeringLec16-CS110 Computational Engineering
Lec16-CS110 Computational Engineering
 
chapter 1
chapter 1chapter 1
chapter 1
 
Analysis of algorithn class 2
Analysis of algorithn class 2Analysis of algorithn class 2
Analysis of algorithn class 2
 

Recently uploaded

Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...drjose256
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualBalamuruganV28
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...IJECEIAES
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfJNTUA
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxMustafa Ahmed
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashidFaiyazSheikh
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Stationsiddharthteach18
 
electrical installation and maintenance.
electrical installation and maintenance.electrical installation and maintenance.
electrical installation and maintenance.benjamincojr
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...josephjonse
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1T.D. Shashikala
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxkalpana413121
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024EMMANUELLEFRANCEHELI
 
Interfacing Analog to Digital Data Converters ee3404.pdf
Interfacing Analog to Digital Data Converters ee3404.pdfInterfacing Analog to Digital Data Converters ee3404.pdf
Interfacing Analog to Digital Data Converters ee3404.pdfragupathi90
 
15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon15-Minute City: A Completely New Horizon
15-Minute City: A Completely New HorizonMorshed Ahmed Rahath
 
Working Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfWorking Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfSkNahidulIslamShrabo
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...archanaece3
 
What is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, FunctionsWhat is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, FunctionsVIEW
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfJNTUA
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 

Recently uploaded (20)

Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manual
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdf
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded Systems
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Station
 
electrical installation and maintenance.
electrical installation and maintenance.electrical installation and maintenance.
electrical installation and maintenance.
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptx
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
 
Interfacing Analog to Digital Data Converters ee3404.pdf
Interfacing Analog to Digital Data Converters ee3404.pdfInterfacing Analog to Digital Data Converters ee3404.pdf
Interfacing Analog to Digital Data Converters ee3404.pdf
 
15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon
 
Working Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfWorking Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdf
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
What is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, FunctionsWhat is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, Functions
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 

CS-323 DAA.pdf

  • 1. DAA notes by Pallavi Joshi Chapter 1: The Role of Algorithms in Computing
  • 2. DAA notes by Pallavi Joshi Algorithms Informally, an algorithm is … A well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output. algorithm input output A sequence of computational steps that transform the input into output.
  • 3. DAA notes by Pallavi Joshi Algorithms Empirically, an algorithm is … A tool for solving a well-specified computational problem. Problem specification includes what the input is, what the desired output should be. Algorithm describes a specific computational procedure for achieving the desired output for a given input.
  • 4. DAA notes by Pallavi Joshi Algorithms The Sorting Problem: Input: A sequence of n numbers [a1, a2, … , an]. Output: A permutation or reordering [a'1, a'2, … , a'n ] of the input sequence such that a'1  a'2  …  a'n . An instance of the Sorting Problem: Input: A sequence of 6 number [31, 41, 59, 26, 41, 58]. Expected output for given instance: Expected Output: The permutation of the input [26, 31, 41, 41, 58 , 59].
  • 5. DAA notes by Pallavi Joshi Algorithms Some definitions … An algorithm is said to be correct if, for every input instance, it halts with the correct output. A correct algorithm solves the given computational problem. Focus will be on correct algorithms; incorrect algorithms can sometimes be useful. Algorithm specification may be in English, as a computer program, even as a hardware design.
  • 6. DAA notes by Pallavi Joshi Gallery of Problems The Human Genome Project seeks to identify all the 100,000 genes in human DNA, determining the sequences of the 3 billion chemical base pairs comprising human DNA, storing this information in databases, and developing tools for data analysis. Algorithms are needed (most of which are novel) to solve the many problems listed here … The huge network that is the Internet and the huge amount of data that courses through it require algorithms to efficiently manage and manipulate this data.
  • 7. DAA notes by Pallavi Joshi Gallery of Problems E-commerce enables goods and services to be negotiated and exchanged electronically. Crucial is the maintenance of privacy and security for all transactions. Traditional manufacturing and commerce require allocation of scarce resources in the most beneficial way. Linear programming algorithms are used extensively in commercial optimization problems.
  • 8. DAA notes by Pallavi Joshi Some algorithms • Shortest path algorithm – Given a weighted graph and two distinguished vertices -- the source and the destination -- compute the most efficient way to get from one to the other • Matrix multiplication algorithm – Given a sequence of conformable matrices, compute the most efficient way of forming the product of the matrix sequence
  • 9. DAA notes by Pallavi Joshi Some algorithms • Convex hull algorithm – Given a set of points on the plane, compute the smallest convex body that contains the points • String matching algorithm – Given a sequence of characters, compute where (if at all) a second sequence of characters occurs in the first
  • 10. DAA notes by Pallavi Joshi Hard problems • Usual measure of efficiency is speed – How long does an algorithm take to produce its result? – Define formally measures of efficiency • Problems exist that, in all probability, will take a long time to solve – Exponential complexity – NP-complete problems • Problems exist that are unsolvable
  • 11. DAA notes by Pallavi Joshi Hard problems • NP-complete problems are interesting in and of themselves – Some of them arise in real applications – Some of them look very similar to problems for which efficient solutions do exist – Knowing the difference is crucial • Not known whether NP-complete problems really are as hard as they seem, or, perhaps, the machinery for solving them efficiently has not been developed just yet
  • 12. DAA notes by Pallavi Joshi Hard problems • P  NP conjecture – Fundamental open problem in the theory of computational complexity – Open now for 30+ years
  • 13. DAA notes by Pallavi Joshi Algorithms as a technology • Even if computers were infinitely fast and memory was plentiful and free – Study of algorithms still important – still need to establish algorithm correctness – Since time and space resources are infinite, any correct algorithm would do • Real-world computers are fast but not infinitely so • Memory is cheap but not unlimited
  • 14. DAA notes by Pallavi Joshi Efficiency • Time and space efficiency are the goal • Algorithms often differ dramatically in their efficiency – Example: Two sorting algorithms • INSERTION-SORT – time efficiency is c1n2 • MERGE-SORT – time efficiency is c1nlogn – For which problem instances would one algorithm be preferable to the other?
  • 15. DAA notes by Pallavi Joshi Efficiency – Answer depends on several factors: • Speed of machine performing the computation – Internal clock speed – Shared environment – I/O needed by algorithm • Quality of implementation (coding) – Compiler optimization – Implementation details (e.g., data structures) • Size of problem instance – Most stable parameter – used as independent variable
  • 16. DAA notes by Pallavi Joshi Efficiency • INSERTION-SORT – Implemented by an ace programmer and run on a machine A that performs 109 instructions per second such that time efficiency is given by: tA(n) = 2n2 instructions (i.e., c1=2) • MERGE-SORT – Implemented by a novice programmer and run on a machine B that performs 107 instructions per second such that time efficiency is given by: tB(n) = 50nlogn instructions (i.e., c1=50)
  • 17. DAA notes by Pallavi Joshi Efficiency n 2n2 /109 50nlogn/107 10,000 0.20 0.66 50,000 5.00 3.90 100,000 20.00 8.30 500,000 500.00 47.33 1,000,000 2,000.00 99.66 5,000,000 50,000.00 556.34 10,000,000 200,000.00 1,162.67 50,000,000 5,000,000.00 6,393.86 Problem Size Machine A Insertion- Sort Machine B Merge- Sort
  • 18. DAA notes by Pallavi Joshi Efficiency • Graphical comparison Time Efficiency Comparison 0.00 2.00 4.00 6.00 8.00 10.00 1 9 17 25 33 41 49 57 65 Size of Problem (in 1000s) Seconds Insertion Sort Merge Sort
  • 19. DAA notes by Pallavi Joshi The Sorting Problem Input: A sequence of n numbers [a1, a2, … , an]. Output: A permutation or reordering [a'1, a'2, … , a'n ] of the input sequence such that a'1  a'2  …  a'n . An instance of the Sorting Problem: Input: A sequence of 6 number [31, 41, 59, 26, 41, 58]. Expected output for given instance: Expected Output: The permutation of the input [26, 31, 41, 41, 58 , 59].
  • 20. DAA notes by Pallavi Joshi Insertion Sort The main idea …
  • 21. DAA notes by Pallavi Joshi Insertion Sort (cont.)
  • 22. DAA notes by Pallavi Joshi Insertion Sort (cont.)
  • 23. DAA notes by Pallavi Joshi Insertion Sort (cont.) The algorithm …
  • 24. DAA notes by Pallavi Joshi Loop Invariant • Property of A[1 .. j  1] At the start of each iteration of the for loop of lines 1 8, the subarray A[1 .. j  1] consists of the elements originally in A[1 .. j  1] but in sorted order. • Need to establish the following re invariant: – Initialization: true prior to first iteration – Maintenance: if true before iteration, remains true after iteration – Termination: at loop termination, invariant implies correctness of algorithm
  • 25. DAA notes by Pallavi Joshi Analyzing Algorithms • Has come to mean predicting the resources that the algorithm requires • Usually computational time is resource of primary importance • Aims to identify best choice among several alternate algorithms • Requires an agreed-upon “model” of computation • Shall use a generic, one-processor, random-access machine (RAM) model of computation
  • 26. DAA notes by Pallavi Joshi Random-Access Machine • Instructions are executed one after another (no concurrency) • Admits commonly found instructions in “real” computers, data movement operations, control mechanism • Uses common data types (integer and float) • Other properties discussed as needed • Care must be taken since model of computation has great implications on resulting analysis
  • 27. DAA notes by Pallavi Joshi Analysis of Insertion Sort • Time resource requirement depends on input size • Input size depends on problem being studied; frequently, this is the number of items in the input • Running time: number of primitive operations or “steps” executed for an input • Assume constant amount of time for each line of pseudocode
  • 28. DAA notes by Pallavi Joshi Analysis of Insertion Sort Time efficiency analysis …
  • 29. DAA notes by Pallavi Joshi Best Case Analysis • Least amount of (time) resource ever needed by algorithm • Achieved when incoming list is already sorted in increasing order • Inner loop is never iterated • Cost is given by: T(n) = c1n+c2 (n1)+c4 (n1)+c5(n1)+c8(n1) = (c1+c2+c4+c5+c8)n  (c2+c4+c5+c8) = an + b • Linear function of n
  • 30. DAA notes by Pallavi Joshi Worst Case Analysis • Greatest amount of (time) resource ever needed by algorithm • Achieved when incoming list is in reverse order • Inner loop is iterated the maximum number of times, i.e., tj = j • Therefore, the cost will be: T(n) = c1n + c2 (n1)+c4 (n1) + c5((n(n+1)/2) 1) + c6(n(n1)/2) + c7(n(n1)/2) + c8(n1) = ( c5 /2 + c6 /2 + c7/2 ) n2 + (c1+c2+c4+c5 /2  c6 /2  c7 /2 +c8 ) n  ( c2 + c4 + c5 + c8 ) = an2 + bn + c • Quadratic function of n
  • 31. DAA notes by Pallavi Joshi Future Analyses • For the most part, subsequent analyses will focus on: – Worst-case running time • Upper bound on running time for any input – Average-case analysis • Expected running time over all inputs • Often, worst-case and average-case have the same “order of growth”
  • 32. DAA notes by Pallavi Joshi Order of Growth • Simplifying abstraction: interested in rate of growth or order of growth of the running time of the algorithm • Allows us to compare algorithms without worrying about implementation performance • Usually only highest order term without constant coefficient is taken • Uses “theta” notation – Best case of insertion sort is (n) – Worst case of insertion sort is (n2)
  • 33. DAA notes by Pallavi Joshi Designing Algorithms • Several techniques/patterns for designing algorithms exist • Incremental approach: builds the solution one component at a time • Divide-and-conquer approach: breaks original problem into several smaller instances of the same problem – Results in recursive algorithms – Easy to analyze complexity using proven techniques
  • 34. DAA notes by Pallavi Joshi Divide-and-Conquer • Technique (or paradigm) involves: – “Divide” stage: Express problem in terms of several smaller subproblems – “Conquer” stage: Solve the smaller subproblems by applying solution recursively – smallest subproblems may be solved directly – “Combine” stage: Construct the solution to original problem from solutions of smaller subproblem
  • 35. DAA notes by Pallavi Joshi n (sorted) MERGE Merge Sort Strategy • Divide stage: Split the n-element sequence into two subsequences of n/2 elements each • Conquer stage: Recursively sort the two subsequences • Combine stage: Merge the two sorted subsequences into one sorted sequence (the solution) n (unsorted) n/2 (unsorted) n/2 (unsorted) MERGE SORT MERGE SORT n/2 (sorted) n/2 (sorted)
  • 36. DAA notes by Pallavi Joshi Merging Sorted Sequences
  • 37. DAA notes by Pallavi Joshi Merging Sorted Sequences •Combines the sorted subarrays A[p..q] and A[q+1..r] into one sorted array A[p..r] •Makes use of two working arrays L and R which initially hold copies of the two subarrays •Makes use of sentinel value () as last element to simplify logic (1) (n) (1) (n)
  • 38. DAA notes by Pallavi Joshi Merge Sort Algorithm (1) (n) T(n/2) T(n/2) T(n) = 2T(n/2) + (n)
  • 39. DAA notes by Pallavi Joshi Analysis of Merge Sort Analysis of recursive calls …
  • 40. DAA notes by Pallavi Joshi Analysis of Merge Sort T(n) = cn(lg n + 1) = cnlg n + cn T(n) is (n lg n)
  • 41. DAA notes by Pallavi Joshi Chapter 3: Growth of Functions
  • 42. DAA notes by Pallavi Joshi Overview • Order of growth of functions provides a simple characterization of efficiency • Allows for comparison of relative performance between alternative algorithms • Concerned with asymptotic efficiency of algorithms • Best asymptotic efficiency usually is best choice except for smaller inputs • Several standard methods to simplify asymptotic analysis of algorithms
  • 43. DAA notes by Pallavi Joshi Asymptotic Notation • Applies to functions whose domains are the set of natural numbers: N = {0,1,2,…} • If time resource T(n) is being analyzed, the function’s range is usually the set of non-negative real numbers: T(n)  R+ • If space resource S(n) is being analyzed, the function’s range is usually also the set of natural numbers: S(n)  N
  • 44. DAA notes by Pallavi Joshi Asymptotic Notation • Depending on the textbook, asymptotic categories may be expressed in terms of -- a. set membership (our textbook): functions belong to a family of functions that exhibit some property; or b. function property (other textbooks): functions exhibit the property • Caveat: we will formally use (a) and informally use (b)
  • 45. DAA notes by Pallavi Joshi The Θ-Notation f c1 ⋅ g n0 c2 ⋅ g Θ(g(n)) = { f(n) : ∃c1, c2 > 0, n0 > 0 s.t. ∀n ≥ n0: c1 · g(n) ≤ f(n) ≤ c2 ⋅ g(n) }
  • 46. DAA notes by Pallavi Joshi The O-Notation f c ⋅ g n0 O(g(n)) = { f(n) : ∃c > 0, n0 > 0 s.t. ∀n ≥ n0: f(n) ≤ c ⋅ g(n) }
  • 47. DAA notes by Pallavi Joshi The Ω-Notation Ω(g(n)) = { f(n) : ∃c > 0, n0 > 0 s.t. ∀n ≥ n0: f(n) ≥ c ⋅ g(n) } f c ⋅ g n0
  • 48. DAA notes by Pallavi Joshi The o-Notation o(g(n)) = { f(n) : ∀c > 0 ∃n0 > 0 s.t. ∀n ≥ n0: f(n) ≤ c ⋅ g(n) } f c1 ⋅ g n1 c2 ⋅ g c3 ⋅ g n2 n3
  • 49. DAA notes by Pallavi Joshi The ω-Notation f c1 ⋅ g n1 c2 ⋅ g c3 ⋅ g n2 n3 ω(g(n)) = { f(n) : ∀c > 0 ∃n0 > 0 s.t. ∀n ≥ n0: f(n) ≥ c ⋅ g(n) }
  • 50. DAA notes by Pallavi Joshi Comparison of Functions • f(n) = O(g(n)) and g(n) = O(h(n)) ⇒ f(n) = O(h(n)) • f(n) = Ω(g(n)) and g(n) = Ω(h(n)) ⇒ f(n) = Ω(h(n)) • f(n) = Θ(g(n)) and g(n) = Θ(h(n)) ⇒ f(n) = Θ(h(n)) • f(n) = O(f(n)) f(n) = Ω(f(n)) f(n) = Θ(f(n)) Reflexivity Transitivity
  • 51. DAA notes by Pallavi Joshi Comparison of Functions • f(n) = Θ(g(n)) ⇐⇒ g(n) = Θ(f(n)) • f(n) = O(g(n)) ⇐⇒ g(n) = Ω(f(n)) • f(n) = o(g(n)) ⇐⇒ g(n) = ω(f(n)) • f(n) = O(g(n)) and f(n) = Ω(g(n)) ⇒ f(n) = Θ(g(n)) Transpose Symmetry Symmetry Theorem 3.1
  • 52. DAA notes by Pallavi Joshi Asymptotic Analysis and Limits
  • 53. DAA notes by Pallavi Joshi Comparison of Functions • f1(n) = O(g1(n)) and f2(n) = O(g2(n)) ⇒ f1(n) + f2(n) = O(g1(n) + g2(n)) • f(n) = O(g(n)) ⇒ f(n) + g(n) = O(g(n))
  • 54. DAA notes by Pallavi Joshi Standard Notation and Common Functions • Monotonicity A function f(n) is monotonically increasing if m  n implies f(m)  f(n) . A function f(n) is monotonically decreasing if m  n implies f(m)  f(n) . A function f(n) is strictly increasing if m < n implies f(m) < f(n) . A function f(n) is strictly decreasing if m < n implies f(m) > f(n) .
  • 55. DAA notes by Pallavi Joshi Standard Notation and Common Functions • Floors and ceilings For any real number x, the greatest integer less than or equal to x is denoted by x. For any real number x, the least integer greater than or equal to x is denoted by x. For all real numbers x, x1 < x  x  x < x+1. Both functions are monotonically increasing.
  • 56. DAA notes by Pallavi Joshi Standard Notation and Common Functions • Exponentials For all n and a1, the function an is the exponential function with base a and is monotonically increasing. • Logarithms Textbook adopts the following convention lg n = log2n (binary logarithm), ln n = logen (natural logarithm), lgk n = (lg n)k (exponentiation), lg lg n = lg(lg n) (composition), lg n + k = (lg n)+k (precedence of lg). ai
  • 57. DAA notes by Pallavi Joshi Standard Notation and Common Functions • Important relationships For all real constants a and b such that a>1, nb = o(an) that is, any exponential function with a base strictly greater than unity grows faster than any polynomial function. For all real constants a and b such that a>0, lgbn = o(na) that is, any positive polynomial function grows faster than any polylogarithmic function.
  • 58. DAA notes by Pallavi Joshi Standard Notation and Common Functions • Factorials For all n the function n! or “n factorial” is given by n! = n  (n1)  (n  2)  (n  3)  …  2  1 It can be established that n! = o(nn) n! = (2n) lg(n!) = (nlgn)
  • 59. DAA notes by Pallavi Joshi Standard Notation and Common Functions • Functional iteration The notation f (i)(n) represents the function f(n) iteratively applied i times to an initial value of n, or, recursively f (i)(n) = n if n=0 f (i)(n) = f(f (i1)(n)) if n>0 Example: If f(n) = 2n then f (2)(n) = f(2n) = 2(2n) = 22n then f (3)(n) = f(f (2)(n)) = 2(22n) = 23n then f (i)(n) = 2in
  • 60. DAA notes by Pallavi Joshi Standard Notation and Common Functions • Iterated logarithmic function The notation lg* n which reads “log star of n” is defined as lg* n = min {i0 : lg(i) n  1 Example: lg* 2 = 1 lg* 4 = 2 lg* 16 = 3 lg* 65536 = 4 lg* 265536 = 5
  • 61. DAA notes by Pallavi Joshi Asymptotic Running Time of Algorithms • We consider algorithm A better than algorithm B if TA(n) = o(TB(n)) • Why is it acceptable to ignore the behavior of algorithms for small inputs? • Why is it acceptable to ignore the constants? • What do we gain by using asymptotic notation?
  • 62. DAA notes by Pallavi Joshi Things to Remember • Asymptotic analysis studies how the values of functions compare as their arguments grow without bounds. • Ignores constants and the behavior of the function for small arguments. • Acceptable because all algorithms are fast for small inputs and growth of running time is more important than constant factors.
  • 63. DAA notes by Pallavi Joshi Things to Remember • Ignoring the usually unimportant details, we obtain a representation that succinctly describes the growth of a function as its argument grows and thus allows us to make comparisons between algorithms in terms of their efficiency.
  • 64. DAA notes by Pallavi Joshi Chapter 4: Recurrences Overview Define what a recurrence is Discuss three methods of solving recurrences Substitution method Recursion-tree method Master method Examples of each method 64
  • 65. DAA notes by Pallavi Joshi Definition A recurrence is an equation or inequality that describes a function in terms of its value on smaller inputs. Example from MERGE-SORT T(n) = (1) if n=1 2T(n/2) + (n) if n>1 Technicalities Normally, independent variables only assume integral values Example from MERGE-SORT revisited (1) if n=1 T(n) = T( n/2 ) + T( n/2 ) + (n) if n>1 For simplicity, ignore floors and ceilings – often insignificant 65
  • 66. DAA notes by Pallavi Joshi Technicalities Boundary conditions (small n) are also glossed over T(n) = 2T(n/2) + (n) Value of T(n) assumed to be small constant for small n Substitution Method Involves two steps: 1.Guess the form of the solution. 2.Use mathematical induction to find the constants and show the solution works. Drawback: applied only in cases where it is easy to guess at solution Useful in estimating bounds on true solution even if latter is unidentified 66
  • 67. DAA notes by Pallavi Joshi Substitution Method Example: T(n) = 2T(n/2 ) + n Guess: T(n) = O(n lg n) Prove by induction: T(n)  cn lg n for suitable c>0. Inductive Proof We’ll not worry about the basis case for the moment – we’ll choose this as needed – clearly we have: T(1) = (1)  cn lg n Inductive hypothesis: For values of n < k the inequality holds, i.e., T(n)  cn lg n We need to show that this holds for n = k as well. 67
  • 68. DAA notes by Pallavi Joshi Inductive Proof In particular, for n = k/2 , the inductive hypothesis should hold, i.e., T( k/2 )  c k/2 lg k/2 The recurrence gives us: T(k) = 2T( k/2 ) + k Substituting the inequality above yields: T(k)  2[c k/2 lg k/2 ] + k Inductive Proof Because of the non-decreasing nature of the functions involved, we can drop the “floors” and obtain: T(k)  2[c (k/2) lg (k/2)] + k Which simplifies to: T(k)  ck (lg k  lg 2) + k Or, since lg 2 = 1, we have: T(k)  ck lg k  ck + k = ck lg k + (1 c)k So if c  1, T(k)  ck lg k Q.E.D. 68
  • 69. DAA notes by Pallavi Joshi Recursion-Tree Method Straightforward technique of coming up with a good guess Can help the Substitution Method Recursion tree: visual representation of recursive call hierarchy where each node represents the cost of a single subproblem Recursion-Tree Method T(n) = 3T( n/4 ) + (n2) 69
  • 70. DAA notes by Pallavi Joshi Recursion-Tree Method T(n) = 3T( n/4 ) + (n2) Recursion-Tree Method T(n) = 3T( n/4 ) + (n2) 70
  • 71. DAA notes by Pallavi Joshi Recursion-Tree Method T(n) = 3T( n/4 ) + (n2) Recursion-Tree Method T(n) = (3/16)icn2 + (nlog43) i=0 Gathering all the costs together: log4n1 T(n)  (3/16)icn2 + o(n) i=0  T(n)  (1/(13/16))cn2 + o(n) T(n)  (16/13)cn2 + o(n) T(n) = O(n2) 71
  • 72. DAA notes by Pallavi Joshi Recursion-Tree Method T(n) = T(n/3) + T(2n/3) + O(n) Recursion-Tree Method An overestimate of the total cost: log3/2n1 T(n) = cn + (nlog3/22) i=0 Counter-indications: T(n) = O(n lg n) + (n lg n) Notwithstanding this, use as “guess”: T(n) = O(n lg n) 72
  • 73. DAA notes by Pallavi Joshi Substitution Method Recurrence: T(n) = T(n/3) + T(2n/3) + cn Guess: T(n) = O(n lg n) Prove by induction: T(n)  dn lg n for suitable d>0 (we already use c) Inductive Proof Again, we’ll not worry about the basis case Inductive hypothesis: For values of n < k the inequality holds, i.e., T(n)  dn lg n We need to show that this holds for n = k as well. In particular, for n = k/3, and n = 2k/3, the inductive hypothesis should hold… 73
  • 74. DAA notes by Pallavi Joshi Inductive Proof That is T(k/3)  d k/3 lg k/3 T(2k/3)  d 2k/3 lg 2k/3 The recurrence gives us: T(k) = T(k/3) + T(2k/3) + ck Substituting the inequalities above yields: T(k)  [d (k/3) lg (k/3)] + [d (2k/3) lg (2k/3)] + ck Inductive Proof Expanding, we get T(k)  [d (k/3) lg k  d (k/3) lg 3] + [d (2k/3) lg k  d (2k/3) lg(3/2)] + ck Rearranging, we get: T(k)  dk lg k  d[(k/3) lg 3 + (2k/3) lg(3/2)] + ck T(k)  dk lg k  dk[lg 3  2/3] + ck When dc/(lg3  (2/3)), we should have the desired: T(k)  dk lg k 74
  • 75. DAA notes by Pallavi Joshi Master Method Provides a “cookbook” method for solving recurrences Recurrence must be of the form: T(n) = aT(n/b) + f(n) where a1 and b>1 are constants and f(n) is an asymptotically positive function. Master Method Theorem 4.1: Given the recurrence previously defined, we have: 1.If f(n) = O(n logba) for some constant >0, then T(n) = (n logba) 2.If f(n) = (n logba), then T(n) = (nlogba lg n) 75
  • 76. DAA notes by Pallavi Joshi Master Method 3. If f(n) = (n logba+) for some constant >0, and if af(n/b)  cf(n) for some constant c<1 and all sufficiently large n, then T(n) = (f(n)) Example Estimate bounds on the following recurrence: Use the recursion tree method to arrive at a “guess” then verify using induction Point out which case in the Master Method this falls in 76
  • 77. DAA notes by Pallavi Joshi Recursion Tree Recurrence produces the following tree: Cost Summation Collecting the level-by-level costs: A geometric series with base less than one; converges to a finite sum, hence, T(n) = (n2) 77
  • 78. DAA notes by Pallavi Joshi Exact Calculation If an exact solution is preferred: Using the formula for a partial geometric series: 78
  • 79. DAA notes by Pallavi Joshi Master Theorem (Simplified) 79
  • 80. DAA notes by Pallavi Joshi • Divide and Conquer
  • 81. DAA notes by Pallavi Joshi Divide and Conquer - Recursive in structure Divide the problem into sub-problems that are similar to the original but smaller in size Conquer the sub-problems by solving them recursively. If they are small enough, just solve them in a straightforward manner. Combine the solutions to create a solution to the original problem
  • 82. DAA notes by Pallavi Joshi An Example: Merge Sort Comp 122 - Sorting Problem: Sort a sequence of n elements into non-decreasing order. Divide: Divide the n-element sequence to be sorted into two subsequences of n/2 elements each Conquer:Sort the two subsequences recursively using merge sort. Combine: Merge the two sorted subsequences to produce the sorted answer.
  • 83. DAA notes by Pallavi Joshi Merge Sort – Example Comp 122 18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1 26 18 6 32 15 43 1 9 18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1 18 26 32 6 15 43 1 9 6 18 26 32 1 9 15 43 1 6 9 15 18 26 32 43 18 26 32 6 18 26 32 6 43 15 9 1 43 15 9 1 18 26 32 6 43 15 9 1 6 43 15 18 26 32 6 43 15 9 1 18 26 32 9 1 18 26 32 6 43 15 9 1 1 9 18 26 6 32 15 43 - 5 6 18 26 32 1 9 15 43 Original Sequence Sorted Sequence
  • 84. DAA notes by Pallavi Joshi Merge-Sort (A, p, r) Comp 122 - 6 INPUT: a sequence of n numbers stored in array A OUTPUT: an ordered sequence of n numbers MergeSort (A, p, r) // sort A[p..r] by divide & conquer 1 if p < r 2 then q  (p+r)/2 3 MergeSort (A, p, q) 4 MergeSort (A, q+1, r) 5 Merge (A, p, q, r) // merges A[p..q] with A[q+1..r] Initial Call: MergeSort(A, 1, n)
  • 85. DAA notes by Pallavi Joshi Comp 122 - 7 Procedure Merge Merge(A, p, q, r) 1 n1  q – p + 1 2 n2  r – q 3 for i  1 to n1 4 do L[i]  A[p + i – 1] 5for j  1 to n2 6do R[j]  A[q + j] 7 L[n1+1]   8 R[n2+1]   9 i  1 10 j  1 11for k p to r 12do if L[i]  R[j] 13then A[k]  L[i] 14 i  i + 1 15 else A[k]  R[j] 16 j  j + 1 Sentinels, to avoid having to check if either subarray is fully copied at each step. Input: Array containing sorted subarrays A[p..q] and A[q+1..r]. Output: Merged sorted subarray in A[p..r].
  • 86. DAA notes by Pallavi Joshi Comp 122 - 8 j Merge – Example 6 8 26 32 1 9 42 43 A k 6 8 26 32 1 9 42 43 k k k k k k k i i i i i j j j j 6 8 26 32  1 9 42 43  … 1 6 8 9 26 32 42 43 … k L R
  • 87. DAA notes by Pallavi Joshi Comp 122 - 9 Merg e Merge(A, p, q, r) 1 n1  q – p + 1 2 n2  r – q 3 for i  1 to n1 4 do L[i]  A[p + i – 1] 5for j  1 to n2 6do R[j]  A[q + j] 7 L[n1+1]   8 R[n2+1]   9 i  1 10 j  1 11for k p to r 12do if L[i]  R[j] 13then A[k]  L[i] 14 i  i + 1 15 else A[k]  R[j] 16 j  j + 1 Loop Invariant for the for loop At the start of each iteration of the for loop: Subarray A[p..k – 1] contains the k – p smallest elements of L and R in sorted order. L[i] and R[j] are the smallest elements of L and R that have not been copied back into A. Initialization: Before the first iteration: •A[p..k – 1] is empty. •i = j = 1. •L[1] and R[1] are the smallest elements of L and R not copied to A.
  • 88. DAA notes by Pallavi Joshi Comp 122 - 10 Merg e Merge(A, p, q, r) 1 n1  q – p + 1 2 n2  r – q 2 3 for i  1 to n1 4 do L[i]  A[p + i – 1] 5for j  1 to n 6do R[j]  A[q + j] 7 L[n1+1]   8 R[n2+1]   9 i  1 10 j  1 11for k p to r 12do if L[i]  R[j] 13then A[k]  L[i] 14 i  i + 1 15 else A[k]  R[j] 16 j  j + 1 Maintenance: Case 1: L[i]  R[j] •By LI, A contains p – k smallest elements of L and R in sorted order. •By LI, L[i] and R[j] are the smallest elements of L and R not yet copied into A. •Line 13 results in A containing p – k + 1 smallest elements (again in sorted order). Incrementing i and k reestablishes the LI for the next iteration. Similarly for L[i] > R[j]. Termination: •On termination, k = r + 1. •By LI, A contains r – p + 1 smallest elements of L and R in sorted order. •L and R together contain r – p + 3 elements. All but the two sentinels have been copied back into A.
  • 89. DAA notes by Pallavi Joshi Comp 122 - 11 Analysis of Merge Sort Running time T(n) of Merge Sort: Divide: computing the middle takes (1) Conquer: solving 2 subproblems takes 2T(n/2) Combine: merging n elements takes (n) Total: T(n) = (1) T(n) = 2T(n/2) + (n) if n = 1 if n > 1  T(n) = (n lg n) (CLRS, Chapter 4)
  • 90. DAA notes by Pallavi Joshi Comp 122, Spring 2004 Recurrences – I
  • 91. DAA notes by Pallavi Joshi Recurrence Relations Comp 122 - 13 Equation or an inequality that characterizes a function by its values on smaller inputs. Solution Methods (Chapter 4) Substitution Method. Recursion-tree Method. Master Method. Recurrence relations arise when we analyze the running time of iterative or recursive algorithms. Ex: Divide and Conquer. T(n) = (1) T(n) = a T(n/b) + D(n) + C(n) if n  c otherwise
  • 92. DAA notes by Pallavi Joshi Substitution Method Comp 122 - 14 Guess the form of the solution, then use mathematical induction to show it correct. Substitute guessed answer for the function when the inductive hypothesis is applied to smaller values – hence, the name. Works well when the solution is easy to guess. No general way to guess the correct solution.
  • 93. DAA notes by Pallavi Joshi Example – Exact Function Comp 122 - 15 if n = 1 if n > 1 Recurrence: T(n) = 1 T(n) = 2T(n/2) + n Guess: T(n) = n lg n + n. Induction: •Basis: n = 1  n lgn + n = 1 = T(n). •Hypothesis: T(k) = k lg k + k for all k < n. •Inductive Step: T(n) = 2 T(n/2) + n = 2 ((n/2)lg(n/2) + (n/2)) + n = n (lg(n/2)) + 2n = n lg n – n + 2n = n lg n + n
  • 94. DAA notes by Pallavi Joshi Recursion-tree Method Comp 122 - 16 Making a good guess is sometimes difficult with the substitution method. Use recursion trees to devise good guesses. Recursion Trees Show successive expansions of recurrences using trees. Keep track of the time spent on the subproblems of a divide and conquer algorithm. Help organize the algebraic bookkeeping necessary to solve a recurrence.
  • 95. DAA notes by Pallavi Joshi Recursion Tree – Example Comp 122 - 17 Running time of Merge Sort: if n = 1 if n > 1 T(n) = (1) T(n) = 2T(n/2) + (n) Rewrite the recurrence as T(n) = c T(n) = 2T(n/2) + cn if n = 1 if n > 1 c > 0: Running time for the base case and time per array element for the divide and combine steps.
  • 96. DAA notes by Pallavi Joshi Recursion Tree for Merge Sort Comp 122 - 18 For the original problem, we have a cost of cn, plus two subproblems each of size (n/2) and running time T(n/2). cn T(n/2) T(n/2) Each of the size n/2 problems has a cost of cn/2 plus two subproblems, each costing T(n/4). cn cn/2 cn/2 T(n/4) T(n/4) T(n/4) T(n/4) Cost of divide and merge. Cost of sorting subproblems.
  • 97. DAA notes by Pallavi Joshi Recursion Tree for Merge Sort Comp 122 - 19 Continue expanding until the problem size reduces to 1. cn cn/2 cn/2 cn/4 cn/4 cn/4 cn/4 c c c c c c lg n cn cn cn cn Total : cnlgn+cn
  • 98. DAA notes by Pallavi Joshi Recursion Tree for Merge Sort Comp 122 - 20 Continue expanding until the problem size reduces to 1. cn cn/2 cn/2 cn/4 cn/4 cn/4 cn/4 c c c c c c •Each level has total cost cn. •Each time we go down one level, the number of subproblems doubles, but the cost per subproblem halves  cost per level remains the same. •There are lg n + 1 levels, height is lg n. (Assuming n is a power of 2.) •Can be proved by induction. •Total cost = sum of costs at each level = (lg n + 1)cn = cnlgn + cn = (n lgn).
  • 99. DAA notes by Pallavi Joshi Other Examples Comp 122 - 21 Use the recursion-tree method to determine a guess for the recurrences T(n) = 3T(n/4) + (n2). T(n) = T(n/3) + T(2n/3) + O(n).
  • 100. DAA notes by Pallavi Joshi Recursion Trees – Caution Note Comp 122 - 22 Recursion trees only generate guesses. Verify guesses using substitution method. A small amount of “sloppiness” can be tolerated. Why? If careful when drawing out a recursion tree and summing the costs, can be used as direct proof.
  • 101. DAA notes by Pallavi Joshi The Master Method Comp 122 - 23 Based on the Master theorem. “Cookbook” approach for solving recurrences of the form T(n) = aT(n/b) + f(n) • a  1, b > 1 are constants. • f(n) is asymptotically positive. • n/b may not be an integer, but we ignore floors and ceilings. Why? Requires memorization of three cases.
  • 102. DAA notes by Pallavi Joshi The Master Theorem • Theorem 4.1 • Let a  1 and b > 1 be constants, let f(n) be a function, and • Let T(n) be defined on nonnegative integers by the recurrence T(n) = aT(n/b) + f(n), where we can replace n/b by n/b or n/b . T(n) can be bounded asymptotically in three cases: • 1. If f(n) = O(nlogba– ) for some constant > 0, then T(n) = (nlogba). • 2. If f(n) = (nlogba), then T(n) = (nlogbalg n). Comp 122 - 24 Theorem 4.1 Let a  1 and b > 1 be constants, let f(n) be a function, and Let T(n) be defined on nonnegative integers by the recurrence T(n) = aT(n/b) + f(n), where we can replace n/b by n/b or n/b. T(n) can be bounded asymptotically in three cases: 1.If f(n) = O(nlogba– ) for some constant  > 0, then T(n) = (nlogba). 2.If f(n) = (nlogba), then T(n) = (nlogbalg n). 3.If f(n) =  (nlogba+ ) for some constant  > 0, and if, for some constant c < 1 and all sufficiently large n, we have a·f(n/b)  c f(n), then T(n) = (f(n)). We’ll return to recurrences as we need them…
  • 103. DAA notes by Pallavi Joshi • Heap Sort Algorithm
  • 104. DAA notes by Pallavi Joshi 104 Special Types of Trees • Def: Full binary tree = a binary tree in which each node is either a leaf or has degree exactly 2. • Def: Complete binary tree = a binary tree in which all leaves are on the same level and all internal nodes have degree 2. Full binary tree 2 14 8 1 16 7 4 3 9 10 12 Complete binary tree 2 1 16 4 3 9 10
  • 105. DAA notes by Pallavi Joshi 105 Definitions • Height of a node = the number of edges on the longest simple path from the node down to a leaf • Level of a node = the length of a path from the root to the node • Height of tree = height of root node 2 14 8 1 16 4 3 9 10 Height of root = 3 Height of (2)= 1 Level of (10)= 2
  • 106. DAA notes by Pallavi Joshi 106 Useful Properties 2 14 8 1 16 4 3 9 10 Height of root = 3 Height of (2)= 1 Level of (10)= 2 heigh t heigh t 1 1 0 2 1 2 2 1 2 1 d d l d l n           (see Ex 6.1-2, page 129)
  • 107. DAA notes by Pallavi Joshi 107 The Heap Data Structure • Def: A heap is a nearly complete binary tree with the following two properties: – Structural property: all levels are full, except possibly the last one, which is filled from left to right – Order (heap) property: for any node x Parent(x) ≥ x Heap 5 7 8 4 2 From the heap property, it follows that: “The root is the maximum element of the heap!” A heap is a binary tree that is filled in
  • 108. DAA notes by Pallavi Joshi 108 Array Representation of Heaps • A heap can be stored as an array A. – Root of tree is A[1] – Left child of A[i] = A[2i] – Right child of A[i] = A[2i + 1] – Parent of A[i] = A[ i/2 ] – Heapsize[A] ≤ length[A] • The elements in the subarray A[(n/2+1) .. n] are leaves
  • 109. DAA notes by Pallavi Joshi 109 Heap Types • Max-heaps (largest element at root), have the max-heap property: – for all nodes i, excluding the root: A[PARENT(i)] ≥ A[i] • Min-heaps (smallest element at root), have the min-heap property: – for all nodes i, excluding the root: A[PARENT(i)] ≤ A[i]
  • 110. DAA notes by Pallavi Joshi 110 Adding/Deleting Nodes • New nodes are always inserted at the bottom level (left to right) • Nodes are removed from the bottom level (right to left)
  • 111. DAA notes by Pallavi Joshi 111 Operations on Heaps • Maintain/Restore the max-heap property – MAX-HEAPIFY • Create a max-heap from an unordered array – BUILD-MAX-HEAP • Sort an array in place – HEAPSORT • Priority queues
  • 112. DAA notes by Pallavi Joshi 112 Maintaining the Heap Property • Suppose a node is smaller than a child – Left and Right subtrees of i are max-heaps • To eliminate the violation: – Exchange with larger child – Move down the tree – Continue until node is not smaller than children
  • 113. DAA notes by Pallavi Joshi 113 Example MAX-HEAPIFY(A, 2, 10) A[2] violates the heap property A[2]  A[4] A[4] violates the heap property A[4]  A[9] Heap property restored
  • 114. DAA notes by Pallavi Joshi 114 Maintaining the Heap Property • Assumptions: – Left and Right subtrees of i are max-heaps – A[i] may be smaller than its children Alg: MAX-HEAPIFY(A, i, n) 1. l ← LEFT(i) 2. r ← RIGHT(i) 3. if l ≤ n and A[l] > A[i] 4. then largest ←l 5. else largest ←i 6. if r ≤ n and A[r] > A[largest] 7. then largest ←r 8. if largest  i 9. then exchange A[i] ↔ A[largest] 10. MAX-HEAPIFY(A, largest, n)
  • 115. DAA notes by Pallavi Joshi 115 MAX-HEAPIFY Running Time • Intuitively: • Running time of MAX-HEAPIFY is O(lgn) • Can be written in terms of the height of the heap, as being O(h) – Since the height of the heap is lgn h 2h O(h) - - - -
  • 116. DAA notes by Pallavi Joshi 116 Building a Heap Alg: BUILD-MAX-HEAP(A) 1. n = length[A] 2. for i ← n/2 downto 1 3. do MAX-HEAPIFY(A, i, n) • Convert an array A[1 … n] into a max-heap (n = length[A]) • The elements in the subarray A[(n/2+1) .. n] are leaves • Apply MAX-HEAPIFY on elements between 1 and n/2 2 14 8 1 16 7 4 3 9 10 1 2 3 4 5 6 7 8 9 10 4 1 3 2 16 9 10 14 8 7 A:
  • 117. DAA notes by Pallavi Joshi 117 Example: A 4 1 3 2 16 9 10 14 8 7 2 14 8 1 16 7 4 3 9 10 1 2 3 4 5 6 7 8 9 10 14 2 8 1 16 7 4 10 9 3 1 2 3 4 5 6 7 8 9 10 2 14 8 1 16 7 4 3 9 10 1 2 3 4 5 6 7 8 9 10 14 2 8 1 16 7 4 3 9 10 1 2 3 4 5 6 7 8 9 10 14 2 8 16 7 1 4 10 9 3 1 2 3 4 5 6 7 8 9 10 8 2 4 14 7 1 16 10 9 3 1 2 3 4 5 6 7 8 9 10 i = 5 i = 4 i = 3 i = 2 i = 1
  • 118. DAA notes by Pallavi Joshi 118 Running Time of BUILD MAX HEAP  Running time: O(nlgn) • This is not an asymptotically tight upper bound Alg: BUILD-MAX-HEAP(A) 1. n = length[A] 2. for i ← n/2 downto 1 3. do MAX-HEAPIFY(A, i, n) O(lgn) O(n)
  • 119. DAA notes by Pallavi Joshi 119 Running Time of BUILD MAX HEAP • HEAPIFY takes O(h)  the cost of HEAPIFY on a node i is proportional to the height of the node i in the tree Height Level h0 = 3 (lgn) h1 = 2 h2 = 1 h3 = 0 i = 0 i = 1 i = 2 i = 3 (lgn) No. of nodes 20 21 22 23 hi = h – i height of the heap rooted at level i ni = 2i number of nodes at i h i ih n n T     0 ) (   i h h i i    0 2 ) (n O 
  • 120. DAA notes by Pallavi Joshi 120 Running Time of BUILD MAX HEAP i h i ih n n T    0 ) ( Cost of HEAPIFY at level i  number of nodes at that le   i h h i i    0 2 Replace the values of ni and hi computed before h h i i h i h 2 2 0      Multiply by 2h both at the nominator and denominator write 2i as i  2 1    h k k h k 0 2 2 Change variables: k = h - i     0 2 k k k n The sum above is smaller than the sum of all elements and h = lgn ) (n O  The sum above is smaller than 2 Running time of BUILD-MAX-HEAP: T(n) = O(n)
  • 121. DAA notes by Pallavi Joshi 121 Heapsort • Goal: – Sort an array using heap representations • Idea: – Build a max-heap from the array – Swap the root (the maximum element) with the last element in the array – “Discard” this last node by decreasing the heap size – Call MAX-HEAPIFY on the new root – Repeat this process until only one node remains
  • 122. DAA notes by Pallavi Joshi 122 Example: A=[7, 4, 3, 1, 2] MAX-HEAPIFY(A, 1, 4) MAX-HEAPIFY(A, 1, 3) MAX-HEAPIFY(A, 1, 2) MAX-HEAPIFY(A, 1, 1)
  • 123. DAA notes by Pallavi Joshi 123 Alg: HEAPSORT(A) 1. BUILD-MAX-HEAP(A) 2. for i ← length[A] downto 2 3. do exchange A[1] ↔ A[i] 4. MAX-HEAPIFY(A, 1, i - 1) • Running time: O(nlgn) --- Can be shown to be Θ(nlgn) O(n) O(lgn) n-1 times
  • 124. DAA notes by Pallavi Joshi 124 Priority Queues 12 4
  • 125. DAA notes by Pallavi Joshi 125 Operations on Priority Queues • Max-priority queues support the following operations: – INSERT(S, x): inserts element x into set S – EXTRACT-MAX(S): removes and returns element of S with largest key – MAXIMUM(S): returns element of S with largest key – INCREASE-KEY(S, x, k): increases value of element x’s key to k (Assume k ≥ x’s current key value)
  • 126. DAA notes by Pallavi Joshi 126 HEAP-MAXIMUM Goal: – Return the largest element of the heap Alg: HEAP-MAXIMUM(A) 1. return A[1] Running time: O(1) Heap A: Heap-Maximum(A) returns 7
  • 127. DAA notes by Pallavi Joshi 127 HEAP-EXTRACT-MAX Goal: – Extract the largest element of the heap (i.e., return the max value and also remove that element from the heap Idea: – Exchange the root element with the last – Decrease the size of the heap by 1 element – Call MAX-HEAPIFY on the new root, on a heap of size n-1 Heap A: Root is the largest element
  • 128. DAA notes by Pallavi Joshi 128 Example: HEAP-EXTRACT-MAX 8 2 4 14 7 1 16 10 9 3 max = 16 8 2 4 14 7 1 10 9 3 Heap size decreased with 1 4 2 1 8 7 14 10 9 3 Call MAX-HEAPIFY(A, 1, n-1)
  • 129. DAA notes by Pallavi Joshi 129 HEAP-EXTRACT-MAX Alg: HEAP-EXTRACT-MAX(A, n) 1. if n < 1 2. then error “heap underflow” 3. max ← A[1] 4. A[1] ← A[n] 5. MAX-HEAPIFY(A, 1, n-1) remakes heap 6. return max Running time: O(lgn)
  • 130. DAA notes by Pallavi Joshi 130 HEAP-INCREASE-KEY • Goal: – Increases the key of an element i in the heap • Idea: – Increment the key of A[i] to its new value – If the max-heap property does not hold anymore: traverse a path toward the root to find the proper place for the newly increased key 8 2 4 14 7 1 16 10 9 3 i Key [i] ← 15
  • 131. DAA notes by Pallavi Joshi 131 Example: HEAP-INCREASE-KEY 14 2 8 15 7 1 16 10 9 3 i 8 2 4 14 7 1 16 10 9 3 i Key [i ] ← 15 8 2 15 14 7 1 16 10 9 3 i 15 2 8 14 7 1 16 10 9 3 i
  • 132. DAA notes by Pallavi Joshi 132 HEAP-INCREASE-KEY Alg: HEAP-INCREASE-KEY(A, i, key) 1. if key < A[i] 2. then error “new key is smaller than current key” 3. A[i] ← key 4. while i > 1 and A[PARENT(i)] < A[i] 5. do exchange A[i] ↔ A[PARENT(i)] 6. i ← PARENT(i) • Running time: O(lgn) 8 2 4 14 7 1 16 10 9 3 i Key [i] ← 15
  • 133. DAA notes by Pallavi Joshi 133 - MAX-HEAP-INSERT • Goal: – Inserts a new element into a max-heap • Idea: – Expand the max-heap with a new element whose key is - – Calls HEAP-INCREASE-KEY to set the key of the new node to its correct value and maintain the max-heap property 8 2 4 14 7 1 16 10 9 3 15 8 2 4 14 7 1 16 10 9 3
  • 134. DAA notes by Pallavi Joshi 134 Example: MAX-HEAP-INSERT - 8 2 4 14 7 1 16 10 9 3 Insert value 15: - Start by inserting - 15 8 2 4 14 7 1 16 10 9 3 Increase the key to 15 Call HEAP-INCREASE-KEY on A[11] = 15 7 8 2 4 14 15 1 16 10 9 3 7 8 2 4 15 14 1 16 10 9 3 The restored heap containing the newly added element
  • 135. DAA notes by Pallavi Joshi 135 MAX-HEAP-INSERT Alg: MAX-HEAP-INSERT(A, key, n) 1. heap-size[A] ← n + 1 2. A[n + 1] ← - 3. HEAP-INCREASE-KEY(A, n + 1, key) Running time: O(lgn) - 8 2 4 14 7 1 16 10 9 3
  • 136. DAA notes by Pallavi Joshi 136 Summary • We can perform the following operations on heaps: – MAX-HEAPIFY O(lgn) – BUILD-MAX-HEAP O(n) – HEAP-SORT O(nlgn) – MAX-HEAP-INSERT O(lgn) – HEAP-EXTRACT-MAX O(lgn) – HEAP-INCREASE-KEY O(lgn) – HEAP-MAXIMUM O(1) Average O(lgn)
  • 137. DAA notes by Pallavi Joshi Ch. 7 - QuickSort Quick but not Guaranteed
  • 138. DAA notes by Pallavi Joshi Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although O(n lg n) in their time complexity, have fairly large constants and tend to move data around more than desirable (e.g., equal-key items may not maintain their relative position from input to output). We introduce another algorithm with better constants, but a flaw: its worst case in O(n2). Fortunately, the worst case is “rare enough” so that the speed advantages work an overwhelming amount of the time… and it is O(n lg n) on average. 7/15/2021 138 91.404
  • 139. DAA notes by Pallavi Joshi Ch.7 - QuickSort Like in MERGESORT, we use Divide-and-Conquer: 1. Divide: partition A[p..r] into two subarrays A[p..q-1] and A[q+1..r] such that each element of A[p..q-1] is ≤ A[q], and each element of A[q+1..r] is ≥ A[q]. Compute q as part of this partitioning. 2. Conquer: sort the subarrays A[p..q-1] and A[q+1..r] by recursive calls to QUICKSORT. 3. Combine: the partitioning and recursive sorting leave us with a sorted A[p..r] – no work needed here. An obvious difference is that we do most of the work in the divide stage, with no work at the combine one. 7/15/2021 139 91.404
  • 140. DAA notes by Pallavi Joshi Ch.7 - QuickSort The Pseudo-Code 7/15/2021 140 91.404
  • 141. DAA notes by Pallavi Joshi Ch.7 - QuickSort 7/15/2021 141 91.404
  • 142. DAA notes by Pallavi Joshi Ch.7 - QuickSort Proof of Correctness: PARTITION We look for a loop invariant and we observe that at the beginning of each iteration of the loop (l.3-6) for any array index k: 1. If p ≤ k ≤ i, then A[k] ≤ x; 2. If i+1 ≤ k ≤ j-1, then A[k] > x; 3. If k = r, then A[k] = x. 4. If j ≤ k ≤ r-1, then we don’t know anything about A[k]. 7/15/2021 142 91.404
  • 143. DAA notes by Pallavi Joshi Ch.7 - QuickSort The Invariant • Initialization. Before the first iteration: i=p-1, j=p. No values between p and i; no values between i+1 and j-1. The first two conditions are trivially satisfied; the initial assignment satisfies 3. • Maintenance. Two cases – 1. A[j] > x. – 2. A[j] ≥ x. 7/15/2021 143 91.404
  • 144. DAA notes by Pallavi Joshi Ch.7 - QuickSort The Invariant • Termination. j=r. Every entry in the array is in one of the three sets described by the invariant. We have partitioned the values in the array into three sets: less than or equal to x, greater than x, and a singleton containing x. Running time of PARTITION on A[p..r] is (n), where n = r – p + 1. 7/15/2021 144 91.404
  • 145. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – a quick look. • We first look at (apparent) worst-case partitioning: T(n) = T(n-1) + T(0) + (n) = T(n-1) + (n). It is easy to show – using substitution - that T(n) = (n2). • We next look at (apparent) best-case partitioning: T(n) = 2T(n/2) + (n). It is also easy to show (case 2 of the Master Theorem) that T(n) = (n lg n). • Since the disparity between the two is substantial, we need to look further… 7/15/2021 145 91.404
  • 146. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Balanced Partitioning 7/15/2021 146 91.404
  • 147. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – the Average Case As long as the number of “good splits” is bounded below as a fixed percentage of all the splits, we maintain logarithmic depth and so O(n lg n) time complexity. 7/15/2021 147 91.404
  • 148. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Randomized QUICKSORT We would like to ensure that the choice of pivot does not critically impair the performance of the sorting algorithm – the discussion to this point would indicate that randomizing the choice of the pivot should provide us with good behavior (if at all possible with the data-set we are trying to sort). We introduce 7/15/2021 148 91.404
  • 149. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Randomized QUICKSORT And the recursive procedure becomes: Every call to RANDOMIZED-PARTITION has introduced the (constant) extra overhead of a call to RANDOM. 7/15/2021 149 91.404
  • 150. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Rigorous Worst Case Analysis Since we do not, a priori, have any idea of what the splits of the subarrays will be, we have to represent a possible “worst case” (we already have an O(n2) bound from the “bad split” example – so it could be worse… although we hope not). The worst case leads to the recurrence T(n) = max0≤q≤n-1(T(q) + T(n – q - 1)) + (n), where we remember that the pivot does not appear at the next level (down) of the recursion. 7/15/2021 150 91.404
  • 151. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Rigorous Worst Case Analysis We have to come up with a “guess” and the basis for the guess is our likely “bad split case”: it tells us we cannot hope for any better than (n2). So we just hope it is no worse… Guess T(n) ≤ cn2 for some c > 0 and start doing algebra for the induction: T(n) ≤ max0≤q≤n-1(T(q) + T(n – q - 1)) + (n) ≤ max0≤q≤n-1(cq2 + c(n – q - 1)2) + (n). Differentiate cq2 + c(n – q - 1)2 twice with respect to q, to obtain 4c > 0 for all values of q. 7/15/2021 151 91.404
  • 152. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Rigorous Worst Case Analysis Since the expression represents a quadratic curve, concave up, it reaches it maximum at one of the endpoints q = 0 and q = n – 1. As we evaluate, we find max0≤q≤n-1(cq2 + c(n – q - 1)2) + (n) ≤ c max0≤q≤n-1(q2 + (n – q - 1)2) + (n) ≤ c (n – 1)2 + (n) = cn2 – 2cn + 1 + (n) ≤ cn2 by choosing c large enough to overcome the positive constant in (n). 7/15/2021 152 91.404
  • 153. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Expected RunTime Understanding partitioning. 1. Each time PARTITION is called, it selects a pivot element and this pivot element is never included in successive calls: the total number of calls to PARTITION is n. 2. Each call to PARTITION costs O(1) plus an amount of time proportional to the number of iterations of the for loop. 3. Each iteration of the for loop (in line 4) performs a comparison , comparing the pivot to another element in A. 4. We need to count the number of times l. 4 is executed. 7/15/2021 153 91.404
  • 154. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Expected RunTime Lemma 7.1. Let X be the number of comparisons performed in l. 4 of PARTITION over the entire execution of QUICKSORT on an n-element array. Then the running time of QUICKSORT is O(n + X). Proof: the observations on the previous slide. We need to find X, the total number of comparisons performed over all calls to PARTITION. 7/15/2021 154 91.404
  • 155. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Expected RunTime 1. Rename the elements of A as z1, z2, …, zn, so that zi is the ith smallest element of A. 2. Define the set Zij = {zi, zi+1,…, zj}. 3. Question: when does the algorithm compare zi and zj? 4. Answer: at most once – notice that all elements in every (sub)array are compared to the pivot once, and will never be compared to the pivot again (since the pivot is removed from the recursion). 5. Define Xij = I{zi is compared to zj}, the indicator variable of this event. Comparisons are over the full run of the algorithm. 7/15/2021 155 91.404
  • 156. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Expected RunTime 6. Since each pair is compared at most once, we can write 7. Taking expectations of both sides: 8. We need to compute Pr{zi is compared to zj}. 9. We will assume all zi and zj are distinct. 10.For any pair zi, zj, once a pivot x is chosen so that zi < x < zj, zi and zj will never be compared again (why?). 7/15/2021 156 91.404  X  Xij ji1 n  i1 n1  .  E X   E Xij ji1 n  i1 n1           E Xij   ji1 n  i1 n1   Pr zi is compared to zj   ji1 n  i1 n1  .
  • 157. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Expected RunTime 11.If zi is chosen as a pivot before any other item in Zij, then zi will be compared to every other item in Zij. 12.Same for zj. 13. zi and zj are compared if and only if the first element to be chosen as a pivot from Zij is either zi or zj. 14.What is that probability? Until a point of Zij is chosen as a pivot, the whole of Zij is in the same partition, so every element of Zij is equally likely to be the first one chosen as a pivot. 7/15/2021 157 91.404
  • 158. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Expected RunTime 15.Because Zij has j – i + 1 elements, and because pivots are chosen randomly and independently, the probability that any given element is the first one chosen as a pivot is 1/(j-i+1). It follows that: 16. Pr{zi is compared to zj} = Pr{zi or zj is first pivot chosen from Zij} = Pr{zi is first pivot chosen from Zij}+ Pr{ zj is first pivot chosen from Zij} = 1/(j-i+1) + 1/(j-i+1) = 2/(j-i+1). 7/15/2021 158 91.404
  • 159. DAA notes by Pallavi Joshi Ch.7 - QuickSort QUICKSORT: Performance – Expected RunTime 17.Replacing the right-hand-side in 7, and grinding through some algebra: And the result follows. 7/15/2021 159 91.404  E X   2 j i 1 ji1 n  i1 n1   2 k 1 k1 ni  i1 n1   2 k k1 n  i1 n1   2Hn i1 n1   O lg n   O(nlg n). i1 n1 
  • 160. DAA notes by Pallavi Joshi Dynamic Programming
  • 161. DAA notes by Pallavi Joshi Outline 161 ⦁ Assembly‐line scheduling ⦁ Matrix‐chain multiplication ⦁ Elements of dynamic programming ⦁ Longest common subsequence ⦁ Optimal binary search trees
  • 162. DAA notes by Pallavi Joshi Dynamic Programming1/2 162 Not a specific algorithm, but a technique, like divide‐and‐ conquer. Dynamic programming is applicable when the subproblems are not independent. A dynamic‐programming algorithm solves every subsubproblem just once and then saves its answer in a table. "Programming" in this context refers to a tabular method, not to writing computer code. Used for optimization problems: Find a solution with the optimal value. Minimization or maximization.
  • 163. DAA notes by Pallavi Joshi Dynamic Programming2/2 163 ⦁ Four‐step method 1. Characterize the structure of an optimal solution. 2. Recursively define the value of an optimal solution. 3. Compute the value of an optimal solution in a bottom‐up fashion. 4. Construct an optimal solution from computed information.
  • 164. DAA notes by Pallavi Joshi ⦁ Automobile factory with two assembly lines. ⦁ Each line has n stations: S1,1,…, S1,n and S2,1,…, S2,n. ⦁ S1,j and S2,j :perform the same function with times a1,j and a2,j, respectively. ⦁ Entry times e1 and e2. Exit times x1 and x2. ⦁ After going through a station, can either ⦁ stay on same line; no cost, or ⦁ transfer to other line; cost after Si,j is ti,j. Assembly‐line scheduling1/2 164 … line 1 line 2 S1,2 a1,2 S1,3 a1,3 S1,4 a1,4 enters exits S1,n‐1 a1,n‐1 S1,n a1,n a2,2 S2,2 a2,3 S2,3 a2,4 S2,4 a2,n‐1 S2,n‐1 a2,n S2,n t1,1 t2,1 t1,2 t2,2 t1,3 t2,3 t1,n‐1 t2,n‐1 S1,1 a1,1 e1 e2 a2,1 S2,1 x1 x2
  • 165. DAA notes by Pallavi Joshi Assembly‐line scheduling2/2 165 Problem: Given all these costs (time = cost), what stations should be chosen from line 1 and from line 2 for fastest way through factory? … line 1 line 2 S1,2 a1,2 S1,3 a1,3 S1,4 a1,4 enters exits S1,n‐1 a1,n‐1 S1,n a1,n a2,2 S2,2 a2,3 S2,3 a2,4 S2,4 a2,n‐1 S2,n‐1 a2,n S2,n t1,1 t2,1 t1,2 t2,2 t1,3 t2,3 t1,n‐1 t2,n‐1 S1,1 a1,1 e1 e2 a2,1 S2,1 x1 x2
  • 166. DAA notes by Pallavi Joshi Structure of an optimal solution 166 Step 1: Characterize the structure of an optimal solution. Fastest way through S1,j is either fastest way through S1, j−1 then directly through S1, j, or fastest way through S2, j−1, transfer from line 2 to line 1, then through S1, j. Example: If fastest(S1,4) = (S2,1, S1,2, S2,3, S1,4), then fastest(S2,3) = (S2,1, S1,2, S2,3) An optimal solution to a problem contains within it an optimal solution to subproblems. This is optimal substructure.
  • 167. DAA notes by Pallavi Joshi Recursive solution 167 ⦁ Step 2: Recursively define the value of an optimal solution. ⦁ Let fi[j] = the fastest time through Si,j, where i = 1, 2 and j = 1,…, n. ⦁ Let f* = the fastest time through the factory. ⦁ Then, we have the following two recursive equations.  It follows that   if j  1, a ) if j  2. 1, j 1 , f [ j 1] t min( f [ j 1] a e1  a1,1 2, j1 1, j 2 1 f [ j]  if j  1, a ) if j  2. 2, j 2 , f [ j 1] t  a2,1 e2 f2[ j]  min( f [ j 1] a 2, j1 2, j 1 f*  min(f1[n] x1 , f2[n] x2 ).
  • 168. DAA notes by Pallavi Joshi li[j] = line # whose station j −1 is used in fastest way through Si,j. l*= line # whose station n is used in fastest way through the entire factory. An instance of assembly‐line scheduling 168 line 1 line 2 enters exits 2 4 3 2 S1,1 S1, 2 S1, 3 S1, 4 S1,n‐ 1 S1, n 7 9 3 4 8 4 2 3 1 3 4 2 1 2 2 1 8 5 6 4 5 7 S2,1 S2, 2 S2, 3 S2, 4 S2,n‐ 1 S2, n j 1 2 3 4 5 6 f1[j] 9 18 20 24 32 35 f2[j] 12 16 22 25 30 37 f* = 38 l* = 1 l1[j ] l2[j ] 2 3 4 5 6 j 1 2 1 1 2 1 2 1 2 2
  • 169. DAA notes by Pallavi Joshi Compute an optimal solution 169 Step 3: Compute the value of an optimal solution in a bottom‐up fashion. ⦁ Write a recursive algorithm based on above recurrences. ⦁ Let ri(j) = # of references made to fi[j]. ⦁ r1(n) = r2(n) = 1. r1(j) = r2(j)= r1(j+1) + r2(j+1) for j = 1,… , n −1. One can show that ri(j)= 2n−j and the total number of references to all fi[j] is (2n). (Exercises 15.1‐2 and 15.1‐3) ⦁ Observation: ⦁ fi[j] depends only on f1[j−1] and f2[j−1] for j  2. ⦁ So compute in order of increasing j . f* f1[n] f2[n] f2[n‐1] f1[n‐1] f2[n‐1] f1[n‐1]
  • 170. DAA notes by Pallavi Joshi FASTEST‐WAY procedure 170 ⦁ Time:O(n). FASTEST‐WAY(a, t, e, x, n) do if f1[j − 1] + a1,j  f2[j − 1] + t2,j−1 + a1,j then f1[j]  f1[j − 1] + a1,j l1[j]  1 else f1[j]  f2[j − 1] + t2,j−1 + a1,j l1[j]  2 if f2[j − 1] + a2,j  f1[j − 1] + t1,j−1 + a2,j then f2[j]  f2[j − 1] + a2, j l2[j]  2 else f2[j]  f1[j − 1] + t1,j−1 + a2,j 1. f1[1]  e1 + a1,1 2. f2[1]  e2 + a2,1 3. for j  2 to n 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. l2[j]  1 14. if f1[n] + x1  f2[n] + x2 15. then f* = f1[n] + x1 16. l* = 1 17. else f* = f2[n] + x2 18. l* = 2 (1 ) (n1) ∙ (1) (1 )
  • 171. DAA notes by Pallavi Joshi Construct the fastest way 171 Step 4: Construct an optimal solution from computed information. The following procedure prints out the stations used, in decreasing order of station number. PRINT‐STATIONS(l, n) ⦁ Time:O(n). 1. i  l* 2. print “line” i “, station” n 3. for j  n downto 2 4. 5. do i  li[j] print “line” i “, station” j −1 (n1) ∙ (1) (1 )
  • 172. DAA notes by Pallavi Joshi Matrix‐chain multiplication 172 When we multiply two matrices A and B, if A is a p × q matrix and B is a q × r matrix, the resulting matrix C is a p × r matrix. The number of scalar multiplications is pqr. Matrix‐chain multiplication problem Input: A chain 〈A1, A2,..., An〉 of n matrices. (matrix Ai has dimension pi − 1 × pi ) Output: A fully parenthesized product A1, A2,..., An that minimizes the number of scalar multiplications. For example: The dimensions of the matrices A1, A2, and A3 are 10 × 100, 100 × 5, and 5 × 50, respectively. ((A1A2)A3) = 10 ∙ 100 ∙ 5 + 10 ∙ 5 ∙ 50 = 7500. (A1(A2A3)) = 100 ∙ 5 ∙ 50 + 10 ∙ 100 ∙ 50 = 75000.
  • 173. DAA notes by Pallavi Joshi Counting the number of parenthesizations 173   P(k)P(n  k) ⦁ Thus, we have P(n)   n1 k 1 1 Brute‐force algorithm: Checking all possible parenthesizations Time: (2n). (Exercise 15.2‐3) Denote the number of alternative parenthesizations of a sequence of n matrices by P(n). A fully parenthesized matrix product is the product of two fully parenthesized matrix subproducts. The split between the two subproducts may occur between the kth and (k + 1)st matrices. if n  1, if n  2.
  • 174. DAA notes by Pallavi Joshi Step 1: The structure of an optimal solution 174 An optimal solution to an instance contains optimal solutions to subproblem instances. For example: If ((A1A2)A3)(A4(A5A6)) is an optimal solution to A1, A2,..., A6. Then, ((A1A2)A3) is an optimal solution to A1, A2, A3 and (A4(A5A6)) is an optimal solution to A4, A5, A6.
  • 175. DAA notes by Pallavi Joshi Step 2: A recursive solution 175 Define m[i, j] = the minimum number of scalar multiplications needed to compute Ai Ai+1… Aj. if i  j , ⦁ The recursion tree for the computation of m[1,4].  ik j 0 if i  j. i1 k j m[i, j]  min(m[i,k] m[k 1, j]  p p p ) 1..1 4..4 1..4 2..4 1..2 3..4 1..3 2..2 3..4 2..3 4..4 1..1 2..2 3..3 4..4 1..1 2..3 1..2 3..3 3..3 4..4 2..2 3..3 2..2 3..3 1..1 2..2
  • 176. DAA notes by Pallavi Joshi Step 3: Computing the optimal costs 176 Based on the recursive formula, we could easily write an exponential‐time recursive algorithm to compute the compute the minimum cost m[1, n] for multiplying A1A2…An. ⦁ There are only 2 = (n ) distinct subproblems, one problem for each choice of i and j satisfying 1  i  j  n. We can use dynamic programming to compute the solutions bottom up. n ( )  n 2
  • 177. DAA notes by Pallavi Joshi 3 Dependencies between the subproblems 177 matrix A1 A2 A3 A4 A5 A6 dimension 30 × 35 35 × 15 15 × 5 5 × 10 10 × 20 20 × 25   1 4 5 m[2,4] m[5,5] p p p  4375  0  3510 20  11375. 1 3 5 ⦁ s[i, j]: index k achieved the optimal cost in computing m[i, j]. m[2,2] m[3,5] p1 p2 p5  0  2500  3515 20  13000, m[2,5]  minm[2,3] m[4,5] p p p  2625 1000  355 20  7125, 15,125 m A1 A2 A3 A4 A5 A6 0 15,750 2,625 750 1,000 5,000 0 0 0 0 0 7,875 4,375 2,500 3,500 9,375 7,125 5,375 11,875 10,500 1 2 3 4 6 1 6 5 4 3 5 2 3 s 1 2 3 4 5 1 3 3 5 3 3 3 3 6 2 3 4 5 1 5 4 3 2 i j i j
  • 178. DAA notes by Pallavi Joshi MATRIX‐CHAIN‐ORDER pseudocode 178 The loops are nested three deep, and each loop index (l, i, and k) takes on at most n – 1 values. Time:O(n3). MATRIX‐CHAIN‐ORDER( p) do for i  1 to n – l + 1 do j  i + l – 1 1. n  length[p] – 1 2. for i  1 to n 3. do m[i, i]  0 4. for l  2 to n /* l is the chain length*/ 5. 6. 7. m[i, j]   8. for k  i to j – 1 9. do q  m[i, k] + m[k + 1, j] + pi – 1pkpj 10. if q < m[i, j] 11. then m[i, j]  q 12. s[i, j]  k 13. return m and s
  • 179. DAA notes by Pallavi Joshi Step 4: Constructing an optimal solution 179 The call PRINT‐OPTIMAL‐PARENS(s, 1, n) prints the parenthesization ((A1(A2A3)) ((A4A5)A6)). PRINT‐OPTIMAL‐PARENS(s, i, j) 1. if i = j 2. then print “Ai” 3. else print “(“ 4. PRINT‐OPTIMAL‐PARENS(s, i, s[i, j]) 5. PRINT‐OPTIMAL‐PARENS(s, s[i, j]+1, j) 6. print ")" 3 Each entry s[i, j] records the value of k such that the optimal parenthesization of Ai Ai+1∙∙∙Aj splits the product between Ak and Ak+1. s 1 2 3 4 5 1 3 3 5 3 3 3 3 3 3 4 6 1 5 4 3 5 2 i j
  • 180. DAA notes by Pallavi Joshi Elements of dynamic programming1/2 180 Optimal substructure An optimal solution to a problem contains an optimal solution to subproblems. If ((A1A2)A3)(A4(A5A6)) is an optimal solution to A1, A2,..., A6, then ((A1A2)A3) is an optimal solution to A1, A2, A3 and (A4(A5A6)) is an optimal solution to A4, A5, A6. Overlapping subproblems A recursive algorithm revisits the same problem over and over again. Typically, the total number of distinct subproblems is a polynomial in the input size. In contrast, a problem for which a divide‐and‐conquer approach is suitable usually generates brand‐new problems at each step of the recursion.
  • 181. DAA notes by Pallavi Joshi Elements of dynamic programming2/2 181 ⦁ Example: merge sort ⦁ Example: matrix‐chain 1..8 1..4 1..2 4..4 3..4 3..3 2..2 1..1 5..8 6..6 5..6 5..5 8..8 7..8 7..7 1..1 4..4 1..4 2..4 1..2 3..4 1..3 2..2 3..4 2..3 4..4 1..1 2..2 3..3 4..4 1..1 2..3 1..2 3..3 3..3 4..4 2..2 3..3 2..2 3..3 1..1 2..2
  • 182. DAA notes by Pallavi Joshi RECURSIVE‐MATRIX‐CHAIN procedure 182 ⦁ We shall prove that T(n) = (2n). Specifically, T(n)  2n– 1. RECURSIVE‐MATRIX‐CHAIN(p, i, j) + p p p i–1 k j I if q < m[i, j] then m[i, j]  q 1. if i = j 2. then return 0 3. m[i, j]   4. for k  i to j – 1 5. do q  RECURSIVE‐MATRIX‐CHAIN(p, i, k) 6. + RECURSIVE‐MATRIX‐CHAIN(p, k+1, j) 7. 8. 9. 10. return m[i, j] (1 ) (1 )  n 1 ( (T(k)T(nk) 1) k1 n1 n1 n1 T(n) 1(T(k) T(nk)1)  2T(i) n 22i1  n k1 i1 i1 n 2  22i n  2(2n1 1)n  (2n 2)n  2n1 . i0 T(1) 1  20 . Using the substitution method.
  • 183. DAA notes by Pallavi Joshi Memoization 183 A variation of dynamic programming that offers the efficiency of the usual dynamic‐programming approach while maintaining a top‐down strategy. MEMOIZED‐MATRIX‐CHAIN(p ) 1. n  length[p] – 1 2. for i  1 to n 3. 4. do for j  i to n do m[i, j]   5. return LOOKUP‐CHAIN(p, 1, n) LOOKUP‐CHAIN(p, i, j) 1. if m[i, j] <  2. then return m[i, j] 3. if i = j 4. then m[i, j]  0 5. else for k  i to j – 1 6. do q  LOOKUP‐CHAIN(p, i, k) 7. 8. 9. 10. + LOOKUP‐CHAIN (p, k+1, j) + pi–1pkpj I if q < m[i, j] then m[i, j]  q 11. return m[i, j] ⦁ Time:O(n3). 🞂 Compute m[i, j] only in the first time to call LOOKUP‐CHAIN(p, i, j).
  • 184. DAA notes by Pallavi Joshi Longest‐common‐subsequence 184 A subsequence is a sequence that can be derived from another sequence by deleting some elements. For example: 〈K, C, B, A〉 is a subsequence of 〈K, G, C, E, B, B, A〉. 〈B, C, D, G〉is a subsequence of 〈A, C, B, E, G, C, E, D, B, G〉. Longest‐common‐subsequence problem Input: 2 sequences, X = 〈x1, x2,…, xm〉 and Y = 〈y1, y2,…, yn〉. Output: A maximum‐length common subsequence of X and Y. For example: X = 〈A, B, C, B, D, A, B〉 and Y = 〈B, D, C, A, B, A〉. 〈B, C, A〉is a common subsequence of both X and Y. 〈B, C, B, A〉is an longest common subsequence (LCS) of X and Y.
  • 185. DAA notes by Pallavi Joshi Step 1: Characterizing an LCS 185 Brute‐force algorithm: For every subsequence of X, check whether it is a subsequence of Y. Time: (n2m). 2m subsequences of X to check. Each subsequence takes (n) time to check: scan Y for first letter, from there scan for second, and so on. Given a sequence X = 〈x1, x2,…, xm〉, we define the ith prefix of X, as X = 〈x1, x2,…, xi〉. For example: X = 〈A, B, C, B, D, A, B〉. X4 = 〈A, B, C, B〉 and X0 is the empty sequence.
  • 186. DAA notes by Pallavi Joshi Optimal substructure of an LCS 186 Theorem 15.1 Let X = 〈x1, x2,…, xm〉 and Y = 〈y1, y2,…, yn〉 be sequences, and let Z = 〈z1, z2,…, zk〉be any LCS of X and Y. 1. If xm = yn, then zk = xm = yn and Zk−1 is an LCS of Xm−1 and Yn−1. 2. If xm  yn, then zk  xm implies that Z is an LCS of Xm−1 and Y. 3. If xm  yn, then zk  yn implies that Z is an LCS of X and Yn−1. For example: X = 〈A, B, C, B, D, A, B〉, Y = 〈B, D, C, A, B〉 and Z = 〈B, C, A, B〉 is an LCS of X and Y. Then, z4 = x7 = y5 and Z3 = 〈B, C, A〉 is an LCS of X6 and Y4. X = 〈A, B, C, B, D, A, D〉, Y = 〈B, D, C, A, B〉 and Z = 〈B, C, A〉 is an LCS of X and Y. Then, z3  x7 implies that Z3 = 〈B, C, A〉 is an LCS of X6 and Y5. X = 〈A, B, C, B, D, A〉, Y = 〈B, D, C, A, B〉 and Z = 〈B, C, A〉 is an LCS of X and Y. Then, z3  y5 implies that Z3 = 〈B, C, A〉 is an LCS of X6 and Y4.
  • 187. DAA notes by Pallavi Joshi Step 2: A recursive solution 187 ⦁ Define c[i, j] = length of LCS of Xi and Yj. We want c[m, n]. ⦁ The recursion tree for the computation of c[4,3].  max(c[i, j 1],c[i 1, j])  0 c[i, j]  c[i 1, j 1]1 ifi, j  0 and xi  yj. if i  0 or j  0, ifi, j  0 and xi  yj , 4,3 3,3 2,3 3,1 3,2 2,2 2,2 1,3 4,2 3,1 3,2 2,2 3,0 4,1 3,1 0,3 1,2 1,2 2,0 1,2 2,0 2,1 3,0 1,2 2,0 2,1 3,0 2,1 3,0
  • 188. DAA notes by Pallavi Joshi Step 3: Computing the length of an LCS 188 Based on the recursive formula, we could easily write an exponential‐time recursive algorithm to compute the length of an LCS of two sequences. There are only (mn) distinct subproblems. We can use dynamic programming to compute the solutions bottom up.
  • 189. DAA notes by Pallavi Joshi LCS‐LENGTH pseudocode 189 ⦁ Time:O(mn) . LCS‐LENGTH(X, Y) do if xi = yj 1. m  length[X]; n  length[Y] 2. for i  1 to m 3. do c[i, 0]  0 4. for j  0 to n 5. do c[0, j]  0 6. for i  1 to m 7. do for j  1 to n 8. 9. then c[i, j]  c[i − 1, j − 1] + 1 10. b[i, j]  “↖” 11. else if c[i ‐ 1, j]  c[i, j ‐ 1] 12. then c[i, j]  c[i − 1, j] 13. b[i, j]  “” 14. else c[i, j]  c[i, j − 1] 15. b[i, j]  “” 16. return c and b i i j 0 y 1 B 2 D 3 C 4 A 5 B 6 A 0 xi 1 A 2 B 3 C 4 B 5 D 6 A 7 B 0 0 0 0 0 0 0 0  0  0  0 ↖ 1  1 ↖ 1 0 ↖ 1  1  1  1 ↖ 2  2 0  1  1 ↖ 2  2  2  2 0 ↖ 1  1  2  2 ↖ 3  3 0  1 ↖ 2  2  2  3  3 0  1  2  2 ↖ 3  3 ↖ 4 0 ↖ 1  2  2  3 ↖ 4  4
  • 190. DAA notes by Pallavi Joshi Step 4: Constructing an LCS 190 Whenever we encounter a “↖” in entry b[i, j], it implies that xi = yj is an element of the LCS. PRINT‐LCS(b, X, i, j) 1. if i = 0 or j = 0 2. then return 3. if b[i, j] = " ↖“ 4. then PRINT‐LCS(b, X, i − 1, j − 1) 5. print xi 6. elseif b[i, j] = "“ 7. then PRINT‐LCS(b, X, i − 1, j) 8. else PRINT‐LCS(b, X, i, j − 1) This procedure prints "BCBA". i j 0 yi 1 B 2 D 3 C 4 A 5 B 6 A 0 xi 1 A 2 B 3 C 4 B 5 D 6 A 7 B 0 0 0 0 0 0 0 0  0  0  0 ↖ 1  1 ↖ 1 0 ↖ 1  1  1  1 ↖ 2  2 0  1  1 ↖ 2  2  2  2 0 ↖ 1  1  2  2 ↖ 3  3 0  1 ↖ 2  2  2  3  3 0  1  2  2 ↖ 3  3 ↖ 4 0 ↖ 1  2  2  3 ↖ 4  4
  • 191. DAA notes by Pallavi Joshi Optimal binary search trees Input: A sequence K = 〈k1, k2,..., kn〉of n distinct keys in sorted order. A sequence D = 〈d0, d1,..., dn〉of n + 1 dummy keys. k1 < k2 < ∙∙∙ < kn. d0 = all values < k1. dn = all values > kn. di = all values between ki and ki+1. For each key ki, a probability pi that a search is for ki. For each key di, a probability qi that a search is for di. Output: A BST with minimum expected search cost. 191 i T i i n n T i i1 i 1 n n  pi  qi i1 i1 n n i1 i 1 ⦁ E[searchcost inT]  (depthT (ki )1)pi  (depthT (di )1) qi  1  depth (k )p  depth (d ) q n n depthT (ki )pi  depthT (di)qi i1 i1 i1 i 1 n n pi  qi  1
  • 192. DAA notes by Pallavi Joshi 0-1 KNAPSACK PROBLEM
  • 193. DAA notes by Pallavi Joshi Statement of the problem: Given n items, each with corresponding value p and weight w, find which items to place in the knapsack such that sum of all p is maximum, and the sum of all w does not exceed the maximum weight capacity c of the knapsack.
  • 194. DAA notes by Pallavi Joshi We can also express the problem as follows: n i1  pi xi n i1 is maximum and wi xi  c i where x  0  item is not taken 1 item is taken
  • 195. DAA notes by Pallavi Joshi Solution #1 : Brute force n  We take all possible item combinations.  For any n items, the total number of combinations is C i, n = 2n i0  We pick the combinations that satisfy the constraint and sort each  p and get the maximum.  This approach has complexity O 2n .
  • 196. DAA notes by Pallavi Joshi Solution #2: Dynamic Programming (Bottom-Top Computation) Construct an n  c value matrix V to compute a value in each cell for every row in the matrix. The last cell V[n, c] will give the solution to the maximum total value.
  • 197. DAA notes by Pallavi Joshi Bottom-Top computation pseudocode: for i = 0 to c: V[0, i]  0 for i = 0 to n: for k = 0 to c: V[i, k]  Max(V[i - 1, k], pi + V[i - 1, k - wi])
  • 198. DAA notes by Pallavi Joshi Exa mpl e: i 1 2 3 4 5 p 30 20 40 70 60 w 4 1 2 5 3 n = 5 c = 10
  • 199. DAA notes by Pallavi Joshi Solution: The value matrix can be viewed as bottom-top, with the first row (i = 0) the bottom, and moving up to the succeeding rows up to the top (i = n). Row 0 of the value matrix all start with 0. The column k starts at 0 and ends at c (the constraint). k i 0 1 2 3 4 5 6 7 8 9 10 0 0 0 0 0 0 0 0 0 0 0 0
  • 200. DAA notes by Pallavi Joshi  Value at V[i, k] = Max(V[i - 1, k], pi + V[i- 1, k - wi]) where pi - the value of the item at row i wi - the weight of the item at row i  For each row, we search each column where k  wi  0, i.e., the maximum is V[i - 1, k] (the cell above the current cell).  If k  wi  0, compare V i 1, k  and pi V i 1, k  wi . The maximum of the two is the value of V i, k .
  • 201. DAA notes by Pallavi Joshi k i 0 1 2 3 4 5 6 7 8 9 10 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 30 30 30 30 30 30 30 at i = 1, k = 4: V i 1,kV 0,4 0 p1  30,w1  4 , k  wi  4  w1  4  4  0 pi V i 1,k  wi  p1 V 0,0 30  0  30
  • 202. DAA notes by Pallavi Joshi Completing the value matrix: k i 0 1 2 3 4 5 6 7 8 9 10 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 30 30 30 30 30 30 30 2 0 20 20 20 30 50 50 50 50 50 50 3 0 20 40 60 60 60 70 90 90 90 90 4 0 20 40 60 60 70 90 110 130 130 130 5 0 20 40 60 80 100 120 120 130 150 170 The last cell at V n,c is the solution to the maximum value.
  • 203. DAA notes by Pallavi Joshi  The value matrix only showed the solution to the maximum value, but not the individual items chosen.  Modify the last pseudocode to mark the cells where the maximum is pi V i 1, k  wi , where k  wi  0. k i 0 1 2 3 4 5 6 7 8 9 10 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 30 30 30 30 30 30 30 2 0 20 20 20 30 50 50 50 50 50 50 3 0 20 40 60 60 60 70 90 90 90 90 4 0 20 40 60 60 70 90 110 130 130 130 5 0 20 40 60 80 100 120 120 130 150 170
  • 204. DAA notes by Pallavi Joshi Pseudocode to find the items selected: k = c for i = n down to 0: if V i, k  is marked: output item i k  k  wi
  • 205. DAA notes by Pallavi Joshi From the last example: k i 0 1 2 3 4 5 6 7 8 9 10 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 30 30 30 30 30 30 30 2 0 20 20 20 30 50 50 50 50 50 50 3 0 20 40 60 60 60 70 90 90 90 90 4 0 20 40 60 60 70 90 110 130 130 130 5 0 20 40 60 80 100 120 120 130 150 170 i k Marked 5 10 Yes 4 10 - w5 = 10 - 3 = 7 Yes 3 7 - w4 = 7 - 5 = 2 Yes 2 2  w3 = 2 - 2 = 0 No 1 0 - w2 = 0 - 1 = INVALID No The items selected are 3, 4, 5.
  • 206. DAA notes by Pallavi Joshi  Bottom-top computation has complexity Onc.  For large n, a vast improvement to O2n .  Any problem involving maximizing a total value while satisfying a constraint can use this method, as long as the items can only be either chosen or not, i.e., the item cannot be broken into smaller parts.
  • 207. DAA notes by Pallavi Joshi Sorting in Linear Time Counting sort Radix sort Bucket sort
  • 208. DAA notes by Pallavi Joshi Counting Sort The Algorithm • Counting-Sort(A) – Initialize two arrays B and C of size n and set all entries to 0 • Count the number of occurrences of every A[i] – for i = 1..n – do C[A[i]] ← C[A[i]] + 1 • Count the number of occurrences of elements <= A[i] – for i = 2..n – do C[i] ← C[i] + C[i – 1] • Move every element to its final position – for i = n..1 – do B[C[A[i]] ← A[i] – C[A[i]] ← C[A[i]] – 1
  • 209. DAA notes by Pallavi Joshi Counting Sort Example 2 3 5 0 2 3 0 3 A = 1 2 3 4 5 6 7 8 2 4 2 7 8 C = 0 1 2 3 4 5 7 1 2 3 4 5 6 7 8 B = 3 6 C = 0 1 2 3 4 5 2 2 0 3 0 1 2 4 2 7 7 8 C = 0 1 2 3 4 5
  • 210. DAA notes by Pallavi Joshi Counting Sort Example 0 3 B = 2 4 2 6 7 8 C = 0 1 2 3 4 5 1 2 3 4 5 6 7 8 2 3 5 0 2 3 0 3 A = 1 2 3 4 5 6 7 8 2 4 2 6 7 8 C = 0 1 2 3 4 5 1
  • 211. DAA notes by Pallavi Joshi Counting Sort Example 0 3 B = 1 4 2 6 7 8 C = 0 1 2 3 4 5 1 2 3 4 5 6 7 8 2 3 5 0 2 3 0 3 A = 1 2 3 4 5 6 7 8 2 4 2 6 7 8 C = 0 1 2 3 4 5 3 5
  • 212. DAA notes by Pallavi Joshi Counting Sort 1 CountingSort(A, B, k) 2 for i=1 to k 3 C[i]= 0; 4 for j=1 to n 5 C[A[j]] += 1; 6 for i=2 to k 7 C[i] = C[i] + C[i-1]; 8 for j=n downto 1 9 B[C[A[j]]] = A[j]; 10 C[A[j]] -= 1; What will be the running time? Takes time O(k) Takes time O(n)
  • 213. DAA notes by Pallavi Joshi Counting Sort • Total time: O(n + k) – Usually, k = O(n) – Thus counting sort runs in O(n) time • But sorting is (n lg n)! – No contradiction--this is not a comparison sort (in fact, there are no comparisons at all!) – Notice that this algorithm is stable • If numbers have the same value, they keep their original order
  • 214. DAA notes by Pallavi Joshi • A sorting algorithms is stable if for any two indices i and j with i < j and ai = aj, element ai precedes element aj in the output sequence. Observation: Counting Sort is stable. Stable Sorting Algorithms Output 2 1 5 1 6 1 7 1 2 2 2 3 4 1 4 2 Input 2 1 5 1 2 3 6 1 7 1 4 1 4 2 2 2
  • 215. DAA notes by Pallavi Joshi Counting Sort • Linear Sort! Cool! Why don’t we always use counting sort? • Because it depends on range k of elements • Could we use counting sort to sort 32 bit integers? Why or why not? • Answer: no, k too large (232 = 4,294,967,296)
  • 216. DAA notes by Pallavi Joshi Radix Sort • Why it’s not a comparison sort: – Assumption: input has d digits each ranging from 0 to k – Example: Sort a bunch of 4-digit numbers, where each digit is 0-9 • Basic idea: – Sort elements by digit starting with least significant – Use a stable sort (like counting sort) for each stage
  • 217. DAA notes by Pallavi Joshi Radix Sort Overview • Origin : Herman Hollerith’s card-sorting machine for the 1890 U.S Census • Digit-by-digit sort • Hollerith’s original (bad) idea : sort on most-significant digit first. • Good idea : Sort on least-significant digit first with auxiliary stable sort A idéia de Radix Sort não é nova
  • 218. DAA notes by Pallavi Joshi Para minha turma da faculdade foi muito fácil aprender Radix Sort IBM 083 punch card sorter
  • 219. DAA notes by Pallavi Joshi • Radix Sort takes parameters: the array and the number of digits in each array element • Radix-Sort(A, d) • 1 for i = 1..d • 2 do sort the numbers in arrays A by their i-th digit from the right, using a stable sorting algorithm Radix Sort The Algorithm
  • 220. DAA notes by Pallavi Joshi Radix Sort Example 720 329 436 839 355 457 657 329 457 657 839 436 720 355 720 355 436 457 657 329 839 329 355 436 457 657 720 839
  • 221. DAA notes by Pallavi Joshi Radix Sort Correctness and Running Time •What is the running time of radix sort? •Each pass over the d digits takes time O(n+k), so total time O(dn+dk) •When d is constant and k=O(n), takes O(n) time •Stable, Fast •Doesn’t sort in place (because counting sort is used)
  • 222. DAA notes by Pallavi Joshi Bucket Sort • Assumption: input - n real numbers from [0, 1) • Basic idea: – Create n linked lists (buckets) to divide interval [0,1) into subintervals of size 1/n – Add each input element to appropriate bucket and sort buckets with insertion sort • Uniform input distribution  O(1) bucket size – Therefore the expected total time is O(n)
  • 223. DAA notes by Pallavi Joshi Bucket Sort Bucket-Sort(A) 1. n  length(A) 2. for i  0 to n 3. do insert A[i] into list B[floor(n*A[i])] 4. for i  0 to n –1 5. do Insertion-Sort(B[i]) 6. Concatenate lists B[0], B[1], … B[n –1] in order Distribute elements over buckets Sort each bucket
  • 224. DAA notes by Pallavi Joshi Bucket Sort Example .78 .17 .39 .26 .72 .94 .21 .12 .23 .68 7 6 8 9 5 4 3 2 1 0 .17 .12 .26 .23 .21 .39 .68 .78 .72 .94 .68 .72 .78 .94 .39 .26 .23 .21 .17 .12
  • 225. DAA notes by Pallavi Joshi Bucket Sort – Running Time • All lines except line 5 (Insertion-Sort) take O(n) in the worst case. • In the worst case, O(n) numbers will end up in the same bucket, so in the worst case, it will take O(n2) time. • Lemma: Given that the input sequence is drawn uniformly at random from [0,1), the expected size of a bucket is O(1). • So, in the average case, only a constant number of elements will fall in each bucket, so it will take O(n) (see proof in book). • Use a different indexing scheme (hashing) to distribute the numbers uniformly.
  • 226. DAA notes by Pallavi Joshi • Every comparison-based sorting algorithm has to take Ω(n lg n) time. • Merge Sort, Heap Sort, and Quick Sort are comparison-based and take O(n lg n) time. Hence, they are optimal. • Other sorting algorithms can be faster by exploiting assumptions made about the input • Counting Sort and Radix Sort take linear time for integers in a bounded range. • Bucket Sort takes linear average-case time for uniformly distributed real numbers. Summary
  • 228. WHA T ? • GIVEN WEIGHTS AND VALUES OF N ITEMS, WE NEED TO PUT THESE ITEMS IN A KNAPSACK OF CAPACITY W TO GET THE MAXIMUM TOTAL VALUE IN THE KNAPSACK.
  • 229. TYP ES • 0-1 KNAPSACK PROBLEM In the 0-1 knapsack problem, we are not allowed to break items. We either take the whole item or donT’tHEtaOkPeTiItM. AL KNAPSACK ALGORITHM • FRACTIONAL KNAPSACK In fractional knapsack, we can break items for maximizing the total value of knapsack. This problem in which we can break an item is also called the fractional knapsack problem.
  • 230. EXAMP LE 0-1 KNAPSACK Take B and C Total weight =20+30=50 Total value =100+120=220 FRACTIONAL KNAPSACK Take A,B and 2/3rd of C Total weight =10+20+(30* 2/3)=50 Total value =60+100+(120* 2/3)=240
  • 231. GREEDY APPROACH • The basic idea of the greedy approach is to calculate the ratio value/weight for each item. • Sort the item on basis of this ratio. • Then take the item with the highest ratio and add them until we can’t add the next item as a whole. • At the end add the next item as much (fraction) as we can.
  • 232. GREEDYAPPROACH SOLUTION Ratio = 5 R a ti o = 6 1 . c a l value/weight. 2.Sort (descending) item on basis of ratio. 3.Take item with highest ratio and add to knapsack until we cant add the next item as Ratio= 4 Capacity left =40 Value = 60 Capacity left =20 Value = 160 Take 2/3rd of C Weight = 2/3 *30 =20 Value = 2/3 *120 =80 Capacity left =0 Value = 240
  • 233. THE OPTIMAL KNAPSACK ALGORITHM • INPUT: AN INTEGER N • Positive values wi and vi such that 1 < = i < = n • Positive value W. • OUTPUT: • N values of xi such that 0 < = xi < = 1 • Total profit
  • 234. WHAT IS STRING MATCHING • In computer science, string searching algorithms, sometimes called string matching algorithms, that try to find a place where one or several string (also called pattern) are found within a larger string or text.
  • 235. EXAMPL E STRING MATCHING PROBLEM TEXT N A B C A B A A C A B SHIFT=3 A B A A PATTER
  • 236. STRING MATCHING ALGORITHMS There are many types of String Matching Algorithms like:- 1)The Naive string-matching algorithm 2)The Rabin-Krap algorithm 3)String matching with finite automata 4)The Knuth-Morris-Pratt algorithm But we discuss about 2 types of string matching algorithms. 1)The Naive string-matching algorithm 2)The Rabin-Krap algorithm
  • 237. THE NAIVE ALGORITHM The naive algorithm finds all valid shifts using a loop that checks the condition P[1….m]=T[s+1…. s+m] for eachof the n- m+1 possible values of s.(P=pattern , T=text/string , s=shift) NAIVE-STRING-MATCHER(T,P) 1)n = T.length 2)m = P.length 3)for s=0 to n-m 4) 5) if P[1…m]==T[s+1….s+m] printf” Pattern occurs with shift ” s
  • 238. EXAMPL E SUPPOSE, T=1011101110 P=111 FIND ALL VALID SHIFT…… 1 0 1 1 1 0 1 1 1 0 1 1 1 P=Patter n T=Tex t S= 0
  • 239. 1 0 1 1 1 0 1 1 1 0 1 1 1 S= 1
  • 240. 1 0 1 1 1 0 1 1 1 0 S=2 1 1 1 So, S=2 is a valid shift…
  • 241. 1 0 1 1 1 0 1 1 1 0 S=3 1 1 1
  • 242. 1 0 1 1 1 0 1 1 1 0 S=4 1 1 1
  • 243. 1 0 1 1 1 0 1 1 1 0 1 1 1 S=5
  • 244. 1 0 1 1 1 0 1 1 1 0 S=6 1 1 1 So, S=6 is a valid shift…
  • 245. 1 0 1 1 1 0 1 1 1 0 1 1 1 S=7
  • 246. THE RABIN-KARP ALGORITHM Rabin and Karp proposed a string matching algorithm that performs well in practice and that also generalizes to other algorithms for related problems, such as two-dimentional pattern matching.
  • 247. ALGORITH M RABIN-KARP-MATCHER(T,P,d,q) //pre-processing //matching 1) n = T.length 2) m = P.length 3) h = d^(m-1) mod q 4) p = 0 5) t = 0 6) for i =1 to m 7) p = (dp + P[i]) mod q 8) t = (d t + T[i]) mod q 9) for s = 0 to n – m 10) if p == t if P[1…m] == T[s+1…. s+m] printf “ Pattern occurs with shift ” s if s< n-m t+1 = (d(t- T[s+1]h)+ T[s+m+1]) mod q
  • 248. EXAMPL E Pattern P=26, how many spurious hits does the Rabin Karp matcher in the text T=3 1 4 1 5 9 2 6 5 3 5… • T = 3 1 4 1 5 9 2 6 5 3 5 P = 2 6 Here T.length=11so Q=11 and P mod Q = 26 mod 11 = 4 Now find the exact match of P mod Q…
  • 249. 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 3 1 mod 1 1 = 9 not equal to 4 3 1 4 1 5 9 2 6 5 3 5 S=0 S=1 1 4 mod 1 1 = 3 not equal to 4 S=2 4 1 mod 1 1 = 8 not equal to 4
  • 250. 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 1 5 mod 1 1 = 4 equal to 4 SPURIOUS HIT S=3 S=4 5 9 mod 1 1 = 4 equal to 4 SPURIOUS HIT S=5 9 2 mod 1 1 = 4 equal to 4 SPURIOUS HIT
  • 251. 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 3 1 4 1 5 9 2 6 5 3 5 2 6 mod 1 1 = 4 • S=7 • 6 5 mod 1 1 = 10 not equal to 4 • S=8 • 5 3 mod 1 1 = 9 not equal to 4 EXACT MATCH S=6
  • 252. 3 1 4 1 5 9 2 6 5 3 5 3 5 mod 1 1 = 2 not equal to 4 S=9 Pattern occurs with shift 6
  • 253. COMPARISSION The Naive String Matching algorithm slides the pattern one by one. After each slide, it one by one checks characters at the current shift and if all characters match then prints the match. Like the Naive Algorithm, Rabin-Karp algorithm also slides the pattern one by one. But unlike the Naive algorithm, Rabin Karp algorithm matches the hash value of the pattern with the hash value of current substring of text, and if the hash values match then only it starts matching individual characters.
  • 254. Minimum Spanning Trees •Definition of MST •Generic MST algorithm •Kruskal's algorithm •Prim's algorithm 254
  • 255. Definition of MST  Let G=(V,E) be a connected, undirected graph.  For each edge (u,v) in E, we have a weight w(u,v) specifying the cost (length of edge) to connect u and v.  We wish to find a (acyclic) subset T of E that connects all of the vertices in V and whose total weight is minimized.  Since the total weight is minimized, the subset T must be acyclic (no circuit).  Thus, T is a tree. We call it a spanning tree.  The problem of determining the tree T is called the minimum-spanning-tree problem. 255
  • 256. Application of MST: an example • In the design of electronic circuitry, it is often necessary to make a set of pins electrically equivalent by wiring them together. • Running cable TV to a set of houses. What’s the least amount of cable needed to still connect all the houses? 256
  • 257. What makes a greedy algorithm? • Feasible – Has to satisfy the problem’s constraints • Locally Optimal – The greedy part – Has to make the best local choice among all feasible choices available on that step • If this local choice results in a global optimum then the problem has optimal substructure • Irrevocable – Once a choice is made it can’t be un-done on subsequent steps of the algorithm • Simple examples: – Playing chess by making best move without lookahead – Giving fewest number of coins as change • Simple and appealing, but don’t always give the best solution
  • 258. Spanning Tree • Definition – A spanning tree of a graph G is a tree (acyclic) that connects all the vertices of G once • i.e. the tree “spans” every vertex in G – A Minimum Spanning Tree (MST) is a spanning tree on a weighted graph that has the minimum total weight w T w u v u v T ( ) ( , ) ,    such that w(T) is minimum Where might this be useful? Can also be used to approximate some NP-Complete problems
  • 259. 259 Here is an example of a connected graph and its minimum spanning tree: a b h c d e f g i 4 8 7 9 10 14 4 2 2 6 1 7 11 8 Notice that the tree is not unique: replacing (b,c) with (a,h) yields another spanning tree with the same minimum weight.
  • 260. Growing a MST • Set A is always a subset of some minimum spanning tree.. • An edge (u,v) is a safe edge for A if by adding (u,v) to the subset A, we still have a minimum spanning tree. 260 GENERIC_MST(G,w) 1 A:={} 2 while A does not form a spanning tree do 3 find an edge (u,v) that is safe for A 4 A:=A∪{(u,v)} 5 return A
  • 261. How to find a safe edge We need some definitions and a theorem. • A cut (S,V-S) of an undirected graph G=(V,E) is a partition of V. • An edge crosses the cut (S,V-S) if one of its endpoints is in S and the other is in V-S. • An edge is a light edge crossing a cut if its weight is the minimum of any edge crossing the cut. 261
  • 262. 262 V-S↓ a b h c d e f g i 4 8 7 9 10 14 4 2 2 6 1 7 11 8 S↑ ↑ S ↓ V-S • This figure shows a cut (S,V-S) of the graph. • The edge (d,c) is the unique light edge crossing the cut.
  • 263. The algorithms of Kruskal and Prim • The two algorithms are elaborations of the generic algorithm. • They each use a specific rule to determine a safe edge in the GENERIC_MST. • In Kruskal's algorithm, – The set A is a forest. – The safe edge added to A is always a least- weight edge in the graph that connects two distinct components. • In Prim's algorithm, – The set A forms a single tree. – The safe edge added to A is always a least- 263
  • 264. Kruskal's algorithm (simple) (Sort the edges in an increasing order) A:={} while (E is not empty do) { take an edge (u, v) that is shortest in E and delete it from E If (u and v are in different components)then add (u, v) to A } Note: each time a shortest edge in E is considered. 264
  • 265. Kruskal's algorithm 265 1 function Kruskal(G = <N, A>: graph; length: A → R+): set of edges 2 Define an elementary cluster C(v) ← {v}. 3 Initialize a priority queue Q to contain all edges in G, using the weights as keys. 4 Define a forest T ← Ø //T will ultimately contain the edges of the MST 5 // n is total number of vertices 6 while T has fewer than n-1 edges do 7 // edge u,v is the minimum weighted route from u to v 8 (u,v) ← Q.removeMin() 9 // prevent cycles in T. add u,v only if T does not already contain a path // between u and v. 10 // the vertices has been added to the tree. 11 Let C(v) be the cluster containing v, and let C(u) be the cluster containing u. 13 if C(v) ≠ C(u) then 14 Add edge (v,u) to T. 15 Merge C(v) and C(u) into one cluster, that is, union C(v) and C(u). 16 return tree T
  • 266. 266 http://Wikipedia/kruskals This is our original graph. The numbers near the arcs indicate their weight. None of the arcs are highlighted. Kruskal's algorithm
  • 267. 267 http://Wikipedia/kruskals AD and CE are the shortest arcs, with length 5, and AD has been arbitrarily chosen, so it is highlighted. Kruskal's algorithm
  • 268. 268 http://Wikipedia/kruskals CE is now the shortest arc that does not form a cycle, with length 5, so it is highlighted as the second arc. Kruskal's algorithm
  • 269. 269 http://Wikipedia/kruskals The next arc, DF with length 6, is highlighted using much the same method. Kruskal's algorithm
  • 270. 270 http://Wikipedia/kruskals The next-shortest arcs are AB and BE, both with length 7. AB is chosen arbitrarily, and is highlighted. The arc BD has been highlighted in red, because there already exists a path (in green) between B and D, so it would form a cycle (ABD) if it were chosen. Kruskal's algorithm
  • 271. 271 http://Wikipedia/kruskals The process continues to highlight the next-smallest arc, BE with length 7. Many more arcs are highlighted in red at this stage: BC because it would form the loop BCE, DE because it would form the loop DEBA, and FE because it would form FEBAD. Kruskal's algorithm
  • 272. 272 http://Wikipedia/kruskals Finally, the process finishes with the arc EG of length 9, and the minimum spanning tree is found. Kruskal's algorithm
  • 273. Prim's algorithm (simple) MST_PRIM(G,w,r){ A={} S:={r} (r is an arbitrary node in V) Q=V-{r}; while Q is not empty do { take an edge (u, v) such that uS and vQ (vS ) and (u,v) is the shortest edge add (u, v) to A, add v to S and delete v from Q } } 273
  • 274. Prim's algorithm 274 for each vertex in graph set min_distance of vertex to ∞ set parent of vertex to null set minimum_adjacency_list of vertex to empty list set is_in_Q of vertex to true set min_distance of initial vertex to zero add to minimum-heap Q all vertices in graph, keyed by min_distance Initialization inputs: A graph, a function returning edge weights weight-function, and an initial vertex Initial placement of all vertices in the 'not yet seen' set, set initial vertex to be added to the tree, and place all vertices in a min-heap to allow for removal of the min distance from the minimum graph. Wikipedia