CS-323 DAA.pdf

DAA notes by Pallavi Joshi
Chapter 1: The Role of Algorithms in
Computing

Algorithms
Informally, an algorithm is …
A well-defined computational procedure that takes
some value, or set of values, as input and produces
some value, or set of values, as output.
algorithm
input output
A sequence of computational steps that transform the
input into output.

Algorithms
Empirically, an algorithm is …
A tool for solving a well-specified computational
problem.
Problem specification includes what the input is, what
the desired output should be.
Algorithm describes a specific computational
procedure for achieving the desired output for a given
input.

Algorithms
The Sorting Problem:
Input: A sequence of n numbers [a1, a2, … , an].
Output: A permutation or reordering [a'1, a'2, … , a'n ] of the input
sequence such that a'1  a'2  …  a'n .
An instance of the Sorting Problem:
Input: A sequence of 6 number [31, 41, 59, 26, 41, 58].
Expected output for given instance:
Expected
Output: The permutation of the input [26, 31, 41, 41, 58 , 59].

Algorithms
Some definitions …
An algorithm is said to be correct if, for every input
instance, it halts with the correct output.
A correct algorithm solves the given computational
problem.
Focus will be on correct algorithms; incorrect
algorithms can sometimes be useful.
Algorithm specification may be in English, as a
computer program, even as a hardware design.

Gallery of Problems
The Human Genome Project seeks to identify
all the 100,000 genes in human DNA,
determining the sequences of the 3 billion
chemical base pairs comprising human DNA,
storing this information in databases, and
developing tools for data analysis.
Algorithms are needed (most of which are novel) to solve the
many problems listed here …
The huge network that is the Internet and the
huge amount of data that courses through it
require algorithms to efficiently manage and
manipulate this data.

Gallery of Problems
E-commerce enables goods and services to be
negotiated and exchanged electronically.
Crucial is the maintenance of privacy and
security for all transactions.
Traditional manufacturing and commerce
require allocation of scarce resources in the
most beneficial way. Linear programming
algorithms are used extensively in commercial
optimization problems.

Some algorithms
• Shortest path algorithm
– Given a weighted graph and two
distinguished vertices -- the source and
the destination
-- compute the most efficient way to get
from one to the other
• Matrix multiplication algorithm
– Given a sequence of conformable
matrices, compute the most efficient way
of forming the product of the matrix
sequence

Some algorithms
• Convex hull algorithm
– Given a set of points on the plane,
compute the smallest convex body that
contains the points
• String matching algorithm
– Given a sequence of characters, compute
where (if at all) a second sequence of
characters occurs in the first

Hard problems
• Usual measure of efficiency is speed
– How long does an algorithm take to produce its result?
– Define formally measures of efficiency
• Problems exist that, in all probability, will take a
long time to solve
– Exponential complexity
– NP-complete problems
• Problems exist that are unsolvable

Hard problems
• NP-complete problems are interesting in and of
themselves
– Some of them arise in real applications
– Some of them look very similar to problems for which
efficient solutions do exist
– Knowing the difference is crucial
• Not known whether NP-complete problems really
are as hard as they seem, or, perhaps, the
machinery for solving them efficiently has not been
developed just yet

Hard problems
• P  NP conjecture
– Fundamental open problem in
the theory of computational
complexity
– Open now for 30+ years

Algorithms as a technology
• Even if computers were infinitely fast and
memory was plentiful and free
– Study of algorithms still important – still need to
establish algorithm correctness
– Since time and space resources are infinite, any
correct algorithm would do
• Real-world computers are fast but not
infinitely so
• Memory is cheap but not unlimited

Efficiency
• Time and space efficiency are the goal
• Algorithms often differ dramatically in their
efficiency
– Example: Two sorting algorithms
• INSERTION-SORT – time efficiency is c1n2
• MERGE-SORT – time efficiency is c1nlogn
– For which problem instances would one algorithm
be preferable to the other?

Efficiency
– Answer depends on several factors:
• Speed of machine performing the computation
– Internal clock speed
– Shared environment
– I/O needed by algorithm
• Quality of implementation (coding)
– Compiler optimization
– Implementation details (e.g., data structures)
• Size of problem instance
– Most stable parameter – used as independent variable

Efficiency
• INSERTION-SORT
– Implemented by an ace programmer and run on a machine A that
performs 109
instructions per second such that time efficiency is given
by:
tA(n) = 2n2
instructions (i.e., c1=2)
• MERGE-SORT
– Implemented by a novice programmer and run on a machine B that
performs 107
instructions per second such that time efficiency is given
by:
tB(n) = 50nlogn instructions (i.e., c1=50)

Efficiency
n 2n2
/109
50nlogn/107
10,000 0.20 0.66
50,000 5.00 3.90
100,000 20.00 8.30
500,000 500.00 47.33
1,000,000 2,000.00 99.66
5,000,000 50,000.00 556.34
10,000,000 200,000.00 1,162.67
50,000,000 5,000,000.00 6,393.86
Problem
Size
Machine A
Insertion-
Sort
Machine B
Merge-
Sort

Efficiency
• Graphical comparison
Time Efficiency Comparison
0.00
2.00
4.00
6.00
8.00
10.00
1 9 17 25 33 41 49 57 65
Size of Problem (in 1000s)
Seconds
Insertion Sort
Merge Sort

The Sorting Problem
Input: A sequence of n numbers [a1, a2, … , an].
Output: A permutation or reordering [a'1, a'2, … , a'n ] of the input
sequence such that a'1  a'2  …  a'n .
An instance of the Sorting Problem:
Input: A sequence of 6 number [31, 41, 59, 26, 41, 58].
Expected output for given instance:
Expected
Output: The permutation of the input [26, 31, 41, 41, 58 , 59].

Insertion Sort
The main idea …

Insertion Sort (cont.)

Insertion Sort (cont.)
The algorithm …

Loop Invariant
• Property of A[1 .. j  1]
At the start of each iteration of the for loop of lines 1 8, the
subarray A[1 .. j  1] consists of the elements originally in
A[1 .. j  1] but in sorted order.
• Need to establish the following re invariant:
– Initialization: true prior to first iteration
– Maintenance: if true before iteration, remains true after
iteration
– Termination: at loop termination, invariant implies
correctness of algorithm

Analyzing Algorithms
• Has come to mean predicting the resources that the
algorithm requires
• Usually computational time is resource of primary
importance
• Aims to identify best choice among several alternate
algorithms
• Requires an agreed-upon “model” of computation
• Shall use a generic, one-processor, random-access
machine (RAM) model of computation

Random-Access Machine
• Instructions are executed one after another (no
concurrency)
• Admits commonly found instructions in “real”
computers, data movement operations, control
mechanism
• Uses common data types (integer and float)
• Other properties discussed as needed
• Care must be taken since model of computation has
great implications on resulting analysis

Analysis of Insertion Sort
• Time resource requirement depends on input size
• Input size depends on problem being studied;
frequently, this is the number of items in the input
• Running time: number of primitive operations or
“steps” executed for an input
• Assume constant amount of time for each line of
pseudocode

Analysis of Insertion Sort
Time efficiency analysis …

Best Case Analysis
• Least amount of (time) resource ever needed by algorithm
• Achieved when incoming list is already sorted in increasing order
• Inner loop is never iterated
• Cost is given by:
T(n) = c1n+c2 (n1)+c4 (n1)+c5(n1)+c8(n1)
= (c1+c2+c4+c5+c8)n  (c2+c4+c5+c8)
= an + b
• Linear function of n

Worst Case Analysis
• Greatest amount of (time) resource ever needed by algorithm
• Achieved when incoming list is in reverse order
• Inner loop is iterated the maximum number of times, i.e., tj = j
• Therefore, the cost will be:
T(n) = c1n + c2 (n1)+c4 (n1) + c5((n(n+1)/2) 1) + c6(n(n1)/2)
+ c7(n(n1)/2) + c8(n1)
= ( c5 /2 + c6 /2 + c7/2 ) n2 + (c1+c2+c4+c5 /2  c6 /2  c7 /2 +c8 ) n
 ( c2 + c4 + c5 + c8 )
= an2 + bn + c
• Quadratic function of n

Future Analyses
• For the most part, subsequent analyses will
focus on:
– Worst-case running time
• Upper bound on running time for any input
– Average-case analysis
• Expected running time over all inputs
• Often, worst-case and average-case have the
same “order of growth”

Order of Growth
• Simplifying abstraction: interested in rate of growth or
order of growth of the running time of the algorithm
• Allows us to compare algorithms without worrying
about implementation performance
• Usually only highest order term without constant
coefficient is taken
• Uses “theta” notation
– Best case of insertion sort is (n)
– Worst case of insertion sort is (n2)

Designing Algorithms
• Several techniques/patterns for designing algorithms
exist
• Incremental approach: builds the solution one
component at a time
• Divide-and-conquer approach: breaks original problem
into several smaller instances of the same problem
– Results in recursive algorithms
– Easy to analyze complexity using proven techniques

Divide-and-Conquer
• Technique (or paradigm) involves:
– “Divide” stage: Express problem in terms of
several smaller subproblems
– “Conquer” stage: Solve the smaller subproblems
by applying solution recursively – smallest
subproblems may be solved directly
– “Combine” stage: Construct the solution to
original problem from solutions of smaller
subproblem

n
(sorted)
MERGE
Merge Sort Strategy
• Divide stage: Split the n-element
sequence into two subsequences of
n/2 elements each
• Conquer stage: Recursively sort the
two subsequences
• Combine stage: Merge the two
sorted subsequences into one
sorted sequence (the solution)
n
(unsorted)
n/2
(unsorted)
n/2
(unsorted)
MERGE SORT MERGE SORT
n/2
(sorted)
n/2
(sorted)

Merging Sorted Sequences

Merging Sorted Sequences
•Combines the sorted
subarrays A[p..q] and
A[q+1..r] into one sorted
array A[p..r]
•Makes use of two working
arrays L and R which
initially hold copies of the
two subarrays
•Makes use of sentinel
value () as last element
to simplify logic
(1)
(n)
(1)
(n)

Merge Sort Algorithm
(1)
(n)
T(n/2)
T(n/2)
T(n) = 2T(n/2) + (n)

Analysis of Merge Sort
Analysis of recursive calls …

T(n) = cn(lg n + 1)
= cnlg n + cn
T(n) is (n lg n)

Chapter 3: Growth of Functions

Overview
• Order of growth of functions provides a simple
characterization of efficiency
• Allows for comparison of relative performance
between alternative algorithms
• Concerned with asymptotic efficiency of algorithms
• Best asymptotic efficiency usually is best choice except
for smaller inputs
• Several standard methods to simplify asymptotic
analysis of algorithms

Asymptotic Notation
• Applies to functions whose domains are the set of
natural numbers:
N = {0,1,2,…}
• If time resource T(n) is being analyzed, the function’s
range is usually the set of non-negative real numbers:
T(n)  R+
• If space resource S(n) is being analyzed, the function’s
range is usually also the set of natural numbers:
S(n)  N

Asymptotic Notation
• Depending on the textbook, asymptotic
categories may be expressed in terms of --
a. set membership (our textbook): functions
belong to a family of functions that exhibit
some property; or
b. function property (other textbooks): functions
exhibit the property
• Caveat: we will formally use (a) and
informally use (b)

The Θ-Notation
f
c1 ⋅ g
n0
c2 ⋅ g
Θ(g(n)) = { f(n) : ∃c1, c2 > 0, n0 > 0 s.t. ∀n ≥ n0:
c1 · g(n) ≤ f(n) ≤ c2 ⋅ g(n) }

The O-Notation
f
c ⋅ g
n0
O(g(n)) = { f(n) : ∃c > 0, n0 > 0 s.t. ∀n ≥ n0: f(n) ≤ c ⋅ g(n) }

The Ω-Notation
Ω(g(n)) = { f(n) : ∃c > 0, n0 > 0 s.t. ∀n ≥ n0: f(n) ≥ c ⋅ g(n) }
f
c ⋅ g
n0

The o-Notation
o(g(n)) = { f(n) : ∀c > 0 ∃n0 > 0 s.t. ∀n ≥ n0: f(n) ≤ c ⋅ g(n) }
f
c1 ⋅ g
n1
c2 ⋅ g
c3 ⋅ g
n2 n3

The ω-Notation
f
c1 ⋅ g
n1
c2 ⋅ g
c3 ⋅ g
n2
n3
ω(g(n)) = { f(n) : ∀c > 0 ∃n0 > 0 s.t. ∀n ≥ n0: f(n) ≥ c ⋅ g(n) }

Comparison of Functions
• f(n) = O(g(n)) and
g(n) = O(h(n)) ⇒ f(n) = O(h(n))
• f(n) = Ω(g(n)) and
g(n) = Ω(h(n)) ⇒ f(n) = Ω(h(n))
• f(n) = Θ(g(n)) and
g(n) = Θ(h(n)) ⇒ f(n) = Θ(h(n))
• f(n) = O(f(n))
f(n) = Ω(f(n))
f(n) = Θ(f(n))
Reflexivity
Transitivity

• f(n) = Θ(g(n)) ⇐⇒ g(n) = Θ(f(n))
• f(n) = O(g(n)) ⇐⇒ g(n) = Ω(f(n))
• f(n) = o(g(n)) ⇐⇒ g(n) = ω(f(n))
• f(n) = O(g(n)) and
f(n) = Ω(g(n)) ⇒ f(n) = Θ(g(n))
Transpose
Symmetry
Symmetry
Theorem 3.1

Asymptotic
Analysis and Limits

• f1(n) = O(g1(n)) and f2(n) = O(g2(n)) ⇒
f1(n) + f2(n) = O(g1(n) + g2(n))
• f(n) = O(g(n)) ⇒ f(n) + g(n) = O(g(n))

Standard Notation and
Common Functions
• Monotonicity
A function f(n) is monotonically increasing if m  n
implies f(m)  f(n) .
A function f(n) is monotonically decreasing if m 
n implies f(m)  f(n) .
A function f(n) is strictly increasing
if m < n implies f(m) < f(n) .
A function f(n) is strictly decreasing
if m < n implies f(m) > f(n) .

Common Functions
• Floors and ceilings
For any real number x, the greatest integer less
than or equal to x is denoted by x.
For any real number x, the least integer greater
than or equal to x is denoted by x.
For all real numbers x,
x1 < x  x  x < x+1.
Both functions are monotonically increasing.

Common Functions
• Exponentials
For all n and a1, the function an is the exponential function
with base a and is monotonically increasing.
• Logarithms
Textbook adopts the following convention
lg n = log2n (binary logarithm),
ln n = logen (natural logarithm),
lgk n = (lg n)k (exponentiation),
lg lg n = lg(lg n) (composition),
lg n + k = (lg n)+k (precedence of lg).
ai

Common Functions
• Important relationships
For all real constants a and b such that a>1,
nb = o(an)
that is, any exponential function with a base
strictly greater than unity grows faster than any
polynomial function.
For all real constants a and b such that a>0,
lgbn = o(na)
that is, any positive polynomial function grows
faster than any polylogarithmic function.

Common Functions
• Factorials
For all n the function n! or “n factorial” is given by
n! = n  (n1)  (n  2)  (n  3)  …  2  1
It can be established that
n! = o(nn)
n! = (2n)
lg(n!) = (nlgn)

Common Functions
• Functional iteration
The notation f (i)(n) represents the function f(n) iteratively
applied i times to an initial value of n, or, recursively
f (i)(n) = n if n=0
f (i)(n) = f(f (i1)(n)) if n>0
Example:
If f(n) = 2n
then f (2)(n) = f(2n) = 2(2n) = 22n
then f (3)(n) = f(f (2)(n)) = 2(22n) = 23n
then f (i)(n) = 2in

Common Functions
• Iterated logarithmic function
The notation lg* n which reads “log star of n” is defined as
lg* n = min {i0 : lg(i) n  1
Example:
lg* 2 = 1
lg* 4 = 2
lg* 16 = 3
lg* 65536 = 4
lg* 265536 = 5

Asymptotic Running Time
of Algorithms
• We consider algorithm A better than algorithm B if
TA(n) = o(TB(n))
• Why is it acceptable to ignore the behavior of
algorithms for small inputs?
• Why is it acceptable to ignore the constants?
• What do we gain by using asymptotic notation?

Things to Remember
• Asymptotic analysis studies how the values of
functions compare as their arguments grow
without bounds.
• Ignores constants and the behavior of the function
for small arguments.
• Acceptable because all algorithms are fast for small
inputs and growth of running time is more important
than constant factors.

Things to Remember
• Ignoring the usually unimportant details, we obtain a
representation that succinctly describes the growth of
a function as its argument grows and thus allows us to
make comparisons between algorithms in terms of
their efficiency.

Chapter 4: Recurrences
Overview
Define what a recurrence is
Discuss three methods of solving recurrences
Substitution method
Recursion-tree method
Master method
Examples of each method
64

Definition
A recurrence is an equation or inequality that describes a function in terms of its
value on smaller inputs.
Example from MERGE-SORT
T(n) =
(1) if n=1
2T(n/2) + (n) if n>1
Technicalities
Normally, independent variables only assume integral values
Example from MERGE-SORT revisited
(1) if n=1
T(n) =
T( n/2 ) + T( n/2 ) + (n) if n>1
For simplicity, ignore floors and ceilings – often insignificant
65

Technicalities
Boundary conditions (small n) are also glossed over
T(n) = 2T(n/2) + (n)
Value of T(n) assumed to be small constant for small n
Substitution Method
Involves two steps:
1.Guess the form of the solution.
2.Use mathematical induction to find the constants and show the solution works.
Drawback: applied only in cases where it is easy to guess at solution
Useful in estimating bounds on true solution even if latter is
unidentified
66

Substitution Method
Example:
T(n) = 2T(n/2 ) + n
Guess:
T(n) = O(n lg n)
Prove by induction:
T(n)  cn lg n
for suitable c>0.
Inductive Proof
We’ll not worry about the basis case for the moment – we’ll choose this as
needed – clearly we have:
T(1) = (1)  cn lg n
Inductive hypothesis:
For values of n < k the inequality holds, i.e., T(n)  cn lg n
We need to show that this holds for
n = k as well.
67

Inductive Proof
In particular, for n = k/2 , the inductive hypothesis should hold, i.e.,
T( k/2 )  c k/2 lg k/2
The recurrence gives us:
T(k) = 2T( k/2 ) + k Substituting the inequality above yields:
T(k)  2[c k/2 lg k/2 ] + k
Inductive Proof
Because of the non-decreasing nature of the functions involved, we can
drop the “floors” and obtain:
T(k)  2[c (k/2) lg (k/2)] + k
Which simplifies to:
T(k)  ck (lg k  lg 2) + k
Or, since lg 2 = 1, we have:
T(k)  ck lg k  ck + k = ck lg k + (1 c)k
So if c  1, T(k)  ck lg k Q.E.D.
68

Recursion-Tree Method
Straightforward technique of coming up with a good
guess
Can help the Substitution Method
Recursion tree: visual representation of recursive call
hierarchy where each node represents the cost of a single
subproblem
T(n) = 3T( n/4 ) + (n2)
69

T(n) = 3T( n/4 ) + (n2)
T(n) = 3T( n/4 ) + (n2)
70

T(n) = 3T( n/4 ) + (n2)
T(n) = (3/16)icn2 + (nlog43)
i=0
Gathering all the costs together:
log4n1
T(n)  (3/16)icn2 + o(n)
i=0

T(n)  (1/(13/16))cn2 + o(n)
T(n)  (16/13)cn2 + o(n)
T(n) = O(n2)
71

T(n) = T(n/3) + T(2n/3) + O(n)
An overestimate of the total cost:
log3/2n1
T(n) = cn + (nlog3/22)
i=0
Counter-indications:
T(n) = O(n lg n) + (n lg n)
Notwithstanding this, use as “guess”:
T(n) = O(n lg n)
72

Substitution Method
Recurrence:
T(n) = T(n/3) + T(2n/3) + cn
Guess:
T(n) = O(n lg n)
Prove by induction:
T(n)  dn lg n
for suitable d>0 (we already use c)
Inductive Proof
Again, we’ll not worry about the basis case
Inductive hypothesis:
For values of n < k the inequality holds, i.e., T(n)  dn lg n
We need to show that this holds for
n = k as well.
In particular, for n = k/3, and n = 2k/3, the inductive hypothesis
should hold…
73

Inductive Proof
That is
T(k/3)  d k/3 lg k/3 T(2k/3)  d 2k/3 lg 2k/3
The recurrence gives us:
T(k) = T(k/3) + T(2k/3) + ck
Substituting the inequalities above yields:
T(k)  [d (k/3) lg (k/3)] + [d (2k/3) lg (2k/3)] + ck
Inductive Proof
Expanding, we get
T(k)  [d (k/3) lg k  d (k/3) lg 3] +
[d (2k/3) lg k  d (2k/3) lg(3/2)] + ck
Rearranging, we get:
T(k)  dk lg k  d[(k/3) lg 3 + (2k/3) lg(3/2)] + ck T(k)  dk lg k  dk[lg 3  2/3] + ck
When dc/(lg3  (2/3)), we should have the desired:
T(k)  dk lg k
74

Master Method
Provides a “cookbook” method for solving recurrences
Recurrence must be of the form:
T(n) = aT(n/b) + f(n)
where a1 and b>1 are constants and f(n) is an
asymptotically positive function.
Master Method
Theorem 4.1:
Given the recurrence previously defined, we have:
1.If f(n) = O(n logba)
for some constant >0, then T(n) = (n logba)
2.If f(n) = (n logba),
then T(n) = (nlogba lg n)
75

Master Method
3. If f(n) = (n logba+)
for some constant >0, and if
af(n/b)  cf(n)
for some constant c<1
and all sufficiently large n, then T(n) = (f(n))
Example
Estimate bounds on the following recurrence:
Use the recursion tree method to arrive at a “guess” then verify using
induction
Point out which case in the Master Method this falls in
76

Recursion Tree
Recurrence produces the following tree:
Cost Summation
Collecting the level-by-level costs:
A geometric series with base less than one; converges to a finite sum,
hence, T(n) = (n2)
77

Exact Calculation
If an exact solution is preferred:
Using the formula for a partial geometric series:
78

Master Theorem (Simplified)
79

• Divide and Conquer

Divide and Conquer
-
Recursive in structure
Divide the problem into sub-problems that are
similar to the original but smaller in size
Conquer the sub-problems by solving them
recursively. If they are small enough, just
solve them in a straightforward manner.
Combine the solutions to create a solution to
the original problem

An Example: Merge Sort
Comp 122
-
Sorting Problem: Sort a sequence of n elements into
non-decreasing order.
Divide: Divide the n-element sequence to be
sorted into two subsequences of n/2 elements each
Conquer:Sort the two subsequences recursively
using merge sort.
Combine: Merge the two sorted
subsequences to produce the sorted answer.

Merge Sort – Example
Comp 122
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
26
18 6
32 15
43 1
9
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
18 26 32
6 15 43 1 9
6 18 26 32 1 9 15 43
1 6 9 15 18 26 32 43
18 26 32 6
18 26 32 6
43 15 9 1
43 15 9 1
18 26 32 6 43 15 9 1
6 43 15
18 26 32 6 43 15 9 1 18 26 32 9 1
18 26 32 6 43 15 9 1
1 9
18 26 6 32 15 43
- 5
6 18 26 32 1 9 15 43
Original Sequence Sorted Sequence

Merge-Sort (A, p, r)
Comp 122
- 6
INPUT: a sequence of n numbers stored in array A
OUTPUT: an ordered sequence of n numbers
MergeSort (A, p, r) // sort A[p..r] by divide & conquer
1 if p < r
2 then q  (p+r)/2
3 MergeSort (A, p, q)
4 MergeSort (A, q+1, r)
5 Merge (A, p, q, r) // merges A[p..q] with A[q+1..r]
Initial Call: MergeSort(A, 1, n)

Comp 122
- 7
Procedure
Merge
Merge(A, p, q, r)
1 n1  q – p + 1
2 n2  r – q
3 for i  1 to n1
4 do L[i]  A[p + i – 1]
5for j  1 to n2
6do R[j]  A[q + j] 7
L[n1+1]  
8 R[n2+1]  
9 i  1
10 j  1
11for k p to r
12do if L[i]  R[j]
13then A[k]  L[i]
14 i  i + 1
15 else A[k]  R[j]
16 j  j + 1
Sentinels, to avoid having to
check if either subarray is
fully copied at each step.
Input: Array containing
sorted subarrays A[p..q]
and A[q+1..r].
Output: Merged sorted
subarray in A[p..r].

Comp 122
- 8
j
Merge – Example
6 8 26 32 1 9 42 43
A
k
6 8 26 32 1 9 42 43
k k k k k k k
i i i i i j j j j
6 8 26 32  1 9 42 43 
… 1 6 8 9 26 32 42 43 …
k
L R

Comp 122
- 9
Merg
e
Merge(A, p, q, r)
1 n1  q – p + 1
2 n2  r – q
3 for i  1 to n1
4 do L[i]  A[p + i – 1]
5for j  1 to n2
6do R[j]  A[q + j] 7
L[n1+1]  
8 R[n2+1]  
9 i  1
10 j  1
11for k p to r
14 i  i + 1
16 j  j + 1
Loop Invariant for the for loop At the start
of each iteration of the for loop:
Subarray A[p..k – 1] contains the k – p
smallest elements of L and R in sorted order.
L[i] and R[j] are the smallest elements of
L and R that have not been copied back into
A.
Initialization:
Before the first iteration:
•A[p..k – 1] is empty.
•i = j = 1.
•L[1] and R[1] are the smallest
elements of L and R not copied to A.

Comp 122
- 10
Merg
e
Merge(A, p, q, r)
1 n1  q – p + 1
2 n2  r – q
2
3 for i  1 to n1
4 do L[i]  A[p + i – 1]
5for j  1 to n
6do R[j]  A[q + j] 7
L[n1+1]  
8 R[n2+1]  
9 i  1
10 j  1
11for k p to r
14 i  i + 1
16 j  j + 1
Maintenance:
Case 1: L[i]  R[j]
•By LI, A contains p – k smallest elements
of L and R in sorted order.
•By LI, L[i] and R[j] are the smallest
elements of L and R not yet copied into A.
•Line 13 results in A containing p – k + 1
smallest elements (again in sorted order).
Incrementing i and k reestablishes the LI
for the next iteration.
Similarly for L[i] > R[j].
Termination:
•On termination, k = r + 1.
•By LI, A contains r – p + 1 smallest
elements of L and R in sorted order.
•L and R together contain r – p + 3 elements.
All but the two sentinels have been copied
back into A.

Comp 122
- 11
Running time T(n) of Merge Sort:
Divide: computing the middle takes (1)
Conquer: solving 2 subproblems takes 2T(n/2)
Combine: merging n elements takes (n)
Total:
T(n) = (1)
T(n) = 2T(n/2) + (n)
if n = 1
if n > 1
 T(n) = (n lg n) (CLRS, Chapter 4)

Comp 122, Spring 2004
Recurrences – I

Recurrence Relations
Comp 122
- 13
Equation or an inequality that characterizes a
function by its values on smaller inputs.
Solution Methods (Chapter 4)
Substitution Method.
Recursion-tree Method.
Master Method.
Recurrence relations arise when we analyze the
running time of iterative or recursive algorithms.
Ex: Divide and Conquer.
T(n) = (1)
T(n) = a T(n/b) + D(n) + C(n)
if n  c
otherwise

Substitution
Method
Comp 122
- 14
Guess the form of the solution, then
use mathematical induction to show it correct.
Substitute guessed answer for the function when the
inductive hypothesis is applied to smaller values –
hence, the name.
Works well when the solution is easy to guess.
No general way to guess the correct solution.

Example – Exact Function
Comp 122
- 15
if n = 1
if n > 1
Recurrence: T(n) = 1
T(n) = 2T(n/2) + n
Guess: T(n) = n lg n + n.
Induction:
•Basis: n = 1  n lgn + n = 1 = T(n).
•Hypothesis: T(k) = k lg k + k for all k < n.
•Inductive Step: T(n) = 2 T(n/2) + n
= 2 ((n/2)lg(n/2) + (n/2)) + n
= n (lg(n/2)) + 2n
= n lg n – n + 2n
= n lg n + n

Recursion-tree
Method
Comp 122
- 16
Making a good guess is sometimes difficult with
the substitution method.
Use recursion trees to devise good guesses.
Recursion Trees
Show successive expansions of recurrences using
trees.
Keep track of the time spent on the subproblems of a
divide and conquer algorithm.
Help organize the algebraic bookkeeping necessary
to solve a recurrence.

Recursion Tree – Example
Comp 122
- 17
Running time of Merge Sort:
if n = 1
if n > 1
T(n) = (1)
T(n) = 2T(n/2) + (n)
Rewrite the recurrence as
T(n) = c
T(n) = 2T(n/2) + cn
if n = 1
if n > 1
c > 0: Running time for the base case and
time per array element for the divide and
combine steps.

Recursion Tree for Merge Sort
Comp 122
- 18
For the original problem,
we have a cost of cn,
plus two subproblems
each of size (n/2) and
running time T(n/2).
cn
T(n/2) T(n/2)
Each of the size n/2 problems
has a cost of cn/2 plus two
subproblems, each costing
T(n/4).
cn
cn/2 cn/2
T(n/4) T(n/4) T(n/4) T(n/4)
Cost of divide and
merge.
Cost of sorting
subproblems.

Comp 122
- 19
Continue expanding until the problem size reduces to 1.
cn
cn/2 cn/2
cn/4 cn/4 cn/4 cn/4
c c c c c c
lg n
cn
cn
cn
cn
Total : cnlgn+cn

Comp 122
- 20
Continue expanding until the problem size reduces to 1.
cn
cn/2 cn/2
cn/4 cn/4 cn/4 cn/4
c c c c c c
•Each level has total cost cn.
•Each time we go down one level,
the number of subproblems
doubles, but the cost per
subproblem halves
 cost per level remains the same.
•There are lg n + 1 levels, height is
lg n. (Assuming n is a power of 2.)
•Can be proved by induction.
•Total cost = sum of costs at each
level = (lg n + 1)cn = cnlgn + cn =
(n lgn).

Other Examples
Comp 122
- 21
Use the recursion-tree method to determine a
guess for the recurrences
T(n) = 3T(n/4) + (n2).
T(n) = T(n/3) + T(2n/3) + O(n).

Recursion Trees – Caution Note
Comp 122
- 22
Recursion trees only generate guesses.
Verify guesses using substitution method.
A small amount of “sloppiness” can be
tolerated. Why?
If careful when drawing out a recursion tree and
summing the costs, can be used as direct proof.

The Master
Method
Comp 122
- 23
Based on the Master theorem.
“Cookbook” approach for solving recurrences
of the form
T(n) = aT(n/b) + f(n)
• a  1, b > 1 are constants.
• f(n) is asymptotically positive.
• n/b may not be an integer, but we ignore floors and
ceilings. Why?
Requires memorization of three cases.

The Master
Theorem
• Theorem 4.1
• Let a  1 and b > 1 be constants, let f(n) be a
function, and
• Let T(n) be defined on nonnegative integers by
the recurrence T(n) = aT(n/b) + f(n), where we
can replace n/b by n/b or n/b . T(n) can be
bounded asymptotically in three cases:
• 1. If f(n) = O(nlogba– ) for some constant > 0, then
T(n) = (nlogba).
• 2. If f(n) = (nlogba), then T(n) = (nlogbalg n).
Comp 122
- 24
Theorem 4.1
Let a  1 and b > 1 be constants, let f(n) be a function, and
Let T(n) be defined on nonnegative integers by the recurrence T(n) =
aT(n/b) + f(n), where we can replace n/b by n/b or n/b. T(n) can
be bounded asymptotically in three cases:
1.If f(n) = O(nlogba– ) for some constant  > 0, then T(n) =
(nlogba).
2.If f(n) = (nlogba), then T(n) = (nlogbalg n).
3.If f(n) =  (nlogba+ ) for some constant  > 0,
and if, for some constant c < 1 and all sufficiently large n, we have
a·f(n/b)  c f(n), then T(n) = (f(n)).
We’ll return to recurrences as we need them…

• Heap Sort Algorithm

DAA notes by Pallavi Joshi 104
Special Types of Trees
• Def: Full binary tree = a
binary tree in which each node
is either a leaf or has degree
exactly 2.
• Def: Complete binary tree =
a binary tree in which all
leaves are on the same level
and all internal nodes have
degree 2.
Full binary tree
2
14 8
1
16
7
4
3
9 10
12
Complete binary tree
2
1
16
4
3
9 10

Definitions
• Height of a node = the number of edges on the longest simple
path from the node down to a leaf
• Level of a node = the length of a path from the root to the
node
• Height of tree = height of root node
2
14 8
1
16
4
3
9 10
Height of root = 3
Height of (2)= 1 Level of (10)= 2

Useful Properties
2
14 8
1
16
4
3
9 10
Height of root = 3
Height of (2)= 1 Level of (10)= 2
heigh
t
heigh
t
1
1
0
2 1
2 2 1
2 1
d
d
l d
l
n




   


(see Ex 6.1-2, page 129)

The Heap Data Structure
• Def: A heap is a nearly complete binary tree
with the following two properties:
– Structural property: all levels are full, except
possibly the last one, which is filled from left to
right
– Order (heap) property: for any node x
Parent(x) ≥ x
Heap
5
7
8
4
2
From the heap
property, it follows
that:
“The root is the
maximum
element of the heap!”
A heap is a binary tree that is filled in

Array Representation of Heaps
• A heap can be stored as an array
A.
– Root of tree is A[1]
– Left child of A[i] = A[2i]
– Right child of A[i] = A[2i + 1]
– Parent of A[i] = A[ i/2 ]
– Heapsize[A] ≤ length[A]
• The elements in the subarray
A[(n/2+1) .. n] are leaves

Heap Types
• Max-heaps (largest element at root), have the
max-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≥ A[i]
• Min-heaps (smallest element at root), have the
min-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≤ A[i]

Adding/Deleting Nodes
• New nodes are always inserted at the bottom
level (left to right)
• Nodes are removed from the bottom level
(right to left)

Operations on Heaps
• Maintain/Restore the max-heap property
– MAX-HEAPIFY
• Create a max-heap from an unordered array
– BUILD-MAX-HEAP
• Sort an array in place
– HEAPSORT
• Priority queues

Maintaining the Heap Property
• Suppose a node is smaller than a child
– Left and Right subtrees of i are max-heaps
• To eliminate the violation:
– Exchange with larger child
– Move down the tree
– Continue until node is not smaller than
children

Example
MAX-HEAPIFY(A, 2, 10)
A[2] violates the heap property
A[2]  A[4]
A[4] violates the heap property
A[4]  A[9]
Heap property restored

Maintaining the Heap Property
• Assumptions:
– Left and Right
subtrees of i
are max-heaps
– A[i] may be
smaller than
its children
Alg: MAX-HEAPIFY(A, i, n)
1. l ← LEFT(i)
2. r ← RIGHT(i)
3. if l ≤ n and A[l] > A[i]
4. then largest ←l
5. else largest ←i
6. if r ≤ n and A[r] > A[largest]
7. then largest ←r
8. if largest  i
9. then exchange A[i] ↔
A[largest]
10. MAX-HEAPIFY(A, largest,
n)

MAX-HEAPIFY Running Time
• Intuitively:
• Running time of MAX-HEAPIFY is O(lgn)
• Can be written in terms of the height of the heap,
as being O(h)
– Since the height of the heap is lgn
h
2h
O(h)
-
-
-
-

Building a Heap
Alg: BUILD-MAX-HEAP(A)
1. n = length[A]
2. for i ← n/2 downto 1
3. do MAX-HEAPIFY(A, i, n)
• Convert an array A[1 … n] into a max-heap (n = length[A])
• The elements in the subarray A[(n/2+1) .. n] are leaves
• Apply MAX-HEAPIFY on elements between 1 and n/2
2
14 8
1
16
7
4
3
9 10
1
2 3
4 5 6 7
8 9 10
4 1 3 2 16 9 10 14 8 7
A:

Example: A
4 1 3 2 16 9 10 14 8 7
2
14 8
1
16
7
4
3
9 10
1
2 3
4 5 6 7
8 9 10
14
2 8
1
16
7
4
10
9 3
1
2 3
4 5 6 7
8 9 10
2
14 8
1
16
7
4
3
9 10
1
2 3
4 5 6 7
8 9 10
14
2 8
1
16
7
4
3
9 10
1
2 3
4 5 6 7
8 9 10
14
2 8
16
7
1
4
10
9 3
1
2 3
4 5 6 7
8 9 10
8
2 4
14
7
1
16
10
9 3
1
2 3
4 5 6 7
8 9 10
i = 5 i = 4 i = 3
i = 2 i = 1

Running Time of BUILD MAX HEAP
 Running time: O(nlgn)
• This is not an asymptotically tight upper
bound
Alg: BUILD-MAX-HEAP(A)
1. n = length[A]
2. for i ← n/2 downto 1
3. do MAX-HEAPIFY(A, i, n) O(lgn)
O(n)

• HEAPIFY takes O(h)  the cost of HEAPIFY on a node i is
proportional to the height of the node i in the tree
Height Level
h0 = 3 (lgn)
h1 = 2
h2 = 1
h3 = 0
i = 0
i = 1
i = 2
i = 3 (lgn)
No. of nodes
20
21
22
23
hi = h – i height of the heap rooted at
level i
ni = 2i number of nodes at
i
h
i
ih
n
n
T 



0
)
(  
i
h
h
i
i

 
0
2 )
(n
O


i
h
i
ih
n
n
T 


0
)
( Cost of HEAPIFY at level i  number of nodes at that le
 
i
h
h
i
i

 
0
2 Replace the values of ni and hi computed before
h
h
i
i
h
i
h
2
2
0




 Multiply by 2h both at the nominator and denominator
write 2i as i

2
1



h
k
k
h k
0 2
2 Change variables: k = h - i




0 2
k
k
k
n The sum above is smaller than the sum of all elements
and h = lgn
)
(n
O
 The sum above is smaller than 2
Running time of BUILD-MAX-HEAP: T(n) = O(n)

Heapsort
• Goal:
– Sort an array using heap representations
• Idea:
– Build a max-heap from the array
– Swap the root (the maximum element) with the last
element in the array
– “Discard” this last node by decreasing the heap size
– Call MAX-HEAPIFY on the new root
– Repeat this process until only one node remains

Example: A=[7, 4, 3, 1, 2]
MAX-HEAPIFY(A, 1, 4) MAX-HEAPIFY(A, 1, 3) MAX-HEAPIFY(A, 1, 2)
MAX-HEAPIFY(A, 1, 1)

Alg: HEAPSORT(A)
1. BUILD-MAX-HEAP(A)
2. for i ← length[A] downto 2
3. do exchange A[1] ↔ A[i]
4. MAX-HEAPIFY(A, 1, i - 1)
• Running time: O(nlgn) --- Can
be shown to be Θ(nlgn)
O(n)
O(lgn)
n-1 times

Priority Queues
12 4

Operations
on Priority Queues
• Max-priority queues support the following
operations:
– INSERT(S, x): inserts element x into set S
– EXTRACT-MAX(S): removes and returns element of S
with largest key
– MAXIMUM(S): returns element of S with largest key
– INCREASE-KEY(S, x, k): increases value of element x’s
key to k (Assume k ≥ x’s current key value)

HEAP-MAXIMUM
Goal:
– Return the largest element of the heap
Alg: HEAP-MAXIMUM(A)
1. return A[1]
Running time:
O(1)
Heap A:
Heap-Maximum(A) returns 7

HEAP-EXTRACT-MAX
Goal:
– Extract the largest element of the heap (i.e., return the max value
and also remove that element from the heap
Idea:
– Exchange the root element with the last
– Decrease the size of the heap by 1 element
– Call MAX-HEAPIFY on the new root, on a heap of size n-1
Heap A: Root is the largest element

Example: HEAP-EXTRACT-MAX
8
2 4
14
7
1
16
10
9 3
max = 16
8
2 4
14
7
1
10
9 3
Heap size decreased with 1
4
2 1
8
7
14
10
9 3
Call MAX-HEAPIFY(A, 1, n-1)

HEAP-EXTRACT-MAX
Alg: HEAP-EXTRACT-MAX(A, n)
1. if n < 1
2. then error “heap underflow”
3. max ← A[1]
4. A[1] ← A[n]
5. MAX-HEAPIFY(A, 1, n-1) remakes heap
6. return max
Running time: O(lgn)

HEAP-INCREASE-KEY
• Goal:
– Increases the key of an element i in the heap
• Idea:
– Increment the key of A[i] to its new value
– If the max-heap property does not hold anymore:
traverse a path toward the root to find the proper
place for the newly increased key
8
2 4
14
7
1
16
10
9 3
i
Key [i] ← 15

Example: HEAP-INCREASE-KEY
14
2 8
15
7
1
16
10
9 3
i
8
2 4
14
7
1
16
10
9 3
i
Key [i ] ← 15
8
2 15
14
7
1
16
10
9 3
i
15
2 8
14
7
1
16
10
9 3
i

HEAP-INCREASE-KEY
Alg: HEAP-INCREASE-KEY(A, i, key)
1. if key < A[i]
2. then error “new key is smaller than current key”
3. A[i] ← key
4. while i > 1 and A[PARENT(i)] < A[i]
5. do exchange A[i] ↔ A[PARENT(i)]
6. i ← PARENT(i)
• Running time: O(lgn)
8
2 4
14
7
1
16
10
9 3
i
Key [i] ← 15

-
MAX-HEAP-INSERT
• Goal:
– Inserts a new element into a
max-heap
• Idea:
– Expand the max-heap with a
new element whose key is -
– Calls HEAP-INCREASE-KEY to set
the key of the new node to its
correct value and maintain the
max-heap property
8
2 4
14
7
1
16
10
9 3
15
8
2 4
14
7
1
16
10
9 3

Example: MAX-HEAP-INSERT
-
8
2 4
14
7
1
16
10
9 3
Insert value 15:
- Start by inserting -
15
8
2 4
14
7
1
16
10
9 3
Increase the key to 15
Call HEAP-INCREASE-KEY on A[11] = 15
7
8
2 4
14
15
1
16
10
9 3
7
8
2 4
15
14
1
16
10
9 3
The restored heap containing
the newly added element

MAX-HEAP-INSERT
Alg: MAX-HEAP-INSERT(A, key, n)
1. heap-size[A] ← n + 1
2. A[n + 1] ← -
3. HEAP-INCREASE-KEY(A, n + 1, key)
Running time: O(lgn)
-
8
2 4
14
7
1
16
10
9 3

Summary
• We can perform the following operations on heaps:
– MAX-HEAPIFY O(lgn)
– BUILD-MAX-HEAP O(n)
– HEAP-SORT O(nlgn)
– MAX-HEAP-INSERT O(lgn)
– HEAP-EXTRACT-MAX O(lgn)
– HEAP-INCREASE-KEY O(lgn)
– HEAP-MAXIMUM O(1)
Average
O(lgn)

Ch. 7 - QuickSort
Quick but not Guaranteed

Ch.7 - QuickSort
Another Divide-and-Conquer sorting algorithm…
As it turns out, MERGESORT and HEAPSORT, although O(n
lg n) in their time complexity, have fairly large constants
and tend to move data around more than desirable (e.g.,
equal-key items may not maintain their relative position
from input to output).
We introduce another algorithm with better constants, but a
flaw: its worst case in O(n2). Fortunately, the worst case
is “rare enough” so that the speed advantages work an
overwhelming amount of the time… and it is O(n lg n) on
average.
7/15/2021
138
91.404

Ch.7 - QuickSort
Like in MERGESORT, we use Divide-and-Conquer:
1. Divide: partition A[p..r] into two subarrays A[p..q-1] and
A[q+1..r] such that each element of A[p..q-1] is ≤ A[q],
and each element of A[q+1..r] is ≥ A[q]. Compute q as
part of this partitioning.
2. Conquer: sort the subarrays A[p..q-1] and A[q+1..r] by
recursive calls to QUICKSORT.
3. Combine: the partitioning and recursive sorting leave us
with a sorted A[p..r] – no work needed here.
An obvious difference is that we do most of the work in the
divide stage, with no work at the combine one.
7/15/2021
139
91.404

Ch.7 - QuickSort
The Pseudo-Code
7/15/2021
140
91.404

Ch.7 - QuickSort
7/15/2021
141
91.404

Ch.7 - QuickSort
Proof of Correctness: PARTITION
We look for a loop invariant and we observe that at the
beginning of each iteration of the loop (l.3-6) for any
array index k:
1. If p ≤ k ≤ i, then A[k] ≤ x;
2. If i+1 ≤ k ≤ j-1, then A[k] > x;
3. If k = r, then A[k] = x.
4. If j ≤ k ≤ r-1, then we don’t know anything about A[k].
7/15/2021
142
91.404

Ch.7 - QuickSort
The Invariant
• Initialization. Before the first iteration: i=p-1, j=p. No values between
p and i; no values between i+1 and j-1. The first two conditions are
trivially satisfied; the initial assignment satisfies 3.
• Maintenance. Two cases
– 1. A[j] > x.
– 2. A[j] ≥ x.
7/15/2021
143
91.404

Ch.7 - QuickSort
The Invariant
• Termination. j=r. Every entry in the array is in one of the three sets
described by the invariant. We have partitioned the values in the
array into three sets: less than or equal to x, greater than x, and a
singleton containing x.
Running time of PARTITION on A[p..r] is (n), where n = r – p + 1.
7/15/2021
144
91.404

Ch.7 - QuickSort
QUICKSORT: Performance – a quick look.
• We first look at (apparent) worst-case partitioning:
T(n) = T(n-1) + T(0) + (n) = T(n-1) + (n).
It is easy to show – using substitution - that T(n) = (n2).
• We next look at (apparent) best-case partitioning:
T(n) = 2T(n/2) + (n).
It is also easy to show (case 2 of the Master Theorem)
that T(n) = (n lg n).
• Since the disparity between the two is substantial, we
need to look further…
7/15/2021
145
91.404

Ch.7 - QuickSort
QUICKSORT: Performance – Balanced Partitioning
7/15/2021
146
91.404

Ch.7 - QuickSort
QUICKSORT: Performance – the Average Case
As long as the number of “good splits” is bounded below as
a fixed percentage of all the splits, we maintain
logarithmic depth and so O(n lg n) time complexity.
7/15/2021
147
91.404

Ch.7 - QuickSort
QUICKSORT: Performance – Randomized QUICKSORT
We would like to ensure that the choice of pivot does not
critically impair the performance of the sorting algorithm
– the discussion to this point would indicate that
randomizing the choice of the pivot should provide us
with good behavior (if at all possible with the data-set we
are trying to sort). We introduce
7/15/2021
148
91.404

Ch.7 - QuickSort
QUICKSORT: Performance – Randomized QUICKSORT
And the recursive procedure becomes:
Every call to RANDOMIZED-PARTITION has introduced
the (constant) extra overhead of a call to RANDOM.
7/15/2021
149
91.404

Ch.7 - QuickSort
QUICKSORT: Performance – Rigorous Worst Case
Analysis
Since we do not, a priori, have any idea of what the splits of
the subarrays will be, we have to represent a possible
“worst case” (we already have an O(n2) bound from the
“bad split” example – so it could be worse… although we
hope not). The worst case leads to the recurrence
T(n) = max0≤q≤n-1(T(q) + T(n – q - 1)) + (n),
where we remember that the pivot does not appear at the
next level (down) of the recursion.
7/15/2021
150
91.404

Ch.7 - QuickSort
Analysis
We have to come up with a “guess” and the basis for the
guess is our likely “bad split case”: it tells us we cannot
hope for any better than (n2). So we just hope it is no
worse… Guess T(n) ≤ cn2 for some c > 0 and start doing
algebra for the induction:
T(n) ≤ max0≤q≤n-1(T(q) + T(n – q - 1)) + (n)
≤ max0≤q≤n-1(cq2 + c(n – q - 1)2) + (n).
Differentiate cq2 + c(n – q - 1)2 twice with respect to q, to
obtain 4c > 0 for all values of q.
7/15/2021
151
91.404

Ch.7 - QuickSort
Analysis
Since the expression represents a quadratic curve,
concave up, it reaches it maximum at one of the
endpoints q = 0 and q = n – 1. As we evaluate, we find
max0≤q≤n-1(cq2 + c(n – q - 1)2) + (n) ≤
c max0≤q≤n-1(q2 + (n – q - 1)2) + (n) ≤
c (n – 1)2 + (n) = cn2 – 2cn + 1 + (n) ≤ cn2
by choosing c large enough to overcome the positive
constant in (n).
7/15/2021
152
91.404

Ch.7 - QuickSort
QUICKSORT: Performance – Expected RunTime
Understanding partitioning.
1. Each time PARTITION is called, it selects a pivot element
and this pivot element is never included in successive
calls: the total number of calls to PARTITION is n.
2. Each call to PARTITION costs O(1) plus an amount of
time proportional to the number of iterations of the for
loop.
3. Each iteration of the for loop (in line 4) performs a
comparison , comparing the pivot to another element in
A.
4. We need to count the number of times l. 4 is executed.
7/15/2021
153
91.404

Ch.7 - QuickSort
Lemma 7.1. Let X be the number of comparisons
performed in l. 4 of PARTITION over the entire execution
of QUICKSORT on an n-element array. Then the running
time of QUICKSORT is O(n + X).
Proof: the observations on the previous slide.
We need to find X, the total number of comparisons
performed over all calls to PARTITION.
7/15/2021
154
91.404

Ch.7 - QuickSort
1. Rename the elements of A as z1, z2, …, zn, so that zi is the
ith smallest element of A.
2. Define the set Zij = {zi, zi+1,…, zj}.
3. Question: when does the algorithm compare zi and zj?
4. Answer: at most once – notice that all elements in every
(sub)array are compared to the pivot once, and will
never be compared to the pivot again (since the pivot is
removed from the recursion).
5. Define Xij = I{zi is compared to zj}, the indicator variable of
this event. Comparisons are over the full run of the
algorithm.
7/15/2021
155
91.404

Ch.7 - QuickSort
6. Since each pair is compared at most once, we can write
7. Taking expectations of both sides:
8. We need to compute Pr{zi is compared to zj}.
9. We will assume all zi and zj are distinct.
10.For any pair zi, zj, once a pivot x is chosen so that zi < x <
zj, zi and zj will never be compared again (why?).
7/15/2021
156
91.404

X  Xij
ji1
n

i1
n1
 .

E X
  E Xij
ji1
n

i1
n1









 E Xij
 
ji1
n

i1
n1
  Pr zi is compared to zj
 
ji1
n

i1
n1
 .

Ch.7 - QuickSort
11.If zi is chosen as a pivot before any other item in Zij, then
zi will be compared to every other item in Zij.
12.Same for zj.
13. zi and zj are compared if and only if the first element to
be chosen as a pivot from Zij is either zi or zj.
14.What is that probability? Until a point of Zij is chosen as
a pivot, the whole of Zij is in the same partition, so every
element of Zij is equally likely to be the first one chosen
as a pivot.
7/15/2021
157
91.404

Ch.7 - QuickSort
15.Because Zij has j – i + 1 elements, and because pivots
are chosen randomly and independently, the probability
that any given element is the first one chosen as a pivot
is 1/(j-i+1). It follows that:
16. Pr{zi is compared to zj}
= Pr{zi or zj is first pivot chosen from Zij}
= Pr{zi is first pivot chosen from Zij}+
Pr{ zj is first pivot chosen from Zij}
= 1/(j-i+1) + 1/(j-i+1) = 2/(j-i+1).
7/15/2021
158
91.404

Ch.7 - QuickSort
17.Replacing the right-hand-side in 7, and grinding through
some algebra:
And the result follows.
7/15/2021
159
91.404

E X
 
2
j i 1
ji1
n

i1
n1
 
2
k 1
k1
ni

i1
n1
 
2
k
k1
n

i1
n1
  2Hn
i1
n1
  O lg n
  O(nlg n).
i1
n1


Dynamic Programming

Outline
161
⦁ Assembly‐line scheduling
⦁ Matrix‐chain multiplication
⦁ Elements of dynamic
programming
⦁ Longest common subsequence
⦁ Optimal binary search trees

Dynamic Programming1/2
162
Not a specific algorithm, but a technique, like divide‐and‐
conquer.
Dynamic programming is applicable when the subproblems
are not independent.
A dynamic‐programming algorithm solves every
subsubproblem just once and then saves its answer in a table.
"Programming" in this context refers to a tabular method, not
to writing computer code.
Used for optimization problems:
Find a solution with the optimal value.
Minimization or maximization.

Dynamic Programming2/2
163
⦁ Four‐step method
1. Characterize the structure of an optimal solution.
2. Recursively define the value of an optimal solution.
3. Compute the value of an optimal solution in a bottom‐up
fashion.
4. Construct an optimal solution from computed information.

⦁ Automobile factory with two assembly lines.
⦁ Each line has n stations: S1,1,…, S1,n and S2,1,…, S2,n.
⦁ S1,j and S2,j :perform the same function with times a1,j and a2,j,
respectively.
⦁ Entry times e1 and e2. Exit times x1 and x2.
⦁ After going through a station, can either
⦁ stay on same line; no cost, or
⦁ transfer to other line; cost after Si,j is ti,j.
Assembly‐line scheduling1/2
164
…
line 1
line 2
S1,2
a1,2
S1,3
a1,3
S1,4
a1,4
enters exits
S1,n‐1
a1,n‐1
S1,n
a1,n
a2,2
S2,2
a2,3
S2,3
a2,4
S2,4
a2,n‐1
S2,n‐1
a2,n
S2,n
t1,1
t2,1
t1,2
t2,2
t1,3
t2,3
t1,n‐1
t2,n‐1
S1,1
a1,1
e1
e2
a2,1
S2,1
x1
x2

Assembly‐line scheduling2/2
165
Problem: Given all these costs (time = cost), what stations
should be chosen from line 1 and from line 2 for fastest
way through factory?
…
line 1
line 2
S1,2
a1,2
S1,3
a1,3
S1,4
a1,4
enters exits
S1,n‐1
a1,n‐1
S1,n
a1,n
a2,2
S2,2
a2,3
S2,3
a2,4
S2,4
a2,n‐1
S2,n‐1
a2,n
S2,n
t1,1
t2,1
t1,2
t2,2
t1,3
t2,3
t1,n‐1
t2,n‐1
S1,1
a1,1
e1
e2
a2,1
S2,1
x1
x2

Structure of an optimal solution
166
Step 1: Characterize the structure of an optimal solution.
Fastest way through S1,j is either
fastest way through S1, j−1 then directly through S1, j, or
fastest way through S2, j−1, transfer from line 2 to line 1, then
through S1, j.
Example:
If fastest(S1,4) = (S2,1, S1,2, S2,3, S1,4), then fastest(S2,3) = (S2,1, S1,2,
S2,3)
An optimal solution to a problem contains within it an optimal
solution to subproblems.
This is optimal substructure.

Recursive solution
167
⦁ Step 2: Recursively define the value of an optimal solution.
⦁ Let fi[j] = the fastest time through Si,j, where i = 1, 2 and j = 1,…, n.
⦁ Let f* = the fastest time through the factory.
⦁ Then, we have the following two recursive equations.

It follows that


if j  1,
a )
if j 
2.
1,
j
1
, f [ j 1] t
min( f [ j 1] a
e1 
a1,1
2,
j1
1, j
2
1
f [ j] 
if j  1,
a )
if j  2.
2, j
2
, f [ j 1] t
 a2,1
e2
f2[ j]  min( f [ j 1] a 2,
j1
2, j
1
f*  min(f1[n] x1 , f2[n] x2
).

li[j] = line # whose station j −1 is used in fastest way through Si,j.
l*= line # whose station n is used in fastest way through the
entire factory.
An instance of assembly‐line scheduling
168
line 1
line 2
enters exits
2
4
3
2
S1,1 S1,
2
S1,
3
S1,
4
S1,n‐
1
S1,
n
7 9 3 4 8 4
2 3 1 3 4
2 1 2 2 1
8 5 6 4 5 7
S2,1 S2,
2
S2,
3
S2,
4
S2,n‐
1
S2,
n
j 1 2 3 4 5 6
f1[j] 9 18 20 24 32 35
f2[j] 12 16 22 25 30 37
f* =
38
l* =
1
l1[j
]
l2[j
]
2 3 4 5
6
j
1 2 1 1 2
1 2 1 2 2

Compute an optimal solution
169
Step 3: Compute the value of an optimal solution in
a bottom‐up fashion.
⦁ Write a recursive algorithm based on above recurrences.
⦁ Let ri(j) = # of references made to fi[j].
⦁ r1(n) = r2(n) = 1.
r1(j) = r2(j)= r1(j+1) + r2(j+1) for j = 1,… , n
−1.
One can show that ri(j)= 2n−j and the total
number of references to all fi[j] is (2n).
(Exercises 15.1‐2 and 15.1‐3)
⦁ Observation:
⦁ fi[j] depends only on f1[j−1] and f2[j−1] for j 
2.
⦁ So compute in order of increasing j .
f*
f1[n] f2[n]
f2[n‐1]
f1[n‐1]
f2[n‐1]
f1[n‐1]

FASTEST‐WAY procedure
170
⦁ Time:O(n).
FASTEST‐WAY(a, t, e, x,
n)
do if f1[j − 1] + a1,j  f2[j − 1] + t2,j−1
+ a1,j
then f1[j]  f1[j − 1] + a1,j
l1[j]  1
else f1[j]  f2[j − 1] + t2,j−1 + a1,j
l1[j]  2
if f2[j − 1] + a2,j  f1[j − 1] + t1,j−1
+ a2,j
then f2[j]  f2[j − 1] + a2, j
l2[j]  2
else f2[j]  f1[j − 1] + t1,j−1 + a2,j
1. f1[1]  e1 + a1,1
2. f2[1]  e2 + a2,1
3. for j  2 to n
4.
5.
6.
7.
8.
9.
10.
11.
12.
13. l2[j] 
1
14. if f1[n] + x1  f2[n] +
x2
15. then f* = f1[n] +
x1
16. l* = 1
17. else f* = f2[n] +
x2
18. l* = 2
(1
)
(n1) ∙
(1)
(1
)

Construct the fastest way
171
Step 4: Construct an optimal solution from computed
information.
The following procedure prints out the stations used, in decreasing
order of station number.
PRINT‐STATIONS(l, n)
⦁ Time:O(n).
1. i  l*
2. print “line” i “, station”
n
3. for j  n downto 2
4.
5.
do i  li[j]
print “line” i “, station” j
−1
(n1) ∙
(1)
(1
)

Matrix‐chain multiplication
172
When we multiply two matrices A and B, if A is a p × q matrix
and B is a q × r matrix, the resulting matrix C is a p × r matrix.
The number of scalar multiplications is pqr.
Matrix‐chain multiplication problem
Input: A chain 〈A1, A2,..., An〉 of n matrices. (matrix Ai has
dimension pi − 1 × pi )
Output: A fully parenthesized product A1, A2,..., An that minimizes
the number of scalar multiplications.
For example: The dimensions of the matrices A1, A2, and A3
are 10 × 100, 100 × 5, and 5 × 50, respectively.
((A1A2)A3) = 10 ∙ 100 ∙ 5 + 10 ∙ 5 ∙ 50 = 7500.
(A1(A2A3)) = 100 ∙ 5 ∙ 50 + 10 ∙ 100 ∙ 50 = 75000.

Counting the number of parenthesizations
173


P(k)P(n  k)
⦁ Thus, we have P(n)  
n1
k
1
1
Brute‐force algorithm:
Checking all possible parenthesizations
Time: (2n). (Exercise 15.2‐3)
Denote the number of alternative parenthesizations of a
sequence of n matrices by P(n).
A fully parenthesized matrix product is the product of two fully
parenthesized matrix subproducts.
The split between the two subproducts may occur between the
kth and (k + 1)st matrices.
if n  1,
if n  2.

Step 1: The structure of an optimal solution
174
An optimal solution to an instance contains optimal
solutions to subproblem instances.
For example:
If ((A1A2)A3)(A4(A5A6)) is an optimal solution to A1, A2,...,
A6.
Then, ((A1A2)A3) is an optimal solution to A1, A2, A3 and
(A4(A5A6)) is an optimal solution to A4, A5, A6.

Step 2: A recursive solution
175
Define m[i, j] = the minimum number of scalar
multiplications needed to compute Ai Ai+1… Aj.
if i  j ,
⦁ The recursion tree for the computation of
m[1,4].

ik j
0
if i  j.
i1 k
j
m[i, j]  min(m[i,k] m[k 1, j]  p p p
)
1..1 4..4
1..4
2..4 1..2 3..4 1..3
2..2 3..4 2..3 4..4 1..1 2..2 3..3 4..4 1..1
2..3 1..2 3..3
3..3 4..4 2..2 3..3 2..2 3..3 1..1 2..2

Step 3: Computing the optimal
costs
176
Based on the recursive formula, we could easily write an
exponential‐time recursive algorithm to compute the
compute the minimum cost m[1, n] for multiplying A1A2…An.
⦁ There are
only
2
= (n ) distinct subproblems,
one
problem for each choice of i and j satisfying 1  i  j  n.
We can use dynamic programming to compute the
solutions bottom up.
n
( )  n
2

3
Dependencies between the
subproblems
177
matrix A1 A2 A3 A4 A5 A6
dimension 30 × 35 35 × 15 15 × 5 5 × 10 10 × 20 20 × 25


1 4 5
m[2,4] m[5,5] p p p  4375  0  3510
20  11375.
1 3 5
⦁ s[i, j]: index k achieved the optimal cost in computing m[i,
j].
m[2,2] m[3,5] p1 p2 p5  0  2500  3515 20 
13000,
m[2,5]  minm[2,3] m[4,5] p p p  2625 1000  355
20  7125,
15,125
m
A1 A2 A3 A4 A5 A6
0
15,750 2,625 750 1,000
5,000
0 0 0 0 0
7,875 4,375 2,500
3,500
9,375 7,125
5,375
11,875 10,500
1
2
3
4
6
1
6
5
4
3
5
2
3
s
1 2 3 4 5
1 3 3 5
3 3
3 3
6
2
3
4
5
1
5
4
3
2
i
j
i
j

MATRIX‐CHAIN‐ORDER pseudocode
178
The loops are nested three deep, and each loop index (l, i, and
k) takes on at most n – 1 values.
Time:O(n3).
MATRIX‐CHAIN‐ORDER(
p)
do for i  1 to n – l
+ 1
do j  i + l –
1
1. n  length[p] – 1
2. for i  1 to n
3. do m[i, i]  0
4. for l  2 to n /* l is the chain length*/
5.
6.
7. m[i, j]  
8. for k  i to j – 1
9. do q  m[i, k] + m[k + 1, j] + pi
– 1pkpj
10. if q < m[i, j]
11. then m[i, j]  q
12. s[i, j]  k
13. return m and s

Step 4: Constructing an optimal solution
179
The call PRINT‐OPTIMAL‐PARENS(s, 1, n) prints the
parenthesization ((A1(A2A3)) ((A4A5)A6)).
PRINT‐OPTIMAL‐PARENS(s, i,
j)
1. if i = j
2. then print “Ai”
3. else print “(“
4. PRINT‐OPTIMAL‐PARENS(s, i, s[i, j])
5. PRINT‐OPTIMAL‐PARENS(s, s[i, j]+1,
j)
6. print ")"
3
Each entry s[i, j] records the value of k such that the
optimal parenthesization of Ai Ai+1∙∙∙Aj splits the product
between
Ak and Ak+1.
s
1 2 3 4 5
1 3 3 5
3
3
3
3
3
3
4
6 1
5
4
3
5
2
i
j

Elements of dynamic programming1/2
180
Optimal substructure
An optimal solution to a problem contains an optimal solution to
subproblems.
If ((A1A2)A3)(A4(A5A6)) is an optimal solution to A1, A2,..., A6, then
((A1A2)A3) is an optimal solution to A1, A2, A3 and (A4(A5A6)) is an
optimal solution to A4, A5, A6.
Overlapping subproblems
A recursive algorithm revisits the same problem over and over
again.
Typically, the total number of distinct subproblems is a polynomial
in the input size.
In contrast, a problem for which a divide‐and‐conquer approach is
suitable usually generates brand‐new problems at each step of the
recursion.

Elements of dynamic
programming2/2
181
⦁ Example: merge
sort
⦁ Example:
matrix‐chain
1..8
1..4
1..2
4..4
3..4
3..3
2..2
1..1
5..8
6..6
5..6
5..5 8..8
7..8
7..7
1..1 4..4
1..4
2..4 1..2 3..4 1..3
2..2 3..4 2..3 4..4 1..1 2..2 3..3 4..4 1..1
2..3 1..2 3..3
3..3 4..4 2..2 3..3 2..2 3..3 1..1 2..2

RECURSIVE‐MATRIX‐CHAIN procedure
182
⦁ We shall prove that T(n) = (2n). Specifically, T(n)  2n–
1.
RECURSIVE‐MATRIX‐CHAIN(p, i,
j)
+ p p
p
i–1 k j I
if q < m[i,
j] then m[i, j] 
q
1. if i = j
2. then return 0
3. m[i, j]  
4. for k  i to j – 1
5. do q  RECURSIVE‐MATRIX‐CHAIN(p, i, k)
6. + RECURSIVE‐MATRIX‐CHAIN(p, k+1,
j)
7.
8.
9.
10. return m[i, j]
(1
)
(1
)

n
1
( (T(k)T(nk)
1)
k1
n1 n1 n1
T(n) 1(T(k) T(nk)1)  2T(i) n 22i1

n
k1 i1 i1
n
2
 22i
n  2(2n1
1)n  (2n
2)n 
2n1
.
i0
T(1) 1 
20
.
Using the substitution
method.

Memoization
183
A variation of dynamic programming that offers the efficiency of the
usual dynamic‐programming approach while maintaining a
top‐down strategy.
MEMOIZED‐MATRIX‐CHAIN(p
)
1. n  length[p] – 1
2. for i  1 to n
3.
4.
do for j  i to n
do m[i, j] 

5. return LOOKUP‐CHAIN(p, 1,
n)
LOOKUP‐CHAIN(p, i,
j)
1. if m[i, j] < 
2. then return m[i,
j]
3. if i =
j
4. then m[i, j]  0
5. else for k  i to j – 1
6. do q  LOOKUP‐CHAIN(p, i,
k)
7.
8.
9.
10.
+ LOOKUP‐CHAIN (p, k+1,
j)
+ pi–1pkpj I
if q < m[i, j]
then m[i, j]  q
11. return m[i, j]
⦁ Time:O(n3).
🞂 Compute m[i, j] only in the first time to call LOOKUP‐CHAIN(p, i,
j).

Longest‐common‐subsequence
184
A subsequence is a sequence that can be derived from
another sequence by deleting some elements.
For example:
〈K, C, B, A〉 is a subsequence of 〈K, G, C, E, B, B, A〉.
〈B, C, D, G〉is a subsequence of 〈A, C, B, E, G, C, E, D, B, G〉.
Longest‐common‐subsequence problem
Input: 2 sequences, X = 〈x1, x2,…, xm〉 and Y = 〈y1, y2,…, yn〉.
Output: A maximum‐length common subsequence of X and Y.
For example: X = 〈A, B, C, B, D, A, B〉 and Y = 〈B, D, C, A, B, A〉.
〈B, C, A〉is a common subsequence of both X and Y.
〈B, C, B, A〉is an longest common subsequence (LCS) of X and Y.

Step 1: Characterizing an
LCS
185
Brute‐force algorithm:
For every subsequence of X, check whether it is a subsequence of
Y.
Time: (n2m).
2m subsequences of X to check.
Each subsequence takes (n) time to check: scan Y for first letter,
from there scan for second, and so on.
Given a sequence X = 〈x1, x2,…, xm〉, we define the ith prefix of
X, as X = 〈x1, x2,…, xi〉.
For example:
X = 〈A, B, C, B, D, A, B〉.
X4 = 〈A, B, C, B〉 and X0 is the empty sequence.

Optimal substructure of an LCS
186
Theorem 15.1
Let X = 〈x1, x2,…, xm〉 and Y = 〈y1, y2,…, yn〉 be sequences, and let
Z = 〈z1, z2,…, zk〉be any LCS of X and Y.
1. If xm = yn, then zk = xm = yn and Zk−1 is an LCS of Xm−1 and Yn−1.
2. If xm  yn, then zk  xm implies that Z is an LCS of Xm−1 and Y.
3. If xm  yn, then zk  yn implies that Z is an LCS of X and Yn−1.
For example:
X = 〈A, B, C, B, D, A, B〉, Y = 〈B, D, C, A, B〉 and Z = 〈B, C, A, B〉 is an LCS of
X and Y. Then, z4 = x7 = y5 and Z3 = 〈B, C, A〉 is an LCS of X6 and Y4.
X = 〈A, B, C, B, D, A, D〉, Y = 〈B, D, C, A, B〉 and Z = 〈B, C, A〉 is an LCS of X
and Y. Then, z3  x7 implies that Z3 = 〈B, C, A〉 is an LCS of X6 and Y5.
X = 〈A, B, C, B, D, A〉, Y = 〈B, D, C, A, B〉 and Z = 〈B, C, A〉 is an LCS of X
and Y. Then, z3  y5 implies that Z3 = 〈B, C, A〉 is an LCS of X6 and Y4.

Step 2: A recursive
solution
187
⦁ Define c[i, j] = length of LCS of Xi and Yj. We want c[m,
n].
⦁ The recursion tree for the computation of
c[4,3].

max(c[i, j 1],c[i 1, j])

0
c[i, j]  c[i 1, j 1]1
ifi, j  0 and xi  yj.
if i  0 or j  0,
ifi, j  0 and xi  yj ,
4,3
3,3
2,3
3,1
3,2
2,2
2,2
1,3
4,2
3,1
3,2
2,2 3,0
4,1
3,1
0,3 1,2 1,2 2,0 1,2 2,0 2,1 3,0 1,2 2,0 2,1 3,0 2,1
3,0

Step 3: Computing the length of an LCS
188
Based on the recursive formula, we could easily write an
exponential‐time recursive algorithm to compute the length
of an LCS of two sequences.
There are only (mn) distinct subproblems.
We can use dynamic programming to compute the solutions
bottom up.

LCS‐LENGTH pseudocode
189
⦁ Time:O(mn)
.
LCS‐LENGTH(X,
Y)
do if xi =
yj
1. m  length[X]; n  length[Y]
2. for i  1 to m
3. do c[i, 0]  0
4. for j  0 to n
5. do c[0, j]  0
6. for i  1 to m
7. do for j  1 to n
8.
9. then c[i, j]  c[i − 1, j − 1] + 1
10. b[i, j]  “↖”
11. else if c[i ‐ 1, j]  c[i, j ‐ 1]
12. then c[i, j]  c[i − 1, j]
13. b[i, j]  “”
14. else c[i, j]  c[i, j − 1]
15. b[i, j]  “”
16. return c and b
i
i
j 0
y
1
B
2
D
3
C
4
A
5
B
6
A
0 xi
1 A
2 B
3 C
4 B
5 D
6 A
7 B
0 0 0 0 0 0 0
0

0

0

0
↖
1  1
↖
1
0
↖
1  1  1

1
↖
2  2
0

1

1
↖
2  2

2

2
0
↖
1

1

2

2
↖
3  3
0

1
↖
2

2

2

3

3
0

1

2

2
↖
3

3
↖
4
0
↖
1

2

2

3
↖
4

4

Step 4: Constructing an LCS
190
Whenever we encounter a “↖” in entry b[i, j], it implies
that
xi = yj is an element of the LCS.
PRINT‐LCS(b, X, i, j)
1. if i = 0 or j = 0
2. then return
3. if b[i, j] = " ↖“
4. then PRINT‐LCS(b, X, i − 1, j − 1)
5. print xi
6. elseif b[i, j] = "“
7. then PRINT‐LCS(b, X, i − 1, j)
8. else PRINT‐LCS(b, X, i, j − 1)
This procedure prints "BCBA".
i
j 0
yi
1
B
2
D
3
C
4
A
5
B
6
A
0 xi
1 A
2 B
3 C
4 B
5 D
6 A
7 B
0 0 0 0 0 0 0
0

0

0

0
↖
1  1
↖
1
0
↖
1  1  1

1
↖
2  2
0

1

1
↖
2  2

2

2
0
↖
1

1

2

2
↖
3  3
0

1
↖
2

2

2

3

3
0

1

2

2
↖
3

3
↖
4
0
↖
1

2

2

3
↖
4

4

Optimal binary search trees
Input: A sequence K = 〈k1, k2,..., kn〉of n distinct keys in sorted
order. A sequence D = 〈d0, d1,..., dn〉of n + 1 dummy keys.
k1 < k2 < ∙∙∙ < kn.
d0 = all values < k1. dn = all values > kn.
di = all values between ki and ki+1.
For each key ki, a probability pi that a search is for ki.
For each key di, a probability qi that a search is for di.
Output: A BST with minimum expected search
cost.
191
i T i
i
n
n T i
i1 i
1
n n
 pi  qi
i1 i1
n
n
i1 i
1
⦁ E[searchcost inT]  (depthT (ki )1)pi  (depthT (di )1)
qi
 1  depth (k )p  depth (d )
q
n n
depthT (ki )pi  depthT (di)qi
i1 i1
i1 i
1
n n
pi  qi  1

0-1 KNAPSACK PROBLEM

Statement of
the problem:
Given n items, each with corresponding value p and weight w, find
which items to place in the knapsack such that sum of all p is
maximum, and the sum of all w does not exceed the maximum
weight capacity c of the knapsack.

We can also express the problem as follows:
n
i1
 pi xi
n
i1
is maximum and wi xi  c
i
where x 
0

item is not taken
1 item is taken

Solution #1 : Brute force
n
 We take all possible item combinations.
 For any n items, the total number of combinations is
C i, n = 2n
i0
 We pick the combinations that satisfy the constraint and sort
each  p and get the maximum.
 This approach has complexity O 2n
.

Solution #2:
Dynamic Programming (Bottom-Top Computation)
Construct an n  c value matrix V to compute a value in each cell
for every row in the matrix.
The last cell V[n, c] will give the solution to the maximum total
value.

Bottom-Top
computation
pseudocode:
for i = 0 to c:
V[0, i]  0
for i = 0 to n:
for k = 0 to c:
V[i, k]  Max(V[i - 1, k], pi + V[i - 1, k - wi])

Exa
mpl
e:
i 1 2 3 4 5
p 30 20 40 70 60
w 4 1 2 5 3
n = 5
c = 10

Solution:
The value matrix can be viewed as bottom-top, with the first row
(i = 0) the bottom, and moving up to the succeeding rows up to the
top (i = n).
Row 0 of the value matrix all start with 0.
The column k starts at 0 and ends at c (the constraint).
k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0

 Value at V[i, k] = Max(V[i - 1, k], pi + V[i- 1, k - wi]) where
pi - the value of the item at row i
wi - the weight of the item at row i
 For each row, we search each column where k  wi  0, i.e.,
the maximum is V[i - 1, k] (the cell above the current cell).
 If k  wi  0, compare V i 1, k  and pi V i 1, k  wi .
The
maximum of the two is the value of V i, k .

k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 30 30 30 30 30 30 30
at i = 1, k = 4:
V i 1,kV 0,4 0
p1  30,w1  4 , k  wi  4  w1  4  4  0
pi V i 1,k  wi  p1 V 0,0 30  0  30

Completing
the value
matrix:
k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 30 30 30 30 30 30 30
2 0 20 20 20 30 50 50 50 50 50 50
3 0 20 40 60 60 60 70 90 90 90 90
4 0 20 40 60 60 70 90 110 130 130 130
5 0 20 40 60 80 100 120 120 130 150 170
The last cell at V n,c is the solution to the maximum value.

 The value matrix only showed the solution to the maximum
value, but not the individual items chosen.
 Modify the last pseudocode to mark
the cells where the maximum is
pi V i 1, k  wi , where k  wi  0.
k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 30 30 30 30 30 30 30
2 0 20 20 20 30 50 50 50 50 50 50
3 0 20 40 60 60 60 70 90 90 90 90
4 0 20 40 60 60 70 90 110 130 130 130
5 0 20 40 60 80 100 120 120 130 150 170

Pseudocode to
find the items
selected:
k = c
for i = n down to 0:
if V i, k  is marked:
output item i
k  k  wi

From the
last
example:
k
i
0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 30 30 30 30 30 30 30
2 0 20 20 20 30 50 50 50 50 50 50
3 0 20 40 60 60 60 70 90 90 90 90
4 0 20 40 60 60 70 90 110 130 130 130
5 0 20 40 60 80 100 120 120 130 150 170
i k Marked
5 10 Yes
4 10 - w5 = 10 - 3 = 7 Yes
3 7 - w4 = 7 - 5 = 2 Yes
2 2  w3 = 2 - 2 = 0 No
1 0 - w2 = 0 - 1 = INVALID No
The items selected are 3, 4, 5.

 Bottom-top computation has complexity Onc.
 For large n, a vast improvement to O2n
.
 Any problem involving maximizing a total value while
satisfying a constraint can use this method, as long as the items
can only be either chosen or not, i.e., the item cannot be broken
into smaller parts.

Sorting in
Linear Time
Counting sort
Radix sort
Bucket sort

Counting Sort
The Algorithm
• Counting-Sort(A)
– Initialize two arrays B and C of size n and set all entries to 0
• Count the number of occurrences of every A[i]
– for i = 1..n
– do C[A[i]] ← C[A[i]] + 1
• Count the number of occurrences of elements <= A[i]
– for i = 2..n
– do C[i] ← C[i] + C[i – 1]
• Move every element to its final position
– for i = n..1
– do B[C[A[i]] ← A[i]
– C[A[i]] ← C[A[i]] – 1

Counting Sort Example
2 3
5 0 2 3 0 3
A =
1 2 3 4 5 6 7 8
2 4
2 7 8
C =
0 1 2 3 4 5
7
1 2 3 4 5 6 7 8
B = 3
6
C =
0 1 2 3 4 5
2 2
0 3 0 1
2 4
2 7 7 8
C =
0 1 2 3 4 5

0 3
B =
2 4
2 6 7 8
C =
0 1 2 3 4 5
1 2 3 4 5 6 7 8
2 3
5 0 2 3 0 3
A =
1 2 3 4 5 6 7 8
2 4
2 6 7 8
C =
0 1 2 3 4 5
1

0 3
B =
1 4
2 6 7 8
C =
0 1 2 3 4 5
1 2 3 4 5 6 7 8
2 3
5 0 2 3 0 3
A =
1 2 3 4 5 6 7 8
2 4
2 6 7 8
C =
0 1 2 3 4 5
3
5

Counting Sort
1 CountingSort(A, B, k)
2 for i=1 to k
3 C[i]= 0;
4 for j=1 to n
5 C[A[j]] += 1;
6 for i=2 to k
7 C[i] = C[i] + C[i-1];
8 for j=n downto 1
9 B[C[A[j]]] = A[j];
10 C[A[j]] -= 1;
What will be the running time?
Takes time O(k)
Takes time O(n)

Counting Sort
• Total time: O(n + k)
– Usually, k = O(n)
– Thus counting sort runs in O(n) time
• But sorting is (n lg n)!
– No contradiction--this is not a comparison sort (in
fact, there are no comparisons at all!)
– Notice that this algorithm is stable
• If numbers have the same value, they keep their
original order

• A sorting algorithms is stable if for any two
indices i and j with i < j and ai = aj, element ai
precedes element aj in the output sequence.
Observation: Counting Sort is stable.
Stable Sorting Algorithms
Output
2
1 5
1 6
1 7
1
2
2 2
3 4
1 4
2
Input
2
1 5
1 2
3 6
1
7
1 4
1 4
2 2
2

Counting Sort
• Linear Sort! Cool! Why don’t we always
use counting sort?
• Because it depends on range k of
elements
• Could we use counting sort to sort 32 bit
integers? Why or why not?
• Answer: no, k too large (232 =
4,294,967,296)

Radix Sort
• Why it’s not a comparison sort:
– Assumption: input has d digits each ranging from
0 to k
– Example: Sort a bunch of 4-digit numbers, where
each digit is 0-9
• Basic idea:
– Sort elements by digit starting with least
significant
– Use a stable sort (like counting sort) for each stage

Radix Sort Overview
• Origin : Herman Hollerith’s card-sorting machine for the
1890 U.S Census
• Digit-by-digit sort
• Hollerith’s original (bad) idea : sort on most-significant
digit first.
• Good idea : Sort on least-significant digit first with
auxiliary stable sort
A idéia de Radix Sort não é nova

Para minha turma da faculdade foi
muito fácil aprender Radix Sort
IBM 083
punch card
sorter

• Radix Sort takes parameters: the array
and the number of digits in each array
element
• Radix-Sort(A, d)
• 1 for i = 1..d
• 2 do sort the numbers in arrays A by
their i-th digit from the right, using a
stable sorting algorithm
Radix Sort
The Algorithm

Radix Sort Example
720
329
436
839
355
457
657
329
457
657
839
436
720
355
720
355
436
457
657
329
839
329
355
436
457
657
720
839

Radix Sort
Correctness and Running Time
•What is the running time of radix sort?
•Each pass over the d digits takes time
O(n+k), so total time O(dn+dk)
•When d is constant and k=O(n),
takes O(n) time
•Stable, Fast
•Doesn’t sort in place (because counting sort
is used)

Bucket Sort
• Assumption: input - n real numbers from [0, 1)
• Basic idea:
– Create n linked lists (buckets) to divide interval [0,1) into
subintervals of size 1/n
– Add each input element to appropriate bucket and sort
buckets with insertion sort
• Uniform input distribution  O(1) bucket size
– Therefore the expected total time is O(n)

Bucket Sort
Bucket-Sort(A)
1. n  length(A)
2. for i  0 to n
3. do insert A[i] into list B[floor(n*A[i])]
4. for i  0 to n –1
5. do Insertion-Sort(B[i])
6. Concatenate lists B[0], B[1], … B[n –1] in order
Distribute elements over buckets
Sort each bucket

Bucket Sort Example
.78
.17
.39
.26
.72
.94
.21
.12
.23
.68
7
6
8
9
5
4
3
2
1
0
.17
.12
.26
.23
.21
.39
.68
.78
.72
.94
.68
.72
.78
.94
.39
.26
.23
.21
.17
.12

Bucket Sort – Running Time
• All lines except line 5 (Insertion-Sort) take O(n) in the worst
case.
• In the worst case, O(n) numbers will end up in the same
bucket, so in the worst case, it will take O(n2) time.
• Lemma: Given that the input sequence is drawn uniformly at
random from [0,1), the expected size of a bucket is O(1).
• So, in the average case, only a constant number of elements
will fall in each bucket, so it will take O(n) (see proof in book).
• Use a different indexing scheme (hashing) to distribute the
numbers uniformly.

• Every comparison-based sorting algorithm has to take
Ω(n lg n) time.
• Merge Sort, Heap Sort, and Quick Sort are comparison-based
and take O(n lg n) time. Hence, they are optimal.
• Other sorting algorithms can be faster by exploiting
assumptions made about the input
• Counting Sort and Radix Sort take linear time for integers in a
bounded range.
• Bucket Sort takes linear average-case time for uniformly
distributed real numbers.
Summary

WHA
T ?
• GIVEN WEIGHTS AND VALUES OF N ITEMS, WE NEED TO
PUT THESE ITEMS IN A KNAPSACK OF CAPACITY W TO
GET THE MAXIMUM TOTAL VALUE IN THE KNAPSACK.

TYP
ES
• 0-1 KNAPSACK PROBLEM
In the 0-1 knapsack problem, we are not allowed to break items. We
either take the whole item or donT’tHEtaOkPeTiItM. AL
KNAPSACK ALGORITHM
• FRACTIONAL KNAPSACK
In fractional knapsack, we can break items for maximizing the total
value of knapsack. This problem in which we can break an item
is also called the fractional knapsack problem.

EXAMP
LE
0-1 KNAPSACK
Take B and C
Total weight
=20+30=50 Total
value
=100+120=220
FRACTIONAL KNAPSACK
Take A,B and 2/3rd of C
Total weight =10+20+(30*
2/3)=50 Total value
=60+100+(120* 2/3)=240

GREEDY
APPROACH
• The basic idea of the greedy approach is to calculate the ratio
value/weight for each item.
• Sort the item on basis of this ratio.
• Then take the item with the highest ratio and add them until we
can’t add the next item as a whole.
• At the end add the next item as much (fraction) as we can.

GREEDYAPPROACH
SOLUTION
Ratio
= 5
R
a
ti
o
=
6
1
.
c
a
l
value/weight.
2.Sort (descending) item
on basis
of ratio.
3.Take item with highest
ratio and add to
knapsack until we cant
add the next item as
Ratio= 4
Capacity left =40
Value = 60
Capacity left =20
Value = 160
Take 2/3rd of C Weight
= 2/3 *30 =20 Value =
2/3 *120 =80 Capacity
left =0
Value = 240

THE OPTIMAL KNAPSACK
ALGORITHM
• INPUT: AN INTEGER N
• Positive values wi and vi such
that 1 < = i < = n
• Positive value W.
• OUTPUT:
• N values of xi such that 0 < = xi
< = 1
• Total profit

WHAT IS STRING
MATCHING
• In computer science, string searching
algorithms, sometimes called string
matching algorithms, that try to find a
place where one or several string (also
called pattern) are found within a larger
string or text.

EXAMPL
E
STRING MATCHING PROBLEM
TEXT
N
A B C A B A A C A B
SHIFT=3
A B A A PATTER

STRING
MATCHING
ALGORITHMS
There are many types of String Matching
Algorithms like:-
1)The Naive string-matching algorithm
2)The Rabin-Krap algorithm
3)String matching with finite automata
4)The Knuth-Morris-Pratt algorithm
But we discuss about 2 types of string matching
algorithms.
1)The Naive string-matching algorithm
2)The Rabin-Krap algorithm

THE NAIVE
ALGORITHM
The naive algorithm finds all valid shifts using a loop
that checks
the condition P[1….m]=T[s+1…. s+m] for eachof the n-
m+1
possible values of s.(P=pattern , T=text/string , s=shift)
NAIVE-STRING-MATCHER(T,P)
1)n = T.length
2)m = P.length
3)for s=0 to n-m
4)
5)
if P[1…m]==T[s+1….s+m]
printf” Pattern occurs with
shift ” s

EXAMPL
E
SUPPOSE,
T=1011101110 P=111
FIND ALL VALID SHIFT……
1 0 1 1 1 0 1 1 1 0
1 1 1
P=Patter
n
T=Tex
t
S=
0

1 0 1 1 1 0 1 1 1 0
1 1 1
S=
1

1 0 1 1 1 0 1 1 1 0
S=2
1 1 1
So, S=2 is a valid shift…

1 0 1 1 1 0 1 1 1 0
S=6
1 1 1
So, S=6 is a valid shift…

THE RABIN-KARP
ALGORITHM
Rabin and Karp proposed a string
matching algorithm that performs well in
practice and that also generalizes to
other algorithms for related problems,
such as two-dimentional pattern
matching.

ALGORITH
M
RABIN-KARP-MATCHER(T,P,d,q)
//pre-processing
//matching
1) n = T.length
2) m = P.length
3) h = d^(m-1) mod q
4) p = 0
5) t = 0
6) for i =1 to m
7) p = (dp + P[i]) mod q
8) t = (d t + T[i]) mod q
9) for s = 0 to n – m
10) if p == t
if P[1…m] == T[s+1…. s+m]
printf “ Pattern occurs with shift ” s if
s< n-m
t+1 = (d(t- T[s+1]h)+ T[s+m+1]) mod q

EXAMPL
E
Pattern P=26, how many spurious hits does the
Rabin
Karp matcher in the text T=3 1 4 1 5 9 2 6 5 3
5…
• T = 3 1 4 1 5 9 2 6 5 3 5
P = 2 6
Here T.length=11so Q=11 and P mod Q =
26 mod 11
= 4
Now find the exact match of P mod Q…

3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 mod 1 1 = 9 not equal to 4
3 1 4 1 5 9 2 6 5 3 5
S=0
S=1
S=2

3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
1 5 mod 1 1 = 4 equal to 4 SPURIOUS HIT
S=3
S=4
S=5

3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
3 1 4 1 5 9 2 6 5 3 5
2 6 mod 1 1 = 4
• S=7
• 6 5 mod 1 1 = 10 not equal to 4
• S=8
• 5 3 mod 1 1 = 9 not equal to 4
EXACT MATCH
S=6

3 1 4 1 5 9 2 6 5 3 5
S=9
Pattern occurs with shift 6

COMPARISSION
The Naive String Matching algorithm slides
the pattern one by one. After each slide, it one
by one checks characters at the current shift
and if all characters match then prints the
match.
Like the Naive Algorithm, Rabin-Karp algorithm
also slides the pattern one by one. But unlike the
Naive algorithm, Rabin Karp algorithm matches
the hash value of the pattern with the hash value
of current substring of text, and if the hash values
match then only it starts matching individual
characters.

Minimum Spanning Trees
•Definition of MST
•Generic MST algorithm
•Kruskal's algorithm
•Prim's algorithm
254

Definition of MST
 Let G=(V,E) be a connected, undirected graph.
 For each edge (u,v) in E, we have a weight w(u,v)
specifying the cost (length of edge) to connect u and v.
 We wish to find a (acyclic) subset T of E that connects all
of the vertices in V and whose total weight is minimized.
 Since the total weight is minimized, the subset T must be
acyclic (no circuit).
 Thus, T is a tree. We call it a spanning tree.
 The problem of determining the tree T is called the
minimum-spanning-tree problem.
255

Application of MST: an example
• In the design of electronic circuitry, it is often
necessary to make a set of pins electrically
equivalent by wiring them together.
• Running cable TV to a set of houses. What’s
the least amount of cable needed to still
connect all the houses?
256

What makes a greedy algorithm?
• Feasible
– Has to satisfy the problem’s constraints
• Locally Optimal
– The greedy part
– Has to make the best local choice among all feasible choices available
on that step
• If this local choice results in a global optimum then the problem has optimal
substructure
• Irrevocable
– Once a choice is made it can’t be un-done on subsequent steps of the
algorithm
• Simple examples:
– Playing chess by making best move without lookahead
– Giving fewest number of coins as change
• Simple and appealing, but don’t always give the best solution

Spanning Tree
• Definition
– A spanning tree of a graph G is a tree
(acyclic) that connects all the vertices of G
once
• i.e. the tree “spans” every vertex in G
– A Minimum Spanning Tree (MST) is a
spanning tree on a weighted graph that has
the minimum total weight
w T w u v
u v T
( ) ( , )
,


 such that w(T) is minimum
Where might this be useful? Can also be used to approximate some
NP-Complete problems

259
Here is an example of a connected graph
and its minimum spanning tree:
a
b
h
c d
e
f
g
i
4
8 7
9
10
14
4
2
2
6
1
7
11
8
Notice that the tree is not unique:
replacing (b,c) with (a,h) yields another spanning tree
with the same minimum weight.

Growing a MST
• Set A is always a subset of some minimum spanning tree..
• An edge (u,v) is a safe edge for A if by adding (u,v) to the
subset A, we still have a minimum spanning tree.
260
GENERIC_MST(G,w)
1 A:={}
2 while A does not form a spanning tree do
3 find an edge (u,v) that is safe for A
4 A:=A∪{(u,v)}
5 return A

How to find a safe edge
We need some definitions and a theorem.
• A cut (S,V-S) of an undirected graph G=(V,E) is
a partition of V.
• An edge crosses the cut (S,V-S) if one of its
endpoints is in S and the other is in V-S.
• An edge is a light edge crossing a cut if its
weight is the minimum of any edge crossing
the cut.
261

262
V-S↓
a
b
h
c d
e
f
g
i
4
8 7
9
10
14
4
2
2
6
1
7
11
8
S↑ ↑ S
↓ V-S
• This figure shows a cut (S,V-S) of the graph.
• The edge (d,c) is the unique light edge
crossing the cut.

The algorithms of Kruskal and Prim
• The two algorithms are elaborations of the
generic algorithm.
• They each use a specific rule to determine a
safe edge in the GENERIC_MST.
• In Kruskal's algorithm,
– The set A is a forest.
– The safe edge added to A is always a least-
weight edge in the graph that connects two
distinct components.
• In Prim's algorithm,
– The set A forms a single tree.
– The safe edge added to A is always a least- 263

Kruskal's algorithm (simple)
(Sort the edges in an increasing order)
A:={}
while (E is not empty do) {
take an edge (u, v) that is shortest in E
and delete it from E
If (u and v are in different components)then
add (u, v) to A
}
Note: each time a shortest edge in E is considered.
264

Kruskal's algorithm
265
1 function Kruskal(G = <N, A>: graph; length: A → R+): set of edges
2 Define an elementary cluster C(v) ← {v}.
3 Initialize a priority queue Q to contain all edges in G, using the weights as
keys.
4 Define a forest T ← Ø //T will ultimately contain the edges of the MST
5 // n is total number of vertices
6 while T has fewer than n-1 edges do
7 // edge u,v is the minimum weighted route from u to v
8 (u,v) ← Q.removeMin()
9 // prevent cycles in T. add u,v only if T does not already contain a path
// between u and v.
10 // the vertices has been added to the tree.
11 Let C(v) be the cluster containing v, and let C(u) be the cluster containing u.
13 if C(v) ≠ C(u) then
14 Add edge (v,u) to T.
15 Merge C(v) and C(u) into one cluster, that is, union C(v) and C(u).
16 return tree T

266
http://Wikipedia/kruskals
This is our original graph. The numbers near the arcs indicate their
weight. None of the arcs are highlighted.
Kruskal's algorithm

267
AD and CE are the shortest arcs, with length 5, and AD has been
arbitrarily chosen, so it is highlighted.
Kruskal's algorithm

268
CE is now the shortest arc that does not form a cycle, with length 5,
so it is highlighted as the second arc.
Kruskal's algorithm

269
The next arc, DF with length 6, is highlighted using much the same
method.
Kruskal's algorithm

270
The next-shortest arcs are AB and BE, both with length 7. AB is
chosen arbitrarily, and is highlighted. The arc BD has been
highlighted in red, because there already exists a path (in green)
between B and D, so it would form a cycle (ABD) if it were chosen.
Kruskal's algorithm

271
The process continues to highlight the next-smallest arc, BE with
length 7. Many more arcs are highlighted in red at this stage: BC
because it would form the loop BCE, DE because it would form the
loop DEBA, and FE because it would form FEBAD.
Kruskal's algorithm

272
Finally, the process finishes with the arc EG of length 9, and the
minimum spanning tree is found.
Kruskal's algorithm

Prim's algorithm (simple)
MST_PRIM(G,w,r){
A={}
S:={r} (r is an arbitrary node in V)
Q=V-{r};
while Q is not empty
do {
take an edge (u, v) such that
uS and vQ (vS ) and (u,v) is the shortest edge
add (u, v) to A,
add v to S and delete v from Q
}
}
273

Prim's algorithm
274
for each vertex in graph
set min_distance of vertex to ∞
set parent of vertex to null
set minimum_adjacency_list of vertex to empty list
set is_in_Q of vertex to true
set min_distance of initial vertex to zero
add to minimum-heap Q all vertices in graph, keyed by min_distance
Initialization
inputs: A graph, a function returning edge weights weight-function, and an
initial vertex
Initial placement of all vertices in the 'not yet seen' set, set initial vertex to
be added to the tree, and place all vertices in a min-heap to allow for
removal of the min distance from the minimum graph.
Wikipedia

CS-323 DAA.pdf

Recommended

Recommended

More Related Content

Similar to CS-323 DAA.pdf

Similar to CS-323 DAA.pdf (20)

Recently uploaded

Recently uploaded (20)

CS-323 DAA.pdf