Module 3_Greedy Technique_2021 Scheme.pptx

CONTENTS
 General method
 Coin Change Problem
 Knapsack Problem
 Job sequencing with deadlines
 Minimum cost spanning trees
 Prim’s Algorithm
 Kruskal’s Algorithm
 Single source shortest paths
 Dijkstra's Algorithm
 Optimal Tree problem
 Huffman Trees and Codes
 Transform and Conquer Approach
 Heaps and Heap Sort
9/24/2023 Dr. K. Balakrishnan, Dept. of CSE, SaIT, 2

INTRODUCTION
The General Method
 There will be a problem given. A set of conditions to be
satisfied for the problem will also be provided.
 Suppose the problem has n inputs, we select a subset of n
inputs that satisfy the conditions.
 Several subsets may satisfy the conditions.
 Each subset satisfying the problem conditions is referred to as
a Feasible Solution.
 Out of all the Feasible Solutions, the one satisfying the
conditions to the fullest is referred to as the Optimal Solution
and is the solution to the problem.

INTRODUCTION
 Finding a solution to a problem in this manner is referred to as
Subset Paradigm.
Algorithm Greedy(a, n)
//a[1:n] contains the n inputs
{
solution := Ø;
for i := 1 to n do {
x := Select(a);
if Feasible(solution, x) then
solution := Union(solution,
x);
}
return solution;
}

INTRODUCTION
 In simple words, Greedy Method is the concept of getting the
best possible solution for a problem with minimal efforts.
 It is like getting a high quality product with minimal cost.
 A classic scenario that represents greedy technique is the
Coin Change Problem.
 Another simple example that represents greedy method is
Machine Scheduling, which is presented next.

INTRODUCTION
There are n tasks that must be completed. Each task has a start
and end time. Infinite number of machines are provided to
complete the task. The condition is that, the tasks must not be
assigned to machines such that their execution overlaps.
Task A B C D E F G
Start 0 3 4 9 7 1 6
End 2 7 7 11 10 5 8
Solution
 One possible solution for this problem is assigning individual
machines to each of the tasks.
 There are 8 tasks. Hence we assign 8 machines, one for each
task. Although this is a feasible solution, this is definitely not
optimal.

INTRODUCTION
 We now see how to get the optimal solution.
 We start by arranging the tasks in the order of their start times,
i.e.,
A, F, B, C, G, E, D
(0,2),(1,5),(3,7),(4,7),(6,8),(7,10),(9,11)
 Next we allocate machines to the tasks as follows,
0
1 2 3 4 5 6 7 8 9 10 11 12 13
mc
m1
A
m2 F
B
m3 C
G
E
D

KNAPSACK PROBLEM
 There are n objects given to us.
 Each object has a weight and profit associated with it.
 The weights and profits are represented as 𝒘𝒊 and 𝒑𝒊
respectively, for 1 ≤ i ≤ n.
 We are also given a Knapsack/bag whose capacity in m.
 The problem here is to fill the knapsack by using the objects
such that we get maximum profit and we don’t exceed the
knapsack capacity.

KNAPSACK PROBLEM
 Let the objects be 𝒙𝟏, 𝒙𝟐, ……., 𝒙𝒏.
 An object 𝒙𝒊 can either be chosen as a whole or a fraction of it
can be chosen.
 This means, 0 ≤ 𝒙𝒊 ≤ 1.
 𝒙𝒊 = 1, means the object 𝒙𝒊 has been chosen as a whole.
 𝒙𝒊 = 1/2, means half of 𝒙𝒊 has been chosen
 𝒙𝒊 = 0, means the object 𝒙𝒊 hasn't been chosen.
 The profit of an object varies based on the fraction of the object
chosen.

KNAPSACK PROBLEM
 Consider an object 𝒙𝒊, whose profit is 100. Then
 𝒙𝒊 = 1, means the object 𝒙𝒊 has been chosen as a whole.
Hence profit of 𝒙𝒊 is 100.
 𝒙𝒊 = 1/2, means half of 𝒙𝒊 has been chosen. Hence profit of
𝒙𝒊 is 50(100/2).
 𝒙𝒊 = 0, means the object 𝒙𝒊 hasn't been chosen. Hence
profit of 𝒙𝒊 is 0.
 In general, the profit of an object is obtained by 𝒙𝒊*𝒑𝒊.
Similarly, the weight of an object is 𝒘𝒊*𝒙𝒊.

KNAPSACK PROBLEM
Formal Definition of Knapsack Problem
“Given n objects with weights 𝒘𝒊 and profits 𝒑𝒊
such that 1 ≤ i ≤ n and given a knapsack with capacity
m, the Knapsack problem can be stated as
maximize 1 ≤ i ≤ n 𝒑𝒊𝒙𝒊
subject to 1 ≤ i ≤ n 𝒘𝒊𝒙𝒊 ≤ m
and 0 ≤ 𝒙𝒊≤ 1 , 1 ≤ i ≤ n ”

KNAPSACK PROBLEM
We now discuss some observations made about the
Knapsack problem. These observations are referred to
as Lemmas.
Lemma 1
If total weight, i.e., 𝒘𝟏+ 𝒘𝟐+…+ 𝒘𝒏 ≤ M, then, 𝒙𝒊=1 for all i such
that 1 ≤ i ≤ n.
Lemma 2
All optimal solutions will fill the knapsack exactly.

KNAPSACK PROBLEM
Strategies to solve Knapsack problem
Consider there are 3 objects, each with weights (18, 15, 10) and
profits (25, 24, 15) respectively. The knapsack capacity is 20.
Find the optimal solution.
Solution
The data given in this problem is
 n = 3
 Objects are 𝒙𝟏, 𝒙𝟐, 𝒙𝟑
 𝒘𝟏 = 18, 𝒘𝟐 = 15, 𝒘𝟑 = 10
 𝒑𝟏 = 25, 𝒑𝟐 = 24, 𝒑𝟑 = 15
 m = 20

KNAPSACK PROBLEM
Following are some solutions available for the given
problem.
Solutio
n
𝒙𝟏 𝒙𝟐 𝒙𝟑
Total
Weight
Total
Profit
A 1/2 1/3 1/4 16.5 24.25
B 1 2/15 0 20 28.20
C 0 2/3 1 20 31
D 0 1 1/2 20 31.5
 Here solution D is the optimal solution.
 We now look at some greedy strategies to solve the
knapsack problem.

KNAPSACK PROBLEM
Strategy 1 – Choose highest profit/value next
 In this approach, we choose objects in the decreasing order of
their profits.
 In the example, we first select 𝒙𝟏 followed by 2/15th of 𝒙𝟐 as
selecting the whole of 𝒙𝟐 will exceed the knapsack capacity.
 This strategy is represented in solution B.
Strategy 2 – Choose smallest weight next
 In this approach, we choose objects in the increasing order of
their weights.
 This strategy is represented in solution C.

KNAPSACK PROBLEM
Strategy 3 – Choose highest value to weight ratio
next
 This is a combination of strategies 1 and 2.
 In this approach, we first obtain the profit to weight ratio of all
objects. For our example,
𝒙𝟏 = 25/18 = 1.388
𝒙𝟐 = 24/15 = 1.6
𝒙𝟑 = 15/10 = 1.5
 We then select the objects in the decreasing order of their
profit to weight ratio.
 This approach is represented by solution D, which is the

KNAPSACK PROBLEM
Theorem
If 𝒑𝟏
𝒘𝟏 ≥ 𝒑𝟐
𝒘𝟐 ≥ …… ≥ 𝒑𝒏
𝒘𝒏, then Greedy Knapsack
generates an optimal solution to the given instance of the
problem.
Proof
Phase 1
Example
 We need to prove that Greedy method gives the optimal
solution for the Knapsack problem.
 From the discussion that we have had so far, to get the optimal
solution, we have used the value to weight ratio as the

KNAPSACK PROBLEM
 In this approach, we first find the value/weight ratio for each
object and arrange it in descending order, i.e.,
𝒑𝟏
𝒘𝟏 ≥ 𝒑𝟐
𝒘𝟐 ≥ …… ≥ 𝒑𝒏
𝒘𝒏
 Let 𝒑𝟏
𝒘𝟏 correspond to object 𝒙𝟏, 𝒑𝟐
𝒘𝟐 correspond to object 𝒙𝟐
and so on.
 This means, the order in which we pick the objects is,
𝒙𝟏 , 𝒙𝟐 ,……., 𝒙𝒏
 We first pick up 𝒙𝟏 and we pick it as a whole. We then pick up
𝒙𝟐 as a whole followed by 𝒙𝟑 and so on.

KNAPSACK PROBLEM
 We keep picking objects as a whole as long as the
knapsack capacity is not exceeded.
 This can be expressed as
𝒙𝟏 = 1, 𝒙𝟐 = 1, 𝒙𝟑 = 1……..
 This continues until we reach a point, say position ‘j’, such
that picking 𝒙𝒋 as a whole will exceed knapsack capacity.
 Hence, we have no other choice but to pick a fraction of
𝒙𝒋, i.e.,
0 < 𝒙𝒋 < 1

KNAPSACK PROBLEM
 We take a fraction of 𝒙𝒋 such that it is just enough to fill the
knapsack capacity.
 This means, after picking a fraction of 𝒙𝒋, we will not be
able to pick the objects 𝒙𝒋+𝟏, 𝒙𝒋+𝟐,……, 𝒙𝒏.
 This is represented as,
𝒙𝟏, 𝒙𝟐,……, 𝒙𝒋−𝟏 𝒙𝒋 𝒙𝒋+𝟏, 𝒙𝒋+𝟐,……,
𝒙𝒏
1 1 1 0< 𝒙𝒋<1 0 0 0
 This is the greedy solution to the problem. And we need to
prove that this is the optimal solution.

KNAPSACK PROBLEM
Phase 2
 Let the greedy solution for the Knapsack problem be
X = (𝒙𝟏, 𝒙𝟐,….., 𝒙𝒏), such that,
𝒙𝟏, 𝒙𝟐,……, 𝒙𝒋−𝟏 𝒙𝒋 𝒙𝒋+𝟏, 𝒙𝒋+𝟐,……, 𝒙𝒏
1 1 1 0< 𝒙𝒋<1 0 0 0
 Let us assume the optimal solution for the same problem
to be
Y = (𝒚𝟏, 𝒚𝟐,….., 𝒚𝒏)
 We are not aware of what fractions of 𝒚𝒊’s have been
considered.

KNAPSACK PROBLEM
 If we say y = x, it means that the optimal solution is our
greedy solution. There is nothing to prove.
 For the sake of proof, we consider y ≠ x.
 For an index position ‘k’, if 𝒚𝒌 ≠ 𝒙𝒌, then 𝒚𝒌 must be
less than 𝒙𝒌(𝒚𝒌 < 𝒙𝒌).
 We now prove the above statement w.r.t index ‘j’. We
consider three cases:
 Case 1: if k < j
 Case 2: if k = j
 Case 3: if k > j

KNAPSACK PROBLEM
Example
Case 1: if k < j
 For all index positions less than j, the value of 𝒙𝒊’s are 1.
 Hence, for this case, if 𝒚𝒌 ≠ 𝒙𝒌,i.e., if 𝒚𝒌 ≠ 1, then 𝒚𝒌 has to be
less than 𝒙𝒌.
 This is because, 𝒚𝒌 cannot be greater than 1, as 1 is the
maximum value of an object.
 Therefore, for Case 1, we have proved that if 𝒚𝒌 ≠ 𝒙𝒌, then,
𝒚𝒌 < 𝒙𝒌.

KNAPSACK PROBLEM
Case 2: if k = j
 This means 𝒚𝒊 values are same as 𝒙𝒊 values till index j-1. At jth
index their values don’t match.
 At position j, we consider just a fraction of object x. This is because,
taking x as a whole will exceed knapsack capacity.
 This means at position j, the value of object y also cannot be 1.
 Also 𝒚𝒋 cannot be greater than 𝒙𝒋, as the fraction of 𝒙𝒋 considered
fills knapsack to its capacity.
 So, if 𝒚𝒋 exceeds this fraction, then it’ll definitely exceed the
knapsacks capacity.
 Hence, at index k=j, if 𝒚𝒌 ≠ 𝒙𝒌 then 𝒚𝒌 < 𝒙𝒌.

KNAPSACK PROBLEM
Case 3: if k > j
 This means 𝒚𝒊 values are same as 𝒙𝒊 values till index j. After
jth index their values don’t match.
 The values of 𝒙𝒊 till index j have filled the knapsack to its
capacity. This means even the 𝒚𝒊 values have filled the
knapsack.
 After index j, all 𝒙𝒊’s are 0s.
 At these positions 𝒚𝒊 cannot be greater than 𝒙𝒊 as the
knapsack capacity will be breached.

KNAPSACK PROBLEM
 Hence we have proved the statement that, if
𝒚𝒌 ≠ 𝒙𝒌 then 𝒚𝒌 < 𝒙𝒌.
 This means the assumed optimal solution Y is giving
a performance which is less than our greedy solution
X.
 We now transform the assumed optimal solution Y
into the greedy solution X and prove the theorem.
 By transformation, we mean, since 𝒚𝒌 < 𝒙𝒌, we bring

KNAPSACK PROBLEM
Phase 3
Example
 We increase 𝒚𝒌 to 𝒙𝒌. We also reduce 𝒚𝒌+𝟏, 𝒚𝒌+𝟐, …….,
𝒚𝒏 accordingly, so that the balance holds.
 Let this transformed solution be
Z = (𝒛𝟏, 𝒛𝟐,……., 𝒛𝒏)
 We can make the following observations on Z,
 For 1 ≤ i ≤ k, 𝒛𝒊 = 𝒙𝒊
 𝒘𝒌( 𝒙𝒌 - 𝒚𝒌 ) = 𝒊=𝒌+𝟏
𝒏
𝒘𝒊 ( 𝒚𝒊 − 𝒛𝒊 )

KNAPSACK PROBLEM
 We now go on to prove that Y = Z.
 We already know that Z = X.
 Hence if we prove Y = Z, then, X = Y, which is the proof of the
theorem.
We have,
𝑖=1
𝑛
𝑝𝑖 𝑧𝑖 = 𝑖=1
𝑛
𝑝𝑖 𝑦𝑖 + 𝑝𝑘(𝑥𝑘- 𝑦𝑘) - 𝑖=𝑘+1
𝑛
𝑝𝑖(𝑦𝑖 − 𝑧𝑖)
Let us rewrite 𝑝𝑘 as 𝑤𝑘 ∗ (𝑝𝑘/ 𝑤𝑘) and 𝑝𝑖 as 𝑤𝑖 ∗ (𝑝𝑖/ 𝑤𝑖). We get,
𝑖=1
𝑛
𝑛
𝑝𝑖 𝑦𝑖 + 𝑤𝑘 ∗ (𝑝𝑘/ 𝑤𝑘)(𝑥𝑘- 𝑦𝑘) -
𝑖=𝑘+1
𝑛
𝑤𝑖 ∗ (𝑝𝑖/ 𝑤𝑖)
(𝑦𝑖 − 𝑧𝑖)
𝑖=1
𝑛
𝑛

KNAPSACK PROBLEM
𝑖=1
𝑛
𝑛
[ 𝑖=𝑘+1
𝑛
𝑤𝑖 ∗ (𝑦𝑖 − 𝑧𝑖)](𝑝𝑘/ 𝑤𝑘)
𝑖=1
𝑛
𝑛
𝑝𝑖 𝑦𝑖 + (𝑝𝑘/ 𝑤𝑘)[𝑤𝑘(𝑥𝑘- 𝑦𝑘) - 𝑖=𝑘+1
𝑛
𝑤𝑖 ∗ (𝑦𝑖 − 𝑧𝑖)]
Since we already have, 𝒘𝒌(𝒙𝒌- 𝒚𝒌 ) = 𝒊=𝒌+𝟏
𝒏
𝒘𝒊 ( 𝒚𝒊 − 𝒛𝒊)
𝑖=1
𝑛
𝑛
𝑝𝑖 𝑦𝑖 + (𝑝𝑘/ 𝑤𝑘)[𝑤𝑘(𝑥𝑘- 𝑦𝑘) - 𝑖=𝑘+1
𝑛
𝑤𝑖 ∗ (𝑦𝑖 − 𝑧𝑖)]
𝑖=1
𝑛
𝑛
𝑝𝑖 𝑦𝑖
𝒁 = 𝒀, Hence Proved

KNAPSACK PROBLEM
Example
Algorithm GreedyKnapsack(m,
n){
for i := 1 to n do
x[i] := 0.0;
U := m;
for i := 1 to n do{
if(w[i] > U) then
break;
x[i] := 1.0;
U := U – w[i];
}
if( i ≤ n) then
x[i] = U / w[i];
}

KNAPSACK PROBLEM
Example
Find the optimal solution for the given instance of Knapsack
using greedy technique.
n = 7, m = 15,
profits = (10, 5, 15, 7, 6, 18, 3)
Weights = (2, 3, 5, 7, 1, 4, 1)
Solution
 We already know that the optimal solution is obtained by
selecting the objects in the decreasing order of their value to
weight ratio.
 We first find this ration for all the objects..

KNAPSACK PROBLEM
i 1 2 3 4 5 6 7
𝒑𝒊 10 5 15 7 6 18 3
𝒘𝒊 2 3 5 7 1 4 1
𝒑𝒊/𝒘𝒊 5 1.6 3 1 6 4.5 3
m = 15
𝒙𝒊 1
m = 14
1
m = 12
1
m = 8
1
m = 3
1
m = 2
2/3
m = 0
0
Total Profit = 1*10+2/3*5+1*15+1*6+1*18+1*3
Total Profit = 55.33

JOB SEQUENCING WITH DEADLINES
 Consider there are n jobs/tasks.
 Each job has a deadline 𝒅𝒊 such that 𝒅𝒊 ≥ 0.
 Every job also has a profit/value 𝒑𝒊 such that 𝒑𝒊 > 0.
 For a job, we get its profit only when the job is completed
before its deadline.
 A machine is provided to execute the jobs.
 Only one machine is provided.

 There are some conditions defined for the problem:
 Every job takes one time unit to complete execution.
 A job can be taken for execution only when the system clock is within
the deadline of the considered job.
 The objective of the problem is to execute as many jobs as
possible and get maximum profit.
Example
Solve the Job Sequencing problem for the following instance.
n=4
(𝒑𝟏, 𝒑𝟐, 𝒑𝟑, 𝒑𝟒)=(100, 10, 15, 27)
(𝒅𝟏, 𝒅𝟐, 𝒅𝟑, 𝒅𝟒) = ( 2, 1, 2, 1)

 The data given is,
(𝒑𝟏, 𝒑𝟐, 𝒑𝟑, 𝒑𝟒)=(100, 10, 15, 27)
(𝒅𝟏, 𝒅𝟐, 𝒅𝟑, 𝒅𝟒) = ( 2, 1, 2, 1)
 We find all possible solutions and then select the optimal one.
0 1 2 3 4
(J1, J2) = 110
(J1, J3) = 115
(J1, J4) = 127
(J2, J3) = 25
(J3, J4) = 42
J1
J2
0 1 2 3 4
J1
J3
0 1 2 3 4
J1
J4
0 1 2 3 4
J2 J3
0 1 2 3 4
J3
J4
(J1) = 100
(J2) = 10
(J3) = 15
(J4) = 27

 This way of getting the optimal solution is time consuming.
 An efficient approach is to arrange the jobs in the decreasing
order of their profits and then select the jobs from the order for
execution.(Greatest Profit Next strategy)
 We have,
Jobs - J1 J4 J3 J2
Profits - 100 27 15 10
Deadline - 2 1 2 1
Hence the optimal solution is,
0 1 2 3 4
J4 J1
(J4, J1) is the optimal
solution with the profit
127

Algorithm GreedyJob(d, J, n)
//J is a set of jobs that can be completed by their deadlines
{
J := {1};
for i := 2 to n do
{
if( all jobs in J υ {i} can be completed by their
deadlines)
then J := J υ {i};
}
}
High Level description of the Job Sequencing
algorithm

Algorithm JS(d, J, n){
//d is an array of deadlines of n jobs. s is an array of slots.
//J is a set of jobs that can be completed by their deadlines
k := 0; //indicates the number of jobs executed
for i := 1 to n do
j[i] := s[i] :=0;
for i := 1 to n do{
if(s[d[i]] == 0){
s[d[i]] := i;
j[i] := 1;
k++;
}else{
for x := d[i]-1 to 1 step -1{
if(s[x]==0){
s[x] := i; j[i] := 1; k++;
break;
}
}
}
return k;
} Example

Example
n=5, (𝒑𝟏, 𝒑𝟐, 𝒑𝟑, 𝒑𝟒, 𝒑𝟓)=(20, 15, 10, 5, 1)
(𝒅𝟏, 𝒅𝟐, 𝒅𝟑, 𝒅𝟒) = ( 2, 2, 1, 3, 3).
Solution
We solve this problem using feasibility representation.
J Assigned slots Job
considered
Action profit
Ø none 1 Assign to slot [1,2] 0
{1} [1,2] 2 Assign to slot [0,1] 20
{1,2} [0,1], [1,2] 3 Cannot fit. Reject 35
{1,2} [0,1], [1,2] 4 Assign to slot [2,3] 35
{1,2,4} [0,1], [1,2], [2,3] 5 Cannot fit. Reject 40
Hence, the optimal solution is {1, 2, 4} with a profit of 40.

Exercise Problems
1. n =4, D=(4, 1, 1, 1), P=(20, 10, 40, 30)
2. n =5, D=(2, 1, 2, 1, 3), P=(100, 19, 27, 25, 15)
3. n =7, D=(1, 3, 4, 3, 2, 1, 2), P=(3, 5, 20, 18, 1, 6, 30)

MINIMUM COST SPANNING TREE
 Let G=(V, E) be an undirected graph with V vertices and E
edges. Then,
“A Spanning Tree of an undirected connected graph
is its connected acyclic sub graph (i.e., a tree) that contains
all the vertices of the graph”
 A spanning tree satisfies the property that for a given
graph G, it’s spanning tree is a minimal sub graph 𝑮| such
that
 V(G) = V(𝑮|)
 𝑮|
is connected

 A Minimal sub graph is one which has the fewest
number of edges.
 Sometimes, edges of a graph are assigned with
some numerical values.
 These values are referred to as Cost of the edges.
 The Cost of a Tree is the sum of cost of all the edges
in the tree.

Hence,
“The Minimum Cost Spanning Tree(MST) for
a graph G is the spanning tree of the given graph
such that its cost is minimal.”
a b
c d
1
2
3
5
a b
c d
1
2
3
a b
c d
1
3
5
a b
c d
1
2
5
Graph w(T1) =6 w(T2) =9 w(T3) =8
Graph and its spanning trees, with T1 being the minimum
spanning tree.

Prim’s Algorithm
 This algorithm constructs the MST edge by edge.
 All the edges that have been chosen to be part of the MST are
stored in an edge set A.
 Selection of an edge to be made part of a MST must satisfy
the following conditions:
 This edge results in a minimal cost sub graph.
 The inclusion of this edge in the sub graph ensures that the sub
graph remains a tree.

“If A is a set of edges selected so far, then A
forms a tree. The next edge (u, v) to be included in A is
a minimum cost edge not in A with the property that A
U {(u, v)} is also a tree”
Prim’s Algorithm

1
6
5
4
2
3
7
10
25
22
24
18
12
16
14
28
1
6
5
4
2
3
7
10
25
22
12
16
14
𝑉𝑇 = 1
𝐸𝑇 =
|V|= 7
𝑉𝑇 = 1, 6
𝐸𝑇 = (1, 6)
𝑉𝑇 = 1, 6, 5
𝐸𝑇 = (1, 6),(6,5)
𝑉𝑇 = 1, 6, 5, 4
𝐸𝑇 = (1, 6),(6,5),(5,4)
𝑉𝑇 = 1, 6, 5, 4, 3
𝐸𝑇 = (1, 6),(6,5),(5,4),(4,3)
𝑉𝑇 = 1, 6, 5, 4, 3, 2
𝐸𝑇 = (1, 6),(6,5),(5,4),(4,3),(3,2)
𝑉𝑇 = 1, 6, 5, 4, 3, 2,
7
𝐸𝑇 = (1, 6),(6,5),(5,4),(4,3),(3,2),(2,7)
Minimal Cost
Spanning tree
with cost = 99

Example
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
In the solution for this problem, we use the following
notation to represent a node,
node_name(predecessor_tree vertex,
edge_cost)

Tree
Vertices
Remaining Vertices Illustration
a(-,-
)
b(a,3), c(-,∞), d(-,
∞)
e(a,6), f(a,5)
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
3
b(a,3) c(b,1), d(-, ∞),
e(a,6), f(b,4)
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
3
1

Tree
Vertices
c(b,1) d(c, 6), e(a,6),
f(b,4)
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
3
1
f(b,4) d(f, 5), e(f,2)
4
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
3
1
4
2

Tree
Vertices
e(f,2) d(f, 5) b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
3
1
4
2
5
d(f,5) MST with cost =
15

Kruskal’s Algorithm
 This is another approach to obtain a MST for a given graph.
 The algorithm starts by arranging the edges of the given graph
in ascending order.
 Edges are then selected one at a time from this order, to be
included in the MST.
 An edge is included, if and only if it does not result in a cycle in
the MST.

ALGORITHM Kruskal(G)
//Input: A weighted connected graph G = <V, E>
//Output: 𝑬𝑻, the set of edges composing a minimum spanning tree of G
sort E in non decreasing order of the edge weights w(𝒆𝒊𝟏) ≤ . . . ≤
w(𝒆𝒊|𝑬|)
𝑬𝑻 ← ∅; ecounter ← 0 //initialize the set of tree edges and its size
k ← 0 //initialize the number of processed edges
while ecounter < |V| − 1 do
k← k + 1
if 𝑬𝑻 ∪ {𝑒𝑖𝑘} is acyclic
𝑬𝑻 ← 𝑬𝑻 ∪ {𝑒𝑖𝑘};
ecounter ←ecounter + 1
return 𝑬𝑻

Example
Find the MST for the graph using Kruskal’s algorithm.
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
We first arrange the edges of this graph in ascending order.
The resulting order is,
bc ef ab bf cf af df ae cd de
1 2 3 4 4 5 5 6 6 8

Tree
Edges
Sorted List of Edges Illustration
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
b
c
ef a
b
bf cf af df ae c
d
d
e
1 2 3 4 4 5 5 6 6 8
1
b
c
1
b
c
ef a
b
bf cf af df ae c
d
d
e
1 2 3 4 4 5 5 6 6 8
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
1
2

Tree
Edges
ef
2
b
c
ef a
b
bf cf af df ae c
d
d
e
1 2 3 4 4 5 5 6 6 8
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
1
2
3
ab
3
b
c
ef a
b
bf cf af df ae c
d
d
e
1 2 3 4 4 5 5 6 6 8
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
1
2
3
4

Tree
Edges
bf
4
b
c
ef a
b
bf cf af df ae c
d
d
e
1 2 3 4 4 5 5 6 6 8
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
1
2
3
4
b
c
ef a
b
bf cf af df ae c
d
d
e
1 2 3 4 4 5 5 6 6 8
b
c
ef a
b
bf cf af df ae c
d
d
e
1 2 3 4 4 5 5 6 6 8
b c
f d
a
e
3
1
6
5
8
2
4
4
5
6
1
2
3
4
5
df
5 MST with cost =

Practise Examples
Find the MST for the following graphs using Prims and Kruskals algorithms.

Disjoint Subsets and Union-Find Algorithms
 Union-Find algorithm is a strategy which is used to efficiently
implement the Kruskal’s algorithm.
 This strategy is based on the concept of Disjoint Subsets.
 The strategy divides the given edge set 𝑬 into disjoint
subsets, with each subset having just one edge.
 It then combines these subsets one by one until a MST is
obtained.

 The Union-Find algorithm performs the following operations:
 makeset(x) - creates a one-element set {x}.
 find(x) - returns a subset containing x.
 union(x, y) - constructs the union of the disjoint subsets 𝑺𝒙 and 𝑺𝒚
containing x and y.
 For example, let S = {1, 2, 3, 4, 5, 6}. Then makeset(i) creates
the set { i } for all elements in the set, i.e.,
{1}, {2}, {3}, {4}, {5}, {6}
 Performing union(1, 4) and union(5, 2) yields
{1, 4}, {5, 2}, {3}, {6}

 Now if we want to combine the subsets {1,4} and {5,2},
how do we call the Union() function?
 There is a problem here because, Union() function
accepts two values as parameters.
 But here we have two subsets to be combined, each with
two elements.
 We cannot pass the whole subset as parameter to the
Union() function as the function accepts single elements
only.
 Hence, we need to have a single element that represents
a subset. We call this element, the Subset
Representative.

 There are several logics followed to assign a representative
for a subset.
 The approach that we adopt is, the smallest element of a
subset is its representative.
 With this, we can comfortably merge the subsets {1,4} and
{5,2} by making the call
Union(1,2)
 Here element 1 represents the subset {1,4} and element 2
represents subset {5,2}.

 We now have a look at various data structures that
are used to implement the Union-Find algorithm.
 We have 2 strategies for Union-Find based on the
type of data structures used.
 The strategies are
 Quick Find
 Quick Union
 Quick Find uses arrays and linked list while Quick
Union uses trees.

 Quick Find
 This strategy maintains an array that contains
information about the representative of each subsets.
 The array indices are the elements of the set S.
 The values at the respective indices are the representatives
of the subset containing the element.
 The strategy also maintains each subset as a linked list
with header node.
 The header node contains pointers to the first and last
element of the list as well as information about the total
number of nodes in the list.

 Consider the set S={1, 2, 3, 4, 5, 6}
 Initially, this set gets divided into the subsets
{1}, {2}, {3}, {4}, {5}, {6}
 The linked list for the subsets are as follows.
1
Size Last First
List 1 1 0
1
Size Last First
List 2 2 0
1
Size Last First
List 3 3 0
1
Size Last First
List 4 4 0
1
Size Last First
List 5 5 0
1
Size Last First
List 6 6 0

 The array of subset representatives is,
Subset Representatives
Element Index
Representative
1
2
3
4
5
6
1
2
3
4
5
6
 We will now see what happens to the array and the
linked lists after the calls for union(1,4) and
union(2,5) are made.

2
Size Last First
List 1 1
2
Size Last First
List 2
1
Size Last First
List 3 3 0
1
Size Last First
List 6 6 0
4 0
2 5 0
Element Index Representative
1
2
3
4
5
6
1
2
3
1
2
6

 Next the call for union(1,2) and union(3,6) will result in
the following scenario,
4
Size Last First
List 1 1
2
Size Last First
List 3 3 6 0
4 2 5 0
Element Index Representative
1
2
3
4
5
6
1
1
3
1
1
3

 This way the process continues in Quick find
strategy.
 To summarize, in Quick Find approach,
 makeset(x) involves creating a representative array in
which the representative for each element is itself
initially.
 This operation also involves creating a linked list of
one node for each subset.
 find(x) involves retrieving x’s representative from the
array.

 Quick Union
 This approach uses Trees to implement the operations.
 makeset(x) involves creating a tree of one node for all
x in the set S.
 The root of a tree is the representative of that subset.
 Edges in the tree are directed from children to parents.
 union(x,y) involves attaching the root of one tree to the
root of the other.

 For a given set S={1, 2, 3, 4, 5, 6}, makeset(x) results in the
following scenario.
1
2 3
4
5 6
 union(1,4) and union(2,5) will result in,
1 4
2
5
 union(1,2) and union(3,6) will result in,
1
4
2
5
3
6

DIJKSTRA’S ALGORITHM
 This algorithm is used to solve the Single Source
Shortest Path problem.
 The problem is defined as,
“For a given vertex called the source in a
weighted connected graph, find shortest paths to all its
other vertices.”
 The Dijkstra’s algorithm works only on graphs with
edges having non negative weights.

 In this algorithm, we will have two categories of vertices:
 Tree vertices
 Fringe vertices
 Tree vertices are those vertices of the given graph that
are part of the shortest path tree.
 Fringe vertices are the remaining vertices that are yet to
be added to the shortest path tree.
 We now get introduced to 2 notations that will be regularly
used in the algorithm:
 𝒅𝒊 - shortest distance from the source vertex to vertex i.
 w(a, b) – weight of the edge between vertices a and b.

The steps involved in finding the Single Source Shortest path for
a given graph are:
Step 1 – Identify a vertex 𝒖∗
that is closest to the source. Move
𝒖∗ from the fringe to the set of tree vertices.
Step 2 - For each remaining fringe vertex u that is connected to
𝒖∗ by an edge of weight w(𝒖∗, u) such that 𝒅𝒖∗ + w(𝒖∗, u) < 𝒅𝒖,
update the labels of u by 𝒖∗
and 𝒅𝒖∗ + w(𝒖∗
, u) respectively.
a b
c d
1
1
4
1
2

 The vertex representation used in this algorithm is,
node_name(predecessor, cost of shortest path from
source)
 We now find the single source shortest path for the
following graph using Dijkstra’s strategy.
b c
d
a e
4
2 5
4
6
7
3
Source

Tree
Vertices
Fringe Vertices Illustration
a(- , 0) b(a , 3), c(- , ∞), d(a ,
7), e(- , ∞)
b c
d
a e
4
2 5
4
6
7
3
3
b(a , 3) c(b , 7), d(b , 5), e(- ,
∞)
b c
d
a e
4
2 5
4
6
7
3
3
2
d(b , 5) c(b , 7), e(d , 9)
b c
d
a e
4
2 5
4
6
7
3
3
2
4

Tree
Vertices
Fringe Vertices Illustration
c(b , 7) e(d , 9)
b c
d
a e
4
2 5
4
6
7
3
3
2
4
4
e(d , 9)
Single Source Shortest Paths
are:
a – b of length 3
a – b – c of length 7
a – b – d of length 5
a – b – d – e of length 9

Algorithm b c
d
a e
4
2 5
4
6
7
3
Source
0 1 2 3 4
Q
Ø Ø Ø Ø Ø
𝒑𝒗
0 1 2 3 4
(a,∞
)
(b,∞
)
(c,∞) (d,∞
)
(e,∞
)
0 1 2 3 4
(a,0) (b,∞
)
(c,∞) (d,∞
)
(e,∞
)
Ø
𝑽𝑻
0
i
a
𝒖∗
a
0 1 2 3 4
(a,0) (b,∞
)
(c,∞) (d,∞
)
(e,∞
)
∞ ∞ ∞ ∞ ∞
𝒅𝒗 0 ∞ ∞ ∞ ∞
b
u
0 3 ∞ ∞ ∞
Ø a Ø Ø Ø
0 1 2 3 4
(a,0) (b,3) (c,∞) (d,∞
)
(e,∞
)
d
0 3 ∞ 7 ∞
Ø a Ø a Ø
0 1 2 3 4
(a,0) (b,3) (c,∞) (d,7) (e,∞
)
1
0 1 2 3 4
(a,0) (b,3) (c,∞) (d,7) (e,∞
)
b
a b
c
0 3 7 7 ∞
Ø a b a Ø
0 1 2 3 4
(a,0) (b,3) (c,7) (d,7) (e,∞
)
d
0 3 7 5 ∞
Ø a b b Ø
0 1 2 3 4
(a,0) (b,3) (c,7) (d,5) (e,∞
)
2
d
a b d
e
0 1 2 3 4
(a,0) (b,3) (c,7) (d,5) (e,∞
)
0 3 7 5 9
Ø a b b d
0 1 2 3 4
(a,0) (b,3) (c,7) (d,5) (e,9)

Exercise
Problems b c
d
a e
4
2 5
4
6
7
3
Source
a
d
h
k
b
e
i
l
f
j
c
g
3
3
1
6
8
4
5
7
6
3
4
2
5
6
5
2
3
9
5
4
Source

HUFFMAN TREES
 Huffman trees are used for Encoding.
 Encoding is the process of converting a given
text/message into some other form.
 This is accomplished by converting each character of the
text into a sequence of bits.
 The resulting string is called Codeword.
 There are 2 encoding strategies:
 Fixed length encoding
 Variable length encoding

HUFFMAN TREES
Fixed Length Encoding
 As the name says, each character in the text is replaced by a bit
sequence of the same length.
 ASCII code is an example(A=65, B=66, …., Z=90).
Variable Length Encoding
 Different characters are assigned with bit sequences of different
length.
 Frequently occurring characters are assigned with smaller length bit
string while rarely occurring characters are assigned with a longer bit
string.

HUFFMAN TREES
Problem with Variable Length Encoding
 Consider the encoding scheme where a=01, e =00, h=010,
l=011, o=10, i=11.
 With this, we encode the text hello as 0100001101110.
 The problem lies in decoding this bit string.
 Since it is a variable length encoding, every character will have
bit strings of different length.
 Hence finding the beginning or end of bit string for a character
becomes difficult.
 The above bit string could also be decoded as aeeilo.

HUFFMAN TREES
Solution
 We now need a solution such that, even in variable length
encoding, we can clearly identify the beginning and end of a
particular character.
 For this, we adopt a strategy called Prefix Code or Prefix free
code.
 As the name says, the code for a character will not have any
prefixes.
 This means, each character’s code in the code word will not
have anything preceding it.
 Now the question is, how can such a strategy be implemented.

HUFFMAN TREES
 This is accomplished using Trees, specifically Binary
Trees.
 We construct a binary tree such that,
 Leaf nodes represent characters of the text.
 The edges from the root to a leaf node represent the bit
code for the character in that leaf node.
 All the left edges are labeled 0 and all the right edges are
labeled 1.
This strategy is presented next.

HUFFMAN TREES
 With this scheme, decoding the bit string
011010101011
will only result in HELLO.
a
e h
l 0
0
0
0
0
1
1
1
1
The code for the characters
are
A = 00
E = 010
H = 011
L = 10
O = 11

HUFFMAN TREES
 A formal approach of constructing such a binary tree was
proposed by Huffman and is presented in the following
algorithm.
Huffman’s Algorithm
Step 1:
 Initialize n one-node trees and label them with the symbols of the
alphabet given.
 Record the frequency of each symbol in its tree’s root to indicate
the tree’s weight. (More generally, the weight of a tree will be
equal to the sum of the frequencies in the tree’s leaves.)
Step 2: Repeat the following steps, until a single tree is obtained.
 Find two trees with the smallest weight.
 Make them the left and right sub tree of a new tree and record the

HUFFMAN TREES
 A tree constructed in this way is known as a Huffman
tree.
 The code generated by such a tree is known as Huffman
Code.
Example
Consider the five-symbol alphabet {A, B, C, D, _} with the
following occurrence frequencies in a text made up of these
symbols:
symbol A B C D _
frequency 0.35 0.1 0.2 0.2 0.15

HUFFMAN TREES
Step 1
 We start by creating n one node trees.
 Since there are 5 characters, we will have 5 trees,
A B C D _
 We now assign the frequencies of these characters as their
node weights.
A B C D _
0.35 0.1 0.2 0.2 0.15
Step 2
 We need to select 2 trees with least weights. For this we first
arrange these trees in the increasing order of weights.

HUFFMAN TREES
 We now select nodes B and _ to be merged as one tree.
 The total weight of the resulting tree is recorded in the
root.
B _ C D A
0.1 0.15 0.2 0.2 0.35
B _ C D A
0.1 0.15 0.2 0.2 0.35
B
0.1
_
0.15
C
0.2
D
0.2
A
0.35
0.25
B
0.1
_
0.15
C
0.2
D
0.2
A
0.35
0.25

HUFFMAN TREES
B
0.1
_
0.15
C
0.2
D
0.2
A
0.35
0.25
0.4
B
0.1
_
0.15
C
0.2
D
0.2
A
0.35
0.25
0.4
B
0.1
_
0.15
C
0.2
D
0.2
A
0.35
0.25
0.4 0.6
B
0.1
_
0.15
C
0.2
D
0.2
A
0.35
0.25
0.4 0.6

HUFFMAN TREES
B
0.1
_
0.15
C
0.2
D
0.2
A
0.35
0.25
0.4 0.6
1.0
0
0 0
0 1
1
1
1
Huffman Tree and Huffman
Code

HUFFMAN TREES
 The expected number of bits per character using
Huffman approach is given by,
𝒊=𝟏
𝒏
𝒏𝒐. 𝒐𝒇 𝒃𝒊𝒕𝒔 𝒇𝒐𝒓 𝒊𝒕𝒉 𝒄𝒉𝒂𝒓𝒂𝒄𝒕𝒆𝒓 + 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 𝒐𝒇 𝒊𝒕𝒉 𝒄𝒉𝒂𝒓𝒂𝒄𝒕𝒆𝒓
 For our example,
Expected no. of bits = 2*0.35 + 3*0.1+ 2*0.2+
2*0.2 + 3*0.15
= 2.25

THANK YOU

Module 3_Greedy Technique_2021 Scheme.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to Module 3_Greedy Technique_2021 Scheme.pptx

Similar to Module 3_Greedy Technique_2021 Scheme.pptx (20)

Recently uploaded

Recently uploaded (20)

Module 3_Greedy Technique_2021 Scheme.pptx

Editor's Notes