Greedy Algorithms Faster Algorithms for Well-Behaved Optimization Problems
The algorithms we have studied are relatively inefficient: Matrix chain multiplication:  O ( n 3 ) Longest common subsequence:  O ( mn ) Optimal binary search trees:  O ( n 3 ) Why? We have many choices in computing an optimal solution. We exhaustively (blindly) check all of them. We would much rather like to have a way to decide which choice is the best or at least restrict the choices we have to try. Greedy algorithms work for problems where we can decide what’s the best choice. Dynamic Programming is Blind
Greedy choice: We make the choice that looks best at the moment. (Every time we make a choice, we greedily try to maximize our profit.) An example where this does  not  work: Finding a longest monotonically increasing subsequence. Given sequence ‹3 4 5 17 7 8 9›, a longest monotonically increasing subsequence is ‹3 4 5 7 8 9›. The greedy choice after choosing ‹3 4 5› is to choose 17, which precludes the addition of another element and thus results in the suboptimal sequence ‹3 4 5 17›. Making a Greedy Choice
i  1  2  3  4  5  6  7  8  9  10  11 s i  1  3  0  5  3  5  6  8  8  2  12   f i 4  5  6  7  8  9  10  11  12  13  14 Subset of mutually compatible activities {a 3 , a 9 ,a 11 } {a 1 , a 4 , a 8 , a 11 } {a2, a 4 , a 9 , a 11 } A Scheduling Problem 1 2 3 4 5 6 7 8
A classroom can be used for one class at a time. There are  n  classes that want to use the classroom. Every class has a corresponding time interval  I j  = [ s j ,  f j ) during which the the room would be needed for this class. Our goal is to choose a maximal number of classes that can be scheduled to use the classroom without two classes ever using the classroom at the same time. Assume that the classes are sorted according to increasing finish times; that is, f 1  <  f 2  < … <  f n . A Scheduling Problem 1 2 3 4 5 6 7 8
Let  S i , j  be the set of classes that begin after time  f i  and end before time  s j ; that is, these classes can be scheduled between classes  C i  and  C j . We can add two fictitious classes  C 0  and  C n + 1  with  f 0  = –∞ and  s n + 1  = +∞.  Then S 0, n  + 1  is the set of all classes. Assume that class  C k  is part of an optimal schedule of the classes in  S i , j . Then  i  <  k  <  j , and the optimal schedule consists of a maximal subset of  S i , k , { C k }, and a maximal subset of  S k , j . The Structure of an Optimal Schedule C j C i C k S i , k S k , j
Hence, if  Q ( i ,  j ) is the size of an optimal schedule for set  S i , j , we have The Structure of an Optimal Schedule
Lemma:   There exists an optimal schedule for the set S i,j  that contains the class C k  in S i,j  that finishes first, that is, the class C k  in S i,j  with minimal index k. Lemma:   If we choose class C k  as in the previous lemma, then set S i,k  is empty. Making a Greedy Choice
Recursive-Recursive-Selector(s,f,i,j) 1 m   i +1 2 While  m<j and s m  <= f i 3 Do  m    m +1  4 If  m < j 5 Then return  {a m } U  Recursive-Recursive-Selector(s,f,m,j) 6 Else   return   ∅ A Recursive Greedy Algorithm In addition, the algorithm has a certain overhead for maintaining the stack, because it is recursive.
Recursive-Schedule( S ) 1 if  | S | = 0 2 then return   ∅ 3 Let  C k  be the class with minimal finish time in  S 4 Remove  C k  and all classes that begin before  C k  ends from  S ; let  S'  be the resulting set 5 O   ←  Recursive-Schedule( S' ) 6 return   O   ∪  { C k } A Recursive Greedy Algorithm Depending on the data structure we use to store  S , this algorithm has running time  O ( n 2 ) or  O ( n  log  n ). In addition, the algorithm has a certain overhead for maintaining the stack, because it is recursive.
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm
Iterative-Schedule( S ) 1 n   ←  | S | 2 m   ←  –∞ 3 O   ←   ∅ 3 for   i  = 1.. n 4 do if   s i  ≥  m 5 then   O   ←   O   ∪  { C i } 6 m   ←   f i 7 return   O An Iterative Linear-Time Greedy Algorithm The running time of this algorithm is obviously linear. It’s correctness follows from the following lemma: Lemma:   Let O be the current set of selected classes, and let C k  be the last class added to O.  Then any class C l , l > k, that conflicts with a class in O conflicts with C k .
Instead of maximizing the number of classes we want to schedule, we want to maximize the total time the classroom is in use. Our dynamic programming approach still remains unchanged. But none of the obvious greedy choices would work: Choose the class that starts earliest/latest Choose the class that finishes earliest/latest Choose tho longest class A Variation of the Problem
The problem must have two properties: Greedy choice property: An optimal solution can be obtained by making choices that seem best at the time, without considering their implications for solutions to subproblems. Optimal substructure: An optimal solution can be obtained by augmenting the partial solution constructed so far with an optimal solution of the remaining subproblem. When Do Greedy Algorithms Produce an Optimal Solution?
Our next goal is to develop a code that represents a given text as compactly as possible. A standard encoding is ASCII, which represents every character using 7 bits: “An English sentence” 1000001 (A)  1101110 (n)  0100000 ( )  1000101 (E)  1101110 (n)  1100111 (g) 1101100 (l)  1101001 (i)  1110011 (s)  1101000 (h)  0100000 ( )  1110011 (s) 1100101 (e)  1101110 (n)  1110100 (t)  1100101 (e)  1101110 (n)  1100011 (c) 1100101 (e) = 133 bits ≈ 17 bytes Text Encoding
Of course, this is wasteful because we can encode 12 characters in 4 bits: ‹space› = 0000  A = 0001  E = 0010  c = 0011  e = 0100  g = 0101  h = 0110 i = 0111  l = 1000  n = 1001  s = 1010  t = 1011 Then we encode the phrase as 0001 (A)  1001 (n)  0000 ( )  0010 (E)  1001 (n)  0101 (g)  1000 (l)  0111 (i) 1010 (s)  0110 (h)  0000 ( )  1010 (s)  0100 (e)  1001 (n)  1011 (t)  0100 (e) 1001 (n)  0011 (c)  0100 (e) This requires 76 bits ≈ 10 bytes
An even better code is given by the following encoding: ‹space› = 000  A = 0010  E = 0011  s = 010  c = 0110  g = 0111  h = 1000 i = 1001  l = 1010  t = 1011  e = 110  n = 111 Then we encode the phrase as 0010 (A)  111 (n)  000 ( )  0011 (E)  111 (n)  0111 (g)  1010 (l)  1001 (i) 010 (s)  1000 (h)  000 ( ) 010 (s)  110 (e)  111 (n)  1011 (t)  110 (e)  111 (n) 0110 (c)  110 (e) This requires 65 bits ≈ 9 bytes
Fixed-length codes: Every character is encoded using the same number of bits. To determine the boundaries between characters, we form groups of  w  bits, where  w  is the length of a character. Examples: ASCII Our first improved code Prefix codes: No character is the prefix of another character. Examples: Fixed-length codes Huffman codes Codes That Can Be Decoded
Consider a code that is not a prefix code: a = 01  m = 10  n = 111  o = 0  r = 11  s = 1  t = 0011 Now you send a fan-letter to you favorite movie star.  One of the sentences is “You are a star.” You encode “star” as “1  0011  01  11”. Your idol receives the letter and decodes the text using your coding table: 100110111 = 10  0  11  0  111 = “moron” Oops, you have just insulted your idol. Non-prefix codes are ambiguous. Why Prefix Codes?
Why Are Prefix Codes Unambiguous? Since both  c  and  c'  can occur at the beginning of the text, we have  x i  =  y i , for 0 ≤  i  ≤  k ; that is,  x 0 x 1 … x k  is a prefix of  y 0 y 2 … y l , a contradiction. It suffices to show that the first character can be decoded unambiguously.  We then remove this character and are left with the problem of decoding the first character of the remaining text, and so on until the whole text has been decoded. c Assume that there are two characters  c  and  c'  that could potentially be the first characters in the text.  Assume that the encodings are  x 0 x 1 … x k  and  y 0 y 2 … y l .  Assume further that  k  ≤  l . c c'
Our example: ‹space› = 000  A = 0010  E = 0011  s = 010  c = 0110  g = 0111  h = 1000 i = 1001  l = 1010  t = 1011  e = 110  n = 111 Representing a Prefix-Code Dictionary ‹ spc› A E s c g h i l t e n 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
Our example: ‹space› = 000  A = 0010  E = 0011  s = 010  c = 0110   g = 0111  h = 1000 i = 1001  l = 1010  t = 1011  e = 110  n = 111 Representing a Prefix-Code Dictionary ‹ spc› A E s c g h i l t e n 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
Let  C  be the set of characters in the text to be encoded, and let  f ( c ) be the frequency of character  c . Let  d T ( c ) be the depth of node (character)  c  in the tree representing the code.  Then The Cost of Prefix Codes is the number of bits required to encode the text using the code represented by tree  T .  We call  B ( T ) the  cost  of tree  T . Observation:   In a tree T representing an optimal prefix code, every internal node has two children.
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30 55
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30 55
Huffman( C ) 1 n   ←  | C | 2 Q   ←   C 3 for   i  = 1.. n  – 1 4 do  allocate a new node  z 5 left[ z ]  ←   x   ←  Delete-Min( Q ) 6 right[ z ]  ←   y   ←  Delete-Min( Q ) 7 f [ z ]  ←   f [ x ] +  f [ y ] 8 Insert( Q ,  z ) 9 return  Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30 55 100 0 1 0 0 0 0 1 1 1 1 0 100 101 1100 1101 111
Why is merging the two nodes with smallest frequency into a subtree a greedy choice? Greedy Choice By merging the two nodes with lowest frequency, we greedily try to minimize the cost the new node contributes to  B ( T ). where B ( v ) = 0 if  v  is a leaf and B ( v ) =  f (left[ v ]) +  f (right[ v ]) if  v  is an internal node. We can alternatively define  B ( T ) as
Lemma:   There exists an optimal prefix code such that the two characters with smallest frequency are siblings and have maximal depth in T. Greedy Choice Let  x  and  y  be two such characters, and let  T  be a tree representing an optimal prefix code. Let  a  and  b  be two sibling leaves of maximal depth in  T . Assume w.l.o.g. that  f ( x ) ≤  f ( y ) and f ( a ) ≤  f ( b ). This implies that  f ( x ) ≤  f ( a ) and f ( y ) ≤  f ( b ). Let  T'  be the tree obtained by exchanging  a  and  x  and  b  and  y . T T' x y b a x a y b
The cost difference between trees  T  and  T'  is T T' x y b a x a y b
After joining two nodes  x  and  y  by making them children of a new node  z , the algorithm treats  z  as a leaf with frequency  f ( z ) =  f ( x ) +  f ( y ). Let  C'  be the character set in which  x  and  y  are replaced by the single character  z  with frequency  f ( z ) =  f ( x ) +  f ( y ), and let  T'  be an optimal tree for  C' . Let  T  be the tree obtained from  T'  by making  x  and  y  children of  z . We observe the following relationship between  B ( T ) and  B ( T' ): B ( T ) =  B ( T' ) +  f ( x ) +  f ( y ) Optimal Substructure
 
Lemma:   If T' is optimal for C', then T is optimal for C. Assume the contrary.  Then there exists a better tree  T''  for  C . Also, there exists a tree  T'''  at least as good as  T''  for  C  where  x  and  y  are sibling leaves of maximal depth. The removal of  x  and  y  from  T'''  turns their parent into a leaf; we can associate this leaf with  z . The cost of the resulting tree is  B ( T''' ) –  f ( x ) –  f ( y ) <  B ( T ) –  f ( x ) –  f ( y ) =  B ( T' ). This contradicts the optimality of  B ( T' ). Hence,  T  must be optimal for  C .
Greedy algorithms are efficient algorithms for optimization problems that exhibit two properties: Greedy choice property:   An optimal solution can be obtained by making locally optimal choices. Optimal substructure:   An optimal solution contains within it optimal solutions to smaller subproblems. If only optimal substructure is present, dynamic programming may be a viable approach; that is, the greedy choice property is what allows us to obtain faster algorithms than what can be obtained using dynamic programming. Summary

Algoritmos Greedy

  • 1.
    Greedy Algorithms FasterAlgorithms for Well-Behaved Optimization Problems
  • 2.
    The algorithms wehave studied are relatively inefficient: Matrix chain multiplication: O ( n 3 ) Longest common subsequence: O ( mn ) Optimal binary search trees: O ( n 3 ) Why? We have many choices in computing an optimal solution. We exhaustively (blindly) check all of them. We would much rather like to have a way to decide which choice is the best or at least restrict the choices we have to try. Greedy algorithms work for problems where we can decide what’s the best choice. Dynamic Programming is Blind
  • 3.
    Greedy choice: Wemake the choice that looks best at the moment. (Every time we make a choice, we greedily try to maximize our profit.) An example where this does not work: Finding a longest monotonically increasing subsequence. Given sequence ‹3 4 5 17 7 8 9›, a longest monotonically increasing subsequence is ‹3 4 5 7 8 9›. The greedy choice after choosing ‹3 4 5› is to choose 17, which precludes the addition of another element and thus results in the suboptimal sequence ‹3 4 5 17›. Making a Greedy Choice
  • 4.
    i 1 2 3 4 5 6 7 8 9 10 11 s i 1 3 0 5 3 5 6 8 8 2 12 f i 4 5 6 7 8 9 10 11 12 13 14 Subset of mutually compatible activities {a 3 , a 9 ,a 11 } {a 1 , a 4 , a 8 , a 11 } {a2, a 4 , a 9 , a 11 } A Scheduling Problem 1 2 3 4 5 6 7 8
  • 5.
    A classroom canbe used for one class at a time. There are n classes that want to use the classroom. Every class has a corresponding time interval I j = [ s j , f j ) during which the the room would be needed for this class. Our goal is to choose a maximal number of classes that can be scheduled to use the classroom without two classes ever using the classroom at the same time. Assume that the classes are sorted according to increasing finish times; that is, f 1 < f 2 < … < f n . A Scheduling Problem 1 2 3 4 5 6 7 8
  • 6.
    Let Si , j be the set of classes that begin after time f i and end before time s j ; that is, these classes can be scheduled between classes C i and C j . We can add two fictitious classes C 0 and C n + 1 with f 0 = –∞ and s n + 1 = +∞. Then S 0, n + 1 is the set of all classes. Assume that class C k is part of an optimal schedule of the classes in S i , j . Then i < k < j , and the optimal schedule consists of a maximal subset of S i , k , { C k }, and a maximal subset of S k , j . The Structure of an Optimal Schedule C j C i C k S i , k S k , j
  • 7.
    Hence, if Q ( i , j ) is the size of an optimal schedule for set S i , j , we have The Structure of an Optimal Schedule
  • 8.
    Lemma: There exists an optimal schedule for the set S i,j that contains the class C k in S i,j that finishes first, that is, the class C k in S i,j with minimal index k. Lemma: If we choose class C k as in the previous lemma, then set S i,k is empty. Making a Greedy Choice
  • 9.
    Recursive-Recursive-Selector(s,f,i,j) 1 m i +1 2 While m<j and s m <= f i 3 Do m  m +1 4 If m < j 5 Then return {a m } U Recursive-Recursive-Selector(s,f,m,j) 6 Else return ∅ A Recursive Greedy Algorithm In addition, the algorithm has a certain overhead for maintaining the stack, because it is recursive.
  • 10.
    Recursive-Schedule( S )1 if | S | = 0 2 then return ∅ 3 Let C k be the class with minimal finish time in S 4 Remove C k and all classes that begin before C k ends from S ; let S' be the resulting set 5 O ← Recursive-Schedule( S' ) 6 return O ∪ { C k } A Recursive Greedy Algorithm Depending on the data structure we use to store S , this algorithm has running time O ( n 2 ) or O ( n log n ). In addition, the algorithm has a certain overhead for maintaining the stack, because it is recursive.
  • 11.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 12.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 13.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 14.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 15.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 16.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 17.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 18.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 19.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 20.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm
  • 21.
    Iterative-Schedule( S )1 n ← | S | 2 m ← –∞ 3 O ← ∅ 3 for i = 1.. n 4 do if s i ≥ m 5 then O ← O ∪ { C i } 6 m ← f i 7 return O An Iterative Linear-Time Greedy Algorithm The running time of this algorithm is obviously linear. It’s correctness follows from the following lemma: Lemma: Let O be the current set of selected classes, and let C k be the last class added to O. Then any class C l , l > k, that conflicts with a class in O conflicts with C k .
  • 22.
    Instead of maximizingthe number of classes we want to schedule, we want to maximize the total time the classroom is in use. Our dynamic programming approach still remains unchanged. But none of the obvious greedy choices would work: Choose the class that starts earliest/latest Choose the class that finishes earliest/latest Choose tho longest class A Variation of the Problem
  • 23.
    The problem musthave two properties: Greedy choice property: An optimal solution can be obtained by making choices that seem best at the time, without considering their implications for solutions to subproblems. Optimal substructure: An optimal solution can be obtained by augmenting the partial solution constructed so far with an optimal solution of the remaining subproblem. When Do Greedy Algorithms Produce an Optimal Solution?
  • 24.
    Our next goalis to develop a code that represents a given text as compactly as possible. A standard encoding is ASCII, which represents every character using 7 bits: “An English sentence” 1000001 (A) 1101110 (n) 0100000 ( ) 1000101 (E) 1101110 (n) 1100111 (g) 1101100 (l) 1101001 (i) 1110011 (s) 1101000 (h) 0100000 ( ) 1110011 (s) 1100101 (e) 1101110 (n) 1110100 (t) 1100101 (e) 1101110 (n) 1100011 (c) 1100101 (e) = 133 bits ≈ 17 bytes Text Encoding
  • 25.
    Of course, thisis wasteful because we can encode 12 characters in 4 bits: ‹space› = 0000 A = 0001 E = 0010 c = 0011 e = 0100 g = 0101 h = 0110 i = 0111 l = 1000 n = 1001 s = 1010 t = 1011 Then we encode the phrase as 0001 (A) 1001 (n) 0000 ( ) 0010 (E) 1001 (n) 0101 (g) 1000 (l) 0111 (i) 1010 (s) 0110 (h) 0000 ( ) 1010 (s) 0100 (e) 1001 (n) 1011 (t) 0100 (e) 1001 (n) 0011 (c) 0100 (e) This requires 76 bits ≈ 10 bytes
  • 26.
    An even bettercode is given by the following encoding: ‹space› = 000 A = 0010 E = 0011 s = 010 c = 0110 g = 0111 h = 1000 i = 1001 l = 1010 t = 1011 e = 110 n = 111 Then we encode the phrase as 0010 (A) 111 (n) 000 ( ) 0011 (E) 111 (n) 0111 (g) 1010 (l) 1001 (i) 010 (s) 1000 (h) 000 ( ) 010 (s) 110 (e) 111 (n) 1011 (t) 110 (e) 111 (n) 0110 (c) 110 (e) This requires 65 bits ≈ 9 bytes
  • 27.
    Fixed-length codes: Everycharacter is encoded using the same number of bits. To determine the boundaries between characters, we form groups of w bits, where w is the length of a character. Examples: ASCII Our first improved code Prefix codes: No character is the prefix of another character. Examples: Fixed-length codes Huffman codes Codes That Can Be Decoded
  • 28.
    Consider a codethat is not a prefix code: a = 01 m = 10 n = 111 o = 0 r = 11 s = 1 t = 0011 Now you send a fan-letter to you favorite movie star. One of the sentences is “You are a star.” You encode “star” as “1 0011 01 11”. Your idol receives the letter and decodes the text using your coding table: 100110111 = 10 0 11 0 111 = “moron” Oops, you have just insulted your idol. Non-prefix codes are ambiguous. Why Prefix Codes?
  • 29.
    Why Are PrefixCodes Unambiguous? Since both c and c' can occur at the beginning of the text, we have x i = y i , for 0 ≤ i ≤ k ; that is, x 0 x 1 … x k is a prefix of y 0 y 2 … y l , a contradiction. It suffices to show that the first character can be decoded unambiguously. We then remove this character and are left with the problem of decoding the first character of the remaining text, and so on until the whole text has been decoded. c Assume that there are two characters c and c' that could potentially be the first characters in the text. Assume that the encodings are x 0 x 1 … x k and y 0 y 2 … y l . Assume further that k ≤ l . c c'
  • 30.
    Our example: ‹space›= 000 A = 0010 E = 0011 s = 010 c = 0110 g = 0111 h = 1000 i = 1001 l = 1010 t = 1011 e = 110 n = 111 Representing a Prefix-Code Dictionary ‹ spc› A E s c g h i l t e n 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
  • 31.
    Our example: ‹space›= 000 A = 0010 E = 0011 s = 010 c = 0110 g = 0111 h = 1000 i = 1001 l = 1010 t = 1011 e = 110 n = 111 Representing a Prefix-Code Dictionary ‹ spc› A E s c g h i l t e n 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
  • 32.
    Let C be the set of characters in the text to be encoded, and let f ( c ) be the frequency of character c . Let d T ( c ) be the depth of node (character) c in the tree representing the code. Then The Cost of Prefix Codes is the number of bits required to encode the text using the code represented by tree T . We call B ( T ) the cost of tree T . Observation: In a tree T representing an optimal prefix code, every internal node has two children.
  • 33.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45
  • 34.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14
  • 35.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14
  • 36.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25
  • 37.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25
  • 38.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30
  • 39.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30
  • 40.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30 55
  • 41.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30 55
  • 42.
    Huffman( C )1 n ← | C | 2 Q ← C 3 for i = 1.. n – 1 4 do allocate a new node z 5 left[ z ] ← x ← Delete-Min( Q ) 6 right[ z ] ← y ← Delete-Min( Q ) 7 f [ z ] ← f [ x ] + f [ y ] 8 Insert( Q , z ) 9 return Delete-Min( Q ) Huffman’s Algorithm f:5 b:13 c:12 d:16 e:9 a:45 14 25 30 55 100 0 1 0 0 0 0 1 1 1 1 0 100 101 1100 1101 111
  • 43.
    Why is mergingthe two nodes with smallest frequency into a subtree a greedy choice? Greedy Choice By merging the two nodes with lowest frequency, we greedily try to minimize the cost the new node contributes to B ( T ). where B ( v ) = 0 if v is a leaf and B ( v ) = f (left[ v ]) + f (right[ v ]) if v is an internal node. We can alternatively define B ( T ) as
  • 44.
    Lemma: There exists an optimal prefix code such that the two characters with smallest frequency are siblings and have maximal depth in T. Greedy Choice Let x and y be two such characters, and let T be a tree representing an optimal prefix code. Let a and b be two sibling leaves of maximal depth in T . Assume w.l.o.g. that f ( x ) ≤ f ( y ) and f ( a ) ≤ f ( b ). This implies that f ( x ) ≤ f ( a ) and f ( y ) ≤ f ( b ). Let T' be the tree obtained by exchanging a and x and b and y . T T' x y b a x a y b
  • 45.
    The cost differencebetween trees T and T' is T T' x y b a x a y b
  • 46.
    After joining twonodes x and y by making them children of a new node z , the algorithm treats z as a leaf with frequency f ( z ) = f ( x ) + f ( y ). Let C' be the character set in which x and y are replaced by the single character z with frequency f ( z ) = f ( x ) + f ( y ), and let T' be an optimal tree for C' . Let T be the tree obtained from T' by making x and y children of z . We observe the following relationship between B ( T ) and B ( T' ): B ( T ) = B ( T' ) + f ( x ) + f ( y ) Optimal Substructure
  • 47.
  • 48.
    Lemma: If T' is optimal for C', then T is optimal for C. Assume the contrary. Then there exists a better tree T'' for C . Also, there exists a tree T''' at least as good as T'' for C where x and y are sibling leaves of maximal depth. The removal of x and y from T''' turns their parent into a leaf; we can associate this leaf with z . The cost of the resulting tree is B ( T''' ) – f ( x ) – f ( y ) < B ( T ) – f ( x ) – f ( y ) = B ( T' ). This contradicts the optimality of B ( T' ). Hence, T must be optimal for C .
  • 49.
    Greedy algorithms areefficient algorithms for optimization problems that exhibit two properties: Greedy choice property: An optimal solution can be obtained by making locally optimal choices. Optimal substructure: An optimal solution contains within it optimal solutions to smaller subproblems. If only optimal substructure is present, dynamic programming may be a viable approach; that is, the greedy choice property is what allows us to obtain faster algorithms than what can be obtained using dynamic programming. Summary