Divide-and-Conquer Matrix multiplication and Strassen’s algorithm Median Problem In general finding the kth largest element of an unsorted list of numbers using comparisons
Matrix Multiplication How many operations are needed to multiply two 2 by 2 matrices? [ [ r s t u [ [ a b c d [ [ e f g h =
Traditional Approach r = ae + bg s = af + bh t = ce + dg u = cf + dh 8 multiplications and 4 additions [ [ r s t u [ [ a b c d [ [ e f g h =
Extending to n by n matrices Each letter represents an n/2 by n/2 matrix We can use the breakdown to form a divide and conquer algorithm R = AE + BG S = AF + BH T = CE + DG U = CF + DH 8 multiplications of n/2 by n/2 matrices T(n) = 8 T(n/2) +   (n 2 ) T(n) =   (n 3 ) [ [ R S T U [ [ A B C D [ [ E F G H =
Strassen’s Approach p1 = a(f – h) p2 = (a+b)h p3 = (c+d)e p4 = d(g-e) p5 = (a+d)(e + h) p6 = (b-d)(g+h) p7 = (a-c)(e+f) r = p5 + p4 - p2 + p6 s = p1 + p2 t = p3 + p4 u = p5 + p1 – p3 – p7 7 multiplications 18 additions [ [ r s t u [ [ a b c d [ [ e f g h =
Extending to n by n matrices Each letter represents an n/2 by n/2 matrix We can use the breakdown to form a divide and conquer algorithm 7 multiplications of n/2 by n/2 matrices 18 additions of n/2 by n/2 matrices T(n) = 7 T(n/2) +   (n 2 ) T(n) =   (n lg 7 ) [ [ R S T U [ [ A B C D [ [ E F G H =
Observations Comparison: n= 70 Direct multiplication: 70 3  = 343,000 Strassen: 70 lg 7  is approximately 150,000 Crossover point typically around n = 20  Hopcroft and Kerr have shown 7 multiplications are necessary to multiply 2 by 2 matrices But we can do better with larger matrices Current best is O(n 2.376 ) by Coppersmith and Winograd, but it is not practical Best lower bound is   (n 2 ) (since there are n 2  entries) Matrix multiplication can be used in some graph algorithms as a fundamental step, so theoretical improvements in the efficiency of this operation can lead to theoretical improvements in those algorithms
Median Problem How quickly can we find the median (or in general the kth largest element) of an unsorted list of numbers? Two approaches Quicksort partition algorithm expected    (n) time but   (n 2 ) time in the worst-case Deterministic   (n) time in the worst-case
Quicksort Approach int Select(int A[], k, low, high) Choose a pivot item Compare all items to this pivot element If pivot is kth item, return pivot Else update low, high, k and recurse on partition that contains correct item
Probabilistic Analysis Assume each of n! permutations is equally likely What is probability ith largest item is compared to jth largest item? If k is contained in (i..j), then 2/(j-i+1) If k <= i, then 2/(j-k+1) If k >= j, then 2/(k-i+1)
Cases where (i..j) do not contain k k>=j:  (i=1 to k-1)  j = i+1 to k  2/(k-i+1) =   i=1 to k-1  (k-i) 2/(k-i+1) =   i=1 to k-1   2i/(i+1)  [replace k-i with i] = 2   i=1 to k-1  i/(i+1)  <= 2(k-1) k<=i:  (j=k+1 to n)  i = k to j-1  2/(j-k+1) =   j=k+1 to n  (j-k) 2/(j-k+1) =   j = 1 to n-k   2j/(j+1)  [replace j-k with j and change bounds] = 2   j=1 to n-k  j/(j+1)  <= 2(n-k) Total for both cases is <= 2n-2
Case where (i..j) contains k At most 1 interval of size 3 contains k i=k-1, j=k+1 At most 2 intervals of size 4 contain k i=k-1, j=k+2 and i=k-2, j= k+1 In general, at most q-2 intervals of size q contain k Thus we get   (q=3 to n)  (q-2)/q <= n-2 Summing together all cases we see the expected number of comparisons is less than 3n
Best case, Worst-case Best case running time? What happens in the worst-case? Pivot element chosen is always what? This leads to comparing all possible pairs This leads to   (n 2 ) comparisons
Deterministic O(n) approach Need to guarantee a good pivot element while doing O(n) work to find the pivot element int Select(int A[], k, low, high) Choosing pivot element Divide into groups of 5 For each group of 5, find that group’s median Find the median of the medians Compare remaining items directly to median to identify its current exact placement Update low, high, k and recurse on partition that contains correct item
Guarantees on the pivot element Median of medians is guaranteed to be smaller than all the red colored items Why? How many red items are there? ½ (n/5) * 3 = 3n/10 Likewise, median of medians is guaranteed to be larger than the 3n/10 blue colored items Thus median of medians is in the range 3n/10 to 7n/10
Analysis of running time int Select(int A[], k, low, high) Choosing pivot element For each group of 5, find that group’s median Find the median of the medians Compare remaining items directly to median Recurse on correct partition Analysis Choosing pivot element c 1  n/5 c 1  for median of 5 Recurse on problem of size n/5 n comparisons Recurse on problem of size 7n/10 T(n) = T(7n/10) + T(n/5) + O(n)
Solving recurrence relation T(n) = T(7n/10) + T(n/5) + O(n) Key observation: 7/10 + 1/5 = 9/10 < 1 Prove T(n) <= cn for some constant n by induction on n T(n) = 7cn/10 + cn/5 + dn = 9cn/10 + dn Need 9cn/10 + dn <= cn Thus c/10 >= d    c >= 10d

lecture 15

  • 1.
    Divide-and-Conquer Matrix multiplicationand Strassen’s algorithm Median Problem In general finding the kth largest element of an unsorted list of numbers using comparisons
  • 2.
    Matrix Multiplication Howmany operations are needed to multiply two 2 by 2 matrices? [ [ r s t u [ [ a b c d [ [ e f g h =
  • 3.
    Traditional Approach r= ae + bg s = af + bh t = ce + dg u = cf + dh 8 multiplications and 4 additions [ [ r s t u [ [ a b c d [ [ e f g h =
  • 4.
    Extending to nby n matrices Each letter represents an n/2 by n/2 matrix We can use the breakdown to form a divide and conquer algorithm R = AE + BG S = AF + BH T = CE + DG U = CF + DH 8 multiplications of n/2 by n/2 matrices T(n) = 8 T(n/2) +  (n 2 ) T(n) =  (n 3 ) [ [ R S T U [ [ A B C D [ [ E F G H =
  • 5.
    Strassen’s Approach p1= a(f – h) p2 = (a+b)h p3 = (c+d)e p4 = d(g-e) p5 = (a+d)(e + h) p6 = (b-d)(g+h) p7 = (a-c)(e+f) r = p5 + p4 - p2 + p6 s = p1 + p2 t = p3 + p4 u = p5 + p1 – p3 – p7 7 multiplications 18 additions [ [ r s t u [ [ a b c d [ [ e f g h =
  • 6.
    Extending to nby n matrices Each letter represents an n/2 by n/2 matrix We can use the breakdown to form a divide and conquer algorithm 7 multiplications of n/2 by n/2 matrices 18 additions of n/2 by n/2 matrices T(n) = 7 T(n/2) +  (n 2 ) T(n) =  (n lg 7 ) [ [ R S T U [ [ A B C D [ [ E F G H =
  • 7.
    Observations Comparison: n=70 Direct multiplication: 70 3 = 343,000 Strassen: 70 lg 7 is approximately 150,000 Crossover point typically around n = 20 Hopcroft and Kerr have shown 7 multiplications are necessary to multiply 2 by 2 matrices But we can do better with larger matrices Current best is O(n 2.376 ) by Coppersmith and Winograd, but it is not practical Best lower bound is  (n 2 ) (since there are n 2 entries) Matrix multiplication can be used in some graph algorithms as a fundamental step, so theoretical improvements in the efficiency of this operation can lead to theoretical improvements in those algorithms
  • 8.
    Median Problem Howquickly can we find the median (or in general the kth largest element) of an unsorted list of numbers? Two approaches Quicksort partition algorithm expected  (n) time but  (n 2 ) time in the worst-case Deterministic  (n) time in the worst-case
  • 9.
    Quicksort Approach intSelect(int A[], k, low, high) Choose a pivot item Compare all items to this pivot element If pivot is kth item, return pivot Else update low, high, k and recurse on partition that contains correct item
  • 10.
    Probabilistic Analysis Assumeeach of n! permutations is equally likely What is probability ith largest item is compared to jth largest item? If k is contained in (i..j), then 2/(j-i+1) If k <= i, then 2/(j-k+1) If k >= j, then 2/(k-i+1)
  • 11.
    Cases where (i..j)do not contain k k>=j:  (i=1 to k-1)  j = i+1 to k 2/(k-i+1) =  i=1 to k-1 (k-i) 2/(k-i+1) =  i=1 to k-1 2i/(i+1) [replace k-i with i] = 2  i=1 to k-1 i/(i+1) <= 2(k-1) k<=i:  (j=k+1 to n)  i = k to j-1 2/(j-k+1) =  j=k+1 to n (j-k) 2/(j-k+1) =  j = 1 to n-k 2j/(j+1) [replace j-k with j and change bounds] = 2  j=1 to n-k j/(j+1) <= 2(n-k) Total for both cases is <= 2n-2
  • 12.
    Case where (i..j)contains k At most 1 interval of size 3 contains k i=k-1, j=k+1 At most 2 intervals of size 4 contain k i=k-1, j=k+2 and i=k-2, j= k+1 In general, at most q-2 intervals of size q contain k Thus we get  (q=3 to n) (q-2)/q <= n-2 Summing together all cases we see the expected number of comparisons is less than 3n
  • 13.
    Best case, Worst-caseBest case running time? What happens in the worst-case? Pivot element chosen is always what? This leads to comparing all possible pairs This leads to  (n 2 ) comparisons
  • 14.
    Deterministic O(n) approachNeed to guarantee a good pivot element while doing O(n) work to find the pivot element int Select(int A[], k, low, high) Choosing pivot element Divide into groups of 5 For each group of 5, find that group’s median Find the median of the medians Compare remaining items directly to median to identify its current exact placement Update low, high, k and recurse on partition that contains correct item
  • 15.
    Guarantees on thepivot element Median of medians is guaranteed to be smaller than all the red colored items Why? How many red items are there? ½ (n/5) * 3 = 3n/10 Likewise, median of medians is guaranteed to be larger than the 3n/10 blue colored items Thus median of medians is in the range 3n/10 to 7n/10
  • 16.
    Analysis of runningtime int Select(int A[], k, low, high) Choosing pivot element For each group of 5, find that group’s median Find the median of the medians Compare remaining items directly to median Recurse on correct partition Analysis Choosing pivot element c 1 n/5 c 1 for median of 5 Recurse on problem of size n/5 n comparisons Recurse on problem of size 7n/10 T(n) = T(7n/10) + T(n/5) + O(n)
  • 17.
    Solving recurrence relationT(n) = T(7n/10) + T(n/5) + O(n) Key observation: 7/10 + 1/5 = 9/10 < 1 Prove T(n) <= cn for some constant n by induction on n T(n) = 7cn/10 + cn/5 + dn = 9cn/10 + dn Need 9cn/10 + dn <= cn Thus c/10 >= d  c >= 10d