Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Algorithm Design and Complexity - Course 4 - Heaps and Dynamic Progamming


Published on

Course 4 for the Algorithm Design and Complexity course at the Faculty of Engineering in Foreign Languages - Politehnica University of Bucharest, Romania

Published in: Education, Technology, Business

Algorithm Design and Complexity - Course 4 - Heaps and Dynamic Progamming

  1. 1. Algorithm Design and Complexity Course 4
  2. 2. Overview       Data Structures: Heaps Heap Sort Dynamic Programming Longest Common Subsequence Chain Matrix Multiplication How to recognize problems that can be solved using DP?
  3. 3. Data Structures: Heaps  A heap is a data structure that:     Is stored as an array in memory Can be represented/viewed as an almost complete binary tree Respects the heap property Almost complete binary tree:
  4. 4. Data Structures: Heaps (2)  An array A[1..n] can be represented as a binary tree:      A[1] – root of the tree Parent of A[i] = parent(i) = A[floor(i/2)] Left child of A[i] = left(i) = A[2*i] Right child of A[i] = right(i) = A[2*i + 1] Height of the heap:  (log n) Number of edges from the root to the farthest leaf
  5. 5. Heap Property  Max heaps    Min heaps    For each node (except the root): A[i] <= A[parent(i)] Conclusion: The root (A[1]) contains the largest element For each node (except the root): A[i] >= A[parent(i)] Conclusion: The root (A[1]) contains the smallest element Heap Sort uses max heaps
  6. 6. Operations for Heaps   Discussion for max heaps (similar for min heaps) Deleting the root element        Simply remove A[1] However, the heap is broken as it has no root element Need to find a new root Move the last element in the heap as the new root Now, the heap property is lost (almost certainly) Need to heapify-down (bubble-down, percolate-down, siftdown, or down-heap) Heapify-down an element    Compare it to its children If it is larger than both children, stop Else, swap it with the largest child and continue
  7. 7. Operations for Heaps (2)   Number of swaps: maximum the height of the heap O(log n)
  8. 8. Operations for Heaps (3)  Inserting a new element   Add it as the last element in the heap However, the heap property may be broken    If the added element is larger than its parent Need to find its correct position in the heap => heapify-up (bubble-up, percolate-up, sift-up, or down-up) Heapify-up an element    Compare it with its parent If it is smaller than it, stop Else, swap it with the parent and continue
  9. 9. Operations for Heaps (4)   Number of swaps: maximum the height of the heap O(log n)
  10. 10. Example: Heapify-down MAX-HEAPIFY-DOWN(A, i, n) l = LEFT(i) r = RIGHT(i) if (l ≤ n AND A[l] > A[i]) then largest = l else largest = i if (r ≤ n AND A[r] > A[largest]) then largest = r if (largest != i) then SWAP(A[i], A[largest]) MAX-HEAPIFY-DOWN (A, largest, n)
  11. 11. Building a Heap   Giving an array A[1..n] with random elements How to transform A in order to be a heap?   A simple solution would be:       Need to satisfy the heap property Start with a simple heap with only one element Insert the second element in this heap => a heap with two elements Insert the third element … So on… n insert new element into a heap Complexity? Appears to be O(n log n)
  12. 12. Building a Heap (2)  A better solution exists BUILD-MAX-HEAP(A, n) FOR (i = floor(n/2) .. 1) MAX-HEAPIFY-DOWN(A, i, n)   There is only a need to heapify-down the first floor(n/2) elements This solution considers that the last n – floor(n/2) elements in the array are already correct heaps of size 1
  13. 13. Loop invariants  Quick Info:   Three steps:     Loop invariants are useful to prove the correctness of an algorithm: properties that are correct during the execution of a loop Initialization: It is true prior to the first iteration of the loop Maintenance: If it is true before an iteration of the loop, it remains true before the next iteration Termination: When the loop terminates, the invariant gives us a useful property that helps show that the algorithm is correct Loop invariants are similar to mathematical induction   Initialization = Base case Maintenance = Inductive step
  14. 14. Remember: Insertion Sort InsertionSort( A[1..n] ) 1. FOR (j = 2 .. n) 2. x = A[j] 3. i=j–1 side of 4. 5. 6. 7. 8. // element to be inserted // position on the right // the sorted sub-array WHILE (i > 0 AND x < A[i]) // while not in position A[i + 1] = A[i] // move to right i-// continue A[i + 1] = x RETURN A
  15. 15. Insertion Sort – Loop Invariants  Inner loop (index i)  All the elements in A[i+1..j] are higher than x   Termination: before inserting x, all the elements at its right are higher than it Outer loop (index j)  A[1..j-1] is sorted   Termination: A[1..n+1-1] is sorted A[1..j-1] contains the same elements that were on the first j-1 positions at the beginning of the sort  Termination: A[1..n+1-1] contains the same elements that were in A at the beginning
  16. 16. Building a Heap – Loop Invariant BUILD-MAX-HEAP(A, n) FOR (i = floor(n/2) .. 1) MAX-HEAPIFY-DOWN(A, i, n)   At the start of every iteration of for loop, each node i+1, i+2, . . . , n is root of a max-heap. Initialization:   Maintenance:   The elements from floor(n/2)+1 .. N are on the last level in the tree => they are roots of trivial heaps of size 1 We insert the element at position i into one of the heaps from i+1, .., n (which are heaps due to the fact that the property holds at step i) => it makes i a max heap root! Termination (i = 0):  Node 1 is the root of a max heap that contains all the elements in the array A!
  17. 17. Building a Heap – Complexity   This method does not seem to be better than the previous one Complexity appears to be:   However, if we take a closer look (but still be on the worst case):      n/2 * O(log n) = O(n log n) Many elements are heapified-down a heap with only 1 element => height 0 Then, we heapify-down heaps that contains 2-3 elements => height 1 So on… Therefore the complexity is sometimes well below the upper limit O(log n) The revised complexity is: O(2*n) = O(n)
  18. 18. Heap Sort Algorithm   Uses a max heap Idea:   Build a max heap with all the elements in the array Place the maximum element at the end of the array   Remove the last element from the heap    Simply by removing the root – O(log n) Decrease the size of the heap by 1 Do not need to remove it from the actual array Repeat until reaching a heap of size 1  This should contain the smallest element in the heap
  19. 19. Heap Sort Algorithm (2) HEAPSORT(A, n) BUILD-MAX-HEAP(A, n) FOR (i = n .. 2) SWAP(A[1], A[i]) MAX-HEAPIFY-DOWN(A, 1, i − 1)  Loop invariants:    A[i+1 .. n] is sorted A[i+1 .. n] contains the highest elements in the array Complexity: O(n) + (n-1)*O(log n) = O(n log n)
  20. 20. Heap Sort – Conclusions  Heap Sort    Good complexity (like merge-sort) In-place algorithm (like insertion-sort) Similar functionality to selection-sort, but the selection of the maximum element in the array works better    O(log n) instead of O(n) It is important to use good data structures for each algorithm Heaps are useful as priority queues!
  21. 21. Dynamic Programming  Another algorithm design technique   Usually used to solve efficiently optimization problems    Find the minimum value from a set of possible solutions Find the maximum value … Also for some combinatorial problems   As are divide&impera, greedy, backtracking, … How many possibilities do you have to do something? We want to find a solution with the optimal value   Not all of them If we want all of them, DP is not a solution
  22. 22. Dynamic Programming (2)  General DP scheme:     Divide a problem into smaller similar sub-problems Solve the sub-problems Combine them in order to find the optimal solution for the main problem Important steps:     Identify sub-problems and optimal solutions for them Determine a recurrent relation to define the value of the optimal solution Compute the value of the optimal solutions to the sub-problems in a bottom-up fashion (from smaller to bigger problems) Save how the optimal solution has been constructed at each step
  23. 23. Dynamic Programming (3)    Greatly reduces the complexity needed to solve some optimization problems It requires a lot of training/experience in order to be able to find the optimal sub-structure and the recurrent relation Some problems, accept more a single DP solution, depending on how you decompose it into subproblems and how you assemble the solutions  Choose the most efficient method
  24. 24. Examples of DP Algorithms        Floyd-Warshall Optimal Binary Search Trees Chain Matrix Multiplication Edit Distance Viterbi Algorithm Earley Algorithm Catalan Numbers
  25. 25. Characteristics of DP  An optimal solution for the problem contains only optimal solutions for sub-problems The problem can be recursively decomposed into similar, but smaller sub-problems Overlapping sub-problems: The solution to a subproblem is generally used for solving more than a problem of larger size  in order not to compute the solution to the same sub-problem more than once, save the solution for each sub-problem in an array when it is firstly computed (memoization)  Usually uses a bottom-up approach  
  26. 26. Bottom-up vs Top-down   When a problem is solved by decomposing it into sub-problems, we use these terms to refer to the way we solve these sub-problems:   If we start by solving sub-problems first and then go to higher dimensionality problems => bottom-up If we start from larger problems and break them down in order to solve them => top-down
  27. 27. Longest Common Subsequence (LCS)  Consider 2 sequences (arrays):     Find the subsequence common to both sequences that has the longest length A subsequence    X[1..m] Y[1..n] Has to be in order May not be consecutive Example: a l g o r i t h m g e o m e t r y
  28. 28. Simple Solution  We can use brute force:     For each subsequence of X Check if it is also a subsequence of Y (2m) subsequences for X Time to check if a subsequence appears in Y: O(n)  Total: (n * 2m) Exponential complexity!  Need something better 
  29. 29. DP: Optimal Substructure and Choice  Find the sub-problems:    Theorem       Xi = prefix(X, i) = <x1, … , xi> = X[1..i] Yj = prefix(Y, j) = <y1, … , yj> = Y[1..j] Let Z = <z1, …, zk> be any LCS of X and Y If xm = yn then zk = xm = yn and Zk-1 is a LCS of Xm-1 and Yn-1 If xm != yn and zk != xm then Z is a LCS of Xm-1 and Yn If xm != yn and zk != yn then Z is a LCS of Xm and Yn-1 Proof by contradiction! Conclusion: A LCS of two sequences contains as a prefix a LCS of prefixes of the sequences
  30. 30. DP: Recursive Formulation  Sub-problems:    Length of the LCS(i, j): c[i, j]   LCS(i, j) = longest common subsequence of Xi and Yj LCS(m, n) = main problem We need c[m, n] Using the prior theorem, we can find a recursive formulation for c[i, j]
  31. 31. Recursive Formulation – Example    Taken from CLRS X = <b o z o> Y = <b a t> 4,2   Lots of repeated sub-problems Do not recompute them, store them in a table when computed for the first time and use them later
  32. 32. Implementation  In order to compute c[m, n]  We can use a recursive algorithm that follows the recurrent relation determined above   We can implement the solution as an iterative algorithm     This is called top-down DP More difficult because we need to find the best way to do this This is called bottom-up DP It is more efficient as we remove procedure calls Let‟s implement first the iterative solution
  33. 33. Bottom-up Implementation LCS (X, Y, m, n) FOR (i = 1 .. m) c[i, 0] = 0 // base cases of the recurrence FOR (j = 0 .. n) c[0, j ] = 0 // base cases of the recurrence FOR (i = 1 .. m) // start from the first row FOR (j = 1 .. n) // and the first column IF (x[i] = y[j]) // build more complex solutions c[i, j] = c[i − 1, j − 1] + 1 b[i, j] = „diag‟ ELSE IF (c[i − 1, j] ≥ c[i, j − 1]) c[i, j] = c[i − 1, j] b[i, j] ← „up‟ ELSE c[i, j] = c[i, j − 1] b[i, j] = „left‟ RETURN c and b Complexity: (m*n) – both time and space
  34. 34. Remarks   c[i, j] = table for saving the length of the LCS(i, j) b[i, j] = table for saving the direction used to compute the current solution of the LCS(i, j)    Used in order to be able to determine the actual LCS Because c[i, j] only offers the length of the LCS PRINT_LCS(X, b, i, j)   Recursive, uses the direction stored in b On whiteboard!
  35. 35. LCS – Example  From CLRS
  36. 36. Top-down Implementation // Initially, c[i][j] = -1 for all i = 0..m, j = 0..n // Means that LCS(i, j) was not solved yet LCS_recursive(X, Y, i, j) IF (i == 0 OR j == 0) RETURN 0 IF (c[i][j] == -1) IF (x[i] = y[j]) // solve LCS(i, j) // stop condition for the recurrent relation // if the sub-problem was not previously solved // must solve sub-problem recursively c[i, j] = LCS_recursive(X, Y, i-1, j-1) + 1 b[i, j] = „diag‟ ELSE sol_up = LCS_recursive(X, Y, i-1, j) sol_left = LCS_recursive(X, Y, i, j-1) IF (sol_up ≥ sol_left) c[i, j] = sol_up b[i, j] ← „up‟ ELSE c[i, j] = sol_left b[i, j] = „left‟ RETURN c[i][j] // c[i][j] is never -1 now
  37. 37. Top-down Implementation (2)  Initial call:    LCS_recursive(X, Y, m, n) Same complexity as the iterative solution The bottom-up approach is preferred due to the low number of recursive calls which increases the running time (and several other reasons)
  38. 38. Chain Matrix Multiplication  Given a sequence of matrices: A1, A2, ..., An.  Which is the minimum number of scalar multiplications needed to perform the multiplication of the n matrices: A1 x A2 x ... x An ?  We need to determine one of the possible parenthesizations that minimizes the number of total scalar multiplications
  39. 39. Matrix Multiplication  A(p, q) x B (q, r) => pqr scalar multiplications are needed  However, the multiplication of matrices is associative (but it is not commutative)  A(p, q) x B (q, r) x C(r, s) (AB)C => pqr + prs scalar multiplications A(BC) => qrs + pqs scalar multiplications  E.g.: p = 5, q = 4, r = 6, s = 2 (AB)C => 180 scalar multiplications A(BC) => 88 scalar multiplications  Conclusion: The parenthesization is very important!
  40. 40. Brute Force  Matrices: A1, A2, ..., An  It is better to use a single array to store the dimensions of the n matrices (n+1 elements): p0, p1, p2, ... , pn  Ai(pi-1, pi)  A1(p0, p1), A2(p1, p2), …, An(pn-1, pn)  We may use brute force (exhaustive search) to build all the possible parenthesizations in order to determine the best one: Ω(4n / n3/2) exponential complexity  We want to find a polynomial solution using DP
  41. 41. DP: Finding the sub-problems  At first, we try to identify how we can express sub-problems similar to the original problem, but of smaller size 1 ≤ i ≤ j ≤ n:   We use the notation Ai, j = Ai x … x Aj  Ai,j has pi-1 lines and pj columns: Ai,j(pi-1, pj)  m[i, j] = the optimal number of scalar multiplications to solve the sub-problem Ai,j  s[i, j] = the position of the first parenthesis in order to optimally solve Ai,j    If s[i][j] = k => (Ai x … x Ak) x (Ak+1 x … x Aj) Sub-problem: Which is the optimal parenthesization for Ai, j? Initial problem: A1,n
  42. 42. DP: Optimal Substructure  In order to solve Ai,j  We need to find the position i ≤ k < j that ensures the best parenthesization: Ai, j = (Ai x…x Ak) x (Ak+1 x…x Aj) Ai, j = Ai, k x Ak+1, j   We divide any problem into two smaller subproblems Which k to choose ?
  43. 43. DP: Optimal Choice  We must look for the optimal value from all the possible values for choosing k (i ≤ k < j)     k = i => Ai, j = Ai, i x Ai+1, j = (Ai) x (Ai+1 x … x Aj) … k = j – 1 => Ai, j = Ai, j-1 x Aj, j = (Ai x … x Aj-1) x (Aj) In order to do this, the solutions for the sub-problems must also be optimal (the solutions for Ai, k and Ak+1, j)
  44. 44. DP: Optimal Substructure and Choice  If we know that the optimal solution for solving Ai, j needs using the solutions for the sub-problems Ai, k and Ak+1, j, then the solutions for Ai, k şi Ak+1, j must also be optimal! (Optimal substructure)  Demonstration: By using cut-and-paste and proof by contradiction (standard method for showing that problems have the optimal substructure property needed by DP and greedy).  Remark: Not all the optimum problems have this property!  E.g.: maximum length path in a directed graph
  45. 45. DP: Recursive Formulation  When we know the sub-problems, the optimal substructure and choice, the next step is to find the recursive formulation for the solution to the problems  We need to find a recurrent relation for m[i, j] and s[i, j]
  46. 46. DP: Recursive Formulation (2)  The stop conditions are m[i, i] = 0   Ai,i We want to compute m[1, n]  A1,n  How do we choose s[i, j] ?  Bottom-up fashion from the smallest sub-problems to the largest one
  47. 47. Bottom-up Approach Small Large
  48. 48. Example- Initialization
  49. 49. Solution – First step
  50. 50. Solution – Final
  51. 51. Bottom-up Implementation Matrix-chain(p[0..n]) FOR (i = 1..n) m[i, i] = 0 // stop condition FOR (l = 2..n) // l – dimension of the subproblem FOR (i = 1..n – l + 1) // start index j = i + l – 1 // end index m[i, j] = INF FOR (k = i..j – 1) // find optimal choice q = m[i, k] + m[k+1, j] + p[i-1] * p[k] * p[j] IF (q < m[i, j]) m[i, j] = q s[i, j] = k RETURN m AND s
  52. 52. Complexity  Space: Θ(n2)   Number of sub-problems: one solution for each Time: O(n3)    Ns: Total number of sub-problems: O(n2) Nc: Number of choices at each step: O(n) For DP, the complexity is usually Ns x Nc
  53. 53. Dynamic Programming Overview  It‟s a technique mastered by exercise Very useful for finding solutions of low complexity to difficult optimum problems When we only need to compute 1 optimal solution  How do we recognize DP?      Optimal substructure Recursive formulation for solving the problem Overlapping sub-problems
  54. 54. DP Overview: Optimal Substructure   In order to solve the problem, you need to make a choice that uses one or more similar sub-problems Knowing this choice, we need to show that the optimal solution to the problem uses optimal solutions for the sub-problems     E.g. A LCS uses optimal solutions for smaller prefixes E.g. The optimal parenthesization uses optimal parenthesizations for smaller sequences of matrices E.g. The shortest path in a graph uses smaller shortest paths Use cut-and-paste for demonstration
  55. 55. DP Overview: Optimal Choice  We need to consider a wide enough range of choices and sub-problems to be sure that the best solution is not missed out!  It is important how we consider the space of the subproblems in order for it:    Not to be too wide: increases complexity Not to be too small: optimal solutions may be missed Example: can we use a smaller set of sub-problems for CMM?  If we use only A1,i (i = 1..n) than we may miss the optimal solution
  56. 56. Space of Sub-problems  How many sub-problems are used by the optimal solution at each step?    How many choices to consider from at each step?    LCS: 1 sub-problem CMM: 2 sub-problems LCS: 1 or 2 choices CMM: j – i + 1 choices Usually, the complexity of the resulting algorithm is O(number of total sub-problems) x O(number of choices at each step)
  57. 57. Optimal Substructure!     Not all the optimization problems have the optimal substructure property E.g. Longest simple path in a directed graph Example: Optimal substructure is not true when the solutions to the sub-problems are not independent!
  58. 58. Overlapping Sub-problems    Several sub-problems may require to be solved more than once (on different recursive paths) “Store, don‟t recompute” Memoization    Use a table to store solutions for the sub-problems that have already been solved Useful for building top-down DP solutions We can find an iterative solution that fills the table intelligently from smaller sub-problems to larger ones
  59. 59. References  CLRS – Chapters 6, 15  MIT OCW – Introduction to Algorithms – video lecture 15
  60. 60. Exercise 1 Suppose you have a jar of one or more marbles, each of which is either RED or BLUE in color. You also have an unlimited supply of RED marbles off to the side. You then execute the following "procedure": while (# of marbles in the jar > 1) { choose (any) two marbles from the jar; if (the two marbles are of the same color) { toss them aside; place a RED marble into the jar; } else { // one marble of each color was chosen toss the chosen RED marble aside; place the chosen BLUE marble back into the jar; } } Which is the color of the last marble in the jar ? (Hint: use loop invariants)
  61. 61. Exercise 2  Given the following sequence (triangle) of numbers: n0 n1 n3 n2 n4 n5 n6 … Starting from the top, you can choose any of the two elements situated on the level below just to the left and to the right of the current element. You want to maximize the profit you get by summing up the elements you walk through from the top of the triangle to the bottom level