SlideShare a Scribd company logo
1 of 38
Download to read offline
Lecture Notes On
                          Algo-Design
                    Lecturer: Ulf-Peter Schroeder
                           February 16, 2006


                                written by:
                              Braun, Rudolf
                              Brune, Philipp
                             Piepmeyer, Meik




Please send corrections - including [AlgoDesign-Script] in the subject line -
           to: meikp <KLAMMERAFFE> upb <PUNKT> de




                                     1
CONTENTS                                                                                                                                                          2


Contents
1 Greedy Algorithms                                                                                                                                              3
  1.1 Interval Scheduling . . . . . . . . . . . . . . . .                        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   3
  1.2 Scheduling to Minimize Lateness . . . . . . . .                            .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   4
  1.3 ( Huffman Codes and Data Compressions ) . .                                 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   5
  1.4 Theoretical foundations for the greedy method                              .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   5

2 Divide and Conquer                                                                                                                                              8
  2.1 Finding Closest Pair of Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                  8
  2.2 Convolutions at the Fast Fourier Transformation . . . . . . . . . . . . . . . . . . .                                                                      10

3 Dynamic Programming                                                                                                                                            14
  3.1 Weighted Interval Scheduling       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   14
  3.2 Segmented Least Squares . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   16
  3.3 Subset Sum / Knapsack . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
  3.4 Sequence Alignment . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   19

4 Approximation Algorithm                                                                                                                                        22
  4.1 Load-Balancing Problem . . . . . . . . . . . . . . . . .                                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   22
  4.2 Set Cover . . . . . . . . . . . . . . . . . . . . . . . . .                                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   24
  4.3 Vertex Cover (Pricing Method) . . . . . . . . . . . . .                                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   26
  4.4 Linear Programming and Rounding . . . . . . . . . . .                                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   28
  4.5 A more advanced LP-Application: Load Balancing . .                                         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   29
  4.6 Arbitrarily Good Approximations - Knapsack Problem                                         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   31

5 Local Search                                                                                                                                                   34
  5.1 Metropolis Algorithm and Simulated Annealing                               .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35
  5.2 Maximum Cut via Local Search . . . . . . . . .                             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   36
  5.3 Local search Algorithms for Graph-Partitioning                             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   37
  5.4 Best Response Dynamics and Nash-Equilibria .                               .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   38
1   GREEDY ALGORITHMS                                                                               3


                                                                                              18.10.2005


  1     Greedy Algorithms
      • builds of a solution in ”small” steps
      • choosing a irreversible decision at each step ??? to optimize some underlying criterion

      Questions
      1. When did a greedy algorithm succeed in solving a mentioned problem optimally?
      2. How to proof that a greedy algorithm produces an optimal solution to a problem?

  1.1     Interval Scheduling
  Def.: Set of requests {1, 2, .., m}
        The ith request corresponds to an interval of time starting at s(i) and finishing at f (i). We’ll
        say that a subset of the request is compatible if no two of them overlap in time.
        Our goal is to accept as long a compatible subset as possible. Compatible set of maximum
        size will be called optimal.
  Idea: The basic idea is to use a simple route to select the first request i1 . We reject all requests
        that are not compatible with i1 . Repeat this procedure until we run out of requests.
Rule 1: ”Select the available request that starts earliest.”
Rule 2: ”Select the request that requires the smallest interval of time.”
Rule 3: ”Select the request that has the fewest number of non-compatible requests.”
Rule 4: ”Select first the request that finishes first, that is the request i for which f (i) is as small as
        possible.”
 Algo.: Initially let R be a set of requests and let A be empty.

        while R is not yet empty
              choose a request i ∈ R that has the smallest finishing time.
              add request i to A.
              delete all requests from R that are not compatible with request i.
        end while
        return A.

  Analyzing the Algorithm
                                              √
Part 1: A is a compatible set of requests.
        Is the solution A optimal?
        Let O be an optimal set of intervals.
        | A |=| O | is to prove!
        Let i1,..,k be the set of requests in A in the order they were added to A. | A |= k
        Let j1,..,m be the requests in O.
        Our goal is to prove that k = m.
Part 2: For all indices r ≤ k we have f (ir ) ≤ f (jr ).
1   GREEDY ALGORITHMS                                                                                   4


         Proof:       r=1 Our Greedy Rule guaranties that f (i1 ) ≤ f (j1 ).
                      I.H.: The statement is true for r − 1.
                      I.S.: We know that f (jr−1 ) ≤ s(jr ). Combining this with the I.H: f (ir−1 ) ≤ f (jr−1 )
                  we get f (ir−1 ) ≤ s(jr ). Since interval jr is one of available interests at the time when the
                  greedy algorithm selects ir , we have f (ir ) ≤ f (jr ).
        Part 3: The greedy algorithm returns an optimal set A.
         Proof: ( given by contradiction )
                If A is not optimal then an optimal set 0 must have more requests, that is, we must have
                m > k. Applying Part 2 with r = k, then is f (ik ) ≤ f (jk ). Since m > k, there is a request
                jk+1 in O.
                This request starts after request jk ends, and ends after ik ends. But the greedy algorithm
                stops with request ik , and it is only supposed to stop when R is empty. `

                                                                                                         25.10.2005

           1.2     Scheduling to Minimize Lateness
           Definition of the Problem
           A single ressource, a set of n requests to use the ressource for an interval of time, ressource is
           available starting at time S, the request i has a deadline di , and it requires a continuous time
           interval of length ti . Each request must be assigned non overlapping intervals.

           Objective function
           We will assign each request i an interval of time of length ti , let us denote this interval [s(i), f (i)]
           with f (i) = s(i) + ti .
           We say that a request i is late if it misses the deadline, that is if di < f (i).
           li = f (i) − di . We say that li = 0 if request i is not late.

               maximum lateness L =       max li
                                        i∈{1,..,n}

        Idea 1: ”Schedule the jobs in order of increasing length ti .”
        Idea 2: ”Schedule the jobs in order of increasing slacktime di − ti ”.
        Idea 3: ”Earliest Deadline First”

           Analyzing the ”Earliest Deadline First” - Greedy Algorithm
           We start with an optimized schedule O.
           Our plan we have is to gradually modify O, perserving its optimality at each step, but transforming
           it into a schedule that is indicated to the schedule A formed by our algorithm.

        Fact 1: There is an optimal schedule with no idle time.
           Def.: A schedule A′ has an inversion if a job i with deadline di is scheduled before another job j
                 with earlier deadline dj < di .
        Fact 2: All schedules with no inversions and no idle time have the same maximum lateness.
        Fact 3: There is an optimal schedule that has no inversions and no idle time.
Proof of Fact 3: By Fact 1 there is an optimal schedule with no idle time.
1    GREEDY ALGORITHMS                                                                                                 5


                  1. If O has an inversion, then there is a pair of jobs i and j and that j is immediately
                     after i and has dj < di .
                  2. After swapping i and j, we get a schedule with one less inversion.
                                                                                              √
                  3. The new swapped schedule has a maximum lateness no longer then that of O.
Proof of (3.): All jobs other then jobs i and j finish at the same time in the two schedules.

                                               ˜i = f (i) − di = f (j) − di < f (j) − dj = l′
                                               l    ˜
                                                                                            j

                                                                       ⇐⇒ ˜i < lj
                                                                          l     ′




               Notation
               Optimal schedule O: each request r is scheduled [s(r), f (r)]
                                                      ′
                                   and has lateness lr

                                                           L′ = max lr
                                                                     ′
                                                                   r

               swapped schedule O : s(r), f (r), ˜r , L
                                ˜ ˜       ˜      l ˜

        1.3      ( Huffman Codes and Data Compressions )
        1.4      Theoretical foundations for the greedy method
        Def.: A matroid is a pair M = (S, I) satisfying the following conditions:
                  1. S is a finite non-empty set
                                                                                                                           1
                  2. I is a non-empty family of subsets of S such that: B ∈ I and A ⊂ B implies A ∈ I.
                  3. If A ∈ I, B ∈ I and |A| < |B|, then there exists some element x ∈ B  A such that
                     A ∪ [x] ∈ I. 2
  Examples:       1. Matrix Matroid

                    Def.: S = set of n-vectors
                          I consists of all subsets of linear independent vectors from S.
                                             
                                 1 1 1 0
                          A = 0 0 1 1
                              1 0 1 1                     
                                      
                               1
                                         1     1       0 
                          S = 0, 0, 1, 1
                              
                               1                          
                                         0     1       1 
                                      e1      e2      e3      e4
                            I = {∅, {e1 }, {e2}, {e3 }, {e4 }, {e1 , e2 }, {e1, e3 }, {e1 , e4 }, {e2 , e3 }, {e2, e4 },
                            {e3 , e4 }, {e1, e2 , e3 }, {e1 , e2 , e4 }, {e1, e3 , e4 }}

                  2. Graphic Matroid

                    Def.: Let G = (V, E) be a continued, undirected graph.
                          MG = (SG , IG ) is defined by SG = E. I consists of all subset A ⊂ E such that
                          (V, A) is acyclic.
            1 hereditary
            2 exchange   property
1       GREEDY ALGORITHMS                                                                         6


            1. and 2. trivial.
                   3. exchange property?
                      Let A and B belong to I with |A| < |B|, i.e. A and B are forests. Let V (A), V (B)
                      be sets of vertices incident to edges from A and B, resp.
                      (a) If b ∈ V (B)  V (A), then exists some e ∈ V (B) such that [b, e] ∈ B.
                          A ∪ {(b, c)} is acyclic.
                      (b) Now we can assume that V (B) ⊂ V (A)
                          ( Theorem: Forest with k edges contains exactly |V | − k trees. )
                          A consists of τ1 = |V (A)| − |A| trees and
                          B consists of τ2 = |V (B)| − |B| trees.
                          |V (B)| ≤ |V (A)| and |A| < |B| implies τ2 < τ1 .
                          =⇒ ∃ some edge e ∈ B connecting 2 trees from A
                          =⇒ A ∪ {e} is acyclic.
                 3. ”counter example”
                    Interval Scheduling
                    S = {1, .., n} be the set of requests.
                    U ⊂ S belongs to I if its requests are mutually compatible.
    Def.: Let M = [S, I] be a matroid.
          A ∈ I is called maximal, if there exists no x ∈ S  A such that A ∪ {x} ∈ I.
 Lemma: All maximum independent subsets in a matroid have the same size.
  Proof: Suppose the contrary:
         There exist two maximum independent subsets A, B with |A| < |B|.
         ` to the exchange property.
                                                   3
Example:         1. A is maximal ⇔ |A| = rank(S)
                                                                              4
                 2. A is maximal ⇔ A is spanning tree of G ⇔ |A| = |V | − 1
    Def.: A Matroid M = (S, I) is called weighted if there is a weight w(x) > 0 to each x ∈ S.
          For A ⊂ S we set w(A) = x⊂A w(x).
             The maximum weight independent subset problem: Find a maximum weight independent
             subset in a weighted matroid.
              Greedy(M,w)
                   A=∅
                   sort S[M ] into nonincreasing order by weight w
                   for each x ∈ S[M ] take in nonincreasing order by weight w do
                       if A ∪ {x} ∈ I(M ) then
                         A := A ∪ {x}
                   return A
   MST: Define the matroid MG = (SG , IG ) with SG = E. IS consists of all subsets A ⊂ E such that
        (V, A) is acyclic.
        We define w′ (e) = (wmax + 1) − w(e) with wmax = max{w(e)}.
                                                                  e∈E
             It holds w′ (e) > 0 ∀e ∈ E and Greedy(MI , w′ ) computes an optimal subset5 which is a MST
             in the original graph.
        3 rank  means rank for the Matrix Matroid.
        4  |V | means vertices for the Graphire Matroid.
        5 that means a maximum weight independent subset
1   GREEDY ALGORITHMS                                                                        7


              Lemma: Suppose that M = (S, I) is a weighted matroid with weight function w and that S is
                     sorted into nonincreasing order by weight. Let x be the first element of S such that {x} is
                     independent. Then there exists an optimal subset A of S that contains x.

               Proof: Let B ∈ I be an optimal subset.
                      If x ∈ B, then the proof is done.
                      So now let x ∈ B hold.
                                   /
                      Construct the set A as follows:
                           Begin with A = {x}
                           By the choice of x, A is independent and w(x) ≥ w(y) for any y ∈ B.
                           Using the exchange property, find an element x1 ∈ B such that A = {x1 , x} is indepen-
                           dent .
                           Repeat this procedure until |A| = |B|.
                       Then, A = (B  {y}) ∪ {x} for some y ∈ B and B is independent ⇒ {y} is independent.
                       w(x) ≥ w(y) by the choice of x ⇒ w(A) ≥ w(B).
                       Since B is optimal A must also be optimal, and since x ∈ A, the Lemma is proofed.
            Theorem: If M = (S, I) is a weighted matroid with weight function w, then the call Greedy(M,w)
                     returns an optimal independent subset.
               Proof: We show the so called ”optimal-substructure property”.
                      Let x be the first element chosen by Greedy.
                      If x is chosen to be the element of the solution, then this defines the problem to find a
                      maximum weight independent subset in the matroid

                                       M ′ = (S ′ , I ′ ) with   S ′ = {y ∈ S : {x, y} ∈ I}
                                                                 I ′ = {B ⊂ (S  {x}) : B ∪ {x}) ∈ I}

Proof of this property: If A is any maximum-weighted independent subset containing x, then A′ = a  {x} is an
                        optimal subset for M ′ .
                        Conversely, any optimal subset A′ in M ′ yields a subset A = A′ ∪ {x} which has optimal
                        weight among all subset from I containing x.
                        From the previous Lemma we have that there exists an optimal solution containing x. This
                        shows the optimality of Greedy by induction on |S|.


                      Conclusion
                      Here, we have learned three techniques to proof the function of a Greedy Al-
                      gorithm.
2    DIVIDE AND CONQUER                                                                               8


                                                                                            08.11.2005


2      Divide and Conquer
Principle:
     • break the input into several parts
     • solves the problem in each part recursively
     • combines the solutions of the subproblems into an overall solution

Running time: T (n) ≤ aT ( n ) + f (n)
                           b

     • a is the number of subsolutions
     • b specifies the size of the subsolutions

Example: Sorting with Quick-Sort
The function ”Partition” establishes an element x:


                                     ...                    x            ...

                                               ≥x               x≥

     • Best case, Partition always finds the element in the middle:
       T (n) ≤ 2T ( n ) + cn =⇒ O(n ∗ log(n))
                    2

     • Worst case, Partition always finds a marginal element:
       T (n) ≤ T (n − 1) + cn =⇒ O(n2 )

2.1     Finding Closest Pair of Points
Definition of the Problem
Given n points in the plane. Find the pair that is closest together. Let P = {p1 , ..., pn } be the set
of points where pi has coordinates (xi , yi ). For two points pi , pj ∈ P , we use d(pi , pj ) to denote
the Euclidian distance between them. Our goal is to find a pair of points pi , pj that minimizes
d(pi , pj ).

Idea

                                                        q
                                      Q                              R
                               q                                               q
                                                                                       q
                                               q
                                                                     q
                                           q
                                                    q       q
                           q                                             q
                                                                                   q

    A: Setting up the recursion:

        (1) We sort all the points in P by x-coordinate and again by y-coordinate producing lists
            Px and Py .
2    DIVIDE AND CONQUER                                                                             9


          (2) We define Q to be the set of points in the n positions of the list Px and R to be the
                                                         2
              set of points in the final n positions of the list Px .
                                        2
          (3) By a single pass through each of Px and Py , we can create the following four lists:
              Qx : consisting of the points in Q sorted by increasing x-coordinate
              Qy : consisting of the points in Q sorted by increasing y-coordinate
              Rx : consisting of the points in R sorted by increasing x-coordinate
              Ry : consisting of the points in R sorted by increasing y-coordinate
                                                                                        x        x
          (4) We now recursively determine a closest pair of points in Q. Suppose q0 and q1 are
              returned as a pair of points in Q. Similarly we determine a closest pair of points in R,
                         x       x
              obtaining r0 and r1 .
      B: Combining the solutions:
         Let δ be the minimum of d(q0 , q1 ) and d(r0 , r1 ). Let x∗ denote the x-coordinate of the
                                       x x           x x

         rightmost point in Q, and let L denote the vertical line described by the equation x = x∗ .
Fact 1: If there exists q ∈ Q and r ∈ R for which d(q, r) < δ, then each of q and r lies within a
        distance δ of L.

                                                    q
                                           Q                       R
                                    q            ' E' E                   q
                                                  δ   δ
                                                                              q
                                                 q
                                                                  q
                                            q
                                                     q   q
                                q                                     q
                                                                            q
                                                  L
Proof: Suppose such q and r exist. We write q = (qx , qy ) and r = (rx , ry ). We know
       qx ≤ x∗ ≤ rx =⇒
       x∗ − qx ≤ rx − qx ≤ d(q, r) < δ and
                                         √
       rx − x∗ ≤ rx − qx ≤ d(q, r) < δ

         We know that we can restrict our search to the narrow band consisting of only points in P
         within δ of L.
         Let S ⊆ P denote this set and let Sy denote the list consisting of the points in S sorted by
         increasing y-coordinate.
   =⇒ There exist q ∈ Q and r ∈ R for which d(q, r) < δ if and only if there exist s, s′ ∈ S for
      which d(s, s′ ) < δ.
Fact 2: If s, s′ ∈ S have the property that d(s, s′ ) < δ, then s and s′ are within 15 positions of each
        other in the sorted list Sy .
2   DIVIDE AND CONQUER                                                                                      10


Proof: Consider the subset Z of the plane consists of all points within distance δ of L. We partition
       Z into boxes,
                                                              Z

                                                       ' δ E' δ E
                                                           q
                                                Q                              R
                                       q                                                 q
                                                       ′
                                                   s
                                                   q12 13                                         q
                                                                14 15
                                                                               q
                                                  
                                                  
                                                δ
                                             3∗ 2
                                                q 
                                                     8 9       10 11
                                                        q
                                                  
                                                               q
                                                   4 5          6 7
                                   q                  s 1        2 3               q
                                                                                               q
                                                     L
                                                                        δ
        squares with horizontal and vertical size of length
                          √                                             2.   It holds: each box contains at most
        one point of S.

        Now suppose that s, s′ ∈ S have the property that d(s, s′ ) < δ and that they are at least
        16 positions apart in Sy . Assume w.l.o.g. that s has the smaller y-coordinate. Then, since
        there can be at most one point per box, there are at least three ”rows” of Z lying between
        s and s′ . But any two points in Z separated by at least three ”rows” must be a distance of
                     δ
        at least 3 ∗ 2 apart. `

        We can conclude the algorithm as following:
        We make one pass through Sy and for each s ∈ Sy , we compute its distance to each of the
        next 15 points in Sy .

        Running time: T (n) ≤ 2 ∗ T ( n ) + O(n) = O(n ∗ log(n))
                                      2


 2.2      Convolutions at the Fast Fourier Transformation
 Definition of the problem:
 Given two vectors

 a = (a0 , . . . , an−1 )
 b = (b0 , . . . , bn−1 ).

 The convolution of the two vectors of length n is a vector with 2n − 1 coordinates, where co-
 ordinate k is equal to

                      ai ∗ bj =⇒
 (i,j):i+j=k∧i,j<n

 a ∗ b = (a0 ∗ b0 , a1 ∗ b0 + a0 ∗ b1 , a0 ∗ b2 + a1 ∗ b1 + a2 ∗ b0 , . . .)


                                    b0      b1      b2      b3          ...             bn−1
                                        ¨      ¨
                                        ¨ ¨¨ ¨¨ ¨¨              ¨
                               a0 a¨0 a0 b1 a0 b2 a0 b3                 ...            a0 bn−1
                                  ¨ 0 b¨¨ ¨¨ ¨¨
                               a1 a¨0 a¨1 a¨2 a1 b3                     ...            a1 bn−1
                                  ¨ 1b ¨ 1b ¨ 1b
                                        ¨      ¨
                               a2 a¨0 a¨1 a2 b2 a2 b3                   ...
                                  ¨ 2 b¨¨ 2 b                                  a2 bn−1
                               a3 a¨0 a3 b1 a3 b2 a3 b3                 ...    a3 bn−1
                                . ¨ 3.b
                                .    .       .
                                             .       .
                                                     .       .
                                                             .                     .
                                                                                   .
                                .    .       .       .       .                     .
                              an−1an−1 b0 an−1 b1 an−1 b2 an−1 b3       . . . an−1 bn−1
2   DIVIDE AND CONQUER                                                                                      11


         Motivation
  Example 1: ”Polynomial Multiplication”
                                                                            Representation
                                             2                    m−1
               A(x) = a0 + a1 x + a2 x + . . . + am−1 x      −→ (a0 , a1 , a2 , . . . , am−1 )
               B(x) = b0 + b1 x + b2 x2 + . . . + bn−1 xn−1 −→ (b0 , b1 , b2 , . . . , bn−1 )

               C(x) = A(x) ∗ B(x) −→ (c0 , c1 , c2 , . . . , cm+n−2 )

               ck =                 ai b j
                      (i,j):i+j=k




  Example 2: ”Signal Processing”

               Suppose we have a vector a = (a0 , a1 , . . . , am−1 ) representing a sequence of measurements,
               sampled at m consecutive points in time.
               A common operation is to ”smooth” the measurements by averaging each ai with a weighted
               sum of its neighbors within k steps to the left and right in the sequence. We define a ”mask”
               w = (w−k , w−(k−1) , . . . , w−1 , w0 , w1 , . . . , wk−1 , wk ) consisting of the weights we want to use
               for averaging each point with its neighbor.
                                                  k
               We replace ai with a′ =
                                   i                    ws ai+s
                                                 s=−k
               Let’s define b = (b0 , b1 , . . . , b2k ) by setting bl = wk−l :
               a′ =
                i              b l aj
                      (j,l):j+l=i+k




  Example 3: ”Combining Histograms”



                                                                                                           15.11.2005

         Aim: Running time of O(n log n)
Explanation : Complex roots of Unity Complex number reω∗i
              where eΠ∗i = −1 and e2∗Π∗i = 1
              The polynomial equitation xk = 1 has k distinct complex roots
                     2∗Π∗j∗i
              wjk = e k      for j = 0, 1, ..., k − 1 called k th roots of unity.
2   DIVIDE AND CONQUER                                                                                      12

                                 k=8                                  Imaginary Axis
                                                                    T
                                                             ω2,8      i
                                                                    r

                                         ω3,8 r                                r ω1,8



                                     -1 ω4,8                                       ω0,8 1
                                        r                                              r E
                                                                                        Real Axis


                                                 r                             r
                                         ω5,8                ω6,8                  ω7,8
                                                                    r
                                                                        -i


Idea : We are given the vectors a = (a1 , a2 , ..., an−1 ) and b = (b1 , b2 , ..., bn−1 ).

       We will view them as the polynomial A(x) = a0 + a1 x + a2 x2 + ... + an−1 xn−1
       B(x) = b0 + b1 x + b2 x2 + ... + bn−1 xn−1 . We will seek to compute their product C(x) =
       A(x)∗ B(x) in (O(nlog(n)) time. The vector C = (c0 , c1 , ..., c2n−2 ) is exactly the convolution
       a ∗ b. Now, rather than multiplying A and B symbolically, we can treat them as functions
       of the variable x and multiply them as follows :
         (i) Polynomial Evaluation : We chose 2n values x1 , ..., x2n and evaluate A(xj ),B(xj ) for
             each j = 1, 2, .., 2n.
        (ii) Compute C(xj ) for each j = 1, ..., 2n
        (iii) Polynomial Interpolation : Recover C from its values on x1 , ..., x2n
       For our numbers x1 , ..., x2n on which to evaluate A and B we will choose the (2n)th roots of
       unity. The representation of a degree-d polynomial P by its values on the (d + 1)th roots of
       unity is referred to as the ”‘Discrete Fourier transform of P ”’.
                                                                                                            n−2
        (A) A(x) = Aeven (x2 ) + x ∗ Aodd (x2 ) with Aeven (x) = a0 + a2 x + a4 x2 + ... + an−2 x 2
                                                       n−2
            Aodd (x) = a1 + a3 x + a5 x2 + ... + an−1 x 2 . Suppose that we evaluate each of the Aeven
            and Aodd on the (n)th roots of unity. This is exactly a version of the problem we face
            with A and the (2n)th roots of unity, except that the input is half as large. We have
            just to produce the evaluation of A on the (2n)th roots of unity using O(n) additional
                                                                         2Πji
            operations. Consider one of these roots of unity ωj2n = ǫ 2n
                             2Πji         2Πij
             (ωj2n )2 = (ǫ 2n )2 = ǫ       n         and hence (ωj2n )2 is a (n)th root of unity ⇒ T (n) ≤
             2T ( n ) + O(n)
                  2
        (B) The construction of C can be achieved by defining an appropriate polynomial (P ) and
            evaluating it at the (2n)th roots of unity.
                                                      2n−1
             Consider a polynomial C(x) =                    cs ∗ x2 that we want to reconstruct from its values
                                                       s=0
             C(ωs2n ) at the (2n)th roots of unity.
                                                        2n−1
             Define a new polynomial D(x) =                     ds ∗ xs where ds = C(ωs2n ).
                                                         s=0
                          2n−1
             D(ωj2n ) =          C(ωs2n ) ∗ ωj2n s
                           s=0
2   DIVIDE AND CONQUER                                                                                                  13


            2n−1 2n−1
        =          (         ct ∗ ωs2n t ) ∗ ωj2n s
             s=0       t=0
            2n−1             2n−1
        =          ct ∗ (           ωs2n t ∗ ωj2n s )
             t=0              s=0
            2n−1             2n−1         2Πi                 2Πi
        =          ct ∗ (           ((e 2n )s )t ∗ ((e 2n )j )s )
             t=0              s=0
            2n−1             2n−1       (2Πi)∗(st+js)
        =          ct ∗ (           e        2n         )
             t=0              s=0
            2n−1             2n−1        (2Πi)∗(t+j)
        =          ct ∗ (           (e       2n        )s )
             t=0              s=0
            2n−1             2n−1
        =          ct ∗ (           ωt+j2n s )
            t=0              s=0

        The only form of the last lines outer sum that is not equal to 0 is for ct such that
        ωt+j2n = 1.

        Explanation :
                                                                                  2n−1
        For any (2n)th root of unity ω = 1, we have                                      ω s = 0 x2n = 1 ⇔ x2n − 1 = 0 ⇔
                                                                                   s=0
                                             2n−1
        x2n − 1 = (x − 1) ∗ (                       xt ) → This happens if t + j is a multiple of 2n, that is, if
                                             t=0
                                                    2n−1                   2n−1
                                                               s
        t = 2n − j For this value,                            ωt+j2n =            2n So we get that D(ωj,2n ) = 2n ∗ c2n−1
                                                    s=0                     s=0
                                                                    2n−1
        Fact: For any polynomial C(x) =                                    cs ∗ xs and corresponding polynomial D(X) =
                                                                    s=0
        2n−1
                                                                      1
               C(ωs2n ) ∗ xs we have that cs =                       2n    ∗ D(ω2n−s,2n )
         s=0
3   DYNAMIC PROGRAMMING                                                                            14


     3     Dynamic Programming
     Basic Idea
     One implicitly explores this space of all possible solutions, by carefully decomposing things in a
     series of sub solutions, and then building up correct solutions to large and larger subproblems.

     3.1    Weighted Interval Scheduling
     Definition of the Problem
     We have n requests labeled 1, . . . , n, with each request i specifying start time si and finishing
     time fi . Each interval i has a weight vi . Two intervals are compatible if they do not overlap. The
     Goal is to select S ⊆ {1, . . . , n} of mutually compatible intervals, so as to maximize the sum the
     values of the selected intervals,       vi .
                                       i∈S
     Lets suppose that the requests are sorted in order of nondecreasing finishing time:
     f1 ≤ f2 ≤ . . . ≤ fn .
     Well say a request i comes before request j if i < j.

Example:

                                  v1 = 2
                             1                                                      p(1) = 0
                                             v2 = 4                                 p(2) = 0
                             2
                                                       v3 = 4
                             3                                                      p(3) = 1
                                                      v4 = 7                        p(4) = 0
                             4
                                                                       v5 = 2
                             5                                                      p(5) = 3
                                                                           v6 = 1
                             6                                                      p(6) = 3



           We define p(j) for an interval j to be the largest index i < j such that intervals i and j
           are disjoint. We define p(j) = 0 if no request i < j is disjoint from j. For any j between
           1 and n let vj denote the optimal solution of the problem consisting of requests {1, . . . , j},
           and let OP T (j) denote the value of this solution. For the optimal solution vj it holds,
           that either j ∈ Oj in which case OP T (j) = vj + OP T (p(j)), or j ∈ Oj in which case
                                                                                  /
           OP T (j) = OP T (j − 1).
   Fact 1: OP T (j) = max{vj + OP T (p(j)), OP T (j − 1)}
   Fact 2: Request j belongs to an optimal solution on the set {1, . . . , j} if and only if
           vj + OP T (p(j)) ≥ OP T (j − 1)
Remark 1: These facts form the first crucial component on which dynamic programming solution is
          based: recurrence equation that expresses the optimal solution in terms of the optimal
          solutions to smaller subproblems.
3   DYNAMIC PROGRAMMING                                                                          15

                                                                        OP T (6)




                                               OP T (5)                                        OP T (3)




                                   OP T (4)                  OP T (3)                    OP T (2)    OP T (1)




                             OP T (3)    OP T (0)     OP T (2)     OP T (1)        OP T (1)    OP T (0)




                       OP T (2)    OP T (1)    OP T (1)      OP T (0)




               OP T (1)      OP T (0)


 Example:
Remark 2: A fundamental observation, which forms the second crucial component of a dynamic pro-
          gramming solution, is that our recursive algorithm is really only solving n + 1 different sub
          solutions.

            How could we eliminate the redundancy?
            =⇒ ”Memorization”

                                                                                              22.11.2005

     ”Memorization”
  M [0..n]: M [j] will start with the value ”empty” but will hold the value of OP T (j) as soon as it is
            first determined.

            M-OPT(j)
            if j=0 then
                return 0
            else
                if M[j] is not empty then
                   return M[j]
                else
                   M[j] = max(vj + M-OPT(p(j)), M-OPT(j − 1))
            return M[j]
            Iterative-M-OPT(j)
3    DYNAMIC PROGRAMMING                                                                     16


        M[0] = 0
        for j=1,..,n
            M[j] = max(vj + M − (p(j)), M (j − 1))
        return M[j]
   So far we have simply computed the value of an optimal solution. What we want is the full
optimal set of intervals as well. We know from Fact 2 that j belongs to an optimal solution for
the set of intervals {1, .., j} iff 6 vj + OP T (p(j)) ≥ OP T (j − 1).

      Find Solution(j)
        if j=0 then
               Output nothing
        else
               if vj + M [p(j)] ≥ M [j − 1] then
                  Output j together with the result of Find Solution(p(j))
               else
                  Output the result of Find Solution(j-1)

”Informal Guidelines”
    1. There are only a polynomial number of subproblems
    2. The solution to the original problem can be easily computed from the solutions to the
       subproblems.
    3. There is a natural ordering on subproblems from ”smallest” to ”largest” together with an
       easy to compute recurrence that allows one to determine the solution to a subproblem from
       the solutions to some number of smaller subproblems

3.2      Segmented Least Squares
                                 y
                                      T

                                                                
                                                               
                                                     r        
                                                             d
                                                     d        d
                                                       rd  
                                                       d        dr
                                                       d 
                                                r       
                                                d  d
                                                  d
                                                         dr
                                            r    
                                            d  d
                                              d
                                                  dr
                                             
                                            
                                           
                                                                     E
                                                                      x
    6 ”iff”   means ”if and only if”
3   DYNAMIC PROGRAMMING                                                                                   17


Problem description
Suppose our data consists of a set P of n points in the plane, denote (x1 , y1 ), (x2 , y2 ), .., (xn , yn ).
Suppose x1 < x2 < .. < xn .
Given a line L defined by the equation y = ax + b, we say that the error of L with respect to P is
the sum of its squared ”distances” to the points in P .
                                                              n
                                    Error(L, P ) =                 (yi − axi − b)2
                                                             i−1

    The line of minimal error is y = ax + b, where

                                           n∗       i   xi yi − ( i xi )( i yi )
                                    a=                     n 2             2
                                                n∗         i xi − (  i xi )

                                                        i   xi − a ∗        i   xi
                                               b=
                                                                n
                            y
                                T
                                                                             r
                                                                      @ r r @@
                                                                       @@@
                                                                     
                                                                    r
                                       r                   r
                                                            
                                                        r  
                                           r


                                                                                     E
                                                                                      x

Formulating the Problem
We are given a set of points P = {(x1 , y1 ), (x2 , y2 ), .., (xn , yn )} with x1 < x2 < .. < xn . We will
use pi to donate the point (xi , yi ). We must first partition P into some number of segments. Each
segment is a subset of P that represents a continuous set of x-coordinates, that is, it is a subset
of the form {pi , pi+1 , .., pj−1 , pj } for some indices i ≤ j.
Then, for each segment S in our partition of P , we compute the line minimizing the error with
respect to the points in S. The penalty of a partition is defined to be a sum of the following terms:
    1. The number of segments into which we partition P , times a fixed given multiple C > 0.
    2. For each segment, the error value of the optimal line though that segment.
Our goal in the ”Segmented Least Squares” is to find a partition of minimum penalty.
                       y
                          T
                                                                   r
                                                                   n
                                                        r @
                                                     i @@r
                                                     r
                                                     @
                                                               @@

                                                                      r
                                                               r
                                       r
                                                        r
                                           r


                                                                                     E
                                           1                              i-1         x
3   DYNAMIC PROGRAMMING                                                                             18


Observation: The last point pn belongs to a single segment in the optimal partition. If we knew the
             identity of the last segment pi , .., pn , then we could remove these points from consideration
             and recursively solve the problem on the remaining points p1 , .., pi .
             Suppose we let OP T (i) denote the optimal solution for the points p1 , .., pi , and we let eij
             denote the minimum error of any line with respect to pi , .., pj .
     Fact 1: If the last segment of the optimal partition is pi , .., pn then the value of the optimal solution
             is
                                           OP T (n) = ein + C + OP T (i − 1)

     Fact 2: For the subproblems on the points p1 , .., pj

                                           OP T (j) = min {eij + C + OP T (i − 1)}
                                                     1≤i≤j

              and the segment pi , .., pj is used in an optimal solution for the subproblem iff the minimum
              is obtained using index i.

        3.3     Subset Sum / Knapsack
                                                                                                   16.12.2005

        Subset Sum Problem
        We are given n items {1, .., n} and each has a given nonnegative weight wi (for i = 1, .., n). We
        are also given a bound W .
        We would like to select a subset S of the items so that i∈S wi ≤ W and, subject to this restriction
          i∈S wi is as large as possible.


                                  0                                                W




                                      W1

                                      W2

              Knapsack: each item has both a value vi and a weight wi ,
               i∈S wi ≤ W and     i∈S vi is as large as possible.




              OPT(n)
              n ∈ O:
                /        OPT(n) = OPT(n-1)
              n ∈ O:     OPT(n) = ?
           OPT(n) can not solve the problem as it has only one parameter, n. Suppose we take more than
        one parameter:

              OPT(n,W)
              n ∈ O:
                /        OPT(n,W) = OPT(n-1,W)
              n ∈ O:     OPT(n,W) = wn + OP T (n − 1, W − wn )
3   DYNAMIC PROGRAMMING                                                                          19


      OPT(i,W) = max{OP T (i − 1, W ), wi + OP T (i − 1, W − wi )}

          SubsetSum(n, W)

            Array M[0..n,0..W]
            Initialize M[0,w] = 0 ∀w ∈ {0, .., w}
            For i=1 to n
                For j=0 to w
                    compute M[i,j] = max{M [i − 1, j], wi + M [i − 1, j − vi ]}
            return M[m,W]

      n
      i
      i-1
       .
       .
       .
      1
      0
            0   1     2       ...     j ...   W

Fact 1: The SubsetSum(n, W) Algorithm correctly computes the optimal value if the problem and
        runs in O(n ∗ W ) time.
Note 1: The running time is a polynomial function of n and W , the largest integer involved in defining
        the problem. We call such algorithms ”pseudo-polynomial”.



             Extension to the Knapsack problem


                    n ∈ O:
                      /             OPT(n,W) = OPT(n-1,W)
                    n ∈ O:          OPT(n,W) = wn + OP T (n − 1, W − wn )



Fact 2: If w < vi , then OP T (i, W ) = OP T (i − 1, W )
                    else n ∈ O: OPT(i,W) = max{OP T (i − 1, W ), vi + OP T (i − 1, W − wi )}

  3.4        Sequence Alignment
  Motivation
      1. ”Online dictionaries”
         Input: o c c u r r a n c e
         Output: Do you mean o c c u r r e n c e ?


                          o         currance                       o   curr    ance
                          occurrence                                 occurre      nce
                                         costmism + costgap < 3 ∗ costgap ?
3   DYNAMIC PROGRAMMING                                                                                 20


          Goal: We want a model in which similarity is determined roughly by the number of gaps
          and mismatches we have when we line up the two words.
        2. ”Computational biology”
           Organism’s genome is divided into giant linear DNA molecules known as chromosomes.
           We can think of it as an linear tape, containing a string over the alphabet {A, C, G, T}.

    Definition
    Suppose we are given two strings x and y, where x consists of the sequence of symbols x1 x2 x3 ...xm
    and y consists of the sequence of symbols y1 y2 y3 ...yn . Consider the sets {1, ..., m} and {1, ..., n}
    are representing the different positions on the string x and y. Consider a matching of these sets,
    that is a set of ordered pairs with the property that each item occurs in at most one pair. We say
    that a matching M of these two sets is an alignment if there are no ”crossing” pairs:

                                   if (i, j), (i′ , j ′ ) ∈ M and i < i′ ⇒ j < j ′
Example: s t o p
           tops
         Corresponding alignments: (2, 1), (3, 2), (4, 3)
Problem: Suppose M is a given alignment between x and y:
           (1) There is a parameter δ > 0 that defines a gap penalty.
           (2) For each pair of letters p, q in our alphabet, there is a mismatch cost of αpq for lining
               up p with q.
           (3) The cost of M is the sum of its gaps and mismatch costs, and we seek for an alignment
               of minimum costs.
  Fact 1: Let M be any alignment of x and y. If (m, n) ∈ M , then either the mth position of x or the
                                                       /
          nth position of y is not matched in M .
   Proof: Suppose by way of contradiction, that (m, n) ∈ M and there are numbers i < m and j < n
                                                         /
          so that (m, j) ∈ M and (i, n) ∈ M . This contradicts our definition of alignments: we have
          (i, n), (m, j) ∈ M with i < m, but n > i so that the pairs (i, n) and (m, j) cross. `
  Fact 2: In an optimal alignment M , at least one of the following is true:
             (i) (m, n) ∈ M
            (ii) the mth position of x is not matched or
           (iii) the nth position of y is not matched
          Let OP T (i, j) denote the minimum cost of an alignment between x1 x2 x3 ...xi and y1 y2 y3 ...yj .

          In case (i), we pay αxm yn and then align x1 x2 ...xm−1 as well as possible with y1 y2 ...yn−1 .
                                    OP T (m, n) = αxm yn + OP T (m − 1, n − 1)
          In case (ii), we pay gap costs of δ since the mth position of x is not matched, and then we
          align x1 ...xm−1 as well as possible with y1 ...yn .
                                         OP T (m, n) = δ + OP T (m − 1, n)
          In case (iii), we pay gap costs of δ since the nth position of y is not matched, and then we
          align x1 ...xm as well as possible with y1 ...yn−1 .
                                         OP T (m, n) = δ + OP T (m, n − 1)

  Fact 3: The minimum alignment costs satisfy the following recurrence for i ≥ 1, j ≥ 1:
               OP T (i, j) = min{αxi ,yj + OP T (i − 1, j − 1), δ + OP T (i − 1, j), δ + OP T (i, j − 1)}
3   DYNAMIC PROGRAMMING                                                                             21


  Formulation of the sequence alignment algorithm as graph theoretical problem
  Suppose we build a two-dimensional grid graph Gx,y , with the rows labeled by symbols in the
  string x and the columns labeled by the symbols in y. We number the rows from 0 to m and the
  columns from 0 to n.
  We put costs on edges of Gx,y :
      • each horizontal and vertical edge get cost δ.
      • the diagonal edge from (i − 1, j − 1) to (i, j) get cost αxi ,yj .


                                x3    iE   iE   iE   iE   i
                                      T
                                           T
                                                T
                                                     T
                                                          T
                                x2    i
                                       E   i
                                            E   i
                                                 E   i
                                                      E   i
                                      T 
                                          T
                                                T
                                                     T
                                                          T
                                x1    i
                                       E   i
                                            E   i
                                                 E   i
                                                      E   i
                                      T
                                           T
                                                T
                                                     T
                                                          T
                                0     i
                                       E   i
                                            E   i
                                                 E   i
                                                      E   i

                                     0    y1   y2   y3   y4

Fact 4: Let f (i, j) denote the minimum cost of a path from (0, 0) to (i, j) in Gx,y . Then for all i, j,
        we have f (i, j) = OP T (i, j).
4    APPROXIMATION ALGORITHM                                                                    22


                                                                                             20.12.2005


     4      Approximation Algorithm
     Informal Definition
     ”Algorithm, which run in polynomial time and find solutions that are guaranteed to be close to
     optimal.”

     Techniques
         (1) Greedy Algorithm
         (2) Pricing Method (primal-dual technique)
         (3) Linear Programming and Rounding
         (4) Polynomial-time Approximation Scheme

     4.1      Load-Balancing Problem
     Definition
     We are given a set of m machines M1 , ..., Mm and a set of n jobs; each job j has a processing time
     tj . Let A(i) be the set of jobs assigned to machine Mi . Under this assignment, machine Mi needs
     to work for a total time of
                                                  Ti =     tj
                                                       j∈A(i)

     and we declare this to be load on the machine Mi .
     We seek to minimize a quantity known as the makespan; it is simply the maximum load on any
     machine, T = max{T1 , ..., Tm }

     Greedy Balance
            Start with no jobs assigned.
            Set Ti = 0 and A(i) = ∅ for all machines Mi .
            For j = 1 to n
                 Let Mi be a machine that achieves the minimum min{T1, ..., Tm }
                 Assign job j to machine Mi .
                 Set A(i) ← A(i) ∪ {j}
                 Set T (i) ← Ti + tj


Example: m = 3; n = 6; ti = {2, 3, 4, 6, 2, 2}

                                 time
                                   9 T
                                   8
                                                                     
                                                                     
                                                                     
                                   7                                 
                                                                     
                                   6                                 
                                                                     
                                                                     
                                                                     
                                   5       6                    2    
                                   4               2                 makespan = 8
                                   3                                 
                                                                     
                                                                     
                                                                     
                                   2                            4    
                                                                     
                                                   3                 
                                                                     
                                   1       2                         
                                   0
                                          M1      M2            M3
4    APPROXIMATION ALGORITHM                                                               23


             Fact 1: The optimal makespan T ∗ is at least
                                                                              n
                                                                        1
                                                                 T∗ ≥     ∗   tj
                                                                        m j=1

             Fact 2: The optimal makespant T ∗ is at least

                                                             T ∗ ≥ max{t1 , ..., tn }

             Fact 3: Algorithm Greedy-Balance produces an assignment of jobs to machines with makespan

                                                                    T ≥ 2 ∗ T∗

              Proof: We consider a machine Mi that attains the maximum load T in our assignment and we ask:
                     What was the last job j to be placed on Mi ?
                     When we assigned job j to Mi , the machine Mi had the smallest load of any machines. Its
                     load before this assignment was Ti − tj .
                     It follows that every machine had load at least Ti − tj . We have
                                                             m
                                                                   Tk ≥ m ∗ (Ti − tj )
                                                             k=1

                                                                              n
                                                                        1
                                                       ⇔ (Ti − tj ) ≤     ∗   tj ≤ T ∗
                                                                        m j=1
                                                                                   F act1

                       T := Ti = (Ti − tj ) + tj ≤ 2 ∗ T ∗
                                    ≤T ∗        ≤T ∗

                       T
Approximation ratio:   T∗   ≤2
Worst case example: We have m machines and we have n = m ∗ (m − 1) + 1 jobs. The first m ∗ (m − 1) = n − 1
                    jobs require time tj = 1. The last job requires time tn = m.

                       Greedy makespan: 2m − 1
                       Optimal makespan: m

                            2m−1           1
                       =⇒    m     =2−     m   −→ 2

               An improved Greedy Algorithm
                   Sorted Greedy Balance
                       Sort the list of jobs not ascending by the processing time
                       Start with no jobs assigned.
                       Set Ti = 0 and A(i) = ∅ for all machines Mi .
                       For j = 1 to n
                            Let Mi be a machine that achieves the minimum min{T1, ..., Tm }
                            Assign job j to machine Mi .
                            Set A(i) ← A(i) ∪ {j}
                            Set T (i) ← Ti + tj
4    APPROXIMATION ALGORITHM                                                                   24


  Fact 4: If these are more than m jobs, then

                                                              T ∗ ≥ 2 ∗ tm+1

  Proof: Consider only the first m + 1 jobs in the sorted order. They each take at least tm+1 time.
         There are m + 1 jobs and only m machines, so there must be a machine that gets assigned
         two of these jobs. This machine will have processing time at least 2 ∗ tm+1 .
  Fact 5: Algorithm Sorted Greedy Balance produces an assignment of jobs to machines with makespan
                                                                     3
                                                               T ≤     ∗ T∗
                                                                     2

  Proof: We consider a machine Mi that has the maximum load. So let’s assume that machine Mi
         has at least two jobs, and let tj be the last job assigned to the machine.
         It holds:
                                                    j ≥m+1
                                                                        1
                                                          tj ≤ tm+1 ≤     ∗ T∗
                                                                        2
                                         3
            Ti = (Ti − tj ) + tj     ≤   2   ∗ T∗
                     ≤T ∗      1
                              ≤2T∗


    4.2      Set Cover
    Definition
    A set X of n elements.
    A set F of subsets of X, with              s = X.
                                         s∈F
    A set cover is a collection of those sets whose union is equal to X and has minimal cardinality.

    Greedy Set Cover (X, F )
            U := X
            C := ∅
            while U = ∅ do
                 choose s ∈ F with |s ∩ U | → max
                 U := U  s
                 C := C ∪ {s}
            return C


Example:                                                  ¨ ¨  ¨
                                                         '™   ™  ™ $
        (1) |S1 ∩ U | = 6
                                                        ™ ' ™   ™ $
                                                    S1
                                                                 %
                                                       
        (2) |S4 ∩ U | = 3
                                                     S4 ™  ™   ™
                                                                  S2
                                                                 %
        (3) |S5 ∩ U | = 2                                    ¨
                                                     S6 ™   ™   ™
                                                             ©
                                                        © ©  ©
        (4) |S6 ∩ U | = 1
                                                         S3             S5
            (|S3 ∩ U | = 1)
4   APPROXIMATION ALGORITHM                                                                                               25


    Further Definition
    Set U of n elements. A list of S subsets S1 , ..., Sm of U. Each set Si has an associated weight
    wi ≥ 0. Goal is to minimize si ∈e wi . It holds:
                                                            wi         wi
                                                                 −→
                                                           |Si |    |Si ∩ R|
           Greedy Set Cover
           Start with R = U
           while R = ∅
                                                                      wi
                  Select set Si that minimizes                      |Si ∩R|
                  Delete set Si from R
           return the selected sets


Example:                                                                    
                                                         n
                                                         s                     n S6
                                                                               s
                                                                                
                                                     S5
           given weights:
                                                         s                     s
                                                     S2                         
                                                        '                      s $
           w1 = 1, w2 = 1
           w3 = 1 + ǫ, w4 = 1 + ǫ                        s
           w5 = 1, w6 = 1
                                                           s                   s
                                                                              %
                                                     S1
           choosing order:                                                  
                                                            S4                S3
           S1 , S2 , S5 , S6

        Let the cost paid for an element s be described by the quantity cs :
                                                            wi
                                                   cs =   |Si ∩R|   for all s ∈ Si ∩ R.

  Fact 1: If C is the set cover obtained by Greedy Set Cover, then                        si ∈C   wi =   s∈U   Cs .
                                                                                                               n
                                                                                                                   1
  Fact 2: For every Sk , the sum              s∈Sk   Cs is at most H(|Sk |) ∗ Wk whereas H(n) :=                   i   = Θ(ln n).
                                                                                                           i=1

  Proof: We assume that the element of Sk are the first d = |Sk | of the set U . That is Sk = s1 , ...sd .
         Further, let us assume that these element are labelled in the order in which they are assigned
         a cost csj by the Greedy algorithm. Now consider the iteration in which element sj is covered
         by the greedy algo for some j ≤ d. At the start of this iteration , sj , sj+1 , ..., sd ∈ R by our
         labeling of the elements. This implies that |Sk ∩ R| is at least d − j + 1, and so the average
                                         wk       Wk
         cost of the set Sk is at most |Sk ∩R| ≤ d−j+1 . In this iteration, the greedy algorithm selected
         a set Si of minimum average cost, so this set Si has average cost at most that of Sk . Thus
                                                          wi         wk        wk
                                              csj =             ≤          ≤       .
                                                       |Si ∩ R|   |Sk ∩ R|   d−j+1
                                               d
                                                       wk    wk   wk       wk       1  1
                  cs =         j = 1d csj ≤                =    +    +...+    = wk ( +    +...+1) = wk H(d)
                                              j=1
                                                     d−j−1   d d−1         1        d d−1
           s∈Sk

  Fact 3: The set cover C selected by Greedy-Set-Cover has weight at most H(d∗ ) times the optimal
          weight w∗ whereas d∗ = maxi |Si |.
4   APPROXIMATION ALGORITHM                                                                                                         26


    Proof: Let C ∗ denote the optimum set cover, so that w∗ =                                     wi . For each of the sets in C ∗ ,
                                                                                        Si ∈C ∗
             Fact 2 implies:
                                                            1
                                                 wi ≥             ∗           Cs                   (∗)
                                                           H(d∗ )
                                                                      s∈Si

                                                              Cs ≥            Cs                   (∗∗)
                                             si ∈C ∗ s∈Si               s∈U

                                                            1                           1                              1
                w∗ =(Def.)             wi ≥(∗)                             Cs ≥(∗∗)                      Cs =F act1                    wi
                                                           H(d∗ )                      H(d∗ )                         H(d∗ )
                             Si ∈C ∗             Si ∈C ∗            s∈Si                          s∈U                          Si ∈C


     4.3       Vertex Cover (Pricing Method)
     Definition
     A vertex cover in a graph G = (V, E) is a set S ⊆ V so that each edge has at least one end in S.
     We consider have, each vertex i ∈ V has a weight wi ≥ 0. We would like to find a vertex cover S
     for which w(S) is minimum.
     It holds: Vertex Cover ≤p Set Cover and Independent Set ≤p Vertex Cover.

     The ”Pricing Method”
     The pricing method (also known as primal-dual method) is motivated by an economic perspective.
     For the vertex cover problem, we will think of the weights on the nodes as costs, and we will think
     of each edge as having to pay for its “share” of the cost of the vertex cover we find.
         More precisely: We will think of the weight wi of the vertex i as the cost for wing i in the
     cover. We will think of each edge e as an “agent” who is willing to pay something to the node
     that covers it.
         The algorithm will not only find a vertex cover S, but also determine prices pe ≥ 0 for each
     edge e ∈ E, so that if each edge e ∈ E pays the price pe , this will in total approximately cover the
     cost of S. Selecting vertex i covers all edges incident to i, so it would be “unfair” to change these
     incident edges in total more then the cost of vertex i. We call prices pe fair, if for each vertex i,
     the edges adjacent to i do not have to pay more than the cost of the vertex:

                                                                        pe ≤ wi
                                                              e=(i,j)

   Fact 1: For any vertex cover S ∗ and any nonnegative and fair prices pe , we have                                    pe ≤ w(S∗).
                                                                                                                  e∈E

    Proof: By the definition of fairness, we have                        e=(i,j)   pe ≤ wi for all nodes i ∈ S ∗ .              e∈E     pe ≤
                                       ∗
             i∈S ∗ pe ≤ i∈S ∗ wi = w(S ).

Algorithm:


             Def.: We say that a node i is tight (or ”‘paid for”’) if                             pe = wi
                                                                                       e=(i,j)

               VertexCoverApprox(G, w)
                   Set pe = 0 for all e ∈ E
                   while ∃e = (i, j) such that neither i nor j is tight

                       select such an edge e
                       increase pe without violating fairness
4   APPROXIMATION ALGORITHM                                                                      27


             end while
             Let S be the set of all tight nodes.
             return S

                                           Example:

                                                                   
                        a  4                   SELECT(a,b)            4          a  payment ≤ 3
                          t                                       t
                             t                                          t            pay = 3
                              t                                           t
                                t                                          t       ⇒ b is tight
                                 t                                           t
          p=0                      t p=0                 3                    t    0
                                    t                                          t
                             p=0     t                                  0        t
                                       t                                          t
                                        t                                           t
                                         t                                            t
                                      
                                           t                                       
                                                                                         t
      3                     5                3     3                  5                    3
      p = 0               p = 0              0                0               
      b                     c                d       b                c                    d

                                                                                 
 SELECT(a,d)            a  4              payment ≤ 1        SELECT(c,d)         a 4             payment ≤ 2
                          t                                                    t
                             t              pay = 1                                  t            pay = 2
                              t                                                        t
                                    t     ⇒ a is tight                                  t       ⇒ d is tight
                                     t                                                    t
             3                  t    1                                 3                   t    1
                                 t                                                          t
                             0    t                                                  0        t
                                    t                                                          t
                                      t                                                          t
                                       t                                                           t
                                    
                                         t                                                      
                                                                                                      t
      3                     5              3                     3                 5                    3
       0                   0                                0               2              
        b                   c              d                       b               c                    d


        ⇒ S = {a, b, d}, w(S) = 10
Fact 2: The Set S and the prices p returned by the algorithm satisfy the inequality w(S) ≤ 2 ∗
          e∈E pe .

Proof : All Nodes in S are tight, so we have     e=(i,j)   pe = wi for all i ∈ Si w(S) =   i∈S   wi =
          i∈S  e=(i,j) pe ≤ 2 ∗ e∈E pe

Fact 3: The Set S returned by the algorithm is a vertex cover, and its cost is at most twice the
        minimum cost of any vertex cover.
Proof : Suppose by contradiction, that S does not cover edge e = (i, j). This implies that neither
        i nor j is tight, and this contradicts that the while-loop of the algorithm terminated. Let
        p be the prices set by the algorithm, and let S ∗ be an optimal vertex cover. By Fact 2
        we have 2 ∗ e∈E pe ≥ w(S) and we have by Fact 1 that e∈Epe ≤ w(S ∗ ) =⇒ w(S) ≤
        2 ∗ e∈E pe ≤ 2 ∗ w(S ∗ )
4   APPROXIMATION ALGORITHM                                                                     28


     4.4    Linear Programming and Rounding
     The basic linear programming problem can be viewed as a complex version of the problem of si-
     multaneous linear equations with inequalities in place of equations. Specially consider the problem
     of determining a vector x that satisfies Ax ≥ b.
                                                    x1
 Example: x1 ≥ 0, x2 ≥ 0                (1.5, 1) ∗ (x2 )
          x1 + 2 ∗ x2 ≥ 6
          2 ∗ x1 + x2 ≥ 6
               →
          cT ∗ x → M in




                                            6

                                            5

                                            4

                                            3

                                            2

                                            1

                                            0
                                                0    1     2      3   4   5   6


Definition: Given an m × n Matrix A, and vector b ∈ Rm and vector e ∈ Rn , find a vector x ∈ Rn to
           solve the following optimization problem :

           min(      cT ∗ x          , such that x ≥ 0, Ax ≥ b)
                ObjectiveF unction                  Constraints

           Vertex cover as an Integer Program
           Choose a decision variable xi for each node i ∈ V
           xi = 1 will indicate that node i is in the vertex cover
           xi = 0 will indicate that node i is not in the vertex cover
           For each edge (i, j) ∈ E, we write the inequality xi + xj ≥ 1
                                      →
           Objective function : wT ∗ x → min

   VC-IP M in i∈V wi ∗ xi
         xi + xj ≥ 1∀(i, j) ∈ E
         xi ∈ {0, 1}∀i ∈ V
   Fact 1: S is a vertex cover in G if and only if the vector x defined as xi = 1 for i ∈ S, and xi = 0
           for i ∈ S satisfies the constraints in VC-IP. Further we have w(S) = wT ∗ x
                 /


   Fact 2: Vertex Cover ≤p Integer Programming.

           Using linear programming for vertex cover
           Modify the VC-IP by dropping the requirement that each xi ∈ {0, 1} and reverting to the
           constraint that each xi is an arbitrary real number between 0 and 1.
           ֒→ VC-LP.
4   APPROXIMATION ALGORITHM                                                                      29


Fact 3: Let S ∗ denote a vertex cover of minimum weight. Then WLP ≤ w(S ∗ ).

                                                                             1
                      w(S ∗ ) = |V C| = 2   x1 = x2 = x3 =
                                                                           2

                                      1
                                     
                                        d
                                         d      WLP = 3
                                        d          2

                                1          1
                                         



         Given a fractional solution (x∗ ), we define S = {i ∈ V : x∗ ≥ 1 }
                                       i                           i   2
                                                              1
Fact 4: The set S defined in this way is a vertex cover, and   2   ∗ w(S) ≤ WLP

Proof : Consider an edge e = (i, j). Recall that one inequality xi + xj ≥ 1 ⇒ in any solution x∗
        that satisfies this inequality either x∗ ≥ 2 or x∗ ≥ 1 . Thus at least one of these two will be
                                              i
                                                  1
                                                        j   2
        rounded up, and i or j will be placed in S. Therefore S is a vertex cover.
        WLP = wT ∗ x∗ = i wi ∗ x∗ ≥ i∈S wi ∗ x∗ ≥ 1 ∗ i∈S wi = 1 ∗ w(S).
                                      i              i    2              2

Fact 5: The algorithm produces a vertex cover S of at most twice the minimum possible weight.

                                                                                           27.01.2006

  4.5     A more advanced LP-Application: Load Balancing
  Definition of the problem
  Each job j has a fixed given size tj ≥ 0 and a set of machines Mj ⊆ M that it may be assigned to.
  We call an assignment of jobs to machines feasible i each job j is assigned to a machine i ∈ MJ .
  The goal is still to minimize the maximum load on any machine: Using Ji ⊆ J to denote the jobs
  assigned to a machine i ∈ M in a feasible assignment, and using Li = j∈Ji tj to denote the
  resulting load, we seek to minimize maxi Li .


  ”Generalized Load Balancing Problem”
GL-IP: xij to each pair (i, j) of a machine i ∈ M , job j ∈ J:
       Setting xij = 0 will indicate that job j is not assigned to machine i,
       setting xij = tj will indicate that job j is assigned to machine i.
       For each job we require i xij = tj . We require xij = 0 whenever i ∈ Mj . The load of a
                                                                              /
       machine i can be expressed as Li = j xij . We use one more variable L. We use inequalities
          j xij ≤ L∀i ∈ M .




                                            min L
                                          xij = tj           ∀j ∈ J
                                      i
GL-IP:                                    xij ≤ L           ∀i ∈ M
                                      j
                          (∗)       xij ∈ {0, tj }       ∀j ∈ J, i ∈ Mj
                                          xij = 0        ∀j ∈ J, i ∈ Mj
                                                                   /
4   APPROXIMATION ALGORITHM                                                                         30


Fact 1: An assignment of jobs to machines has load at most L if and only if the vector x satisfies
        constraints in GL-IP, with L set to the maximum load of the assignment.


                                     xij ≥ 0                 ∀j ∈ J, i ∈ Mj
GL-IL:
                               instead of (*)


Fact 2: If the optimum value of GL-LP is L, then the optimum load is at least L∗ ≥ L.

Fact 3: The optimum load is at least L∗ ≥ maxj tj .

  We’ll consider the following bipartite graph G(x) = (V (x), E(x))
         V (x) = M ∪ J
         (i, j) ∈ E(x) if and only if xij  0
Fact 4: Given a solution (X, L) of GL-LP such that the graph G(X) has no cycles, we can use the
        solution X to obtain a feasible assignment of jobs to machines with load at most L + L∗ .
Proof: Since the graph G(X) has no cycles, each of its connected components is a tree. First, root
       the tree at an arbitrary node.




                                                                   .
                                                                   .
                                                                   .




         Consider a job j:
           1. If the node corresponding to job j is a leaf of the tree, let machine node i be its parent.
              Since j has degree 1 in G(X) machine i is the only machine that has been assigned any
              part of job j and hence xij = tj .
           2. For a job j whose corresponding node is not a leaf in G(X) we assign j to an arbitrary
              child of the corresponding node in the rooted tree.
         Let i be any machine, and let Ji be the set of jobs assigned to machine i. The set Ji contains
         those children of node i that are leaves, plus possibly the parent p(i) of node i.
         For all jobs j = p(i) assigned to i, we have xij = tj .

                                                 tj =                  xij ≤          xij ≤ L
                                  j∈Ji ,j=p(i)          j∈Ji ,j=p(i)           j∈Ji


         For the parent j = p(i) of node i, we use Fact 3 tj ≤ L∗

                                                   =⇒            ≤ L + L∗
                                                          j∈Ji
4   APPROXIMATION ALGORITHM                                                                           31




                                             Jobs            Machines
                                                         ∞
                                                         ∞                 L
                                                         ∞                 L
                                        tj     j                i
                                                                                   V
                                                                           L
                                                         ∞
                                                         ∞




    Fact 5: Solution of this flow problem with capacity L are in one-to-one correspondence with the
            solution of GL-LP with value along edge (j, i) and the flow value on edge (i, v) is the load
               j xij on machine i.

    Fact 6: Let X, L) be any solution to GL-LP and C be a cycle in G(X). We can modify the solution
            X to eliminate at lest one edge from G(X) without increasing the load or introducing any

                                          →
                                          ←
                                                                               V
                                          ←
                                          →

            new edges.
Proof idea: We modify the solution by augmenting the flow along the cycle C. Assume that the nodes
            along the cycle are i1 j1 i2 j2 ...ik jk where il is a machine node and jl is a job node. We’ll flow
            along all edges and increasing the flow on the edge (jl , il+1 ) for all l = 1, ..., k (where k + 1
            is used to denote 1), by the same amount δ with

                                                         δ = min xil jl
                                                               l=1,...,k


                                                                                                   31.01.2006

      4.6     Arbitrarily Good Approximations - Knapsack Problem
      Goal: “Produce a solution within a small percentage of the optimal solution.”
Definition: Suppose we have n items. Each item i = 1, ..., n has two integer parameters: weight wi ,
           value vi . Given a knapsack capacity W, the goal of the Knapsack problem is to find a subset
           S of items of maximum value subject to the restriction that the total weight of the set should
           not exceed W .      vi →max under the condition       wi ≤ W .
                             i∈S                                           i∈S
            Our algorithm will take as input the weights and values defining the problem and will also
            take an extra parameter ǫ, the desired precision. It will find a subset S whose total weight
            does not exceed W , with value     vi at most a (1 + ǫ) factor below the maximum possible
                                                   i∈S
            solution.
4   APPROXIMATION ALGORITHM                                                                      32


        The algorithm will run in polynomial time for any fixed choice ǫ  0 however, the dependence
        on ǫ will not be polynomial. We call such an algorithm a “Polynomial-time approximation
        scheme” (PTAS).
        We already know an dynamic programming algorithm that run in O(n · W ).
“Old”: OP T (i, W )
“New”: OP T (i, V ) is the smallest knapsack weight W so that one can obtain a solution using a
       subset of items 1, ..., i with values at least V .
                                                                                       i
        We will have a subproblem for all i = 0, ..., n and values V = 0, ...,              vj .
                                                                                      j=1
         i
              vj ≤ n · (max vi ) = n · v ∗ ⇒ O(n2 · v ∗ ) subproblems.
        j=1                i

                           v∗

  Recurrence for solving these subproblems:
                   n−1
        if V            vi then OP T (n, V ) = wn + OP T (n − 1, V − vn )
                   i=1


        else OP T (n, V ) = min{OP T (n − 1, v), wn + OP T (n − 1, max(0, V − vn ))}


      Knapsack (n)
        Array M [0...n, 0...v]
        For i = 0...n do M [i, 0] = 0
        For i = 1, ..., n do
                                      i
               For v = 1, ...,            vj do
                                 j=1
                                i−1
                   if v              vj then
                                j=1

                         M [i, v] = wi + M [i − 1, v]
                   else
                          M [i, v] = minM [i − 1, v], wi + M [i − 1, max0, v − vi ]
        return the maximum value V such that M [n, V ] ≤ W

  Idea for a PTAS:
  If the values are small integers, then v ∗ is small and the problem can be solved in polynomial
  time already. On the other hand, we will use a rounding parameter b and will consider the values
  rounded to an integer multiple of b. More precisely, for each item i, let its rounded value be
  vi = vi · b.
  ˜      b
      We will use our dynamic programming algorithm to solve the problem with the rounded values.
Fact 1: For each item i we have vi ≤ vi ≤ vi + b. The rounded values are all integers multiples of a
                                      ˜
        common value b. Instead of solving the problem with the rounded values vi , we can change
                                                                                  ˜
                                                                                      ˜
        the units; we can divide all values by b and get an equivalent problem. vi = vi = vi for
                                                                                 ˆ    b      b
        all i = 1, ..., n .
4   APPROXIMATION ALGORITHM                                                                                                              33


Fact 2: The Knapsack problem with values vi and the scaled problem with values vi have the same
                                           ˜                                      ˆ
        set of optimum solutions, the optimum values differ exactly by a factor of b, and the scaled
        values are integers.

  Knapsack-Approx(ǫ)
                  ǫ
        set b =   n    ∗ (max vi )
                              i

        solve the Knapsack Problem with values vi (with the help of our dyn.
                                               ˆ                                                                                      Prog.      Algo.)
        return the set S of items found by this algorithm

Fact 3: The set of items S returned by the algorithm has total weight at most W , that is                                             wi ≤ W .
                                                                                                                                i∈S

Fact 4: The algorithm Knapsack-Approx runs in polynomial time for any fixed ǫ  0.
Proof: The Dyn. Prog. Algo. runs in time O(n2 · v ∗ ) with v ∗ = max vi .
                                                                                                     i
        Determine max vi : The item j with maximum value vj = max vi has also maximum value
                      ˆ
                         i                                                                               i
                                                                                  vj
        in the rounded problem. So, max vi = vi =
                                        ˆ    ˆ                                     b     = n · ǫ−1 . ⇒ The overall running time:
                                                       i
        O(n3 · ǫ−1 ).
Fact 5: If S is the solution found by the Knapsack-Approx algorithm, and S ∗ is any other solution
        satisfying      wi ≤ W , then we have (1 + ǫ) · vi ≥     vi .
                      i∈S ∗                                                       i∈S        i∈S ∗

Proof: Let S ∗ be any set satisfying                        wi ≤ W . Our algorithm finds the optimal solution with
                                                    i∈S ∗
        values vi :
               ˜
                                                                  vi ≥
                                                                  ˜               vi
                                                                                  ˜          (∗)
                                                            i∈S           i∈S ∗

                                          vi ≤              vi ≤
                                                            ˜             vi ≤
                                                                          ˜              (vi + b) ≤ n · b +                vi
                                           F act1             (∗)
                                  i∈S ∗             i∈S ∗           i∈S            i∈S                               i∈S

        We get:
                                                                    vi ≥ vj := max vi
                                                                         ˜         ˜
                                                                                         i
                                                             i∈S

                                               ǫ
                                          b=     · (max vi ) ⇒ n · b = ǫ(max vi ) ≤ ǫ ·                             vi
                                               n     i                    i
                                                                                                              i∈S

        Thus, we get:
                                                                                                                         √
                                          vi ≤ (ǫ ·          vi ) + (         vi ) = (ǫ + 1) ·               vi
                                  i∈S ∗               i∈S               i∈S                          i∈S
Script md a[1]
Script md a[1]
Script md a[1]
Script md a[1]
Script md a[1]

More Related Content

What's hot

Quantum Cryptography and Possible Attacks
Quantum Cryptography and Possible AttacksQuantum Cryptography and Possible Attacks
Quantum Cryptography and Possible AttacksArinto Murdopo
 
Crypto notes
Crypto notesCrypto notes
Crypto notesvedshri
 
Modelling Time in Computation (Dynamic Systems)
Modelling Time in Computation (Dynamic Systems)Modelling Time in Computation (Dynamic Systems)
Modelling Time in Computation (Dynamic Systems)M Reza Rahmati
 
All Minimal and Maximal Open Single Machine Scheduling Problems Are Polynomia...
All Minimal and Maximal Open Single Machine Scheduling Problems Are Polynomia...All Minimal and Maximal Open Single Machine Scheduling Problems Are Polynomia...
All Minimal and Maximal Open Single Machine Scheduling Problems Are Polynomia...SSA KPI
 
Quantum Variables in Finance and Neuroscience Lecture Slides
Quantum Variables in Finance and Neuroscience Lecture SlidesQuantum Variables in Finance and Neuroscience Lecture Slides
Quantum Variables in Finance and Neuroscience Lecture SlidesLester Ingber
 
Approximate Algorithms for the Network Pricing Problem with Congestion - MS t...
Approximate Algorithms for the Network Pricing Problem with Congestion - MS t...Approximate Algorithms for the Network Pricing Problem with Congestion - MS t...
Approximate Algorithms for the Network Pricing Problem with Congestion - MS t...Desirée Rigonat
 
7 introplasma
7 introplasma7 introplasma
7 introplasmaYu Chow
 

What's hot (16)

Quantum Cryptography and Possible Attacks
Quantum Cryptography and Possible AttacksQuantum Cryptography and Possible Attacks
Quantum Cryptography and Possible Attacks
 
t
tt
t
 
Queueing
QueueingQueueing
Queueing
 
Crypto notes
Crypto notesCrypto notes
Crypto notes
 
Modelling Time in Computation (Dynamic Systems)
Modelling Time in Computation (Dynamic Systems)Modelling Time in Computation (Dynamic Systems)
Modelling Time in Computation (Dynamic Systems)
 
All Minimal and Maximal Open Single Machine Scheduling Problems Are Polynomia...
All Minimal and Maximal Open Single Machine Scheduling Problems Are Polynomia...All Minimal and Maximal Open Single Machine Scheduling Problems Are Polynomia...
All Minimal and Maximal Open Single Machine Scheduling Problems Are Polynomia...
 
pickingUpPerl
pickingUpPerlpickingUpPerl
pickingUpPerl
 
Calculus3
Calculus3Calculus3
Calculus3
 
PerlIntro
PerlIntroPerlIntro
PerlIntro
 
Quantum Variables in Finance and Neuroscience Lecture Slides
Quantum Variables in Finance and Neuroscience Lecture SlidesQuantum Variables in Finance and Neuroscience Lecture Slides
Quantum Variables in Finance and Neuroscience Lecture Slides
 
Ns doc
Ns docNs doc
Ns doc
 
MSC-2013-12
MSC-2013-12MSC-2013-12
MSC-2013-12
 
Perl-crash-course
Perl-crash-coursePerl-crash-course
Perl-crash-course
 
thesis
thesisthesis
thesis
 
Approximate Algorithms for the Network Pricing Problem with Congestion - MS t...
Approximate Algorithms for the Network Pricing Problem with Congestion - MS t...Approximate Algorithms for the Network Pricing Problem with Congestion - MS t...
Approximate Algorithms for the Network Pricing Problem with Congestion - MS t...
 
7 introplasma
7 introplasma7 introplasma
7 introplasma
 

Similar to Script md a[1]

Avances Base Radial
Avances Base RadialAvances Base Radial
Avances Base RadialESCOM
 
Climb - A Generic and Dynamic Approach to Image Processing
Climb - A Generic and Dynamic Approach to Image ProcessingClimb - A Generic and Dynamic Approach to Image Processing
Climb - A Generic and Dynamic Approach to Image ProcessingChristopher Chedeau
 
Coding interview preparation
Coding interview preparationCoding interview preparation
Coding interview preparationSrinevethaAR
 
matconvnet-manual.pdf
matconvnet-manual.pdfmatconvnet-manual.pdf
matconvnet-manual.pdfKhamis37
 
Location In Wsn
Location In WsnLocation In Wsn
Location In Wsnnetfet
 
FDTD-FEM Hybrid
FDTD-FEM HybridFDTD-FEM Hybrid
FDTD-FEM Hybridkickaplan
 
Lecture Notes in Machine Learning
Lecture Notes in Machine LearningLecture Notes in Machine Learning
Lecture Notes in Machine Learningnep_test_account
 
A buffer overflow study attacks and defenses (2002)
A buffer overflow study   attacks and defenses (2002)A buffer overflow study   attacks and defenses (2002)
A buffer overflow study attacks and defenses (2002)Aiim Charinthip
 
Math for programmers
Math for programmersMath for programmers
Math for programmersmustafa sarac
 
Reading Materials for Operational Research
Reading Materials for Operational Research Reading Materials for Operational Research
Reading Materials for Operational Research Derbew Tesfa
 
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zinggFundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zinggRohit Bapat
 
Stochastic Programming
Stochastic ProgrammingStochastic Programming
Stochastic ProgrammingSSA KPI
 
Triangulation methods Mihaylova
Triangulation methods MihaylovaTriangulation methods Mihaylova
Triangulation methods MihaylovaZlatka Mihaylova
 

Similar to Script md a[1] (20)

Avances Base Radial
Avances Base RadialAvances Base Radial
Avances Base Radial
 
Climb - A Generic and Dynamic Approach to Image Processing
Climb - A Generic and Dynamic Approach to Image ProcessingClimb - A Generic and Dynamic Approach to Image Processing
Climb - A Generic and Dynamic Approach to Image Processing
 
Grafx
GrafxGrafx
Grafx
 
Habilitation draft
Habilitation draftHabilitation draft
Habilitation draft
 
Di11 1
Di11 1Di11 1
Di11 1
 
Coding interview preparation
Coding interview preparationCoding interview preparation
Coding interview preparation
 
matconvnet-manual.pdf
matconvnet-manual.pdfmatconvnet-manual.pdf
matconvnet-manual.pdf
 
Location In Wsn
Location In WsnLocation In Wsn
Location In Wsn
 
Queueing 3
Queueing 3Queueing 3
Queueing 3
 
Queueing 2
Queueing 2Queueing 2
Queueing 2
 
FDTD-FEM Hybrid
FDTD-FEM HybridFDTD-FEM Hybrid
FDTD-FEM Hybrid
 
Lecture Notes in Machine Learning
Lecture Notes in Machine LearningLecture Notes in Machine Learning
Lecture Notes in Machine Learning
 
A buffer overflow study attacks and defenses (2002)
A buffer overflow study   attacks and defenses (2002)A buffer overflow study   attacks and defenses (2002)
A buffer overflow study attacks and defenses (2002)
 
Cs665 writeup
Cs665 writeupCs665 writeup
Cs665 writeup
 
Algorithms
AlgorithmsAlgorithms
Algorithms
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
 
Reading Materials for Operational Research
Reading Materials for Operational Research Reading Materials for Operational Research
Reading Materials for Operational Research
 
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zinggFundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
 
Stochastic Programming
Stochastic ProgrammingStochastic Programming
Stochastic Programming
 
Triangulation methods Mihaylova
Triangulation methods MihaylovaTriangulation methods Mihaylova
Triangulation methods Mihaylova
 

Script md a[1]

  • 1. Lecture Notes On Algo-Design Lecturer: Ulf-Peter Schroeder February 16, 2006 written by: Braun, Rudolf Brune, Philipp Piepmeyer, Meik Please send corrections - including [AlgoDesign-Script] in the subject line - to: meikp <KLAMMERAFFE> upb <PUNKT> de 1
  • 2. CONTENTS 2 Contents 1 Greedy Algorithms 3 1.1 Interval Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Scheduling to Minimize Lateness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 ( Huffman Codes and Data Compressions ) . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Theoretical foundations for the greedy method . . . . . . . . . . . . . . . . . . . . 5 2 Divide and Conquer 8 2.1 Finding Closest Pair of Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Convolutions at the Fast Fourier Transformation . . . . . . . . . . . . . . . . . . . 10 3 Dynamic Programming 14 3.1 Weighted Interval Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 Segmented Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Subset Sum / Knapsack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4 Sequence Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 Approximation Algorithm 22 4.1 Load-Balancing Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2 Set Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.3 Vertex Cover (Pricing Method) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.4 Linear Programming and Rounding . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.5 A more advanced LP-Application: Load Balancing . . . . . . . . . . . . . . . . . . 29 4.6 Arbitrarily Good Approximations - Knapsack Problem . . . . . . . . . . . . . . . . 31 5 Local Search 34 5.1 Metropolis Algorithm and Simulated Annealing . . . . . . . . . . . . . . . . . . . . 35 5.2 Maximum Cut via Local Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.3 Local search Algorithms for Graph-Partitioning . . . . . . . . . . . . . . . . . . . . 37 5.4 Best Response Dynamics and Nash-Equilibria . . . . . . . . . . . . . . . . . . . . . 38
  • 3. 1 GREEDY ALGORITHMS 3 18.10.2005 1 Greedy Algorithms • builds of a solution in ”small” steps • choosing a irreversible decision at each step ??? to optimize some underlying criterion Questions 1. When did a greedy algorithm succeed in solving a mentioned problem optimally? 2. How to proof that a greedy algorithm produces an optimal solution to a problem? 1.1 Interval Scheduling Def.: Set of requests {1, 2, .., m} The ith request corresponds to an interval of time starting at s(i) and finishing at f (i). We’ll say that a subset of the request is compatible if no two of them overlap in time. Our goal is to accept as long a compatible subset as possible. Compatible set of maximum size will be called optimal. Idea: The basic idea is to use a simple route to select the first request i1 . We reject all requests that are not compatible with i1 . Repeat this procedure until we run out of requests. Rule 1: ”Select the available request that starts earliest.” Rule 2: ”Select the request that requires the smallest interval of time.” Rule 3: ”Select the request that has the fewest number of non-compatible requests.” Rule 4: ”Select first the request that finishes first, that is the request i for which f (i) is as small as possible.” Algo.: Initially let R be a set of requests and let A be empty. while R is not yet empty choose a request i ∈ R that has the smallest finishing time. add request i to A. delete all requests from R that are not compatible with request i. end while return A. Analyzing the Algorithm √ Part 1: A is a compatible set of requests. Is the solution A optimal? Let O be an optimal set of intervals. | A |=| O | is to prove! Let i1,..,k be the set of requests in A in the order they were added to A. | A |= k Let j1,..,m be the requests in O. Our goal is to prove that k = m. Part 2: For all indices r ≤ k we have f (ir ) ≤ f (jr ).
  • 4. 1 GREEDY ALGORITHMS 4 Proof: r=1 Our Greedy Rule guaranties that f (i1 ) ≤ f (j1 ). I.H.: The statement is true for r − 1. I.S.: We know that f (jr−1 ) ≤ s(jr ). Combining this with the I.H: f (ir−1 ) ≤ f (jr−1 ) we get f (ir−1 ) ≤ s(jr ). Since interval jr is one of available interests at the time when the greedy algorithm selects ir , we have f (ir ) ≤ f (jr ). Part 3: The greedy algorithm returns an optimal set A. Proof: ( given by contradiction ) If A is not optimal then an optimal set 0 must have more requests, that is, we must have m > k. Applying Part 2 with r = k, then is f (ik ) ≤ f (jk ). Since m > k, there is a request jk+1 in O. This request starts after request jk ends, and ends after ik ends. But the greedy algorithm stops with request ik , and it is only supposed to stop when R is empty. ` 25.10.2005 1.2 Scheduling to Minimize Lateness Definition of the Problem A single ressource, a set of n requests to use the ressource for an interval of time, ressource is available starting at time S, the request i has a deadline di , and it requires a continuous time interval of length ti . Each request must be assigned non overlapping intervals. Objective function We will assign each request i an interval of time of length ti , let us denote this interval [s(i), f (i)] with f (i) = s(i) + ti . We say that a request i is late if it misses the deadline, that is if di < f (i). li = f (i) − di . We say that li = 0 if request i is not late. maximum lateness L = max li i∈{1,..,n} Idea 1: ”Schedule the jobs in order of increasing length ti .” Idea 2: ”Schedule the jobs in order of increasing slacktime di − ti ”. Idea 3: ”Earliest Deadline First” Analyzing the ”Earliest Deadline First” - Greedy Algorithm We start with an optimized schedule O. Our plan we have is to gradually modify O, perserving its optimality at each step, but transforming it into a schedule that is indicated to the schedule A formed by our algorithm. Fact 1: There is an optimal schedule with no idle time. Def.: A schedule A′ has an inversion if a job i with deadline di is scheduled before another job j with earlier deadline dj < di . Fact 2: All schedules with no inversions and no idle time have the same maximum lateness. Fact 3: There is an optimal schedule that has no inversions and no idle time. Proof of Fact 3: By Fact 1 there is an optimal schedule with no idle time.
  • 5. 1 GREEDY ALGORITHMS 5 1. If O has an inversion, then there is a pair of jobs i and j and that j is immediately after i and has dj < di . 2. After swapping i and j, we get a schedule with one less inversion. √ 3. The new swapped schedule has a maximum lateness no longer then that of O. Proof of (3.): All jobs other then jobs i and j finish at the same time in the two schedules. ˜i = f (i) − di = f (j) − di < f (j) − dj = l′ l ˜ j ⇐⇒ ˜i < lj l ′ Notation Optimal schedule O: each request r is scheduled [s(r), f (r)] ′ and has lateness lr L′ = max lr ′ r swapped schedule O : s(r), f (r), ˜r , L ˜ ˜ ˜ l ˜ 1.3 ( Huffman Codes and Data Compressions ) 1.4 Theoretical foundations for the greedy method Def.: A matroid is a pair M = (S, I) satisfying the following conditions: 1. S is a finite non-empty set 1 2. I is a non-empty family of subsets of S such that: B ∈ I and A ⊂ B implies A ∈ I. 3. If A ∈ I, B ∈ I and |A| < |B|, then there exists some element x ∈ B A such that A ∪ [x] ∈ I. 2 Examples: 1. Matrix Matroid Def.: S = set of n-vectors I consists of all subsets of linear independent vectors from S.   1 1 1 0 A = 0 0 1 1 1 0 1 1            1  1 1 0  S = 0, 0, 1, 1   1   0 1 1  e1 e2 e3 e4 I = {∅, {e1 }, {e2}, {e3 }, {e4 }, {e1 , e2 }, {e1, e3 }, {e1 , e4 }, {e2 , e3 }, {e2, e4 }, {e3 , e4 }, {e1, e2 , e3 }, {e1 , e2 , e4 }, {e1, e3 , e4 }} 2. Graphic Matroid Def.: Let G = (V, E) be a continued, undirected graph. MG = (SG , IG ) is defined by SG = E. I consists of all subset A ⊂ E such that (V, A) is acyclic. 1 hereditary 2 exchange property
  • 6. 1 GREEDY ALGORITHMS 6 1. and 2. trivial. 3. exchange property? Let A and B belong to I with |A| < |B|, i.e. A and B are forests. Let V (A), V (B) be sets of vertices incident to edges from A and B, resp. (a) If b ∈ V (B) V (A), then exists some e ∈ V (B) such that [b, e] ∈ B. A ∪ {(b, c)} is acyclic. (b) Now we can assume that V (B) ⊂ V (A) ( Theorem: Forest with k edges contains exactly |V | − k trees. ) A consists of τ1 = |V (A)| − |A| trees and B consists of τ2 = |V (B)| − |B| trees. |V (B)| ≤ |V (A)| and |A| < |B| implies τ2 < τ1 . =⇒ ∃ some edge e ∈ B connecting 2 trees from A =⇒ A ∪ {e} is acyclic. 3. ”counter example” Interval Scheduling S = {1, .., n} be the set of requests. U ⊂ S belongs to I if its requests are mutually compatible. Def.: Let M = [S, I] be a matroid. A ∈ I is called maximal, if there exists no x ∈ S A such that A ∪ {x} ∈ I. Lemma: All maximum independent subsets in a matroid have the same size. Proof: Suppose the contrary: There exist two maximum independent subsets A, B with |A| < |B|. ` to the exchange property. 3 Example: 1. A is maximal ⇔ |A| = rank(S) 4 2. A is maximal ⇔ A is spanning tree of G ⇔ |A| = |V | − 1 Def.: A Matroid M = (S, I) is called weighted if there is a weight w(x) > 0 to each x ∈ S. For A ⊂ S we set w(A) = x⊂A w(x). The maximum weight independent subset problem: Find a maximum weight independent subset in a weighted matroid. Greedy(M,w) A=∅ sort S[M ] into nonincreasing order by weight w for each x ∈ S[M ] take in nonincreasing order by weight w do if A ∪ {x} ∈ I(M ) then A := A ∪ {x} return A MST: Define the matroid MG = (SG , IG ) with SG = E. IS consists of all subsets A ⊂ E such that (V, A) is acyclic. We define w′ (e) = (wmax + 1) − w(e) with wmax = max{w(e)}. e∈E It holds w′ (e) > 0 ∀e ∈ E and Greedy(MI , w′ ) computes an optimal subset5 which is a MST in the original graph. 3 rank means rank for the Matrix Matroid. 4 |V | means vertices for the Graphire Matroid. 5 that means a maximum weight independent subset
  • 7. 1 GREEDY ALGORITHMS 7 Lemma: Suppose that M = (S, I) is a weighted matroid with weight function w and that S is sorted into nonincreasing order by weight. Let x be the first element of S such that {x} is independent. Then there exists an optimal subset A of S that contains x. Proof: Let B ∈ I be an optimal subset. If x ∈ B, then the proof is done. So now let x ∈ B hold. / Construct the set A as follows: Begin with A = {x} By the choice of x, A is independent and w(x) ≥ w(y) for any y ∈ B. Using the exchange property, find an element x1 ∈ B such that A = {x1 , x} is indepen- dent . Repeat this procedure until |A| = |B|. Then, A = (B {y}) ∪ {x} for some y ∈ B and B is independent ⇒ {y} is independent. w(x) ≥ w(y) by the choice of x ⇒ w(A) ≥ w(B). Since B is optimal A must also be optimal, and since x ∈ A, the Lemma is proofed. Theorem: If M = (S, I) is a weighted matroid with weight function w, then the call Greedy(M,w) returns an optimal independent subset. Proof: We show the so called ”optimal-substructure property”. Let x be the first element chosen by Greedy. If x is chosen to be the element of the solution, then this defines the problem to find a maximum weight independent subset in the matroid M ′ = (S ′ , I ′ ) with S ′ = {y ∈ S : {x, y} ∈ I} I ′ = {B ⊂ (S {x}) : B ∪ {x}) ∈ I} Proof of this property: If A is any maximum-weighted independent subset containing x, then A′ = a {x} is an optimal subset for M ′ . Conversely, any optimal subset A′ in M ′ yields a subset A = A′ ∪ {x} which has optimal weight among all subset from I containing x. From the previous Lemma we have that there exists an optimal solution containing x. This shows the optimality of Greedy by induction on |S|. Conclusion Here, we have learned three techniques to proof the function of a Greedy Al- gorithm.
  • 8. 2 DIVIDE AND CONQUER 8 08.11.2005 2 Divide and Conquer Principle: • break the input into several parts • solves the problem in each part recursively • combines the solutions of the subproblems into an overall solution Running time: T (n) ≤ aT ( n ) + f (n) b • a is the number of subsolutions • b specifies the size of the subsolutions Example: Sorting with Quick-Sort The function ”Partition” establishes an element x: ... x ... ≥x x≥ • Best case, Partition always finds the element in the middle: T (n) ≤ 2T ( n ) + cn =⇒ O(n ∗ log(n)) 2 • Worst case, Partition always finds a marginal element: T (n) ≤ T (n − 1) + cn =⇒ O(n2 ) 2.1 Finding Closest Pair of Points Definition of the Problem Given n points in the plane. Find the pair that is closest together. Let P = {p1 , ..., pn } be the set of points where pi has coordinates (xi , yi ). For two points pi , pj ∈ P , we use d(pi , pj ) to denote the Euclidian distance between them. Our goal is to find a pair of points pi , pj that minimizes d(pi , pj ). Idea q Q R q q q q q q q q q q q A: Setting up the recursion: (1) We sort all the points in P by x-coordinate and again by y-coordinate producing lists Px and Py .
  • 9. 2 DIVIDE AND CONQUER 9 (2) We define Q to be the set of points in the n positions of the list Px and R to be the 2 set of points in the final n positions of the list Px . 2 (3) By a single pass through each of Px and Py , we can create the following four lists: Qx : consisting of the points in Q sorted by increasing x-coordinate Qy : consisting of the points in Q sorted by increasing y-coordinate Rx : consisting of the points in R sorted by increasing x-coordinate Ry : consisting of the points in R sorted by increasing y-coordinate x x (4) We now recursively determine a closest pair of points in Q. Suppose q0 and q1 are returned as a pair of points in Q. Similarly we determine a closest pair of points in R, x x obtaining r0 and r1 . B: Combining the solutions: Let δ be the minimum of d(q0 , q1 ) and d(r0 , r1 ). Let x∗ denote the x-coordinate of the x x x x rightmost point in Q, and let L denote the vertical line described by the equation x = x∗ . Fact 1: If there exists q ∈ Q and r ∈ R for which d(q, r) < δ, then each of q and r lies within a distance δ of L. q Q R q ' E' E q δ δ q q q q q q q q q L Proof: Suppose such q and r exist. We write q = (qx , qy ) and r = (rx , ry ). We know qx ≤ x∗ ≤ rx =⇒ x∗ − qx ≤ rx − qx ≤ d(q, r) < δ and √ rx − x∗ ≤ rx − qx ≤ d(q, r) < δ We know that we can restrict our search to the narrow band consisting of only points in P within δ of L. Let S ⊆ P denote this set and let Sy denote the list consisting of the points in S sorted by increasing y-coordinate. =⇒ There exist q ∈ Q and r ∈ R for which d(q, r) < δ if and only if there exist s, s′ ∈ S for which d(s, s′ ) < δ. Fact 2: If s, s′ ∈ S have the property that d(s, s′ ) < δ, then s and s′ are within 15 positions of each other in the sorted list Sy .
  • 10. 2 DIVIDE AND CONQUER 10 Proof: Consider the subset Z of the plane consists of all points within distance δ of L. We partition Z into boxes, Z ' δ E' δ E q Q R q q ′  s  q12 13 q 14 15 q   δ 3∗ 2 q  8 9 10 11 q  q  4 5 6 7 q s 1 2 3 q q L δ squares with horizontal and vertical size of length √ 2. It holds: each box contains at most one point of S. Now suppose that s, s′ ∈ S have the property that d(s, s′ ) < δ and that they are at least 16 positions apart in Sy . Assume w.l.o.g. that s has the smaller y-coordinate. Then, since there can be at most one point per box, there are at least three ”rows” of Z lying between s and s′ . But any two points in Z separated by at least three ”rows” must be a distance of δ at least 3 ∗ 2 apart. ` We can conclude the algorithm as following: We make one pass through Sy and for each s ∈ Sy , we compute its distance to each of the next 15 points in Sy . Running time: T (n) ≤ 2 ∗ T ( n ) + O(n) = O(n ∗ log(n)) 2 2.2 Convolutions at the Fast Fourier Transformation Definition of the problem: Given two vectors a = (a0 , . . . , an−1 ) b = (b0 , . . . , bn−1 ). The convolution of the two vectors of length n is a vector with 2n − 1 coordinates, where co- ordinate k is equal to ai ∗ bj =⇒ (i,j):i+j=k∧i,j<n a ∗ b = (a0 ∗ b0 , a1 ∗ b0 + a0 ∗ b1 , a0 ∗ b2 + a1 ∗ b1 + a2 ∗ b0 , . . .) b0 b1 b2 b3 ... bn−1 ¨ ¨ ¨ ¨¨ ¨¨ ¨¨ ¨ a0 a¨0 a0 b1 a0 b2 a0 b3 ... a0 bn−1 ¨ 0 b¨¨ ¨¨ ¨¨ a1 a¨0 a¨1 a¨2 a1 b3 ... a1 bn−1 ¨ 1b ¨ 1b ¨ 1b ¨ ¨ a2 a¨0 a¨1 a2 b2 a2 b3 ... ¨ 2 b¨¨ 2 b a2 bn−1 a3 a¨0 a3 b1 a3 b2 a3 b3 ... a3 bn−1 . ¨ 3.b . . . . . . . . . . . . . . . . an−1an−1 b0 an−1 b1 an−1 b2 an−1 b3 . . . an−1 bn−1
  • 11. 2 DIVIDE AND CONQUER 11 Motivation Example 1: ”Polynomial Multiplication” Representation 2 m−1 A(x) = a0 + a1 x + a2 x + . . . + am−1 x −→ (a0 , a1 , a2 , . . . , am−1 ) B(x) = b0 + b1 x + b2 x2 + . . . + bn−1 xn−1 −→ (b0 , b1 , b2 , . . . , bn−1 ) C(x) = A(x) ∗ B(x) −→ (c0 , c1 , c2 , . . . , cm+n−2 ) ck = ai b j (i,j):i+j=k Example 2: ”Signal Processing” Suppose we have a vector a = (a0 , a1 , . . . , am−1 ) representing a sequence of measurements, sampled at m consecutive points in time. A common operation is to ”smooth” the measurements by averaging each ai with a weighted sum of its neighbors within k steps to the left and right in the sequence. We define a ”mask” w = (w−k , w−(k−1) , . . . , w−1 , w0 , w1 , . . . , wk−1 , wk ) consisting of the weights we want to use for averaging each point with its neighbor. k We replace ai with a′ = i ws ai+s s=−k Let’s define b = (b0 , b1 , . . . , b2k ) by setting bl = wk−l : a′ = i b l aj (j,l):j+l=i+k Example 3: ”Combining Histograms” 15.11.2005 Aim: Running time of O(n log n) Explanation : Complex roots of Unity Complex number reω∗i where eΠ∗i = −1 and e2∗Π∗i = 1 The polynomial equitation xk = 1 has k distinct complex roots 2∗Π∗j∗i wjk = e k for j = 0, 1, ..., k − 1 called k th roots of unity.
  • 12. 2 DIVIDE AND CONQUER 12 k=8 Imaginary Axis T ω2,8 i r ω3,8 r r ω1,8 -1 ω4,8 ω0,8 1 r r E Real Axis r r ω5,8 ω6,8 ω7,8 r -i Idea : We are given the vectors a = (a1 , a2 , ..., an−1 ) and b = (b1 , b2 , ..., bn−1 ). We will view them as the polynomial A(x) = a0 + a1 x + a2 x2 + ... + an−1 xn−1 B(x) = b0 + b1 x + b2 x2 + ... + bn−1 xn−1 . We will seek to compute their product C(x) = A(x)∗ B(x) in (O(nlog(n)) time. The vector C = (c0 , c1 , ..., c2n−2 ) is exactly the convolution a ∗ b. Now, rather than multiplying A and B symbolically, we can treat them as functions of the variable x and multiply them as follows : (i) Polynomial Evaluation : We chose 2n values x1 , ..., x2n and evaluate A(xj ),B(xj ) for each j = 1, 2, .., 2n. (ii) Compute C(xj ) for each j = 1, ..., 2n (iii) Polynomial Interpolation : Recover C from its values on x1 , ..., x2n For our numbers x1 , ..., x2n on which to evaluate A and B we will choose the (2n)th roots of unity. The representation of a degree-d polynomial P by its values on the (d + 1)th roots of unity is referred to as the ”‘Discrete Fourier transform of P ”’. n−2 (A) A(x) = Aeven (x2 ) + x ∗ Aodd (x2 ) with Aeven (x) = a0 + a2 x + a4 x2 + ... + an−2 x 2 n−2 Aodd (x) = a1 + a3 x + a5 x2 + ... + an−1 x 2 . Suppose that we evaluate each of the Aeven and Aodd on the (n)th roots of unity. This is exactly a version of the problem we face with A and the (2n)th roots of unity, except that the input is half as large. We have just to produce the evaluation of A on the (2n)th roots of unity using O(n) additional 2Πji operations. Consider one of these roots of unity ωj2n = ǫ 2n 2Πji 2Πij (ωj2n )2 = (ǫ 2n )2 = ǫ n and hence (ωj2n )2 is a (n)th root of unity ⇒ T (n) ≤ 2T ( n ) + O(n) 2 (B) The construction of C can be achieved by defining an appropriate polynomial (P ) and evaluating it at the (2n)th roots of unity. 2n−1 Consider a polynomial C(x) = cs ∗ x2 that we want to reconstruct from its values s=0 C(ωs2n ) at the (2n)th roots of unity. 2n−1 Define a new polynomial D(x) = ds ∗ xs where ds = C(ωs2n ). s=0 2n−1 D(ωj2n ) = C(ωs2n ) ∗ ωj2n s s=0
  • 13. 2 DIVIDE AND CONQUER 13 2n−1 2n−1 = ( ct ∗ ωs2n t ) ∗ ωj2n s s=0 t=0 2n−1 2n−1 = ct ∗ ( ωs2n t ∗ ωj2n s ) t=0 s=0 2n−1 2n−1 2Πi 2Πi = ct ∗ ( ((e 2n )s )t ∗ ((e 2n )j )s ) t=0 s=0 2n−1 2n−1 (2Πi)∗(st+js) = ct ∗ ( e 2n ) t=0 s=0 2n−1 2n−1 (2Πi)∗(t+j) = ct ∗ ( (e 2n )s ) t=0 s=0 2n−1 2n−1 = ct ∗ ( ωt+j2n s ) t=0 s=0 The only form of the last lines outer sum that is not equal to 0 is for ct such that ωt+j2n = 1. Explanation : 2n−1 For any (2n)th root of unity ω = 1, we have ω s = 0 x2n = 1 ⇔ x2n − 1 = 0 ⇔ s=0 2n−1 x2n − 1 = (x − 1) ∗ ( xt ) → This happens if t + j is a multiple of 2n, that is, if t=0 2n−1 2n−1 s t = 2n − j For this value, ωt+j2n = 2n So we get that D(ωj,2n ) = 2n ∗ c2n−1 s=0 s=0 2n−1 Fact: For any polynomial C(x) = cs ∗ xs and corresponding polynomial D(X) = s=0 2n−1 1 C(ωs2n ) ∗ xs we have that cs = 2n ∗ D(ω2n−s,2n ) s=0
  • 14. 3 DYNAMIC PROGRAMMING 14 3 Dynamic Programming Basic Idea One implicitly explores this space of all possible solutions, by carefully decomposing things in a series of sub solutions, and then building up correct solutions to large and larger subproblems. 3.1 Weighted Interval Scheduling Definition of the Problem We have n requests labeled 1, . . . , n, with each request i specifying start time si and finishing time fi . Each interval i has a weight vi . Two intervals are compatible if they do not overlap. The Goal is to select S ⊆ {1, . . . , n} of mutually compatible intervals, so as to maximize the sum the values of the selected intervals, vi . i∈S Lets suppose that the requests are sorted in order of nondecreasing finishing time: f1 ≤ f2 ≤ . . . ≤ fn . Well say a request i comes before request j if i < j. Example: v1 = 2 1 p(1) = 0 v2 = 4 p(2) = 0 2 v3 = 4 3 p(3) = 1 v4 = 7 p(4) = 0 4 v5 = 2 5 p(5) = 3 v6 = 1 6 p(6) = 3 We define p(j) for an interval j to be the largest index i < j such that intervals i and j are disjoint. We define p(j) = 0 if no request i < j is disjoint from j. For any j between 1 and n let vj denote the optimal solution of the problem consisting of requests {1, . . . , j}, and let OP T (j) denote the value of this solution. For the optimal solution vj it holds, that either j ∈ Oj in which case OP T (j) = vj + OP T (p(j)), or j ∈ Oj in which case / OP T (j) = OP T (j − 1). Fact 1: OP T (j) = max{vj + OP T (p(j)), OP T (j − 1)} Fact 2: Request j belongs to an optimal solution on the set {1, . . . , j} if and only if vj + OP T (p(j)) ≥ OP T (j − 1) Remark 1: These facts form the first crucial component on which dynamic programming solution is based: recurrence equation that expresses the optimal solution in terms of the optimal solutions to smaller subproblems.
  • 15. 3 DYNAMIC PROGRAMMING 15 OP T (6) OP T (5) OP T (3) OP T (4) OP T (3) OP T (2) OP T (1) OP T (3) OP T (0) OP T (2) OP T (1) OP T (1) OP T (0) OP T (2) OP T (1) OP T (1) OP T (0) OP T (1) OP T (0) Example: Remark 2: A fundamental observation, which forms the second crucial component of a dynamic pro- gramming solution, is that our recursive algorithm is really only solving n + 1 different sub solutions. How could we eliminate the redundancy? =⇒ ”Memorization” 22.11.2005 ”Memorization” M [0..n]: M [j] will start with the value ”empty” but will hold the value of OP T (j) as soon as it is first determined. M-OPT(j) if j=0 then return 0 else if M[j] is not empty then return M[j] else M[j] = max(vj + M-OPT(p(j)), M-OPT(j − 1)) return M[j] Iterative-M-OPT(j)
  • 16. 3 DYNAMIC PROGRAMMING 16 M[0] = 0 for j=1,..,n M[j] = max(vj + M − (p(j)), M (j − 1)) return M[j] So far we have simply computed the value of an optimal solution. What we want is the full optimal set of intervals as well. We know from Fact 2 that j belongs to an optimal solution for the set of intervals {1, .., j} iff 6 vj + OP T (p(j)) ≥ OP T (j − 1). Find Solution(j) if j=0 then Output nothing else if vj + M [p(j)] ≥ M [j − 1] then Output j together with the result of Find Solution(p(j)) else Output the result of Find Solution(j-1) ”Informal Guidelines” 1. There are only a polynomial number of subproblems 2. The solution to the original problem can be easily computed from the solutions to the subproblems. 3. There is a natural ordering on subproblems from ”smallest” to ”largest” together with an easy to compute recurrence that allows one to determine the solution to a subproblem from the solutions to some number of smaller subproblems 3.2 Segmented Least Squares y T     r   d d   d rd   d dr d  r   d  d d   dr r   d  d d   dr       E x 6 ”iff” means ”if and only if”
  • 17. 3 DYNAMIC PROGRAMMING 17 Problem description Suppose our data consists of a set P of n points in the plane, denote (x1 , y1 ), (x2 , y2 ), .., (xn , yn ). Suppose x1 < x2 < .. < xn . Given a line L defined by the equation y = ax + b, we say that the error of L with respect to P is the sum of its squared ”distances” to the points in P . n Error(L, P ) = (yi − axi − b)2 i−1 The line of minimal error is y = ax + b, where n∗ i xi yi − ( i xi )( i yi ) a= n 2 2 n∗ i xi − ( i xi ) i xi − a ∗ i xi b= n y T r @ r r @@ @@@    r r r   r   r E x Formulating the Problem We are given a set of points P = {(x1 , y1 ), (x2 , y2 ), .., (xn , yn )} with x1 < x2 < .. < xn . We will use pi to donate the point (xi , yi ). We must first partition P into some number of segments. Each segment is a subset of P that represents a continuous set of x-coordinates, that is, it is a subset of the form {pi , pi+1 , .., pj−1 , pj } for some indices i ≤ j. Then, for each segment S in our partition of P , we compute the line minimizing the error with respect to the points in S. The penalty of a partition is defined to be a sum of the following terms: 1. The number of segments into which we partition P , times a fixed given multiple C > 0. 2. For each segment, the error value of the optimal line though that segment. Our goal in the ”Segmented Least Squares” is to find a partition of minimum penalty. y T r n r @ i @@r r @ @@ r r r r r E 1 i-1 x
  • 18. 3 DYNAMIC PROGRAMMING 18 Observation: The last point pn belongs to a single segment in the optimal partition. If we knew the identity of the last segment pi , .., pn , then we could remove these points from consideration and recursively solve the problem on the remaining points p1 , .., pi . Suppose we let OP T (i) denote the optimal solution for the points p1 , .., pi , and we let eij denote the minimum error of any line with respect to pi , .., pj . Fact 1: If the last segment of the optimal partition is pi , .., pn then the value of the optimal solution is OP T (n) = ein + C + OP T (i − 1) Fact 2: For the subproblems on the points p1 , .., pj OP T (j) = min {eij + C + OP T (i − 1)} 1≤i≤j and the segment pi , .., pj is used in an optimal solution for the subproblem iff the minimum is obtained using index i. 3.3 Subset Sum / Knapsack 16.12.2005 Subset Sum Problem We are given n items {1, .., n} and each has a given nonnegative weight wi (for i = 1, .., n). We are also given a bound W . We would like to select a subset S of the items so that i∈S wi ≤ W and, subject to this restriction i∈S wi is as large as possible. 0 W W1 W2 Knapsack: each item has both a value vi and a weight wi , i∈S wi ≤ W and i∈S vi is as large as possible. OPT(n) n ∈ O: / OPT(n) = OPT(n-1) n ∈ O: OPT(n) = ? OPT(n) can not solve the problem as it has only one parameter, n. Suppose we take more than one parameter: OPT(n,W) n ∈ O: / OPT(n,W) = OPT(n-1,W) n ∈ O: OPT(n,W) = wn + OP T (n − 1, W − wn )
  • 19. 3 DYNAMIC PROGRAMMING 19 OPT(i,W) = max{OP T (i − 1, W ), wi + OP T (i − 1, W − wi )} SubsetSum(n, W) Array M[0..n,0..W] Initialize M[0,w] = 0 ∀w ∈ {0, .., w} For i=1 to n For j=0 to w compute M[i,j] = max{M [i − 1, j], wi + M [i − 1, j − vi ]} return M[m,W] n i i-1 . . . 1 0 0 1 2 ... j ... W Fact 1: The SubsetSum(n, W) Algorithm correctly computes the optimal value if the problem and runs in O(n ∗ W ) time. Note 1: The running time is a polynomial function of n and W , the largest integer involved in defining the problem. We call such algorithms ”pseudo-polynomial”. Extension to the Knapsack problem n ∈ O: / OPT(n,W) = OPT(n-1,W) n ∈ O: OPT(n,W) = wn + OP T (n − 1, W − wn ) Fact 2: If w < vi , then OP T (i, W ) = OP T (i − 1, W ) else n ∈ O: OPT(i,W) = max{OP T (i − 1, W ), vi + OP T (i − 1, W − wi )} 3.4 Sequence Alignment Motivation 1. ”Online dictionaries” Input: o c c u r r a n c e Output: Do you mean o c c u r r e n c e ? o currance o curr ance occurrence occurre nce costmism + costgap < 3 ∗ costgap ?
  • 20. 3 DYNAMIC PROGRAMMING 20 Goal: We want a model in which similarity is determined roughly by the number of gaps and mismatches we have when we line up the two words. 2. ”Computational biology” Organism’s genome is divided into giant linear DNA molecules known as chromosomes. We can think of it as an linear tape, containing a string over the alphabet {A, C, G, T}. Definition Suppose we are given two strings x and y, where x consists of the sequence of symbols x1 x2 x3 ...xm and y consists of the sequence of symbols y1 y2 y3 ...yn . Consider the sets {1, ..., m} and {1, ..., n} are representing the different positions on the string x and y. Consider a matching of these sets, that is a set of ordered pairs with the property that each item occurs in at most one pair. We say that a matching M of these two sets is an alignment if there are no ”crossing” pairs: if (i, j), (i′ , j ′ ) ∈ M and i < i′ ⇒ j < j ′ Example: s t o p tops Corresponding alignments: (2, 1), (3, 2), (4, 3) Problem: Suppose M is a given alignment between x and y: (1) There is a parameter δ > 0 that defines a gap penalty. (2) For each pair of letters p, q in our alphabet, there is a mismatch cost of αpq for lining up p with q. (3) The cost of M is the sum of its gaps and mismatch costs, and we seek for an alignment of minimum costs. Fact 1: Let M be any alignment of x and y. If (m, n) ∈ M , then either the mth position of x or the / nth position of y is not matched in M . Proof: Suppose by way of contradiction, that (m, n) ∈ M and there are numbers i < m and j < n / so that (m, j) ∈ M and (i, n) ∈ M . This contradicts our definition of alignments: we have (i, n), (m, j) ∈ M with i < m, but n > i so that the pairs (i, n) and (m, j) cross. ` Fact 2: In an optimal alignment M , at least one of the following is true: (i) (m, n) ∈ M (ii) the mth position of x is not matched or (iii) the nth position of y is not matched Let OP T (i, j) denote the minimum cost of an alignment between x1 x2 x3 ...xi and y1 y2 y3 ...yj . In case (i), we pay αxm yn and then align x1 x2 ...xm−1 as well as possible with y1 y2 ...yn−1 . OP T (m, n) = αxm yn + OP T (m − 1, n − 1) In case (ii), we pay gap costs of δ since the mth position of x is not matched, and then we align x1 ...xm−1 as well as possible with y1 ...yn . OP T (m, n) = δ + OP T (m − 1, n) In case (iii), we pay gap costs of δ since the nth position of y is not matched, and then we align x1 ...xm as well as possible with y1 ...yn−1 . OP T (m, n) = δ + OP T (m, n − 1) Fact 3: The minimum alignment costs satisfy the following recurrence for i ≥ 1, j ≥ 1: OP T (i, j) = min{αxi ,yj + OP T (i − 1, j − 1), δ + OP T (i − 1, j), δ + OP T (i, j − 1)}
  • 21. 3 DYNAMIC PROGRAMMING 21 Formulation of the sequence alignment algorithm as graph theoretical problem Suppose we build a two-dimensional grid graph Gx,y , with the rows labeled by symbols in the string x and the columns labeled by the symbols in y. We number the rows from 0 to m and the columns from 0 to n. We put costs on edges of Gx,y : • each horizontal and vertical edge get cost δ. • the diagonal edge from (i − 1, j − 1) to (i, j) get cost αxi ,yj . x3 iE iE iE iE i T   T   T   T   T x2 i  E i  E i  E i  E i T  T   T   T   T x1 i  E i  E i  E i  E i T   T   T   T   T 0 i  E i  E i  E i  E i 0 y1 y2 y3 y4 Fact 4: Let f (i, j) denote the minimum cost of a path from (0, 0) to (i, j) in Gx,y . Then for all i, j, we have f (i, j) = OP T (i, j).
  • 22. 4 APPROXIMATION ALGORITHM 22 20.12.2005 4 Approximation Algorithm Informal Definition ”Algorithm, which run in polynomial time and find solutions that are guaranteed to be close to optimal.” Techniques (1) Greedy Algorithm (2) Pricing Method (primal-dual technique) (3) Linear Programming and Rounding (4) Polynomial-time Approximation Scheme 4.1 Load-Balancing Problem Definition We are given a set of m machines M1 , ..., Mm and a set of n jobs; each job j has a processing time tj . Let A(i) be the set of jobs assigned to machine Mi . Under this assignment, machine Mi needs to work for a total time of Ti = tj j∈A(i) and we declare this to be load on the machine Mi . We seek to minimize a quantity known as the makespan; it is simply the maximum load on any machine, T = max{T1 , ..., Tm } Greedy Balance Start with no jobs assigned. Set Ti = 0 and A(i) = ∅ for all machines Mi . For j = 1 to n Let Mi be a machine that achieves the minimum min{T1, ..., Tm } Assign job j to machine Mi . Set A(i) ← A(i) ∪ {j} Set T (i) ← Ti + tj Example: m = 3; n = 6; ti = {2, 3, 4, 6, 2, 2} time 9 T 8    7   6     5 6 2  4 2 makespan = 8 3     2 4   3   1 2  0 M1 M2 M3
  • 23. 4 APPROXIMATION ALGORITHM 23 Fact 1: The optimal makespan T ∗ is at least n 1 T∗ ≥ ∗ tj m j=1 Fact 2: The optimal makespant T ∗ is at least T ∗ ≥ max{t1 , ..., tn } Fact 3: Algorithm Greedy-Balance produces an assignment of jobs to machines with makespan T ≥ 2 ∗ T∗ Proof: We consider a machine Mi that attains the maximum load T in our assignment and we ask: What was the last job j to be placed on Mi ? When we assigned job j to Mi , the machine Mi had the smallest load of any machines. Its load before this assignment was Ti − tj . It follows that every machine had load at least Ti − tj . We have m Tk ≥ m ∗ (Ti − tj ) k=1 n 1 ⇔ (Ti − tj ) ≤ ∗ tj ≤ T ∗ m j=1 F act1 T := Ti = (Ti − tj ) + tj ≤ 2 ∗ T ∗ ≤T ∗ ≤T ∗ T Approximation ratio: T∗ ≤2 Worst case example: We have m machines and we have n = m ∗ (m − 1) + 1 jobs. The first m ∗ (m − 1) = n − 1 jobs require time tj = 1. The last job requires time tn = m. Greedy makespan: 2m − 1 Optimal makespan: m 2m−1 1 =⇒ m =2− m −→ 2 An improved Greedy Algorithm Sorted Greedy Balance Sort the list of jobs not ascending by the processing time Start with no jobs assigned. Set Ti = 0 and A(i) = ∅ for all machines Mi . For j = 1 to n Let Mi be a machine that achieves the minimum min{T1, ..., Tm } Assign job j to machine Mi . Set A(i) ← A(i) ∪ {j} Set T (i) ← Ti + tj
  • 24. 4 APPROXIMATION ALGORITHM 24 Fact 4: If these are more than m jobs, then T ∗ ≥ 2 ∗ tm+1 Proof: Consider only the first m + 1 jobs in the sorted order. They each take at least tm+1 time. There are m + 1 jobs and only m machines, so there must be a machine that gets assigned two of these jobs. This machine will have processing time at least 2 ∗ tm+1 . Fact 5: Algorithm Sorted Greedy Balance produces an assignment of jobs to machines with makespan 3 T ≤ ∗ T∗ 2 Proof: We consider a machine Mi that has the maximum load. So let’s assume that machine Mi has at least two jobs, and let tj be the last job assigned to the machine. It holds: j ≥m+1 1 tj ≤ tm+1 ≤ ∗ T∗ 2 3 Ti = (Ti − tj ) + tj ≤ 2 ∗ T∗ ≤T ∗ 1 ≤2T∗ 4.2 Set Cover Definition A set X of n elements. A set F of subsets of X, with s = X. s∈F A set cover is a collection of those sets whose union is equal to X and has minimal cardinality. Greedy Set Cover (X, F ) U := X C := ∅ while U = ∅ do choose s ∈ F with |s ∩ U | → max U := U s C := C ∪ {s} return C Example: ¨ ¨ ¨ '™ ™ ™ $ (1) |S1 ∩ U | = 6 ™ ' ™ ™ $ S1 % (2) |S4 ∩ U | = 3 S4 ™ ™ ™ S2 % (3) |S5 ∩ U | = 2 ¨ S6 ™ ™ ™ © © © © (4) |S6 ∩ U | = 1 S3 S5 (|S3 ∩ U | = 1)
  • 25. 4 APPROXIMATION ALGORITHM 25 Further Definition Set U of n elements. A list of S subsets S1 , ..., Sm of U. Each set Si has an associated weight wi ≥ 0. Goal is to minimize si ∈e wi . It holds: wi wi −→ |Si | |Si ∩ R| Greedy Set Cover Start with R = U while R = ∅ wi Select set Si that minimizes |Si ∩R| Delete set Si from R return the selected sets Example: n s n S6 s S5 given weights: s s S2 ' s $ w1 = 1, w2 = 1 w3 = 1 + ǫ, w4 = 1 + ǫ s w5 = 1, w6 = 1 s s % S1 choosing order: S4 S3 S1 , S2 , S5 , S6 Let the cost paid for an element s be described by the quantity cs : wi cs = |Si ∩R| for all s ∈ Si ∩ R. Fact 1: If C is the set cover obtained by Greedy Set Cover, then si ∈C wi = s∈U Cs . n 1 Fact 2: For every Sk , the sum s∈Sk Cs is at most H(|Sk |) ∗ Wk whereas H(n) := i = Θ(ln n). i=1 Proof: We assume that the element of Sk are the first d = |Sk | of the set U . That is Sk = s1 , ...sd . Further, let us assume that these element are labelled in the order in which they are assigned a cost csj by the Greedy algorithm. Now consider the iteration in which element sj is covered by the greedy algo for some j ≤ d. At the start of this iteration , sj , sj+1 , ..., sd ∈ R by our labeling of the elements. This implies that |Sk ∩ R| is at least d − j + 1, and so the average wk Wk cost of the set Sk is at most |Sk ∩R| ≤ d−j+1 . In this iteration, the greedy algorithm selected a set Si of minimum average cost, so this set Si has average cost at most that of Sk . Thus wi wk wk csj = ≤ ≤ . |Si ∩ R| |Sk ∩ R| d−j+1 d wk wk wk wk 1 1 cs = j = 1d csj ≤ = + +...+ = wk ( + +...+1) = wk H(d) j=1 d−j−1 d d−1 1 d d−1 s∈Sk Fact 3: The set cover C selected by Greedy-Set-Cover has weight at most H(d∗ ) times the optimal weight w∗ whereas d∗ = maxi |Si |.
  • 26. 4 APPROXIMATION ALGORITHM 26 Proof: Let C ∗ denote the optimum set cover, so that w∗ = wi . For each of the sets in C ∗ , Si ∈C ∗ Fact 2 implies: 1 wi ≥ ∗ Cs (∗) H(d∗ ) s∈Si Cs ≥ Cs (∗∗) si ∈C ∗ s∈Si s∈U 1 1 1 w∗ =(Def.) wi ≥(∗) Cs ≥(∗∗) Cs =F act1 wi H(d∗ ) H(d∗ ) H(d∗ ) Si ∈C ∗ Si ∈C ∗ s∈Si s∈U Si ∈C 4.3 Vertex Cover (Pricing Method) Definition A vertex cover in a graph G = (V, E) is a set S ⊆ V so that each edge has at least one end in S. We consider have, each vertex i ∈ V has a weight wi ≥ 0. We would like to find a vertex cover S for which w(S) is minimum. It holds: Vertex Cover ≤p Set Cover and Independent Set ≤p Vertex Cover. The ”Pricing Method” The pricing method (also known as primal-dual method) is motivated by an economic perspective. For the vertex cover problem, we will think of the weights on the nodes as costs, and we will think of each edge as having to pay for its “share” of the cost of the vertex cover we find. More precisely: We will think of the weight wi of the vertex i as the cost for wing i in the cover. We will think of each edge e as an “agent” who is willing to pay something to the node that covers it. The algorithm will not only find a vertex cover S, but also determine prices pe ≥ 0 for each edge e ∈ E, so that if each edge e ∈ E pays the price pe , this will in total approximately cover the cost of S. Selecting vertex i covers all edges incident to i, so it would be “unfair” to change these incident edges in total more then the cost of vertex i. We call prices pe fair, if for each vertex i, the edges adjacent to i do not have to pay more than the cost of the vertex: pe ≤ wi e=(i,j) Fact 1: For any vertex cover S ∗ and any nonnegative and fair prices pe , we have pe ≤ w(S∗). e∈E Proof: By the definition of fairness, we have e=(i,j) pe ≤ wi for all nodes i ∈ S ∗ . e∈E pe ≤ ∗ i∈S ∗ pe ≤ i∈S ∗ wi = w(S ). Algorithm: Def.: We say that a node i is tight (or ”‘paid for”’) if pe = wi e=(i,j) VertexCoverApprox(G, w) Set pe = 0 for all e ∈ E while ∃e = (i, j) such that neither i nor j is tight select such an edge e increase pe without violating fairness
  • 27. 4 APPROXIMATION ALGORITHM 27 end while Let S be the set of all tight nodes. return S Example: a 4 SELECT(a,b) 4 a payment ≤ 3 t t t t pay = 3 t t t t ⇒ b is tight t t p=0 t p=0 3 t 0 t t p=0 t 0 t t t t t t t t t 3 5 3 3 5 3 p = 0 p = 0 0 0 b c d b c d SELECT(a,d) a 4 payment ≤ 1 SELECT(c,d) a 4 payment ≤ 2 t t t pay = 1 t pay = 2 t t t ⇒ a is tight t ⇒ d is tight t t 3 t 1 3 t 1 t t 0 t 0 t t t t t t t t t 3 5 3 3 5 3 0 0 0 2 b c d b c d ⇒ S = {a, b, d}, w(S) = 10 Fact 2: The Set S and the prices p returned by the algorithm satisfy the inequality w(S) ≤ 2 ∗ e∈E pe . Proof : All Nodes in S are tight, so we have e=(i,j) pe = wi for all i ∈ Si w(S) = i∈S wi = i∈S e=(i,j) pe ≤ 2 ∗ e∈E pe Fact 3: The Set S returned by the algorithm is a vertex cover, and its cost is at most twice the minimum cost of any vertex cover. Proof : Suppose by contradiction, that S does not cover edge e = (i, j). This implies that neither i nor j is tight, and this contradicts that the while-loop of the algorithm terminated. Let p be the prices set by the algorithm, and let S ∗ be an optimal vertex cover. By Fact 2 we have 2 ∗ e∈E pe ≥ w(S) and we have by Fact 1 that e∈Epe ≤ w(S ∗ ) =⇒ w(S) ≤ 2 ∗ e∈E pe ≤ 2 ∗ w(S ∗ )
  • 28. 4 APPROXIMATION ALGORITHM 28 4.4 Linear Programming and Rounding The basic linear programming problem can be viewed as a complex version of the problem of si- multaneous linear equations with inequalities in place of equations. Specially consider the problem of determining a vector x that satisfies Ax ≥ b. x1 Example: x1 ≥ 0, x2 ≥ 0 (1.5, 1) ∗ (x2 ) x1 + 2 ∗ x2 ≥ 6 2 ∗ x1 + x2 ≥ 6 → cT ∗ x → M in 6 5 4 3 2 1 0 0 1 2 3 4 5 6 Definition: Given an m × n Matrix A, and vector b ∈ Rm and vector e ∈ Rn , find a vector x ∈ Rn to solve the following optimization problem : min( cT ∗ x , such that x ≥ 0, Ax ≥ b) ObjectiveF unction Constraints Vertex cover as an Integer Program Choose a decision variable xi for each node i ∈ V xi = 1 will indicate that node i is in the vertex cover xi = 0 will indicate that node i is not in the vertex cover For each edge (i, j) ∈ E, we write the inequality xi + xj ≥ 1 → Objective function : wT ∗ x → min VC-IP M in i∈V wi ∗ xi xi + xj ≥ 1∀(i, j) ∈ E xi ∈ {0, 1}∀i ∈ V Fact 1: S is a vertex cover in G if and only if the vector x defined as xi = 1 for i ∈ S, and xi = 0 for i ∈ S satisfies the constraints in VC-IP. Further we have w(S) = wT ∗ x / Fact 2: Vertex Cover ≤p Integer Programming. Using linear programming for vertex cover Modify the VC-IP by dropping the requirement that each xi ∈ {0, 1} and reverting to the constraint that each xi is an arbitrary real number between 0 and 1. ֒→ VC-LP.
  • 29. 4 APPROXIMATION ALGORITHM 29 Fact 3: Let S ∗ denote a vertex cover of minimum weight. Then WLP ≤ w(S ∗ ). 1 w(S ∗ ) = |V C| = 2 x1 = x2 = x3 = 2 1   d   d WLP = 3   d 2 1 1 Given a fractional solution (x∗ ), we define S = {i ∈ V : x∗ ≥ 1 } i i 2 1 Fact 4: The set S defined in this way is a vertex cover, and 2 ∗ w(S) ≤ WLP Proof : Consider an edge e = (i, j). Recall that one inequality xi + xj ≥ 1 ⇒ in any solution x∗ that satisfies this inequality either x∗ ≥ 2 or x∗ ≥ 1 . Thus at least one of these two will be i 1 j 2 rounded up, and i or j will be placed in S. Therefore S is a vertex cover. WLP = wT ∗ x∗ = i wi ∗ x∗ ≥ i∈S wi ∗ x∗ ≥ 1 ∗ i∈S wi = 1 ∗ w(S). i i 2 2 Fact 5: The algorithm produces a vertex cover S of at most twice the minimum possible weight. 27.01.2006 4.5 A more advanced LP-Application: Load Balancing Definition of the problem Each job j has a fixed given size tj ≥ 0 and a set of machines Mj ⊆ M that it may be assigned to. We call an assignment of jobs to machines feasible i each job j is assigned to a machine i ∈ MJ . The goal is still to minimize the maximum load on any machine: Using Ji ⊆ J to denote the jobs assigned to a machine i ∈ M in a feasible assignment, and using Li = j∈Ji tj to denote the resulting load, we seek to minimize maxi Li . ”Generalized Load Balancing Problem” GL-IP: xij to each pair (i, j) of a machine i ∈ M , job j ∈ J: Setting xij = 0 will indicate that job j is not assigned to machine i, setting xij = tj will indicate that job j is assigned to machine i. For each job we require i xij = tj . We require xij = 0 whenever i ∈ Mj . The load of a / machine i can be expressed as Li = j xij . We use one more variable L. We use inequalities j xij ≤ L∀i ∈ M . min L xij = tj ∀j ∈ J i GL-IP: xij ≤ L ∀i ∈ M j (∗) xij ∈ {0, tj } ∀j ∈ J, i ∈ Mj xij = 0 ∀j ∈ J, i ∈ Mj /
  • 30. 4 APPROXIMATION ALGORITHM 30 Fact 1: An assignment of jobs to machines has load at most L if and only if the vector x satisfies constraints in GL-IP, with L set to the maximum load of the assignment. xij ≥ 0 ∀j ∈ J, i ∈ Mj GL-IL: instead of (*) Fact 2: If the optimum value of GL-LP is L, then the optimum load is at least L∗ ≥ L. Fact 3: The optimum load is at least L∗ ≥ maxj tj . We’ll consider the following bipartite graph G(x) = (V (x), E(x)) V (x) = M ∪ J (i, j) ∈ E(x) if and only if xij 0 Fact 4: Given a solution (X, L) of GL-LP such that the graph G(X) has no cycles, we can use the solution X to obtain a feasible assignment of jobs to machines with load at most L + L∗ . Proof: Since the graph G(X) has no cycles, each of its connected components is a tree. First, root the tree at an arbitrary node. . . . Consider a job j: 1. If the node corresponding to job j is a leaf of the tree, let machine node i be its parent. Since j has degree 1 in G(X) machine i is the only machine that has been assigned any part of job j and hence xij = tj . 2. For a job j whose corresponding node is not a leaf in G(X) we assign j to an arbitrary child of the corresponding node in the rooted tree. Let i be any machine, and let Ji be the set of jobs assigned to machine i. The set Ji contains those children of node i that are leaves, plus possibly the parent p(i) of node i. For all jobs j = p(i) assigned to i, we have xij = tj . tj = xij ≤ xij ≤ L j∈Ji ,j=p(i) j∈Ji ,j=p(i) j∈Ji For the parent j = p(i) of node i, we use Fact 3 tj ≤ L∗ =⇒ ≤ L + L∗ j∈Ji
  • 31. 4 APPROXIMATION ALGORITHM 31 Jobs Machines ∞ ∞ L ∞ L tj j i V L ∞ ∞ Fact 5: Solution of this flow problem with capacity L are in one-to-one correspondence with the solution of GL-LP with value along edge (j, i) and the flow value on edge (i, v) is the load j xij on machine i. Fact 6: Let X, L) be any solution to GL-LP and C be a cycle in G(X). We can modify the solution X to eliminate at lest one edge from G(X) without increasing the load or introducing any → ← V ← → new edges. Proof idea: We modify the solution by augmenting the flow along the cycle C. Assume that the nodes along the cycle are i1 j1 i2 j2 ...ik jk where il is a machine node and jl is a job node. We’ll flow along all edges and increasing the flow on the edge (jl , il+1 ) for all l = 1, ..., k (where k + 1 is used to denote 1), by the same amount δ with δ = min xil jl l=1,...,k 31.01.2006 4.6 Arbitrarily Good Approximations - Knapsack Problem Goal: “Produce a solution within a small percentage of the optimal solution.” Definition: Suppose we have n items. Each item i = 1, ..., n has two integer parameters: weight wi , value vi . Given a knapsack capacity W, the goal of the Knapsack problem is to find a subset S of items of maximum value subject to the restriction that the total weight of the set should not exceed W . vi →max under the condition wi ≤ W . i∈S i∈S Our algorithm will take as input the weights and values defining the problem and will also take an extra parameter ǫ, the desired precision. It will find a subset S whose total weight does not exceed W , with value vi at most a (1 + ǫ) factor below the maximum possible i∈S solution.
  • 32. 4 APPROXIMATION ALGORITHM 32 The algorithm will run in polynomial time for any fixed choice ǫ 0 however, the dependence on ǫ will not be polynomial. We call such an algorithm a “Polynomial-time approximation scheme” (PTAS). We already know an dynamic programming algorithm that run in O(n · W ). “Old”: OP T (i, W ) “New”: OP T (i, V ) is the smallest knapsack weight W so that one can obtain a solution using a subset of items 1, ..., i with values at least V . i We will have a subproblem for all i = 0, ..., n and values V = 0, ..., vj . j=1 i vj ≤ n · (max vi ) = n · v ∗ ⇒ O(n2 · v ∗ ) subproblems. j=1 i v∗ Recurrence for solving these subproblems: n−1 if V vi then OP T (n, V ) = wn + OP T (n − 1, V − vn ) i=1 else OP T (n, V ) = min{OP T (n − 1, v), wn + OP T (n − 1, max(0, V − vn ))} Knapsack (n) Array M [0...n, 0...v] For i = 0...n do M [i, 0] = 0 For i = 1, ..., n do i For v = 1, ..., vj do j=1 i−1 if v vj then j=1 M [i, v] = wi + M [i − 1, v] else M [i, v] = minM [i − 1, v], wi + M [i − 1, max0, v − vi ] return the maximum value V such that M [n, V ] ≤ W Idea for a PTAS: If the values are small integers, then v ∗ is small and the problem can be solved in polynomial time already. On the other hand, we will use a rounding parameter b and will consider the values rounded to an integer multiple of b. More precisely, for each item i, let its rounded value be vi = vi · b. ˜ b We will use our dynamic programming algorithm to solve the problem with the rounded values. Fact 1: For each item i we have vi ≤ vi ≤ vi + b. The rounded values are all integers multiples of a ˜ common value b. Instead of solving the problem with the rounded values vi , we can change ˜ ˜ the units; we can divide all values by b and get an equivalent problem. vi = vi = vi for ˆ b b all i = 1, ..., n .
  • 33. 4 APPROXIMATION ALGORITHM 33 Fact 2: The Knapsack problem with values vi and the scaled problem with values vi have the same ˜ ˆ set of optimum solutions, the optimum values differ exactly by a factor of b, and the scaled values are integers. Knapsack-Approx(ǫ) ǫ set b = n ∗ (max vi ) i solve the Knapsack Problem with values vi (with the help of our dyn. ˆ Prog. Algo.) return the set S of items found by this algorithm Fact 3: The set of items S returned by the algorithm has total weight at most W , that is wi ≤ W . i∈S Fact 4: The algorithm Knapsack-Approx runs in polynomial time for any fixed ǫ 0. Proof: The Dyn. Prog. Algo. runs in time O(n2 · v ∗ ) with v ∗ = max vi . i Determine max vi : The item j with maximum value vj = max vi has also maximum value ˆ i i vj in the rounded problem. So, max vi = vi = ˆ ˆ b = n · ǫ−1 . ⇒ The overall running time: i O(n3 · ǫ−1 ). Fact 5: If S is the solution found by the Knapsack-Approx algorithm, and S ∗ is any other solution satisfying wi ≤ W , then we have (1 + ǫ) · vi ≥ vi . i∈S ∗ i∈S i∈S ∗ Proof: Let S ∗ be any set satisfying wi ≤ W . Our algorithm finds the optimal solution with i∈S ∗ values vi : ˜ vi ≥ ˜ vi ˜ (∗) i∈S i∈S ∗ vi ≤ vi ≤ ˜ vi ≤ ˜ (vi + b) ≤ n · b + vi F act1 (∗) i∈S ∗ i∈S ∗ i∈S i∈S i∈S We get: vi ≥ vj := max vi ˜ ˜ i i∈S ǫ b= · (max vi ) ⇒ n · b = ǫ(max vi ) ≤ ǫ · vi n i i i∈S Thus, we get: √ vi ≤ (ǫ · vi ) + ( vi ) = (ǫ + 1) · vi i∈S ∗ i∈S i∈S i∈S