Approximation Algorithms for Problems
    on Networks and Streams of Data
    Luca Foschini - Ph.D. Defense

    Committee: Subhash Suri (chair), John Gilbert, Teofilo Gonzalez


Friday, September 7, 12
Why Approximation Algorithms?




Friday, September 7, 12
Why Approximation Algorithms?

              Exact algorithms require many resources




Friday, September 7, 12
Why Approximation Algorithms?
                                                               Hardware
              Exact algorithms require many resources




                                                        Apps




                                                               Data



Friday, September 7, 12
Why Approximation Algorithms?
                                                               Hardware
              Exact algorithms require many resources




                                                        Apps



                    Problems solvable
                         exactly                               Data



Friday, September 7, 12
A Long History,
    and Work in Progress




                           © Original Artist




Friday, September 7, 12
A Long History,
    and Work in Progress

      ✤    Early ‘70s - many combinatorial
           problems found to be NP-hard

      ✤    Recently - more restricting
           computation models proposed e.g.,
           data stream

                                               © Original Artist




Friday, September 7, 12
A Long History,
    and Work in Progress

      ✤    Early ‘70s - many combinatorial
           problems found to be NP-hard

      ✤    Recently - more restricting
           computation models proposed e.g.,
           data stream

                                                                            © Original Artist




                          Heuristics not sufficient, provable guarantees needed

Friday, September 7, 12
Content of the Dissertation




Friday, September 7, 12
Content of the Dissertation




       "




Friday, September 7, 12
Content of the Dissertation


                      Networks



       "

                   Data Streams




Friday, September 7, 12
Content of the Dissertation

                                                     STACS12 +
                                  Partitioning
                                                    Algorithmica
                      Networks
                                                  SODA11 +
                                  Shortest Paths
                                                 Algorithmica
       "
                                  Time Series         ICDE10

                   Data Streams

                                  Burst Detection     NSDI11


Friday, September 7, 12
Content of the Dissertation

                                                     STACS12 +      ICISS08
                                  Partitioning
                                                    Algorithmica
                      Networks                                      ICIP11
                                                  SODA11 +
                                  Shortest Paths                   ALENEX10
                                                 Algorithmica
       "
                                                                    ESA11
                                  Time Series         ICDE10

                   Data Streams                                    WOOT11

                                  Burst Detection     NSDI11        WAW09

Friday, September 7, 12
Roadmap

                                                                STACS12 +
                                             Partitioning
                                                               Algorithmica
                              Networks
                                                             SODA11 +
                                             Shortest Paths
                                                            Algorithmica
                          "
                                             Time Series         ICDE10

                              Data Streams

                                             Burst Detection     NSDI11

Friday, September 7, 12
k-Balanced Partitioning Problem
            Given: an unweighted graph G on n
            vertices; an integer k

            Find: a partition of the vertices of G
            into k sets Vi s.t.

                   ✤      |Vi |  dn/ke
                   ✤      Cut size (number of edges
                          connecting vertices in
                          different Vi) is minimized


                                    joint work with Andi Feldmann (ETHz)
                               (appeared in STACS12, submitted to Algorithmica)
Friday, September 7, 12
Motivation & Complexity

    ✤    Divide-and-conquer algorithms

    ✤    VLSI design

    ✤    Parallel computing



    ✤    NP-hard to approximate cut size within any finite value alpha
         [Andreev and Räcke 2006]


Friday, September 7, 12
Related Work




Friday, September 7, 12
General Graphs & Trees

    ✤     Algorithm is !-approximation if
          finds a cut at most ! times optimal

    ✤     NP-hard to approximate cut size
          within any finite ! [Andreev and
          Räcke 2006]




Friday, September 7, 12
General Graphs & Trees

    ✤     Algorithm is !-approximation if
          finds a cut at most ! times optimal

    ✤     NP-hard to approximate cut size
          within any finite ! [Andreev and
          Räcke 2006]


                   Trees - simple instances?



Friday, September 7, 12
General Graphs & Trees

    ✤     Algorithm is !-approximation if
          finds a cut at most ! times optimal

    ✤     NP-hard to approximate cut size      n=31, k=8 cut size = 10
          within any finite ! [Andreev and
          Räcke 2006]


                   Trees - simple instances?


                                               n=31, k=9 cut size = 8
Friday, September 7, 12
Trees Are Hard




Friday, September 7, 12
Trees Are Hard

       ✤     NP-hard to approx. cut size for !=nc
             (for any c<1) even if constant diameter




Friday, September 7, 12
Trees Are Hard

       ✤     NP-hard to approx. cut size for !=nc
             (for any c<1) even if constant diameter

       ✤     APX-hard to approx. cut-size even if
             constant degree




Friday, September 7, 12
Trees Are Hard

       ✤     NP-hard to approx. cut size for !=nc
             (for any c<1) even if constant diameter

       ✤     APX-hard to approx. cut-size even if
             constant degree




                          Most NP-hard problems become trivial on trees


Friday, September 7, 12
Relax!




Friday, September 7, 12
Relax!

         Balance constraint relaxed:
              |Vi |  (1 + ")dn/ke




Friday, September 7, 12
Relax!

         Balance constraint relaxed:
              |Vi |  (1 + ")dn/ke


                                       Balance relaxed
        Perfect balance
        Optimal cut size
                                                           Cut size
                                                         approximated
                                          !



Friday, September 7, 12
Relax!

         Balance constraint relaxed:          Bicriteria Approximation: cut
                                              size approximation ! measured
              |Vi |  (1 + ")dn/ke
                                              w.r.t perfectly balanced optimum


                                       Balance relaxed
        Perfect balance
        Optimal cut size
                                                                 Cut size
                                                               approximated
                                          !



Friday, September 7, 12
0<eps<1 on general graphs



    ✤    eps>1 -- alpha in .... spreading metric techniques

    ✤    0<eps < 1 not much improvement. 1/epsˆ2 log ^1.5 n

    ✤    What about trees?




Friday, September 7, 12
Summary of PTAS for Trees


    ✤    Compute optimal cut size for each coarse signature using DP

    ✤    Pack each coarse signatures into bins of size (1 + ")dn/ke

    ✤    Pick solution with smallest cut size among those fitting into k bins
                                     4       1+3d 1 log( 1 )e
    ✤    Total time complexity O(n (k/")          "      "      )




Friday, September 7, 12
Summary of PTAS for Trees


    ✤    Compute optimal cut size for each coarse signature using DP

    ✤    Pack each coarse signatures into bins of size (1 + ")dn/ke

    ✤    Pick solution with smallest cut size among those fitting into k bins
                                     4       1+3d 1 log( 1 )e
    ✤    Total time complexity O(n (k/")          "      "      )


                                    Show that ! =1

Friday, September 7, 12
Extension to General Graphs


    ✤    Decomposition of graph into collection of trees [Räcke, Madry], cut
         size worsen by at most O(log n) for at least 1 tree

    ✤    Apply PTAS for trees to each instance

    ✤    Return partition for tree with minimum cut

    ✤    alpha = O(log n) improves




Friday, September 7, 12
Tree Decomposition




Friday, September 7, 12
Analysis of Embedding




Friday, September 7, 12
Extensions & Open Problems
       ✤     Tree embedding techniques allow the !=1 tree PTAS to translate to a
             !=O(log n) approx for general weighted graphs

       ✤     Improves on previous best != O(log 1.5 n/"2 )




Friday, September 7, 12
Extensions & Open Problems
       ✤     Tree embedding techniques allow the !=1 tree PTAS to translate to a
             !=O(log n) approx for general weighted graphs

       ✤     Improves on previous best != O(log 1.5 n/"2 )


                                                      



                                      




                                                                     
                          Graphs                             Trees
Friday, September 7, 12
Roadmap

                                                                STACS12 +
                                             Partitioning
                                                               Algorithmica
                              Networks
                                                             SODA11 +
                                             Shortest Paths
                                                            Algorithmica
                          "
                                             Time Series         ICDE10

                              Data Streams

                                             Burst Detection     NSDI11

Friday, September 7, 12
Approximating Time Series



    ✤    Represent a time series with B
         linear segments

    ✤    New value arrives to the time
         series, need to reallocate
         segments




Friday, September 7, 12
Approximating Time Series



    ✤    Represent a time series with B
         linear segments

    ✤    New value arrives to the time
         series, need to reallocate
         segments




Friday, September 7, 12
Approximating Time Series



    ✤    Represent a time series with B
         linear segments

    ✤    New value arrives to the time
         series, need to reallocate
         segments




Friday, September 7, 12
Old Algorithms, New Proofs




Friday, September 7, 12
Old Algorithms, New Proofs

     ✤    We prove that a popular greedy merge
          scheme gives constant (bicriteria)
          approx. for many L_p norms. (ICDE10;
          joint with Gandhi, Suri)




Friday, September 7, 12
Old Algorithms, New Proofs

     ✤    We prove that a popular greedy merge
          scheme gives constant (bicriteria)
          approx. for many L_p norms. (ICDE10;
          joint with Gandhi, Suri)

     ✤    Results implemented in Linux Kernel
          and used to detect traffic bursts in
          networks (NSDI11, joint with Uyeda,
          Suri, Varghese, Baker)




Friday, September 7, 12
Old Algorithms, New Proofs

     ✤    We prove that a popular greedy merge
          scheme gives constant (bicriteria)
          approx. for many L_p norms. (ICDE10;
          joint with Gandhi, Suri)

     ✤    Results implemented in Linux Kernel
          and used to detect traffic bursts in
          networks (NSDI11, joint with Uyeda,
          Suri, Varghese, Baker)


                          Next steps: Extend results in ICDE10 to other norms
Friday, September 7, 12
Conclusion


    ✤    Approximation is necessary to reduce resource utilization

    ✤    Presented approximation algorithms for problems from different
         domains that we cannot afford to solve exactly

    ✤    Presented basic building blocks that can be used across the board to
         design approximation algorithms




Friday, September 7, 12

Aaabbbbccccc

  • 1.
    Approximation Algorithms forProblems on Networks and Streams of Data Luca Foschini - Ph.D. Defense Committee: Subhash Suri (chair), John Gilbert, Teofilo Gonzalez Friday, September 7, 12
  • 2.
  • 3.
    Why Approximation Algorithms? Exact algorithms require many resources Friday, September 7, 12
  • 4.
    Why Approximation Algorithms? Hardware Exact algorithms require many resources Apps Data Friday, September 7, 12
  • 5.
    Why Approximation Algorithms? Hardware Exact algorithms require many resources Apps Problems solvable exactly Data Friday, September 7, 12
  • 6.
    A Long History, and Work in Progress © Original Artist Friday, September 7, 12
  • 7.
    A Long History, and Work in Progress ✤ Early ‘70s - many combinatorial problems found to be NP-hard ✤ Recently - more restricting computation models proposed e.g., data stream © Original Artist Friday, September 7, 12
  • 8.
    A Long History, and Work in Progress ✤ Early ‘70s - many combinatorial problems found to be NP-hard ✤ Recently - more restricting computation models proposed e.g., data stream © Original Artist Heuristics not sufficient, provable guarantees needed Friday, September 7, 12
  • 9.
    Content of theDissertation Friday, September 7, 12
  • 10.
    Content of theDissertation " Friday, September 7, 12
  • 11.
    Content of theDissertation Networks " Data Streams Friday, September 7, 12
  • 12.
    Content of theDissertation STACS12 + Partitioning Algorithmica Networks SODA11 + Shortest Paths Algorithmica " Time Series ICDE10 Data Streams Burst Detection NSDI11 Friday, September 7, 12
  • 13.
    Content of theDissertation STACS12 + ICISS08 Partitioning Algorithmica Networks ICIP11 SODA11 + Shortest Paths ALENEX10 Algorithmica " ESA11 Time Series ICDE10 Data Streams WOOT11 Burst Detection NSDI11 WAW09 Friday, September 7, 12
  • 14.
    Roadmap STACS12 + Partitioning Algorithmica Networks SODA11 + Shortest Paths Algorithmica " Time Series ICDE10 Data Streams Burst Detection NSDI11 Friday, September 7, 12
  • 15.
    k-Balanced Partitioning Problem Given: an unweighted graph G on n vertices; an integer k Find: a partition of the vertices of G into k sets Vi s.t. ✤ |Vi |  dn/ke ✤ Cut size (number of edges connecting vertices in different Vi) is minimized joint work with Andi Feldmann (ETHz) (appeared in STACS12, submitted to Algorithmica) Friday, September 7, 12
  • 16.
    Motivation & Complexity ✤ Divide-and-conquer algorithms ✤ VLSI design ✤ Parallel computing ✤ NP-hard to approximate cut size within any finite value alpha [Andreev and Räcke 2006] Friday, September 7, 12
  • 17.
  • 18.
    General Graphs &Trees ✤ Algorithm is !-approximation if finds a cut at most ! times optimal ✤ NP-hard to approximate cut size within any finite ! [Andreev and Räcke 2006] Friday, September 7, 12
  • 19.
    General Graphs &Trees ✤ Algorithm is !-approximation if finds a cut at most ! times optimal ✤ NP-hard to approximate cut size within any finite ! [Andreev and Räcke 2006] Trees - simple instances? Friday, September 7, 12
  • 20.
    General Graphs &Trees ✤ Algorithm is !-approximation if finds a cut at most ! times optimal ✤ NP-hard to approximate cut size n=31, k=8 cut size = 10 within any finite ! [Andreev and Räcke 2006] Trees - simple instances? n=31, k=9 cut size = 8 Friday, September 7, 12
  • 21.
    Trees Are Hard Friday,September 7, 12
  • 22.
    Trees Are Hard ✤ NP-hard to approx. cut size for !=nc (for any c<1) even if constant diameter Friday, September 7, 12
  • 23.
    Trees Are Hard ✤ NP-hard to approx. cut size for !=nc (for any c<1) even if constant diameter ✤ APX-hard to approx. cut-size even if constant degree Friday, September 7, 12
  • 24.
    Trees Are Hard ✤ NP-hard to approx. cut size for !=nc (for any c<1) even if constant diameter ✤ APX-hard to approx. cut-size even if constant degree Most NP-hard problems become trivial on trees Friday, September 7, 12
  • 25.
  • 26.
    Relax! Balance constraint relaxed: |Vi |  (1 + ")dn/ke Friday, September 7, 12
  • 27.
    Relax! Balance constraint relaxed: |Vi |  (1 + ")dn/ke Balance relaxed Perfect balance Optimal cut size Cut size approximated ! Friday, September 7, 12
  • 28.
    Relax! Balance constraint relaxed: Bicriteria Approximation: cut size approximation ! measured |Vi |  (1 + ")dn/ke w.r.t perfectly balanced optimum Balance relaxed Perfect balance Optimal cut size Cut size approximated ! Friday, September 7, 12
  • 29.
    0<eps<1 on generalgraphs ✤ eps>1 -- alpha in .... spreading metric techniques ✤ 0<eps < 1 not much improvement. 1/epsˆ2 log ^1.5 n ✤ What about trees? Friday, September 7, 12
  • 30.
    Summary of PTASfor Trees ✤ Compute optimal cut size for each coarse signature using DP ✤ Pack each coarse signatures into bins of size (1 + ")dn/ke ✤ Pick solution with smallest cut size among those fitting into k bins 4 1+3d 1 log( 1 )e ✤ Total time complexity O(n (k/") " " ) Friday, September 7, 12
  • 31.
    Summary of PTASfor Trees ✤ Compute optimal cut size for each coarse signature using DP ✤ Pack each coarse signatures into bins of size (1 + ")dn/ke ✤ Pick solution with smallest cut size among those fitting into k bins 4 1+3d 1 log( 1 )e ✤ Total time complexity O(n (k/") " " ) Show that ! =1 Friday, September 7, 12
  • 32.
    Extension to GeneralGraphs ✤ Decomposition of graph into collection of trees [Räcke, Madry], cut size worsen by at most O(log n) for at least 1 tree ✤ Apply PTAS for trees to each instance ✤ Return partition for tree with minimum cut ✤ alpha = O(log n) improves Friday, September 7, 12
  • 33.
  • 34.
  • 35.
    Extensions & OpenProblems ✤ Tree embedding techniques allow the !=1 tree PTAS to translate to a !=O(log n) approx for general weighted graphs ✤ Improves on previous best != O(log 1.5 n/"2 ) Friday, September 7, 12
  • 36.
    Extensions & OpenProblems ✤ Tree embedding techniques allow the !=1 tree PTAS to translate to a !=O(log n) approx for general weighted graphs ✤ Improves on previous best != O(log 1.5 n/"2 )    Graphs Trees Friday, September 7, 12
  • 37.
    Roadmap STACS12 + Partitioning Algorithmica Networks SODA11 + Shortest Paths Algorithmica " Time Series ICDE10 Data Streams Burst Detection NSDI11 Friday, September 7, 12
  • 38.
    Approximating Time Series ✤ Represent a time series with B linear segments ✤ New value arrives to the time series, need to reallocate segments Friday, September 7, 12
  • 39.
    Approximating Time Series ✤ Represent a time series with B linear segments ✤ New value arrives to the time series, need to reallocate segments Friday, September 7, 12
  • 40.
    Approximating Time Series ✤ Represent a time series with B linear segments ✤ New value arrives to the time series, need to reallocate segments Friday, September 7, 12
  • 41.
    Old Algorithms, NewProofs Friday, September 7, 12
  • 42.
    Old Algorithms, NewProofs ✤ We prove that a popular greedy merge scheme gives constant (bicriteria) approx. for many L_p norms. (ICDE10; joint with Gandhi, Suri) Friday, September 7, 12
  • 43.
    Old Algorithms, NewProofs ✤ We prove that a popular greedy merge scheme gives constant (bicriteria) approx. for many L_p norms. (ICDE10; joint with Gandhi, Suri) ✤ Results implemented in Linux Kernel and used to detect traffic bursts in networks (NSDI11, joint with Uyeda, Suri, Varghese, Baker) Friday, September 7, 12
  • 44.
    Old Algorithms, NewProofs ✤ We prove that a popular greedy merge scheme gives constant (bicriteria) approx. for many L_p norms. (ICDE10; joint with Gandhi, Suri) ✤ Results implemented in Linux Kernel and used to detect traffic bursts in networks (NSDI11, joint with Uyeda, Suri, Varghese, Baker) Next steps: Extend results in ICDE10 to other norms Friday, September 7, 12
  • 45.
    Conclusion ✤ Approximation is necessary to reduce resource utilization ✤ Presented approximation algorithms for problems from different domains that we cannot afford to solve exactly ✤ Presented basic building blocks that can be used across the board to design approximation algorithms Friday, September 7, 12