Sketching, Sampling and other Sublinear
Algorithms:
Algorithms for parallel models
Alex Andoni
(MSR SVC)
Parallel Models
 Data cannot be seen by one machine
 Distributed across many machines
 MapReduce, Hadoop, Dryad,…
 Alg...
Types of problems
 0. Statistics: 2nd moment of the frequency
 1. Sort n numbers
 2. s-t connectivity in a graph
 3. M...
Computational Model

Model Constraints

PRAMs

Problem 0: Statistics

IP
2 1
5 3
7 2
1
9
4
Problem 1: sorting

Problem 2: graph connectivity

VS
Problems 3: geometric graphs

Problem: Geometric MST

[A-Nikolov-Onak-Yaroslavtsev’??]
General Approach
 Partition the space hierarchically in a “nice way”
 In each part
 Compute a pseudo-solution to the pr...
MST algorithm: attempt 1
 Partition the space hierarchically in a “nice way”
 In each part
 Compute a pseudo-solution t...
Troubles
 Quad tree can cut MST edges
 forcing irrevocable decisions
 Choose a wrong representative
MST algorithm: final

MST algorithm: Glimpse of analysis

Finale
 Gotta love your models:
 Streaming:
 sub-linear space
 see all data sequentially
 Parallel computing:
 sub-l...
Upcoming SlideShare
Loading in …5
×

Sketching, Sampling, and other Sublinear Algorithms 4 (Lecture by Alex Andoni)

845 views

Published on

Parallel framework: we look at problems where neither the data or the output fits on a machine. For example, given a set of 2D points, how can we compute the minimum spanning tree over a cluster of machines.

Published in: Education, Technology
  • Be the first to comment

Sketching, Sampling, and other Sublinear Algorithms 4 (Lecture by Alex Andoni)

  1. 1. Sketching, Sampling and other Sublinear Algorithms: Algorithms for parallel models Alex Andoni (MSR SVC)
  2. 2. Parallel Models  Data cannot be seen by one machine  Distributed across many machines  MapReduce, Hadoop, Dryad,…  Algorithmic tools for the models?  very incipient!
  3. 3. Types of problems  0. Statistics: 2nd moment of the frequency  1. Sort n numbers  2. s-t connectivity in a graph  3. Minimum Spanning Tree on a graph  … many more!
  4. 4. Computational Model 
  5. 5. Model Constraints 
  6. 6. PRAMs 
  7. 7. Problem 0: Statistics  IP 2 1 5 3 7 2 1 9 4
  8. 8. Problem 1: sorting 
  9. 9. Problem 2: graph connectivity  VS
  10. 10. Problems 3: geometric graphs 
  11. 11. Problem: Geometric MST  [A-Nikolov-Onak-Yaroslavtsev’??]
  12. 12. General Approach  Partition the space hierarchically in a “nice way”  In each part  Compute a pseudo-solution to the problem  Sketch the pseudo-solution with small space  Send the sketch to be used in the next level/round
  13. 13. MST algorithm: attempt 1  Partition the space hierarchically in a “nice way”  In each part  Compute a pseudo-solution to the problem  Sketch the pseudo-solution with small space  Send the sketch to be used in the next level/round quad trees! compute MST send any point as a representative
  14. 14. Troubles  Quad tree can cut MST edges  forcing irrevocable decisions  Choose a wrong representative
  15. 15. MST algorithm: final 
  16. 16. MST algorithm: Glimpse of analysis 
  17. 17. Finale  Gotta love your models:  Streaming:  sub-linear space  see all data sequentially  Parallel computing:  sub-linear space per machine  data distributed over many machines  communication (rounds) expensive  Algorithmic tools in development!

×