I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)

1,557 views
1,370 views

Published on

In many modern applications that deal with massive data sets, communication between internal and external memory, and not actual computation time, is the bottleneck in the computation. This is due to the huge difference in access time of fast internal memory and slower external memory such as disks. In order to amortize this time over a large amount of data, disks typically read or write large blocks of contiguous data at once. This means that it is important to design algorithms with a high degree of locality in their disk access pattern, that is, algorithms where data accessed close in time is also stored close on disk. Such algorithms take advantage of block transfers by amortizing the large access time over a large number of accesses. In the area of I/O-efficient algorithms the main goal is to develop algorithms that minimize the number of block transfers (I/Os) used to solve a given problem. In these four lectures, we will cover fundamental I/O-efficient algorithms and data structures, such as sorting algorithms, search trees and priority queues. We will also discuss geometric problems, as well as geometric data structures for various range searching problems.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,557
On SlideShare
0
From Embeds
0
Number of Embeds
493
Actions
Shares
0
Downloads
37
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

I/O Efficient Algorithms and Data Structures 1+2 (Lecture by Lars Arge)

  1. 1. I/O-efficient Algorithms and Data Structures Lars Arge Professor and center director Aarhus University ALMADA summer school August 1, 2013
  2. 2. • Pervasive use of computers and sensors • Increased ability to acquire/store/process data → Massive data collected everywhere • Society increasingly “data driven” → Access/process data anywhere any time Nature/Science special issues • 2/06, 9/08, 2/11 • Scientific data size growing exponentially, while quality and availability improving • Paradigm shift: Science will be about mining data Massive Data Obviously not only in sciences: • Economist 02/10: • From 150 Billion Gigabytes five years ago to 1200 Billion today • Managing data deluge difficult; doing so will transform business/public life I/O-efficient Algorithms and Data Structures Lars Arge 2
  3. 3. Lars Arge 3 Random Access Machine Model • Standard theoretical model of computation: – Infinite memory – Uniform access cost • Simple model crucial for success of computer industry R A M I/O-efficient Algorithms and Data Structures
  4. 4. Lars Arge 4 Hierarchical Memory • Modern machines have complicated memory hierarchy – Levels get larger and slower further away from CPU – Data moved between levels using large blocks L 1 L 2 R A M I/O-efficient Algorithms and Data Structures
  5. 5. Lars Arge 5 Slow I/O – Disk systems try to amortize large access time transferring large contiguous blocks of data • Important to store/access data to take advantage of blocks (locality) • Disk access is 106 times slower than main memory access track magnetic surface read/write arm read/write head “The difference in speed between modern CPU and disk technologies is analogous to the difference in speed in sharpening a pencil using a sharpener on one’s desk or by taking an airplane to the other side of the world and using a sharpener on someone else’s desk.” (D. Comer) I/O-efficient Algorithms and Data Structures
  6. 6. Lars Arge 6 Scalability Problems • Most programs developed in RAM-model – Run on large datasets because OS moves blocks as needed • Moderns OS utilizes sophisticated paging and prefetching strategies – But if program makes scattered accesses even good OS cannot take advantage of block access Scalability problems! data size runningtime I/O-efficient Algorithms and Data Structures
  7. 7. Lars Arge 7 N= # of items in the problem instance B = # of items per disk block M = # of items that fit in main memory T = # of items in output I/O: Move block between memory and disk We assume (for convenience) that M >B2 D P M Block I/O External Memory Model I/O-efficient Algorithms and Data Structures
  8. 8. Lars Arge 8 Fundamental Bounds Internal External • Scanning: N • Sorting: N log N • Permuting • Searching: • Note: – Linear I/O: O(N/B) – Permuting not linear – Permuting and sorting bounds are equal in all practical cases – B factor VERY important: – Cannot sort optimally with search tree NBlog B N B N B Mlog B N NB N B N B N B Mlog }log,min{ B N B N B MNN N2log I/O-efficient Algorithms and Data Structures
  9. 9. Scalability Problems: Block Access Matters • Example: Traversing linked list (List ranking) – Array size N = 10 elements – Disk block size B = 2 elements – Main memory size M = 4 elements (2 blocks) • Large difference between N and N/B large since block size is large – Example: N = 256 x 106, B = 8000 , 1ms disk access time N I/Os take 256 x 103 sec = 4266 min = 71 hr N/B I/Os take 256/8 sec = 32 sec Algorithm 2: N/B=5 I/OsAlgorithm 1: N=10 I/Os 1 5 2 6 73 4 108 9 1 2 10 9 85 4 76 3 9Lars Arge I/O-efficient Algorithms and Data Structures
  10. 10. Outline • Today: – Sorting upper bound (merge- and distribution-sort) – Sorting (permuting) lower bound – Cache-oblivious sorting (if time permits) – Batched geometric problems: Distribution sweeping • Monday: – Searching (B-trees) – Batched searching (Buffer-trees) – I/O-efficient priority queues – I/O-efficient list ranking Lars Arge 10 I/O-efficient Algorithms and Data Structures
  11. 11. Lars Arge 11 Queues and Stacks • Queue: – Maintain push and pop blocks in main memory O(1/B) Push/Pop operations amortized • Stack: – Maintain push/pop blocks in main memory O(1/B) Push/Pop operations amortized Push Pop I/O-efficient Algorithms and Data Structures
  12. 12. Lars Arge 12 Sorting • <M/B sorted lists (queues) can be merged in O(N/B) I/Os M/B blocks in main memory • Unsorted list (queue) can be distributed using <M/B split elements in O(N/B) I/Os I/O-efficient Algorithms and Data Structures
  13. 13. Lars Arge 13 Sorting • Merge sort: – Create N/M memory sized sorted lists – Repeatedly merge lists together Θ(M/B) at a time phases using I/Os each I/Os)( B NO)(log M N B MO )log( B N B N B MO )( M N )/( B M M N ))/(( 2 B M M N 1 I/O-efficient Algorithms and Data Structures
  14. 14. Lars Arge 14 Sorting • Distribution sort (multiway quicksort): – Compute Θ(M/B) split elements – Distribute unsorted list into Θ(M/B) unsorted lists of equal size – Recursively split lists until fit in memory I/Os if split elements in O(N/B) I/Os Can compute split elements in O(N/B) I/Os phases I/Os )log( B N B N B MO B M )(log)(log M N M N B M B M OO )log( B N B N B MO I/O-efficient Algorithms and Data Structures
  15. 15. Lars Arge 15 Permuting Lower Bound Permuting N elements according to a given permutation takes I/Os in “indivisibility” model • Indivisibility model: Move of elements only allowed operation • Note: – We can allow copies (and destruction of elements) – Bound also a lower bound on sorting • Proof: – View memory and disk as array of N tracks of B elements – Assume all I/Os track aligned (assumption can be removed) })log,(min{ B N B N B MN I/O-efficient Algorithms and Data Structures M D
  16. 16. Lars Arge 16 Permuting Lower Bound – Array contains permutation of N elements at all times – We will count how many permutations can be reached (produced) with t I/Os – Input: * Choose track: N possibilities * Rearrange ≤ B element in track and place among ≤ M-B elements in memory: – possibilities if “fresh” track – otherwise at most permutations after t inputs – Output: Choose track: N possibilities B N B M BN t )!())(( )(! B M B )(B M I/O-efficient Algorithms and Data Structures M D
  17. 17. Lars Arge 17 Permuting Lower Bound – Permutation algorithm needs to be able to produce N! permutations (using Stirlings formula and ) – If we have – If we have and thus !)!())(( NBN B N B M t )!log())log((log)!log( NNtB B M B N NNBNtBN B M log)log(loglog B M B N BN N t loglog log xxx log!log B M BB M log)log( B M BN loglog B N BMB N B N B M B N t /log2 log log B M BN loglog NB NNNNNt N B N N B N 2 1 2 1 log log 2 1 log2 log })log,(min{ B N B N B MNt I/O-efficient Algorithms and Data Structures
  18. 18. Lars Arge 18 Sorting lower bound • Similar argument but assuming comparison model in internal memory – Initially N! possible orderings – Count how may possible after t I/Os Sorting N elements takes I/Os in comparison model !)!())(( NBN B N B M t )!log())log((log)!log( NNtB B M B N NNBNtBN B M log)log(loglog B M B N BN N t loglog log )log( B N B N B M I/O-efficient Algorithms and Data Structures
  19. 19. Lars Arge 19 Summary/Conclusion: Sorting • External merge or distribution sort takes I/Os – Merge-sort based on M/B-way merging – Distribution sort based on -way distribution and (complicated) split elements finding – Key is linear I/O M/B-way merging/distribution • Optimal in comparison model • lower bound in stronger indivisibility model – Holds even for permuting )log( B N B N B MO })log,(min{ B N B N B MN B M I/O-efficient Algorithms and Data Structures
  20. 20. Outline • Today: – Sorting upper bound (merge- and distribution-sort) – Sorting (permuting) lower bound – Cache-oblivious sorting (if time permits) – Batched geometric problems: Distribution sweeping • Monday: – Searching (B-trees) – Batched searching (Buffer-trees) – I/O-efficient priority queues – I/O-efficient list ranking Lars Arge 20 I/O-efficient Algorithms and Data Structures
  21. 21. Lars Arge 21 Cache-Oblivious Model • N, B, and M as in I/O-model – Assumed that M>B2 (tall cache assumption) • M and B not used in algorithm description • Block transfers by optimal paging strategy Analyze in two-level model Efficient on one level, efficient on all levels (of fully associative hierarchy using LRU) I/O-efficient Algorithms and Data Structures
  22. 22. Lars Arge 22 • levels • Distribution in I/Os algorithms I/O-model Distribution Sort B M 1 )( B N O )log( B N B N B MO I/O-efficient Algorithms and Data Structures B M )(log)(log M N M N B M B M OO
  23. 23. Lars Arge 23 Cache-Oblivious Distribution Sort • General idea: way distributionN N N N N N N • Distribution performed in accesses after – breaking the N elements into subarrays of size – sorting each subarray recursively )( B N O N N Recurrence for number of accesses used to sort )log()( B N B N B MONT MNO MNONTN NT B N B N if)( if)()(2 )( I/O-efficient Algorithms and Data Structures
  24. 24. Lars Arge 24 Cache-Oblivious Distribution Sort • Distribution in accesses (assuming buckets known):)( B N O NN – Divide subarrays into two equal sized groups A1 and A2 – Divide buckets into two equal-sized groups B1 and B2 – Recursively: 1. Distribute (relevant) elements from A1 into B1 2. Distribute (relevant) elements from A2 into B1 3. Distribute elements from A1 into B2 4. Distribute elements from A2 into B2 I/O-efficient Algorithms and Data Structures
  25. 25. Lars Arge 25 Cache-Oblivious Distribution Sort Analysis of distribution: – Consider recursive subproblems on B subarrays * such problems * Each subproblem solved in accesses accesses 2 log 4 B NB N B B movedelements )()( 2 B N B N B N OBO N 2 N 2 2 N B B N log NB I/O-efficient Algorithms and Data Structures
  26. 26. Lars Arge 26 Batched Geometric Problems • Example: Orthogonal line segment intersection – Given set of axis-parallel line segments, report all intersections • In internal memory many problems is solved using sweeping I/O-efficient Algorithms and Data Structures
  27. 27. Lars Arge 27 Plane Sweeping • Sweep plane top-down while maintaining search tree T on vertical segments crossing sweep line (by x-coordinates) – Top endpoint of vertical segment: Insert in T – Bottom endpoint of vertical segment: Delete from T – Horizontal segment: Perform range query with x-interval on T I/O-efficient Algorithms and Data Structures
  28. 28. Lars Arge 28 Plane Sweeping • In internal memory algorithm runs in optimal O(Nlog N+T) time • In external memory algorithm performs badly (>N I/Os) if |T|>M • Solution: Distribution sweeping – Combination of distribution and plane sweeping I/O-efficient Algorithms and Data Structures
  29. 29. Lars Arge 29 Distribution Sweeping • Divide plane into M/B-1 slabs with O(N/(M/B)) endpoints each • Sweep plane top-down while reporting intersections between – part of horizontal segment spanning slab(s) and vertical segments • Distribute data to M/B-1 slabs – vertical segments and non-spanning parts of horizontal segments • Recourse in each slab I/O-efficient Algorithms and Data Structures
  30. 30. Lars Arge 30 Distribution Sweeping • Sweep performed in O(N/B+T’/B) I/Os I/Os • Maintain active list of vertical segments for each slab (<B in memory) – Top endpoint of vertical segment: Insert in active list – Horizontal segment: Scan through all relevant active lists * Removing “expired” vertical segments * Reporting intersections with “non-expired” vertical segments )log( B T B N B N B MO I/O-efficient Algorithms and Data Structures
  31. 31. Lars Arge 31 Distribution Sweeping • Other example: Rectangle intersection – Given set of axis-parallel rectangles, report all intersections. I/O-efficient Algorithms and Data Structures
  32. 32. Lars Arge 32 Distribution Sweeping • Divide plane into M/B-1 slabs with O(N/(M/B)) endpoints each • Sweep plane top-down while reporting intersections between – part of rectangles spanning slab(s) and other rectangles • Distribute data to M/B-1 slabs – Non-spanning parts of rectangles • Recurse in each slab I/O-efficient Algorithms and Data Structures
  33. 33. Lars Arge 33 Distribution Sweeping • Seems hard to perform sweep in O(N/B+T’/B) I/Os • Solution: Multislabs (contigious set of slabs) – Reduce fanout of distribution to – Recursion height still – Room for block from each multislab (activlist) in memory )( B M )(log B N B MO I/O-efficient Algorithms and Data Structures
  34. 34. Lars Arge 34 Distribution Sweeping • Sweep while maintaining rectangle active list for each multisslab – Top side of spanning rectangle: Insert in active multislab list – Each rectangle: Scan through all relevant multislab lists * Removing “expired” rectangles * Reporting intersections with “non-expired” rectangles I/Os)log( B T B N B N B MO I/O-efficient Algorithms and Data Structures
  35. 35. Lars Arge 35 Homework Design an I/O algorithm that given N rectangles in the plane computes the measure (area) of the their union. Make sure to argue for correctness and I/O-complexity of your algorithm. Hint: First find for each input y-coordinate y the length of the intersection between the rectangles and a horizontal line through the y. Use distribution sweeping with a combine step to do so. )log( B N B N B MO I/O-efficient Algorithms and Data Structures
  36. 36. Outline • Today: – Sorting upper bound (merge- and distribution-sort) – Sorting (permuting) lower bound – Cache-oblivious sorting (if time permits) – Batched geometric problems: Distribution sweeping • Monday: – Searching (B-trees) – Batched searching (Buffer-trees) – I/O-efficient priority queues – I/O-efficient list ranking Lars Arge 36 I/O-efficient Algorithms and Data Structures
  37. 37. Lars Arge 37 References • Input/Output Complexity of Sorting and Related Problems A. Aggarwal and J.S. Vitter. CACM 31(9), 1988 • External-Memory Computational Geometry. M.T. Goodrich, J-J. Tsay, D.E. Vengroff, and J.S. Vitter. Proc. FOCS'93 • Cache-Oblivious Algorithms. M. Frigo, C. Leiserson, H. Prokop, and S. Ramachandran. Proc. FOCS '99. I/O-efficient Algorithms and Data Structures
  38. 38. • High level objectives: – Advance algorithmic knowledge in “massive data” processing area – Train researchers in world-leading international environment – Be catalyst for multidisciplinary/industry collaboration • Building on: – Strong international team – Vibrant international environment – Focus areas (among others): * I/O-efficient algorithms * Streaming algorithms * Cache-oblivious algorithms * Algorithm engineering Center for Massive Data Algorithmics We are hiring!! Lars Arge 38 I/O-efficient Algorithms and Data Structures
  39. 39. Thanks large@cs.au.dk www.madalgo.au.dk

×