Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The skyline operator lee, woonghee

740 views

Published on

For my first paper review at my lab seminar, this paper is dedicated by Borzsonyi, S., Kossmann, D., and Stocker, K. I made the PPT file to present in front of my colleague students and my professor. This paper is good to study not only algorithm(divide and conquer) but also mathematical aspect of database of linear algebraic methods.

Published in: Data & Analytics
  • Be the first to comment

The skyline operator lee, woonghee

  1. 1. The Skyline Operator Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Lee, Woonghee M.S. student at the Big Data Mining Lab. Department of computer science and engineering at the Hanyang University July 24th - August 07th, 2015
  2. 2. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Contents ● Introduction: What is the Skyline? ● SQL’s Extensions ● Implementation • Two dimensional dddddddd • Block-nested-loops algorithm • Divide-and-conquer algorithm  Experiments and results
  3. 3. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Introduction  Holiday to Nassau (Bahamas)  To look for a hotel cheap and close to the beach  Two goals(distance, price) are complementary as the hotels near the beach tend to be more expensive
  4. 4. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Introduction: Definition of the Skyline  Interesting hotels are not worse than any other hotels.  The set of interesting hotels are the Skyline.
  5. 5. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Introduction: Definition of the Skyline  Interesting hotels are not worse than any other hotels.  The set of interesting hotels are the Skyline. A This set is dominated by A ex) Hotel A <$80, 0.7 miles> dominates Hotel B <$120, 1.0 miles> B
  6. 6. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Introduction: Definition of the Skyline • Definition of the Skyline 𝐿𝑒𝑡 𝑣𝑒𝑐𝑡𝑜𝑟𝑠 ℎ𝑖, ℎ𝑗 ∈ 𝑅 𝑛 𝑏𝑒 ℎ𝑖 =< 𝑥𝑖1, … , 𝑥𝑖𝑛 >, ℎ𝑗 =< 𝑥𝑗1, … , 𝑥𝑗𝑛 >. 𝐼𝑓 𝑎𝑙𝑙 𝑥𝑖𝑘 𝑎𝑟𝑒 𝑏𝑒𝑡𝑡𝑒𝑟 𝑡ℎ𝑎𝑛 𝑎𝑛𝑑 𝑒𝑞𝑢𝑎𝑙 𝑡𝑜 𝑥𝑗𝑘, 𝑎𝑛𝑑 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑥𝑖𝑙 𝑖𝑠 𝑏𝑒𝑡𝑡𝑒𝑟 𝑡ℎ𝑎𝑛 𝑥𝑗𝑙, ℎ𝑖 ≻ ℎ𝑗
  7. 7. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Introduction: Applied the Skyline  To propose interesting hotels  To find good salesperson which have low salary  To derive database visualization
  8. 8. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. SQL’s Extensions  To propose to extend SQL’s SELECT statement by an optional SKYLINE OF clause
  9. 9. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. SQL’s Extensions  Ex) The price of a hotel should be minimized The rating should be maximized dimension of the Skyline To specify whether the value in each dimension should be minimized, maximized or be different
  10. 10. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. SQL’s Extensions: SKYLINE OF clause  Tuple 𝑝 = (𝑝1, … , 𝑝 𝑘, 𝑝 𝑘+1, … , 𝑝𝑙, 𝑝𝑙+1, … , 𝑝 𝑚, 𝑝 𝑚+1, … , 𝑝 𝑛)dominates tuple q = (𝑞1, … , 𝑞 𝑘, 𝑞 𝑘+1, … , 𝑞𝑙, 𝑞𝑙+1, … , 𝑞 𝑚, 𝑞 𝑚+1, … , 𝑞 𝑛)for a Skyline query SKYLINE OF 𝑑1MIN, …, 𝑑 𝑘MIN, 𝑑 𝑘+1MAX, …, 𝑑𝑙MAX, 𝑑𝑙+1DIFF, …, 𝑑 𝑚DIFF if the following three conditions hold: 𝑝𝑖 ≤ 𝑞𝑖 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 = 1, … , 𝑘 𝑝𝑖 ≥ 𝑞𝑖 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 = 𝑘 + 1 , … , 𝑙 𝑝𝑖 = 𝑞𝑖 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖 = (𝑙 + 1), … , 𝑚
  11. 11. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. SQL’s Extensions: SKYLINE OF clause  If two tuples have the same values for all attributes and are not dominated, both are part of the result (if no distinct).  A one dimensional Skyline is equivalent a min, max, or distinct SQL query.  Dominance is a transitive relation; if 𝑝 dominates 𝑞 and 𝑞 dominates 𝑟, then 𝑝 dominates 𝑟.
  12. 12. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. SQL’s Extensions: Example Skyline Queries
  13. 13. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation of the Skyline Operation  Nested SQL query  Two dimensional sorting  Three variants based on a block-nested-loops algorithm  Three variants based on divide-and-conquer
  14. 14. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: Nested SQL query Same with the SKYLINE OF clause but poor performance
  15. 15. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: Nested SQL query The reasons why this approach shows very poor performance:  This query cannot be unnested.  If this query involves a 𝑗𝑜𝑖𝑛 or 𝑔𝑟𝑜𝑢𝑝 𝑏𝑦, the 𝑗𝑜𝑖𝑛 or 𝑔𝑟𝑜𝑢𝑝 𝑏𝑦 might execute as the outer query and as the subquery.  Combining with other operations (e.g., 𝑗𝑜𝑖𝑛 or 𝑇𝑜𝑝 𝑁) might cause additional cost to compute the Skyline.
  16. 16. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: Two Dimensional  A one-dimensional Skyline is equivalent to computing the 𝑚𝑖𝑛, 𝑚𝑎𝑥, or 𝑑𝑖𝑠𝑡𝑖𝑛𝑐𝑡.  Computing two-dimensional Skyline is also easy by sorting the data (It just needs to compare with the last previous tuple).
  17. 17. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001.  A one-dimensional Skyline is equivalent to computing the 𝑚𝑖𝑛, 𝑚𝑎𝑥, or 𝑑𝑖𝑠𝑡𝑖𝑛𝑐𝑡.  Computing two-dimensional Skyline is also easy by sorting the data (It just needs to compare with the last previous tuple). Implementation: Two Dimensional Skyline
  18. 18. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001.  Definition the Skyline at 2-dimensional 𝐿𝑒𝑡 𝐴, 𝐵 𝑏𝑒 𝑣𝑒𝑐𝑡𝑜𝑟𝑠 𝑖𝑛 𝑅2, 𝑎𝑛𝑑 𝐴𝑖 =< 𝑥𝑖, 𝑦𝑖 >, 𝐵𝑗 =< 𝑥𝑗, 𝑦𝑗 > 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖, 𝑗. 𝐴 ∝ 𝐵 𝑤ℎ𝑒𝑛 𝑥𝑖 < 𝑥𝑗 𝑎𝑛𝑑 𝑦𝑖 ≤ 𝑦𝑗 𝑜𝑟 𝑥𝑖 ≤ 𝑥𝑗 𝑎𝑛𝑑 𝑦𝑖 < 𝑦𝑗 𝑜𝑟 𝑥𝑖 < 𝑥𝑗 𝑎𝑛𝑑 𝑦𝑖 < 𝑦𝑗. Implementation: Two Dimensional
  19. 19. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. i j k dominate no dominate  To prove 2-dimensional Skyline Algorithm 𝑝𝑟𝑒𝑚𝑖𝑠𝑒 1: 𝑖 < 𝑘, ℎ𝑖 ∝ ℎ 𝑘. 𝑝𝑟𝑒𝑚𝑖𝑠𝑒 2: ℎ𝑖 ≻ ℎ𝑗. 𝑝𝑟𝑒𝑚𝑖𝑠𝑒 3: 𝑎 < 𝑏, 𝑓𝑜𝑟 𝑎𝑙𝑙 ℎ 𝑎, ℎ 𝑏, 𝑥 𝑎 < 𝑥 𝑏. 𝑝𝑟𝑜𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛: ∀ 𝑘 < 𝑗, ℎ 𝑘 ≻ ℎ𝑗. Implementation: Two Dimensional
  20. 20. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001.  To prove 2-dimensional Skyline Algorithm 𝐿𝑒𝑡 𝑒𝑎𝑐ℎ 𝑣𝑒𝑐𝑡𝑜𝑟𝑠 𝑐𝑜𝑛𝑠𝑖𝑠𝑡 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 < 𝑥, 𝑦 >. 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑙𝑦 𝑃𝑟𝑒𝑚𝑖𝑠𝑒 2, 𝑓𝑜𝑟 𝑎𝑙𝑙 ℎ𝑖, ℎ𝑗 → 𝑦𝑖 > 𝑦𝑗 ∵ 2 − 𝐷 𝑆𝑘𝑦𝑙𝑖𝑛𝑒 𝑑𝑒𝑓𝑖𝑛𝑒𝑡𝑖𝑜𝑛 𝑎𝑛𝑑 𝑃𝑟𝑒𝑚𝑖𝑠𝑒 3 . 𝐿𝑒𝑡 𝑢𝑠 𝑐𝑎𝑙𝑙 𝑖𝑡 𝑖𝑠 𝑝𝑟𝑒𝑚𝑖𝑠𝑒 4. Implementation: Two Dimensional
  21. 21. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001.  To prove 2-dimensional Skyline Algorithm 𝐿𝑒𝑡 𝑎𝑠𝑠𝑢𝑚𝑒 ℎ 𝑘 ∝ ℎ𝑗, 𝑖. 𝑒. , 𝑦 𝑘 ≤ 𝑦𝑗. 𝐵𝑒𝑐𝑎𝑢𝑠𝑒 𝑜𝑓 𝑥 𝑘 < 𝑥𝑗(∵ Implementation: Two Dimensional
  22. 22. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: Two Dimensional  However, more than two-dimensions does not work to get the Skyline by sorting the data. ℎ1 is not ℎ3’s direct predecessor. rating (star)
  23. 23. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Basic Block-nested-loops Algorithm Idea • reads repeatedly the set of tuples like the naive nested- loops algorithm. • keeps a 𝑤𝑖𝑛𝑑𝑜𝑤 of incomparable tuples in main memory. 1. BNL-basic 2. BNL-sol 3. BNL-solrep
  24. 24. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Basic Block-nested-loops Algorithm Steps: 1. A tuple 𝑝 is read from the input. 2. 𝑝 is compared to all tuples of the 𝑤𝑖𝑛𝑑𝑜𝑤. 3. Based on step 2, 𝑝 is either eliminated, placed into the 𝑤𝑖𝑛𝑑𝑜𝑤 or into a 𝑡𝑒𝑚𝑝𝑜𝑟𝑎𝑟𝑦 𝑓𝑖𝑙𝑒. 1. BNL-basic 2. BNL-sol 3. BNL-solrep
  25. 25. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Basic Block-nested-loops Algorithm • After an iteration, three cases can occur: 1. 𝑝 is dominated by a tuple within the 𝑤𝑖𝑛𝑑𝑜𝑤. 2. 𝑝 dominates one or more tuples in the 𝑤𝑖𝑛𝑑𝑜𝑤. 3. 𝑝 is incomparable with all tuples in the 𝑤𝑖𝑛𝑑𝑜𝑤. Complexity: The best case is O(𝑛). The worst case is 𝑂(𝑛2 ). 1. BNL-basic 2. BNL-sol 3. BNL-solrep
  26. 26. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> Data set Window Temporary file …
  27. 27. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> Data set Window Temporary file time stamp t to do not compare two tuples are never compared twice …
  28. 28. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> Data set Window Temporary file compare …
  29. 29. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> Data set Window Temporary file dominated …
  30. 30. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> Data set Window Temporary file dominated …
  31. 31. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> Data set Window Temporary file incomparable …
  32. 32. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h4, $52, 0.7 miles> t = 2<h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> Data set Window Temporary file incomparable …
  33. 33. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h4, $52, 0.7 miles> t = 2<h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> Data set Window Temporary file incomparable …
  34. 34. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h4, $52, 0.7 miles> t = 2<h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> <h5, $49, 1.0 miles> t = 3 Data set Window Temporary file …
  35. 35. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h4, $52, 0.7 miles> t = 2<h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> <h5, $49, 1.0 miles> t = 3 Data set Window Temporary file compare dominated …
  36. 36. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h4, $52, 0.7 miles> t = 2<h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> <h5, $49, 1.0 miles> t = 3 Data set Window Temporary file removed …
  37. 37. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h6, $51, 0.7 miles> t = 4<h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> <h5, $49, 1.0 miles> t = 3 Data set Window Temporary file replace …
  38. 38. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> t = 1 <h6, $51, 0.7 miles> t = 4<h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> <h5, $49, 1.0 miles> t = 3 … Data set Window Temporary file output window at the end of the iteration … iterate until EOF
  39. 39. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Demonstration BNL-basic1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 0.9 miles> <h2, $50, 1.0 miles> <h3, $55, 1.0 miles> <h4, $52, 0.7 miles> <h5, $49, 1.0 miles> <h6, $51, 0.7 miles> <h5, $49, 1.0 miles> t = 3 … Data set Window Temporary file next iteration …
  40. 40. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Basic Block-nested-loops Algorithm • It works well if the Skyline fits into the window. • The best case complexity is 𝑂(𝑛). • The worst case complexity is 𝑂(𝑛2). • It is better I/O behavior than the naive nested-loops algorithm(Haas, Carey, Livny and Shukla 1997). 1. BNL-basic 2. BNL-sol 3. BNL-solrep
  41. 41. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Maintaining the Window as a Self-organizing List • To speed up comparison a tuple in the window • To move up a tuple in the window which dominates another tuple in input 1. BNL-basic 2. BNL-sol 3. BNL-solrep … <h2, $50, 0.8 miles> t = 2 <h5, $49, 1.0 miles> t = 3 Window … <h7, $49, 1.2 miles> … Data set dominates move up to the first line of the window
  42. 42. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Maintaining the Window as a Self-organizing List • Attractive if the data is skewed (i.e. better performance if there are a couple of 𝑘𝑖𝑙𝑙𝑒𝑟 tuples which dominate many other tuples, and better performance if there are many 𝑛𝑒𝑢𝑡𝑟𝑎𝑙 tuples in the Skyline.) 1. BNL-basic 2. BNL-sol 3. BNL-solrep killer tuple dominated by killer tuple neutral tuples neutral tuples
  43. 43. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Replacing Tuples in the Window1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 1.0 miles> t = 1 <h2, $59, 0.9 miles> t = 2 Window … <h3, $60, 0.1 miles> … Data set incomparable no memory
  44. 44. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Replacing Tuples in the Window1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 1.0 miles> t = 1 <h2, $59, 0.9 miles> t = 2 Window … <h3, $60, 0.1 miles> … Data set replacing 0 0.2 0.4 0.6 0.8 1 1.2 48 50 52 54 56 58 60 62 h1 h2 h3
  45. 45. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Replacing Tuples in the Window1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 1.0 miles> t = 1 <h3, $60, 0.1 miles> t = 3 Window … <h3, $60, 0.1 miles> … Data set replacing 0 0.2 0.4 0.6 0.8 1 1.2 48 50 52 54 56 58 60 62 h1 h2 h3
  46. 46. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: BNL Algorithms Replacing Tuples in the Window • Replacing when new tuple can dominate more tuples • Many replacement policies; 𝑝𝑟𝑖𝑐𝑒 ∗ 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 • Additional CPU cost needed • Two tuples in the temporary file might be compared twice. 1. BNL-basic 2. BNL-sol 3. BNL-solrep <h1, $50, 1.0 miles> t = 1 <h3, $60, 0.1 miles> t = 3 Window … <h3, $60, 0.1 miles> … Data set
  47. 47. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm The worst case and best case complexity is 𝑂 𝑛 ∗ log 𝑛 𝑑−2 + 𝑂(𝑛 ∗ log 𝑛) where 𝑛 is the number of input tuples and 𝑑 is the number of dimensions (Kung, Luccio and Preparata 1975). (unlikely for the BNL, the best case is O 𝑛 and the worst case is O(𝑛2 )) 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk
  48. 48. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Basic Divide and Conquer Algorithm Steps: 1. Compute the approximate median 𝑚 𝑝 of the input for some dimension 𝑑 𝑝. Divide the input two partitions by 𝑚 𝑝. 2. Compute the Skyline 𝑆1 of 𝑃1 and 𝑆2 of 𝑃2. 𝑃1 and 𝑃2 are recursively partitioned until a partition has one or few tuples. Then computing the Skyline is trivial. 3. Compute the overall Skyline as merging 𝑆1 and 𝑆2. 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk
  49. 49. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Basic Divide and Conquer Algorithm Steps: At the step 3, partition both 𝑆1 and 𝑆2 by approximate median 𝑚 𝑔 for dimension 𝑑 𝑔(≠ 𝑑 𝑝). Then we obtain partition 𝑆1,1, 𝑆1,2, 𝑆2,1 and 𝑆2,2 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk
  50. 50. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑- dimensional points 𝑹: output of the Skyline operation; a set of 𝑑- dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  51. 51. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm At the merge basic function:1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆1,2 𝑆2,1 𝑆2,2 𝑚𝑒𝑑𝑖𝑎𝑛 𝑝 𝑚𝑒𝑑𝑖𝑎𝑛 𝑔
  52. 52. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm At the merge basic function:1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆1,2 𝑆2,1 𝑆2,2 𝑚𝑒𝑑𝑖𝑎𝑛 𝑝 𝑚𝑒𝑑𝑖𝑎𝑛 𝑔
  53. 53. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm At the merge basic function:1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆1,2 𝑆2,1 𝑆2,2 𝑚𝑒𝑑𝑖𝑎𝑛 𝑝 𝑚𝑒𝑑𝑖𝑎𝑛 𝑔
  54. 54. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm At the merge basic function:1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆1,2 𝑆2,1 𝑆2,2 𝑚𝑒𝑑𝑖𝑎𝑛 𝑝 𝑚𝑒𝑑𝑖𝑎𝑛 𝑔
  55. 55. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  56. 56. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end
  57. 57. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end
  58. 58. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end
  59. 59. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end
  60. 60. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end
  61. 61. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end
  62. 62. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end
  63. 63. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end
  64. 64. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  65. 65. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  66. 66. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  67. 67. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  68. 68. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  69. 69. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  70. 70. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  71. 71. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  72. 72. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  73. 73. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  74. 74. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  75. 75. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  76. 76. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  77. 77. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  78. 78. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  79. 79. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  80. 80. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk function MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin if 𝑆1 = {𝑝} then 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑝 ≻ 𝑞 else if 𝑆2 = 𝑞 then begin for each 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ Minimum{𝑝1|𝑝 ∈ 𝑆1} 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑝 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛−1 𝑝 ∈ 𝑆1 𝑆1,1, 𝑆1,2 ≔ Partition 𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑆2,1, 𝑆2,2 ≔ Partition 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑖𝑣𝑜𝑡 𝑅1 ≔ MergeBasic 𝑆1,1, 𝑆2,1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅2 ≔ MergeBasic 𝑆1,2, 𝑆2,2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑅3 ≔ MergeBasic 𝑆1,1, 𝑅2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 𝑅 ≔ 𝑅1 ∪ 𝑅3 end return R end See also Preparata et al. (1993), Computational Geometry, pp. 161-164, Springer.
  81. 81. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝒑 is dominated by point 𝒒 function SkylineBasic M, Dimension begin if 𝑀 = 1 then return 𝑀 𝑃𝑖𝑣𝑜𝑡 ≔ Median 𝑚 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑚 ∈ 𝑀 𝑃1, 𝑃2 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑖𝑣𝑜𝑡 𝑆1 ≔ SkylineBasic(𝑃1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) 𝑆2 ≔ SkylineBasic 𝑃2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 return 𝑆1 ⋃ MergeBasic(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) end
  82. 82. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm M-way Partitioning • To get better performance by I/O behavior than the basic algorithm • To divide into 𝑚 partitions to fit into memory • To be used in the first step and third step of the basic algorithm • Partition by quantile rather than median 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk
  83. 83. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm M-way Partitioning • To be applied the basic algorithm at the first step, 𝑚-way partitioning produces 𝑚 partitions 𝑃1, … , 𝑃𝑚 to each 𝑃𝑖 fits into memory. • At the third step of the basic algorithm, the 𝑚-way partitioning is applied. • All sub-partitions should occupy at most half of the memory. 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk
  84. 84. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. function 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 begin if 𝑆1 = {𝑝} then 𝑅 ≔ {𝑞 ∈ 𝑆2|𝑝 ≻ 𝑞} else if 𝑆2 = {𝑞} then begin foreach 𝑝 ∈ 𝑆1 do if 𝑝 ≺ 𝑞 then 𝑅 ≔ 𝜙 end else if 𝑆1 + 𝑆2 < |𝕊| then 𝑅 ≔ 𝑀𝑒𝑟𝑔𝑒𝐵𝑎𝑠𝑖𝑐(𝑆1, 𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) else if 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 = 2 then begin 𝑀𝑖𝑛 ≔ 𝑀𝑖𝑛𝑖𝑚𝑢𝑚 𝑝1 𝑝 ∈ 𝑆1 𝑅 ≔ 𝑞 ∈ 𝑆2 𝑞1 < 𝑀𝑖𝑛 end else begin 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 ≔ 𝑀𝑎𝑥𝑖𝑚𝑢𝑚{ 𝑆1 𝕊 2 , 𝑆2 𝕊 2 } 𝑄𝑢𝑎𝑛𝑡𝑖𝑙𝑒𝑠 ≔ 𝛼 − 𝑄𝑢𝑎𝑛𝑡𝑖𝑙𝑒𝑠(𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑃𝑎𝑟𝑡𝑖𝑖𝑜𝑛𝑠) 𝑆1,1, … , 𝑆1,𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 ≔ 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠(𝑆1, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑄𝑢𝑎𝑛𝑡𝑖𝑙𝑒𝑠) 𝑆2,1, … , 𝑆2,𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 ≔ 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠(𝑆2, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1, 𝑄𝑢𝑎𝑛𝑡𝑖𝑙𝑒𝑠) 𝑅 ≔ 𝜙 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end end return 𝑅 end Implementation: D&C Algorithm Pseudo-code1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑴: input of the Skyline operation; a set of 𝑑-dimensional points 𝑹: output of the Skyline operation; a set of 𝑑-dimensional points 𝕊: main memory; a set of 𝑑-dimensional points 𝒑 ≺ 𝒒: point 𝑝 is dominated by point 𝑞 function SkylineMway(𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛) begin 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 ≔ Minimum 2 𝑛 𝑛 ∈ 𝑁⋀ 𝕊 ∙ 2 𝑛 > 𝑀 𝑄𝑢𝑎𝑛𝑡𝑖𝑙𝑒𝑠 ≔ 𝛼 − Quantiles 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 𝑃1, … , 𝑃𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 ≔ Partition 𝑀, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛, 𝑄𝑢𝑎𝑛𝑡𝑖𝑙𝑒𝑠 for 𝑖 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do if 𝑃𝑖 < |𝕊| then 𝑆𝑖 ≔ SkylineBasic 𝑃𝑖, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 else 𝑆𝑖 ≔ SkylineMway 𝑃𝑖, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 while 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 > 1 do begin for 𝑖 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 2 do 𝑆𝑖 ≔ 𝑆𝑖 ∪ MergeMway 𝑆2𝑖−1, 𝑆2𝑖, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 ≔ 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 2 end return 𝑆1 end
  85. 85. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 Partition to be fitted into the memory size 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  86. 86. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 𝑗 = 2, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  87. 87. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 𝑗 = 2, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,2 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  88. 88. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 𝑗 = 2, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,2 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,2 𝑗 = 3, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,3 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  89. 89. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 𝑗 = 2, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,2 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,2 𝑗 = 3, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,3 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,3 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  90. 90. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 𝑗 = 2, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,2 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,2 𝑗 = 3, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,3 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,3 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  91. 91. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 𝑗 = 2, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,2 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,2 𝑗 = 3, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,3 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,3 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑗 = 4, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,4 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,3 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  92. 92. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 𝑗 = 2, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,2 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,2 𝑗 = 3, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,3 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,3 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑗 = 4, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,4 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,4 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  93. 93. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 𝑗 = 2, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,2 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,2 𝑗 = 3, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,3 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,3 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑗 = 4, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,4 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,4 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,4 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  94. 94. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk 𝑆1,1 𝑆2,1 𝑆1,2 𝑆2,2 𝑆1,3 𝑆2,3 𝑆1,4 𝑆2,4 𝑗 = 1, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,1 𝑗 = 2, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,2 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,2 𝑗 = 3, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,3 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,3 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,3 𝑗 = 4, 𝑖 = 1 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,1 𝑆2,4 𝑖 = 2 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,2 𝑆2,4 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,3 𝑆2,4 𝑖 = 3 𝑡ℎ𝑒𝑛 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,4 𝑆2,4 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼1 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼2 𝑞𝑢𝑎𝑛𝑡𝑖𝑙𝑒 𝛼3 for 𝑗 ≔ 1 to 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 do begin for 𝑖 ≔ 1 to 𝑗 do if 𝑖 < 𝑗 then 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 − 1 else 𝑆2,𝑗 ≔ 𝑀𝑒𝑟𝑔𝑒𝑀𝑤𝑎𝑦 𝑆1,𝑖, 𝑆2,𝑗, 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛 𝑎𝑝𝑝𝑒𝑛𝑑(𝑅, 𝑆2,𝑗) end
  95. 95. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk To Propose Bushy Merge Tree • To minimize different merge steps • In the below figure the tuples of 𝑆1 are only involved in log 𝑚 merge steps (where 𝑚 is the number of the partitions)
  96. 96. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk Early Skyline At the first step of the M-way partitioning; 1. Load as many tuples as fit into the available main- memory buffers. 2. Applying the basic divide-and-conquer algorithm in order to immediately eliminate the tuples which are dominated by others. 3. Partition the remaining tuples into 𝑚 partitions.
  97. 97. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Implementation: D&C Algorithm 1. D&C-basic 2. D&C-mpt 3. D&C-mptesk Early Skyline • incurs additional CPU. • saves I/O, because less tuples need to be written and reread in the partitioning steps. • is attractive if the Skyline is selective (i.e., if the Skyline is small).
  98. 98. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results Experiments Environment • Processor: 333 MHz • Main memory: 128 MB • Operating system: Solaris 7 • Disk drive: 9GB Seagate with 7200 rpm and 512K disk cache
  99. 99. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results Implemented C++ Algorithms • Sort: Only for two-dimensional Skylines • BNL-basic: the basic block-nested-loops algorithm • BNL-sol: BNL and the window is organized as a self-organizing list • BNL-solrep: BNL-sol and tuples in the window are replaced • D&C-basic: basic divide-and-conquer algorithm • D&C-mpt: D&C-basic with m-way partitioning • D&C-mptesk: D&C-mpt and “Early Skylline”
  100. 100. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results Generated Database Condition • Each benchmark database contains 100,000 tuples (10MB). • The values of doubles of a tuple are generated randomly in [0,1).
  101. 101. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results Three Generated Database • indep: all attribute values are generated independently • corr: all attribute values have correlation with each dimension • anti: all attribute values have anti-correlation with each dimension indep corr anti
  102. 102. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results indep corr anti
  103. 103. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results Skyline Sizes
  104. 104. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results 2-d Skylines running times (in seconds) and amount of disk I/O (in MB) • BNL is the winner. Because of large enough memory, BNL terminates after one iteration. • “Early Skyline” is the winner among D&C variants, because after applying “Early Skyline”, the partitions are very small and the rest of the algorithm can be completed very quickly.
  105. 105. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results Multi-dimensional Skylines winner at the corr BNL is good up to 5-D D&C-mpt and D&C-mptesk outperform BNL after 5-D
  106. 106. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results Multi-dimensional Skylines At the anti, BEP is earlier than the corr. Finally D&C outperforms BNL.
  107. 107. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results Conclusion of the Experiments: BNL • The BNL variants are good if the size of the Skyline is small. • Performance of the BNL’s performance depends on the number of dimension and correlation. • BNL-sol is the winner among the BNL variants, but not great. • Replacement is bad if the Skyline is very large. It incurs additional overhead without benefits.
  108. 108. Borzsony, S., The Skyline Operator, In proc, IEEE Conf. on Data Engineering, page 421-430, Heidelberg, Germany, Apr. 2001. Experiments and Results Conclusion of the Experiments: D&C • D&C variants’ performance less depends on the number of dimension and correlation than BNL. • D&C-mptesk is winner among the D&C variants.
  109. 109. Thank you for listening

×