Upcoming SlideShare
×

# CS 542 -- Query Optimization

2,610 views

Published on

1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
2,610
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
84
0
Likes
1
Embeds 0
No embeds

No notes for slide

### CS 542 -- Query Optimization

1. 1. CS 542 Database Management Systems<br />Query Optimization<br />J Singh <br />March 28, 2011<br />
2. 2. Outline<br />Convert SQL query to a parse tree<br />Semantic checking: attributes, relation names, types<br />Convert to a logical query plan (relational algebra expression)<br />deal with subqueries<br />Improve the logical query plan<br />use algebraic transformations<br />group together certain operators<br />evaluate logical plan based on estimated size of relations <br />Convert to a physical query plan<br />search the space of physical plans <br />choose order of operations<br />complete the physical query plan<br />
3. 3. Desired Endpoint<br /> x=1 AND y=2 AND z<5 (R)<br />R ⋈ S ⋈ U<br />Example Physical Query Plans<br />two-pass<br />hash-join<br />101 buffers<br />Filter(x=1 AND z<5)<br />materialize<br />IndexScan(R,y=2)<br />two-pass<br />hash-join<br />101 buffers<br />TableScan(U)<br />TableScan(R)<br />TableScan(S)<br />
4. 4. Physical Plan Selection<br />The particular operation being performed<br />Size of intermediate results, as derived last week (sec 16.4 of book)<br />Physical Operator Implementation used, <br />e.g., one- or two-pass<br />Operation ordering, <br />esp. Join ordering<br />Operation output: materialized or pipelined.<br />Governed by disk I/O, which in turn is governed by<br />
5. 5. Index-based physical plans (p1)<br />Selection example. What is the cost of a=v(R) assuming<br />B(R) = 2000<br />T(R) = 100,000<br />V(R, a) = 20<br />Table scan (assuming R is clustered):<br />B(R) = 2,000 I/Os<br />Index based selection:<br />If index is clustering: B(R) / V(R,a) = 100 I/Os<br />If index is unclustered: T(R) / V(R,a) = 5,000 I/Os<br />For small V(R, a), table scan can be faster than an unclustered index<br />Heuristics that pick indexed over not-indexed can lead you astray<br />Determine the cost of both methods and let the algorithm decide<br />5<br />
6. 6. Index-based physical plans (p2)<br />Example: Join if S has an index on the join attribute<br />For each tuplein R, fetch corresponding tuple(s) from S<br />Assume R is clustered. Cost:<br />If index on S is clustering: B(R) + T(R) B(S) / V(S,a)<br />If index on S is unclustered: B(R) + T(R) T(S) / V(S,a)<br />Another case: when R is output of another Iterator. Cost:<br />B(R) is accounted for in the iterator<br />If index on S is clustering: T(R) B(S) / V(S,a)<br />If index on S is unclustered: T(R) T(S) / V(S,a)<br />If S is not indexed but fits in memory: B(S)<br />A number of other cases<br />
7. 7. Index-based physical plans (p3)<br />Index Based Join ifboth R and S have a sorted index (B+ tree) on the join attribute<br />Then perform a merge join <br />called zig-zag join<br />Cost: B(R) + B(S)<br />
8. 8. Grand Summary of Physical Plans (p1)<br />Scans and Selects<br />Index: N = None, C = Clustering, NC = Non-clustered<br />
9. 9. Grand Summary of Physical Plans (p2)<br />Joins<br />Index: N = None, C = Clustering, NC = Non-clustered<br />Relation fits in memory: F = Yes, NF = No<br />
10. 10. Physical plans at non-leaf Operators (p1)<br />What if the input of the operator is from another operator?<br />For Select, cost= 0.<br />Cost of pipelining is assumed to be zero<br />The number of tuples emitted is reduced<br />For Join, when R is from an operator and S from a table:<br />B(R) is accounted for in the iterator<br />If index on S is clustering: T(R) B(S) / V(S,a)<br />If index on S is unclustered: T(R) T(S) / V(S,a)<br />If S is not indexed but fits in memory: B(S)<br />If S is not indexed and doesn’t fit: k*B(S) for k chunks<br />If S is not indexed and doesn’t fit: 3*B(S) for sort- or hash-join<br />
11. 11. Physical plans at non-leaf Operators (p2)<br />For Join, when R and S are both from operators, cost depends on whether the result are sorted by the Join attribute(s)<br />If yes, we use the zig-zag algorithm and the cost is zero. Why?<br />If either relation will fit in memory, the cost is zero. Why?<br />At most, the cost is 2*(B(R) + B(S)). Why?<br />
12. 12. Example (787)<br />Product(pname, maker), Company(cname, city)<br />Select Product.pname<br />From Product, Company<br />Where Product.maker=Company.cname<br /> and Company.city = “Seattle”<br />How do we execute this query ?<br />
13. 13. Example (787)<br />Product(pname, maker), Company(cname, city)<br />Select Product.pname<br />From Product, Company<br />Where Product.maker=Company.cname<br /> and Company.city = “Seattle”<br />Logical Plan<br />Clustering Indices:<br />Product.pname<br />Company.cname<br />Unclustered Indices:<br />Product.maker<br />Company.city<br />maker=cname<br />scity=“Seattle”<br />Product(pname,maker)<br />Company(cname,city)<br />
14. 14. Example (787) Physical Plans<br />Physical Plan 1<br />Physical Plans 2a and 2b<br />Merge-join<br />Index-basedjoin<br />Index-basedselection<br />maker=cname<br />scity=“Seattle”<br />cname=maker<br />scity=“Seattle”<br />Product(pname,maker)<br />Company(cname,city)<br />Product(pname,maker)<br />Company(cname,city)<br />Index-scan<br />Scan and sort (2a)index scan (2b)<br />
15. 15. Evaluate (787) Physical Plans<br />Physical Plan 1<br />Tuples:<br />T(city='Seattle'(Company)) = T(Company) / V(Company, City)<br />Cost:<br />T(city='Seattle'(Company)) * T(Product) / V(Product, maker)<br />or, simplifying,<br />T(Company) / V(Company, City) * T(Product) / V(Product, maker)<br />Total Cost:<br />2a: 3B(Product) + B(Company)<br />2b: T(Product) + B(Company)<br />Merge-join<br />maker=cname<br />scity=“Seattle”<br />Product(pname,maker)<br />Company(cname,city)<br />Index-scan<br />Scan and sort (2a)index scan (2b)<br />
16. 16. Final Evaluation<br />Plan Costs:<br />Plan 1: T(Company) / V(Company, city)  T(Product)/V(Product, maker)<br />Plan 2a: B(Company) + 3B(Product)<br />Plan 2b: B(Company) + T(Product)<br />Which is better?<br />It depends on the data<br />
17. 17. Example (787) Evaluation Results<br />Common assumptions:<br />T(Company) = 5,000 B(Company) = 500 M = 100<br />T(Product) = 100,000 B(Product) = 1,000<br />Assume V(Product, maker)  T(Company)<br />Case 2: <br />V(Company, city) << T(Company)<br />V(Company, city) = 20<br />Plan 1: 250  20 = 5,000<br />Plan 2a: 3,500<br />Plan 2b: 100,500<br />Case 1: <br />V(Company, city)  T(Company)<br />V(Company, city) = 5,000 <br />Plan 1: 1  20 = 20<br />Plan 2a: 3,500<br />Plan 2b: 100,500<br />Reference from previous page:<br /><ul><li>Plan 1: T(Company)/V(Company,city)  T(Product)/V(Product,maker)
18. 18. Plan 2a: B(Company) + 3B(Product)
19. 19. Plan 2b: B(Company) + T(Product)</li></li></ul><li>Lessons<br />Need to consider several physical plans<br />even for one, simple logical plan<br />No magic “best” plan: depends on the data<br />In order to make the right choice<br />need to have statistics over the data<br />the B’s, the T’s, the V’s<br />
20. 20. Query Optimzation<br />Have a SQL query Q<br />Create a plan P<br />Find equivalent plans P = P’ = P’’ = … <br />Choose the “cheapest”. <br />HOW ??<br />
21. 21. Logical Query Plan<br />SELECT P.buyer<br />FROM Purchase P, Person Q<br />WHERE P.buyer=Q.name AND<br />Q.city=‘seattle’ AND <br />Q.phone > ‘5430000’ <br />Plan<br />buyer<br /><br />City=‘seattle’<br /> phone>’5430000’<br />Buyer=name<br />In class:<br />find a “better” plan P’<br />Person<br />Purchase<br />
22. 22. CS 542 Database Management Systems<br />Query Optimization – Choosing the Order of Operations<br />J Singh <br />March 28, 2011<br />
23. 23. Outline<br />Convert SQL query to a parse tree<br />Semantic checking: attributes, relation names, types<br />Convert to a logical query plan (relational algebra expression)<br />deal with subqueries<br />Improve the logical query plan<br />use algebraic transformations<br />group together certain operators<br />evaluate logical plan based on estimated size of relations <br />Convert to a physical query plan<br />search the space of physical plans <br />choose order of operations<br />complete the physical query plan<br />
24. 24. Join Trees<br />Recall that the following are equivalent:<br /><ul><li>R ⋈ S ⋈ U
25. 25. R ⋈ (S ⋈ U)
26. 26. (R ⋈ S) ⋈ U
27. 27. S ⋈ (R ⋈ U)
28. 28. But they are not equivalent from an execution viewpoint.</li></ul>Considerable research has gone into picking the best order for Joins<br />
29. 29. Join Trees<br />R1 ⋈R2 ⋈ …⋈Rn<br />Join tree:<br />Definitions<br />A plan = a join tree<br />A partial plan = a subtree of a join tree<br />R3<br />R1<br />R2<br />R4<br />24<br />
30. 30. Left & Right Join Arguments<br />The argument relations in joins determine the cost of the join<br />In Physical Query Plans, the left argument of the join is <br />Called the build relation<br />Assumed to be smaller<br />Stored in main-memory<br />
31. 31. Left & Right Join Arguments<br />The right argument of the join is<br />Called the probe relation <br />Read a block at a time<br />Its tuples are matched with those of build relation<br />The join algorithms which distinguish between the arguments are:<br />One-pass join<br />Nested-loop join<br />Index join<br />
32. 32. Types of Join Trees<br /><ul><li>Right deep</li></ul>Left deep:<br />Bushy<br />R3<br />R4<br />R1<br />R2<br />R5<br />R3<br />R2<br />R4<br />R5<br />R2<br />R4<br />R3<br />R1<br />Many different orders, very important to pick the right one<br />R5<br />R1<br />
33. 33. Optimization Algorithms<br />Heuristic based<br />Cost based<br />Dynamic programming: System R<br />Rule-based optimizations: DB2, SQL-Server<br />
34. 34. Dynamic Programming<br />Given: a query R1 ⋈R2 ⋈… ⋈Rn<br />Assume we have a function cost() that gives us the cost of a join tree<br />Find the best join tree for the query<br />
35. 35. Dynamic Programming<br />Problem Statement<br />Given: a query R1 ⋈ R2 ⋈… ⋈Rn<br />Assume we have a function cost() that gives us the cost of a join tree<br />Find the best join tree for the query<br />Idea: for each subset of {R1, …, Rn}, compute the best plan for that subset<br />Algorithm: In increasing order of set cardinality, compute the cost for<br />Step 1: for {R1}, {R2}, …, {Rn}<br />Step 2: for {R1,R2}, {R1,R3}, …, {Rn-1, Rn}<br />…<br />Step n: for {R1, …, Rn}<br />It is a bottom-up strategy<br />Skipping further details of the algorithm<br />Read from book if interested<br />Will not be on the exam<br />
36. 36. Dynamic Programming Algorithm<br /><ul><li>When computing R1 ⋈ R2 ⋈ … ⋈ Rn,</li></ul>Best Plan (R1 ⋈ R2 ⋈ … ⋈ Rn) = min cost plan of<br /><ul><li>Best Plan (R2 ⋈ R3 ⋈ … ⋈ Rn) ⋈ R1
37. 37. Best Plan (R1 ⋈ R3 ⋈ … ⋈ Rn) ⋈ R2
38. 38.
39. 39. Best Plan (R1 ⋈ R2 ⋈ … ⋈ Rn-1) ⋈ Rn</li></li></ul><li>Reducing the Search Space <br />Left-deep trees vsBushy trees<br />Combinatoric explosion of the number of possible trees<br />Computing the cost of all possible trees is not feasible<br />For a 6-way Join, we can have<br />More than 30,000 bushy trees<br />6!, or 720 left-deep trees<br />Left-deep trees leave their result in memory, making it possible to pipeline efficiently<br />Trees without Cartesian product<br />Example: R(A,B) ⋈S(B,C) ⋈ T(C,D)<br />Plan: (R(A,B) ⋈T(C,D)) ⋈S(B,C) has a Cartesian product<br />Most query optimizers will not consider it<br />
40. 40. Outline<br />Convert SQL query to a parse tree<br />Semantic checking: attributes, relation names, types<br />Convert to a logical query plan (relational algebra expression)<br />deal with subqueries<br />Improve the logical query plan<br />use algebraic transformations<br />group together certain operators<br />evaluate logical plan based on estimated size of relations <br />Convert to a physical query plan<br />search the space of physical plans <br />choose order of operations<br />complete the physical query plan<br />Three topics<br />Choosing the physical implementations (e.g., select and join methods)<br />Decisions regarding materialized vs pipelined<br />Notation for physical query plans<br />
41. 41. Choosing a Selection Method<br />Algorithm for each selection operator<br />1. Can we use an created index on an attribute?<br />If yes, index-scan. (Otherwise table-scan)<br />2. After retrieving all condition-satisfied tuples in (1), filter them with the remaining selection conditions<br />In other words,<br />When computing c1  c2  …  cn(R), we index-scan on ci, then filter the result on all other ci, where j  i.<br />The next 2 pages show an example where we examine several options and pick the best one<br />
42. 42. Selection Method Example (p1)<br />Selection: x=1  y=2  z < 5 (R)<br />Where parameters of R are:<br /> T(R) = 5,000 B(R) = 200<br /> V(R, x) = 100 V(R, y) = 500<br />Relation R is clustered<br />x and y have non-clustering indices<br />z is a clustering index<br />
43. 43. Selection Method Example (p2)<br />Selection options:<br />Table-scan  filter x, y, z. <br />Cost isB(R) = 200since R is clustered.<br />Use index on x =1  filter on y, z. <br />Cost is 50 sinceT(R) / V(R, x) is (5000/100) = 50 tuples, x is not clustering.<br />Use index on y =2  filter on x, z. <br />Cost is 10 sinceT(R) / V(R, y) is (5000/500) = 10 tuples, y is not clustering.<br />Index-scan on clustering index w/ z < 5 filter x ,y. <br />Cost is about B(R)/3 = 67<br />Therefore:<br />First retrieve all tuples with y = 2 (option 3)<br />Then filter for x and z<br />
44. 44. Outline<br />Convert SQL query to a parse tree<br />Semantic checking: attributes, relation names, types<br />Convert to a logical query plan (relational algebra expression)<br />deal with subqueries<br />Improve the logical query plan<br />use algebraic transformations<br />group together certain operators<br />evaluate logical plan based on estimated size of relations <br />Convert to a physical query plan<br />search the space of physical plans <br />choose order of operations<br />complete the physical query plan<br />Three topics<br />Choosing the physical implementations (e.g., select and join methods)<br />Decisions regarding materialized vs pipelined<br />Notation for physical query plans<br />
45. 45. Pipelining Versus Materialization<br />Materialization<br />store (intermediate) result of each operations on disk <br />Pipelining<br />Interleave the execution of several operations, the tuples produced by one operation are passed directly to the operations that used it<br />store (intermediate) result of each operations on buffer, which is implemented on main memory<br />Prefer Pipelining where possible<br />Sometimes not possible, as the following example shows<br />Next few pages, a fully worked-out example<br />
46. 46. R⋈S⋈U Example (p1)<br />Consider physical query plan for the expression<br />(R(w, x) ⋈ S(x, y)) ⋈ U(y, z)<br />Assumption<br />R occupies 5,000 blocks, S and U each 10,000 blocks.<br />The intermediate result R ⋈ S occupies k blocks for some k.<br />Both joins will be implemented as hash-joins, either one-pass or two-pass depending on k<br />There are 101 buffers available.<br />
47. 47. R⋈S⋈U Example (p2)<br />When joining R ⋈ S, neither relation fits in buffers<br />Need two-pass hash-join to partition R<br />How many hash buckets for R?<br />100 at most<br />The 2nd pass hash-join uses 51 buffers, leaving 50 buffers for joining result of R ⋈ S with U.<br />Why 51?<br />
48. 48. R⋈S⋈U Example (p3)<br />Case 1: Suppose k 49, the result of R ⋈ S occupies at most 49 blocks. <br />Steps <br />Pipeline in R ⋈ S into 49 buffers<br />Organize them for lookup as a hash table<br />Use one buffer left to read each block of U in turn<br />Execute the second join as one-pass join.<br />The total number of I/O’s is 55,000<br />45,000 for two-pass hash join of R and S<br />10,000 to read U for one-pass hash join of (R⋈ S) ⋈U.<br />
49. 49. R⋈S⋈U Example (p4)<br />Case 2: suppose k > 49 but < 5,000, we can still pipeline, but need another strategy where intermediate results join with U in a 50-bucket, two-pass hash-join. Steps are:<br />Before start on R ⋈ S, we hash U into 50 buckets of 200 blocks each.<br />Perform two-pass hash join of R and U using 51 buffers as case 1, and placing results in 50 remaining buffers to form 50 buckets for the join of R ⋈ S with U.<br />Finally, join R ⋈ S with U bucket by bucket. <br />The number of disk I/O’s is:<br />20,000 to read U and write its tuples into buckets<br />45,000 for two-pass hash-join R ⋈ S<br />k to write out the buckets of R ⋈ S<br />k+10,000 to read the buckets of R ⋈ S and U in the final join<br />The total cost is 75,000+2k.<br />
50. 50. R⋈S⋈U Example (p5)<br />Case 3: k > 5,000, we cannot perform two-pass join in 50 buffers available if result of R ⋈ S is pipelined. We are forced to materialize the relation R ⋈ S. <br />The number of disk I/O’s is:<br />45,000 for two-pass hash-join R and S<br />k to store R ⋈ S on disk<br />30,000 + 3k for two-pass join of U in R ⋈ S<br />The total cost is 75,000+4k.<br />
51. 51. R⋈S⋈U Example (p6)<br />In summary, costs of physical plan as function of R ⋈ S size.<br />Pause and Reflect<br />It’s all about the expected size of the intermediate result R ⋈ S<br />What would have happened if <br />We guessed 45 but had 55? Guessed 55 but only had 45?<br />Guessed 4,500 but had 5,500? Guessed 5,500 but only had 4,500?<br />
52. 52. Outline<br />Convert SQL query to a parse tree<br />Semantic checking: attributes, relation names, types<br />Convert to a logical query plan (relational algebra expression)<br />deal with subqueries<br />Improve the logical query plan<br />use algebraic transformations<br />group together certain operators<br />evaluate logical plan based on estimated size of relations <br />Convert to a physical query plan<br />search the space of physical plans <br />choose order of operations<br />complete the physical query plan<br />Three topics<br />Choosing the physical implementations (e.g., select and join methods)<br />Decisions regarding materialized vs pipelined<br />Notation for physical query plans<br />
53. 53. Notation for Physical Query Plans<br />Several types of operators: <br />Operators for leaves<br />(Physical) operators for Selection<br />(Physical) Sorts Operators<br />Other Relational-Algebra Operations<br />In practice, each DBMS uses its own internal notation for physical query plans<br />
54. 54. PQP Notation<br />Leaves:Replace a leaf in an LQP by<br />TableScan(R): Read all blocks<br />SortScan(R, L): Read in order according to L<br />IndexScan(R, C): Scan R using index attribute A by condition AC<br />IndexScan(R, A): Scan R using index attribute A<br />Selects: Replace a Select in an LQP by one of the leaf operators plus:<br />Filter(D) for condition D<br />Sorts: Replace a leaf-level sort as shown above. For other operation,<br />Sort(L): Sort a relation that is not stored<br />Other Operators: Operation- and algorithm-specific (e.g., Hash-Join)<br />Also need to specify # passes, buffer sizes, etc.<br />
55. 55. We have Arrived at the Desired Endpoint<br /> x=1 AND y=2 AND z<5 (R)<br />R ⋈ S ⋈ U<br />Example Physical Query Plans<br />two-pass<br />hash-join<br />101 buffers<br />Filter(x=1 AND z<5)<br />materialize<br />IndexScan(R,y=2)<br />two-pass<br />hash-join<br />101 buffers<br />TableScan(U)<br />TableScan(R)<br />TableScan(S)<br />
56. 56. Outline<br />Convert SQL query to a parse tree<br />Semantic checking: attributes, relation names, types<br />Convert to a logical query plan (relational algebra expression)<br />deal with subqueries<br />Improve the logical query plan<br />use algebraic transformations<br />group together certain operators<br />evaluate logical plan based on estimated size of relations <br />Convert to a physical query plan<br />search the space of physical plans <br />choose order of operations<br />complete the physical query plan<br />
57. 57. Optimization Issues and Proposals<br />The “fuzz” in estimation of sizes<br />Parametric Query Optimization<br />Specify alternatives to the execution engine so it may respond to conditions at runtime<br />Multiple-query optimization<br />Take concurrent execution of several queries into account<br />Combinatoric explosion of options when doing an n-way Join<br />Becomes really expensive around n > 15<br />Alternatives optimizations have been proposed for special situations, but no general framework<br />Rule-based optimizers<br />Randomized plan generation<br />
58. 58. CS 542 Database Management Systems<br />Distributed Query Execution<br />Source: Carsten Binnig, Univ of Zurich, 2006<br />J Singh <br />March 28, 2011<br />
59. 59. Motivation<br />Algorithms based on Semi-Joins have been proposed as techniques for query optimization<br />They shine in Distributed and Parallel Databases<br />Good opportunity to explore them in that context<br />Semi-join by example:<br />Semi-join formal definition:<br />
60. 60. Distributed / Parallel Join Processing<br />Scenario:<br />How to compute A ⋈B?<br />Table A resides on Node 1<br />Table B resides on Node 2<br />Node 1<br />Node 2<br />Table A<br />Table B<br />
61. 61. Naïve approach (1)<br />Idea: Use standard join and fetch table page-wise from remote node if necessary (send- and receive-operators)<br />Example:<br />Join is executed on node 2 using a Nested-Loop-Join<br />Outer loop: Request page of table A from node 1 (remote)<br />Inner loop: For each page iterate over table B and produce output<br />=> Random access of pages on node 1 (due to network delay)<br />Node 1<br />Node 2<br />Request<br />Table A<br />Page A1<br />Table B<br />Send<br />
62. 62. Naïve approach (2)<br />Idea: Ship one table completely to the other node<br />Example:<br />Ship complete table A from node 1 to node 2<br />Join table A and B locally on node 2<br /><ul><li>Avoid random page access on node 1</li></ul>Node 1<br />Node 2<br />Table A<br />Table A<br />Table B<br />Ship<br />
63. 63. Naïve Approach: Implications<br />Problems:<br />High cost for shipping data<br />Network cost roughly the same as I/O cost for a hard disk (or even worse because of unpredictability of network delay)<br />Shipping A roughly equivalent to a full table scan<br />(Trivial) Optimizations:<br />Ship always smaller table to the other side<br />If query contains a selection, apply selection before sending A<br />Note: bigger table may become the smaller table (after selection)<br />
64. 64. Semi-join Approach (p1)<br />Idea: Before shipping a table, reduce to data that is shipped to those tuples that are only relevant for join<br />Example: Join on A.id=B.id and table A should be shipped to node 2<br />Node 1<br />Node 2<br />Table A<br />Table B<br />
65. 65. Semi-join Approach (p2)<br />(1) Compute projection B.id of table B on node 2<br />(2) Ship column B.id to node 1<br />Node 1<br />Node 2<br />Table A<br />Table B<br />Ship<br />
66. 66. Semi-join Approach (p3)<br />(3) Execute semi-join of B.id and table A on A.id=B.id (to select only relevant tuples of table A => table A’)<br />(4) Send result of semi-join (table A’) to node 2<br />Node 1<br />Node 2<br />Table A<br />Table B<br />Table A’<br />Ship<br />
67. 67. Semi-join Approach (p4)<br />(5) Join the shipped table A’ locally on node 2 with table B<br />=> Optimization of this approach: If node 1 holds a join index (e.g., type 1 with A.id -> {B.RID}) we can start with step (3)<br />Node 1<br />Node 2<br />Table A<br />Table B<br />Table A’<br />Ship<br />
68. 68. Semi-join Approach Discussion<br />This strategy works well if semi-join reduces size of the table that needs to be shipped<br />Assume all rows of Table A are needed anyway => none of the rows of table A can be discarded<br />Then this approach is more costly than shipping the entire table A in the first place!<br />Consequence:<br />Need to decide whether this method makes sense based on semi-join selectivity <br />=> Cost-based optimization must decide this<br />
69. 69. Bloom-join Approach (p1)<br />Algorithm same as semi-join approach<br />Ship a bloom-filter instead of (foreign) key column<br />Use bloom-filter technique to compress data<br />Goal: only send a small bit list (to reduce network I/O) instead of all keys of column (as bit-vector)<br />Problems: <br />A superset of tuples that might join will be sent back (same problem as in bloom-filters for bitmap-indexes)<br />=> More tuples must be sent over network and thus net gain depends on good hash function<br />
70. 70. Bloom-join Approach (p2)<br />(1) Compute bloom filter BL of size n for column B.id of table B on node 2 with n << |B.id| (e.g., by B.id % n)<br />(2) Ship bloom filter B.id’ to node 1<br />Node 1<br />Node 2<br />Table A<br />Table B<br />Ship<br />
71. 71. Bloom-join Approach (p3)<br />(3) Probe bloom filter B.id’ with tuples from table A to get a superset of possible join candidates (=> table A’)<br />(4) Send result (table A’) to node 2 (table A’ might contain join candidates that do not have a partner in table B)<br />(5) Join the shipped table A’ locally on node 2 with table B<br />Node 1<br />Node 2<br />Table A<br />Table B<br />Table A’<br />Ship<br />Probe<br />
72. 72. Bloom-join Approach Discussion<br />Communication cost much reduced<br />But have to deal with false positives<br />Widely used in NoSQLdatabases<br />
73. 73. Project Rubric<br />