Upcoming SlideShare
×

# CS 542 -- Query Optimization

• 1,558 views

• Comment goes here.
Are you sure you want to
Be the first to comment

Total Views
1,558
On Slideshare
0
From Embeds
0
Number of Embeds
0

Shares
59
0
Likes
1

No embeds

### Report content

No notes for slide

### Transcript

• 1. CS 542 Database Management Systems
Query Optimization
J Singh
March 28, 2011
• 2. Outline
Convert SQL query to a parse tree
Semantic checking: attributes, relation names, types
Convert to a logical query plan (relational algebra expression)
deal with subqueries
Improve the logical query plan
use algebraic transformations
group together certain operators
evaluate logical plan based on estimated size of relations
Convert to a physical query plan
search the space of physical plans
choose order of operations
complete the physical query plan
• 3. Desired Endpoint
 x=1 AND y=2 AND z<5 (R)
R ⋈ S ⋈ U
Example Physical Query Plans
two-pass
hash-join
101 buffers
Filter(x=1 AND z<5)
materialize
IndexScan(R,y=2)
two-pass
hash-join
101 buffers
TableScan(U)
TableScan(R)
TableScan(S)
• 4. Physical Plan Selection
The particular operation being performed
Size of intermediate results, as derived last week (sec 16.4 of book)
Physical Operator Implementation used,
e.g., one- or two-pass
Operation ordering,
esp. Join ordering
Operation output: materialized or pipelined.
Governed by disk I/O, which in turn is governed by
• 5. Index-based physical plans (p1)
Selection example. What is the cost of a=v(R) assuming
B(R) = 2000
T(R) = 100,000
V(R, a) = 20
Table scan (assuming R is clustered):
B(R) = 2,000 I/Os
Index based selection:
If index is clustering: B(R) / V(R,a) = 100 I/Os
If index is unclustered: T(R) / V(R,a) = 5,000 I/Os
For small V(R, a), table scan can be faster than an unclustered index
Heuristics that pick indexed over not-indexed can lead you astray
Determine the cost of both methods and let the algorithm decide
5
• 6. Index-based physical plans (p2)
Example: Join if S has an index on the join attribute
For each tuplein R, fetch corresponding tuple(s) from S
Assume R is clustered. Cost:
If index on S is clustering: B(R) + T(R) B(S) / V(S,a)
If index on S is unclustered: B(R) + T(R) T(S) / V(S,a)
Another case: when R is output of another Iterator. Cost:
B(R) is accounted for in the iterator
If index on S is clustering: T(R) B(S) / V(S,a)
If index on S is unclustered: T(R) T(S) / V(S,a)
If S is not indexed but fits in memory: B(S)
A number of other cases
• 7. Index-based physical plans (p3)
Index Based Join ifboth R and S have a sorted index (B+ tree) on the join attribute
Then perform a merge join
called zig-zag join
Cost: B(R) + B(S)
• 8. Grand Summary of Physical Plans (p1)
Scans and Selects
Index: N = None, C = Clustering, NC = Non-clustered
• 9. Grand Summary of Physical Plans (p2)
Joins
Index: N = None, C = Clustering, NC = Non-clustered
Relation fits in memory: F = Yes, NF = No
• 10. Physical plans at non-leaf Operators (p1)
What if the input of the operator is from another operator?
For Select, cost= 0.
Cost of pipelining is assumed to be zero
The number of tuples emitted is reduced
For Join, when R is from an operator and S from a table:
B(R) is accounted for in the iterator
If index on S is clustering: T(R) B(S) / V(S,a)
If index on S is unclustered: T(R) T(S) / V(S,a)
If S is not indexed but fits in memory: B(S)
If S is not indexed and doesn’t fit: k*B(S) for k chunks
If S is not indexed and doesn’t fit: 3*B(S) for sort- or hash-join
• 11. Physical plans at non-leaf Operators (p2)
For Join, when R and S are both from operators, cost depends on whether the result are sorted by the Join attribute(s)
If yes, we use the zig-zag algorithm and the cost is zero. Why?
If either relation will fit in memory, the cost is zero. Why?
At most, the cost is 2*(B(R) + B(S)). Why?
• 12. Example (787)
Product(pname, maker), Company(cname, city)
Select Product.pname
From Product, Company
Where Product.maker=Company.cname
and Company.city = “Seattle”
How do we execute this query ?
• 13. Example (787)
Product(pname, maker), Company(cname, city)
Select Product.pname
From Product, Company
Where Product.maker=Company.cname
and Company.city = “Seattle”
Logical Plan
Clustering Indices:
Product.pname
Company.cname
Unclustered Indices:
Product.maker
Company.city
maker=cname
scity=“Seattle”
Product(pname,maker)
Company(cname,city)
• 14. Example (787) Physical Plans
Physical Plan 1
Physical Plans 2a and 2b
Merge-join
Index-basedjoin
Index-basedselection
maker=cname
scity=“Seattle”
cname=maker
scity=“Seattle”
Product(pname,maker)
Company(cname,city)
Product(pname,maker)
Company(cname,city)
Index-scan
Scan and sort (2a)index scan (2b)
• 15. Evaluate (787) Physical Plans
Physical Plan 1
Tuples:
T(city='Seattle'(Company)) = T(Company) / V(Company, City)
Cost:
T(city='Seattle'(Company)) * T(Product) / V(Product, maker)
or, simplifying,
T(Company) / V(Company, City) * T(Product) / V(Product, maker)
Total Cost:
2a: 3B(Product) + B(Company)
2b: T(Product) + B(Company)
Merge-join
maker=cname
scity=“Seattle”
Product(pname,maker)
Company(cname,city)
Index-scan
Scan and sort (2a)index scan (2b)
• 16. Final Evaluation
Plan Costs:
Plan 1: T(Company) / V(Company, city)  T(Product)/V(Product, maker)
Plan 2a: B(Company) + 3B(Product)
Plan 2b: B(Company) + T(Product)
Which is better?
It depends on the data
• 17. Example (787) Evaluation Results
Common assumptions:
T(Company) = 5,000 B(Company) = 500 M = 100
T(Product) = 100,000 B(Product) = 1,000
Assume V(Product, maker)  T(Company)
Case 2:
V(Company, city) << T(Company)
V(Company, city) = 20
Plan 1: 250  20 = 5,000
Plan 2a: 3,500
Plan 2b: 100,500
Case 1:
V(Company, city)  T(Company)
V(Company, city) = 5,000
Plan 1: 1  20 = 20
Plan 2a: 3,500
Plan 2b: 100,500
Reference from previous page:
• Plan 1: T(Company)/V(Company,city)  T(Product)/V(Product,maker)
• 18. Plan 2a: B(Company) + 3B(Product)
• 19. Plan 2b: B(Company) + T(Product)
• Lessons
Need to consider several physical plans
even for one, simple logical plan
No magic “best” plan: depends on the data
In order to make the right choice
need to have statistics over the data
the B’s, the T’s, the V’s
• 20. Query Optimzation
Have a SQL query Q
Create a plan P
Find equivalent plans P = P’ = P’’ = …
Choose the “cheapest”.
HOW ??
• 21. Logical Query Plan
FROM Purchase P, Person Q
Q.city=‘seattle’ AND
Q.phone > ‘5430000’
Plan

City=‘seattle’
phone>’5430000’
In class:
find a “better” plan P’
Person
Purchase
• 22. CS 542 Database Management Systems
Query Optimization – Choosing the Order of Operations
J Singh
March 28, 2011
• 23. Outline
Convert SQL query to a parse tree
Semantic checking: attributes, relation names, types
Convert to a logical query plan (relational algebra expression)
deal with subqueries
Improve the logical query plan
use algebraic transformations
group together certain operators
evaluate logical plan based on estimated size of relations
Convert to a physical query plan
search the space of physical plans
choose order of operations
complete the physical query plan
• 24. Join Trees
Recall that the following are equivalent:
• R ⋈ S ⋈ U
• 25. R ⋈ (S ⋈ U)
• 26. (R ⋈ S) ⋈ U
• 27. S ⋈ (R ⋈ U)
• 28. But they are not equivalent from an execution viewpoint.
Considerable research has gone into picking the best order for Joins
• 29. Join Trees
R1 ⋈R2 ⋈ …⋈Rn
Join tree:
Definitions
A plan = a join tree
A partial plan = a subtree of a join tree
R3
R1
R2
R4
24
• 30. Left & Right Join Arguments
The argument relations in joins determine the cost of the join
In Physical Query Plans, the left argument of the join is
Called the build relation
Assumed to be smaller
Stored in main-memory
• 31. Left & Right Join Arguments
The right argument of the join is
Called the probe relation
Read a block at a time
Its tuples are matched with those of build relation
The join algorithms which distinguish between the arguments are:
One-pass join
Nested-loop join
Index join
• 32. Types of Join Trees
• Right deep
Left deep:
Bushy
R3
R4
R1
R2
R5
R3
R2
R4
R5
R2
R4
R3
R1
Many different orders, very important to pick the right one
R5
R1
• 33. Optimization Algorithms
Heuristic based
Cost based
Dynamic programming: System R
Rule-based optimizations: DB2, SQL-Server
• 34. Dynamic Programming
Given: a query R1 ⋈R2 ⋈… ⋈Rn
Assume we have a function cost() that gives us the cost of a join tree
Find the best join tree for the query
• 35. Dynamic Programming
Problem Statement
Given: a query R1 ⋈ R2 ⋈… ⋈Rn
Assume we have a function cost() that gives us the cost of a join tree
Find the best join tree for the query
Idea: for each subset of {R1, …, Rn}, compute the best plan for that subset
Algorithm: In increasing order of set cardinality, compute the cost for
Step 1: for {R1}, {R2}, …, {Rn}
Step 2: for {R1,R2}, {R1,R3}, …, {Rn-1, Rn}

Step n: for {R1, …, Rn}
It is a bottom-up strategy
Skipping further details of the algorithm
Will not be on the exam
• 36. Dynamic Programming Algorithm
• When computing R1 ⋈ R2 ⋈ … ⋈ Rn,
Best Plan (R1 ⋈ R2 ⋈ … ⋈ Rn) = min cost plan of
• Best Plan (R2 ⋈ R3 ⋈ … ⋈ Rn) ⋈ R1
• 37. Best Plan (R1 ⋈ R3 ⋈ … ⋈ Rn) ⋈ R2
• 38.
• 39. Best Plan (R1 ⋈ R2 ⋈ … ⋈ Rn-1) ⋈ Rn
• Reducing the Search Space
Left-deep trees vsBushy trees
Combinatoric explosion of the number of possible trees
Computing the cost of all possible trees is not feasible
For a 6-way Join, we can have
More than 30,000 bushy trees
6!, or 720 left-deep trees
Left-deep trees leave their result in memory, making it possible to pipeline efficiently
Trees without Cartesian product
Example: R(A,B) ⋈S(B,C) ⋈ T(C,D)
Plan: (R(A,B) ⋈T(C,D)) ⋈S(B,C) has a Cartesian product
Most query optimizers will not consider it
• 40. Outline
Convert SQL query to a parse tree
Semantic checking: attributes, relation names, types
Convert to a logical query plan (relational algebra expression)
deal with subqueries
Improve the logical query plan
use algebraic transformations
group together certain operators
evaluate logical plan based on estimated size of relations
Convert to a physical query plan
search the space of physical plans
choose order of operations
complete the physical query plan
Three topics
Choosing the physical implementations (e.g., select and join methods)
Decisions regarding materialized vs pipelined
Notation for physical query plans
• 41. Choosing a Selection Method
Algorithm for each selection operator
1. Can we use an created index on an attribute?
If yes, index-scan. (Otherwise table-scan)
2. After retrieving all condition-satisfied tuples in (1), filter them with the remaining selection conditions
In other words,
When computing c1  c2  …  cn(R), we index-scan on ci, then filter the result on all other ci, where j  i.
The next 2 pages show an example where we examine several options and pick the best one
• 42. Selection Method Example (p1)
Selection: x=1  y=2  z < 5 (R)
Where parameters of R are:
T(R) = 5,000 B(R) = 200
V(R, x) = 100 V(R, y) = 500
Relation R is clustered
x and y have non-clustering indices
z is a clustering index
• 43. Selection Method Example (p2)
Selection options:
Table-scan  filter x, y, z.
Cost isB(R) = 200since R is clustered.
Use index on x =1  filter on y, z.
Cost is 50 sinceT(R) / V(R, x) is (5000/100) = 50 tuples, x is not clustering.
Use index on y =2  filter on x, z.
Cost is 10 sinceT(R) / V(R, y) is (5000/500) = 10 tuples, y is not clustering.
Index-scan on clustering index w/ z < 5 filter x ,y.
Cost is about B(R)/3 = 67
Therefore:
First retrieve all tuples with y = 2 (option 3)
Then filter for x and z
• 44. Outline
Convert SQL query to a parse tree
Semantic checking: attributes, relation names, types
Convert to a logical query plan (relational algebra expression)
deal with subqueries
Improve the logical query plan
use algebraic transformations
group together certain operators
evaluate logical plan based on estimated size of relations
Convert to a physical query plan
search the space of physical plans
choose order of operations
complete the physical query plan
Three topics
Choosing the physical implementations (e.g., select and join methods)
Decisions regarding materialized vs pipelined
Notation for physical query plans
• 45. Pipelining Versus Materialization
Materialization
store (intermediate) result of each operations on disk
Pipelining
Interleave the execution of several operations, the tuples produced by one operation are passed directly to the operations that used it
store (intermediate) result of each operations on buffer, which is implemented on main memory
Prefer Pipelining where possible
Sometimes not possible, as the following example shows
Next few pages, a fully worked-out example
• 46. R⋈S⋈U Example (p1)
Consider physical query plan for the expression
(R(w, x) ⋈ S(x, y)) ⋈ U(y, z)
Assumption
R occupies 5,000 blocks, S and U each 10,000 blocks.
The intermediate result R ⋈ S occupies k blocks for some k.
Both joins will be implemented as hash-joins, either one-pass or two-pass depending on k
There are 101 buffers available.
• 47. R⋈S⋈U Example (p2)
When joining R ⋈ S, neither relation fits in buffers
Need two-pass hash-join to partition R
How many hash buckets for R?
100 at most
The 2nd pass hash-join uses 51 buffers, leaving 50 buffers for joining result of R ⋈ S with U.
Why 51?
• 48. R⋈S⋈U Example (p3)
Case 1: Suppose k 49, the result of R ⋈ S occupies at most 49 blocks.
Steps
Pipeline in R ⋈ S into 49 buffers
Organize them for lookup as a hash table
Use one buffer left to read each block of U in turn
Execute the second join as one-pass join.
The total number of I/O’s is 55,000
45,000 for two-pass hash join of R and S
10,000 to read U for one-pass hash join of (R⋈ S) ⋈U.
• 49. R⋈S⋈U Example (p4)
Case 2: suppose k > 49 but < 5,000, we can still pipeline, but need another strategy where intermediate results join with U in a 50-bucket, two-pass hash-join. Steps are:
Before start on R ⋈ S, we hash U into 50 buckets of 200 blocks each.
Perform two-pass hash join of R and U using 51 buffers as case 1, and placing results in 50 remaining buffers to form 50 buckets for the join of R ⋈ S with U.
Finally, join R ⋈ S with U bucket by bucket.
The number of disk I/O’s is:
20,000 to read U and write its tuples into buckets
45,000 for two-pass hash-join R ⋈ S
k to write out the buckets of R ⋈ S
k+10,000 to read the buckets of R ⋈ S and U in the final join
The total cost is 75,000+2k.
• 50. R⋈S⋈U Example (p5)
Case 3: k > 5,000, we cannot perform two-pass join in 50 buffers available if result of R ⋈ S is pipelined. We are forced to materialize the relation R ⋈ S.
The number of disk I/O’s is:
45,000 for two-pass hash-join R and S
k to store R ⋈ S on disk
30,000 + 3k for two-pass join of U in R ⋈ S
The total cost is 75,000+4k.
• 51. R⋈S⋈U Example (p6)
In summary, costs of physical plan as function of R ⋈ S size.
Pause and Reflect
It’s all about the expected size of the intermediate result R ⋈ S
What would have happened if
We guessed 45 but had 55? Guessed 55 but only had 45?
• 52. Outline
Convert SQL query to a parse tree
Semantic checking: attributes, relation names, types
Convert to a logical query plan (relational algebra expression)
deal with subqueries
Improve the logical query plan
use algebraic transformations
group together certain operators
evaluate logical plan based on estimated size of relations
Convert to a physical query plan
search the space of physical plans
choose order of operations
complete the physical query plan
Three topics
Choosing the physical implementations (e.g., select and join methods)
Decisions regarding materialized vs pipelined
Notation for physical query plans
• 53. Notation for Physical Query Plans
Several types of operators:
Operators for leaves
(Physical) operators for Selection
(Physical) Sorts Operators
Other Relational-Algebra Operations
In practice, each DBMS uses its own internal notation for physical query plans
• 54. PQP Notation
Leaves:Replace a leaf in an LQP by
SortScan(R, L): Read in order according to L
IndexScan(R, C): Scan R using index attribute A by condition AC
IndexScan(R, A): Scan R using index attribute A
Selects: Replace a Select in an LQP by one of the leaf operators plus:
Filter(D) for condition D
Sorts: Replace a leaf-level sort as shown above. For other operation,
Sort(L): Sort a relation that is not stored
Other Operators: Operation- and algorithm-specific (e.g., Hash-Join)
Also need to specify # passes, buffer sizes, etc.
• 55. We have Arrived at the Desired Endpoint
 x=1 AND y=2 AND z<5 (R)
R ⋈ S ⋈ U
Example Physical Query Plans
two-pass
hash-join
101 buffers
Filter(x=1 AND z<5)
materialize
IndexScan(R,y=2)
two-pass
hash-join
101 buffers
TableScan(U)
TableScan(R)
TableScan(S)
• 56. Outline
Convert SQL query to a parse tree
Semantic checking: attributes, relation names, types
Convert to a logical query plan (relational algebra expression)
deal with subqueries
Improve the logical query plan
use algebraic transformations
group together certain operators
evaluate logical plan based on estimated size of relations
Convert to a physical query plan
search the space of physical plans
choose order of operations
complete the physical query plan
• 57. Optimization Issues and Proposals
The “fuzz” in estimation of sizes
Parametric Query Optimization
Specify alternatives to the execution engine so it may respond to conditions at runtime
Multiple-query optimization
Take concurrent execution of several queries into account
Combinatoric explosion of options when doing an n-way Join
Becomes really expensive around n > 15
Alternatives optimizations have been proposed for special situations, but no general framework
Rule-based optimizers
Randomized plan generation
• 58. CS 542 Database Management Systems
Distributed Query Execution
Source: Carsten Binnig, Univ of Zurich, 2006
J Singh
March 28, 2011
• 59. Motivation
Algorithms based on Semi-Joins have been proposed as techniques for query optimization
They shine in Distributed and Parallel Databases
Good opportunity to explore them in that context
Semi-join by example:
Semi-join formal definition:
• 60. Distributed / Parallel Join Processing
Scenario:
How to compute A ⋈B?
Table A resides on Node 1
Table B resides on Node 2
Node 1
Node 2
Table A
Table B
• 61. Naïve approach (1)
Idea: Use standard join and fetch table page-wise from remote node if necessary (send- and receive-operators)
Example:
Join is executed on node 2 using a Nested-Loop-Join
Outer loop: Request page of table A from node 1 (remote)
Inner loop: For each page iterate over table B and produce output
=> Random access of pages on node 1 (due to network delay)
Node 1
Node 2
Request
Table A
Page A1
Table B
Send
• 62. Naïve approach (2)
Idea: Ship one table completely to the other node
Example:
Ship complete table A from node 1 to node 2
Join table A and B locally on node 2
• Avoid random page access on node 1
Node 1
Node 2
Table A
Table A
Table B
Ship
• 63. Naïve Approach: Implications
Problems:
High cost for shipping data
Network cost roughly the same as I/O cost for a hard disk (or even worse because of unpredictability of network delay)
Shipping A roughly equivalent to a full table scan
(Trivial) Optimizations:
Ship always smaller table to the other side
If query contains a selection, apply selection before sending A
Note: bigger table may become the smaller table (after selection)
• 64. Semi-join Approach (p1)
Idea: Before shipping a table, reduce to data that is shipped to those tuples that are only relevant for join
Example: Join on A.id=B.id and table A should be shipped to node 2
Node 1
Node 2
Table A
Table B
• 65. Semi-join Approach (p2)
(1) Compute projection B.id of table B on node 2
(2) Ship column B.id to node 1
Node 1
Node 2
Table A
Table B
Ship
• 66. Semi-join Approach (p3)
(3) Execute semi-join of B.id and table A on A.id=B.id (to select only relevant tuples of table A => table A’)
(4) Send result of semi-join (table A’) to node 2
Node 1
Node 2
Table A
Table B
Table A’
Ship
• 67. Semi-join Approach (p4)
(5) Join the shipped table A’ locally on node 2 with table B
=> Optimization of this approach: If node 1 holds a join index (e.g., type 1 with A.id -> {B.RID}) we can start with step (3)
Node 1
Node 2
Table A
Table B
Table A’
Ship
• 68. Semi-join Approach Discussion
This strategy works well if semi-join reduces size of the table that needs to be shipped
Assume all rows of Table A are needed anyway => none of the rows of table A can be discarded
Then this approach is more costly than shipping the entire table A in the first place!
Consequence:
Need to decide whether this method makes sense based on semi-join selectivity
=> Cost-based optimization must decide this
• 69. Bloom-join Approach (p1)
Algorithm same as semi-join approach
Ship a bloom-filter instead of (foreign) key column
Use bloom-filter technique to compress data
Goal: only send a small bit list (to reduce network I/O) instead of all keys of column (as bit-vector)
Problems:
A superset of tuples that might join will be sent back (same problem as in bloom-filters for bitmap-indexes)
=> More tuples must be sent over network and thus net gain depends on good hash function
• 70. Bloom-join Approach (p2)
(1) Compute bloom filter BL of size n for column B.id of table B on node 2 with n << |B.id| (e.g., by B.id % n)
(2) Ship bloom filter B.id’ to node 1
Node 1
Node 2
Table A
Table B
Ship
• 71. Bloom-join Approach (p3)
(3) Probe bloom filter B.id’ with tuples from table A to get a superset of possible join candidates (=> table A’)
(4) Send result (table A’) to node 2 (table A’ might contain join candidates that do not have a partner in table B)
(5) Join the shipped table A’ locally on node 2 with table B
Node 1
Node 2
Table A
Table B
Table A’
Ship
Probe
• 72. Bloom-join Approach Discussion
Communication cost much reduced
But have to deal with false positives
Widely used in NoSQLdatabases
• 73. Project Rubric