QUERY
OPTIMIZATION
P R E S E N T E D BY A B D U L W A H A B
INTRODUCTION
• What is Query Optimization?
• Suppose you were given a chance to visit 15 pre-selected
different cities in Pakistan. The only constraint would be ‘Time’
-> Would you have a plan to visit the cities in any order?
• Plan:
-> Place the 15 cities in different groups based
on their proximity to each other.
-> Start with one group and move on to the
next group.
Important point made over here is that you
would have visited the cities in a more
organized manner, and the ‘Time’ constraint
mentioned earlier would have been dealt with
efficiently.
• Query Optimization works in a similar way:
There can be many different ways to get an answer
from a given query. The result would be same in all
scenarios.
DBMS strive to process the query in the most
efficient way (in terms of ‘Time’) to produce the
answer.
Cost = Time needed to get all answers
STEPS IN A QUERY
OPTIMIZATION
1. Parsing
2. Transformation
3. Implementation
QUERY FLOW
Parser
Optimizer
Code
Generator/I
nterpreter
Processor
SQL
• Query Parser – Verify validity of the SQL statement.
Translate query into an internal structure using
relational calculus.
• Query Optimizer – Find the best expression from
various different algebraic expressions. Criteria used
is ‘Cheapness’
• Code Generator/Interpreter – Make calls for the
Query processor as a result of the work done by the
optimizer.
• Query Processor – Execute the calls obtained from
the code generator.
PROJECTION EXAMPLE:
• Projections produce a result tuple for every argument
tuple.
• What is the change?
• Change in the output size is the change in the length
of tuples
Let’s take a relation ‘R’
Relation (20,000 tuples): R(a, b, c)
Each Tuple (190 bytes): header = 24 bytes, a = 8 bytes,
b = 8 bytes, c = 150 bytes
Each Block (1024): header = 24 bytes
We can fit 5 tuples into 1 block
- 5 tuples * 190 bytes/tuple = 950 bytes can fit
into 1 block
- For 20,000 tuples, we would require 4,000
blocks (20,000 / 5 tuples per block = 4,000
With a projection resulting in elimination of
column c (150 bytes), we could estimate that
each tuple would decrease to 40 bytes (190 –
150 bytes)
Now, the new estimate will be 25 tuples in 1
block.
- 25 tuples * 40 bytes/tuple = 1000 bytes will
be able to fit into 1 block
- With 20,000 tuples, the new estimate is 800
blocks (20,000 tuples / 25 tuples per block =
800 blocks)
Result is reduction by a factor of 5
QUERY OPTIMIZATION: ALGEBRAIC
EXPRESSIONS
If we had the following query-
SELECT count(id) from emp;
Instead of
SELECT count(*) from emp;
Suppose, In DLD, We have that expression.
X = A + ABC +AB + AC + AA’ ------------------(1)
X = A(1+BC+B+C+A’)
X = A(1)
X = A ---------------------------------------------------(2)
So, Tell me Which one is better 1 and 2. Both have
same result.
THANK
YOU

Query o

  • 1.
    QUERY OPTIMIZATION P R ES E N T E D BY A B D U L W A H A B
  • 2.
    INTRODUCTION • What isQuery Optimization? • Suppose you were given a chance to visit 15 pre-selected different cities in Pakistan. The only constraint would be ‘Time’ -> Would you have a plan to visit the cities in any order?
  • 3.
    • Plan: -> Placethe 15 cities in different groups based on their proximity to each other. -> Start with one group and move on to the next group. Important point made over here is that you would have visited the cities in a more organized manner, and the ‘Time’ constraint mentioned earlier would have been dealt with efficiently.
  • 4.
    • Query Optimizationworks in a similar way: There can be many different ways to get an answer from a given query. The result would be same in all scenarios. DBMS strive to process the query in the most efficient way (in terms of ‘Time’) to produce the answer. Cost = Time needed to get all answers
  • 5.
    STEPS IN AQUERY OPTIMIZATION 1. Parsing 2. Transformation 3. Implementation
  • 6.
  • 7.
    • Query Parser– Verify validity of the SQL statement. Translate query into an internal structure using relational calculus. • Query Optimizer – Find the best expression from various different algebraic expressions. Criteria used is ‘Cheapness’ • Code Generator/Interpreter – Make calls for the Query processor as a result of the work done by the optimizer. • Query Processor – Execute the calls obtained from the code generator.
  • 8.
    PROJECTION EXAMPLE: • Projectionsproduce a result tuple for every argument tuple. • What is the change? • Change in the output size is the change in the length of tuples Let’s take a relation ‘R’ Relation (20,000 tuples): R(a, b, c) Each Tuple (190 bytes): header = 24 bytes, a = 8 bytes, b = 8 bytes, c = 150 bytes Each Block (1024): header = 24 bytes
  • 9.
    We can fit5 tuples into 1 block - 5 tuples * 190 bytes/tuple = 950 bytes can fit into 1 block - For 20,000 tuples, we would require 4,000 blocks (20,000 / 5 tuples per block = 4,000 With a projection resulting in elimination of column c (150 bytes), we could estimate that each tuple would decrease to 40 bytes (190 – 150 bytes)
  • 10.
    Now, the newestimate will be 25 tuples in 1 block. - 25 tuples * 40 bytes/tuple = 1000 bytes will be able to fit into 1 block - With 20,000 tuples, the new estimate is 800 blocks (20,000 tuples / 25 tuples per block = 800 blocks) Result is reduction by a factor of 5
  • 11.
    QUERY OPTIMIZATION: ALGEBRAIC EXPRESSIONS Ifwe had the following query- SELECT count(id) from emp; Instead of SELECT count(*) from emp;
  • 12.
    Suppose, In DLD,We have that expression. X = A + ABC +AB + AC + AA’ ------------------(1) X = A(1+BC+B+C+A’) X = A(1) X = A ---------------------------------------------------(2) So, Tell me Which one is better 1 and 2. Both have same result.
  • 13.