SlideShare a Scribd company logo
CS263
Query Optimisation
⇒ Motivation for Query Optimisation
⇒ Phases of Query Processing
⇒ Query Trees
⇒ RA Transformation Rules
⇒ Heuristic Processing Strategies
⇒ Cost Estimation for RA Operations
LECTURE PLAN
Motivation for Query Optimisation
List all the managers that work in the sales department.
SELECT *
FROM emp, dept
WHERE emp.deptno = dept.deptno
AND emp.job = ‘Manager’
AND dept.name = ‘Sales’;
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT)
σ(job = ‘Manager’) ∧ (name=‘Sales’) (EMP emp.deptno = dept.deptno DEPT)
(σ(job = ‘Manager’) (EMP)) emp.deptno = dept.deptno (σ(name=‘Sales’) (DEPT))
There are at least three
alternative ways of
representing this query
as a Relational Algebra
expression.
Motivation for Query Optimisation
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT)
Metrics:
1000 tuples in the EMP relation
50 tuples in the DEPT relation
50 employees are Managers (one per department)
5 separate Sales departments (across the country)
Cost of processing the following query alternate:
Cartesian product of EMP and DEPT:
(1000 + 50) record I/O’s to read the relations
+ (1000 * 50) record I/O’s to create an intermediate relation to store result
Selection on result of Cartesian product:
(1000 * 50) record I/O’s to read tuples and compare against predicate
Total cost of the query:
(1000 + 50) + 2*(1000 * 50) = 101, 050 record I/O’s.
Motivation for Query Optimisation
Metrics:
1000 tuples in the EMP relation
50 tuples in the DEPT relation
50 employees are Managers (one per department)
5 separate Sales departments (across the country)
Cost of processing the following query alternate:
Join of EMP and DEPT over deptno:
(1000 + 50) record I/O’s to read the relations
+ (1000) record I/O’s to create an intermediate relation to store join result
Selection on result of Join:
(1000) record I/O’s to read each tuple and compare against predicate
Total cost of the query:
(1000 + 50) + 2*(1000) = 3, 050 record I/O’s.
σ(job = ‘Manager’) ∧ (name=‘Sales’) (EMP emp.deptno = dept.deptno DEPT)
Motivation for Query Optimisation
Cost of processing the following query:
(σ(job = ‘Manager’) (EMP)) emp.deptno = dept.deptno (σ(name=‘Sales’) (DEPT))
Select ‘Managers’ in EMP:
(1000) record I/O’s to read the relations
+ (50) record I/O’s to create an intermediate relation to store select result
Select ‘Sales’ in DEPT:
(50) record I/O’s to read the relations
+ (5) record I/O’s to create an intermediate relation to store select result
Join of previous two selections over deptno:
(50 + 5) record I/O’s to read the relations
Total cost of the query:
(1000 2*(50) + 5 +(50 +5)) = 1, 160 record I/O’s.
Phases of Query Processing
Query Processing Stage - 1
 Cast the query into internal form
 This involves the conversion of the original (SQL)
query into some internal representation more suitable
for machine manipulation.
 The internal representation typically chosen is either
some kind of ‘abstract syntax tree’, or a relational
algebra ‘query tree’.
Relational Algebra Query Trees
A Relational Algebra query can be represented as a ‘query tree’. For
example the query to list all the managers that work in the sales
department could be described as one of the following:
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT)
EMP DEPT
X
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno)
Leaves
Intermediate
operations
Root
Relational Algebra Query Trees
A Relational Algebra query can be represented as a ‘query tree’. For
example the query to list all the managers that work in the sales
department could be described as one of the following:
σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT)
EMP DEPT
X
σ(job = ‘Manager’) ∧ (name=‘Sales’)
∧ (emp.deptno = dept.deptno)
Leaves
Intermediate
operations
Root
Relational Algebra Query Trees
σ(job = ‘Manager’) ∧ (name=‘Sales’) (EMP emp.deptno = dept.deptno DEPT)
EMP DEPT
σ(job = ‘Manager’) ∧ (name=‘Sales’)
emp.deptno = dept.deptno
Alternative‘query tree’ for the query to list all the managers that work
in the sales department:
Relational Algebra Query Trees
(σ(job = ‘Manager’) (EMP)) emp.deptno = dept.deptno (σ(name=‘Sales’) (DEPT))
EMP DEPT
emp.deptno = dept.deptno
σ(job = ‘Manager’) σ(name=‘Sales’)
Alternative‘query tree’ for the query to list all the managers that work
in the sales department:
Query Processing Stage - 2
 Convert to canonical form
 Find a more ‘efficient’ representation of the query by
converting the internal representation into some
equivalent (canonical) form through the application
of a set of well-defined ‘transformation rules’.
 The set of transformation rules to apply will
generally be the result of the application of specific
heuristic processing strategies associated with
particular DBMSs.
1. Conjunctive selection operations can cascade into
individual selection operations (and vice versa).
Sometimes referred to as cascade of selection.
σp∧q∧r(R) = σp(σq(σr(R)))
Example:
σdeptno=10 ∧sal>1000(Emp) = σdeptno=10(σsal>1000(Emp))
Transformation Rules for RA Operations
2. Commutativity of selection
σp(σq(R)) = σq(σp(R))
Example:
σsal>1000(σdeptno=10(Emp)) = σdeptno=10(σsal>1000(Emp))
Transformation Rules for RA Operations
3. In a sequence of projection operations, only the last
in the sequence is required.
ΠLΠM … ΠN(R) = ΠL (R)
Example:
ΠdeptnoΠname(Dept) = Πdeptno (Dept))
Transformation Rules for RA Operations
4. Commutativity of selection and projection.
ΠAi,…,Am(σp(R)) = σp(ΠAi,…,Am(R))
where p ∈{A1, A2, …, Am}
Example:
Πname, job(σname=‘Smith’(Emp)) = σname=‘Smith'(Πname,job(Staff))
Transformation Rules for RA Operations
Selection predicate (p) is only
made up of projected attributes
5. Commutativity of theta-join (and Cartesian product).
R pS = S pR
Transformation Rules for RA Operations
R X S = S X R
Example:
EMP emp.deptno = dept.deptno DEPT
= DEPT emp.deptno = dept.deptno EMP
NOTE: Theta-join is a generalisation
of both the equi-join and natural-join
6. Commutativity of selection and theta-join
(or Cartesian product).
Transformation Rules for RA Operations
Example:
(σemp.deptno=10 (EMP)) emp.deptno = dept.deptno DEPT
= σemp.deptno=10 (EMP emp.deptno = dept.deptno DEPT)
(σp(R)) r S = σp(R r S)
where p ∈{A1, A2, …, Am}
Selection predicate (p) is only
made up of join attributes
7. Commutativity of projection and theta-join
(or Cartesian product).
Transformation Rules for RA Operations
Example:
Π job, location, deptno (EMP emp.deptno = dept.deptno DEPT)
= (Π job, deptno (EMP)) emp.deptno = dept.deptno (Π location, deptno (DEPT))
ΠL(R r S) = (ΠL1(R)) r (ΠL2(S))
Project attributes L = L1 ∪ L2, where L1 are attributes of R, and
L2 are attributes of S. L will also contain the join attributes
8. Commutativity of union and intersection
(but not set difference).
R ∪ S = S ∪ R
R ∩ S = S ∩ R
Transformation Rules for RA Operations
Transformation Rules for RA Operations
9. Commutativity of selection and set operations
(union, intersection, and set difference).
Union
σp(R ∪ S) = σp(S) ∪ σp(R)
Intersection
σp(R ∩ S) = σp(S) ∩ σp(R)
Set Difference
σp(R - S) = σp(S) - σp(R)
10 Commutativity of projection and union
ΠL(R ∪ S) = ΠL(S) ∪ ΠL(R)
Transformation Rules for RA Operations
11 Associativity of natural join (and Cartesian product)
Natural Join
(R S) T = R (S T)
Cartesian Product
(R X S) X T = R X (S X T)
Transformation Rules for RA Operations
Transformation Rules for RA Operations
12 Associativity of union and intersection (but not set
difference)
Union
(R ∪ S) ∪ T = S ∪ (R ∪ T)
Intersection
(R ∩ S) ∩ T = S ∩ (R ∩ T)
Heuristic Processing Strategies
 Perform selection operations as early as possible
 Translate a Cartesian product and subsequent
selection (whose predicate represents a join condition)
into a join operation.
 Use associativity of binary operations to ensure
that the most restrictive selection operations are
executed first
 Perform projections as early as possible.
 Compute common expressions once
Heuristic Processing - Example
EMP DEPT
σ(job =‘Manager’) ∧ (name=‘Sales’)
emp.deptno = dept.deptno
EMP DEPT
σ(job =‘Manager’) ∧ (name=‘Sales’)
emp.deptno = dept.deptno
EMP DEPT
σ(job =‘Manager’) ∧ (name=‘Sales’)
emp.deptno = dept.deptno
EMP DEPT
emp.deptno = dept.deptno
σ(job =‘Manager’) σ(name=‘Sales’)
EMP DEPT
emp.deptno = dept.deptno
σ(job =‘Manager’) σ(name=‘Sales’)
EMP DEPT
emp.deptno = dept.deptno
σ(job =‘Manager’)
σ(job =‘Manager’) σ(name=‘Sales’)
EMP DEPT
X
σ(job =‘Manager’) ∧ (name=‘Sales’)
∧ (emp.deptno = dept.deptno)
EMP DEPT
X
σ(job =‘Manager’) ∧ (name=‘Sales’)
∧ (emp.deptno = dept.deptno)
EMP DEPT
X
σ(job =‘Manager’) ∧ (name=‘Sales’)
∧ (emp.deptno = dept.deptno)
Optimised
Canonical Query
Query Processing Stage - 3
 Choose candidate low-level procedures
 Consider the (optimised canonical) query as a series
of low-level operations (join, restrict, etc…).
 For each of these operations generate alternative
execution strategies and calculate the cost of such
strategies on the basis of statistical information held
about the database tables (files).
Query Processing Stage - 4
 Generate query plans and choose the cheapest
 Construct a set of ‘candidate’ Query Execution Plans (QEPs).
 Each QEP is constructed by selecting a candidate
implementation procedure for each operation in the canonical
query and then combining them to form a string of associated
operations.
 Each QEP will have an (estimated) cost associated with it – the
sum of the cost of each of its operations.
 Choose the QEP with the least cost.
Cost Based Optimisation
 Cost Based Optimisation (stages 3 & 4)
 A good declarative query optimiser does not rely
solely on heuristic processing strategies.
 It chooses the QEP with the lowest estimated cost.
 After heuristic rules are applied to a query, there still
remains a number of alternative ways to execute it .
 The Query Optimiser estimates the cost of executing
each one (or at least a number) of these alternatives, and
selects the cheapest one.
Costs associated with query execution
 Secondary storage access costs:
 Searching for data blocks on disk,
 Reading data blocks from disk
 Writing data block to disk
 Storage costs
 Cost of storing intermediate (temp) files
 Computation costs
 Cost of CPU usage
 Main memory usage costs
 Cost of buffering data
 Communication costs
 Cost of moving data across
Database statistics used in cost estimation
Information held on each relation:
 number of tuples
 number of blocks
 blocking factor
 primary access method
 primary access attributes
 secondary indexes
 secondary indexing attributes
 number of levels for each index
 number of distinct values of each attribute
Physical Data Structures – File Types
 Heap (Sequential, Unordered)
 no key columns
 queries, other than appends, scan every page
 rows are appended at the end
 duplicate rows are allowed
 Ordered
 physically sorted data file with no index
 Hash (Random, Direct)
 data is located based on the (calculated) value of a hash field (key)
 Indexed Sequential (ISAM)
 sorted data file with a primary index
 B+
Tree
 dynamic multilevel index
 reuses deleted space on associated data pages
Strategies for implementing the RESTRICT operation
Different access strategies dependant upon the structure of
the file in which the relation is stored, and whether the
predicate attribute(s) have been indexed/hashed: Each uses a
different cost algorithm (which refers to specific database statistics).
 Linear Search (Heap)
 Binary Search (Ordered)
 Equality on Hash Key
 Equality condition on primary key
 Inequality condition on primary key
 Equality condition on secondary index
 Inequality condition on secondary B+
Tree index
If the selection predicate is a composite (AND & OR) then there
are additional cost considerations!
Strategies for implementing the JOIN operation
Different access strategies dependant upon the structure of the
files in which the relations to be joined are stored, and whether
the join attributes have been indexed/hashed: Each uses its
own cost algorithm (which refers to specific database statistics).
 Block nested loop join
 Indexed nested loop join
 Sort-merge join
 Hash join
Query Optimisation Summary
 The aims of query processing are to transform a query
written in a high-level language (SQL), into a correct and
efficient execution strategy expressed in a low-level
language (Relational Algebra), and to execute the strategy to
retrieve the required data.
 There are many equivalent transformations of the same high-
level query, the DBMS has to choose the one that minimises
resource usage.
 There are two main techniques for query optimisation. The
first uses heuristic rules that order the operations in a query.
The second compares different execution strategies for those
operations, based on their relative costs, and selects the least
resource intensive (cheapest) ones.

More Related Content

What's hot

Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluationavniS
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query Processing
Mythili Kannan
 
Query-porcessing-& Query optimization
Query-porcessing-& Query optimizationQuery-porcessing-& Query optimization
Query-porcessing-& Query optimization
Saranya Natarajan
 
Query evaluation and optimization
Query evaluation and optimizationQuery evaluation and optimization
Query evaluation and optimization
lavanya marichamy
 
Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localizationAli Usman
 
Query Execution Time and Query Optimization.
Query Execution Time and Query Optimization.Query Execution Time and Query Optimization.
Query Execution Time and Query Optimization.
Radhe Krishna Rajan
 
8 query processing and optimization
8 query processing and optimization8 query processing and optimization
8 query processing and optimizationKumar
 
Recursion and Sorting Algorithms
Recursion and Sorting AlgorithmsRecursion and Sorting Algorithms
Recursion and Sorting Algorithms
Afaq Mansoor Khan
 
R basics
R basicsR basics
R basics
FAO
 
R language introduction
R language introductionR language introduction
R language introduction
Shashwat Shriparv
 
Data structure and algorithm
Data structure and algorithmData structure and algorithm
Data structure and algorithm
Trupti Agrawal
 
Hash join
Hash joinHash join
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 BackgroundAli Usman
 
02 c++ Array Pointer
02 c++ Array Pointer02 c++ Array Pointer
02 c++ Array PointerTareq Hasan
 
Array Of Pointers
Array Of PointersArray Of Pointers
Array Of Pointers
Sharad Dubey
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Building data fusion surrogate models for spacecraft aerodynamic problems wit...
Building data fusion surrogate models for spacecraft aerodynamic problems wit...Building data fusion surrogate models for spacecraft aerodynamic problems wit...
Building data fusion surrogate models for spacecraft aerodynamic problems wit...
Shinwoo Jang
 

What's hot (20)

Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluation
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query Processing
 
Query processing
Query processingQuery processing
Query processing
 
Query-porcessing-& Query optimization
Query-porcessing-& Query optimizationQuery-porcessing-& Query optimization
Query-porcessing-& Query optimization
 
Query evaluation and optimization
Query evaluation and optimizationQuery evaluation and optimization
Query evaluation and optimization
 
Query processing System
Query processing SystemQuery processing System
Query processing System
 
Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localization
 
Query Execution Time and Query Optimization.
Query Execution Time and Query Optimization.Query Execution Time and Query Optimization.
Query Execution Time and Query Optimization.
 
8 query processing and optimization
8 query processing and optimization8 query processing and optimization
8 query processing and optimization
 
Query compiler
Query compilerQuery compiler
Query compiler
 
Recursion and Sorting Algorithms
Recursion and Sorting AlgorithmsRecursion and Sorting Algorithms
Recursion and Sorting Algorithms
 
R basics
R basicsR basics
R basics
 
R language introduction
R language introductionR language introduction
R language introduction
 
Data structure and algorithm
Data structure and algorithmData structure and algorithm
Data structure and algorithm
 
Hash join
Hash joinHash join
Hash join
 
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 Background
 
02 c++ Array Pointer
02 c++ Array Pointer02 c++ Array Pointer
02 c++ Array Pointer
 
Array Of Pointers
Array Of PointersArray Of Pointers
Array Of Pointers
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Building data fusion surrogate models for spacecraft aerodynamic problems wit...
Building data fusion surrogate models for spacecraft aerodynamic problems wit...Building data fusion surrogate models for spacecraft aerodynamic problems wit...
Building data fusion surrogate models for spacecraft aerodynamic problems wit...
 

Viewers also liked

2 shadowing
2 shadowing2 shadowing
2 shadowing
ashish61_scs
 
Query Optimisation
Query OptimisationQuery Optimisation
Query Optimisation
dchq
 
OLAP Cubes: Basic operations
OLAP Cubes: Basic operationsOLAP Cubes: Basic operations
OLAP Cubes: Basic operations
Sthefan Berwanger
 
Centralized vs distrbution system
Centralized vs distrbution systemCentralized vs distrbution system
Centralized vs distrbution systemzirram
 
Centralised and distributed databases
Centralised and distributed databasesCentralised and distributed databases
Centralised and distributed databases
Forrester High School
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
Komal Choudhary
 
OLAP
OLAPOLAP
OLAP
Ashir Ali
 
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
Beat Signer
 
Memory management
Memory managementMemory management
Memory management
Muhammad Fayyaz
 
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NFDatabase Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Oum Saokosal
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
CloverDX (formerly known as CloverETL)
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
Zalpa Rathod
 
Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)
Jargalsaikhan Alyeksandr
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
SlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
SlideShare
 

Viewers also liked (17)

2 shadowing
2 shadowing2 shadowing
2 shadowing
 
Query Optimisation
Query OptimisationQuery Optimisation
Query Optimisation
 
OLAP Cubes: Basic operations
OLAP Cubes: Basic operationsOLAP Cubes: Basic operations
OLAP Cubes: Basic operations
 
Centralized vs distrbution system
Centralized vs distrbution systemCentralized vs distrbution system
Centralized vs distrbution system
 
Centralised and distributed databases
Centralised and distributed databasesCentralised and distributed databases
Centralised and distributed databases
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
 
OLAP
OLAPOLAP
OLAP
 
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
 
OLAP
OLAPOLAP
OLAP
 
Memory management
Memory managementMemory management
Memory management
 
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NFDatabase Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 
Memory management
Memory managementMemory management
Memory management
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
 

Similar to Query optimisation

Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
vsssuresh
 
Module 2-2.ppt
Module 2-2.pptModule 2-2.ppt
Module 2-2.ppt
Shylaja40
 
R Programming: Mathematical Functions In R
R Programming: Mathematical Functions In RR Programming: Mathematical Functions In R
R Programming: Mathematical Functions In R
Rsquared Academy
 
R console
R consoleR console
R console
Ananth Raj
 
3._Relational_Algebra.pptx:Basics of relation algebra
3._Relational_Algebra.pptx:Basics of relation algebra3._Relational_Algebra.pptx:Basics of relation algebra
3._Relational_Algebra.pptx:Basics of relation algebra
ZakriyaMalik2
 
R/Finance 2009 Chicago
R/Finance 2009 ChicagoR/Finance 2009 Chicago
R/Finance 2009 Chicago
gyollin
 
SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)
Logan Palanisamy
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
Avjinder (Avi) Kaler
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
Avjinder (Avi) Kaler
 
Intro to relational model
Intro to relational modelIntro to relational model
Intro to relational model
ATS SBGI MIRAJ
 
Dbms ii mca-ch5-ch6-relational algebra-2013
Dbms ii mca-ch5-ch6-relational algebra-2013Dbms ii mca-ch5-ch6-relational algebra-2013
Dbms ii mca-ch5-ch6-relational algebra-2013
Prosanta Ghosh
 
Basic concept of MATLAB.ppt
Basic concept of MATLAB.pptBasic concept of MATLAB.ppt
Basic concept of MATLAB.ppt
aliraza2732
 
5 the relational algebra and calculus
5 the relational algebra and calculus5 the relational algebra and calculus
5 the relational algebra and calculusKumar
 
E212d9a797dbms chapter3 b.sc2 (2)
E212d9a797dbms chapter3 b.sc2 (2)E212d9a797dbms chapter3 b.sc2 (2)
E212d9a797dbms chapter3 b.sc2 (2)Mukund Trivedi
 
E212d9a797dbms chapter3 b.sc2 (1)
E212d9a797dbms chapter3 b.sc2 (1)E212d9a797dbms chapter3 b.sc2 (1)
E212d9a797dbms chapter3 b.sc2 (1)Mukund Trivedi
 
E212d9a797dbms chapter3 b.sc2
E212d9a797dbms chapter3 b.sc2E212d9a797dbms chapter3 b.sc2
E212d9a797dbms chapter3 b.sc2Mukund Trivedi
 
Relational Algebra Ch6 (Navathe 4th edition)/ Ch7 (Navathe 3rd edition)
Relational Algebra Ch6 (Navathe 4th edition)/ Ch7 (Navathe 3rd edition)Relational Algebra Ch6 (Navathe 4th edition)/ Ch7 (Navathe 3rd edition)
Relational Algebra Ch6 (Navathe 4th edition)/ Ch7 (Navathe 3rd edition)
Raj vardhan
 
R for Statistical Computing
R for Statistical ComputingR for Statistical Computing
R for Statistical Computing
Mohammed El Rafie Tarabay
 

Similar to Query optimisation (20)

Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
 
Module 2-2.ppt
Module 2-2.pptModule 2-2.ppt
Module 2-2.ppt
 
Learn Matlab
Learn MatlabLearn Matlab
Learn Matlab
 
R Programming: Mathematical Functions In R
R Programming: Mathematical Functions In RR Programming: Mathematical Functions In R
R Programming: Mathematical Functions In R
 
R console
R consoleR console
R console
 
3._Relational_Algebra.pptx:Basics of relation algebra
3._Relational_Algebra.pptx:Basics of relation algebra3._Relational_Algebra.pptx:Basics of relation algebra
3._Relational_Algebra.pptx:Basics of relation algebra
 
R/Finance 2009 Chicago
R/Finance 2009 ChicagoR/Finance 2009 Chicago
R/Finance 2009 Chicago
 
SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
 
Intro to relational model
Intro to relational modelIntro to relational model
Intro to relational model
 
Dbms ii mca-ch5-ch6-relational algebra-2013
Dbms ii mca-ch5-ch6-relational algebra-2013Dbms ii mca-ch5-ch6-relational algebra-2013
Dbms ii mca-ch5-ch6-relational algebra-2013
 
TreSQL
TreSQL TreSQL
TreSQL
 
Basic concept of MATLAB.ppt
Basic concept of MATLAB.pptBasic concept of MATLAB.ppt
Basic concept of MATLAB.ppt
 
5 the relational algebra and calculus
5 the relational algebra and calculus5 the relational algebra and calculus
5 the relational algebra and calculus
 
E212d9a797dbms chapter3 b.sc2 (2)
E212d9a797dbms chapter3 b.sc2 (2)E212d9a797dbms chapter3 b.sc2 (2)
E212d9a797dbms chapter3 b.sc2 (2)
 
E212d9a797dbms chapter3 b.sc2 (1)
E212d9a797dbms chapter3 b.sc2 (1)E212d9a797dbms chapter3 b.sc2 (1)
E212d9a797dbms chapter3 b.sc2 (1)
 
E212d9a797dbms chapter3 b.sc2
E212d9a797dbms chapter3 b.sc2E212d9a797dbms chapter3 b.sc2
E212d9a797dbms chapter3 b.sc2
 
Relational Algebra Ch6 (Navathe 4th edition)/ Ch7 (Navathe 3rd edition)
Relational Algebra Ch6 (Navathe 4th edition)/ Ch7 (Navathe 3rd edition)Relational Algebra Ch6 (Navathe 4th edition)/ Ch7 (Navathe 3rd edition)
Relational Algebra Ch6 (Navathe 4th edition)/ Ch7 (Navathe 3rd edition)
 
R for Statistical Computing
R for Statistical ComputingR for Statistical Computing
R for Statistical Computing
 

More from WBUTTUTORIALS

Software testing-and-analysis
Software testing-and-analysisSoftware testing-and-analysis
Software testing-and-analysis
WBUTTUTORIALS
 
Fuzzy logic-introduction
Fuzzy logic-introductionFuzzy logic-introduction
Fuzzy logic-introduction
WBUTTUTORIALS
 
Failure mode-and-effects-analysis
Failure mode-and-effects-analysisFailure mode-and-effects-analysis
Failure mode-and-effects-analysis
WBUTTUTORIALS
 
Direct memory access
Direct memory accessDirect memory access
Direct memory access
WBUTTUTORIALS
 
Cost volume-profit-relationships
Cost volume-profit-relationshipsCost volume-profit-relationships
Cost volume-profit-relationships
WBUTTUTORIALS
 
Control unit-implementation
Control unit-implementationControl unit-implementation
Control unit-implementation
WBUTTUTORIALS
 
Relational model
Relational modelRelational model
Relational model
WBUTTUTORIALS
 
Query processing-and-optimization
Query processing-and-optimizationQuery processing-and-optimization
Query processing-and-optimization
WBUTTUTORIALS
 
Data communications-concepts
Data communications-conceptsData communications-concepts
Data communications-concepts
WBUTTUTORIALS
 
Ajax workshop
Ajax workshopAjax workshop
Ajax workshop
WBUTTUTORIALS
 
Ajax toolkit-framework
Ajax toolkit-frameworkAjax toolkit-framework
Ajax toolkit-framework
WBUTTUTORIALS
 

More from WBUTTUTORIALS (12)

Software testing-and-analysis
Software testing-and-analysisSoftware testing-and-analysis
Software testing-and-analysis
 
Fuzzy logic-introduction
Fuzzy logic-introductionFuzzy logic-introduction
Fuzzy logic-introduction
 
Failure mode-and-effects-analysis
Failure mode-and-effects-analysisFailure mode-and-effects-analysis
Failure mode-and-effects-analysis
 
Direct memory access
Direct memory accessDirect memory access
Direct memory access
 
Cost volume-profit-relationships
Cost volume-profit-relationshipsCost volume-profit-relationships
Cost volume-profit-relationships
 
Control unit-implementation
Control unit-implementationControl unit-implementation
Control unit-implementation
 
Relational model
Relational modelRelational model
Relational model
 
Query processing-and-optimization
Query processing-and-optimizationQuery processing-and-optimization
Query processing-and-optimization
 
Data communications-concepts
Data communications-conceptsData communications-concepts
Data communications-concepts
 
Ajax workshop
Ajax workshopAjax workshop
Ajax workshop
 
Ajax toolkit-framework
Ajax toolkit-frameworkAjax toolkit-framework
Ajax toolkit-framework
 
Ajax
AjaxAjax
Ajax
 

Recently uploaded

A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 

Recently uploaded (20)

A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 

Query optimisation

  • 2. ⇒ Motivation for Query Optimisation ⇒ Phases of Query Processing ⇒ Query Trees ⇒ RA Transformation Rules ⇒ Heuristic Processing Strategies ⇒ Cost Estimation for RA Operations LECTURE PLAN
  • 3. Motivation for Query Optimisation List all the managers that work in the sales department. SELECT * FROM emp, dept WHERE emp.deptno = dept.deptno AND emp.job = ‘Manager’ AND dept.name = ‘Sales’; σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT) σ(job = ‘Manager’) ∧ (name=‘Sales’) (EMP emp.deptno = dept.deptno DEPT) (σ(job = ‘Manager’) (EMP)) emp.deptno = dept.deptno (σ(name=‘Sales’) (DEPT)) There are at least three alternative ways of representing this query as a Relational Algebra expression.
  • 4. Motivation for Query Optimisation σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT) Metrics: 1000 tuples in the EMP relation 50 tuples in the DEPT relation 50 employees are Managers (one per department) 5 separate Sales departments (across the country) Cost of processing the following query alternate: Cartesian product of EMP and DEPT: (1000 + 50) record I/O’s to read the relations + (1000 * 50) record I/O’s to create an intermediate relation to store result Selection on result of Cartesian product: (1000 * 50) record I/O’s to read tuples and compare against predicate Total cost of the query: (1000 + 50) + 2*(1000 * 50) = 101, 050 record I/O’s.
  • 5. Motivation for Query Optimisation Metrics: 1000 tuples in the EMP relation 50 tuples in the DEPT relation 50 employees are Managers (one per department) 5 separate Sales departments (across the country) Cost of processing the following query alternate: Join of EMP and DEPT over deptno: (1000 + 50) record I/O’s to read the relations + (1000) record I/O’s to create an intermediate relation to store join result Selection on result of Join: (1000) record I/O’s to read each tuple and compare against predicate Total cost of the query: (1000 + 50) + 2*(1000) = 3, 050 record I/O’s. σ(job = ‘Manager’) ∧ (name=‘Sales’) (EMP emp.deptno = dept.deptno DEPT)
  • 6. Motivation for Query Optimisation Cost of processing the following query: (σ(job = ‘Manager’) (EMP)) emp.deptno = dept.deptno (σ(name=‘Sales’) (DEPT)) Select ‘Managers’ in EMP: (1000) record I/O’s to read the relations + (50) record I/O’s to create an intermediate relation to store select result Select ‘Sales’ in DEPT: (50) record I/O’s to read the relations + (5) record I/O’s to create an intermediate relation to store select result Join of previous two selections over deptno: (50 + 5) record I/O’s to read the relations Total cost of the query: (1000 2*(50) + 5 +(50 +5)) = 1, 160 record I/O’s.
  • 7. Phases of Query Processing
  • 8. Query Processing Stage - 1  Cast the query into internal form  This involves the conversion of the original (SQL) query into some internal representation more suitable for machine manipulation.  The internal representation typically chosen is either some kind of ‘abstract syntax tree’, or a relational algebra ‘query tree’.
  • 9. Relational Algebra Query Trees A Relational Algebra query can be represented as a ‘query tree’. For example the query to list all the managers that work in the sales department could be described as one of the following: σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT) EMP DEPT X σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) Leaves Intermediate operations Root
  • 10. Relational Algebra Query Trees A Relational Algebra query can be represented as a ‘query tree’. For example the query to list all the managers that work in the sales department could be described as one of the following: σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) (EMP X DEPT) EMP DEPT X σ(job = ‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) Leaves Intermediate operations Root
  • 11. Relational Algebra Query Trees σ(job = ‘Manager’) ∧ (name=‘Sales’) (EMP emp.deptno = dept.deptno DEPT) EMP DEPT σ(job = ‘Manager’) ∧ (name=‘Sales’) emp.deptno = dept.deptno Alternative‘query tree’ for the query to list all the managers that work in the sales department:
  • 12. Relational Algebra Query Trees (σ(job = ‘Manager’) (EMP)) emp.deptno = dept.deptno (σ(name=‘Sales’) (DEPT)) EMP DEPT emp.deptno = dept.deptno σ(job = ‘Manager’) σ(name=‘Sales’) Alternative‘query tree’ for the query to list all the managers that work in the sales department:
  • 13. Query Processing Stage - 2  Convert to canonical form  Find a more ‘efficient’ representation of the query by converting the internal representation into some equivalent (canonical) form through the application of a set of well-defined ‘transformation rules’.  The set of transformation rules to apply will generally be the result of the application of specific heuristic processing strategies associated with particular DBMSs.
  • 14. 1. Conjunctive selection operations can cascade into individual selection operations (and vice versa). Sometimes referred to as cascade of selection. σp∧q∧r(R) = σp(σq(σr(R))) Example: σdeptno=10 ∧sal>1000(Emp) = σdeptno=10(σsal>1000(Emp)) Transformation Rules for RA Operations
  • 15. 2. Commutativity of selection σp(σq(R)) = σq(σp(R)) Example: σsal>1000(σdeptno=10(Emp)) = σdeptno=10(σsal>1000(Emp)) Transformation Rules for RA Operations
  • 16. 3. In a sequence of projection operations, only the last in the sequence is required. ΠLΠM … ΠN(R) = ΠL (R) Example: ΠdeptnoΠname(Dept) = Πdeptno (Dept)) Transformation Rules for RA Operations
  • 17. 4. Commutativity of selection and projection. ΠAi,…,Am(σp(R)) = σp(ΠAi,…,Am(R)) where p ∈{A1, A2, …, Am} Example: Πname, job(σname=‘Smith’(Emp)) = σname=‘Smith'(Πname,job(Staff)) Transformation Rules for RA Operations Selection predicate (p) is only made up of projected attributes
  • 18. 5. Commutativity of theta-join (and Cartesian product). R pS = S pR Transformation Rules for RA Operations R X S = S X R Example: EMP emp.deptno = dept.deptno DEPT = DEPT emp.deptno = dept.deptno EMP NOTE: Theta-join is a generalisation of both the equi-join and natural-join
  • 19. 6. Commutativity of selection and theta-join (or Cartesian product). Transformation Rules for RA Operations Example: (σemp.deptno=10 (EMP)) emp.deptno = dept.deptno DEPT = σemp.deptno=10 (EMP emp.deptno = dept.deptno DEPT) (σp(R)) r S = σp(R r S) where p ∈{A1, A2, …, Am} Selection predicate (p) is only made up of join attributes
  • 20. 7. Commutativity of projection and theta-join (or Cartesian product). Transformation Rules for RA Operations Example: Π job, location, deptno (EMP emp.deptno = dept.deptno DEPT) = (Π job, deptno (EMP)) emp.deptno = dept.deptno (Π location, deptno (DEPT)) ΠL(R r S) = (ΠL1(R)) r (ΠL2(S)) Project attributes L = L1 ∪ L2, where L1 are attributes of R, and L2 are attributes of S. L will also contain the join attributes
  • 21. 8. Commutativity of union and intersection (but not set difference). R ∪ S = S ∪ R R ∩ S = S ∩ R Transformation Rules for RA Operations
  • 22. Transformation Rules for RA Operations 9. Commutativity of selection and set operations (union, intersection, and set difference). Union σp(R ∪ S) = σp(S) ∪ σp(R) Intersection σp(R ∩ S) = σp(S) ∩ σp(R) Set Difference σp(R - S) = σp(S) - σp(R)
  • 23. 10 Commutativity of projection and union ΠL(R ∪ S) = ΠL(S) ∪ ΠL(R) Transformation Rules for RA Operations
  • 24. 11 Associativity of natural join (and Cartesian product) Natural Join (R S) T = R (S T) Cartesian Product (R X S) X T = R X (S X T) Transformation Rules for RA Operations
  • 25. Transformation Rules for RA Operations 12 Associativity of union and intersection (but not set difference) Union (R ∪ S) ∪ T = S ∪ (R ∪ T) Intersection (R ∩ S) ∩ T = S ∩ (R ∩ T)
  • 26. Heuristic Processing Strategies  Perform selection operations as early as possible  Translate a Cartesian product and subsequent selection (whose predicate represents a join condition) into a join operation.  Use associativity of binary operations to ensure that the most restrictive selection operations are executed first  Perform projections as early as possible.  Compute common expressions once
  • 27. Heuristic Processing - Example EMP DEPT σ(job =‘Manager’) ∧ (name=‘Sales’) emp.deptno = dept.deptno EMP DEPT σ(job =‘Manager’) ∧ (name=‘Sales’) emp.deptno = dept.deptno EMP DEPT σ(job =‘Manager’) ∧ (name=‘Sales’) emp.deptno = dept.deptno EMP DEPT emp.deptno = dept.deptno σ(job =‘Manager’) σ(name=‘Sales’) EMP DEPT emp.deptno = dept.deptno σ(job =‘Manager’) σ(name=‘Sales’) EMP DEPT emp.deptno = dept.deptno σ(job =‘Manager’) σ(job =‘Manager’) σ(name=‘Sales’) EMP DEPT X σ(job =‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) EMP DEPT X σ(job =‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) EMP DEPT X σ(job =‘Manager’) ∧ (name=‘Sales’) ∧ (emp.deptno = dept.deptno) Optimised Canonical Query
  • 28. Query Processing Stage - 3  Choose candidate low-level procedures  Consider the (optimised canonical) query as a series of low-level operations (join, restrict, etc…).  For each of these operations generate alternative execution strategies and calculate the cost of such strategies on the basis of statistical information held about the database tables (files).
  • 29. Query Processing Stage - 4  Generate query plans and choose the cheapest  Construct a set of ‘candidate’ Query Execution Plans (QEPs).  Each QEP is constructed by selecting a candidate implementation procedure for each operation in the canonical query and then combining them to form a string of associated operations.  Each QEP will have an (estimated) cost associated with it – the sum of the cost of each of its operations.  Choose the QEP with the least cost.
  • 30. Cost Based Optimisation  Cost Based Optimisation (stages 3 & 4)  A good declarative query optimiser does not rely solely on heuristic processing strategies.  It chooses the QEP with the lowest estimated cost.  After heuristic rules are applied to a query, there still remains a number of alternative ways to execute it .  The Query Optimiser estimates the cost of executing each one (or at least a number) of these alternatives, and selects the cheapest one.
  • 31. Costs associated with query execution  Secondary storage access costs:  Searching for data blocks on disk,  Reading data blocks from disk  Writing data block to disk  Storage costs  Cost of storing intermediate (temp) files  Computation costs  Cost of CPU usage  Main memory usage costs  Cost of buffering data  Communication costs  Cost of moving data across
  • 32. Database statistics used in cost estimation Information held on each relation:  number of tuples  number of blocks  blocking factor  primary access method  primary access attributes  secondary indexes  secondary indexing attributes  number of levels for each index  number of distinct values of each attribute
  • 33. Physical Data Structures – File Types  Heap (Sequential, Unordered)  no key columns  queries, other than appends, scan every page  rows are appended at the end  duplicate rows are allowed  Ordered  physically sorted data file with no index  Hash (Random, Direct)  data is located based on the (calculated) value of a hash field (key)  Indexed Sequential (ISAM)  sorted data file with a primary index  B+ Tree  dynamic multilevel index  reuses deleted space on associated data pages
  • 34. Strategies for implementing the RESTRICT operation Different access strategies dependant upon the structure of the file in which the relation is stored, and whether the predicate attribute(s) have been indexed/hashed: Each uses a different cost algorithm (which refers to specific database statistics).  Linear Search (Heap)  Binary Search (Ordered)  Equality on Hash Key  Equality condition on primary key  Inequality condition on primary key  Equality condition on secondary index  Inequality condition on secondary B+ Tree index If the selection predicate is a composite (AND & OR) then there are additional cost considerations!
  • 35. Strategies for implementing the JOIN operation Different access strategies dependant upon the structure of the files in which the relations to be joined are stored, and whether the join attributes have been indexed/hashed: Each uses its own cost algorithm (which refers to specific database statistics).  Block nested loop join  Indexed nested loop join  Sort-merge join  Hash join
  • 36. Query Optimisation Summary  The aims of query processing are to transform a query written in a high-level language (SQL), into a correct and efficient execution strategy expressed in a low-level language (Relational Algebra), and to execute the strategy to retrieve the required data.  There are many equivalent transformations of the same high- level query, the DBMS has to choose the one that minimises resource usage.  There are two main techniques for query optimisation. The first uses heuristic rules that order the operations in a query. The second compares different execution strategies for those operations, based on their relative costs, and selects the least resource intensive (cheapest) ones.