SlideShare a Scribd company logo
Lecture 3/2 Query Processing
• T. Connolly, and C. Begg, “Database Systems: A Practical Approach to Design, Implementation, and Management”, 5th edition,
Addison-Wesley, 2009. ISBN: 0-321-60110-6, ISBN-13: 978-0-321-60110-0 (International Edition).
• T. Connolly, and C. Begg, “Database Systems: A Practical Approach to Design, Implementation, and Management”, 4th edition,
Addison-Wesley, 2004. ISBN: 0-321-21025-5.
• R. Elmasri and S. B. Navathe, “Fundamentals of Database Systems”, 5th ed., Pearson, 2007, ISBN: 0-321-41506-X.
ITM661 – Database Systems
Query Processing 2
Objectives
 Objectives of query processing and optimization.
 Static versus dynamic query optimization.
 How a query is decomposed and semantically analyzed.
 How to create a R.A.T. to represent a query.
 Rules of equivalence for RA operations.
 How to apply heuristic transformation rules to improve
efficiency of a query.
 How pipelining can be used to improve efficiency of
queries.
 Difference between materialization and pipelining.
 Advantages of left-deep trees.
(R.A.T = Relational Algebra Tree)
Query Processing 3
An Example (Branch and Staff Relations)
• A relation is a table with
columns and rows.
Only applies to logical structure of
the database, not the physical
structure.
• An attribute is a named
column of a relation.
• A domain is a set of
allowable values for one or
more attributes.
• A tuple is a row of a
relation.
• A degree is a number of
attributes in a relation.
• A cardinality is a number of
tuples in a relation.
• A relational database is a
collection of normalized
relations.
Query Processing 4
Relational Algebra and Calculus
 Relational algebra and relational calculus are formal
languages associated with the relational model.
Informally,
 Relational algebra is a (high-level) procedural
language and
 Relational calculus a non-procedural language.
 However, formally both are equivalent to one another.
 A language that produces a relation that can be derived
using relational calculus is relationally complete.
Query Processing 5
Relational Algebra
 There are 5 basic operations, in relational algebra, that performs
most of the data retrieval operations needed.
 Selection
 Projection
 Cartesian Product
 Union
 Set Difference
 Also operations that can be expressed by 5 basic operations.
 Join
 Intersection
 Division
Query Processing 6
An Example (Home Rental Database)
Query Processing 7
An Example (Home Rental Database)
Query Processing 8
An Example (Home Rental Database)
Query Processing 9
Query Processing
 Activities involved in retrieving data from the
database.
 Aims of QP:
 transform query written in high-level language (e.g.
SQL), into correct and efficient execution strategy
expressed in low-level language (implementing
Relational Algebra);
 execute the strategy to retrieve required data.
Query Processing 10
Query Optimization
 Activity of choosing an efficient execution
strategy for processing query.
 As there are many equivalent transformations of
same high-level query, aim of QO is to choose
one that minimizes resource usage.
 Generally, reduce total execution time of query.
 May also reduce response time of query.
 Problem computationally intractable with large
number of relations, so strategy adopted is
reduced to finding near optimum solution.
Query Processing 11
Different Strategies
 Find all Managers that work at a London branch.
SELECT *
FROM Staff s, Branch b
WHERE s.branchNo = b.branchNo AND
(s.position = ‘Manager’ AND b.city = ‘London’);
 Three equivalent RA queries are:
(1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch)
(2) s(position='Manager')  (city='London') ( Staff |><| staff.bno=branch.bno Branch)
(3) (sposition='Manager’(Staff)) |><| staff.bno=branch.bno (s city='London' (Branch))
(1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch)
(2) s(position='Manager')  (city='London') ( Staff staff.bno=branch.bno Branch)
(3) (sposition='Manager’(Staff)) staff.bno=branch.bno (s city='London' (Branch))
Staff Branch

s s(position='Manager’) (city='London')  (staff.bno=branch.bno)
1000 50
50000
epsilon
12
Query Processing
(1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch)
(2) s(position='Manager')  (city='London') ( Staff staff.bno=branch.bno Branch)
(3) (sposition='Manager’(Staff)) staff.bno=branch.bno (s city='London' (Branch))
Staff Branch
s s(position='Manager’) (city='London')
1000 50
epsilon
1000
13
Query Processing
(1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch)
(2) s(position='Manager')  (city='London') ( Staff staff.bno=branch.bno Branch)
(3) (sposition='Manager’(Staff)) staff.bno=branch.bno (s city='London' (Branch))
Staff Branch
s
1000 50
epsilon
1000
s
50
position='Manager’
city='London'
50 5
14
Query Processing
Query Processing 15
Cost Comparison in Different Strategies
 Assume:
 1000 tuples in Staff; 50 tuples in Branch;
 50 Managers; 5 London branches;
 No indexes or sort keys;
 Results of any intermediate operations stored on disk;
 Cost of the final write is ignored;
 Tuples are accessed one at a time.
(1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch)
(2) s(position='Manager')  (city='London') ( Staff |><| staff.bno=branch.bno Branch)
(3) (sposition='Manager’(Staff)) |><| staff.bno=branch.bno (s city='London' (Branch))
(1) (1000 + 50) + 2*(1000 * 50) = 101,050
(2) 2*1000 + (1000 + 50) = 3,050
(3) 1000 + 2*50 + 50 + 2*(5) = 1,160
Cartesian product and
join operations are
much more expensive
than selection
Query Processing
16
4 Phases of Query Processing
(1) decomposition
(parsing and
validation)
(2) optimization
(3) code generation
(4) execution.
Query Processing 17
Dynamic versus Static Optimization
 Two choices when first three phases of QP can be
carried out:
1. dynamically every time query is run.
 Advantages if dynamic QO arise from fact that
information is up-to-date.
 Disadvantages are that performance of query is
affected, time may limit finding optimum strategy.
2. Statically when query is first submitted.
 Advantages of static QO are removal of runtime
overhead  more time to find optimum strategy.
 Disadvantages arise from fact that chosen execution
strategy may no longer be optimal when query is
run.
 Could use a hybrid approach to overcome this.
Query Processing 18
Query Decomposition
 Aims are to transform high-level query into
RA query and check that query is syntactically
and semantically correct.
 Typical stages are:
 analysis
 normalization
 semantic analysis
 simplification
 query restructuring
Query Processing 19
Analysis (I)
 Analyze query lexically and syntactically using compiler
techniques.
 Verify relations and attributes exist.
 Verify operations are appropriate for object type.
 EX: SELECT staff_no
FROM Staff
WHERE position > 10;
 This query would be rejected on two grounds:
 Staff_no is not defined for Staff relation (should be
StaffNo).
 Comparison ‘>10’ is incompatible with type ‘position’,
which is variable character string.
Query Processing 20
Analysis (II)
 Finally, query transformed into some internal
representation more suitable for processing.
 Some kind of query tree is typically chosen,
constructed as follows:
 Leaf node created for each base relation.
 Non-leaf node created for each intermediate
relation produced by RA operation.
 Root of tree represents query result.
 Sequence is directed from leaves to root.
Query Processing 21
Relational Algebra Tree
Query Processing 22
Normalization
 Converts query into a normalized form for easier
manipulation.
 Predicate can be converted into one of two forms:
 Conjunctive normal form:
(position = 'Manager'  salary > 20000)  (bno = 'B3')
 Disjunctive normal form:
(position = 'Manager'  bno = 'B3' ) 
(salary > 20000  bno = 'B3')
Query Processing 23
Semantic Analysis
 Rejects normalized queries that are incorrectly
formulated or contradictory.
 Query is incorrectly formulated if components do not
contribute to generation of result.
 Query is contradictory if its predicate cannot be
satisfied by any tuple.
 Algorithms to determine correctness exist only for
queries that do not contain disjunction and negation.
Query Processing 24
24
Semantic Analysis
 For these queries, could construct:
 A relation connection graph.
 Normalized attribute connection graph.
 Relation connection graph
 Create node for each relation and node for result.
Create edges between two nodes that represent a
join, and edges between nodes that represent
projection.
 If not connected, query is incorrectly formulated.
Query Processing 25
25
Normalized Attribute Connection Graph
 Create node for each reference to an attribute, or
constant 0.
 Create directed edge between nodes that represent a
join, and directed edge between attribute node and 0
node that represents selection.
 Weight edges a  b with value c, if it represents
inequality condition (a  b + c); weight edges 0  a
with -c, if it represents inequality condition (a  c).
 If graph has cycle for which valuation sum is
negative, query is contradictory.
Query Processing 26
Checking Semantic Correctness
SELECT p.propertyNo, p.street
FROM Client c, Viewing v, PropertyForRent p
WHERE c.clientNo = v.clientNo AND
c.maxRent >= 500 AND
c.prefType = 'Flat' AND p.ownerNo = 'CO93';
 Relation connection graph not fully connected, so query is not
correctly formulated. Omiting the join (v.propertyNo =
p.propertyNo).
SELECT p.propertyNo, p.street
FROM Client c, Viewing v, PropertyForRent p
WHERE c.maxRent > 500 AND
c.clientNo = v.clientNo AND
v.propertyNo = p.propertyNo AND
c.prefType = 'Flat' AND c.maxRent < 200;
 Normalized attribute connection graph has cycle between
nodes R.maxRent and 0 with negative valuation sum, so query
is contradictory
Query Processing 27
Checking Semantic Correctness
Relation
Connection graph
Normalized
attribute
connection graph
Query Processing 28
Simplification
 Simplification strategy:
 Detects redundant qualifications,
 Eliminates common sub-expressions,
 Transforms query to semantically equivalent but more
easily and efficiently computed form.
 Typically, access restrictions, view definitions, and integrity
constraints are considered.
 Assuming user has appropriate access privileges, first apply
well-known idempotency rules of Boolean algebra.
 Ex.: p  p equivalent to p
p  true equivalent to true
 Next step is Query restructuring
Query Processing 29
Transformation Rules for RA Operation (I)
 Conjunctive selection operations can cascade into individual
selection operations (vice versa).
sp  q  r(R) = sp(sq(sr(R)))
 Sometimes referred to as cascade of selection.
sbranchNo='B003‘  salary>15000(Staff) = sbranchNo='B003'(ssalary>15000(Staff))
 Commutativity of selection.
sp(sq(R)) = sq(sp(R))
 For example:
sbranchNo='B003'(ssalary>15000(Staff)) = ssalary>15000(sbranchNo='B003'(Staff))
Query Processing 30
Transformation Rules for RA Operation (II)
 In a sequence of projection operations, only the last in the
sequence is required.
PLPM … PN(R) = PL (R)
 For example:
PlNamePlName, branchNo(Staff) = PlName (Staff)
 Commutativity of selection and projection.
PAi, …, Am(sp(R)) = sp(PAi, …, Am(R)) where p {A1, A2, …, Am}
 For example:
PfName,lName(slName='Beech'(Staff)) = slName='Beech'(PfName,lName(Staff))
Query Processing 31
Transformation Rules for RA Operation (III)
 Commutativity of q-join (and Cartesian product).
R p S = S p R
R  S = S  R
 Rule also applies to equi-join and natural join. For example:
Staff staff.branchNo=branch.branchNo Branch =
Branch staff.branchNo=branch.branchNo Staff
Query Processing 32
Transformation Rules for RA Operation (IV)
 Commutativity of selection and theta-join (or Cartesian
product).
 If selection predicate involves only attributes of one of join
relations, selection and join (or Cartesian product) operations
commute:
sp(R r S) = (sp(R)) r S
sp(R  S) = (sp(R))  S
where p is a constraint on attributes of R
Query Processing 33
Transformation Rules for RA Operation (V)
 If selection predicate is conjunctive predicate having form (p
ู q), where p only involves attributes of R, and q only
attributes of S, selection and theta-join operations commute as:
sp  q(R r S) = (sp(R)) r (sq(S))
sp  q(R  S) = (sp(R))  (sq(S))
 For example:
sposition='Manager’  city='London'(Staff staff.branchNo=branch.branchNo Branch) =
(sposition='Manager'(Staff)) staff.branchNo=branch.branchNo (scity='London' (Branch))
Query Processing 34
Transformation Rules for RA Operation (VI)
 Commutativity of projection and theta-join
(or Cartesian product).
 If projection list is of form L = L1  L2, where L1 only
involves attributes of R, and L2 only involves attributes of S,
provided join condition only contains attributes of L,
projection and theta-join operations commute as:
PL1 L2(R r S) = (PL1(R)) r (PL2(S))
Query Processing 35
Transformation Rules for RA Operation (VII)
 If join condition contains additional attributes not in L, say
attributes M = M1  M2 where M1 only involves attributes of
R, and M2 only involves attributes of S, a final projection op.
is required:
PL1  L2(R r S) = PL1  L2( (PL1  M1(R)) r (PL2  M2(S)))
 For example:
Pposition, city, branchNo(Staff staff.branchNo=branch.branchNo Branch) =
(Pposition, branchNo(Staff)) staff.branchNo=branch.branchNo (Pcity, branchNo (Branch))
 and using the latter rule:
Pposition, city(Staff staff.branchNo=branch.branchNo Branch) =
Pposition, city ((Pposition, branchNo(Staff)) staff.branchNo=branch.branchNo ( Pcity, branchNo (Branch)))
Query Processing 36
Transformation Rules for RA Operation (VIII)
 Commutativity of union and intersection (but not set
difference).
R  S = S  R
R  S = S  R
 Commutativity of selection and set operations (union,
intersection, and set difference).
sp(R  S) = sp(S)  sp(R)
sp(R  S) = sp(S)  sp(R)
sp(R - S) = sp(S) - sp(R)
 Commutativity of projection and union.
PL(R  S) = PL(S)  PL(R)
 Associativity of union and intersection (but not set difference).
(R  S)  T = S  (R  T)
(R  S)  T = S  (R  T)
Query Processing 37
Transformation Rules for RA Operation (IX)
 Associativity of q-join (and Cartesian product).
 Cartesian product and natural join are always associative:
(R S) T = R (S T)
(R  S)  T = R  (S  T)
 If join condition q involves attributes only from S and T, then
theta-join is associative as follows:
(R p S) q  r T = R p  r (S q T)
 For example:
((Staff Staff.staffNo=PropertyForRent.staffNo PropertyForRent)
PropertyForRent.ownerNo=owner.ownerNo  staff.lName=owner.lName Owner)
=
((Staff Staff.staffNo=PropertyForRent.staffNo  Staff.lName=Owner.lName (PropertyForRent
PropertyForRent.ownerNo = Owner.ownerNo Owner)
Query Processing 38
Use of Transformation Rules (I)
 For prospective renters of flats, find properties that match
requirements and owned by CO93.
SELECT p.propertyNo, p.street
FROM Client c, Viewing v, PropertyForRent p
WHERE c.prefType = ‘Flat’ AND
c.clientNo = v.clientNo AND
v.propertyNo = p.propertyNo AND
c.maxRent >= p.rent AND
c.prefType = p.type AND
p.ownerNo = ‘CO93’;
Query Processing 39
Use of Transformation Rules (II)
SELECT p.propertyNo, p.street
FROM Client c, Viewing v, PropertyForRent p
WHERE c.prefType = ‘Flat’ AND
c.clientNo = v.clientNo AND
v.propertyNo = p.propertyNo AND
c.maxRent >= p.rent AND
c.prefType = p.type AND p.ownerNo = ‘CO93’;
Query Processing 40
Use of Transformation Rules (III)
SELECT p.propertyNo, p.street
FROM Client c, Viewing v, PropertyForRent p
WHERE c.prefType = ‘Flat’ AND
c.clientNo = v.clientNo AND
v.propertyNo = p.propertyNo AND
c.maxRent >= p.rent AND
c.prefType = p.type AND p.ownerNo = ‘CO93’;
Query Processing 41
Use of Transformation Rules (IV)
SELECT p.propertyNo, p.street
FROM Client c, Viewing v, PropertyForRent p
WHERE c.prefType = ‘Flat’ AND
c.clientNo = v.clientNo AND
v.propertyNo = p.propertyNo AND
c.maxRent >= p.rent AND
c.prefType = p.type AND p.ownerNo = ‘CO93’;
Query Processing 42
Heuristical Processing Strategies
 Perform selection operations as early as possible.
 Keep predicates on same relation together.
 Combine Cartesian product with subsequent selection whose
predicate represents join condition into a join operation.
 Use associativity of binary operations to rearrange leaf nodes
so leaf nodes with most restrictive selection operations
executed first.
 Perform projection as early as possible.
 Keep projection attributes on same relation together.
 Compute common expressions once.
 If common expression appears more than once, and result
not too large, store result and reuse it when required.
 It is useful when querying views, as same expression is used
to construct view each time.
Query Processing 43
Pipelining
 Materialization - output of one operation is stored in
temporary relation for processing by next.
 Could also pipeline results of one operation to
another without creating temporary relation.
 Known as pipelining or on-the-fly processing.
 Pipelining can save on cost of creating temporary
relations and reading results back in again.
 Generally, pipeline is implemented as separate
process or thread.
Query Processing 44
Types of Trees
Query Processing 45
Pipelining
 With linear trees, relation on one side of each
operator is always a base relation.
 However, as need to examine entire inner
relation for each tuple of outer relation, inner
relations must always be materialized.
 This makes left-deep trees appealing as inner
relations are always base relations.
 Reduces search space for optimum strategy,
and allows QO to use dynamic processing.
 Not all execution strategies are considered.
Query Processing 46
Exercise I
 Assume that there are 1000 tuples in Student, 40 tuples in
Course, 5000 tuples in TakeCourse, 40 tuples in Lecturer,
25 internal lecturers, 30 three-credit courses, 8 IT lecturers
and 200 IT students.
 Assume that the following SQL statement is executed
SELECT *
FROM Lecturer L, Course C
WHERE C.LecturerID = L.LecturerID AND
L.InOrEx = ‘I’ AND C.Credit = 3
Compare the cost of disk accesses of three following equivalent relational
algebra queries corresponding to the above SQL statement.
(a) s(C.LecturerID = L.LecturerID)  (C.Credit = 3)  ( L.InOrEx = ‘I’ ) (Course X Lecturer)
(b) s(C.Credit = 3)  ( L.InOrEx = ‘I’ ) ( Course |X|C.LecturerID = L.LecturerID Lecturer)
(c) (s(C.Credit = 3) (Course)) |X|C.LecturerID = L.LecturerID (s( L.InOrEx = ‘I’ ) (Lecturer))
Query Processing 47
Exercise II
Given the following SQL query, illustrate relational algebra trees based
on heuristical processing strategies.
SELECT S.SFName, S.SLName, L.LFName
FROM Student S, Course C, Lecturer L, TakeCourse T
WHERE C.LecturerID = L.LecturerID AND
C.CourseID = T.CourseID AND
S.StudentID = T.StudentID AND
C.Credit = 3 AND S.DeptID = ‘IT’ AND
L.LDeptID = S.DeptID;
Hint: The heuristics are
(1) Perform selection operations as early as possible
(2) Combine the Cartesian product with a subsequent selection operation whose predicate represents
a join condition into a join operation.
(3) Use associativity of binary operations to rearrange leaf nodes so that the leaf nodes with the most
restrictive selection operations are executed first.
(4) Perform projection as early as possible
(5) Compute common expressions once.

More Related Content

Similar to itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf

700442110-advanced database Ch-2-Query-Process.pptx
700442110-advanced database Ch-2-Query-Process.pptx700442110-advanced database Ch-2-Query-Process.pptx
700442110-advanced database Ch-2-Query-Process.pptxtasheebedane
 
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...IJERA Editor
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsmeharikiros2
 
RBHF_SDM_2011_Jie
RBHF_SDM_2011_JieRBHF_SDM_2011_Jie
RBHF_SDM_2011_JieMDO_Lab
 
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET Journal
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptxShafii8
 
Query decomposition in data base
Query decomposition in data baseQuery decomposition in data base
Query decomposition in data baseSalman Memon
 
Data_Structure_and_Algorithms_Using_C++ _ Nho Vĩnh Share.pdf
Data_Structure_and_Algorithms_Using_C++ _ Nho Vĩnh Share.pdfData_Structure_and_Algorithms_Using_C++ _ Nho Vĩnh Share.pdf
Data_Structure_and_Algorithms_Using_C++ _ Nho Vĩnh Share.pdfNho Vĩnh
 
Alexandria ACM Student Chapter | Specification & Verification of Data-Centric...
Alexandria ACM Student Chapter | Specification & Verification of Data-Centric...Alexandria ACM Student Chapter | Specification & Verification of Data-Centric...
Alexandria ACM Student Chapter | Specification & Verification of Data-Centric...AlexACMSC
 
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATAEFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATAcsandit
 
Fractal analysis of good programming style
Fractal analysis of good programming styleFractal analysis of good programming style
Fractal analysis of good programming stylecsandit
 
FRACTAL ANALYSIS OF GOOD PROGRAMMING STYLE
FRACTAL ANALYSIS OF GOOD PROGRAMMING STYLEFRACTAL ANALYSIS OF GOOD PROGRAMMING STYLE
FRACTAL ANALYSIS OF GOOD PROGRAMMING STYLEcscpconf
 
Modeling Search Computing Applications
Modeling Search Computing ApplicationsModeling Search Computing Applications
Modeling Search Computing ApplicationsMarco Brambilla
 
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...YONG ZHENG
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 
Design dbms
Design dbmsDesign dbms
Design dbmsIIITA
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 

Similar to itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf (20)

700442110-advanced database Ch-2-Query-Process.pptx
700442110-advanced database Ch-2-Query-Process.pptx700442110-advanced database Ch-2-Query-Process.pptx
700442110-advanced database Ch-2-Query-Process.pptx
 
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
A Case Elaboration Methodology for a Semantic Web Service Discovery System Ba...
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
 
RBHF_SDM_2011_Jie
RBHF_SDM_2011_JieRBHF_SDM_2011_Jie
RBHF_SDM_2011_Jie
 
Query trees
Query treesQuery trees
Query trees
 
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptx
 
Query decomposition in data base
Query decomposition in data baseQuery decomposition in data base
Query decomposition in data base
 
Data_Structure_and_Algorithms_Using_C++ _ Nho Vĩnh Share.pdf
Data_Structure_and_Algorithms_Using_C++ _ Nho Vĩnh Share.pdfData_Structure_and_Algorithms_Using_C++ _ Nho Vĩnh Share.pdf
Data_Structure_and_Algorithms_Using_C++ _ Nho Vĩnh Share.pdf
 
Alexandria ACM Student Chapter | Specification & Verification of Data-Centric...
Alexandria ACM Student Chapter | Specification & Verification of Data-Centric...Alexandria ACM Student Chapter | Specification & Verification of Data-Centric...
Alexandria ACM Student Chapter | Specification & Verification of Data-Centric...
 
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATAEFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
 
Fractal analysis of good programming style
Fractal analysis of good programming styleFractal analysis of good programming style
Fractal analysis of good programming style
 
FRACTAL ANALYSIS OF GOOD PROGRAMMING STYLE
FRACTAL ANALYSIS OF GOOD PROGRAMMING STYLEFRACTAL ANALYSIS OF GOOD PROGRAMMING STYLE
FRACTAL ANALYSIS OF GOOD PROGRAMMING STYLE
 
Modeling Search Computing Applications
Modeling Search Computing ApplicationsModeling Search Computing Applications
Modeling Search Computing Applications
 
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
 
E05312426
E05312426E05312426
E05312426
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
Design dbms
Design dbmsDesign dbms
Design dbms
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 

More from beshahashenafe20

Transaction Management, Concurrency Control and Deadlocks.pdf
Transaction Management, Concurrency Control and Deadlocks.pdfTransaction Management, Concurrency Control and Deadlocks.pdf
Transaction Management, Concurrency Control and Deadlocks.pdfbeshahashenafe20
 
Architecture Database_Unit5_Transaction Models and Concurrency Control.pdf
Architecture Database_Unit5_Transaction Models and Concurrency Control.pdfArchitecture Database_Unit5_Transaction Models and Concurrency Control.pdf
Architecture Database_Unit5_Transaction Models and Concurrency Control.pdfbeshahashenafe20
 
Coronel_PPT_hdgshaahaakjhfakjdhfajkdCh10.pdf
Coronel_PPT_hdgshaahaakjhfakjdhfajkdCh10.pdfCoronel_PPT_hdgshaahaakjhfakjdhfajkdCh10.pdf
Coronel_PPT_hdgshaahaakjhfakjdhfajkdCh10.pdfbeshahashenafe20
 
DSA chapter 4.pptxhdjaaaaaadjhsssssssssssssssssssssssssss
DSA chapter 4.pptxhdjaaaaaadjhsssssssssssssssssssssssssssDSA chapter 4.pptxhdjaaaaaadjhsssssssssssssssssssssssssss
DSA chapter 4.pptxhdjaaaaaadjhsssssssssssssssssssssssssssbeshahashenafe20
 
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdChapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdbeshahashenafe20
 
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhChapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhbeshahashenafe20
 
02 Text Operatiohhfdhjghdfshjgkhjdfjhglkdfjhgiuyihjufidhcun.pdf
02 Text Operatiohhfdhjghdfshjgkhjdfjhglkdfjhgiuyihjufidhcun.pdf02 Text Operatiohhfdhjghdfshjgkhjdfjhglkdfjhgiuyihjufidhcun.pdf
02 Text Operatiohhfdhjghdfshjgkhjdfjhglkdfjhgiuyihjufidhcun.pdfbeshahashenafe20
 

More from beshahashenafe20 (7)

Transaction Management, Concurrency Control and Deadlocks.pdf
Transaction Management, Concurrency Control and Deadlocks.pdfTransaction Management, Concurrency Control and Deadlocks.pdf
Transaction Management, Concurrency Control and Deadlocks.pdf
 
Architecture Database_Unit5_Transaction Models and Concurrency Control.pdf
Architecture Database_Unit5_Transaction Models and Concurrency Control.pdfArchitecture Database_Unit5_Transaction Models and Concurrency Control.pdf
Architecture Database_Unit5_Transaction Models and Concurrency Control.pdf
 
Coronel_PPT_hdgshaahaakjhfakjdhfajkdCh10.pdf
Coronel_PPT_hdgshaahaakjhfakjdhfajkdCh10.pdfCoronel_PPT_hdgshaahaakjhfakjdhfajkdCh10.pdf
Coronel_PPT_hdgshaahaakjhfakjdhfajkdCh10.pdf
 
DSA chapter 4.pptxhdjaaaaaadjhsssssssssssssssssssssssssss
DSA chapter 4.pptxhdjaaaaaadjhsssssssssssssssssssssssssssDSA chapter 4.pptxhdjaaaaaadjhsssssssssssssssssssssssssss
DSA chapter 4.pptxhdjaaaaaadjhsssssssssssssssssssssssssss
 
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdChapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
 
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhChapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
 
02 Text Operatiohhfdhjghdfshjgkhjdfjhglkdfjhgiuyihjufidhcun.pdf
02 Text Operatiohhfdhjghdfshjgkhjdfjhglkdfjhgiuyihjufidhcun.pdf02 Text Operatiohhfdhjghdfshjgkhjdfjhglkdfjhgiuyihjufidhcun.pdf
02 Text Operatiohhfdhjghdfshjgkhjdfjhglkdfjhgiuyihjufidhcun.pdf
 

Recently uploaded

Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfbu07226
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxheathfieldcps1
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptxmansk2
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleCeline George
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportAvinash Rai
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17Celine George
 
Application of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesApplication of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesRased Khan
 
size separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticssize separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticspragatimahajan3
 
ppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesashishpaul799
 
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdfTelling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdfTechSoup
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringDenish Jangid
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxbennyroshan06
 
An Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxAn Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxCeline George
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxCapitolTechU
 
Open Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointOpen Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointELaRue0
 
The impact of social media on mental health and well-being has been a topic o...
The impact of social media on mental health and well-being has been a topic o...The impact of social media on mental health and well-being has been a topic o...
The impact of social media on mental health and well-being has been a topic o...sanghavirahi2
 
Advances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdfAdvances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdfDr. M. Kumaresan Hort.
 

Recently uploaded (20)

Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17
 
Application of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesApplication of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matrices
 
size separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticssize separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceutics
 
ppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyesppt your views.ppt your views of your college in your eyes
ppt your views.ppt your views of your college in your eyes
 
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdfTelling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
An Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxAn Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptx
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
Open Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointOpen Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPoint
 
The impact of social media on mental health and well-being has been a topic o...
The impact of social media on mental health and well-being has been a topic o...The impact of social media on mental health and well-being has been a topic o...
The impact of social media on mental health and well-being has been a topic o...
 
Advances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdfAdvances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdf
 
B.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdfB.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdf
 

itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf

  • 1. Lecture 3/2 Query Processing • T. Connolly, and C. Begg, “Database Systems: A Practical Approach to Design, Implementation, and Management”, 5th edition, Addison-Wesley, 2009. ISBN: 0-321-60110-6, ISBN-13: 978-0-321-60110-0 (International Edition). • T. Connolly, and C. Begg, “Database Systems: A Practical Approach to Design, Implementation, and Management”, 4th edition, Addison-Wesley, 2004. ISBN: 0-321-21025-5. • R. Elmasri and S. B. Navathe, “Fundamentals of Database Systems”, 5th ed., Pearson, 2007, ISBN: 0-321-41506-X. ITM661 – Database Systems
  • 2. Query Processing 2 Objectives  Objectives of query processing and optimization.  Static versus dynamic query optimization.  How a query is decomposed and semantically analyzed.  How to create a R.A.T. to represent a query.  Rules of equivalence for RA operations.  How to apply heuristic transformation rules to improve efficiency of a query.  How pipelining can be used to improve efficiency of queries.  Difference between materialization and pipelining.  Advantages of left-deep trees. (R.A.T = Relational Algebra Tree)
  • 3. Query Processing 3 An Example (Branch and Staff Relations) • A relation is a table with columns and rows. Only applies to logical structure of the database, not the physical structure. • An attribute is a named column of a relation. • A domain is a set of allowable values for one or more attributes. • A tuple is a row of a relation. • A degree is a number of attributes in a relation. • A cardinality is a number of tuples in a relation. • A relational database is a collection of normalized relations.
  • 4. Query Processing 4 Relational Algebra and Calculus  Relational algebra and relational calculus are formal languages associated with the relational model. Informally,  Relational algebra is a (high-level) procedural language and  Relational calculus a non-procedural language.  However, formally both are equivalent to one another.  A language that produces a relation that can be derived using relational calculus is relationally complete.
  • 5. Query Processing 5 Relational Algebra  There are 5 basic operations, in relational algebra, that performs most of the data retrieval operations needed.  Selection  Projection  Cartesian Product  Union  Set Difference  Also operations that can be expressed by 5 basic operations.  Join  Intersection  Division
  • 6. Query Processing 6 An Example (Home Rental Database)
  • 7. Query Processing 7 An Example (Home Rental Database)
  • 8. Query Processing 8 An Example (Home Rental Database)
  • 9. Query Processing 9 Query Processing  Activities involved in retrieving data from the database.  Aims of QP:  transform query written in high-level language (e.g. SQL), into correct and efficient execution strategy expressed in low-level language (implementing Relational Algebra);  execute the strategy to retrieve required data.
  • 10. Query Processing 10 Query Optimization  Activity of choosing an efficient execution strategy for processing query.  As there are many equivalent transformations of same high-level query, aim of QO is to choose one that minimizes resource usage.  Generally, reduce total execution time of query.  May also reduce response time of query.  Problem computationally intractable with large number of relations, so strategy adopted is reduced to finding near optimum solution.
  • 11. Query Processing 11 Different Strategies  Find all Managers that work at a London branch. SELECT * FROM Staff s, Branch b WHERE s.branchNo = b.branchNo AND (s.position = ‘Manager’ AND b.city = ‘London’);  Three equivalent RA queries are: (1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch) (2) s(position='Manager')  (city='London') ( Staff |><| staff.bno=branch.bno Branch) (3) (sposition='Manager’(Staff)) |><| staff.bno=branch.bno (s city='London' (Branch))
  • 12. (1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch) (2) s(position='Manager')  (city='London') ( Staff staff.bno=branch.bno Branch) (3) (sposition='Manager’(Staff)) staff.bno=branch.bno (s city='London' (Branch)) Staff Branch  s s(position='Manager’) (city='London')  (staff.bno=branch.bno) 1000 50 50000 epsilon 12 Query Processing
  • 13. (1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch) (2) s(position='Manager')  (city='London') ( Staff staff.bno=branch.bno Branch) (3) (sposition='Manager’(Staff)) staff.bno=branch.bno (s city='London' (Branch)) Staff Branch s s(position='Manager’) (city='London') 1000 50 epsilon 1000 13 Query Processing
  • 14. (1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch) (2) s(position='Manager')  (city='London') ( Staff staff.bno=branch.bno Branch) (3) (sposition='Manager’(Staff)) staff.bno=branch.bno (s city='London' (Branch)) Staff Branch s 1000 50 epsilon 1000 s 50 position='Manager’ city='London' 50 5 14 Query Processing
  • 15. Query Processing 15 Cost Comparison in Different Strategies  Assume:  1000 tuples in Staff; 50 tuples in Branch;  50 Managers; 5 London branches;  No indexes or sort keys;  Results of any intermediate operations stored on disk;  Cost of the final write is ignored;  Tuples are accessed one at a time. (1) s(position='Manager’) (city='London')  (staff.bno=branch.bno) (Staff  Branch) (2) s(position='Manager')  (city='London') ( Staff |><| staff.bno=branch.bno Branch) (3) (sposition='Manager’(Staff)) |><| staff.bno=branch.bno (s city='London' (Branch)) (1) (1000 + 50) + 2*(1000 * 50) = 101,050 (2) 2*1000 + (1000 + 50) = 3,050 (3) 1000 + 2*50 + 50 + 2*(5) = 1,160 Cartesian product and join operations are much more expensive than selection
  • 16. Query Processing 16 4 Phases of Query Processing (1) decomposition (parsing and validation) (2) optimization (3) code generation (4) execution.
  • 17. Query Processing 17 Dynamic versus Static Optimization  Two choices when first three phases of QP can be carried out: 1. dynamically every time query is run.  Advantages if dynamic QO arise from fact that information is up-to-date.  Disadvantages are that performance of query is affected, time may limit finding optimum strategy. 2. Statically when query is first submitted.  Advantages of static QO are removal of runtime overhead  more time to find optimum strategy.  Disadvantages arise from fact that chosen execution strategy may no longer be optimal when query is run.  Could use a hybrid approach to overcome this.
  • 18. Query Processing 18 Query Decomposition  Aims are to transform high-level query into RA query and check that query is syntactically and semantically correct.  Typical stages are:  analysis  normalization  semantic analysis  simplification  query restructuring
  • 19. Query Processing 19 Analysis (I)  Analyze query lexically and syntactically using compiler techniques.  Verify relations and attributes exist.  Verify operations are appropriate for object type.  EX: SELECT staff_no FROM Staff WHERE position > 10;  This query would be rejected on two grounds:  Staff_no is not defined for Staff relation (should be StaffNo).  Comparison ‘>10’ is incompatible with type ‘position’, which is variable character string.
  • 20. Query Processing 20 Analysis (II)  Finally, query transformed into some internal representation more suitable for processing.  Some kind of query tree is typically chosen, constructed as follows:  Leaf node created for each base relation.  Non-leaf node created for each intermediate relation produced by RA operation.  Root of tree represents query result.  Sequence is directed from leaves to root.
  • 22. Query Processing 22 Normalization  Converts query into a normalized form for easier manipulation.  Predicate can be converted into one of two forms:  Conjunctive normal form: (position = 'Manager'  salary > 20000)  (bno = 'B3')  Disjunctive normal form: (position = 'Manager'  bno = 'B3' )  (salary > 20000  bno = 'B3')
  • 23. Query Processing 23 Semantic Analysis  Rejects normalized queries that are incorrectly formulated or contradictory.  Query is incorrectly formulated if components do not contribute to generation of result.  Query is contradictory if its predicate cannot be satisfied by any tuple.  Algorithms to determine correctness exist only for queries that do not contain disjunction and negation.
  • 24. Query Processing 24 24 Semantic Analysis  For these queries, could construct:  A relation connection graph.  Normalized attribute connection graph.  Relation connection graph  Create node for each relation and node for result. Create edges between two nodes that represent a join, and edges between nodes that represent projection.  If not connected, query is incorrectly formulated.
  • 25. Query Processing 25 25 Normalized Attribute Connection Graph  Create node for each reference to an attribute, or constant 0.  Create directed edge between nodes that represent a join, and directed edge between attribute node and 0 node that represents selection.  Weight edges a  b with value c, if it represents inequality condition (a  b + c); weight edges 0  a with -c, if it represents inequality condition (a  c).  If graph has cycle for which valuation sum is negative, query is contradictory.
  • 26. Query Processing 26 Checking Semantic Correctness SELECT p.propertyNo, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.clientNo = v.clientNo AND c.maxRent >= 500 AND c.prefType = 'Flat' AND p.ownerNo = 'CO93';  Relation connection graph not fully connected, so query is not correctly formulated. Omiting the join (v.propertyNo = p.propertyNo). SELECT p.propertyNo, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.maxRent > 500 AND c.clientNo = v.clientNo AND v.propertyNo = p.propertyNo AND c.prefType = 'Flat' AND c.maxRent < 200;  Normalized attribute connection graph has cycle between nodes R.maxRent and 0 with negative valuation sum, so query is contradictory
  • 27. Query Processing 27 Checking Semantic Correctness Relation Connection graph Normalized attribute connection graph
  • 28. Query Processing 28 Simplification  Simplification strategy:  Detects redundant qualifications,  Eliminates common sub-expressions,  Transforms query to semantically equivalent but more easily and efficiently computed form.  Typically, access restrictions, view definitions, and integrity constraints are considered.  Assuming user has appropriate access privileges, first apply well-known idempotency rules of Boolean algebra.  Ex.: p  p equivalent to p p  true equivalent to true  Next step is Query restructuring
  • 29. Query Processing 29 Transformation Rules for RA Operation (I)  Conjunctive selection operations can cascade into individual selection operations (vice versa). sp  q  r(R) = sp(sq(sr(R)))  Sometimes referred to as cascade of selection. sbranchNo='B003‘  salary>15000(Staff) = sbranchNo='B003'(ssalary>15000(Staff))  Commutativity of selection. sp(sq(R)) = sq(sp(R))  For example: sbranchNo='B003'(ssalary>15000(Staff)) = ssalary>15000(sbranchNo='B003'(Staff))
  • 30. Query Processing 30 Transformation Rules for RA Operation (II)  In a sequence of projection operations, only the last in the sequence is required. PLPM … PN(R) = PL (R)  For example: PlNamePlName, branchNo(Staff) = PlName (Staff)  Commutativity of selection and projection. PAi, …, Am(sp(R)) = sp(PAi, …, Am(R)) where p {A1, A2, …, Am}  For example: PfName,lName(slName='Beech'(Staff)) = slName='Beech'(PfName,lName(Staff))
  • 31. Query Processing 31 Transformation Rules for RA Operation (III)  Commutativity of q-join (and Cartesian product). R p S = S p R R  S = S  R  Rule also applies to equi-join and natural join. For example: Staff staff.branchNo=branch.branchNo Branch = Branch staff.branchNo=branch.branchNo Staff
  • 32. Query Processing 32 Transformation Rules for RA Operation (IV)  Commutativity of selection and theta-join (or Cartesian product).  If selection predicate involves only attributes of one of join relations, selection and join (or Cartesian product) operations commute: sp(R r S) = (sp(R)) r S sp(R  S) = (sp(R))  S where p is a constraint on attributes of R
  • 33. Query Processing 33 Transformation Rules for RA Operation (V)  If selection predicate is conjunctive predicate having form (p ู q), where p only involves attributes of R, and q only attributes of S, selection and theta-join operations commute as: sp  q(R r S) = (sp(R)) r (sq(S)) sp  q(R  S) = (sp(R))  (sq(S))  For example: sposition='Manager’  city='London'(Staff staff.branchNo=branch.branchNo Branch) = (sposition='Manager'(Staff)) staff.branchNo=branch.branchNo (scity='London' (Branch))
  • 34. Query Processing 34 Transformation Rules for RA Operation (VI)  Commutativity of projection and theta-join (or Cartesian product).  If projection list is of form L = L1  L2, where L1 only involves attributes of R, and L2 only involves attributes of S, provided join condition only contains attributes of L, projection and theta-join operations commute as: PL1 L2(R r S) = (PL1(R)) r (PL2(S))
  • 35. Query Processing 35 Transformation Rules for RA Operation (VII)  If join condition contains additional attributes not in L, say attributes M = M1  M2 where M1 only involves attributes of R, and M2 only involves attributes of S, a final projection op. is required: PL1  L2(R r S) = PL1  L2( (PL1  M1(R)) r (PL2  M2(S)))  For example: Pposition, city, branchNo(Staff staff.branchNo=branch.branchNo Branch) = (Pposition, branchNo(Staff)) staff.branchNo=branch.branchNo (Pcity, branchNo (Branch))  and using the latter rule: Pposition, city(Staff staff.branchNo=branch.branchNo Branch) = Pposition, city ((Pposition, branchNo(Staff)) staff.branchNo=branch.branchNo ( Pcity, branchNo (Branch)))
  • 36. Query Processing 36 Transformation Rules for RA Operation (VIII)  Commutativity of union and intersection (but not set difference). R  S = S  R R  S = S  R  Commutativity of selection and set operations (union, intersection, and set difference). sp(R  S) = sp(S)  sp(R) sp(R  S) = sp(S)  sp(R) sp(R - S) = sp(S) - sp(R)  Commutativity of projection and union. PL(R  S) = PL(S)  PL(R)  Associativity of union and intersection (but not set difference). (R  S)  T = S  (R  T) (R  S)  T = S  (R  T)
  • 37. Query Processing 37 Transformation Rules for RA Operation (IX)  Associativity of q-join (and Cartesian product).  Cartesian product and natural join are always associative: (R S) T = R (S T) (R  S)  T = R  (S  T)  If join condition q involves attributes only from S and T, then theta-join is associative as follows: (R p S) q  r T = R p  r (S q T)  For example: ((Staff Staff.staffNo=PropertyForRent.staffNo PropertyForRent) PropertyForRent.ownerNo=owner.ownerNo  staff.lName=owner.lName Owner) = ((Staff Staff.staffNo=PropertyForRent.staffNo  Staff.lName=Owner.lName (PropertyForRent PropertyForRent.ownerNo = Owner.ownerNo Owner)
  • 38. Query Processing 38 Use of Transformation Rules (I)  For prospective renters of flats, find properties that match requirements and owned by CO93. SELECT p.propertyNo, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.prefType = ‘Flat’ AND c.clientNo = v.clientNo AND v.propertyNo = p.propertyNo AND c.maxRent >= p.rent AND c.prefType = p.type AND p.ownerNo = ‘CO93’;
  • 39. Query Processing 39 Use of Transformation Rules (II) SELECT p.propertyNo, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.prefType = ‘Flat’ AND c.clientNo = v.clientNo AND v.propertyNo = p.propertyNo AND c.maxRent >= p.rent AND c.prefType = p.type AND p.ownerNo = ‘CO93’;
  • 40. Query Processing 40 Use of Transformation Rules (III) SELECT p.propertyNo, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.prefType = ‘Flat’ AND c.clientNo = v.clientNo AND v.propertyNo = p.propertyNo AND c.maxRent >= p.rent AND c.prefType = p.type AND p.ownerNo = ‘CO93’;
  • 41. Query Processing 41 Use of Transformation Rules (IV) SELECT p.propertyNo, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.prefType = ‘Flat’ AND c.clientNo = v.clientNo AND v.propertyNo = p.propertyNo AND c.maxRent >= p.rent AND c.prefType = p.type AND p.ownerNo = ‘CO93’;
  • 42. Query Processing 42 Heuristical Processing Strategies  Perform selection operations as early as possible.  Keep predicates on same relation together.  Combine Cartesian product with subsequent selection whose predicate represents join condition into a join operation.  Use associativity of binary operations to rearrange leaf nodes so leaf nodes with most restrictive selection operations executed first.  Perform projection as early as possible.  Keep projection attributes on same relation together.  Compute common expressions once.  If common expression appears more than once, and result not too large, store result and reuse it when required.  It is useful when querying views, as same expression is used to construct view each time.
  • 43. Query Processing 43 Pipelining  Materialization - output of one operation is stored in temporary relation for processing by next.  Could also pipeline results of one operation to another without creating temporary relation.  Known as pipelining or on-the-fly processing.  Pipelining can save on cost of creating temporary relations and reading results back in again.  Generally, pipeline is implemented as separate process or thread.
  • 45. Query Processing 45 Pipelining  With linear trees, relation on one side of each operator is always a base relation.  However, as need to examine entire inner relation for each tuple of outer relation, inner relations must always be materialized.  This makes left-deep trees appealing as inner relations are always base relations.  Reduces search space for optimum strategy, and allows QO to use dynamic processing.  Not all execution strategies are considered.
  • 46. Query Processing 46 Exercise I  Assume that there are 1000 tuples in Student, 40 tuples in Course, 5000 tuples in TakeCourse, 40 tuples in Lecturer, 25 internal lecturers, 30 three-credit courses, 8 IT lecturers and 200 IT students.  Assume that the following SQL statement is executed SELECT * FROM Lecturer L, Course C WHERE C.LecturerID = L.LecturerID AND L.InOrEx = ‘I’ AND C.Credit = 3 Compare the cost of disk accesses of three following equivalent relational algebra queries corresponding to the above SQL statement. (a) s(C.LecturerID = L.LecturerID)  (C.Credit = 3)  ( L.InOrEx = ‘I’ ) (Course X Lecturer) (b) s(C.Credit = 3)  ( L.InOrEx = ‘I’ ) ( Course |X|C.LecturerID = L.LecturerID Lecturer) (c) (s(C.Credit = 3) (Course)) |X|C.LecturerID = L.LecturerID (s( L.InOrEx = ‘I’ ) (Lecturer))
  • 47. Query Processing 47 Exercise II Given the following SQL query, illustrate relational algebra trees based on heuristical processing strategies. SELECT S.SFName, S.SLName, L.LFName FROM Student S, Course C, Lecturer L, TakeCourse T WHERE C.LecturerID = L.LecturerID AND C.CourseID = T.CourseID AND S.StudentID = T.StudentID AND C.Credit = 3 AND S.DeptID = ‘IT’ AND L.LDeptID = S.DeptID; Hint: The heuristics are (1) Perform selection operations as early as possible (2) Combine the Cartesian product with a subsequent selection operation whose predicate represents a join condition into a join operation. (3) Use associativity of binary operations to rearrange leaf nodes so that the leaf nodes with the most restrictive selection operations are executed first. (4) Perform projection as early as possible (5) Compute common expressions once.