SlideShare a Scribd company logo
1 of 35
Download to read offline
Distributed Database Systems
Autumn, 2008
Chapter 7
Overview of Query
Processing
1
Distributed Database Systems
SQL: Non-Procedural Language of RDB
 Tuple calculus
◦ { t | F(t) } where:
 t : tuple variable
 F(t) : well formed formula
 Example
◦ Get the No. and name of all managers
2
Distributed Database Systems
   
 
"
"
|
, MANAGER
TITLE
t
EMP
t
ENAME
ENO
t 


SQL: Non-Procedural Language of RDB
 Domain calculus
where:
 xi : domain variables
 : well formed formula
 Example
{ x, y | E(x, y, "manager") }
3
Distributed Database Systems
 
 
,
,
,
|
,
,
, 2
1
2
1 n
n x
x
x
F
x
x
x 





 
n
x
x
x
F ,
,
, 2
1 


Variables are position sensitive!
SQL: Non-Procedural Language of RDB
 SQL is a tuple calculus language
SELECT ENO,ENAME
FROM EMP
WHERE TITLE=“manager”
4
Distributed Database Systems
End user uses non-procedural languages
to express queries.
Query Processor
 Query processor transforms queries into
procedural operations to access data
5
Distributed Database Systems
Query Processor
 Distributed query processor has to deal
with
◦query decomposition, and
◦data localization
6
Distributed Database Systems
7.1 Query Processing Problems
Distributed Database Systems 7
7.1 Query Processing Problems
 Centralized query processor must
◦transform calculus query into
algebra operation, and
◦choose the best execution plan
 Example:
SELECT ENAME
FROM E,G
WHERE E.ENO = G.ENO
AND RESP=“manager”
8
Distributed Database Systems
7.1 Query Processing Problems
 Relational Algebra 1
 Relational Algebra 2
9
Distributed Database Systems
 
 
G
E Manager
RESP
ENO
ENAME "
"


 

 
 
G
E
ENO
G
ENO
E
Manager
RESP
ENAME 


 .
.
"
"


Execution plan 2 is better for consuming
less resources!
7.1 Query Processing Problems
 In DDB, the query processor must
consider the communication cost and
select the best site!
 Same query as last example, but G and E
are distributed.
 Simple plan:
◦ To transport all segments to query site and
execute there.This causes too much network
traffic, very costly.
10
Distributed Database Systems
7.1 Query Processing Problems
 Distributed Query Example
◦ Distribution of E and G
11
Distributed Database Systems
7.1 Query Processing Problems
 Distributed Query Example
◦ Query
12
Distributed Database Systems
 
 
G
E Manager
REPSP
ENO
ENAME "
"


 

7.1 Query Processing Problems
 Distributed Query Example
◦ Optimized Processing
13
Distributed Database Systems
7.2 Objectives of Query Processing
Distributed Database Systems 14
7.2 Objectives of Query Processing
 Two-fold objectives:
◦Transformation, and
◦Optimization
15
Distributed Database Systems
7.2 Objectives of Query Processing
 Cost to be considered for optimization:
◦CPU time
◦I/O time, and
◦Communication time
16
Distributed Database Systems
WAN: the last cost is dominant
LAN: all three are equal
7.3 Complexity of Relational Algebra Operations
Distributed Database Systems 17
7.3 Complexity of Relational Algebra Operations
 Measured by n (cardinality) and tuples are
sorted on comparison attributes
Distributed Database Systems 18
O(n)
O(nlogn)
O(nlogn)
O(n2)
)
duplicates
(with
,

GROUP
),
duplicates
(with


 ,
,
,
, 




7.4 Characterization of Query Processor
Distributed Database Systems 19
7.4.1 Languages
 For users:
◦ calculus or algebra based languages.
 For query processor:
◦ map the input into internal form of
algebra augmented with
communication primitives.
Distributed Database Systems 20
7.4.2 Types of Optimization
 Exhaustive search
◦ Workable for small solution space
 Heuristics
◦ Perform first, semi-join, etc. for large
solution space
Distributed Database Systems 21
 
,
7.4.3 Optimization Timing
 Static
◦ Do it at compiling time by using statistics,
appropriate for exhaustive search, optimized
once, but executed many times.
 Dynamic
◦ Do it at execution time, accurate, repeated
for every execution, expensive.
Distributed Database Systems 22
7.4.4 Statistics
 Facts of
◦ Cardinalities
◦ Attribute value distribution
◦ Size of relation, etc.
 Provided to query optimizer and
periodically updated.
Distributed Database Systems 23
7.4.5 Decision Site
 For query optimization, it may be done by
◦ Single site – centralized approach, or
◦ All the sites involved – distributed, or
◦ Hybrid – one site makes major decision in
cooperation with other sites making local
decisions
Distributed Database Systems 24
7.4.6 Exploration of the NetworkTopology
 WAN
◦ communication cost is dominant
 LAN
◦ communication cost is comparable to I/O
cost. Broadcasting capability, star network,
satellite network should be considered.
Distributed Database Systems 25
7.4.7 Exploration of Replicated Fragments
Use replications to minimize
communication costs.
Distributed Database Systems 26
7.4.8 Use of Semi-joins
Reduce the size of operand
relations to cut down
communication costs when
overhead is not significant.
Distributed Database Systems 27
7.5 Layers of Query Processing
Distributed Database Systems 28
Distributed Database Systems 29
Generic Laying Scheme
for Distributed Query
Processing
7.5.1 Query Decomposition
 Decompose calculus query into algebra
query using global conceptual schema
information.
Distributed Database Systems 30
Step 1 – calculus normalization
Step 2 – semantic analysis to reject
incorrect queries
Step 3 – simplification to eliminate
redundant components
Step 4 – translation of calculus query
into optimized algebra query.
7.5.2 Data Localization
Distributed query is mapped into
a fragment query and simplified
to produce a good one.
Distributed Database Systems 31
7.5.3 Global Query Optimization
 Find an execution strategy close to
optimal.
 Find the best ordering of operations in
the fragment query, including
communication operations.
 Cost function defined in time is required.
Distributed Database Systems 32
7.5.4 Local Query Optimization
Centralized system algorithms
(to be discussed in chapter 9)
Distributed Database Systems 33
7.6 Conclusions
Distributed Database Systems 34
7.6 Conclusions
 Query processor – must be able to find
good execution plan for a calculus query, s.
t. CPU time, I/O time and communication
time are minimized.
 Method: laying of
◦ decomposition
◦ localization
◦ global query optimization
◦ local query optimization
Distributed Database Systems 35

More Related Content

Similar to 07.Overview_of_Query_Processing.pdf

Query processing in Distributed Database System
Query processing in Distributed Database SystemQuery processing in Distributed Database System
Query processing in Distributed Database SystemMeghaj Mallick
 
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...Yahoo Developer Network
 
Bse 3105 lecture 4-software re-engineering
Bse 3105  lecture 4-software re-engineeringBse 3105  lecture 4-software re-engineering
Bse 3105 lecture 4-software re-engineeringAlonzee Tash
 
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptxPPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptxneju3
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Rusif Eyvazli
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...areej qasrawi
 
My mapreduce1 presentation
My mapreduce1 presentationMy mapreduce1 presentation
My mapreduce1 presentationNoha Elprince
 
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...Institute of Information Systems (HES-SO)
 
MapReduce: Simplified Data Processing On Large Clusters
MapReduce: Simplified Data Processing On Large ClustersMapReduce: Simplified Data Processing On Large Clusters
MapReduce: Simplified Data Processing On Large Clusterskazuma_sato
 
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Databricks
 
Performance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data WarehousePerformance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data WarehouseDenodo
 
JDBC Connectivity Model
JDBC Connectivity ModelJDBC Connectivity Model
JDBC Connectivity Modelkunj desai
 
Simplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterSimplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterHarsh Kevadia
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersXiao Qin
 
Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015József Makai
 
Scheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii VozniukScheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii VozniukAndrii Vozniuk
 

Similar to 07.Overview_of_Query_Processing.pdf (20)

Query processing in Distributed Database System
Query processing in Distributed Database SystemQuery processing in Distributed Database System
Query processing in Distributed Database System
 
MYSQL
MYSQLMYSQL
MYSQL
 
Handout3o
Handout3oHandout3o
Handout3o
 
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
 
Bse 3105 lecture 4-software re-engineering
Bse 3105  lecture 4-software re-engineeringBse 3105  lecture 4-software re-engineering
Bse 3105 lecture 4-software re-engineering
 
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptxPPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
 
My mapreduce1 presentation
My mapreduce1 presentationMy mapreduce1 presentation
My mapreduce1 presentation
 
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
 
MapReduce: Simplified Data Processing On Large Clusters
MapReduce: Simplified Data Processing On Large ClustersMapReduce: Simplified Data Processing On Large Clusters
MapReduce: Simplified Data Processing On Large Clusters
 
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
Extending Apache Spark SQL Data Source APIs with Join Push Down with Ioana De...
 
Performance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data WarehousePerformance Considerations in Logical Data Warehouse
Performance Considerations in Logical Data Warehouse
 
JDBC Connectivity Model
JDBC Connectivity ModelJDBC Connectivity Model
JDBC Connectivity Model
 
COBOL to Apache Spark
COBOL to Apache SparkCOBOL to Apache Spark
COBOL to Apache Spark
 
Simplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterSimplified Data Processing On Large Cluster
Simplified Data Processing On Large Cluster
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
 
Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015
 
Scheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii VozniukScheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii Vozniuk
 
Map reducecloudtech
Map reducecloudtechMap reducecloudtech
Map reducecloudtech
 

Recently uploaded

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 

Recently uploaded (20)

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 

07.Overview_of_Query_Processing.pdf

  • 1. Distributed Database Systems Autumn, 2008 Chapter 7 Overview of Query Processing 1 Distributed Database Systems
  • 2. SQL: Non-Procedural Language of RDB  Tuple calculus ◦ { t | F(t) } where:  t : tuple variable  F(t) : well formed formula  Example ◦ Get the No. and name of all managers 2 Distributed Database Systems       " " | , MANAGER TITLE t EMP t ENAME ENO t   
  • 3. SQL: Non-Procedural Language of RDB  Domain calculus where:  xi : domain variables  : well formed formula  Example { x, y | E(x, y, "manager") } 3 Distributed Database Systems     , , , | , , , 2 1 2 1 n n x x x F x x x         n x x x F , , , 2 1    Variables are position sensitive!
  • 4. SQL: Non-Procedural Language of RDB  SQL is a tuple calculus language SELECT ENO,ENAME FROM EMP WHERE TITLE=“manager” 4 Distributed Database Systems End user uses non-procedural languages to express queries.
  • 5. Query Processor  Query processor transforms queries into procedural operations to access data 5 Distributed Database Systems
  • 6. Query Processor  Distributed query processor has to deal with ◦query decomposition, and ◦data localization 6 Distributed Database Systems
  • 7. 7.1 Query Processing Problems Distributed Database Systems 7
  • 8. 7.1 Query Processing Problems  Centralized query processor must ◦transform calculus query into algebra operation, and ◦choose the best execution plan  Example: SELECT ENAME FROM E,G WHERE E.ENO = G.ENO AND RESP=“manager” 8 Distributed Database Systems
  • 9. 7.1 Query Processing Problems  Relational Algebra 1  Relational Algebra 2 9 Distributed Database Systems     G E Manager RESP ENO ENAME " "          G E ENO G ENO E Manager RESP ENAME     . . " "   Execution plan 2 is better for consuming less resources!
  • 10. 7.1 Query Processing Problems  In DDB, the query processor must consider the communication cost and select the best site!  Same query as last example, but G and E are distributed.  Simple plan: ◦ To transport all segments to query site and execute there.This causes too much network traffic, very costly. 10 Distributed Database Systems
  • 11. 7.1 Query Processing Problems  Distributed Query Example ◦ Distribution of E and G 11 Distributed Database Systems
  • 12. 7.1 Query Processing Problems  Distributed Query Example ◦ Query 12 Distributed Database Systems     G E Manager REPSP ENO ENAME " "     
  • 13. 7.1 Query Processing Problems  Distributed Query Example ◦ Optimized Processing 13 Distributed Database Systems
  • 14. 7.2 Objectives of Query Processing Distributed Database Systems 14
  • 15. 7.2 Objectives of Query Processing  Two-fold objectives: ◦Transformation, and ◦Optimization 15 Distributed Database Systems
  • 16. 7.2 Objectives of Query Processing  Cost to be considered for optimization: ◦CPU time ◦I/O time, and ◦Communication time 16 Distributed Database Systems WAN: the last cost is dominant LAN: all three are equal
  • 17. 7.3 Complexity of Relational Algebra Operations Distributed Database Systems 17
  • 18. 7.3 Complexity of Relational Algebra Operations  Measured by n (cardinality) and tuples are sorted on comparison attributes Distributed Database Systems 18 O(n) O(nlogn) O(nlogn) O(n2) ) duplicates (with ,  GROUP ), duplicates (with    , , , ,     
  • 19. 7.4 Characterization of Query Processor Distributed Database Systems 19
  • 20. 7.4.1 Languages  For users: ◦ calculus or algebra based languages.  For query processor: ◦ map the input into internal form of algebra augmented with communication primitives. Distributed Database Systems 20
  • 21. 7.4.2 Types of Optimization  Exhaustive search ◦ Workable for small solution space  Heuristics ◦ Perform first, semi-join, etc. for large solution space Distributed Database Systems 21   ,
  • 22. 7.4.3 Optimization Timing  Static ◦ Do it at compiling time by using statistics, appropriate for exhaustive search, optimized once, but executed many times.  Dynamic ◦ Do it at execution time, accurate, repeated for every execution, expensive. Distributed Database Systems 22
  • 23. 7.4.4 Statistics  Facts of ◦ Cardinalities ◦ Attribute value distribution ◦ Size of relation, etc.  Provided to query optimizer and periodically updated. Distributed Database Systems 23
  • 24. 7.4.5 Decision Site  For query optimization, it may be done by ◦ Single site – centralized approach, or ◦ All the sites involved – distributed, or ◦ Hybrid – one site makes major decision in cooperation with other sites making local decisions Distributed Database Systems 24
  • 25. 7.4.6 Exploration of the NetworkTopology  WAN ◦ communication cost is dominant  LAN ◦ communication cost is comparable to I/O cost. Broadcasting capability, star network, satellite network should be considered. Distributed Database Systems 25
  • 26. 7.4.7 Exploration of Replicated Fragments Use replications to minimize communication costs. Distributed Database Systems 26
  • 27. 7.4.8 Use of Semi-joins Reduce the size of operand relations to cut down communication costs when overhead is not significant. Distributed Database Systems 27
  • 28. 7.5 Layers of Query Processing Distributed Database Systems 28
  • 29. Distributed Database Systems 29 Generic Laying Scheme for Distributed Query Processing
  • 30. 7.5.1 Query Decomposition  Decompose calculus query into algebra query using global conceptual schema information. Distributed Database Systems 30 Step 1 – calculus normalization Step 2 – semantic analysis to reject incorrect queries Step 3 – simplification to eliminate redundant components Step 4 – translation of calculus query into optimized algebra query.
  • 31. 7.5.2 Data Localization Distributed query is mapped into a fragment query and simplified to produce a good one. Distributed Database Systems 31
  • 32. 7.5.3 Global Query Optimization  Find an execution strategy close to optimal.  Find the best ordering of operations in the fragment query, including communication operations.  Cost function defined in time is required. Distributed Database Systems 32
  • 33. 7.5.4 Local Query Optimization Centralized system algorithms (to be discussed in chapter 9) Distributed Database Systems 33
  • 35. 7.6 Conclusions  Query processor – must be able to find good execution plan for a calculus query, s. t. CPU time, I/O time and communication time are minimized.  Method: laying of ◦ decomposition ◦ localization ◦ global query optimization ◦ local query optimization Distributed Database Systems 35