1. PHASES OF DISTRIBUTED
QUERY PROCESSING -I
PRESENTATION MEMBERS
-ANU ISSAC
-CRYSTAL CUTHINHO
-LEON D’SOUZA
-NEVIL D’SOUZA
-ANDREA FURTADO
2. • SQL , Parsing , The SQL query determines what data is to be found, but does not define the method by
which the data manager searches the database.
• Indexing, hashing, Views , ...
3.
4. OBJECTIVES OF QUERY PROCESSING
• First and main objective of query Processing, "To convert high level language(SQL) to low level
language(Relational Algebra)".
• In distributed systems, the query written on single machine but are actually executed on different local
database. as the local database provides the data to execute the query.
• A query can be processed using various techniques, such that the technique should be efficient to
execute the query in distributed environment.
• To minimize the overall cost of execution.(I/O cost + CPU cost +Communication cost).
• to minimize the time required to execute the query.(I/O time + CPU time +Communication time).
• This phase brings query into normalized form in order of easier processing of query.
• its specifically focus on the where clause of query.
5. 2. ANALYSIS
• Lexical & Syntactical analysis
• Verification of relations & attributes
• Conflict between operations
• Checks if result is possible
8. 3. SIMPLIFICATION
• Detects redundant predicates
• Transforms queries – makes them simple & efficient
• Not at the cost of semantic correctness
• Checks for factors responsible for redundancy
9. QUERY RESTRUCTURING
1. Rewrites Query into equivalent Relational Algebra
2. Makes use of a Query Tree or Operator Tree
• Leaf node for every relation in the query
• Non-leaf node for intermediate relation we can generate
• Root node for the result of a query
3.Sequence of operation is from Leaf towards the Root
4.Transformation Rules are applied .
10. EXAMPLE
• Q) Find the names of employees other than Raj who worked on CAD/CAM project for either one or two
years.
• Query:
SELECT Ename
FROM PROJ, ASG, EMP
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO
AND ENAME ≠ “Raj”
AND PROJ.PNAME=“CAD/CAM”
AND (DUR=12 OR DUR=24);
13. FRAGMENTATION:
• Forming relational algebraic queries to be used on fragmented relations.
• Generation of fragmented query is done by replacing the global relations with
fragmented relation in the query tree of distributed query.
• The generic tree still has some scope for reconstruction and simplification.
• Generic tree is used to generate a simpler and optimized query by using
reduction technique.
• Type of fragmentation determines the reduction technique to be used.
14. REDUCTION FOR HORIZONTAL FRAGMENTATION:
• Reduction of generic tree is done using either selection operation
or join operation.
• Selection operation: Produces an intermediate relation that
remains empty if there is contradiction between selection
predicate and definition of the fragment.
• Join operation: Detects useless join operation by commuting joins
with union operation.
15. EXAMPLE
• Consider the schemas:
EMP(ENO, ENAME, TITLE) , ASG(ENO, PNO, RESP, DUR)
• Consider the following query and fragmentation: Query: SELECT *
FROM EMP, ASG WHERE EMP.ENO=ASG.ENO
• Horizontal fragmentation:
∗ EMP1 = σENO ≤ “ E3”(EMP) ASG1 = σENO ≤ ” E3”(ASG)
EMP2 = σ “ E3””<ENO<” E6”(EMP) ASG2 = σENO> ” E3”(ASG) –
EMP3 = σENO> ” E6”(EMP)
16. USING SELECTION OPERATION
• Consider the query: SELECT * FROM EMP WHERE ENO=”E5”
• Here the leaf node that corresponds to Employee relation in the generic tree
can be replaced by reconstruction rule.
• The selection predicate has a contradiction with the definition of EMP1 and
EMP3 fragment and produces empty result.
17.
18. Department(deptno,dname,location)
• DEPT1= σ deptno<=10(Department)
• DEPT2= σ deptno>10(Department)
• Assume that the fragmention of employee relation is derived from department
• EMPi=Employee deptno DEPTi i=1,2
Select * from Employee,Department where depno>10 and Employee.deptno=Department.deptno.