comprehensive lecture on join odering fragments queries. it is the topic of DDBMS and the content are taken from multiple sources including google, book, class lecture.
prepared by IFZAL HUSSAIN student of CS in SHAHEED BENAZIR BHUTTO UNIVERSITY SHERINGAL DIR UPPER KPK, PAKISTAN.
2. Join Ordering in Fragment Queries
• Join ordering is important in centralized DB, and is more important in
distributed DB.
3. Join Ordering in Fragment Queries (cont.)
• R site j: “relation R is transferred to site j”
• 1. EMP site 2; site 2 computes EMP’
• EMP’->site 3; site 3 computes the result.
• 2.ASG->site 1: site 1 computes EMP’, EMP’->site
3; site 3 computes the result
• 3. ASG->site 3; computeASG’;ASG’->site 1
• 4. PROJ->site 2; compute PROJ’; PROJ’->site 1
• 5. EMP->site 2; PROJ->site 2; site 2 compute the
join.
4. Join Ordering in Fragment Queries (cont.)
• Join ordering
• Distributed INGRES
• System R*
• Semijoin ordering
• SDD-1
5. Join Ordering
• Consider two relations only
• R ⋈ S
• Transfer the smaller size
• Multiple relations more difficult because too many alternatives
• Compute the cost of all alternatives and select the best one
• Necessary to compute the size of intermediate relations which is difficult.
• Use heuristics
7. Join Ordering – Example (cont.)
• Execution alternatives:
• 1. EMP Site 2
• Site 2 computes EMP’=EMP⋈ASG
• EMP’ Site 3
• Site 3 computes EMP’⋈PROJ
• 2.ASG Site 1
• Site 1 computes EMP’=EMP⋈ASG
• EMP’ Site 3
• Site 3 computes EMP’⋈PROJ
8. Join Ordering – Example (cont.)
3. ASG Site 3
Site 3 computes ASG’=ASG⋈PROJ
ASG’ Site 1
Site 1 computes ASG’⋈EMP
4. PROJ Site 2
Site 2 computes PROJ’=PROJ⋈ASG
PROJ’ Site 1
Site 1 computes PROJ’ ⋈ EMP
9. cont,d
5. EMP Site 2
PROJ Site 2
Site 2 computes EMP⋈ PROJ⋈ASG
10. Semijoin Algorithms
• Shortcoming of the joining method
• Transfer the entire relation which may contain some useless tuples
• Semi-join reduces the size of operand relation to be transferred
• Semi-join is beneficial if the cost to produce and send to the other site is less than
sending the whole relation.
11. Semijoin Algorithms (cont.)
• Consider the join of two relations
• R[A] (located at site 1)
• S[A] (located at site 2)
• Alternatives
• 1. Do the join R ⋈A S
• 2. Perform one of the semijoin equivalents
( ) ( )
( ) ( )
A A A A A
A A A
R S R S S R S R
R S S R
12. Cnt,d
• Perform the join
• Send R to site 2
• Site 2 computes R ⋈A S
• Consider semijoin
• S’ = A(S)
• S’ Site 1
• Site 1 computes
• R’ Site 2
• Site 2 computes
• Semijoin is better if
( )
A A
R S S
' '
A
R R S
' A
R S
( ( ( )) ( )) ( )
A A
size S size R S size R