SlideShare a Scribd company logo
1 of 28
Relational Algebra
Relational Query Languages

    Query languages: Allow manipulation and retrieval of
    data from a database.

    Relational model supports simple, powerful QLs:
    −   Strong formal foundation based on logic.
    −   Allows for much optimization.

    Query Languages != programming languages!
    −   QLs not expected to be “Turing complete”.
    −   QLs not intended to be used for complex calculations.
    −   QLs support easy, efficient access to large data sets.
Formal Relational Query Languages

Two mathematical Query Languages form the
  basis for “real” languages (e.g. SQL), and for
  implementation:
  Relational Algebra: More operational, very
  useful for representing execution plans.
  Relational Calculus: Lets users describe what
  they want, rather than how to compute it. (Non-
  operational, declarative.)
   Understanding Algebra is key to understanding SQL,
   and query processing!
Algebra Preliminaries

    A query is applied to relation instances, and the
    result of a query is also a relation instance.
     −   Schemas of input relations for a query are fixed (but
         query will run regardless of instance!)
     −   The schema for the result of a given query is also fixed!
         Determined by definition of query language constructs.
Example Instances


    “Sailors” and “Reserves”
                                  S1 sid   sname rating age
    relations for our examples.
                                     22    dustin  7    45.0
    R1
                                     31    lubber  8    55.5
    sid bid  day                     58    rusty   10 35.0
    22 101 10/10/96
    58 103 11/12/96               S2 sid   sname rating age
                                     28    yuppy   9    35.0
                                     31    lubber  8    55.5
                                     44    guppy   5    35.0
                                     58    rusty   10 35.0
Algebra Operations


    Look what we want to get from the following
    table:

          sid    sname rating age
          28     yuppy   9    35.0
          31     lubber  8    55.5
          44     guppy   5    35.0
          58     rusty   10 35.0
sname    rating
                             Projection yuppy           9
                                               lubber   8

    Deletes attributes that are not in         guppy    5
    projection list.                           rusty    10
    Schema of result contains exactly
                                              π sname,rating(S2)



    the fields in the projection list, with
    the same names that they had in the
    (only) input relation.

    Projection operator has to eliminate
    duplicates! (Why??)
                                                    age
     − Note: real systems typically                 35.0
        don’t do duplicate elimination              55.5
        unless the user explicitly asks
        for it. (Why not?)                        π age(S2)
S2
                                 Selection rating age
                                  sid sname
                                  28     yuppy     9        35.0

    Selects rows that satisfy     31     lubber    8        55.5
    selection condition.          44     guppy     5        35.0

    No duplicates in result!      58     rusty     10       35.0
    (Why?)
    Schema of result identical
                                       σ rating >8(S 2) =



    to schema of (only) input
    relation.
                                  sid sname rating age
                                  28 yuppy 9       35.0
                                  58 rusty  10     35.0
Composition of Operations
 
     Result relation can be the input for another relational
     algebra operation! (Operator composition.)

S2 sid      sname rating age
     28     yuppy   9    35.0
     31     lubber  8    55.5
     44     guppy   5    35.0
     58     rusty   10 35.0         sname rating
π sname,rating (σ rating >8(S 2)) = yuppy 9
                                    rusty 10
What do we want to get from two
                relations?
                              S1
 R1                           sid    sname rating age
sid bid  day
                              22     dustin  7    45.0
22 101 10/10/96
                              31     lubber  8    55.5
58 103 11/12/96
                              58     rusty   10 35.0

What about: Who reserved boat 101?

Or: Find the name of the sailor who reserved boat 101.
Cross-Product

    Each row of S1 is paired with each row of R1.

    Result schema has one field per field of S1 and R1,
    with field names inherited.
    sid1   sname    rating age    sid2   bid   day
     22    dustin     7    45.0    22    101   10/10/96
     22    dustin     7    45.0    58    103   11/12/96
     31    lubber     8    55.5    22    101   10/10/96
     31    lubber     8    55.5    58    103   11/12/96
     58    rusty      10   35.0    22    101   10/10/96
     58    rusty      10   35.0    58    103   11/12/96
Renaming operator (because naming conflict):
       ρ(sid → sid1, S1) × ρ (sid → sid 2, R1)
Why does this cross product help

 Query: Find the name of the sailor who reserved boat 101.


CP = ρ (sid → sid1, S1) × ρ (sid → sid 2, R1)
Result =π           (σ                    (CP))
            Sname sid1= sid 2 and bid =101

 * Note my use of “temporary” relation CP.
Another example

    Find the name of the sailor having the highest
    rating.

AllR =π             (rating − > ratingA, S 2)
          ratingA
Result? =π              (σ         (S 2×AllR))
             Sname rating < ratingA
What’s in “Result?” ?
Does it answer our query?
Union, Intersection, Set-Difference
                                    sid sname rating age

    All of these operations take    22   dustin   7    45.0
    two input relations, which      31   lubber   8    55.5
    must be union-compatible:       58   rusty    10   35.0
     − Same number of fields.       44   guppy    5    35.0
     − `Corresponding’ fields       28   yuppy    9    35.0
        have the same type.                  S1∪ S2

    What is the schema of result?
                                    sid sname rating age
sid sname        rating age         31 lubber 8      55.5
22 dustin        7      45.0        58 rusty  10     35.0
            S1− S2                           S1∩ S2
Back to our query

    Find the name of the sailor having the highest
    rating.
AllR =π             (rating − > ratingA, S 2)
          ratingA
Tmp =π                 (σ            (S 2×AllR))
          Sid ,Sname rating < ratingA
Result =π            (π             (S 2 )− Tmp)
            Sname         Sid ,Sname
* Why not project on Sid only for Tmp?
Relational Algebra (Summary)

    Basic operations:
     −   Selection ( σ ) Selects a subset of rows from relation.
     −   Projection ( π ) Deletes unwanted columns from relation.
     −   Cross-product ( × ) Allows us to combine two relations.
     −   Set-difference ( - ) Tuples in reln. 1, but not in reln. 2.
     −   Union ( ∪ ) Tuples in reln. 1 and in reln. 2.
     −   Rename ( ρ ) Changes names of the attributes

    Since each operation returns a relation, operations can be
    composed! (Algebra is “closed”.)

    Use of temporary relations recommended.
Natural Join

    Natural-Join:

    S1  R1=
       sid     sname      rating age       bid    day
       22      dustin     7      45.0      101    10/10/96
       58      rusty      10     35.0      103    11/12/96


    Result schema similar to cross-product, but only one copy of
    each field which appears in both relations.

    Natural Join = Cross Product + Selection on common fields +
    drop duplicate fields.
Query revisited using natural join

Query: Find the name of the sailor who reserved boat 101.


    Result =π        (σ       (S1  R1))
                Sname bid =101
    Or
    Result =π        (S1  σ         (R1))
                Sname         bid =101
Consider yet another query
 
      Find the sailor(s) who reserved all the red
      boats.
R1                           B
sid     bid   day                bid   color
22      101 10/10/96             101 Red
22      103 10/11/96             102 Green
58      102 11/12/96             103 Red
Start an attempt


    who reserved what boat:              sid    bid
                                         22     101
          S1=π                 (R1) =    22     103
                    sid ,bid
                                         58     102

    All the red boats:
                                                bid
     S 2 =π         (σ                 (B)) =   101
              bid        color = red            103
Division

    Not supported as a primitive operator, but useful for
    expressing queries like:
                                        Find sailors who
    have reserved all red boats.

    Let S1 have 2 fields, x and y; S2 have only field y:
     − S1/S2 =  x | forall y in R2 (there exists x, y in R1)
               
                                                                          
                                                                           
                                                                           
               
                                                                          
                                                                           

     −   i.e., S1/S2 contains all x tuples (sailors) such that for every
         y tuple (redboat) in S2, there is an xy tuple in S1 (i.e, x
         reserved y).

    In general, x and y can be any lists of fields; y is the list
    of fields in S2, and x∪y is the list of fields of S1.
Examples of Division A/B

sno   pno        pno        pno        pno
s1    p1         p2         p2         p1
s1    p2                    p4         p2
s1    p3
                  B1                   p4
s1    p4
                             B2
s2    p1         sno                   B3
s2    p2         s1
s3    p2         s2         sno
s4    p2         s3         s1         sno
s4    p4         s4         s4         s1

      A          A/B1       A/B2       A/B3
Find names of sailors who’ve reserved boat
                        #103

    Solution 1: π
                  sname((σ bid =103(Re serves)) Sailors)
                                                





    Solution 2:
                   Temp1=σ           (Re serves)
                             bid =103
                   Temp2=Temp1 Sailors
                               
                  Re sult =π sname (Temp2)

    Solution 3:    π sname (σ              (Re serves  Sailors))
                                                       
                                bid =103
Find names of sailors who’ve reserved a red
                    boat

 
     Information about boat color only available in
     Boats; so need an extra join:
 π sname ((σ                       Boats)  Re serves  Sailors)
                                                      
               color =' red '

     A more efficient solution:
 π sname (π         ((π      σ               Boats)  Re s)  Sailors)
                                                            
              sid         bid color =' red '


     A query optimizer can find this given the first solution!
Find sailors who’ve reserved a red or a green
                         boat

    Can identify all red or green boats, then find
    sailors who’ve reserved one of these boats:
      Tempboats =σ                                    (Boats)
                      color ='red '∨color =' green'
Re sult =π sname(Tempboats  Re serves Sailors)
                                      


    Can also define Tempboats using union! (How?)
    What happens if ∨ is replaced by ∧ in this query?
Find sailors who’ve reserved a red and a
                    green boat
    
        Previous approach won’t work! Must identify
        sailors who’ve reserved red boats, sailors who’ve
        reserved green boats, then find the intersection
        (note that sid is a key for Sailors):
  Tempred =π          ((σ                (Boats)) Re serves)
                                                  
                sid         color ='red '
Tempgreen =π          ((σ                  (Boats)) Re serves)
                                                    
                sid         color =' green'
Re sult =π sname((Tempred ∩Tempgreen) Sailors)
                                      
Find the names of sailors who’ve reserved all
                       boats


    Uses division; schemas of the input relations
    to / must be carefully chosen:

          Tempsids =(π                (Re serves))/ (π     (Boats))
                    sid ,bid            bid
    Result =π sname(Tempsids  Sailors)
                              

To find sailors who’ve reserved all ‘Interlake’ boats:
      .....   /π         (σ                     (Boats))
                   bid        bname ='Interlake'
Summary


    The relational model has rigorously defined
    query languages that are simple and powerful.

    Relational algebra is more operational; useful as
    internal representation for query evaluation
    plans.

    Several ways of expressing a given query; a
    query optimizer should choose the most
    efficient version.

More Related Content

Viewers also liked

Algebraic expressions
Algebraic expressionsAlgebraic expressions
Algebraic expressions
Andri Rahadi
 
Cleaning the beach 2 power point
Cleaning the beach 2 power pointCleaning the beach 2 power point
Cleaning the beach 2 power point
ErinCundiff
 

Viewers also liked (9)

Digital Advertising & SEM - Bogazici University Marketing Department Lecture ...
Digital Advertising & SEM - Bogazici University Marketing Department Lecture ...Digital Advertising & SEM - Bogazici University Marketing Department Lecture ...
Digital Advertising & SEM - Bogazici University Marketing Department Lecture ...
 
Algebraic expressions
Algebraic expressionsAlgebraic expressions
Algebraic expressions
 
Know and Delight Your Users: UX Analytics
Know and Delight Your Users: UX AnalyticsKnow and Delight Your Users: UX Analytics
Know and Delight Your Users: UX Analytics
 
Cleaning the beach 2 power point
Cleaning the beach 2 power pointCleaning the beach 2 power point
Cleaning the beach 2 power point
 
Digital Talks Spring'15 - User Experience
Digital Talks Spring'15 - User ExperienceDigital Talks Spring'15 - User Experience
Digital Talks Spring'15 - User Experience
 
Develop and Grow your Audience with Google Analytics
Develop and Grow your Audience with Google AnalyticsDevelop and Grow your Audience with Google Analytics
Develop and Grow your Audience with Google Analytics
 
Introduction to Google Analytics
Introduction to Google AnalyticsIntroduction to Google Analytics
Introduction to Google Analytics
 
Website Architecture Presentation from Web Strategy Workshops
Website Architecture Presentation from Web Strategy WorkshopsWebsite Architecture Presentation from Web Strategy Workshops
Website Architecture Presentation from Web Strategy Workshops
 
Coke ppt
Coke pptCoke ppt
Coke ppt
 

Similar to Relal

Relational operation final
Relational operation finalRelational operation final
Relational operation final
Student
 

Similar to Relal (19)

Database managment System Relational Algebra
Database managment System  Relational AlgebraDatabase managment System  Relational Algebra
Database managment System Relational Algebra
 
Relational_Algebra .ppt
Relational_Algebra .pptRelational_Algebra .ppt
Relational_Algebra .ppt
 
Relational algebra
Relational algebraRelational algebra
Relational algebra
 
Relational Model
Relational ModelRelational Model
Relational Model
 
Ch7
Ch7Ch7
Ch7
 
Algebra
AlgebraAlgebra
Algebra
 
Relational Algebra
Relational AlgebraRelational Algebra
Relational Algebra
 
MOD2-DBMS.pdf
MOD2-DBMS.pdfMOD2-DBMS.pdf
MOD2-DBMS.pdf
 
Relational Algebra and Calculus.ppt
Relational Algebra and Calculus.pptRelational Algebra and Calculus.ppt
Relational Algebra and Calculus.ppt
 
Unit 05 dbms
Unit 05 dbmsUnit 05 dbms
Unit 05 dbms
 
Relational operation final
Relational operation finalRelational operation final
Relational operation final
 
State Space C-Reductions @ ETAPS Workshop GRAPHITE 2013
State Space C-Reductions @ ETAPS Workshop GRAPHITE 2013State Space C-Reductions @ ETAPS Workshop GRAPHITE 2013
State Space C-Reductions @ ETAPS Workshop GRAPHITE 2013
 
Note 11
Note 11Note 11
Note 11
 
Algebra and Trigonometry 9th Edition Larson Solutions Manual
Algebra and Trigonometry 9th Edition Larson Solutions ManualAlgebra and Trigonometry 9th Edition Larson Solutions Manual
Algebra and Trigonometry 9th Edition Larson Solutions Manual
 
Graph functions
Graph functionsGraph functions
Graph functions
 
Control and Instrumentation.pdf.pdf
Control and Instrumentation.pdf.pdfControl and Instrumentation.pdf.pdf
Control and Instrumentation.pdf.pdf
 
BEC 26 control-systems_unit-IV
BEC 26 control-systems_unit-IVBEC 26 control-systems_unit-IV
BEC 26 control-systems_unit-IV
 
Control Systems - Stability Analysis.
Control Systems - Stability Analysis.Control Systems - Stability Analysis.
Control Systems - Stability Analysis.
 
Ece4510 notes05
Ece4510 notes05Ece4510 notes05
Ece4510 notes05
 

Relal

  • 2. Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational model supports simple, powerful QLs: − Strong formal foundation based on logic. − Allows for much optimization.  Query Languages != programming languages! − QLs not expected to be “Turing complete”. − QLs not intended to be used for complex calculations. − QLs support easy, efficient access to large data sets.
  • 3. Formal Relational Query Languages Two mathematical Query Languages form the basis for “real” languages (e.g. SQL), and for implementation: Relational Algebra: More operational, very useful for representing execution plans. Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non- operational, declarative.) Understanding Algebra is key to understanding SQL, and query processing!
  • 4. Algebra Preliminaries  A query is applied to relation instances, and the result of a query is also a relation instance. − Schemas of input relations for a query are fixed (but query will run regardless of instance!) − The schema for the result of a given query is also fixed! Determined by definition of query language constructs.
  • 5. Example Instances  “Sailors” and “Reserves” S1 sid sname rating age relations for our examples. 22 dustin 7 45.0 R1 31 lubber 8 55.5 sid bid day 58 rusty 10 35.0 22 101 10/10/96 58 103 11/12/96 S2 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0
  • 6. Algebra Operations  Look what we want to get from the following table: sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0
  • 7. sname rating Projection yuppy 9 lubber 8  Deletes attributes that are not in guppy 5 projection list. rusty 10 Schema of result contains exactly π sname,rating(S2)  the fields in the projection list, with the same names that they had in the (only) input relation.  Projection operator has to eliminate duplicates! (Why??) age − Note: real systems typically 35.0 don’t do duplicate elimination 55.5 unless the user explicitly asks for it. (Why not?) π age(S2)
  • 8. S2 Selection rating age sid sname 28 yuppy 9 35.0  Selects rows that satisfy 31 lubber 8 55.5 selection condition. 44 guppy 5 35.0  No duplicates in result! 58 rusty 10 35.0 (Why?) Schema of result identical σ rating >8(S 2) =  to schema of (only) input relation. sid sname rating age 28 yuppy 9 35.0 58 rusty 10 35.0
  • 9. Composition of Operations  Result relation can be the input for another relational algebra operation! (Operator composition.) S2 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 sname rating π sname,rating (σ rating >8(S 2)) = yuppy 9 rusty 10
  • 10. What do we want to get from two relations? S1 R1 sid sname rating age sid bid day 22 dustin 7 45.0 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 What about: Who reserved boat 101? Or: Find the name of the sailor who reserved boat 101.
  • 11. Cross-Product  Each row of S1 is paired with each row of R1.  Result schema has one field per field of S1 and R1, with field names inherited. sid1 sname rating age sid2 bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 Renaming operator (because naming conflict): ρ(sid → sid1, S1) × ρ (sid → sid 2, R1)
  • 12. Why does this cross product help Query: Find the name of the sailor who reserved boat 101. CP = ρ (sid → sid1, S1) × ρ (sid → sid 2, R1) Result =π (σ (CP)) Sname sid1= sid 2 and bid =101 * Note my use of “temporary” relation CP.
  • 13. Another example  Find the name of the sailor having the highest rating. AllR =π (rating − > ratingA, S 2) ratingA Result? =π (σ (S 2×AllR)) Sname rating < ratingA What’s in “Result?” ? Does it answer our query?
  • 14. Union, Intersection, Set-Difference sid sname rating age  All of these operations take 22 dustin 7 45.0 two input relations, which 31 lubber 8 55.5 must be union-compatible: 58 rusty 10 35.0 − Same number of fields. 44 guppy 5 35.0 − `Corresponding’ fields 28 yuppy 9 35.0 have the same type. S1∪ S2  What is the schema of result? sid sname rating age sid sname rating age 31 lubber 8 55.5 22 dustin 7 45.0 58 rusty 10 35.0 S1− S2 S1∩ S2
  • 15. Back to our query  Find the name of the sailor having the highest rating. AllR =π (rating − > ratingA, S 2) ratingA Tmp =π (σ (S 2×AllR)) Sid ,Sname rating < ratingA Result =π (π (S 2 )− Tmp) Sname Sid ,Sname * Why not project on Sid only for Tmp?
  • 16. Relational Algebra (Summary)  Basic operations: − Selection ( σ ) Selects a subset of rows from relation. − Projection ( π ) Deletes unwanted columns from relation. − Cross-product ( × ) Allows us to combine two relations. − Set-difference ( - ) Tuples in reln. 1, but not in reln. 2. − Union ( ∪ ) Tuples in reln. 1 and in reln. 2. − Rename ( ρ ) Changes names of the attributes  Since each operation returns a relation, operations can be composed! (Algebra is “closed”.)  Use of temporary relations recommended.
  • 17. Natural Join  Natural-Join: S1  R1= sid sname rating age bid day 22 dustin 7 45.0 101 10/10/96 58 rusty 10 35.0 103 11/12/96  Result schema similar to cross-product, but only one copy of each field which appears in both relations.  Natural Join = Cross Product + Selection on common fields + drop duplicate fields.
  • 18. Query revisited using natural join Query: Find the name of the sailor who reserved boat 101. Result =π (σ (S1  R1)) Sname bid =101 Or Result =π (S1  σ (R1)) Sname bid =101
  • 19. Consider yet another query  Find the sailor(s) who reserved all the red boats. R1 B sid bid day bid color 22 101 10/10/96 101 Red 22 103 10/11/96 102 Green 58 102 11/12/96 103 Red
  • 20. Start an attempt  who reserved what boat: sid bid 22 101 S1=π (R1) = 22 103 sid ,bid 58 102 All the red boats: bid S 2 =π (σ (B)) = 101 bid color = red 103
  • 21. Division  Not supported as a primitive operator, but useful for expressing queries like: Find sailors who have reserved all red boats.  Let S1 have 2 fields, x and y; S2 have only field y: − S1/S2 =  x | forall y in R2 (there exists x, y in R1)          − i.e., S1/S2 contains all x tuples (sailors) such that for every y tuple (redboat) in S2, there is an xy tuple in S1 (i.e, x reserved y).  In general, x and y can be any lists of fields; y is the list of fields in S2, and x∪y is the list of fields of S1.
  • 22. Examples of Division A/B sno pno pno pno pno s1 p1 p2 p2 p1 s1 p2 p4 p2 s1 p3 B1 p4 s1 p4 B2 s2 p1 sno B3 s2 p2 s1 s3 p2 s2 sno s4 p2 s3 s1 sno s4 p4 s4 s4 s1 A A/B1 A/B2 A/B3
  • 23. Find names of sailors who’ve reserved boat #103 Solution 1: π sname((σ bid =103(Re serves)) Sailors)   Solution 2: Temp1=σ (Re serves) bid =103 Temp2=Temp1 Sailors  Re sult =π sname (Temp2) Solution 3: π sname (σ (Re serves  Sailors))  bid =103
  • 24. Find names of sailors who’ve reserved a red boat  Information about boat color only available in Boats; so need an extra join: π sname ((σ Boats)  Re serves  Sailors)   color =' red ' A more efficient solution: π sname (π ((π σ Boats)  Re s)  Sailors)   sid bid color =' red ' A query optimizer can find this given the first solution!
  • 25. Find sailors who’ve reserved a red or a green boat  Can identify all red or green boats, then find sailors who’ve reserved one of these boats: Tempboats =σ (Boats) color ='red '∨color =' green' Re sult =π sname(Tempboats  Re serves Sailors)   Can also define Tempboats using union! (How?) What happens if ∨ is replaced by ∧ in this query?
  • 26. Find sailors who’ve reserved a red and a green boat  Previous approach won’t work! Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (note that sid is a key for Sailors): Tempred =π ((σ (Boats)) Re serves)  sid color ='red ' Tempgreen =π ((σ (Boats)) Re serves)  sid color =' green' Re sult =π sname((Tempred ∩Tempgreen) Sailors) 
  • 27. Find the names of sailors who’ve reserved all boats  Uses division; schemas of the input relations to / must be carefully chosen: Tempsids =(π (Re serves))/ (π (Boats)) sid ,bid bid Result =π sname(Tempsids  Sailors)  To find sailors who’ve reserved all ‘Interlake’ boats: ..... /π (σ (Boats)) bid bname ='Interlake'
  • 28. Summary  The relational model has rigorously defined query languages that are simple and powerful.  Relational algebra is more operational; useful as internal representation for query evaluation plans.  Several ways of expressing a given query; a query optimizer should choose the most efficient version.

Editor's Notes

  1. 1
  2. 2
  3. 4
  4. 5
  5. 7
  6. 8
  7. 10
  8. 9
  9. 6
  10. 12
  11. 13
  12. 14
  13. 16
  14. 17
  15. 18
  16. 19
  17. 20