The document discusses the topics covered in a database technologies course, including relational algebra operations. It provides examples and explanations of relational algebra concepts like selection, projection, join, union, difference, and cartesian product. It also discusses limitations of relational algebra in expressing complex queries involving transitive closure. The document contains practice questions related to relational algebra operations at the end.
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Relational algebra
1. Department of Information Technology 1Data base Technologies (ITB4201)
Introduction to Relational Algebra
Dr. C.V. Suresh Babu
Professor
Department of IT
Hindustan Institute of Science & Technology
2. Department of Information Technology 2Data base Technologies (ITB4201)
Discussion Topics
• DBMS Architecture
• Relational Algebra
• Union and Differences
• Selectin
• Projection
• Cartesian Product
• Renaming
• Join
• Limitations of Relational Algebra
• Quiz
3. Department of Information Technology 3Data base Technologies (ITB4201)
DBMS Architecture
How does a SQL engine work ?
• SQL query relational
algebra plan
• Relational algebra plan
Optimized plan
• Execute each operator of the
plan
Query Optimization
and Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk Space Management
DB
PracticeTheory
Relational Algebra
Relational Model
Relational Calculus
4. Department of Information Technology 4Data base Technologies (ITB4201)
Review – Why do we need Query Languages anyway?
• Two key advantages
– Less work for user asking query
– More opportunities for optimization
• Relational Algebra
– Theoretical foundation for SQL
– Higher level than programming language
• but still must specify steps to get desired result
• Relational Calculus
– Formal foundation for Query-by-Example
– A first-order logic description of desired result
– Only specify desired result, not how to get it
5. Department of Information Technology 5Data base Technologies (ITB4201)
Relational Algebra
• Formalism for creating new relations from existing ones
• Its place in the big picture:
Declartive
query
language
Algebra Implementation
SQL,
relational calculus
Relational algebra
Relational bag algebra
6. Department of Information Technology 6Data base Technologies (ITB4201)
Relational Algebra
• Five operators:
– Union:
– Difference: -
– Selection: s
– Projection: P
– Cartesian Product:
• Derived or auxiliary operators:
– Intersection, complement
– Joins (natural,equi-join, theta join, semi-join)
– Renaming: r
7. Department of Information Technology 7Data base Technologies (ITB4201)
1. Union and 2. Difference
• R1 R2
• Example:
– ActiveEmployees RetiredEmployees
• R1 – R2
• Example:
– AllEmployees -- RetiredEmployees
8. Department of Information Technology 8Data base Technologies (ITB4201)
What about Intersection ?
• It is a derived operator
• R1 R2 = R1 – (R1 – R2)
• Also expressed as a join
• Example
– UnionizedEmployees RetiredEmployees
9. Department of Information Technology 9Data base Technologies (ITB4201)
3. Selection
• Returns all tuples which satisfy a condition
• Notation: sc(R)
• Examples
– sSalary > 40000 (Employee)
– sname = “Smith” (Employee)
• The condition c can be =, <, , >, , <>
10. Department of Information Technology 10Data base Technologies (ITB4201)
sSalary > 40000 (Employee)
SSN Name Salary
1234545 John 200000
5423341 Smith 600000
4352342 Fred 500000
SSN Name Salary
5423341 Smith 600000
4352342 Fred 500000
11. Department of Information Technology 11Data base Technologies (ITB4201)
4. Projection
• Eliminates columns, then removes duplicates
• Notation: P A1,…,An (R)
• Example: project social-security number and names:
– P SSN, Name (Employee)
– Output schema: Answer(SSN, Name)
12. Department of Information Technology 12Data base Technologies (ITB4201)
P Name,Salary (Employee)
SSN Name Salary
1234545 John 200000
5423341 John 600000
4352342 John 200000
Name Salary
John 20000
John 60000
13. Department of Information Technology 13Data base Technologies (ITB4201)
5. Cartesian Product
• Each tuple in R1 with each tuple in R2
• Notation: R1 R2
• Example:
– Employee Dependents
• Very rare in practice; mainly used to express joins
14. Department of Information Technology 14Data base Technologies (ITB4201)
Cartesian Product Example
Employee
Name SSN
John 999999999
Tony 777777777
Dependents
EmployeeSSN Dname
999999999 Emily
777777777 Joe
Employee x Dependents
Name SSN EmployeeSSN Dname
John 999999999 999999999 Emily
John 999999999 777777777 Joe
Tony 777777777 999999999 Emily
Tony 777777777 777777777 Joe
15. Department of Information Technology 15Data base Technologies (ITB4201)
Renaming
• Changes the schema, not the instance
• Notation: r B1,…,Bn (R)
• Example:
– rLastName, SocSocNo (Employee)
– Output schema:
Answer(LastName, SocSocNo)
16. Department of Information Technology 16Data base Technologies (ITB4201)
Renaming Example
Employee
Name SSN
John 999999999
Tony 777777777
LastName SocSocNo
John 999999999
Tony 777777777
rLastName, SocSocNo (Employee)
17. Department of Information Technology 17Data base Technologies (ITB4201)
Natural Join
• Notation: R1 || R2
• Meaning: R1 || R2 = PA(sC(R1 R2))
• Where:
– The selection sC checks equality of all common attributes
– The projection eliminates the duplicate common attributes
18. Department of Information Technology 18Data base Technologies (ITB4201)
Natural Join Example
Employee
Name SSN
John 999999999
Tony 777777777
Dependents
SSN Dname
999999999 Emily
777777777 Joe
Name SSN Dname
John 999999999 Emily
Tony 777777777 Joe
Employee Dependents =
PName, SSN, Dname(s SSN=SSN2(Employee x rSSN2, Dname(Dependents))
19. Department of Information Technology 19Data base Technologies (ITB4201)
Natural Join
• R= S=
• R || S=
A B
X Y
X Z
Y Z
Z V
B C
Z U
V W
Z V
A B C
X Z U
X Z V
Y Z U
Y Z V
Z V W
20. Department of Information Technology 20Data base Technologies (ITB4201)
Natural Join
• Given the schemas R(A, B, C, D), S(A, C, E), what is the
schema of R || S ?
• Given R(A, B, C), S(D, E), what is R || S ?
• Given R(A, B), S(A, B), what is R || S ?
21. Department of Information Technology 21Data base Technologies (ITB4201)
Theta Join
• A join that involves a predicate
• R1 || q R2 = s q (R1 R2)
• Here q can be any condition
22. Department of Information Technology 22Data base Technologies (ITB4201)
Eq-join
• A theta join where q is an equality
• R1 || A=B R2 = s A=B (R1 R2)
• Example:
– Employee || SSN=SSN Dependents
• Most useful join in practice
23. Department of Information Technology 23Data base Technologies (ITB4201)
Semijoin
• R | S = P A1,…,An (R || S)
• Where A1, …, An are the attributes in R
• Example:
– Employee | Dependents
24. Department of Information Technology 24Data base Technologies (ITB4201)
Semijoins in Distributed Databases
• Semijoins are used in distributed databases
SSN Name
. . . . . .
SSN Dname Age
. . . . . .
Employee
Dependents
network
Employee | ssn=ssn (s age>71 (Dependents))
T = P SSN s age>71 (Dependents)
R = Employee |T
Answer = R ||Dependents
25. Department of Information Technology 25Data base Technologies (ITB4201)
Complex RA Expressions
Person Purchase Person Product
sname=fred sname=gizmo
P pidP ssn
seller-ssn=ssn
pid=pid
buyer-ssn=ssn
P name
26. Department of Information Technology 26Data base Technologies (ITB4201)
Operations on Bags
A bag = a set with repeated elements
All operations need to be defined carefully on bags
• {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f}
• {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d}
• sC(R): preserve the number of occurrences
• PA(R): no duplicate elimination
• Cartesian product, join: no duplicate elimination
Important ! Relational Engines work on bags, not sets !
Reading assignment: 5.3 – 5.4
27. Department of Information Technology 27Data base Technologies (ITB4201)
Finally: RA has Limitations !
• Cannot compute “transitive closure”
• Find all direct and indirect relatives of Fred
• Cannot express in RA !!! Need to write program
Name1 Name2 Relationship
Fred Mary Father
Mary Joe Cousin
Mary Bill Spouse
Nancy Lou Sister
28. Department of Information Technology 28Data base Technologies (ITB4201)
Test Yourself
1. Which of the following is not a valid binary operation in the Relational Algebra?
a) Project
b) Union
c) Set difference
d) Cartesian product
2. The operation of a relation x, produces y, such that y contains only selected attributes of x. Such a operation is:?
a) Projection
b) Intersection
c) Union
d) Difference
3. Which of the following is not a valid unary operation in the relational algebra?
a) select
b) min
c) project
d) rename
4. Which clause from the following corresponds to the project operation of the relational algebra?
a) from
b) select
c) where
d) none of these
5. The intersect operation:
a) automatically eliminates duplicates
b) automatically eliminates duplicates. If we provide all clause with intersect
c) never eliminates duplicates
d) none of these
29. Department of Information Technology 29Data base Technologies (ITB4201)
Answers
1. Which of the following is not a valid binary operation in the Relational Algebra?
a) Project
b) Union
c) Set difference
d) Cartesian product
2. The operation of a relation x, produces y, such that y contains only selected attributes of x. Such a operation is:?
a) Projection
b) Intersection
c) Union
d) Difference
3. Which of the following is not a valid unary operation in the relational algebra?
a) select
b) min
c) project
d) rename
4. Which clause from the following corresponds to the project operation of the relational algebra?
a) from
b) select
c) where
d) none of these
5. The intersect operation:
a) automatically eliminates duplicates
b) automatically eliminates duplicates. If we provide all clause with intersect
c) never eliminates duplicates
d) none of these