Formal Languages - I
(Relational Algebra)
10/09/2024
Christalin Nelson | Systems | SoCS
1 of 67
At a Glance
• Formal Languages
• Unary Relational Operations
• Operations from Set Theory
• Binary Relational Operations
• Additional Relational Operations
10/09/2024
Christalin Nelson | Systems | SoCS
2 of 66
Formal languages (1/2)
• The following formal languages are associated with Relational model
– Relational algebra
• Used in internals of many DB implementations for query processing and optimization
– Relational calculus
• Has firm basis on mathematical logic
• SQL for RDBMSs has some of its foundations in tuple relational calculus
10/09/2024
Christalin Nelson | Systems | SoCS
3 of 66
Formal languages (2/2)
• “Expressive power” of both languages are identical
– Any retrieval that is specified in basic relational algebra can also be specified in
relational calculus, and vice versa.
• “Relationally complete” language
– A relational query language (L) is relationally complete if any query that is expressed in
relational calculus can be expressed in L
– Important basis for comparing the expressive power of high-level query languages
– Note
• Most relational query languages are relationally complete but have more expressive power
than relational algebra or relational calculus because of additional operations (like
aggregate functions, grouping, and ordering)
10/09/2024
Christalin Nelson | Systems | SoCS
4 of 66
Relational Algebra (1/2)
• Procedural in nature
• Focuses on describing a sequence of operations to retrieve or manipulate data
– It is also possible to nest algebra operations to form a single expression
• Provides Set of operations that act on relations to produce new relations as results
– E.g. SELECT, PROJECT, JOIN, UNION, INTERSECTION, and DIFFERENCE
– Note:
• Relational algebra operations can be thought of as similar to SQL query clauses
– Relational algebra operations are expressed in a more formal and procedural manner
– Queries can be expressed as sequences of these operations, which are applied to relations to
obtain the desired result
10/09/2024
Christalin Nelson | Systems | SoCS
5 of 66
Relational Algebra (2/2)
• Types of Relational algebra operations
– Set operations from mathematical set theory
• Example: UNION, INTERSECTION, SET DIFFERENCE, CARTESIAN PRODUCT (CROSS
PRODUCT)
– Operations developed specifically for relational databases
• Example: SELECT, PROJECT, JOIN, DIVISION
– Additional Operations
• Aggregate functions (can summarize data from the tables)
• Additional types of JOIN and UNION operations (known as OUTER JOINs and OUTER
UNIONs)
10/09/2024
Christalin Nelson | Systems | SoCS
6 of 66
Unary Relational Operations: SELECT (1/5)
• Unary Operations
– Applied to a single relation (Example: SELECT, PROJECT)
• SELECT Operation
– Selects/Filters a subset of tuples from a relation that satisfies a selection condition
• i.e. Horizontal partition of relation
– Denoted by sigma
• <selection condition>
– Condition is applied independently to each tuple in relation (R). If TRUE => tuple is selected
– Contains Boolean expression which has clauses of the form
» <attribute name> <comparison operator> <constant value or attribute name>
10/09/2024
Christalin Nelson | Systems | SoCS
7 of 66
10/09/2024
Christalin Nelson | Systems | SoCS
8 of 67
Unary Relational Operations: SELECT (3/5)
• Boolean operators {AND, OR, NOT}
• Comparison operators {=, <, ≤, >, ≥, ≠}
– Note: If the domain of the attribute is
• Set of ordered values => All comparison operators can be used
• Set of unordered values => Only the comparison operators in the set {=, ≠} can be used
– Example: domain Color = { ‘red’, ‘blue’, ‘green’, ‘white’, ‘yellow’, ...}, where no order is specified
among the various colors
• Degree of resulting relation
– No. of attributes resulting from a SELECT operation
• For any condition, No. of tuples in resulting relation ≤ No. of tuples in R
– i.e. |σc (R)| ≤ |R|
10/09/2024
Christalin Nelson | Systems | SoCS
9 of 66
Unary Relational Operations: SELECT (4/5)
• Selectivity
– Fraction of tuples selected by a selection condition
• SELECT operation is commutative
– A sequence of SELECTs can be applied in any order
• Cascade/Combine
– SELECT operations can be cascaded into a single operation with an AND condition
10/09/2024
Christalin Nelson | Systems | SoCS
10 of 66
Unary Relational Operations: SELECT (5/5)
• In SQL, the SELECT condition is typically specified in the WHERE clause of a query
• Note:
– The SELECT operation is different from the SELECT clause of SQL
– The SELECT operation chooses tuples from a table and is sometimes called a RESTRICT
or FILTER operation
10/09/2024
Christalin Nelson | Systems | SoCS
11 of 66
Unary Relational Operations: PROJECT (1/3)
• Selects columns from the table and discards the other columns
– i.e. Vertical partition of relation (Recollect: SELECT did horizontal partition of relation)
• Denoted by pi
• Degree of resulting relation
– The result of PROJECT operation has only the attributes specified in <attribute list> in
the same order as they appear in the list
• Hence, Degree = Number of attributes in <attribute list>
• Duplicate elimination
– If the <attribute list> includes only non-key attributes of R, the result of PROJECT
operation is a set of distinct tuples => A valid relation
10/09/2024
Christalin Nelson | Systems | SoCS
12 of 66
10/09/2024
Christalin Nelson | Systems | SoCS
13 of 67
Unary Relational Operations: PROJECT (3/3)
• Number of Tuples in the resulting relation
– If (projection list includes some key of R)
• No. of tuples in the resulting relation = No. of tuples in R
– Else
• No. of tuples in a resulting relation ≤ No. of tuples in R
• The following expression is valid as long as <list2> contains the attributes in <list1>
• Commutativity does not hold on PROJECT
• In SQL, the PROJECT condition is typically specified in the SELECT clause of a query
10/09/2024
Christalin Nelson | Systems | SoCS
14 of 66
Sequences of Operations and RENAME Operation (1/5)
• Dealing with the Sequence of Operations
– Operations can be nested into a single relational algebra expression (Inline expression)
(or)
– One operation can be applied at a time with intermediate resulting relations
• i.e. Give names to the relations that hold the intermediate results
• Example: Retrieve first name, last name, salary of all employees who work in
department no. 5
– This needs PROJECT and SELECT operations to be sequenced
10/09/2024
Christalin Nelson | Systems | SoCS
15 of 66
10/09/2024
Christalin Nelson | Systems | SoCS
16 of 67
Using intermediate relations and renaming of attributes
10/09/2024
Christalin Nelson | Systems | SoCS
17 of 67
Sequences of Operations and RENAME Operation (4/5)
• Rename attributes in intermediate results
• RENAME operation
– Denoted as rho
– Types
• Rename both relation & attributes=>
• Rename the relation only =>
• Rename the attributes only =>
10/09/2024
Christalin Nelson | Systems | SoCS
18 of 66
Sequences of Operations and RENAME Operation (5/5)
• In SQL, a single query typically represents a complex relational algebra expression.
Renaming in SQL is accomplished by aliasing using AS
– Example: Renaming both relation and attributes
10/09/2024
Christalin Nelson | Systems | SoCS
19 of 66
Relational Algebra Operations from Set Theory (1/13)
• Binary operations are used to work with the elements of two sets in various ways
• Classifications
– Set operations applicable for two relations that are union-compatible
• UNION
• INTERSECTION
• SET DIFFERENCE or MINUS
– Set operations applicable to two relations that are not union-compatible
• CARTESIAN PRODUCT
10/09/2024
Christalin Nelson | Systems | SoCS
20 of 66
Relational Algebra Operations from Set Theory (2/13)
• Union Compatible (or Type Compatible relations)
– Two relations have the same no. of attributes & each corresponding pair of attributes
has the same domain
– i.e. Two relations R(A1, A2, ..., An) and S(B1, B2, ..., Bn) are said to be union compatible if
• (1) Both have same degree n
• (2) dom(Ai) = dom(Bi) for 1 ≤ i ≤ n
Two union-compatible relations
10/09/2024
Christalin Nelson | Systems | SoCS
21 of 66
Relational Algebra Operations from Set Theory (3/13)
• UNION (denoted as R S)
∪
– Result: A relation that includes all tuples that are either in R or S or both R and S
• INTERSECTION (denoted as R ∩ S)
– Result: A relation that includes all tuples that are in both R and S
• SET DIFFERENCE or MINUS (denoted by R – S)
– Result: A relation that includes all tuples that are in R but not in S
• Note:
– The resulting relation has the same attribute names as the first relation R
– Duplicate tuples are eliminated
– The attributes in the result can be renamed using the RENAME operator
10/09/2024
Christalin Nelson | Systems | SoCS
22 of 66
Two UNION compatible Relations
STUDENT ∪
INSTRUCTOR
STUDENT ∩ INSTRUCTOR
STUDENT – INSTRUCTOR
INSTRUCTOR – STUDENT
10/09/2024
Christalin Nelson | Systems | SoCS
23 of 67
Using UNION operation
10/09/2024
Christalin Nelson | Systems | SoCS
(To find the distinct SSN from EMPLOYEE with Dno = 5)
RESULT1 U RESULT2
DEP5_EMPS
24 of 67
Relational Algebra Operations from Set Theory (6/13)
• Note:
– Except for MINUS operations, UNION and INTERSECTION operations are commutative.
– Both UNION and INTERSECTION can be treated as n-ary operations (applicable to any
number of relations) because both are also associative operations
– INTERSECTION can be expressed in terms of UNION and DIFFERENCE
– In SQL,
• Set Operations (Eliminate duplicates): UNION, INTERSECT, and EXCEPT
• Multiset operations (Do not eliminate duplicates): UNION ALL, INTERSECT ALL, EXCEPT ALL
10/09/2024
Christalin Nelson | Systems | SoCS
25 of 66
Relational Algebra Operations from Set Theory (7/13)
• CARTESIAN PRODUCT (1/7)
– Also called CROSS PRODUCT or CROSS JOIN
– Denoted by x
• R(A1, A2, ..., An) × S(B1, B2, ..., Bm)
– If Q is the resulting relation
• Degree of Q: n + m
• Total tuples in Q
– If R has nR tuples (denoted as |R| = nR), and S has nS tuples, then Q will have nR * nS tuples
» i.e. Product of the number of rows in the two tables being joined
10/09/2024
Christalin Nelson | Systems | SoCS
26 of 66
Relational Algebra Operations from Set Theory (8/13)
• CARTESIAN PRODUCT (2/7)
– Example: Retrieve a list of names of each female employee’s dependents
10/09/2024
Christalin Nelson | Systems | SoCS
27 of 66
Using CARTESIAN PRODUCT operation
10/09/2024
Christalin Nelson | Systems | SoCS
(Retrieve a list of names of each female employee’s dependents)
28 of 67
10/09/2024
Christalin Nelson | Systems | SoCS
Using CARTESIAN PRODUCT operation
(Retrieve a list of names of each female employee’s dependents)
Order of Attributes No. of Tuples = 3x7 = 21
Degree = 3+5 = 8
29 of 67
Retrieve a list of names of each female
employee’s dependents
10/09/2024
Christalin Nelson | Systems | SoCS
30 of 67
Relational Algebra Operations from Set Theory (12/13)
• CARTESIAN PRODUCT (6/7)
– Useful when followed by a selection that matches values of attributes
• Introduces a new join operation to specify this sequence as a single operation
– In SQL, CARTESIAN PRODUCT can be realized by
• Using the CROSS JOIN option in joined tables
• Alternatively, If more than one relation is specified in FROM clause and there is no WHERE
clause, then CROSS PRODUCT of these relations are selected (except for duplicate
elimination)
10/09/2024
Christalin Nelson | Systems | SoCS
SELECT *
FROM Tab1
CROSS JOIN Tab2;
31 of 66
10/09/2024
Relational Algebra Operations from Set Theory (13/13)
• CARTESIAN PRODUCT (7/7)
– Every row from the first table is combined with every row from the second table,
resulting in a Cartesian product
– It does not require any join condition to be specified
– Can be computationally expensive and should be used with caution, especially when
joining large tables
Christalin Nelson | Systems | SoCS
SELECT *
FROM Tab1
CROSS JOIN Tab2;
32 of 66
Binary Relational Operations: JOIN (1/14)
• Denoted by
• Process relationships among relations
– General form of JOIN operation on two relations R(A1, A2, ..., An) & S(B1,B2, ..., Bm)
• <join condition> is of the form <condition> AND <condition> AND...AND <condition>
• If Q is the resulting relation
– Degree = Degree of Table-1 + Degree of Table-2
– Vs. Cartesian Product
• In JOIN, only combinations of tuples satisfying the join condition appear in the result
– i.e. One tuple in Q = Tuple of Table1 + Tuple of Table2 ONLY when the join condition is TRUE
– Tuples whose join attributes are NULL (or) for which join condition is FALSE do not appear in Q
• In CARTESIAN PRODUCT all combinations of tuples are included in the result
10/09/2024
Christalin Nelson | Systems | SoCS
33 of 66
Binary Relational Operations: JOIN (2/14)
• Example: Retrieve the name of the manager of each department
– (1) Combine each DEPARTMENT tuple with EMPLOYEE tuple if join condition is TRUE
– (2) PROJECT the result with suitable attributes
10/09/2024
Christalin Nelson | Systems | SoCS
34 of 66
10/09/2024
Christalin Nelson | Systems | SoCS
Retrieve the name of the manager of each department
PROJECT the result
with suitable
attributes
35 of 67
Binary Relational Operations: JOIN (4/14)
• JOIN can be specified as a CARTESIAN PRODUCT operation followed by a SELECT
operation (as discussed in Slide-32)
10/09/2024
Christalin Nelson | Systems | SoCS
36 of 66
10/09/2024
Binary Relational Operations: JOIN (5/14)
• NATURAL JOIN (1/4)
– Denoted by *
– <join condition>
• Automatically determined based on the columns with the same name in both tables
– If the names of join attributes are different, a RENAME operation is applied first
• It combines rows from both tables where the values in the matching columns are equal
– Removes duplicate columns from the result
– In SQL
Christalin Nelson | Systems | SoCS
SELECT *
FROM Tab1
NATURAL JOIN Tab2;
37 of 66
10/09/2024
Christalin Nelson | Systems | SoCS
38 of 67
Binary Relational Operations: JOIN (7/14)
• NATURAL JOIN (3/4)
– Example: Combine PROJECT with DEPARTMENT
• (1) The join attribute is Department number. As both tables should have same name of join
attribute, “Dnumber” attribute of DEPARTMENT should be renamed to “Dnum”
• (2) Apply NATURAL JOIN
10/09/2024
Christalin Nelson | Systems | SoCS
39 of 66
10/09/2024
Christalin Nelson | Systems | SoCS
Dnum
40 of 67
Binary Relational Operations: JOIN (9/14)
• EQUI JOIN (1/2)
– Denoted by
– EQUI JOIN are specific instances of THETA JOIN where <join condition> is explicitly
specified using the equality operator (=) on join attributes
– EQUI JOIN provides more control over <join condition> compared to NATURAL JOIN
– Tables are joined based on the equality of values in specified join attributes/columns
• Always have one or more pairs of attributes that have identical values in every tuple
– Most common type of join and is often used to match related rows between tables
– In SQL,
10/09/2024
Christalin Nelson | Systems | SoCS
SELECT *
FROM Tab1 A
JOIN Tab2 B ON A.id = B.id;
41 of 66
10/09/2024
Binary Relational Operations: JOIN (10/14)
• EQUI JOIN (2/2)
– Example:
Christalin Nelson | Systems | SoCS
Employee_ID Name Department_ID
1 John Doe 1
2 Jane Smith 1
3 Mike Johnson 2
4 Emily Brown 3
department_i
d department_name
1 Engineering
2 Marketing
3 Human Resources
DEPARTMENT
EMPLOYEE
EMPLOYEE_NAME DEPARTMENT_NAME
John Doe Engineering
Jane Smith Engineering
Mike Johnson Marketing
Emily Brown Human Resources
SELECT E.name AS employee_name, D.department_name
FROM Employees E
JOIN Departments D ON E.department_id = D.department_id
RESULTING RELATION
42 of 66
Binary Relational Operations: JOIN (11/14)
• THETA JOIN (1/2)
– Denoted by θ
– <join condition> is based on a general comparison condition (theta condition), which
can include any comparison operator such as '=', '>', '<', '>=', '<=', or '<>'
– THETA JOIN provides more flexibility than EQUI JOIN as they allow for <join condition>
other than equality
• i.e. Can be used to perform joins based on custom conditions that are not limited to
column equality
– Tuples whose join attributes are NULL (or) for which the join condition is FALSE do not
appear in the result
– In SQL
10/09/2024
Christalin Nelson | Systems | SoCS
SELECT *
FROM Tab1 A
JOIN Tab2 B ON A.col1 > B.col2;
43 of 66
10/09/2024
Binary Relational Operations: JOIN (12/14)
• THETA JOIN (2/2)
– Example
Christalin Nelson | Systems | SoCS
employee_id name department
1 John Doe Engineering
2 Jane Smith Engineering
3 Mike Johnson Marketing
4 Emily Brown Human Resources
employee_id salary
1 60000.00
2 65000.00
3 55000.00
4 60000.00
EMPLOYEES SALARIES
NAME
John Doe
Jane Smith
Emily Brown
SELECT Employees.name
FROM Employees
JOIN Salaries ON Employees.employee_id = Salaries.employee_id
WHERE Salaries.salary > 59000;
44 of 66
Binary Relational Operations: JOIN (13/14)
• Variations of JOIN
– n-way JOIN
• NATURAL JOIN or EQUIJOIN operation specified among multiple tables
– INNER JOIN
• Defined formally as a combination of CARTESIAN PRODUCT and SELECTION
• It is a broader concept that encompasses various types of join operations where only
matching rows from both tables are included in the result set
• In an INNER JOIN, rows from two tables are combined based on a specified condition,
which could be an equality condition (EQUI JOIN) or any other condition
• In SQL
10/09/2024
Christalin Nelson | Systems | SoCS
SELECT *
FROM Tab1 A
INNER JOIN Tab2 B ON A.id = B.id; -- Here, it is like EQUI JOIN
45 of 66
Binary Relational Operations: JOIN (14/14)
• In SQL, JOIN can be realized in several different ways
– (1) Specify <join conditions> in WHERE clause, along with other selection conditions
– (2) Use a nested relation
– (3) Use the concept of joined tables
• The construct of joined tables allows user to specify explicitly all the various types of join
• It also allows user to distinguish join conditions from selection conditions in WHERE clause
• Note:
– Join selectivity = Expected size of join result / maximum size
10/09/2024
Christalin Nelson | Systems | SoCS
46 of 66
Binary Relational Operations: DIVISION (1/5)
• Denoted by ÷
• Example-1:
– R ÷ S
• For a tuple(t) to appear in the resulting
relation (T), values in t must appear in R
in combination with every tuple in S
• Example-2:
– Retrieve names of EMPLOYEES who work
on all the projects that ‘John Smith’
WORKS ON
10/09/2024
Christalin Nelson | Systems | SoCS
a
a
a
a
a
a
47 of 66
SMITH
10/09/2024
Christalin Nelson | Systems | SoCS
48 of 67
10/09/2024
Christalin Nelson | Systems | SoCS
Retrieve names of EMPLOYEES who work on all the
projects that ‘John Smith’ WORKS ON
a
a
a
a
a
49 of 67
10/09/2024
Christalin Nelson | Systems | SoCS
Retrieve names of EMPLOYEES who work on all the projects that ‘John Smith’ WORKS ON
Result
50 of 67
Binary Relational Operations: DIVISION (5/5)
• DIVISION operation can be expressed as a sequence of π, ×, and – operations
10/09/2024
Christalin Nelson | Systems | SoCS
51 of 66
Operations of Relational Algebra (1/2)
10/09/2024
Christalin Nelson | Systems | SoCS
52 of 66
Operations of Relational Algebra (2/2)
10/09/2024
Christalin Nelson | Systems | SoCS
53 of 66
Query Tree (1/2)
• Data structure for internal representation of the Query in a RDBMS
• Also called as Query evaluation tree (or) Query execution tree
– Internal Representation
• Leaf Nodes => Input relations of query
• Internal Nodes => Relational algebra operations
• Query Execution
– Execution of an internal node operation starts when its operands (child nodes) are
available and it gets replaced by the relation that results from executing the operation
– Execution terminates when the root node is executed and produces the final result
relation for the query
10/09/2024
Christalin Nelson | Systems | SoCS
54 of 66
10/09/2024
Christalin Nelson | Systems | SoCS
55 of 67
Additional Relational Operations (1/10)
• Some operations cannot be specified in the basic original relational algebra
• Generalized projection
– Allows functions of attributes to be included in the projection list
10/09/2024
Christalin Nelson | Systems | SoCS
56 of 66
Generalized Projection (with renaming) to obtain a report from the relation EMPLOYEE
10/09/2024
Christalin Nelson | Systems | SoCS
57 of 67
Additional Relational Operations (3/10)
• Aggregate functions and grouping
– Types
• (1) Mathematical Functions applied to collections of numeric values from database
– Example: SUM, AVERAGE, MAXIMUM, MINIMUM
• (2) Function applied to count tuples/values
– Example: COUNT
• (3) Group tuples by the value of some of their attributes and apply aggregate function
independently to each group
– Denoted by (pronounced script F)
– <grouping attributes> list of attributes of the relation specified in R
– <function list> list of (<function> <attribute>) pairs
» In each such pair <function> is one of the allowed functions (such as SUM, AVERAGE,
MAXIMUM, MINIMUM, COUNT), <attribute> is an attribute of relation R
10/09/2024
Christalin Nelson | Systems | SoCS
58 of 66
10/09/2024
Christalin Nelson | Systems | SoCS
59 of 67
Find the missing operation?
10/09/2024
Christalin Nelson | Systems | SoCS
60 of 67
Additional Relational Operations (6/10)
• Recursive Closure operations
– Operation applied to a recursive relationship between tuples of same type
– SQL3 standard includes syntax for recursive closure
10/09/2024
Christalin Nelson | Systems | SoCS
61 of 66
Additional Relational Operations (7/10)
• Outer Joins
– Keep all tuples in R, or all those in S, or all those in both relations regardless of
whether or not they have matching tuples in the other relation
– Types (part of the SQL2 standard)
• LEFT OUTER JOIN
• RIGHT OUTER JOIN
• FULL OUTER JOIN
– LEFT OUTER JOIN (denoted by )
• Keep every tuple in the first (or left) relation R in R S
• If NO matching tuple is found in S, then the attributes of S in the join result are filled or
padded with NULL values
10/09/2024
Christalin Nelson | Systems | SoCS
62 of 66
10/09/2024
Christalin Nelson | Systems | SoCS
63 of 67
Additional Relational Operations (9/10)
• Outer Joins
– RIGHT OUTER JOIN (denoted by )
• Keep every tuple in the second (or right) relation S in R S
• If NO matching tuple is found in R, then the attributes of R in the join result are filled or
padded with NULL values
– FULL OUTER JOIN (denoted by )
• Keep every tuple in the left and right relation R and S
• If NO matching tuple is found in R and S, then their attributes in the join result are filled or
padded with NULL values
10/09/2024
Christalin Nelson | Systems | SoCS
64 of 66
Additional Relational Operations (10/10)
• OUTER UNION Operation
– Union of tuples from two partially union (type) compatible relations R(X, Y) and S(X, Z)
that have some common attributes (X)
– Resulting Relation representation: T (X, Y, Z)
• Union compatible (X) attributes – represented ONLY once in the result
– Note: Same as FULL OUTER JOIN on the common attributes
• Non-Union compatible (Y, Z) attributes – Kept with a NULL value
10/09/2024
Christalin Nelson | Systems | SoCS
65 of 66
Thank You
10/09/2024
Christalin Nelson | Systems | SoCS
66 of 67

Relational Algebra in Database Management System

  • 1.
    Formal Languages -I (Relational Algebra) 10/09/2024 Christalin Nelson | Systems | SoCS 1 of 67
  • 2.
    At a Glance •Formal Languages • Unary Relational Operations • Operations from Set Theory • Binary Relational Operations • Additional Relational Operations 10/09/2024 Christalin Nelson | Systems | SoCS 2 of 66
  • 3.
    Formal languages (1/2) •The following formal languages are associated with Relational model – Relational algebra • Used in internals of many DB implementations for query processing and optimization – Relational calculus • Has firm basis on mathematical logic • SQL for RDBMSs has some of its foundations in tuple relational calculus 10/09/2024 Christalin Nelson | Systems | SoCS 3 of 66
  • 4.
    Formal languages (2/2) •“Expressive power” of both languages are identical – Any retrieval that is specified in basic relational algebra can also be specified in relational calculus, and vice versa. • “Relationally complete” language – A relational query language (L) is relationally complete if any query that is expressed in relational calculus can be expressed in L – Important basis for comparing the expressive power of high-level query languages – Note • Most relational query languages are relationally complete but have more expressive power than relational algebra or relational calculus because of additional operations (like aggregate functions, grouping, and ordering) 10/09/2024 Christalin Nelson | Systems | SoCS 4 of 66
  • 5.
    Relational Algebra (1/2) •Procedural in nature • Focuses on describing a sequence of operations to retrieve or manipulate data – It is also possible to nest algebra operations to form a single expression • Provides Set of operations that act on relations to produce new relations as results – E.g. SELECT, PROJECT, JOIN, UNION, INTERSECTION, and DIFFERENCE – Note: • Relational algebra operations can be thought of as similar to SQL query clauses – Relational algebra operations are expressed in a more formal and procedural manner – Queries can be expressed as sequences of these operations, which are applied to relations to obtain the desired result 10/09/2024 Christalin Nelson | Systems | SoCS 5 of 66
  • 6.
    Relational Algebra (2/2) •Types of Relational algebra operations – Set operations from mathematical set theory • Example: UNION, INTERSECTION, SET DIFFERENCE, CARTESIAN PRODUCT (CROSS PRODUCT) – Operations developed specifically for relational databases • Example: SELECT, PROJECT, JOIN, DIVISION – Additional Operations • Aggregate functions (can summarize data from the tables) • Additional types of JOIN and UNION operations (known as OUTER JOINs and OUTER UNIONs) 10/09/2024 Christalin Nelson | Systems | SoCS 6 of 66
  • 7.
    Unary Relational Operations:SELECT (1/5) • Unary Operations – Applied to a single relation (Example: SELECT, PROJECT) • SELECT Operation – Selects/Filters a subset of tuples from a relation that satisfies a selection condition • i.e. Horizontal partition of relation – Denoted by sigma • <selection condition> – Condition is applied independently to each tuple in relation (R). If TRUE => tuple is selected – Contains Boolean expression which has clauses of the form » <attribute name> <comparison operator> <constant value or attribute name> 10/09/2024 Christalin Nelson | Systems | SoCS 7 of 66
  • 8.
    10/09/2024 Christalin Nelson |Systems | SoCS 8 of 67
  • 9.
    Unary Relational Operations:SELECT (3/5) • Boolean operators {AND, OR, NOT} • Comparison operators {=, <, ≤, >, ≥, ≠} – Note: If the domain of the attribute is • Set of ordered values => All comparison operators can be used • Set of unordered values => Only the comparison operators in the set {=, ≠} can be used – Example: domain Color = { ‘red’, ‘blue’, ‘green’, ‘white’, ‘yellow’, ...}, where no order is specified among the various colors • Degree of resulting relation – No. of attributes resulting from a SELECT operation • For any condition, No. of tuples in resulting relation ≤ No. of tuples in R – i.e. |σc (R)| ≤ |R| 10/09/2024 Christalin Nelson | Systems | SoCS 9 of 66
  • 10.
    Unary Relational Operations:SELECT (4/5) • Selectivity – Fraction of tuples selected by a selection condition • SELECT operation is commutative – A sequence of SELECTs can be applied in any order • Cascade/Combine – SELECT operations can be cascaded into a single operation with an AND condition 10/09/2024 Christalin Nelson | Systems | SoCS 10 of 66
  • 11.
    Unary Relational Operations:SELECT (5/5) • In SQL, the SELECT condition is typically specified in the WHERE clause of a query • Note: – The SELECT operation is different from the SELECT clause of SQL – The SELECT operation chooses tuples from a table and is sometimes called a RESTRICT or FILTER operation 10/09/2024 Christalin Nelson | Systems | SoCS 11 of 66
  • 12.
    Unary Relational Operations:PROJECT (1/3) • Selects columns from the table and discards the other columns – i.e. Vertical partition of relation (Recollect: SELECT did horizontal partition of relation) • Denoted by pi • Degree of resulting relation – The result of PROJECT operation has only the attributes specified in <attribute list> in the same order as they appear in the list • Hence, Degree = Number of attributes in <attribute list> • Duplicate elimination – If the <attribute list> includes only non-key attributes of R, the result of PROJECT operation is a set of distinct tuples => A valid relation 10/09/2024 Christalin Nelson | Systems | SoCS 12 of 66
  • 13.
    10/09/2024 Christalin Nelson |Systems | SoCS 13 of 67
  • 14.
    Unary Relational Operations:PROJECT (3/3) • Number of Tuples in the resulting relation – If (projection list includes some key of R) • No. of tuples in the resulting relation = No. of tuples in R – Else • No. of tuples in a resulting relation ≤ No. of tuples in R • The following expression is valid as long as <list2> contains the attributes in <list1> • Commutativity does not hold on PROJECT • In SQL, the PROJECT condition is typically specified in the SELECT clause of a query 10/09/2024 Christalin Nelson | Systems | SoCS 14 of 66
  • 15.
    Sequences of Operationsand RENAME Operation (1/5) • Dealing with the Sequence of Operations – Operations can be nested into a single relational algebra expression (Inline expression) (or) – One operation can be applied at a time with intermediate resulting relations • i.e. Give names to the relations that hold the intermediate results • Example: Retrieve first name, last name, salary of all employees who work in department no. 5 – This needs PROJECT and SELECT operations to be sequenced 10/09/2024 Christalin Nelson | Systems | SoCS 15 of 66
  • 16.
    10/09/2024 Christalin Nelson |Systems | SoCS 16 of 67
  • 17.
    Using intermediate relationsand renaming of attributes 10/09/2024 Christalin Nelson | Systems | SoCS 17 of 67
  • 18.
    Sequences of Operationsand RENAME Operation (4/5) • Rename attributes in intermediate results • RENAME operation – Denoted as rho – Types • Rename both relation & attributes=> • Rename the relation only => • Rename the attributes only => 10/09/2024 Christalin Nelson | Systems | SoCS 18 of 66
  • 19.
    Sequences of Operationsand RENAME Operation (5/5) • In SQL, a single query typically represents a complex relational algebra expression. Renaming in SQL is accomplished by aliasing using AS – Example: Renaming both relation and attributes 10/09/2024 Christalin Nelson | Systems | SoCS 19 of 66
  • 20.
    Relational Algebra Operationsfrom Set Theory (1/13) • Binary operations are used to work with the elements of two sets in various ways • Classifications – Set operations applicable for two relations that are union-compatible • UNION • INTERSECTION • SET DIFFERENCE or MINUS – Set operations applicable to two relations that are not union-compatible • CARTESIAN PRODUCT 10/09/2024 Christalin Nelson | Systems | SoCS 20 of 66
  • 21.
    Relational Algebra Operationsfrom Set Theory (2/13) • Union Compatible (or Type Compatible relations) – Two relations have the same no. of attributes & each corresponding pair of attributes has the same domain – i.e. Two relations R(A1, A2, ..., An) and S(B1, B2, ..., Bn) are said to be union compatible if • (1) Both have same degree n • (2) dom(Ai) = dom(Bi) for 1 ≤ i ≤ n Two union-compatible relations 10/09/2024 Christalin Nelson | Systems | SoCS 21 of 66
  • 22.
    Relational Algebra Operationsfrom Set Theory (3/13) • UNION (denoted as R S) ∪ – Result: A relation that includes all tuples that are either in R or S or both R and S • INTERSECTION (denoted as R ∩ S) – Result: A relation that includes all tuples that are in both R and S • SET DIFFERENCE or MINUS (denoted by R – S) – Result: A relation that includes all tuples that are in R but not in S • Note: – The resulting relation has the same attribute names as the first relation R – Duplicate tuples are eliminated – The attributes in the result can be renamed using the RENAME operator 10/09/2024 Christalin Nelson | Systems | SoCS 22 of 66
  • 23.
    Two UNION compatibleRelations STUDENT ∪ INSTRUCTOR STUDENT ∩ INSTRUCTOR STUDENT – INSTRUCTOR INSTRUCTOR – STUDENT 10/09/2024 Christalin Nelson | Systems | SoCS 23 of 67
  • 24.
    Using UNION operation 10/09/2024 ChristalinNelson | Systems | SoCS (To find the distinct SSN from EMPLOYEE with Dno = 5) RESULT1 U RESULT2 DEP5_EMPS 24 of 67
  • 25.
    Relational Algebra Operationsfrom Set Theory (6/13) • Note: – Except for MINUS operations, UNION and INTERSECTION operations are commutative. – Both UNION and INTERSECTION can be treated as n-ary operations (applicable to any number of relations) because both are also associative operations – INTERSECTION can be expressed in terms of UNION and DIFFERENCE – In SQL, • Set Operations (Eliminate duplicates): UNION, INTERSECT, and EXCEPT • Multiset operations (Do not eliminate duplicates): UNION ALL, INTERSECT ALL, EXCEPT ALL 10/09/2024 Christalin Nelson | Systems | SoCS 25 of 66
  • 26.
    Relational Algebra Operationsfrom Set Theory (7/13) • CARTESIAN PRODUCT (1/7) – Also called CROSS PRODUCT or CROSS JOIN – Denoted by x • R(A1, A2, ..., An) × S(B1, B2, ..., Bm) – If Q is the resulting relation • Degree of Q: n + m • Total tuples in Q – If R has nR tuples (denoted as |R| = nR), and S has nS tuples, then Q will have nR * nS tuples » i.e. Product of the number of rows in the two tables being joined 10/09/2024 Christalin Nelson | Systems | SoCS 26 of 66
  • 27.
    Relational Algebra Operationsfrom Set Theory (8/13) • CARTESIAN PRODUCT (2/7) – Example: Retrieve a list of names of each female employee’s dependents 10/09/2024 Christalin Nelson | Systems | SoCS 27 of 66
  • 28.
    Using CARTESIAN PRODUCToperation 10/09/2024 Christalin Nelson | Systems | SoCS (Retrieve a list of names of each female employee’s dependents) 28 of 67
  • 29.
    10/09/2024 Christalin Nelson |Systems | SoCS Using CARTESIAN PRODUCT operation (Retrieve a list of names of each female employee’s dependents) Order of Attributes No. of Tuples = 3x7 = 21 Degree = 3+5 = 8 29 of 67
  • 30.
    Retrieve a listof names of each female employee’s dependents 10/09/2024 Christalin Nelson | Systems | SoCS 30 of 67
  • 31.
    Relational Algebra Operationsfrom Set Theory (12/13) • CARTESIAN PRODUCT (6/7) – Useful when followed by a selection that matches values of attributes • Introduces a new join operation to specify this sequence as a single operation – In SQL, CARTESIAN PRODUCT can be realized by • Using the CROSS JOIN option in joined tables • Alternatively, If more than one relation is specified in FROM clause and there is no WHERE clause, then CROSS PRODUCT of these relations are selected (except for duplicate elimination) 10/09/2024 Christalin Nelson | Systems | SoCS SELECT * FROM Tab1 CROSS JOIN Tab2; 31 of 66
  • 32.
    10/09/2024 Relational Algebra Operationsfrom Set Theory (13/13) • CARTESIAN PRODUCT (7/7) – Every row from the first table is combined with every row from the second table, resulting in a Cartesian product – It does not require any join condition to be specified – Can be computationally expensive and should be used with caution, especially when joining large tables Christalin Nelson | Systems | SoCS SELECT * FROM Tab1 CROSS JOIN Tab2; 32 of 66
  • 33.
    Binary Relational Operations:JOIN (1/14) • Denoted by • Process relationships among relations – General form of JOIN operation on two relations R(A1, A2, ..., An) & S(B1,B2, ..., Bm) • <join condition> is of the form <condition> AND <condition> AND...AND <condition> • If Q is the resulting relation – Degree = Degree of Table-1 + Degree of Table-2 – Vs. Cartesian Product • In JOIN, only combinations of tuples satisfying the join condition appear in the result – i.e. One tuple in Q = Tuple of Table1 + Tuple of Table2 ONLY when the join condition is TRUE – Tuples whose join attributes are NULL (or) for which join condition is FALSE do not appear in Q • In CARTESIAN PRODUCT all combinations of tuples are included in the result 10/09/2024 Christalin Nelson | Systems | SoCS 33 of 66
  • 34.
    Binary Relational Operations:JOIN (2/14) • Example: Retrieve the name of the manager of each department – (1) Combine each DEPARTMENT tuple with EMPLOYEE tuple if join condition is TRUE – (2) PROJECT the result with suitable attributes 10/09/2024 Christalin Nelson | Systems | SoCS 34 of 66
  • 35.
    10/09/2024 Christalin Nelson |Systems | SoCS Retrieve the name of the manager of each department PROJECT the result with suitable attributes 35 of 67
  • 36.
    Binary Relational Operations:JOIN (4/14) • JOIN can be specified as a CARTESIAN PRODUCT operation followed by a SELECT operation (as discussed in Slide-32) 10/09/2024 Christalin Nelson | Systems | SoCS 36 of 66
  • 37.
    10/09/2024 Binary Relational Operations:JOIN (5/14) • NATURAL JOIN (1/4) – Denoted by * – <join condition> • Automatically determined based on the columns with the same name in both tables – If the names of join attributes are different, a RENAME operation is applied first • It combines rows from both tables where the values in the matching columns are equal – Removes duplicate columns from the result – In SQL Christalin Nelson | Systems | SoCS SELECT * FROM Tab1 NATURAL JOIN Tab2; 37 of 66
  • 38.
    10/09/2024 Christalin Nelson |Systems | SoCS 38 of 67
  • 39.
    Binary Relational Operations:JOIN (7/14) • NATURAL JOIN (3/4) – Example: Combine PROJECT with DEPARTMENT • (1) The join attribute is Department number. As both tables should have same name of join attribute, “Dnumber” attribute of DEPARTMENT should be renamed to “Dnum” • (2) Apply NATURAL JOIN 10/09/2024 Christalin Nelson | Systems | SoCS 39 of 66
  • 40.
    10/09/2024 Christalin Nelson |Systems | SoCS Dnum 40 of 67
  • 41.
    Binary Relational Operations:JOIN (9/14) • EQUI JOIN (1/2) – Denoted by – EQUI JOIN are specific instances of THETA JOIN where <join condition> is explicitly specified using the equality operator (=) on join attributes – EQUI JOIN provides more control over <join condition> compared to NATURAL JOIN – Tables are joined based on the equality of values in specified join attributes/columns • Always have one or more pairs of attributes that have identical values in every tuple – Most common type of join and is often used to match related rows between tables – In SQL, 10/09/2024 Christalin Nelson | Systems | SoCS SELECT * FROM Tab1 A JOIN Tab2 B ON A.id = B.id; 41 of 66
  • 42.
    10/09/2024 Binary Relational Operations:JOIN (10/14) • EQUI JOIN (2/2) – Example: Christalin Nelson | Systems | SoCS Employee_ID Name Department_ID 1 John Doe 1 2 Jane Smith 1 3 Mike Johnson 2 4 Emily Brown 3 department_i d department_name 1 Engineering 2 Marketing 3 Human Resources DEPARTMENT EMPLOYEE EMPLOYEE_NAME DEPARTMENT_NAME John Doe Engineering Jane Smith Engineering Mike Johnson Marketing Emily Brown Human Resources SELECT E.name AS employee_name, D.department_name FROM Employees E JOIN Departments D ON E.department_id = D.department_id RESULTING RELATION 42 of 66
  • 43.
    Binary Relational Operations:JOIN (11/14) • THETA JOIN (1/2) – Denoted by θ – <join condition> is based on a general comparison condition (theta condition), which can include any comparison operator such as '=', '>', '<', '>=', '<=', or '<>' – THETA JOIN provides more flexibility than EQUI JOIN as they allow for <join condition> other than equality • i.e. Can be used to perform joins based on custom conditions that are not limited to column equality – Tuples whose join attributes are NULL (or) for which the join condition is FALSE do not appear in the result – In SQL 10/09/2024 Christalin Nelson | Systems | SoCS SELECT * FROM Tab1 A JOIN Tab2 B ON A.col1 > B.col2; 43 of 66
  • 44.
    10/09/2024 Binary Relational Operations:JOIN (12/14) • THETA JOIN (2/2) – Example Christalin Nelson | Systems | SoCS employee_id name department 1 John Doe Engineering 2 Jane Smith Engineering 3 Mike Johnson Marketing 4 Emily Brown Human Resources employee_id salary 1 60000.00 2 65000.00 3 55000.00 4 60000.00 EMPLOYEES SALARIES NAME John Doe Jane Smith Emily Brown SELECT Employees.name FROM Employees JOIN Salaries ON Employees.employee_id = Salaries.employee_id WHERE Salaries.salary > 59000; 44 of 66
  • 45.
    Binary Relational Operations:JOIN (13/14) • Variations of JOIN – n-way JOIN • NATURAL JOIN or EQUIJOIN operation specified among multiple tables – INNER JOIN • Defined formally as a combination of CARTESIAN PRODUCT and SELECTION • It is a broader concept that encompasses various types of join operations where only matching rows from both tables are included in the result set • In an INNER JOIN, rows from two tables are combined based on a specified condition, which could be an equality condition (EQUI JOIN) or any other condition • In SQL 10/09/2024 Christalin Nelson | Systems | SoCS SELECT * FROM Tab1 A INNER JOIN Tab2 B ON A.id = B.id; -- Here, it is like EQUI JOIN 45 of 66
  • 46.
    Binary Relational Operations:JOIN (14/14) • In SQL, JOIN can be realized in several different ways – (1) Specify <join conditions> in WHERE clause, along with other selection conditions – (2) Use a nested relation – (3) Use the concept of joined tables • The construct of joined tables allows user to specify explicitly all the various types of join • It also allows user to distinguish join conditions from selection conditions in WHERE clause • Note: – Join selectivity = Expected size of join result / maximum size 10/09/2024 Christalin Nelson | Systems | SoCS 46 of 66
  • 47.
    Binary Relational Operations:DIVISION (1/5) • Denoted by ÷ • Example-1: – R ÷ S • For a tuple(t) to appear in the resulting relation (T), values in t must appear in R in combination with every tuple in S • Example-2: – Retrieve names of EMPLOYEES who work on all the projects that ‘John Smith’ WORKS ON 10/09/2024 Christalin Nelson | Systems | SoCS a a a a a a 47 of 66
  • 48.
    SMITH 10/09/2024 Christalin Nelson |Systems | SoCS 48 of 67
  • 49.
    10/09/2024 Christalin Nelson |Systems | SoCS Retrieve names of EMPLOYEES who work on all the projects that ‘John Smith’ WORKS ON a a a a a 49 of 67
  • 50.
    10/09/2024 Christalin Nelson |Systems | SoCS Retrieve names of EMPLOYEES who work on all the projects that ‘John Smith’ WORKS ON Result 50 of 67
  • 51.
    Binary Relational Operations:DIVISION (5/5) • DIVISION operation can be expressed as a sequence of π, ×, and – operations 10/09/2024 Christalin Nelson | Systems | SoCS 51 of 66
  • 52.
    Operations of RelationalAlgebra (1/2) 10/09/2024 Christalin Nelson | Systems | SoCS 52 of 66
  • 53.
    Operations of RelationalAlgebra (2/2) 10/09/2024 Christalin Nelson | Systems | SoCS 53 of 66
  • 54.
    Query Tree (1/2) •Data structure for internal representation of the Query in a RDBMS • Also called as Query evaluation tree (or) Query execution tree – Internal Representation • Leaf Nodes => Input relations of query • Internal Nodes => Relational algebra operations • Query Execution – Execution of an internal node operation starts when its operands (child nodes) are available and it gets replaced by the relation that results from executing the operation – Execution terminates when the root node is executed and produces the final result relation for the query 10/09/2024 Christalin Nelson | Systems | SoCS 54 of 66
  • 55.
    10/09/2024 Christalin Nelson |Systems | SoCS 55 of 67
  • 56.
    Additional Relational Operations(1/10) • Some operations cannot be specified in the basic original relational algebra • Generalized projection – Allows functions of attributes to be included in the projection list 10/09/2024 Christalin Nelson | Systems | SoCS 56 of 66
  • 57.
    Generalized Projection (withrenaming) to obtain a report from the relation EMPLOYEE 10/09/2024 Christalin Nelson | Systems | SoCS 57 of 67
  • 58.
    Additional Relational Operations(3/10) • Aggregate functions and grouping – Types • (1) Mathematical Functions applied to collections of numeric values from database – Example: SUM, AVERAGE, MAXIMUM, MINIMUM • (2) Function applied to count tuples/values – Example: COUNT • (3) Group tuples by the value of some of their attributes and apply aggregate function independently to each group – Denoted by (pronounced script F) – <grouping attributes> list of attributes of the relation specified in R – <function list> list of (<function> <attribute>) pairs » In each such pair <function> is one of the allowed functions (such as SUM, AVERAGE, MAXIMUM, MINIMUM, COUNT), <attribute> is an attribute of relation R 10/09/2024 Christalin Nelson | Systems | SoCS 58 of 66
  • 59.
    10/09/2024 Christalin Nelson |Systems | SoCS 59 of 67
  • 60.
    Find the missingoperation? 10/09/2024 Christalin Nelson | Systems | SoCS 60 of 67
  • 61.
    Additional Relational Operations(6/10) • Recursive Closure operations – Operation applied to a recursive relationship between tuples of same type – SQL3 standard includes syntax for recursive closure 10/09/2024 Christalin Nelson | Systems | SoCS 61 of 66
  • 62.
    Additional Relational Operations(7/10) • Outer Joins – Keep all tuples in R, or all those in S, or all those in both relations regardless of whether or not they have matching tuples in the other relation – Types (part of the SQL2 standard) • LEFT OUTER JOIN • RIGHT OUTER JOIN • FULL OUTER JOIN – LEFT OUTER JOIN (denoted by ) • Keep every tuple in the first (or left) relation R in R S • If NO matching tuple is found in S, then the attributes of S in the join result are filled or padded with NULL values 10/09/2024 Christalin Nelson | Systems | SoCS 62 of 66
  • 63.
    10/09/2024 Christalin Nelson |Systems | SoCS 63 of 67
  • 64.
    Additional Relational Operations(9/10) • Outer Joins – RIGHT OUTER JOIN (denoted by ) • Keep every tuple in the second (or right) relation S in R S • If NO matching tuple is found in R, then the attributes of R in the join result are filled or padded with NULL values – FULL OUTER JOIN (denoted by ) • Keep every tuple in the left and right relation R and S • If NO matching tuple is found in R and S, then their attributes in the join result are filled or padded with NULL values 10/09/2024 Christalin Nelson | Systems | SoCS 64 of 66
  • 65.
    Additional Relational Operations(10/10) • OUTER UNION Operation – Union of tuples from two partially union (type) compatible relations R(X, Y) and S(X, Z) that have some common attributes (X) – Resulting Relation representation: T (X, Y, Z) • Union compatible (X) attributes – represented ONLY once in the result – Note: Same as FULL OUTER JOIN on the common attributes • Non-Union compatible (Y, Z) attributes – Kept with a NULL value 10/09/2024 Christalin Nelson | Systems | SoCS 65 of 66
  • 66.
    Thank You 10/09/2024 Christalin Nelson| Systems | SoCS 66 of 67