3. 2.1 Relational Data Model Structure
Topics:
Why relational data model
Brief history of relational data model
Basic concepts
Terminology
Schemas
Properties of relations
Relation keys
4. 2.1 Relational Data Model Structure
Why Relational Data Model?
We will devote more time to Relational Data Model compared
to any other data model for the following reasons:
The model is easy to understand.
It has simple concepts: tables, columns, rows and
constraints.
It has mathematical foundation.
Many database management system products are based
on the relational data model.
5. 2.1 Relational Data Model Structure
Brief History of Relational Data Model
Introduced by E.F. Codd of IBM in 1970.
A prototype RDBMS called System R was developed in late
1970s by IBM.
SQL was developed by IBM as a language for RDBMSs
Commercial RDBMSs DB2 and SQL/DS were developed
by IBM corporation and Oracle by Oracle corporation.
INGRES was developed at Univ. of California at Berkeley
and later made available as commercial RDBMS. Used
language QUEL.
Later many commercial RDBMS were developed in 1980s.
6. 2.1 Relational Data Model Structure
Concepts
Basic concepts of the relational data model:
A database consists of one or more relations.
A relation is described by a name, names for one or
more attributes and consists of zero or more tuples.
A tuple consists of values for each attribute of the
relation.
An attribute takes values from a domain and there
exists one value for each attribute in a tuple.
A domain consists of all allowed atomic values for
one or more attributes.
7. 2.1 Relational Data Model - Structure
Some more Concepts
• Degree of a relation is the number of attributes of the
relation.
• Cardinality of a relation is the number of tuples of the
relation.
8. 2.1 Relational Data Model Structure
Relation
A relation is analogous to a table of rows and columns as
shown below:
Employee Name Designation Department Name
Niranjan Software Engineer Development
Praveen Software Engineer Development
Dinesh Director Marketing
Harish Manager Administration
Attributes
Tuples
12 values in 4 tuples and 3 attributes
Relation: Employees
9. 2.1 Relational Data Model Structure
Alternative Terminology
Terms relation, attribute and tuple are from mathematics. The
terms used by developers and many RDBMSs are table, row
and column:
Employee Name Designation Department Name
Niranjan Software Engineer Development
Praveen Software Engineer Development
Dinesh Director Marketing
Harish Manager Administration
Column names
Rows
Columns
Table: EMPLOYEES
10. 2.1 Relational Data Model Structure
One More Alternative Terminology
Term Record is used instead of tuple/row and term Field is
used instead of attribute/column.
Employee Name Designation Department Name
Niranjan Software Engineer Development
Praveen Software Engineer Development
Dinesh Director Marketing
Harish Manager Administration
Field names
Records
Fields
Table: EMPLOYEES
11. 2.1 Relational Data Model Structure
Schema
Schema is description of data.
Relation schema is a description of a relation using its
attributes and domains.
Database schema is the collection of relation schemas
and schemas of relationships between the relations
12. 2.1 Relational Data Model Structure
Properties of Relations
Every relation has the following properties:
A distinct name.
Each attribute has a distinct name.
An atomic value for an attribute in a tuple.
All values for an attribute are from the same domain.
Each tuple is distinct(means no duplicate tuples).
There is no significance to the order of attributes.
There is no significance to the order of tuples.
13. 2.1 Relational Data Model Structure
Distinct Relation Names
Each relation or table in a database should have distinct
name.
For example, three relations with names as specified
below in a database are valid
EMPLOYEES
DEPARTMENTS
PROJECTS
The following names for three relations in a database are
invalid:
EMPLOYEES
DEPARTMENTS
DEPARTMENTS
Name DEPARTMENTS is a duplicate name.
14. 2.1 Relational Data Model Structure
Distinct Attribute Names
Each relation or table in a database should have distinct
names for its attributes or columns.
For example, relation EMPLOYEES with attribute names as
specified below is valid:
Employee Name, Designation, Department Name
It is invalid to specify duplicate names for attributes for a
relation as shown below:
Employee Name, Designation, Designation,
Department Name
Attribute Designation is duplicate.
15. 2.1 Relational Data Model Structure
Atomic Values
Each value of each attribute in a relation is atomic.
It means the value can not be divided.
A tuple can not have multiple values for an attribute.
The following table shows intention of storing two
designations Manager and Director for employee Harish.
This is not possible:
Employee Name Designation Department Name
Niranjan Software Engineer Development
Praveen Software Engineer Development
Dinesh Director Marketing
Jimson K John Manager
Director Development
16. 2.1 Relational Data Model Structure
Values From The Same Domain
Each value of an attribute can be from only one Domain
All values of a domain are expected to be of same data
type.
For example, if an attribute is used for storing age of
employees as an integer number such as 25, you can
neither use its word format “twenty five” nor non-integer
number such as 25.5 as these two values are not integers.
17. 2.1 Relational Data Model Structure
No Duplicate Tuples
Employee Name Designation Department Name
Niranjan Software Engineer Development
Praveen Software Engineer Development
Dinesh Director Marketing
Praveen Software Engineer Development
There are two tuples for employee “Praveen”.
Each relation in a database is expected to have unique tuples.
No two tuples are identical in values for all attributes .
Relation violating the property:
18. 2.1 Relational Data Model Structure
No Significance to Order of Attributes
There is no significance to the order of attributes in a relation.
Database language does not depend on the order of attributes.
Following three alternatives for a relation are same:
Employees(Employee Name, Designation, Department Name)
Employees(Designation, Employee Name, Department Name)
Employees(Department Name, Employee Name, Designation)
19. 2.1 Relational Data Model Structure
No Significance to Order of Tuples
Order of tuples does not affect the results of operations on the
tables. Employee Name Designation Department Name
Niranjan Software Engineer Development
Praveen Software Engineer Development
Dinesh Director Marketing
Harish Director Administration
Employee Name Designation Department Name
Harish Director Administration
Niranjan Software Engineer Development
Dinesh Director Marketing
Praveen Software Engineer Development
For example,
the tables
are
equivalent.
20. 2.1 Relational Data Model Structure
Relation Keys
A property of relation is it has distinct tuples.
I.e., no two tuples have same values for each
attribute.
To make tuples distinct, values of one or more
attributes together should be different in each
tuple.
These attributes that uniquely identify a tuple is
called key.
There are various kinds of keys.
21. 2.1 Relational Data Model Structure
Relation Keys - Super Key
A Super Key is a set of one or more attributes that
uniquely identifies a tuple within a relation.
Super Keys
Eno Ename Designation PAN Dno
1001 Niranjan Software Engineer NIR01ABC01 10
1002 Praveen Trainee PRA02ABC02 10
1003 Prashanth Admin PRA03ABC03 20
1004 Srilatha Software Engineer SRI04ABC04 10
1005 Sagar Manager SAG05ABC05 10
There can be multiple
super keys for a relation.
A key that has
more than one
attribute is
called
composite key.
22. 2.1 Relational Data Model Structure
Relation Keys - Candidate Key
A Candidate Key is a super key such that no
proper subset is a super key.
Eno Ename Designation PAN Dno
1001 Niranjan Software Engineer NIR01ABC01 10
1002 Praveen Trainee PRA02ABC02 10
1003 Prashanth Admin PRA03ABC03 20
1004 Srilatha Software Engineer SRI04ABC04 10
1005 Sagar Manager SAG05ABC05 10
Candidate Keys
A candidate key has
minimum number of
attributes required to
uniquely identify tuples.
23. 2.1 Relational Data Model Structure
Relation Keys - Primary Key
A Primary Key is a candidate key that is chosen by the
database designer as the principal means of identifying
tuples uniquely within a relation.
Eno Ename Designation PAN Dno
1001 Niranjan Software Engineer NIR01ABC01 10
1002 Praveen Trainee PRA02ABC02 10
1003 Prashanth Admin PRA03ABC03 20
1004 Srilatha Software Engineer SRI04ABC04 10
1005 Sagar Manager SAG05ABC05 10
Primary Key (Eno)
A relation, by definition, will
have only one primary key.
Candidate keys other than primary
key are called alternate keys.
24. 2.1 Relational Data Model Structure
Relation Keys - Foreign Key
A Foreign key is a set of one or more attributes within one
relation that matches a candidate key of another relation or
possibly the same relation.
Relation that has a foreign key is called
referencing relation where as the relation that
contains the candidate key referenced relation.
subj_id subj_title
book_id book_title subj_id
book_id ch_no ch_title
SUBJECTS
BOOKS
CHAPTERS
Primary keys
Foreign keys
26. 2.2 Relational Data Model Operations
Description Language
We will use Relational Algebra to describe
operations on the relation data model.
Relational algebra was introduced by E.F. Codd in
1971.
Other languages that can be used to specify the
operations is Relational Calculus (domain
relational calculus and tuple reational calculus) but
we will not cover this.
27. 2.2 Relational Data Model Operations
Relational Algebra
The Relational Algebra is a set language. All tuples from one or
more relations are operated using one statement of the
language without using any looping constructs.
Five fundamental operations : Selection, Projection, Cartesian
Product, Union and Set difference.
Other operations: Join, Intersection, and Division and a few
variations of joins.
Selection and Projection are unary operations; operate on one
relation.
Others are binary operations, i.e., operate on two relations.
28. 2.2 Relational Data Model Operations
Relational Algebra Operations
The Relational Algebra operations are as follows:
SNO Operation Notation
1 Selection σpredicate(R)
2 Projection Πa1,...,an(R)
3 Union R U S
4 Set difference R - S
5 Intersection R ∩ S
6 Division R / S
7 Aggregate GAL
8 Grouping GAGAL
SNO Operation
9 Cartesian Product
10 Theta join
11 Equi oin
12 Natural join
13 Outer join
29. 2.2 Relational Data Model Operations
Relational Algebra – Selection
The Selection operation produces a relation that contains only those tuples
that satisfy the specified predicate.
Predicate is a condition containing columns of the relation and constants
that returns a boolean value (true or false).
The selection operation σsalary > 5000(Employees) gives only those tupes
that satisfies the specified predicate which is salary > 5000.
Eno Ename Designation Salary Dno
1001 Niranjan Software
Engineer
10000 10
1002 Praveen Trainee 5000 10
1003 Prashanth Admin 6000 20
1004 Sugumar Software
Engineer
8000 10
1005 Majunath Software
Engineer
5000 10
Table: Employees
Eno Ename Designation Salary Dno
1001 Niranjan
Software
Engineer
10000 10
1003 Prashanth Admin 6000 20
1004 Sugumar
Software
Engineer
8000 10
Result of selection
30. 2.2 Relational Data Model Operations
Relational Algebra – Projection
The Projection operation produces a relation that contains vertical subset of
given relation of specified attributes.
Given the following relation Employee,
operation ΠEno, Ename, Salary(Employees) gives a relation as shown below:
Eno Ename Designation Salary Dno
1001 Niranjan Software
Engineer 10000 10
1002 Praveen Trainee 5000 10
1003 Prashanth Admin 6000 20
1004 Sugumar Software
Engineer 8000 10
1005 Manjunath Trainee 5000 10
Eno Ename Salary
1001 Niranjan 10000
1002 Praveen 5000
1003 Prashanth 6000
1004 Sugumar 8000
1005 Manjunath 5000
Employees Result relation
31. 2.2 Relational Data Model Operations
Relational Algebra – Union
Given two relations, the Union operation produces a relation that contains
all tuples of the two relations.
Given the following relations Employees and Contract_Employees,
operation Employees U Contract_Employees gives a relation as shown
below:
Eno Ename Dno
1001 Niranjan 10
1002 Praveen 10
1003 Prashanth 20
1004 Sugumar 10
1005 Manjunath 10
Eno Ename Dno
5001 Ravi 10
5002 Akshay 10
Employees Contract_Employees
Eno Ename Dno
1001 Niranjan 10
1002 Praveen 10
1003 Prashanth 20
1004 Sugumar 10
1005 Manjunath 10
5001 Ravi 10
5002 Akshay 10
Result relation
Notes: (1)The two relations
must be union-compatible
(2) The result will not
contain any duplicate tuples.
=U
32. 2.2 Relational Data Model Operations
Relational Algebra – Set Difference
Given two relations, the Set Difference operation produces a relation that
contains all tuples of the first relation that are not in the second relation.
Given the following relations Employees and Trainee_Employees,
operation Employees - Trainee_Employees gives a relation as shown
below:
Eno Ename Dno
1001 Niranjan 10
1002 Praveen 10
1003 Prashanth 20
1004 Sugumar 10
1005 Manjunath 10
Eno Ename Dno
5001 Ravi 10
5002 Akshay 10
1002 Praveen 10
1005 Manjunath 10
Employees Trainee_Employees
Eno Ename Dno
1001 Niranjan 10
1003 Prashanth 20
1004 Sugumar 10
Result Relation
=-
Note that the two relations must be union-compatible
33. 2.2 Relational Data Model Operations
Relational Algebra – Intersection
Given two relations, the Intersection operation produces a relation that
contains tuples that are common to both the relations.
Given the following relations Employees and Trainee_Employees,
operation Employees ∩ Trainee_Employees gives a relation as shown
below:
Eno Ename Dno
1001 Niranjan 10
1002 Praveen 10
1003 Prashanth20
1004 Sugumar 10
1005 Manjunath10
Eno Ename Dno
5001 Ravi 10
5002 Akshay 10
1002 Praveen 10
1005 Manjunath 10
Employees
Trainee_Employees
Eno Ename Dno
1002 Praveen 10
1005 Manjunath 10
∩ =
Result relation
Note that the two relations must be union-compatible
34. Relational Data Model Operations
Relational Algebra – Cartesian Product
Given two relations, the Cartesian Product operation produces a relation
that is a concatenation of every tuple of the first relation with every tuple of
the second relation.
Cartesian Product of relations Employees and Departments, written as
Employees X Departments, is shown below:
Eno Ename Dno
1001 Niranjan 10
1003 Prashanth 20
Dno Dname
10 Devlopment
20 Admin
30 Marketing
40 Research
Employees
Departments
Eno Ename Dno Dno Dname
1001 Niranjan 10 10 Development
1001 Niranjan 10 20 Admin
1001 Niranjan 10 30 Marketing
1001 Niranjan 10 40 Research
1003 Prashanth 10 10 Development
1003 Prashanth 10 20 Admin
1003 Prashanth 10 30 Marketing
1004 Prashanth 10 40 Research
X
Result
=
(2 x 3)
(4 x 2)
(8 x 5)
35. 2.2 Relational Data Model Operations
Relational Algebra – Theta Join
The Theta Join(θ join)operation produces a relation that contains tuples
satisfying a predicate F from cartesian product of given two relations.
Theta join of relations Employees and Departments, written as
Employees F Departments, is shown below:
Eno Ename Dno
1001 Niranjan 10
1003 Prashanth20
Dno Dname
10 Devlopment
20 Admin
30 Marketing
Employees Departments
Eno Ename DnoDnoDname
1001Niranjan 10 20 Admin
1001Niranjan 10 30 Marketing
1003Prashanth20 30 Marketing
F
Result relation
=
F is (Employees.Dno < Departments.Dno)
R F S = σFR X S
36. 2.2 Relational Data Model Operations
Relational Algebra – Equijoin
The Equijoin operation produces a relation that contains tuples satisfying a
predicate F that contains only equal operators(=) from cartesian product of
given two relations.
Equijoin of relations Employees and Departments, written as
Employees F Departments, is shown below:
Eno Ename Dno
1001 Niranjan 10
1003 Prashanth20
Dno Dname
10 Devlopment
20 Admin
30 Marketing
Employees Departments
Eno Ename DnoDnoDname
1001Niranjan 10 10 Marketing
1003Prashanth20 20 Admin
F
Result relation
=
F is (Employees.Dno = Departments.Dno)
37. 2.2 Relational Data Model Operations
Relational Algebra – Natural Join
The Natural Join operation is an Equijoin of given two relations over all
common attributes with result containing only one occurrence of each
common attribute.
Natural join of relations Employees and Departments, written as
Employees Departments, is shown below:
Eno Ename Dno
1001 Niranjan 10
1003 Prashanth20
Dno Dname
10 Devlopment
20 Admin
30 Marketing
Employees Departments
Eno Ename DnoDname
1001Niranjan 10 Marketing
1003Prashanth20 Admin
Result relation
=
Common Attributes
38. 2.2 Relational Data Model Operations
Relational Algebra – Left Outer Join
The Left Outer Join operation of given two relations is a join in which
tuples from the left table of the join that do not have matching tuples in the
right table for the specified predicate are also included in the result relation
but with null values for the attributes of the right table.
Left Outer Join of relations Courses and Students, written as
Courses P Students is shown below:
Cno Cname
C01 Java
C02 C++
C03 DBMS
C04 Android
C05 jQuery
Sno Sname Cno
101 Akhil C01
102 Akhil C03
103 Prithivi C03
104 Chetan C01
Courses Students
CnoCname Sno Sname Cno
C01Java 101 Akhil C01
C01Java 104 Chetan C01
C02C++ NULL NULL NULL
C03DBMS 102 Akhil C03
C03DBMS 103 Prithvi C03
C04Android NULL NULL NULL
C05jQuery NULL NULL NULL
P
Result relation
=
Predicate: (Courses.Cno = Students.Cno)
39. 2.2 Relational Data Model Operations
Relational Algebra – Semijoin
The (Left) Semijoin operation of given two relations is a join in which each
tuple from the left table that has at least one matching tuple in the right
table is included in the result relation.
Semijoin of relations Courses and Students, written as Courses I>P
Students is shown below:
Cno Cname
C01 Java
C02 C++
C03 DBMS
C04 Android
C05 jQuery
Sno Sname Cno
101 Akhil C01
102 Akhil C03
103 Prithivi C03
104 Chetan C01
Courses
Students
Cno Cname
C01 Java
C03 DBMSP
Result relation
=
Predicate: (Courses.Cno = Students.Cno)
Note: Columns from right table
will not be in the result.
40. 2.2 Relational Data Model Operations
Relational Algebra – Anti-Semijoin
The (Left) Anti-Semijoin operation of given two relations is a join in which
each tuple from the left table that does not have a matching tuple in the
right table is included in the result relation.
Anti-Semijoin of relations Courses and Students is shown below:
Cno Cname
C01 Java
C02 C++
C03 DBMS
C04 Android
C05 jQuery
Sno Sname Cno
101 Akhil C01
102 Akhil C03
103 Prithivi C03
104 Chetan C01
Courses Students
Cno Cname
C02 C++
C04 Android
C05 jQuery
P
Result relation
=
Predicate: (Courses.Cno = Students.Cno)
Note: Columns from right table
will not be in the result.
Anti
41. 2.2 Relational Data Model Operations
Relational Algebra – Division
The Division operation of two relations produces a relation that consist of
set of tuples from the first relation for a set of attributes that match the
combination of every tuple in the second relation where the set of attributes
exists in the first table but not in the second table.
Division of relations Courses and Students is shown below:
Cno
C01
C03
Students
Sno Sname
101Akhil
105Pranoy
Result relation
=÷
Sno Sname Cno
101 Akhil C01
101 Akhil C03
102 Prithivi C04
103 Chetan C01
104 Pranoy C01
104 Pranoy C03
102 Prithvi C02
Speical-Courses
This answers typical question of who took
courses C01 and C03 ?
42. 2.2 Relational Data Model Operations
Relational Algebra – Division
Another example for the division operation. Suppose you have a table of
prospective candidates whom you want to recruit provided they all have the
required skills present in another table. You can use the Division operator to
get the result:
Skill
Java
C++
JSP
Candidates
Candidate
Name
Akhil
Sparsh
Result relation
=÷
Candidate
Name Skill
Chetan Objective-C
Chetan C++
Chetan Java
Sparsh C++
Sparsh JSP
Sparsh Java
Sparsh HTML
Pranoy Java
Pranoy C++
Akhil PHP
Akhil C++
Akhil Objective-C
Akhil Java
Akhil JSP
Skills
Note that the table of Skills could be
derived as a result of another operator
on some other tables.
43. 2.2 Relational Data Model Operations
Relational Algebra – Aggregate
Aggregate operator, GAL(R), produces a relation by applying an aggregate function list
to a given relation.
An aggregate function returns one value that is computed from a collection of values.
An aggregate function list, AL, consists of pairs of an aggregate function and an
attribute.
Aggregate operation GSUM(Salary), COUNT(Eno)(Employees) is shown below:
Employees
Result relation
Typical aggregate functions are
COUNT, SUM, AVG, MAX and MIN.
SUM(Salary) COUNT(Eno)
74000 7
Eno Ename Salary Dno
1001 Niranjan 10000 10
1002 Praveen 5000 10
1003 Prashanth 6000 20
1004 Sugumar 8000 10
1005 Majunath 5000 10
1006 Dinesh 25000 30
1007 Harish 15000 20
44. 2.2 Relational Data Model Operations
Relational Algebra – Grouping
Grouping operator, GAGAL(R), produces a relation that contains the grouping
attributes, GA, and result for each aggregate function of the aggregate
function list, AL, by grouping tuples of the given relation by the grouping
attributes and applying the aggregate operator on the result of the grouping.
Grouping operation DnoGSUM(Salary), COUNT(Eno)(Employees) is shown below:
Employees Result relation
Typical aggregate functions are COUNT,
SUM, AVG, MAX and MIN.
Eno Ename Salary Dno
1001 Niranjan 10000 10
1002 Praveen 5000 10
1003 Prashanth 6000 20
1004 Sugumar 8000 10
1005 Majunath 5000 10
1006 Dinesh 25000 30
1007 Harish 15000 20
Dno SUM(Salary) COUNT(Eno)
10 28000 4
20 21000 2
30 25000 1
Grouping columnAggregate Columns
Aggregate Values