3. Introduction
Prof. K. Adisesha
3
Anomalies in Relational Model:
In Database Management System (DBMS), anomaly means the inconsistency occurred
in the relational table during the operations performed on the relational table.
➢ A database anomaly is a fault in a database that usually emerges as a result of shoddy
planning and storing everything in a flat database
➢ In most cases, this is removed through the normalization procedure, which involves the
joining and splitting of tables.
➢ The following are actually the ones about which we should be worried:
âť– Update
âť– Insert
âť– Delete
4. Introduction
Prof. K. Adisesha
4
Codd’s rules :
A DB that solely contains a relational data model cannot be called a Relational DB
Management System or RDBMS. Some rules determine if a DB is the correct RDBMS.
➢ Dr Edgar F. Codd, who has extensive knowledge on the DB system’s Relational Model,
proposed these principles in 1985.
➢ Codd presents his 13 criteria for a DB to evaluate the concept of a DB Management
System (DBMS) against the relational model.
➢ A DB that follows the rule is referred to as a real relational DB management system
(RDBMS).
➢ Codd’s rules are a set of rules that are widely used in relational Data Bases.
5. Introduction
Prof. K. Adisesha
5
Codd’s rules : Some rules determine if a DB is the correct RDBMS.
âť– Rule 0: The Foundation Rule
âť– Rule 1: The Information Rule
âť– Rule 2: The Guaranteed Access Rule
âť– Rule 3: The Systematic Treatment of Null Values
âť– Rule 4: The Dynamic/Active Online Catalog on the basis of the Relational Model
âť– Rule 5: The Comprehensive Data Sub Language Rule
âť– Rule 6: The View Updating Rule
âť– Rule 7: The Relational Level Operation (or High-Level Insert, Delete, and Update) Rule
âť– Rule 8: The Physical Data Independence Rule
âť– Rule 9: The Logical Data Independence Rule
âť– Rule 10: The Integrity Independence Rule
âť– Rule 11: The Distribution Independence Rule
âť– Rule 12: The Non-Subversion Rule
6. Codd’s rules
Prof. K. Adisesha
6
Codd’s rules :
Some rules determine if a DB is the correct RDBMS.
➢ Rule 0: The Foundation Rule
❖ The DB must be structured in a relational manner so that the system’s relational capabilities can manage
the DB.
➢ Rule 1: The Information Rule
âť– A DB comprises a variety of data, which must be recorded in the form of columns and rows in each and
every cell of a table.
➢ Rule 2: The Guaranteed Access Rule
❖ A relational DB’s primary key value, column name, and table name can be used to conceptually retrieve any
single or precise data (the atomic value).
➢ Rule 3: The Systematic Treatment of Null Values
âť– The treatment of Null values in DB records is defined by this rule. No value in a cell, missing data,
unsuitable information, unknown data, the primary key that should not be null, etc.,
7. Codd’s rules
Prof. K. Adisesha
7
Codd’s rules :
Some rules determine if a DB is the correct RDBMS.
➢ Rule 4: The Dynamic/Active Online Catalog on the basis of the Relational Model
âť– A DB dictionary is a logical representation of the whole logical structure of a descriptive DB that needs
to be stored online. It grants users access to the DB and uses a query language that is comparable to that
of the DB.
➢ Rule 5: The Comprehensive Data SubLanguage Rule
âť– The relational DB supports a variety of languages, and in order to access the DB, the language has to be
linear, explicit, or a well-defined syntax, character strings. It must support the following operations: view
definition, integrity constraints, data manipulation, data definition, as well as limit transaction
management. It is considered a DB violation if the DB permits access to the data and information without
the use of any language.
8. Codd’s rules
Prof. K. Adisesha
8
Codd’s rules :
Some rules determine if a DB is the correct RDBMS.
➢ Rule 6: The View Updating Rule
âť– A view table can theoretically be updated, and DB systems must update them in practice.
➢ Rule 7: The Relational Level Operation (or High-Level Insert, Delete, and Update) Rule
âť– In each level or single row, a DB system should adhere to high-level relational operations (for
example, update, insert, and delete). The DB system also includes operations like intersection,
union, and minus.
➢ Rule 8: The Physical Data Independence Rule
âť– To access a DB or an application, all stored data must be independent physically. Each piece of
data should not be reliant on another piece of data or an application. External applications that
access data from the DB will have no effect if data is altered or the physical structure of a given
DB is modified.
9. Codd’s rules
Prof. K. Adisesha
9
Codd’s rules :
Some rules determine if a DB is the correct RDBMS.
➢ Rule 9: The Logical Data Independence Rule
❖ It’s similar to the independence of physical data. It indicates that any modifications made at the
logical level (or the table structures) should not have an impact on the user’s experience
(application). For example, if a table is split into two separate tables or into two table joins in
order to produce a single table, the application at the user view should not be affected.
➢ Rule 10: The Integrity Independence Rule
âť– When we are using SQL to put data into table cells, a DB must guarantee integrity
independence. All the entered values must not be changed, and the integrity of the data should
not be reliant on any external component or application. It’s also useful for making each front-
end app DB-independent.
10. Codd’s rules
Prof. K. Adisesha
10
Codd’s rules :
Some rules determine if a DB is the correct RDBMS.
➢ Rule 11: The Distribution Independence Rule
❖ This rule denotes that a DB must function properly even if it’s stored in multiple locations and
used by various end-users. Let’s say a person uses an application to access the DB. In such a
case, they must not be aware that another user is using the same data, and thus, the data they
always obtain is only available on one site. The DB can be accessed by end-users, and each
user’s access data must be independent in order for them to run SQL queries.
➢ Rule 12: The Non-Subversion Rule
âť– RDBMS is defined by this rule as a SQL language for storing and manipulating data in a DB. If
a system uses a low-level or different language to access the DB system other than SQL, it
should not bypass or subvert data integrity.
11. Decomposition
Prof. K. Adisesha
11
Decomposition in DBMS:
The process of decomposition in DBMS helps us remove redundancy, inconsistencies
and anomalies from a database when we divide the table into numerous tables.
➢ The term decomposition refers to the process in which we break down a table in a
database into various elements or parts.
➢ In simpler words, the process of decomposition refers to dividing a relation X into {X1,
X2,……Xn}. Decomposition is dependency preserving as well as lossless.
➢ Decomposition is of two major types in DBMS:
âť– Lossless: A decomposition is said to be lossless when it is feasible to reconstruct the
original relation R using joins from the decomposed tables.
âť– Lossy: Decompose a relation into multiple relational schemas, then the loss of
data/information is unavoidable whenever we try to retrieve the original relation.
12. Functional Dependency
Prof. K. Adisesha
12
Functional Dependency in DBMS:
The functional dependency is a relationship that exists between two attributes. It
typically exists between the primary key and non-key attribute within a table.
X → Y
➢ The left side of FD is known as a determinant, the right side of the production is known
as a dependent.
➢ Types of Functional dependency:
âť– Trivial functional dependency
âť– Non-Trivial functional dependency
âť– Multivalued functional dependency
âť– Transitive functional dependency.
13. Functional Dependency
Prof. K. Adisesha
13
Functional Dependency in DBMS:
➢ Trivial functional dependency: A decomposition is said to be lossless when it is feasible
to reconstruct the original relation R using joins from the decomposed tables.
➢ In Trivial Functional Dependency, a dependent is always a subset of the determinant.
i.e. If X → Y and Y is the subset of X, then it is called trivial functional dependency
➢ For example,
➢ Here, {roll_no, name} → name is a trivial functional
dependency, since the dependent name is a subset of
determinant set {roll_no, name}
➢ Similarly, roll_no → roll_no is also an example of
trivial functional dependency.
roll_no name age
42 Adi 17
43 Sunny 18
44 Prajwal 18
14. Functional Dependency
Prof. K. Adisesha
14
Functional Dependency in DBMS:
➢ Non-trivial functional dependency: Decompose a relation into multiple relational
schemas, then the loss of data/information is unavoidable whenever we try to retrieve
the original relation.
➢ i.e. If X → Y and Y is not a subset of X, then it is called Non-trivial functional
dependency.
➢ For example,
➢ Here, roll_no → name is a non-trivial functional dependency,
since the dependent name is not a subset of determinant roll_no
➢ Similarly, {roll_no, name} → age is also a non-trivial functional
dependency, since age is not a subset of {roll_no, name}
roll_no name age
42 Adi 17
43 Sunny 18
44 Prajwal 18
15. Functional Dependency
Prof. K. Adisesha
15
Functional Dependency in DBMS:
➢ Multivalued Functional Dependency: In Multivalued functional dependency, entities of
the dependent set are not dependent on each other.
➢ i.e. If a → {b, c} and there exists no functional dependency between b and c, then it is
called a multivalued functional dependency.
➢ Here, roll_no → {name, age} is a multivalued
functional dependency, since the dependents name &
age are not dependent on each other(i.e. name → age
or age → name doesn’t exist !)
roll_no name age
42 Adi 17
43 Sunny 18
44 Prajwal 18
45 Adi 19
16. Functional Dependency
Prof. K. Adisesha
16
Functional Dependency in DBMS:
➢ Transitive Functional Dependency: In transitive functional dependency, dependent is
indirectly dependent on determinant.
➢ i.e. If a → b & b → c, then according to axiom of transitivity, a → c. This is a transitive
functional dependency.
➢ Here, enrol_no → dept and dept → building_no,
➢ Hence, according to the axiom of transitivity,
enrol_no → building_no is a valid functional
dependency. This is an indirect functional dependency,
hence called Transitive functional dependency.
enrol_no name dept building_no
42 Adi CO 4
43 Sunny EC 2
44 Prajwal IT 1
45 Adi EC 2
17. Normalization
Prof. K. Adisesha
17
Normalization:
A large database defined as a single relation may result in data duplication. This
repetition of data may result in:
âť– Making relations very large.
âť– It isn't easy to maintain and update data as it would involve searching many records
in relation.
âť– Wastage and poor utilization of disk space and resources.
âť– The likelihood of errors and inconsistencies increases.
➢ So to handle these problems, we should analyze and decompose the relations with
redundant data into smaller, simpler, and well-structured relations that are satisfy
desirable properties.
18. Normalization
Prof. K. Adisesha
18
Normalization:
Normalization is a process of decomposing the relations into relations with fewer
attributes.
➢ Normalization is the process of organizing the data in the database.
➢ Normalization is used to minimize the redundancy from a relation or set of relations. It
is also used to eliminate undesirable characteristics like Insertion, Update, and Deletion
Anomalies.
➢ Normalization divides the larger table into smaller and links them using relationships.
➢ The normal form is used to reduce redundancy from the database table.
19. Normalization
Prof. K. Adisesha
19
Normalization:
The main reason for normalizing the relations is removing these anomalies.
Normalization consists of a series of guidelines that helps to guide you in creating a
good database structure.
➢ Data modification anomalies can be categorized into three types:
âť– Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple
into a relationship due to lack of data.
âť– Deletion Anomaly: The delete anomaly refers to the situation where the deletion of
data results in the unintended loss of some other important data.
âť– Updatation Anomaly: The update anomaly is when an update of a single data value
requires multiple rows of data to be updated.
20. Normalization
Prof. K. Adisesha
20
Normalization:
Normalization is a step by step process of removing the different kinds of redundancy
and anomaly one step at a time from the database.
➢ E.F Codd developed for the relation data model in 1970.
➢ Normalization rules are divided into following normal form:
22. Normalization
Prof. K. Adisesha
22
1st Normal Form (1NF):
By applying the First Normal Form, you achieve atomicity, and also every column has
unique values.
➢ A table is referred to as being in its First Normal Form if atomicity of the table is 1.
➢ Here, atomicity states that a single cell cannot hold multiple values. It must hold only a
single-valued attribute.
➢ The First normal form disallows the multi-valued attribute, composite attribute, and
their combinations.
First normal form
23. Normalization
Prof. K. Adisesha
23
Second Normal Form (2NF):
The first condition for the table to be in Second Normal Form is that the table has to be
in First Normal Form. The table should not possess partial dependency.
➢ The partial dependency here means the proper subset of the candidate key should give a
non-prime attribute.
➢ To bring the table to Second Normal Form, you need to split the table-1 into two parts.
This will give you the below tables:
Second normal form
24. Normalization
Prof. K. Adisesha
24
Third Normal Form (3NF):
The third Normal Form ensures the reduction of data duplication. It is also used to
achieve data integrity.
➢ The first condition for the table to be in Third Normal Form is that the table should be
in the Second Normal Form.
➢ The second condition is that there should be no transitive dependency for non-prime
attributes, which indicates that non-prime attributes (which are not a part of the
candidate key) should not depend on other non-prime attributes in a table. Therefore, a
transitive dependency is a functional dependency in which A → C (A determines C)
indirectly, because of A → B and B → C (where it is not the case that B → A).:
25. Normalization
Prof. K. Adisesha
25
Third Normal Form (3NF):
The third Normal Form ensures the reduction of data duplication. It is also used to
achieve data integrity.
➢ Below is a student table that has student id, student name, subject id, subject name, and
address of the student as its columns.
➢ Now to change the table to the third normal form, you need to divide the table as shown
below:
Third normal form
26. Normalization
Prof. K. Adisesha
26
Boyce Codd Normal Form (BCNF):
Boyce Codd Normal Form is also known as 3.5 NF. It is the superior version of 3NF and
was developed by Raymond F. Boyce and Edgar F. Codd to tackle certain types of
anomalies which were not resolved with 3NF.
➢ The first condition for the table to be in Boyce Codd Normal Form is that the table
should be in the third normal form.
➢ Secondly, every Right-Hand Side (RHS) attribute of the functional dependencies should
depend on the super key of that particular table.
➢ For example : You have a functional dependency X → Y.
In the particular functional dependency, X has to be the
part of the super key of the provided table.
27. Normalization
Prof. K. Adisesha
27
Boyce Codd Normal Form (BCNF):
The subject table follows these conditions:
âť– Each student can enroll in multiple subjects.
âť– Multiple professors can teach a particular subject.
âť– For each subject, it assigns a professor to the student.
➢ In the above table, student_id and subject together form the primary key because using
student_id and subject; you can determine all the table columns.
➢ However, there exists yet another dependency - professor → subject.
BCNF normal form
28. Normalization
Prof. K. Adisesha
28
Fourth normal form (4NF):
If no database table instance contains two or more, independent and multivalued data
describing the relevant entity, then it is in 4th Normal Form.
➢ A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
➢ For a dependency A → B, if for a single value of A, multiple values of B exists, then the
relation will be a multi-valued dependency.
STU_ID COURSE HOBBY
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
➢ The given STUDENT table is in 3NF, but the
COURSE and HOBBY are two independent entity.
Hence, there is no relationship between COURSE
and HOBBY.
29. Normalization
Prof. K. Adisesha
29
Fourth normal form (4NF):
If no database table instance contains two or more, independent and multivalued data
describing the relevant entity, then it is in 4th Normal Form.
➢ In the STUDENT relation, a student with STU_ID, 21 contains two courses, Computer
and Math and two hobbies, Dancing and Singing. So there is a Multi-valued
dependency on STU_ID, which leads to unnecessary repetition of data.
➢ So to make the above table into 4NF, we can decompose it into two tables:.
4NF normal form
30. Normalization
Prof. K. Adisesha
30
Fifth normal form (5NF):
A table is in 5th Normal Form only if it is in 4NF and it cannot be decomposed into any
number of smaller tables without loss of data.
➢ A relation is in 5NF if it is in 4NF and not contains any join dependency and joining
should be lossless.
➢ 5NF is satisfied when all the tables are broken into as many tables as possible in order
to avoid redundancy.
➢ 5NF is also known as Project-join normal form (PJ/NF).
➢ In the table, John takes both Computer and Math class for
Semester 1 but he doesn't take Math class for Semester 2. In
this case, combination of all these fields required to identify
a valid data.
31. Normalization
Prof. K. Adisesha
31
Fifth normal form (5NF):
A table is in 5th Normal Form only if it is in 4NF and it cannot be decomposed into any
number of smaller tables without loss of data.
➢ Suppose we add a new Semester as Semester 3 but do not know about the subject and
who will be taking that subject so we leave Lecturer and Subject as NULL. But all three
columns together acts as a primary key, so we can't leave other two columns blank.
➢ So to make the above table into 5NF, we can decompose it into three relations P1, P2 &
P3: