Database Normalization Explained

 It is a process of converting a complex, large and unstable
relation into a set of simple, small and stable relations.
 It is a process of efficiently organizing data in a database.
 Normalization results in a well structured relation – a relation
that contains min. redundancy and allows insert, update and
delete without errors/inconsistencies.
 Errors or inconsistencies caused by redundant data are also
called anomalies.
 There are three types of anomalies:
a) Insert Anomaly.
b) Delete Anomaly.
c) Update Anomaly.

a) Insert Anomaly.
 It occurs when extra data beyond the desired data
must be added to the database.
b) Update Anomaly.
 It occurs when it is necessary to change multiple rows
to modify only a single fact.
c) Delete Anomaly.
 It occurs when deleting a row causes some unwanted
deletions.

 Normalization is based on the analysis of Functional Dependency (FD).
 A FD is a relationship between two attributes A & B of a relation R,
such that attribute B is said to be functionally dependent on attribute
A, if A uniquely determines the values of B.(A->B)
 The attribute on the left-hand side of the arrow in a functional
dependency is called Determinant and on the right are called
Dependents.
 An attribute may be functionally dependent on more than one
attributes.

Date a Course is completed is completely determined
by the EMP_ID and Course_Title
EMP_ID Name Dept_Name Salary
EMPLOYEE1
EMPLOYEE2
EMP_ID Course_Title Name Dept_Name Salary Date_Completed

 Functional Dependencies can be thought of as an integrity
constraints that encode data semantics.
 Functional Dependencies are helpful in identifying the keys for a
given relation and to replace a relation with a collection of
smaller relations.
 An attribute or set of attributes is a key, if it can functionally
determine the other attributes of the relation.
 Example: Consider a relation R (A, B, C, D, E) with the following
FDs:
 A -> D
 D -> B
 B -> C
 E -> B

 Normalization process is built around the concept of Normal
Forms.
 A Normal Form is a state of a relation that can be determined
by applying simple rules regarding functional Dependencies.
◦ First Normal Form
◦ Second Normal Form
◦ Third Normal Form
◦ Boyce Codd Normal Form
◦ Fourth Normal Form
◦ Fifth Normal Form

Table with
Multivalued attribute
Remove multivalued attributes
Remove partial dependencies
Remove transitive dependencies
Removing remaining anomalies
Remove multivalued dependencies
Remove remaining anomalies
First Normal Form
Second Normal Form
Third Normal Form
Boyce-Codd
Normal Form
Fourth Normal Form
Fifth Normal Form

 A relation is said to be in 1NF, if it contains no Repeating Group (RG).
 A RG is a collection of multi-valued attributes OR when there is more
than one field storing the same kind of information in a single table,
there is a RG.
 To eliminate a RG, the value at the intersection of a row and column must
be atomic(having one value).
 If you developed a logical design by transforming ER diagram into
relations, there should not be any multivalued attributes remaining
 Consider the following relation:
Student (RegNo, Name, Program, C-Code, C-Title, C-Grade)
 This relation has a repeating group consisting of C-Code, C-Title, C-Grade
and therefore it has the insert, delete and update anomalies.
 Multiple values create problems in performing operations like select or
join.

 The relation Student can be converted into 1NF using either of
the following methods:
a) Change the PK of the relation and define a composite key
RegNo & C_Code. We fill the blanks by duplicating the non-
repeating data. This approach is commonly referred to as
Flattening the table.
b) Split the relation into 2 relations by placing the repeating data
along with a copy of the original key attribute(s) in a separate
relation. The new relation will always have concatenated key.
Student (RegNo, Name, Program)
Course (RegNo, C-Code, C-Title, C_Grade)
Example 2: STD(stId, stName, stAdr, prName, bkId)

 A relation is in 2NF if:
◦ It is in 1NF
◦ Every nonkey attribute is fully functionally dependent on the
primary key
 A situation of Partial Functional Dependency arises when PK
of a relation is composite and a non key attribute is
functionally dependent on part (but not all) of the PK.
 Referring to the Course relation:
Course (RegNo, C-Code,C-Title, C_Grade)
 The functional dependencies are:
C-Code -> C_Title (Partial FD)
RegNo,C_Code -> C_Grade (Full FD)

 Since all the non key attributes are not fully functionally
dependent on the PK or there is partial functional
dependency in the relation, therefore it is not in 2NF.
 The Anomalies associated with the course relation are:
a) Insert Anomaly:
 A course instance cant be inserted without a student (RegNo)
b) Delete Anomaly.
 Deleting a student will unnecessarily delete course data.
c) Update Anomaly.
 A course cant be updated independently.

 The process for transforming a 1NF table to 2NF is:
◦ Identify any determinants other than the composite key, and the
columns they determine.
◦ Create and name a new table for each determinant and the
unique columns it determines.
◦ Move the determined columns from the original table to the
new table. The determinate becomes the primary key of the
new table.
◦ Delete the columns you just moved from the original table
except for the determinate which will serve as a foreign key.
◦ The original table may be renamed to maintain semantic
meaning.

 The relation Course can be converted into 2NF by
decomposing it into the following relations:
Course (C-Code,C-Title)
Result (RegNo, C-Code, C_Grade)
 A relation in 1NF will be in 2NF if:
◦ The PK consists of only one attribute OR
◦ No nonkey attributes exist in the relation OR
◦ Every nonkey attribute is functionally dependent on
the full set of primary key attributes

 A relation is said to be in 3NF, if it is in 2NF and there is no
Transitive Dependency.
 A Transitive Dependency is a functional dependency between
two or more non key attributes of a relation.
 Consider the following relation:
Emp (EmpNo, EName, Job, Sal, Proj-No,Proj-Details)
 In the above relation, there is a following transitive
dependency:
Proj-No -> Proj-Details
 Due to this, project information cant be maintained
independent of a employee record and hence there are
anomalies in the relation.

 You can remove transitive dependency from a relation in the
following way:
 Create a new relation against transitively dependent
attributes and leave the PK of new relation in the old relation
to serve as a FK.
Emp (EmpNo, EName, Job, Sal, Proj-No)
Project (Proj-No, Proj-Details)
More Example:
Ex2: STD (stId, stName, stAdr, prName, prCrdts)
◦ stId -> stName, stAdr, prName, prCrdts
◦ prName -> prCrdts

 The process of transforming a table into 3NF is:
◦ Identify any determinants, other the primary key, and the
columns they determine.
◦ Create and name a new table for each determinant and the
unique columns it determines.
◦ Move the determined columns from the original table to the
new table. The determinate becomes the primary key of the
new table.
◦ Delete the columns you just moved from the original table
except for the determinate which will serve as a foreign key.
◦ The original table may be renamed to maintain semantic
meaning.

 Student Relation:
RegNo, Name, Address, Program, C-Code, C-Title, C-Grade
 Patient Relation:
PatNo, PatName, PatAge, VisitNo, VisitDate, DNo, DName, DSpeciality,
Diagnosis
 Project Relation:
PNo, PName, PBudget, EmpNo, EName, Job, ChgHour, Hours

Database Normalization Explained

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (13)

Similar to Database Normalization Explained

Similar to Database Normalization Explained (20)

Recently uploaded

Recently uploaded (20)

Database Normalization Explained