This presentation discusses the following topics:
Purpose of normalization.
Problems associated with redundant data.
Identification of various types of update anomalies such as insertion, deletion, and modification anomalies.
How to recognize appropriateness or quality of the design of relations.
How functional dependencies can be used to group attributes into relations that are in a known normal form.
How to undertake process of normalization.
How to identify most commonly used normal forms, namely 1NF, 2NF, 3NF
Presiding Officer Training module 2024 lok sabha elections
Normalization
1. Department of Information Technology 1Data base Technologies (ITB4201)
Normalization: An Introduction
Dr. C.V. Suresh Babu
Professor
Department of IT
Hindustan Institute of Science & Technology
2. Department of Information Technology 2Data base Technologies (ITB4201)
Action Plan
• Purpose of normalization.
• Problems associated with redundant data.
• Identification of various types of update anomalies such as insertion, deletion,
and modification anomalies.
• How to recognize appropriateness or quality of the design of relations.
• How functional dependencies can be used to group attributes into relations that
are in a known normal form.
• How to undertake process of normalization.
• How to identify most commonly used normal forms, namely 1NF, 2NF, 3NF.
• Quiz
3. Department of Information Technology 3Data base Technologies (ITB4201)
Un normalized table
Project
Code
Project Name Project
Manager
Project
Budget
Emp.
No.
Employee
Name
Departm
ent No.
Department
Name
Hourly
Rate
PC010 Banking Mrs. Narayani 100000 101 Viji D03 Database 21.00
102 Ambika D02 Testing 16.50
103 Vicky D01 IT 22.00
PC011 ERP Mrs. Sasikala 200000 104 Shakthi D03 Database 18.50
105 Shalini D02 Testing 17.00
106 Pandari D01 IT 23.50
PC012 E-commerce Mrs. Kamala 300000 107 Vidya D03 Database 21.50
108 Fathima D02 Testing 15.50
109 Padma D01 IT 20.50
4. Department of Information Technology 4Data base Technologies (ITB4201)
Issues in Unnormalized table:
• It is not possible to uniquely retrieve a record for example
Employee 102.
• If you want to update Department No. D03 to D04 you have to
update all the records, otherwise it leads to Data inconsistency
• It is not possible to count the no. of records in the database
• Duplicate records such as Dept. Name, Dep. No occupy space
on the storage devices
5. Department of Information Technology 5Data base Technologies (ITB4201)
Advantages of Normalized forms
• Ease of use
• Flexibility
• Precision
• Security
• Reliability
• Ease of Implementation.
• Clarity
6. Department of Information Technology 6Data base Technologies (ITB4201)
Introduction to Normalization
• Normalization: Process of decomposing unsatisfactory "bad" relations
by breaking up their attributes into smaller relations
• Normal form: Condition using keys and FDs of a relation to certify
whether a relation schema is in a particular normal form
– 2NF, 3NF, BCNF based on keys and FDs of a relation schema
– 4NF based on keys, multi-valued dependencies
8. Department of Information Technology 8Data base Technologies (ITB4201)
Data Redundancy
• Major aim of relational database design is to group attributes into relations to minimize data
redundancy and reduce file storage space required by base relations.
• Problems associated with data redundancy are illustrated by comparing the following Staff
and Branch relations with the StaffBranch relation.
10. Department of Information Technology 10Data base Technologies (ITB4201)
Data Redundancy
• StaffBranch relation has redundant data: details of a branch are repeated for every member of
staff.
• In contrast, branch information appears only once for each branch in Branch relation and only
branchNo is repeated in Staff relation, to represent where each member of staff works.
11. Department of Information Technology 11Data base Technologies (ITB4201)
Update Anomalies
• Relations that contain redundant information may potentially suffer from update
anomalies.
• Insertion anomaly: How to add a new major?
• Modification anomaly: What would happen if we change office of Smith in the first tuple?
• Deletion anomaly: What would happen if Scott is deleted?
SID Name Major GPA Advisor Office
2011 John CS 3.4 Smith 3345
1235 Carl CS 3.2 Smith 3345
1003 Ken Math 3.5 Johnson 1120
1034 Bill Math 2.5 Johnson 1120
2005 Mary CS 2.9 Smith 3345
2078 Frank Math 4.0 Johnson 1120
1922 Scott Chem 3.45 Ford 2525
Students
12. Department of Information Technology 12Data base Technologies (ITB4201)
A Better Design
• Decompose Students into two relations
Major Advisor Office
CS Smith 3345
Math Johnson 1120
Chem Ford 2525
Major_Advisor
SID Name Major GPA
2011 John CS 3.4
1235 Carl CS 3.2
1003 Ken Math 3.5
1034 Bill Math 2.5
2005 Mary CS 2.9
2078 Frank Math 4.0
1922 Scott Chem 3.45
Students
Decomposition can remove redundancy
It may also cause problems if not done carefully.
13. Department of Information Technology 13Data base Technologies (ITB4201)
Functional Dependency
•Main concept associated with normalization.
•Functional Dependency
• Describes relationship between attributes in a relation.
• If A and B are attributes of relation R, B is functionally dependent on A (denoted
A B), if each value of A in R is associated with exactly one value of B in R.
14. Department of Information Technology 14Data base Technologies (ITB4201)
Functional Dependency
•Property of the meaning (or semantics) of
the attributes in a relation.
•Diagrammatic representation:
Determinant of a functional dependency refers
to attribute or group of attributes on left-hand
side of the arrow.
15. Department of Information Technology 15Data base Technologies (ITB4201)
Example - Functional Dependency
16. Department of Information Technology 16Data base Technologies (ITB4201)
Functional Dependency
•Main characteristics of functional dependencies used in
normalization:
•have a 1:1 relationship between attribute(s) on left and right-hand
side of a dependency;
•hold for all time;
•are nontrivial.
17. Department of Information Technology 17Data base Technologies (ITB4201)
Functional Dependency
•Complete set of functional dependencies for a given relation can be very
large.
•Important to find an approach that can reduce set to a manageable size.
•Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that
relation and has property that every functional dependency in Y is
implied by functional dependencies in X.
18. Department of Information Technology 18Data base Technologies (ITB4201)
Functional Dependency
•Let A, B, and C be subsets of the attributes of relation R.
Armstrong’s axioms are as follows:
1. Reflexivity
If B is a subset of A, then A B
2. Augmentation
If A B, then A,C B,C
3. Transitivity
If A B and B C, then A C
19. Department of Information Technology 19Data base Technologies (ITB4201)
Relationship Between Normal Forms
20. Department of Information Technology 20Data base Technologies (ITB4201)
Unnormalized Form (UNF)
•A table that contains one or more repeating groups.
•To create an unnormalized table:
• transform data from information source (e.g. form) into table format with columns
and rows.
21. Department of Information Technology 21Data base Technologies (ITB4201)
First Normal Form (1NF)
•A relation in which intersection of each row and column
contains one and only one value.
22. Department of Information Technology 22Data base Technologies (ITB4201)
UNF to 1NF
•Nominate an attribute or group of attributes to act as the key for the
unnormalized table.
•Identify repeating group(s) in unnormalized table which repeats for
the key attribute(s).
23. Department of Information Technology 23Data base Technologies (ITB4201)
UNF to 1NF
•Remove repeating group by:
• entering appropriate data into the empty columns of rows containing repeating
data (‘flattening’ the table).
Or by
• placing repeating data along with copy of the original key attribute(s) into a
separate relation.
24. Department of Information Technology 24Data base Technologies (ITB4201)
Second Normal Form (2NF)
•Based on concept of full functional dependency:
• A and B are attributes of a relation,
• B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A.
•2NF - A relation that is in 1NF and every non-primary-key attribute is
fully functionally dependent on the primary key.
25. Department of Information Technology 25Data base Technologies (ITB4201)
1NF to 2NF
• Identify primary key for the 1NF relation.
• Identify functional dependencies in the relation.
• If partial dependencies exist on the primary key remove them by
placing them in a new relation along with copy of their determinant.
26. Department of Information Technology 26Data base Technologies (ITB4201)
Third Normal Form (3NF)
•Based on concept of transitive dependency:
• A, B and C are attributes of a relation such that if A B and B C,
• then C is transitively dependent on A through B. (Provided that A is not
functionally dependent on B or C).
•3NF - A relation that is in 1NF and 2NF and in which no non-primary-
key attribute is transitively dependent on the primary key.
27. Department of Information Technology 27Data base Technologies (ITB4201)
2NF to 3NF
•Identify the primary key in the 2NF relation.
•Identify functional dependencies in the relation.
•If transitive dependencies exist on the primary key remove them
by placing them in a new relation along with copy of their
determinant.
28. Department of Information Technology 28Data base Technologies (ITB4201)
General Definitions of 2NF and 3NF
•Second normal form (2NF)
• A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on any candidate key.
•Third normal form (3NF)
• A relation that is in 1NF and 2NF and in which no non-primary-key attribute
is transitively dependent on any candidate key.
29. Department of Information Technology 29Data base Technologies (ITB4201)
Review of Normalization (UNF to BCNF)
30. Department of Information Technology 30Data base Technologies (ITB4201)
Example
https://beginnersbook.com/2015/05/normalization-in-dbms/
https://www.studytonight.com/dbms/database-normalization.php
31. Department of Information Technology 31Data base Technologies (ITB4201)
Test Yourself
1. Functional Dependencies are the types of constraints that are based on______
a) Key
b) Key revisited
c) Superset key
d) None of the mentioned
2.. Which is a bottom-up approach to database design that design by examining the relationship between attributes:
a) Functional dependency
b) Database modeling
c) Normalization
d) Decomposition
3. In the __________ normal form, a composite attribute is converted to individual attributes.
a) First
b) Second
c) Third
d) Fourth
4. A table on the many side of a one to many or many to many relationship must:
a) Be in Second Normal Form (2NF)
b) Be in Third Normal Form (3NF)
c) Have a single attribute key
d) Have a composite key
5. Tables in second normal form (2NF):
a) Eliminate all hidden dependencies
b) Eliminate the possibility of a insertion anomalies
c) Have a composite key
d) Have all non key fields depend on the whole primary key
32. Department of Information Technology 32Data base Technologies (ITB4201)
Answers
1. Functional Dependencies are the types of constraints that are based on______
a) Key
b) Key revisited
c) Superset key
d) None of the mentioned
2.. Which is a bottom-up approach to database design that design by examining the relationship between attributes:
a) Functional dependency
b) Database modeling
c) Normalization
d) Decomposition
3. In the __________ normal form, a composite attribute is converted to individual attributes.
a) First
b) Second
c) Third
d) Fourth
4. A table on the many side of a one to many or many to many relationship must:
a) Be in Second Normal Form (2NF)
b) Be in Third Normal Form (3NF)
c) Have a single attribute key
d) Have a composite key
5. Tables in second normal form (2NF):
a) Eliminate all hidden dependencies
b) Eliminate the possibility of a insertion anomalies
c) Have a composite key
d) Have all non key fields depend on the whole primary key