1. Normalization
Normalization is a process of organizing the data in database to avoid
data redundancy, insertion anomaly, update anomaly & deletion
anomaly.
2. Let us discuss the anomalies in Badly formed
tables
• Consider following badly formed table.
Studid Sname SubAbbre Subname Hrs Grade Class
s101 Sachin DTE,
RDM,
AMS
Digital
Techniques,
Relational DBMS,
Applied Maths
48
64
48
C
A
A
CM3G
s102 AJIT DTE,
RDM,
AMS
Digital
Techniques,
Relational DBMS,
Applied Maths
48
64
48
A
A
A
CM3G
s102 Sunita DTE,
RDM,
AMS
Digital
Techniques,
Relational DBMS,
Applied Maths
48
64
48
A
A
C
CM3G
3. The following table consists of Anomalies-
• 1)Insertion Anomalies- Suppose we want to add new subject (i.e ETE
electrical technology) it must be added in each row.
• 2)Data Redundancy-The Subject Names and course is repeatedly
stored in table. Which leads to more wastage of memory
• 3)Updation Anamalies- If we want to change subject name Relational
Dbms to DBMS it must be updated every where.
• 4)Deletion Anomalies-It is difficult to delete particular value from
table.
4. .Need of Normalization
•The normalization is performed to avoid or to
reduce such a anomalies in badly formed table
• 1)Reduces Redundancy
• 2)Insertion becomes easy
• 3)Deletion becomes easy
• 4)Updation becomes easy
5. Normal Forms
• First normal form(1NF)
• Second normal form(2NF)
• Third normal form(3NF)
• Boyce & Codd normal form (BCNF)
Higher the Normal form more the consistency of data
in Database,i.E less redundancy and less anomalies
6. 1NF-
Database is said to be in 1NF iff-
1) Domains are atomic.
That means column of a table cannot hold multiple values.
2)Each row must have unique combinations of values.
SO ABOVE Badly FORMED TABLE IS NOT IN 1NF.
After applying Rule of 1NF
7. The below table is now in 1NF
Studid Sname SubAbbre Subname Hrs Grade Class
s101 Sachin DTE Digital Techniques 48 C cM3G
s101 Sachin RDM Relational DBMS 64 A cM3G
s101 Sachin AMS Applied Maths 48 A cM3G
8. Before Starting 2NF we need to know FD-
(Functional Dependency)
• Functional dependency is a relationship that exists when one
attribute uniquely determines another attribute.
• If R is a relation with attributes X and Y, a functional dependency
between the attributes is represented as X->Y, which specifies Y is
functionally dependent on X.
• Here X is a determinant set and Y is a dependent attribute.
• Each value of X i s a s s o c i a t e d p r e c i s e l y
w i t h o n e Y v a l u e .
9. 2NF-
• The databse is said to be in 2NF iff
1)It must be 1NF
2)There is no partial dependency
Partial dependency means part of primary key determines attribute.
Consider above table structure
Student_course(studid,sname,subAbbrev,hrs,class)
Primary key is composite
Primary key={studid,subabbrev}
10. • Here is presence of partial Dependency
As
Subabbrevsubname
Also
Studidsname
So remove these partial dependency, it will reduces redundancy also
Simply store related columns of table in separate table
11. After applying rules of 2NF-
• Remove Partial dependency,and store related columns in seprate
tables.
1)studentinfo(studid,sname)
2)subject(subabbrev,subname,hrs)
3)studentgrade(studid,subabbrev,grade)
SO above database is in 2NF
Above database also have some anomalies-----so normalize it into 3NF
12. 3NF
• Database is said to be in 3NF iff
1)It must be in 2nf
2)There is no transitive dependency
Transitive Dependency-Non prime attribute determines another non
prime attribute.
So recall tables in 2NF ---
1)studentinfo(studid,sname)
2)subject(subabbrev,subname,hrs)
3)studentgrade(studid,subabbrev,grade)
14. Boyce Codd Normal Form-
Database is said to be is in BCNF if and only if
1)It is already is in 2NF and
2)Every determinant is Candidate key
BCNF is more stricter form of normalization
Dependency preservation is more difficult in BCNF.
The BCNF may leads to loss of data(Lossy Decomposition)
15. Difference Between 3NF and BCNF
3NF BCNF
1) The databse is said to be in 2NF if AND
ONLY IF
It must be 1NF AND
There is no partial dependency
Database is said to be is in BCNF if and
only if
It is already is in 2NF AND
2)Every determinant is Candidate key
2)It easy to achieve It is difficult to achieve
3) Less Stricter form of Normalization More Stricter Form of Normalization
4) Maintaining FD is easy Maintaining Functional Dependency is
difficult
5)
16. Multivalued Dependency(MVD)
• It is tuple generating dependency
• A B i.e A multipally deteremines B if and only if for any relation R.
Consists of atleast three columns. Namely A,B,C
And for Single value of A there are multiple values of B and C
And A and B, and A and C are not depedent.
17. Example of MVD
• Example
Consider database schema
Movies(title,actor,yearofrelease,length)
Title actor Yearofrelease length
Golmal Amol palekar 1990 2.78
Golmal Abhishek
bacchan
Harshad
Warsi
2002 2.60