Normalization,1st
NF, 2nd NF, 3rd NF,
BCNF,
4th NF, 5th NF
Prepared by :
• Krunal Patel
• Vishal Panchal
• Harsh Parmar
• Dhrumil Patel
Guided BY:
Prof. Paresh Patel
1
DEFINATION
■ Database normalization or normalization : is the
process of organizing the attributes and relations of
a relational database to reduce data redundancy and
improve data redundancy.
■ Normalization is also the process of simplifying the
design of a database so that it achieves the optimal
structure composed of atomic elements.
■ It include classify like given below
■ 1st NF ,2ndNF,3rdNF, BCNF,4thNF,5thNF.
2
1st NF
■ As per First Normal Form, no two Rows of data must contain repeating group of
information i.e each set of column must have a unique value, such that multiple
columns cannot be used to fetch the same row. Each table should be organized
into rows, and each row should have a primary key that distinguishes it as unique.
■ The Primary key is usually a single column, but sometimes more than one
column can be combined to create a single primary key. For example consider a
table which is not in First normal form
■ e.g
■ StudentTable :
Student Age Subject
Adam 15 Biology, Maths
Alex 14 Maths
Stuart 17 Maths
3
1st NF
■ After 1st NF.
Student Age Subject
Adam 15 Biology
Adam 15 Maths
Alex 14 Maths
Stuart 17 Maths
1st NF
4
■ As per the Second Normal Form there must not be any partial
dependency of any column on primary key. It means that for a table that
has concatenated primary key, each column in the table that is not part
of the primary key must depend upon the entire concatenated key for its
existence. If any column depends only on one part of the concatenated
key, then the table fails Second normal form.
■ In example of First Normal Form there are two rows for Adam, to include
multiple subjects that he has opted for.While this is searchable, and
follows First normal form, it is an inefficient use of space.Also in the
aboveTable in First Normal Form, while the candidate key is
{Student, Subject}, Age of Student only depends on Student column,
which is incorrect as per Second Normal Form.To achieve second normal
form, it would be helpful to split out the subjects into an independent
table, and match them up using the student names as foreign keys.
2nd NF
5
2nd NF
■ After 2nd NF
■ New StudentTable following 2NF will be :
■ In SubjectTable the candidate key will be {Student, Subject} column.
Now, both the above tables qualifies for Second Normal Form and will
never suffer from Update Anomalies.Although there are a few complex
cases in which table in Second Normal Form suffers Update Anomalies,
and to handle those scenariosThird Normal Form is there.
Student Age
Adam 15
Alex 14
Stuart 17
Student Subject
Adam Biology
Adam Maths
Alex Maths
Stuart Maths
2nd NF
2nd NF
6
■ Third Normal form applies that every non-prime attribute of table must be
dependent on primary key, or we can say that, there should not be the case
that a non-prime attribute is determined by another non-prime attribute.
So this transitive functional dependency should be removed from the table
and also the table must be in Second Normal form. For example, consider a
table with following fields.
■ Student_DetailTable :
■ In this table Student_id is Primary key, but street, city and state depends
upon Zip.The dependency between zip and other fields is called transitive
dependency. Hence to apply 3NF, we need to move the street, city and
state to new table, with Zip as primary key.
Student_id Student_name DOB Street city State Zip
3rd NF
7
3rd NF
■ New Student_DetailTable :
■ AddressTable :
■ The advantage of removing transtive dependency is,
■ Amount of data duplication is reduced.
■ Data integrity achieved.
Student_id Student_name DOB Zip
Zip Street city state
3rd NF
8
Boyce and Codd Normal Form
(BCNF)
■ Boyce and Codd Normal Form is a higher version of theThird Normal form.
This form deals with certain type of anamoly that is not handled by 3NF. A 3NF
table which does not have multiple overlapping candidate keys is said to be in
BCNF. For a table to be in BCNF, following conditions must be satisfied:
■ R must be in 3rd Normal Form
■ and, for each functional dependency ( X ->Y ), X should be a super Key.
consider the following relationship:R(A,B,C,D)
and following dependwncies
A ->BCD
BC->AD
D ->B
above relationship is already in 3rd NF.keys are A and BC.
hence ,in functional dependency, A->BCD ,A is the super key .in second relation,
BC->AD,BC is also akey.but in,D->b,d is not a key.
Hence we can break our relation ship R into two relationship R1 and R2.
9
Boyce and Codd Normal Form
(BCNF)
R(A,B,C,D)
R1(A,D,C) R2(D,B)
10
4th NF
■ Fourth normal form (4NF) is a level of database normalization where there are
no non-trivial multivalued dependencies other than a candidate key.
■ It builds on the first three normal forms (1NF, 2NF and 3NF) and the Boyce-
Codd Normal Form (BCNF). It states that, in addition to a database meeting
the requirements of BCNF, it must not contain more than one multivalued
dependency.
■ Multivalued dependency is best illustrated using an example. In a table
containing a list of three things - college courses, the lecturer in charge of
each course and the recommended book for each course - these three
elements (course, lecturer and book) are independent of one another.
Changing the course’s recommended book, for instance, has no effect on the
course itself.This is an example of multivalued dependency: An item depends
on more than one value. In this example, the course depends on both lecturer
and book.
■ Thus, 4NF states that a table should not have more than one of these
dependencies. 4NF is rarely used outside of academic circles.
11
4th NF
■ E.g 4th NF table will be:
■ After 4th NF
Subject Lecturer books
math raj b1
math jay b2
physics kanu p1
chemestry vinod c1
Subject Lecturer
math raj
math jay
physics kanu
chemestry vinod
Subject books
math b1
math b2
physics p1
chemestry c1
12
5th NF
■ A database is said to be in 5NF, if and only if,
■ It's in 4NF
■ If we can decompose table further to eliminate redundancy and anomaly,
and when we re-join the decomposed tables by means of candidate keys,
we should not be losing the original data or any new record set should not
arise. In simple words, joining two or more decomposed table should not
lose records nor create new records.
Subject Lecturer Semester
math raj 1
math jay 2
physics kanu 1
chemestry vinod 2
chemestry divy 1
13
5th NF
 In above table, Rose takes both Mathematics and Physics class for
Semester 1, but she does not take Physics class for Semester 2. In this
case, combination of all these 3 fields is required to identify a valid data.
Imagine we want to add a new class - Semester3 but do not know which
Subject and who will be taking that subject. We would be simply inserting
a new entry with Class as Semester3 and leaving Lecturer and subject as
NULL.As we discussed above, it's not a good to have such entries.
Moreover, all the three columns together act as a primary key, we cannot
leave other two columns blank!
 Hence we have to decompose the table in such a way that it satisfies all
the rules till 4NF and when join them by using keys, it should yield correct
record. Here, we can represent each lecturer's Subject area and their
classes in a better way.We can divide above table into three - (SUBJECT,
LECTURER), (LECTURER,CLASS), (SUBJECT,CLASS)
14
5th NF
■ After 5th NF.
Subject Lecturer
math raj
math jay
physics kanu
chemestry vinod
chemestry divy
Lecturer Semester
raj 1
jay 2
kanu 1
vinod 2
divy 1
Subject Semester
math 1
math 2
physics 1
chemestry 2
chemestry 1
15
5th NF
■ Now, each of combinations is in three different tables. If we
need to identify who is teaching which subject to which
semester, we need join the keys of each table and get the
result.
■ For example, who teaches Physics to Semester 1, we would be
selecting Physics and Semester1 from table 3 above, join with
table1 using Subject to filter out the lecturer names.Then join
with table2 using Lecturer to get correct lecturer name.That is
we joined key columns of each table to get the correct data.
Hence there is no lose or new data - satisfying 5NF condition.
16
17

Normalization,1st NF, 2nd NF, 3rd NF, BCNF, 4th NF, 5th NF

  • 1.
    Normalization,1st NF, 2nd NF,3rd NF, BCNF, 4th NF, 5th NF Prepared by : • Krunal Patel • Vishal Panchal • Harsh Parmar • Dhrumil Patel Guided BY: Prof. Paresh Patel 1
  • 2.
    DEFINATION ■ Database normalizationor normalization : is the process of organizing the attributes and relations of a relational database to reduce data redundancy and improve data redundancy. ■ Normalization is also the process of simplifying the design of a database so that it achieves the optimal structure composed of atomic elements. ■ It include classify like given below ■ 1st NF ,2ndNF,3rdNF, BCNF,4thNF,5thNF. 2
  • 3.
    1st NF ■ Asper First Normal Form, no two Rows of data must contain repeating group of information i.e each set of column must have a unique value, such that multiple columns cannot be used to fetch the same row. Each table should be organized into rows, and each row should have a primary key that distinguishes it as unique. ■ The Primary key is usually a single column, but sometimes more than one column can be combined to create a single primary key. For example consider a table which is not in First normal form ■ e.g ■ StudentTable : Student Age Subject Adam 15 Biology, Maths Alex 14 Maths Stuart 17 Maths 3
  • 4.
    1st NF ■ After1st NF. Student Age Subject Adam 15 Biology Adam 15 Maths Alex 14 Maths Stuart 17 Maths 1st NF 4
  • 5.
    ■ As perthe Second Normal Form there must not be any partial dependency of any column on primary key. It means that for a table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence. If any column depends only on one part of the concatenated key, then the table fails Second normal form. ■ In example of First Normal Form there are two rows for Adam, to include multiple subjects that he has opted for.While this is searchable, and follows First normal form, it is an inefficient use of space.Also in the aboveTable in First Normal Form, while the candidate key is {Student, Subject}, Age of Student only depends on Student column, which is incorrect as per Second Normal Form.To achieve second normal form, it would be helpful to split out the subjects into an independent table, and match them up using the student names as foreign keys. 2nd NF 5
  • 6.
    2nd NF ■ After2nd NF ■ New StudentTable following 2NF will be : ■ In SubjectTable the candidate key will be {Student, Subject} column. Now, both the above tables qualifies for Second Normal Form and will never suffer from Update Anomalies.Although there are a few complex cases in which table in Second Normal Form suffers Update Anomalies, and to handle those scenariosThird Normal Form is there. Student Age Adam 15 Alex 14 Stuart 17 Student Subject Adam Biology Adam Maths Alex Maths Stuart Maths 2nd NF 2nd NF 6
  • 7.
    ■ Third Normalform applies that every non-prime attribute of table must be dependent on primary key, or we can say that, there should not be the case that a non-prime attribute is determined by another non-prime attribute. So this transitive functional dependency should be removed from the table and also the table must be in Second Normal form. For example, consider a table with following fields. ■ Student_DetailTable : ■ In this table Student_id is Primary key, but street, city and state depends upon Zip.The dependency between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to move the street, city and state to new table, with Zip as primary key. Student_id Student_name DOB Street city State Zip 3rd NF 7
  • 8.
    3rd NF ■ NewStudent_DetailTable : ■ AddressTable : ■ The advantage of removing transtive dependency is, ■ Amount of data duplication is reduced. ■ Data integrity achieved. Student_id Student_name DOB Zip Zip Street city state 3rd NF 8
  • 9.
    Boyce and CoddNormal Form (BCNF) ■ Boyce and Codd Normal Form is a higher version of theThird Normal form. This form deals with certain type of anamoly that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF. For a table to be in BCNF, following conditions must be satisfied: ■ R must be in 3rd Normal Form ■ and, for each functional dependency ( X ->Y ), X should be a super Key. consider the following relationship:R(A,B,C,D) and following dependwncies A ->BCD BC->AD D ->B above relationship is already in 3rd NF.keys are A and BC. hence ,in functional dependency, A->BCD ,A is the super key .in second relation, BC->AD,BC is also akey.but in,D->b,d is not a key. Hence we can break our relation ship R into two relationship R1 and R2. 9
  • 10.
    Boyce and CoddNormal Form (BCNF) R(A,B,C,D) R1(A,D,C) R2(D,B) 10
  • 11.
    4th NF ■ Fourthnormal form (4NF) is a level of database normalization where there are no non-trivial multivalued dependencies other than a candidate key. ■ It builds on the first three normal forms (1NF, 2NF and 3NF) and the Boyce- Codd Normal Form (BCNF). It states that, in addition to a database meeting the requirements of BCNF, it must not contain more than one multivalued dependency. ■ Multivalued dependency is best illustrated using an example. In a table containing a list of three things - college courses, the lecturer in charge of each course and the recommended book for each course - these three elements (course, lecturer and book) are independent of one another. Changing the course’s recommended book, for instance, has no effect on the course itself.This is an example of multivalued dependency: An item depends on more than one value. In this example, the course depends on both lecturer and book. ■ Thus, 4NF states that a table should not have more than one of these dependencies. 4NF is rarely used outside of academic circles. 11
  • 12.
    4th NF ■ E.g4th NF table will be: ■ After 4th NF Subject Lecturer books math raj b1 math jay b2 physics kanu p1 chemestry vinod c1 Subject Lecturer math raj math jay physics kanu chemestry vinod Subject books math b1 math b2 physics p1 chemestry c1 12
  • 13.
    5th NF ■ Adatabase is said to be in 5NF, if and only if, ■ It's in 4NF ■ If we can decompose table further to eliminate redundancy and anomaly, and when we re-join the decomposed tables by means of candidate keys, we should not be losing the original data or any new record set should not arise. In simple words, joining two or more decomposed table should not lose records nor create new records. Subject Lecturer Semester math raj 1 math jay 2 physics kanu 1 chemestry vinod 2 chemestry divy 1 13
  • 14.
    5th NF  Inabove table, Rose takes both Mathematics and Physics class for Semester 1, but she does not take Physics class for Semester 2. In this case, combination of all these 3 fields is required to identify a valid data. Imagine we want to add a new class - Semester3 but do not know which Subject and who will be taking that subject. We would be simply inserting a new entry with Class as Semester3 and leaving Lecturer and subject as NULL.As we discussed above, it's not a good to have such entries. Moreover, all the three columns together act as a primary key, we cannot leave other two columns blank!  Hence we have to decompose the table in such a way that it satisfies all the rules till 4NF and when join them by using keys, it should yield correct record. Here, we can represent each lecturer's Subject area and their classes in a better way.We can divide above table into three - (SUBJECT, LECTURER), (LECTURER,CLASS), (SUBJECT,CLASS) 14
  • 15.
    5th NF ■ After5th NF. Subject Lecturer math raj math jay physics kanu chemestry vinod chemestry divy Lecturer Semester raj 1 jay 2 kanu 1 vinod 2 divy 1 Subject Semester math 1 math 2 physics 1 chemestry 2 chemestry 1 15
  • 16.
    5th NF ■ Now,each of combinations is in three different tables. If we need to identify who is teaching which subject to which semester, we need join the keys of each table and get the result. ■ For example, who teaches Physics to Semester 1, we would be selecting Physics and Semester1 from table 3 above, join with table1 using Subject to filter out the lecturer names.Then join with table2 using Lecturer to get correct lecturer name.That is we joined key columns of each table to get the correct data. Hence there is no lose or new data - satisfying 5NF condition. 16
  • 17.