Learn more about database normalization concepts

Normalization?
Is the process of eliminating the
relation design issues (mainly
data redundancy) and related
anomalies (insertion, deletion and
updation).

First Normal Form (1NF)
The first normal form (1NF), imposes a fundamental requirement on
relations.
 We say that a relation schema R is in first normal form (1NF) if the domains of all
attributes of R are atomic.
 A domain of an attribute is atomic if elements of the domain are considered to be
indivisible units.
 It means that multivalued attributes, composite attributes, and their combinations
are not allowed in a Relation that is in first normal form

First Normal Form (1NF)…
 Multivalued attribute: A multivalued attribute may have one or more values for a particular entity.
Example – Phone Number. In our SMS case study, the phone number attribute in the STUDENT entity
type is a multivalued attribute. It means that a student can have multiple phone numbers. If you
remember, this also comes from the implicit constraint applied to relational databases.
 Composite attribute: Composite attributes are not atomic because they are assembled using some
other atomic attributes. A typical example of a composite attribute is a person's address, composed of
atomic attributes, such as House No., Street, City, State, Pincode.
 In the case of a composite attribute, we can still store it in the database without violating any database
constraint; however, it is not a good database design. Storing a composite attribute in the database will make
data querying and analysis on its constituent atomic attributes very complex. It can also result in the
redundancy of data.

For handling a Composite attribute we need to create a separate column for each part of the
composite attribute, as number of parts in a composite attribute will be fixed for most of the cases.

For handling a multivalued attribute, we have the three options:-
Option 1:
Expand the Key of this Relation to include phone_no with roll_no. The Relation will now
have a composite primary key consisting of roll_no & phone_no. This arrangement
achieves the first normal form (1NF); however, it is not a good design as it introduces data
redundancy into the Relation. For each additional phone number of a student, the data in
other columns is repeated.

Option 2:
Suppose the maximum number of values is known for phone_no, as many columns can be
added to the existing Relation.
Let's assume a student can have a maximum of two phone_no. We can create the below
relation design, with two separate columns to store two possible student phone numbers to
achieve the first normal form (1NF). This is not a good design as it limits the phone
numbers a student can have. If we want to allow more phone numbers, the relation design
would need a change, which is not a good design practice.

Option 3:
Decompose this Relation into two relations – STUDENT & STUDENT_PHONE_NO. They
are linked to each other with the Primary Key (PK) - Foreign Key (FK) relationship. This is a
good design as it takes care of data redundancy and does not limit the number of phone
numbers a student can have.

Second Normal Form (2NF)
 The Second Normal Form (2NF) is based on the concept of full functional
dependency.
 The Second Normal Form applies to relations with composite keys, that
is, relations with a primary key composed of two or more attributes.
 A Relation with a single-attribute primary key is automatically in at least
2NF. A Relation not in 2NF may suffer from inconsistency problems
arising during insert, delete and update operations.

Second Normal Form (2NF)…
Definition:
For a Relation to be in 2NF, it should fulfill the below two conditions:
 The Relation should be in 1NF
 The Relation should have No Partial Dependency, i.e., no non-prime attribute (attributes that are not part
of any Primary/candidate key) is dependent on any proper subset of any candidate key of the Relation.
How to check:
 2NF applies to relations with composite candidate keys. A Relation with a single-attribute candidate Keys is
automatically in at least 2NF.
 Proper Subset (CK/PK) → any non-prime attribute should not hold.
How to convert 1NF to 2NF:
The normalization of 1NF relations to 2NF involves the removal of partial functional dependencies. If a
partial dependency exists, we remove partially dependent attribute(s) (along with their dependents, if any) from
the Relation by placing them in a new Relation along with a copy of their determinant. The remaining attributes
of the Relation along with the determinant above remain part of the base Relation.

Example 1: Let's assume a school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.
The FDs in the Relation teacher_id → teacher_age,
can be depicted as: Relation (ABC) with FD = A→C
Let's find the candidate key of the above Relation.
Candidate Key is (AB). Prime Attributes – A, B. Non-
prime Attributes – C
We have a composite candidate key (AB), and its
proper subset (A) can determine a non-prime
attribute (C), FD (A→C). So this is a case of partial
dependency. Therefore the Relation is not in 2NF.
To convert this Relation into 2NF, we need
to remove the partially dependent
attribute(s) from the Relation by placing
them in a new Relation along with a copy
of their determinant.

Example 2: In the previous section, when we converted the Relation into 1NF using option 1. (roll_no
& phone_no) is the composite primary key
Now let's analyze this Relation from a functional dependency point of view and find out if this is in 2NF
or not. We can re-write the above as Relation R(ABCDEFGHIJKL) with FDs = A→BCDEFGHIJK, I→J
Candidate Keys is (AL). Prime Attributes – A, L. Non-prime Attributes – B, C, D, E, F, G, H, I, J, K
We have a composite candidate key (AL), and its proper subset (A) can determine non-prime attributes
(B, C, D, E, F, G, H, I, J, K), FD (A→BCDEFGHIJK). So this is a case of partial dependency. Therefore
the Relation is not in 2NF.

Example 2 (contd.): To convert this Relation into 2NF, we need to remove the partially dependent
attribute(s) from the Relation by placing them in a new relation along with a copy of their determinant.

Example 3:
Let's take Relation R(A,B,C,D,E,F) with FD set = (A→B, B→C, C→D, D→E). Let's find if this Relation
is in 2NF or not.
The candidate key of the above Relation is (A). As the candidate key is not composite, the case of
partial dependency does not arise. Therefore the Relation is in 2NF.
Example 4:
Let’s take Relation R(A,B,C,D) with FD set = (AB→CD, C→A, D→B). Let's find if this Relation is in
2NF or not.
The candidate keys of the above Relation are (AB), (BC), (CD), (AD).
Prime Attributes – A, B, C, D. Non-prime Attributes – NILL
In this case, though, we have composite candidate keys but no non-prime attribute. So the case of
partial dependency does not arise. Therefore the Relation is in 2NF.

Example 5:
Let’s take Relation R(A,B,C,D) with FD set = (A→B, B→D). Let's find if this Relation is in 2NF or not.
The candidate key of the above Relation is (AC).
Prime Attributes – A, C
Non-prime Attributes – B, D
In this case, we have a composite candidate key (AC), and its proper subset (A) can determine a non-
prime attribute (B), FD (A→B). So this is a case of partial dependency. Therefore the Relation is not in
2NF.

Third Normal Form (3NF)
 Although Second Normal Form (2NF) relations have less redundancy than
those in 1NF, they may still suffer from inconsistency problems arising during
insert, delete and update operations.
 A transitive dependency causes these inconsistency problems. Transitive
dependency causes redundancy in the Relation. We need to remove such
dependencies by progressing to the Third Normal Form (3NF).

Third Normal Form (3NF)…
Definition:
For a Relation to be in 3NF, it should fulfill both the below two conditions
 There should be no non-prime attribute that is transitively dependent on the primary key
or any candidate key
or
 A non-prime attribute should not functionally depend on the other non-prime attribute.
This means if we have a Relation R(A,B,C,D) with FDs = A→BD, B→C. In this Relation, (A) is
the candidate key and we have a transitive dependency, A→B, B→C.
We have a non-prime attribute (C) that is transitively dependent on candidate key (A), therefore
this Relation is not in 3NF or we can say, we have a non-prime attribute (C) which is dependent
on another non-prime attribute (B); hence the Relation is violating the 3NF condition.

How to check:-
A Relation is in 3NF if at least one of the following condition holds in every non-trivial
function dependency X→Y:
 X is a super key
 Y is a prime attribute
How to convert 2NF to 3NF:-
The normalization of 2NF relations to 3NF involves the removal of transitive dependencies.
If a transitive dependency exists, we remove transitively dependent attribute(s) from the
Relation by placing the attribute(s) in a new Relation along with a copy of the determinant.
The remaining attributes of the Relation along with the determinant above remain part of
the base Relation.

Example 1: In the previous section, in example 2, we converted the STUDENT Relation from 1NF to
2NF by decomposing it into two separate relations STUDENT_DETAIL and STUDENT_PHONE_NO.
Now let's analyze the STUDENT_DETAIL Relation, which is already in 2NF.
FDs in the above Relation are:
roll_no → first_name, middle_name, last_name, dob, gender, house_no, street_name, city, State,
pincode, city → state
The candidate key of the Relation is roll_no. In this Relation, we have a transitive dependency roll_no
→ city, city → state. This transitive dependency is causing data redundancy in the Relation. Therefore
this Relation is not in 3NF.

Example 1 (contd.): The normalization of this Relation to 3NF will involve the removal of transitive
dependencies. We need to remove the transitively dependent attribute(s) from the Relation by placing
the attribute(s) in a new Relation (CITY_STATE_MASTER) along with a copy of the determinant.

Example 2:
Let's take Relation R(A,B,C,D) with FD set = (A→B, B→C, C→D). Let's find if this Relation is in 3NF
or not.
The candidate key of the above Relation is (A).
Prime attributes – A. Non-prime attributes – B, C, D
Now let's analyze each FD for the 3NF condition:
A→B, A is a super key (we know all candidate keys are super keys) – 3NF condition met
B→C, B is not a super key, and C is not a prime attribute – 3NF condition failed
Therefore we can conclude that the above Relation is not 3NF.
A relation is in 3NF if at least one of the following condition holds in every non-trivial function dependency
X→Y:
• X is a super key
• Y is a prime attribute

Example 3:
Let’s take Relation R(A,B,C,D,E,F) with FD set = (AB→CDEF, BD→F). Let's find if this Relation is in
3NF or not.
The candidate key of the above Relation is (AB).
Prime attributes – A, B. Non-prime attributes – C, D, E, F
AB→CDEF, AB is a super key (we know all candidate keys are super keys) – 3NF condition met
BD→F, BD is not a super key, and F is not a prime attribute – 3NF condition failed
Therefore we can conclude that the above Relation is not 3NF.
A relation is in 3NF if at least one of the following condition holds in every non-trivial function dependency
X→Y:

Example 4:
Let's take Relation R(A,B,C,D,E) with FD set = (A→B, B→C, C→D, D→A). Let's find if this Relation is
in 3NF?
The candidate key of the above Relation is (AE), (DE), (CE), (BE).
Prime attributes – A, B, C, D, E. Non-prime attributes – NILL
A→B, A is not a super key, but B is a prime attribute – 3NF condition met.
B→C, B is not a super key, but C is a prime attribute – 3NF condition met.
C→D, C is not a super key, but D is a prime attribute – 3NF condition met.
D→A, D is not a super key, but A is a prime attribute – 3NF condition met.
Therefore we can conclude that the above Relation is in 3NF.
A relation is in 3NF if at least one of the following condition holds in every non-trivial function
dependency X→Y:

Boyce Codd Normal Form (BCNF)
Boyce-Codd Normal Form or BCNF is an extension to the 3NF and is also known as the 3.5
Normal Form. Some redundancies might still remain even after a Relation is in 3NF.
Definition:
For a Relation to be in BCNF, it should fulfill both the below two conditions
 For each non-trivial functional dependency X→Y, X should be a Super Key
or
The Relation has no non-trivial functional dependency i.e. the Relation is an all-key Relation
(all attributes make the only candidate key)
How to convert 3NF to BCNF:
The normalization of 3NF relations to BCNF involves creating new Relation for every
dependency that violates the BCNF condition. The remaining attributes of the Relation, along
with the determinant (of the FD violating the BCNF condition) above, remain part of the base
Relation.

Boyce Codd Normal Form (BCNF)…
Example 1:
Relation R(A,B,C) with FD set = (A→B, B→C, C→A).
The candidate key of the above Relation is (A), (B), (C).
Prime attributes – A, B, C
Non-prime attributes – NIL
This Relation is in 3NF (use the concepts learned in the previous section). Now let's analyze
each FD for BCNF condition:
A→B, A is a super key – BCNF condition met.
B→C, B is a super key – BCNF condition met.
C→A, C is a super key – BCNF condition met.
All FDs are meeting the BCNF condition; therefore, we can conclude that the above Relation is
in BCNF.

Example 2:
Relation R(A,B,C) with FD set = (AB→C, C→B).
The candidate key of the above Relation is (AB), (AC).
Prime attributes – A, B, C
This Relation is in 3NF (use the concepts learned in the previous section). Now let's analyze
each FD for BCNF condition:
AB→C, AB is a super key – BCNF condition met.
C→B, C is not a super key – BCNF condition not met.
All FDs are not meeting the BCNF condition; therefore, we can conclude that the above
Relation is not in BCNF.

Example 3: Below we have a STUDENT_SUBJECT_PROFESSOR Relation with columns student_id,
subject, and professor.
In the above Relation:
 One student can enroll in multiple subjects. For example, a student with student_id 101 has opted
for subjects - Java & C++
 For each subject, a professor is assigned to the student.
 There can be multiple professors teaching one subject as we have for Java.
 One professor teaches only one subject

Example 3 (contd.):
FDs for this Relation:
student_id, subject → professor
professor → subject
Candidate key for the Relation – (student_id, subject)
This Relation satisfies the 1st Normal form because all the values are atomic, column names
are unique, and all the values stored in a particular column are of the same domain.
This Relation also satisfies the 2nd Normal Form as there is no Partial Dependency.
And, there is no Transitive Dependency; hence the Relation also satisfies the 3rd Normal
Form.
But this Relation is not in Boyce-Codd Normal Form as FD; professor → subject does not
meet the BCNF condition. Here LHS (professor) is not a super key.

Example 3 (contd.):
To make this Relation satisfy BCNF, we will decompose this Relation into two relations
STUDENT_PROFESSOR and PROFESSOR_SUBJECT.

Question
 Prove that any relation of two attributes is always in BCNF.
36

Finding the highest normal form of a relation
Steps to find the highest normal form of a Relation:
 Find all possible candidate keys of the Relation.
 Divide all attributes into two categories: prime attributes and non-prime attributes.
 Check for BCNF normal form, then 3NF, and so on. By definition (implicit constraints) a
Relation will always be in 1NF.
Summary of definition of Normal forms:
2NF: No non-prime attribute should be partially dependent on Candidate Key (CK).
i.e. Proper Subset (CK/PK) → any non-prime attribute should not hold.
3NF: First, it should be in 2NF and at least one of the following condition holds in every non-
trivial function dependency X→Y:
 X is a super key
 Y is a prime attribute
BCNF: First, it should be in 3NF and if there exists a non-trivial dependency between two sets of
attributes X and Y such that X→Y, then X is Super Key

Finding the highest normal form of a relation…
The below Venn diagram shows the relationship between various normal forms. If a
Relation is in BCNF, it is already in 3NF, 2NF & 1NF. That's why we start checking a
Relation for BCNF and then move to 3NF and so on.

Example 1:
Relation R(ABCDEFGH) with FDs = {ABC→DE, E→GH, H→G, G→H, ABCD→EF}
Step 1:
Candidate key of this Relation is (ABC)
Step 2:
Prime attributes: A, B, C
Non-prime attributes: D, E, F, G, H

Example 1 (contd.):
Step 3:
Check for BCNF
ABC→DE, ABC is a super key – BCNF condition met.
E→GH, E is not a super key – BCNF condition not met.
H→G, H is not a super key – BCNF condition not met.
G→H, G is not a super key – BCNF condition not met.
ABCD→EF, ABCD is a super key – BCNF condition met.
As all FDs are not meeting BCNF conditions, this Relation is not in BCNF.

Example 1 (contd.):
Check for 3NF
ABC→DE, ABC is a super key – 3NF condition met.
E→GH, E is not a super key, and G&H are non-prime attributes – 3NF condition not met.
H→G, H is not a super key, and G is a non-prime attribute – 3NF condition not met.
G→H, G is not a super key, and H is a non-prime attribute – 3NF condition not met.
ABCD→EF, ABCD is a super key – 3NF condition met.
As all FDs are not meeting 3NF conditions, this Relation is not in 3NF.

Example 1 (contd.):
Check for 2NF
ABC→DE, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
E→GH, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
H→G, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
G→H, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
ABCD→EF, LHS not a proper subset of candidate key (ABC) – 2NF condition met.
As all FDs are meeting 2NF conditions, this Relation is in 2NF.

Example 2:
Relation R(A,B,C,D) with FDs = {A→BCD, BC→AD, D→B}
Step 1:
Candidate keys of this Relation are (A), (BC), (CD).
Step 2:
Prime attributes: A, B, C, D
Non-prime attributes: NIL
Step 3:
Check for BCNF
A→BCD, A is a super key – BCNF condition met.
BC→AD, BC is a super key – BCNF condition met.
D→B, D is not a super key – BCNF condition not met.

Example 2 (contd.):
Check for 3NF
A→BCD, A is a super key – 3NF condition met.
BC→AD, BC is a super key – 3NF condition met.
D→B, D is not a super key, but B is a prime attribute – 3NF condition met.
No need to check for 2NF, and as all 3NF relations are 2NF

Example 3:
Relation R(A,B,C,D) with FDs = {AB→C, ABD→C, ABC→D, AC→D}
Step 1:
Candidate key of this Relation is (AB)
Step 2:
Prime attributes: A, B. Non-prime attributes: C, D
Step 3:
Check for BCNF
AB→C, AB is a super key – BCNF condition met.
ABD→C, ABD is a super key – BCNF condition met.
ABC→D, ABC is a super key – BCNF condition met.
AC→D, AC is not a super key – BCNF condition not met.

Example 3 (contd.):
Check for 3NF
AB→C, AB is a super key – 3NF condition met.
ABD→C, ABD is a super key – 3NF condition met.
ABC→D, ABC is a super key – 3NF condition met.
AC→D, AC is a not super key – 3NF condition not met.
Check for 2NF
AB→C, LHS not a proper subset of candidate key (AB) – 2NF condition met.
ABD→C, LHS not a proper subset of candidate key (AB) – 2NF condition met.
ABC→D, LHS not a proper subset of candidate key (AB) – 2NF condition met.
AC→D, LHS not a proper subset of candidate key (AB) – 2NF condition met.

Example 4:
Relation R(A,B,C,D,E) with FDs = {AB→CDE, D→BE}
Step 1:
Candidate keys of this Relation are (AB), (AD)
Step 2:
Prime attributes: A, B, D
Non-prime attributes: C, E
Step 3:
Check for BCNF
AB→CDE, AB is a super key – BCNF condition met.
D→BE, D is not a super key – BCNF condition not met.

Example 4 (contd.):
Check for 3NF
AB→CDE, AB is a super key – 3NF condition met.
D→BE can be written as D→B, D→E
D→B, D is not a super key, but B is a prime attributes – 3NF condition met.
D→E, D is not a super key, and E is not a prime attribute – 3NF condition not met.
As all FDs are not meeting the 3NF conditions, this Relation is not in 3NF.
Check for 2NF
AB→CDE, LHS not a proper subset of candidate key (AB) – 2NF condition met.
D→B, LHS is a proper subset of candidate key (AD), but B is not a non-prime attribute –
2NF condition met.
D→E, LHS is a proper subset of candidate key (AD), and E is a non-prime attribute – 2NF
condition not met.
As all FDs are not meeting 2NF conditions, this Relation is not in 2NF.
So this Relation is in 1NF.

Decomposition of relations to convert them into
higher normal form
Till now we have understood:
 The concept of 1NF, 2NF, 3NF & BCNF.
 Find the highest normal form of a given Relation.
Let's use this knowledge to convert a given Relation into a higher normal form.
We will do this with a set of examples to bring more clarity.

higher normal form…
Example 1:
Given Relation R(A,B,C,D,E) with FDs = {A→B, B→E, C→D)
Step 1 – Find the current normal form of the Relation
Candidate Key – (AC)
Prime attributes – A, C
Non-prime attributes – B, D, E
Using the process learned in the section above, we can find that R is in 1NF.
Step 2 – Find the FDs that are creating a problem
A→B (This is a partial dependency as (A) being a proper subset of candidate key (AC)
is determining a non-prime attribute (B) – Thus violating 2NF
C→D (This is a partial dependency as (C) being a proper subset of candidate key (AC)
is determining a non-prime attribute (B) – Thus violating 2NF

Example 1 (contd.):
Step 3 – Decompose the Relation to remove the anomalies identified above
As, we have identified two partial dependencies in the above Relation, thus violating
2NF. From previous sections, we know:
We will create two separate relations to handle two partial dependencies A→B
(including B→E, as E is dependent on B) & C→D.
i.e. R1(A,B,E), R2(C,D). After removing the partial dependent (and their dependents)
attributes, the base Relation will be reduced to R3(A,C).
The normalization of 1NF relations to 2NF involves the removal of partial functional dependencies. If a
partial dependency exists, we remove the partially dependent attribute(s) (along with their dependents, if
any) from the relation by placing them in a new relation along with a copy of their determinant. The
remaining attributes of the relation along with the determinant above remain part of the base relation.

Example 1 (contd.):
Step 4 – Check again if the above-decomposed relations have achieved the highest normal form.
Relation R1(A,B,E)
Candidate Key – (A). Prime attributes – A. Non-prime attributes – B, E
We see there is transitive dependency here B→E; therefore, this Relation is not in 3NF
R2(C,D) & R3(A,C) are both in BCNF (you can check by concepts learned in the earlier sections).
Step 5 – Decompose the Relation R1(A,B,E) to remove the anomalies identified above.
We have identified a transitive dependency in the above Relation, thus violating 3NF. We know:
We will create a separate Relation to handle the transitive dependency B→E
i.e., R12(B,E). After removing the transitive dependent attribute, the base Relation will be reduced to
R11(A,B).
The normalization of 2NF relations to 3NF involves the removal of transitive dependencies. If a transitive dependency
exists, we remove the transitively dependent attribute(s) from the relation by placing the attribute(s) in a new relation
along with a copy of the determinant. The remaining attributes of the relation along with the determinant above remain
part of the base relation.

Example 1 (contd.):
Step 6 – Check again if the above-decomposed relations have achieved the highest normal form.
R11(A,B) &R12(B,E) are both in BCNF (you can check by concepts learned in the earlier
sections).
Step 7 – After carrying out the decomposition, we need to make sure that one of the decomposed relations
contains the candidate key of the Relation R(A,B,C,D,E) i.e (AC). Here R3(A,C) meets the
condition.
Conclusion: Relation R(A,B,C,D,E) with FDs = {A→B, B→E, C→D) is in 1NF. It can be decomposed into
4 separate relations - R11(A,B), R12(B,E), R2(C,D) & R3(A,C) to achieve the highest normal form of
BCNF.

Example 2:
Given Relation R(A,B,C,D) with FDs = {A→B, B→C, C→D)
Candidate Key – (A)
Prime attributes – A
Non-prime attributes – B, C, D
Using the process learned in the section above, we can find that Relation R is in 2NF.
B→C, transitive dependency– Thus violating 3NF
C→D, transitive dependency – Thus violating 3NF

Example 2 (contd.):
As, we have identified two transitive dependencies in the above Relation, thus violating 3NF.
From previous sections, we know:
We will create two separate relations to handle two transitive dependencies B→C, C→D
i.e. R1(BC) & R2(CD). After removing the partial dependent attributes, the base Relation will be
reduced to R3(AB).
The normalization of 2NF relations to 3NF involves the removal of transitive dependencies. If a transitive
dependency exists, we remove the transitively dependent attribute(s) (along with their dependents, if any)
from the relation by placing the attribute(s) in a new relation along with a copy of the determinant. The
remaining attributes of the relation along with the determinant above remain part of the base relation.

Example 2 (contd.):
Step 4 – Check again if the above-decomposed relations have achieved the highest normal
form.
R1(BC), R2(CD) & R3(AB) are all in BCNF (you can check by concepts learned in the
earlier sections).
Step 5 – After carrying out the decomposition, we need to make sure that one of the
decomposed relations contains the candidate key of the Relation R(A,B,C,D) i.e (A). Here
R3(A,B) meets the condition.
Conclusion: Relation R(A,B,C,D) with FDs = {A→B, B→C, C→D) is in 2NF. It can be
decomposed into 3 separate relations - R1(BC), R2(CD) & R3(AB) to achieve the highest
normal form of BCNF.

Example 3:
Relation R(A,B,C,D) with FDs = {A→BCD, BC→AD, D→B}
Candidate Keys – (A), (BC), (CD)
Prime attributes – A, B, C, D
Using the process learned in the section above, we can find that Relation R is in 3NF
and not in BCNF
D→B, D is not a super key – Thus violating BCNF

Example 3 (contd.):
We have identified one dependency violating the BCNF condition. From previous
sections, we know:
We will create one separate Relation to handle the dependency D→B
i.e., R1(D,B). After removing the dependent attributes of the above dependency from
the base Relation, it will be reduced to R2(A,C,D).
How to convert 3NF to BCNF:-
The normalization of 3NF relations to BCNF involves creating a new relation for every
dependency which is violating the BCNF condition. The remaining attributes of the relation along
with the determinant (of the FD violating the BCNF condition) above remain part of the base
relation.

Example 3 (contd.):
Step 4 – Check again if the above-decomposed relations have achieved the highest normal
form.
R1(D,B) & R2(A,C,D) are both in BCNF (you can check by concepts learned in the
earlier sections).
Step 5 – BCNF decompositions are not always dependency preserving; therefore, we don't
need to make sure that all candidate keys of the base Relation are there in the
decomposed relations.
Conclusion: Relation R(A,B,C,D) with FDs = {A→BCD, BC→AD, D→B} is in 3NF. It can be
decomposed into two separate relations - R1(D,B) & R2(A,C,D) to achieve the highest normal
form of BCNF.

Fourth Normal Form (4NF)
The fourth Normal Form comes into the picture when non-trivial Multivalued Dependency
(MVD) occurs in any Relation. These relations need to be identified and decomposed further
into a 4NF decomposition to improve database design.
Definition:
For a Relation to be in 4NF, it should fulfill the below two conditions:
 The Relation should be in BCNF
 The Relation should not have any non-trivial Multivalued Dependency (MVD).
Multivalued Dependency (MVD):
 Multivalued dependencies are a consequence of 1NF, which disallows multivalued
attributes in a tuple and the accompanying process of converting an un-normalized
Relation into 1NF.
 Suppose we have two or more multivalued independent attributes in the same Relation.
In that case, we get into having to repeat every value of one attribute with every value of
the other attribute to keep the relation state consistent and maintain the independence
among the attributes involved.
 A non-trivial multivalued dependency specifies this constraint.

Fourth Normal Form (4NF)…
Example:

Fourth Normal Form (4NF)…
4NF Normalization Process:

Fifth Normal Form (5NF)
Fifth Normal Form in Database Normalization is generally not implemented in real-life database design;
however, we should know what it is. It is also known as Project Join Normal Form (PJNF).
Definition:
A Relation R is in 5NF if and only if it satisfies the following conditions:
 R should be already in 4NF.
 It should not have any join dependency
A Relation R is in 5NF if and only if it cannot be decomposed further into two or more relations with a loss-
less join Property, ensuring that no spurious or extra tuples are generated when relations are reunited
through a natural join.
Joint dependency – If the join of R1 and R2 over C is equal to relation R then
we can say that a join dependency (JD) exists, where R1 and R2 are the
decomposition R1(A, B, C) and R2(C, D) of a given relations R (A, B, C, D).
Otherwise, R1 and R2 are a lossless decomposition of R.
A JD ⋈ {R1, R2, …, Rn} is said to hold over a relation R if R1, R2, ….., Rn is
a lossless-join decomposition.

Fifth Normal Form (5NF)…
Example:
In the above 4NF Relation:
 One student can enroll in multiple subjects. For example, the
student with student_id 101 has opted for subjects – Java, C++ &
C#
 Multiple professors can teach each subject. For example, Java is
taught by Amit, Mohit & Payal.
 Each professor can teach multiple subjects. For example, Amit can
teach Java & C++.
From the ER modelling perspective, the above Relation is the outcome of
a ternary relationship type between student, subject, and professor

Fifth Normal Form (5NF)…
Example (contd.):
If we decompose the above Relation into three separate binary relations as below, We can see from
the above decomposition that there is a loss of information.
Student 101 is studying subjects – Java, C++ & C#.
Student 101 is being taught by two professors – Amit & Rajan.
Amit can teach – Java & C++, and Rajan can teach – C# & C++.
From the above information, it is impossible to decipher who is teaching C++ to student 101.
Hence there is a loss of information; therefore, this decomposition is not lossless. There is no join
dependency between the base Relation and the decomposed relations.
Hence we can conclude that the base Relation student_subject_professor is in 5NF as it
cannot be further non-loss decomposed.

Conditions for relation decomposition
One thing common across the normalization process is the
decomposition of base relations into two or more relations to achieve a
higher normal form.
When we decompose a Relation into two or more relations to achieve a higher
normal form, we need to make sure that the decomposition is:
 Lossless (non-additive) join decomposition
 Dependency preserving decomposition (optional in case of BCNF decomposition)

Lossless (Non-additive) join decomposition
Lossless (non-additive) join decomposition ensures that:
 No spurious tuples are generated when a natural join operation is applied to the relations
resulting from the decomposition.
 The condition of no spurious tuples should hold on every legal relation state. The
lossless join property is always defined for a specific set F of functional dependencies.
 The word loss in lossless refers to loss of information, not to the loss of tuples.
If we decompose a Relation r(R) into r1 (R1) and r2 (R2) such that R1 R
Ս 2 = R (attribute
preservation condition), then it is said to be lossless if it satisfies r1 ⋈ r2 = r with no new
tuples added and no tuples eliminated.
If we decompose a Relation r(R) into r1 (R1), r2 (R2)….rk (Rk) such that R1 R
Ս 2…. R
Ս k = R
(attribute preservation condition) is said to be lossless if it satisfies r1 ⋈ r2 ⋈ …rk = r with no new
tuples added and no tuples eliminated.

Lossless (Non-additive) join decomposition…
Example 1:
Case 1:
In case 1, we can see that R1 U R2 = R and r1 ⋈ r2 = r. It is a lossless join decomposition.
A B C A B A C A B C
a1 b1 c1 a1 b1 a1 c1 a1 b1 c1
a2 b2 c1 a2 b2 a2 c1 a1 b1 c2
a1 b1 c2 a3 b2 a1 c2 a1 b1 c3
a3 b2 c3 a3 c3 a2 b2 c1
a1 b1 c3 a1 c3 a2 b2 c4
a2 b2 c4 a2 c4 a3 b2 c3
r(R): r1(R1): r2(R2): r1(R1)
⋈
r2(R2)

Example 1 (contd.):
Case 2:
In case 2, we can see that R1 U R2 = R and r1 ⋈ r2 ≠ r. It is not a lossless join decomposition.
A B C A B A C A B C
a1 b1 c1 a1 b1 a1 c1 a1 b1 c1 √
a2 b2 c1 a2 b2 a2 c1 a1 b1 c2 X
a1 b2 c2 a1 b2 a1 c2 a1 b1 c3 √
a3 b2 c3 a3 b2 a3 c3 a2 b2 c1 √
a1 b1 c3 a2 b1 a1 c3 a2 b2 c4 X
a2 b1 c4 a2 c4 a1 b2 c1 X
a1 b2 c2 √
a1 b2 c3 X
a3 b2 c3 √
a2 b1 c1 X
a2 b1 c4 √
√
X
Correct tuple
Spurious tuple
r(R): r1(R1): r2(R2): r1(R1)
⋈
r2(R2)

For lossless join decomposition using FD set, the following conditions must hold:
 Union of Attributes of R1 and R2 must be equal to attribute of R. Each attribute of R must
be either in R1 or in R2.
Att(R1) U Att(R2) = Att(R)
 The intersection of Attributes of R1 and R2 must not be NULL.
Att(R1) ∩ Att(R2) ≠ Φ
 The common attribute must be a key for at least one Relation (R1 or R2)
Att(R1) ∩ Att(R2) → Att(R1) or Att(R1) ∩ Att(R2) → Att(R2)

Example 2:
A Relation R(A,B,C,D,E,F) with FD set {AB→C, C→D, D→EF, F→A, D→B} is decomposed into R1(ABC),
R2(CDE), R3(EF)
Condition 1:- Att(R1) U Att(R2) U Att(R3) = (A,B,C,D,E,F) = R(A,B,C,D,E,F) – condition met
As Join ( ) is a binary operation so we will take 2 relations at a time
⋈

Example 2 (contd.):
Att(R1) ∩ Att(R2) = (C) ≠ Φ – condition met
Let's check if (C) is a Key in either R1 or R2.
Find C+ = {C,D,E,F,A,B}, so we can see (C) can determine all attributes of both R1 & R2, hence it is a Key in
both R1 & R2 - condition met
So, R1(A,B,C) R2(C,D,E) = R12(A,B,C,D,E) is a
⋈ lossless join
Att(R12) ∩ Att(R3) = (E) ≠ Φ – condition met
Let's check if (E) is a Key in either R12 or R3.
Find E+ = {E}, so we can see (E) cannot determine all attributes of either R12 or R3 – condition not met
So, R12(A,B,C,D,E) R3(E,F) = R(A,B,C,D,E,F) is
⋈ not a lossless join
Therefore we can conclude that the whole decomposition R1 (ABC), R2 (CDE) & R3 (EF) is not a
lossless join

Algorithm to test for lossless (Non-additive) Join Property
Input: A universal Relation R, a decomposition D = {R1, R2, …, Rm} of R, and a set F of functional dependencies.
Output: A decision whether decomposition is lossless or not.
1. Create an initial matrix S with one row i for each Relation Ri in D, and one column j for each attribute Aj in
R.
2. For each row i representing Relation schema Ri
{For each column j representing attribute Aj
{If Relation Ri includes attribute Aj:
Put the symbol aj i.e. S(i, j): = aj
Otherwise
Put the symbol bij i.e. S(i, j): = bij
}}

Algorithm to test for lossless (Non-additive) Join Property (contd.)
3. Repeat the following loop until a complete loop execution results in no changes to S {For each
functional dependency X→Y in F
{For all rows in S that have the same symbols in the columns corresponding to attributes in X
{Make the symbols in each column that correspond to an attribute in Y be the same in all
these rows as follows:
If any of the rows have an 'a' symbol for the column, set the other rows to that same 'a'
symbol in the column.
If no 'a' symbol exists for the attribute in any of the rows, choose one of the 'b' symbols that
appears in one of the rows for the attribute and set the other rows to that same 'b' symbol in
the column ;};
}}}
4. If a row is made up entirely of 'a’ symbols, then the decomposition has the non-additive join
property; otherwise, it does not.

Example 3:
R(A,B,C,D,E)
Decomposition is: R1(AD) ; R2(AB) ; R3(BE) ; R4(CDE) ; R5(AE)
Set of functional dependencies FD = {A→C, B→C, C→D, DE→C, CE→A}. Verify whether this
decomposition is lossless or lossy.
Solution: Initialization of matrix:
Now consider a set of functional dependencies F= {A→C, B→C, C→D, DE→C, CE→A}
1 2 3 4 5
A B C D E
1 AD a1 b12 b13 a4 b15
2 AB a1 a2 b23 b24 b25
3 BE b31 a2 b33 b34 a5
4 CDE b41 b42 a3 a4 a5
5 AE a1 b52 b53 b54 a5

Example 3 (contd.):
Thus, decomposition of R(A,B,C,D,E) in to R1(AD) ; R2(AB) ; R3(BE) ; R4(CDE) ; R5(AE) is a lossless decomposition.
A B C D E A B C D E
AD a1 b12 b13 a4 b15 AD a1 b12 b13 a4 b15
AB a1 a2 b13 b24 b25 AB a1 a2 b13 b24 b25
BE b31 a2 b33 b34 a5 BE b31 a2 b13 b34 a5
CDE b41 b42 a3 a4 a5 CDE b41 b42 a3 a4 a5
AE a1 b52 b13 b54 a5 AE a1 b52 b13 b54 a5
A B C D E A B C D E
AD a1 b12 b13 a4 b15 AD a1 b12 b13 a4 b15
AB a1 a2 b13 a4 b25 AB a1 a2 b13 a4 b25
BE b31 a2 b13 a4 a5 BE b31 a2 a3 a4 a5
CDE b41 b42 a3 a4 a5 CDE b41 b42 a3 a4 a5
AE a1 b52 b13 a4 a5 AE a1 b52 a3 a4 a5
A B C D E A B C D E
AD a1 b12 b13 a4 b15 AD a1 b12 a3 a4 b15
AB a1 a2 b13 a4 b25 AB a1 a2 a3 a4 b25
BE a1 a2 a3 a4 a5 BE a1 a2 a3 a4 a5
CDE a1 b42 a3 a4 a5 CDE a1 b42 a3 a4 a5
AE a1 b52 a3 a4 a5 AE a1 b52 a3 a4 a5
A B C D E
AD a1 b12 a3 a4 b15
AB a1 a2 a3 a4 b25
BE a1 a2 a3 a4 a5 All 'a' symbols are in this row
CDE a1 b42 a3 a4 a5
AE a1 b52 a3 a4 a5
A → C, B → C, C → D,
DE → C, CE → A, A → C
1. A → C 2. B → C
3. C → D 4. DE → C
5. CE → A 6. A → C

Dependency preserving decomposition
Dependency preserving or preserving functional dependencies
 For a Relation R to be recoverable, its decomposition must be lossless as explained in
earlier section. In addition to this, the decomposition must satisfy another property known as
dependency preservation.
 It states, if a Relation R is decomposed into relations R1 and R2, then all functional
dependencies of R either must be a part of R1 or R2 or must be derivable from the
combination of FD’s of R1 and R2.
Need of dependency preservation:
 The set of FD’s on original Relation defines the integrity constraints that Relation needs to
meet. If any decomposition does not preserve the dependencies of original Relation impose
an unnecessary burden on the RDBMS by joining all these decomposed relations to check
that the constraints are not violated in case of any update in any of the decomposed
relations. Dependency preservation is optional for BCNF decomposition.

Dependency preserving decomposition…
Definition:
A Decomposition D = {R1, R2, R3….,Rn} of R is dependency preserving w.r.t a set F of
Functional dependency if (F1 U F2 U … U Fn)+
= F+
.
How to check:
Consider a Relation R with some functional dependencies set F. R is decomposed or divided
into R1 with FD {F1} and R2 with {F2}, then there can be three cases:
 {F1 U F2} = F -----> Decomposition is dependency preserving.
 {F1 U F2} is a subset of F -----> Decomposition is not Dependency preserving.
 {F1 U F2} is a super set of F -----> This case is not possible.

Example 1:
Let a Relation R (ABCD) and functional dependency set F= {AB→C, C→D, D→A}. Relation R is
decomposed into R1(ABC) and R2(CD). Check whether decomposition is dependency
preserving or not.
Solution:
Step 1: For decomposed Relation R1(A, B, C) and R2(C, D), let’s find the functional
dependency of each sub Relation as F1 and F2 using closure property.
To find FD’s for Relation R1 i.e. F1 we will consider all combination of attributes that belong to
Relation R1(ABC) i.e., find closure of A, B, C, AB, BC, and AC using original FD set F (Note:
ABC is not considered as it is always ABC due to triviality) and then eliminate such FD’s in
which any attribute appears which is not part of R1 Relation. No need to add trivial functional
dependencies
(A)+
= {A}) // Trivial hence ignore
(B)+
= {B} // Trivial hence ignore

Example 1 (contd.):
(C)+
= {C,A,D} but D can't be part of the closure because D is not present R1.
= Therefore, {C}+
= (C,A} now we will write FD as C→CA, But C on RHS is trivial attribute.
Hence remove from RHS. Finally, FD using {C}+
is C→A ………………………………………….(1)
(AB)+
= {A,B,C,D} but D can't be in closure as D is not present R1.
= {A,B,C}. Therefore FD will be AB→C // Removing trivial attributes (AB) from RHS…..
(2)
(BC)+
= {B,C,D,A} but D can't be in closure as D is not present R1.
= {A,B,C}. Therefore FD will be BC→A // Removing trivial attributes (BC) from RHS...…(3)
(AC)+
= {A,C,D} but D can't be in closure as D is not present R1.
= {A,C}. Ignoring AC (trivial). Therefore no new FD is derived using AC.
Therefore F1 = {C→A, AB→C, BC→A} using (1), (2) & (3)

Example 1 (contd.):
To find FD’s for Relation R2, i.e., F2, we will consider all combination of attributes that
belongs to Relation R2(CD), i.e., C, D, CD (Note: CD is not considered as it is always CD due
to triviality)
Similarly, we can derive for F2 = {C→D}
Step 2: Test whether original Relation functional dependency {AB→C, C→D, D→A} exist in {F1
U F2} or F = {F1 U F2}.
{F1 U F2} = {C→A, AB→C, BC→A, C→D}
AB→C is present in {F1 U F2}.
C→D is present in {F1 U F2}.
D→A is not present in any of F1 or F2 nor in {F1 U F2}+
. Hence this dependency is not
preserved or we can say F1 U F2 is a subset of F.
So given decomposition is not dependency preserving.

Example 2:
Let a Relation R(A,B,C,D,E) and functional dependency set F = {A→B, B→C, C→D, D→A}. Relation R is
decomposed into R1(ABC) and R2(CDE). Check whether decomposition is dependency preserving or not.
Solution:
Step 1: For decomposed Relation R1(ABC) and R2(CDE), let’s find the functional dependency of each
sub Relation as F1 and F2 using closure property.
To find FD’s for Relation R1, i.e., F1, we will consider all combination of attributes that belongs to
Relation R1(ABC), i.e., find closure of A, B, C, AB, BC and AC using original FD set F
(A)+
= {A,B,C,D} but we will ignore A (trivial) & D (D is not part of R1). Therefore, {A}+
= {B,C}.
We can write Functional dependency derived from A as A→BC …………..……………… (1)
(B)+
= {B,C,D,A}. Ignoring B (trivial) & D (D not the part of R1). Therefore, {B}+
= {C,A}.
We can write Functional dependency derived from B as B→CA ……………..…………… (2)

Example 2 (contd.):
(C)+
= {C,D,A,B}. Ignoring C (trivial) & D (D not the part of R1). Therefore, {C}+
= {B,A}.
We can write Functional dependency derived from C as C→BA ..………………………… (3)
(AB) +
= {A,B,C,D}. Ignoring AB (trivial) & D (D not the part of R1). Hence {AB}+
= {C}.
We can write Functional dependency derived from AB as AB→C. But please note that this
is duplicate FD because attribute A alone can derive C in equation (1) above or we can say we will not
check any combination of attributes, with attribute(s) which itself is capable of acting as the key of the
Relation R. Hence we will ignore this FD as part of F1 set.
Similarly, (A)+
, (B)+
, (C)+
derive all attributes of Relation R; hence testing the combination like AC & BC will
not going to add any new functional dependency in the set F1.
Therefore final F1 = {A→BC, B→CA, C→BA}

Example 2 (contd.):
To find FD’s for Relation R2 i.e. F2, we will consider all the combination of attributes of R2(CDE) i.e. C, D, E, CD,
CE, DE using original functional dependency set F = F= {A→B, B→C, C→D, D→A}.
(C)+
= {C,D,A,B}. Ignoring C (trivial) & AB (AB not the part of R2). Therefore, {C}+
= {D}.
We can write Functional dependency derived from C as C→D ……………………………… (1)
(D)+
= {D,A,B,C}. Ignoring D (trivial) & AB (AB not the part of R2). Therefore, {D}+
= {C}.
We can write Functional dependency derived from D as D→C ……………………………… (2)
(E)+
= {E}. Ignoring trivial attribute E, therefore no FD using E.
(CD) +
= {C,D,A,B}. Ignoring CD (trivial) & AB (AB not the part of R2). Therefore no new FD is derived using CD.
(DE)+
= {D,E,A,B,C}. Ignoring DE (trivial) & AB (AB not part of R2). Hence {DE}+
= {C}.
We can write Functional dependency derived from DE as DE→C. But please note this is duplicate FD because D
alone can derive C in equation (2). Hence we will ignore this FD.

Example 2 (contd.):
(CE)+
= {C,E,D,A,B}. Ignoring CE (trivial) & AB (AB not part of R2). Hence {CE}+
= {D}.
We can write Functional dependency derived from CE as CE→D. But please note this is duplicate FD
because C alone can derive D in equation (1). Hence we will ignore this FD.
Therefore final F2 = {C→D, D→C}
Step 2: Test whether original Relation functional dependency F = {A→B, B→C, C→D, D→A} exist in {F1 U F2} or F
= {F1 U F2}.
F1 = {A→BC, B→CA, C→BA}
F2 = {C→D, D→C}
{F1 U F2} = {A→BC, B→CA, C→BA, C→D, D→C}
A→B, is present in {F1 U F2} (applying the decomposing rule on A→BC)
B→C, is present in {F1 U F2} (applying the decomposing rule on B→CA)

Example 2 (contd.):
C→D, is present in {F1 U F2}
D→A can be derived using axioms on {F1 U F2} i.e., using D→C & C→BA, we can derive D→BA (using transitivity
rule) & then applying the decomposing rule, we can infer D→B & D→A. Hence, D→A is present in {F1 U F2}. This
means F= {F1 U F2}.
So given decomposition of Relation R is dependency preserving.

Domain Key Normal Form (DKNF)
 A relation schema is said to be in DKNF only if all the constraints and
dependencies that should hold on the valid relation state can be
enforced simply by enforcing the domain constraints (constraint on a
valid set of values) and the key constraints on the relation.
 Verification Procedure
 Each attribute value is a tuple is of the appropriate domain and that every key constraint is
enforced.
 Why DKNF?
 To avoid general constraints in the database that are not clear key constraints.
 Most database can easily test or check key constraints on attributes.
87

Example
 A relations CAR (MAKE, vin#) and MANUFACTURE (vin#, country)
 If the MAKE is either ‘HONDA’ or ‘MARUTI’ then the first character of the
vin# is a ‘B’ If the country of manufacture is ‘INDIA’
If the MAKE is ‘FORD’ or ‘ACCURA’, the second character of the vin# is a
‘B” if the country of manufacture is ‘INDIA’.
88
Make vin#
Honda B001
Maruti B002
Ford CB01
Accura CB01
vin# country
B001 India
B001 India
CB01 India
CB01 India
Conclusion:
• Difficult to enforce such constraint

Q. Relation R with an associated set of functional dependencies, F is
decomposed into BCNF. The redundancy (arising out of functional
dependencies) in the resulting set relations is. (GATE 2002)
A. Zero
B. More than zero but less than that of an equivalent 3NF decomposition
C. Proportional to the size of F+
D. Indeterminate
Question

Q. Which normal form is based on the concept of ‘full functional dependency’?
(ISRO 2011)
A. First Normal Form
B. Second Normal Form
C. Third Normal Form
D. Fourth Normal Form
Question
A full functional dependency is a state of database
normalization similar to Second Normal Form (2NF).
It means that the schema should meet the
requirements of First Normal Form (1NF), and all
non-key attributes are fully functionally dependent
on the primary key and partial dependency on the
candidate key should not exist.
So, Option (B) is correct.

Q. If every non-key attribute is functionally dependent on the primary key, then the relation
is in __________ . (UGC NET 2017)
A. First normal form
B. Second normal form
C. Third normal form
D. Fourth normal form
Question
Conditions for various normal forms:
1 NF – A relation R is in first normal form (1NF) if and only if all
underlying domains contain atomic values only.
2 NF – A relation R is in second normal form (2NF) if and only if it is in
1NF and every non-key attribute is fully dependent on the primary key.
3 NF – A relation R is in third normal form (3NF) if and only if it is in
2NF and every non-key attribute is non-transitively dependent on the primary
key.
BCNF – A relation R is in Boyce-Codd normal form (BCNF) if and only if every
determinant is a candidate key.
Example:
Relation R(XYZ) with functional dependencies {X -> Y, Y -> Z, X -> Z}.
Notice here Y -> Z, in question it is not mention that non prime attribute
is only dependent on primary key so this FD is perfectly valid.
This relation is in 2NF but not in 3NF because of every non-key attribute
is transitively dependent on the primary key. Here {X} will be candidate
key.
So, option (B) is correct.

Q. Consider the following dependencies and the BOOK table in a relational database
design. Determine the normal form of the given relation. (ISRO 2013)
A. First Normal Form
B. Second Normal Form
C. Third Normal Form
D. BCNF
ISBN → Title
ISBN → Publisher
Publisher → Address
Question
ISBN is the candidate Key.
BCNF is ruled out as Publisher is not a Key.
3NF is ruled out as there is transitive
dependence Publisher -> Address. Also neither
Publisher is a key nor Address is a prime
attribute.
The relation is in 2NF as there is no partial
dependency.

Q. For a database relation R(a,b,c,d), where the domains a, b, c, d include only atomic
values, only the following functional dependencies and those that can be inferred from
them hold: (GATE 1997 & UGC NET 2017)
{a → c, b → d}
This relation is:-
A. In first normal form but not in second normal form
B. In second normal form but not in first normal form
C. In third normal form
D. None of the above
Candidate Key of above relation is :- ab
a and b is partial attribute (part of the CK) that’s why the given
FDs are partially dependents.
In 2NF there must not be partially dependents FD and we know that
every table is already in 1NF. Hence, this relation is in first
normal form but not in second normal form.
Option (A) is correct.
Question

Q. Consider the following database relations containing the attributes:- (GATE 1998)
(a) What is the highest normal form satisfied by this relation ?
(b) Suppose the attributes Book_title and Author_address are added to the relation, and the
primary key is changed to (Name_of_Author, Book_Title), what will be the highest normal form
satisfied by the relation?
(a)BCNF
(b)1NF
Book_id
Subject_Category_of_book
Name_of_Author
Nationality_of_Author with
Book_id as the Primary Key.
Question
(a) R (Book_id, Subject_Category_of_book, Name_of_Author ,
Nationality_of_Author) = R (A, B, C, D)
Given that Book_id as the Primary Key, therefore { A → B, C, D}
Hence Given relation is in BCNF.
(b) Two attributes Book_title and Author_address are added to the relation.
Then , R (Book_id, Subject_Category_of_book, Name_of_Author , Nationality_of_Author,
Book_title, Author_address) = R (A, B, C, D, E, F ). Given (Name_of_Author, Book_Title)
is now primary key, therefore { C, E → A, B, D, F} & { A → B, C, D}
Candidate key of this relation is (A, E) and there is partial dependency { A → B, C, D},
so the relation is in 1NF

The relation scheme Student Performance (name, courseNo, rollNo, grade) has the
following functional dependencies:
The highest normal form of this relation scheme is:-
1. 2NF
2. 3NF
3. BCNF
4. 4NF
Question(GATE 2004)
name, courseNo → grade
rollNo, courseNo → grade
name → rollNo
rollNo → name
For easy understanding let's say
attributes (name, courseNo, rollNo,
grade) be (A,B,C,D). Then given FDs
are as follows:
AB->D, CB->D, A->C, C->A
Here there are two Candidate keys, AB
and CB. Now AB->D and CB->D satisfy
BCNF as LHS is superkey in both. But,
A->C and C->A, doesn't satisfy BCNF.
Hence we check for 3NF for these 2

Q. A table has fields, F1,F2,F3,F4,F5 with the following functional dependencies: (GATE 2005)
F1→F3
F2→F4
(F1.F2)→F5
In terms of Normalization, this table is in:-
A. 1NF
B. 2NF
C. 3NF
D. None of these
Question
First Normal Form - A relation is in first
normal form if every attribute in that
relation is singled valued attribute. Second
Normal Form - A relation is in 2NF if it
has No Partial Dependency, i.e., no non-prime
attribute (attributes which are not part of
any candidate key) is dependent on any proper
subset of any candidate key of the table. This
table has Partial Dependency f1->f3, f2->
f4 given (F1,F2) is Key So answer is A

Q. Let R (A, B, C, D, E, P, G) be a relational schema in which the following functional
depen
dencies are known to hold: AB → CD, DE → P, C → E, P → C and B → G. The
relational schema R is (GATE 2008)
A. In BCNF
B. In 3NF, but not in BCNF
C. In 2NF, but not in 3NF
D. Not in 2NF
Question
Candidate key = AB
B->G is partial
dependency
So, not in 2NF

Q. Consider the following relational schemes for a library database: (GATE 2008)
Book (Title, Author, Catalog_no, Publisher, Year, Price)
Collection (Title, Author, Catalog_no) with in the following functional dependencies:
Assume {Author, Title} is the key for both schemes. Which of the following statements is true?
A. Both Book and Collection are in BCNF
B. Both Book and Collection are in 3NF only
C. Book is in 2NF and Collection is in 3NF
D. Both Book and Collection are in 2NF only
Title, Author --> Catalog_no
Catalog_no --> Title, Author, Publisher, Year
Publisher, Title, Year --> Price
Question

Q. Relation R has eight attributes ABCDEFGH. Fields of R contain only
atomic values. F = {CH -> G, A -> BC, B -> CFH, E -> A, F -> EG} is a set of
functional dependencies (FDs) so that F+ is exactly the set of FDs that hold
for R.
The Relation R is:- (GATE 2013)
A. In 1NF, but not in 2NF
B. In 2NF, but not in 3NF
C. In 3NF, but not in BCNF
D. In BCNF
Question
The table is not in 2nd Normal Form as the
non-prime attributes are dependent on
subsets of candidate keys. The candidate
keys are AD, BD, ED and FD. In all of the
following FDs, the non-prime attributes are
dependent on a partial candidate key. A ->
BC B -> CFH F -> EG

Q. The best normal form of relation scheme R(A, B, C, D) along with the set of functional
dependencies F = {AB → C, AB → D, C → A, D → B} is (UGC NET 2014)
A. Boyce-Codd Normal form
B. Third Normal form
C. Second Normal form
D. First Normal form
Question
AB is the candidate key. {C -> A} & {D -> B} are not in
BCNF as (C) & (D) are not keys. The relation is in 3NF as
(AB) is key in {AB → C, AB → D} and (A) & (B) are prime
attributes in {C → A, D → B}

Q. Consider the following four relational schemas. For each schema, all non-trivial functional dependencies are listed,
The underlined attributes are the respective primary keys.
 Schema I: Registration (rollno, courses) Field ‘courses’ is a set-valued attribute containing the set of courses a
student has registered for. Non-trivial functional dependency {rollno → courses}
 Schema II: Registration (rollno, coursid, email) Non-trivial functional dependencies: {rollno, courseid →
email}, {email → rollno}
 Schema III: Registration (rollno, courseid, marks, grade) Non-trivial functional dependencies: {rollno, courseid, →
marks, grade}, {marks → grade}
 Schema IV: Registration (rollno, courseid, credit) Non-trivial functional dependencies: {rollno, courseid →
credit}, {courseid → credit}
Which one of the relational schemas above is in 3NF but not in BCNF? (GATE 2018)
A. Schema I
B. Schema II
C. Schema III
D. Schema IV
Question

A database of research articles in a journal uses the following schema.
(VOLUME, NUMBER, STARTPAGE, ENDPAGE, TITLE, YEAR, PRICE)
The primary key is (VOLUME, NUMBER, STARTPAGE, ENDPAGE)
and the following functional dependencies exist in the schema.
(VOLUME, NUMBER, STARTPAGE, ENDPAGE) → TITLE
(VOLUME, NUMBER) → YEAR
(VOLUME, NUMBER, STARTPAGE, ENDPAGE) → PRICE
The database is redesigned to use the following schemas.
(VOLUME, NUMBER, STARTPAGE, ENDPAGE, TITLE, PRICE)
(VOLUME, NUMBER, YEAR)
Which is the weakest normal form that the new database satisfies, but the old one does not?
A. 1NF
B. 2NF
C. 3NF
D. BCNF
Question(GATE 2016)

Consider the following table : Faculty (facName, dept, office, rank, dateHired)
(Assume that no faculty member within a single department has same name. Each faculty
member has only one office identified in office). 3NF refers to third normal form and BCNF
refers to Boyee-Codd Normal Form
Then Faculty is:-
A. Not in 3NF, in BCNF
B. In 3NF, not in BCNF
C. In 3NF, in BCNF
D. Not in 3NF, not in BCNF
Question(ISRO 2017)
FDs:-
1. facName , dept-> office, rank, dateHired
2. Office -> Dept
FD facName, dept → office, rank, datehired is in 3 NF as well as in BCNF,
because facName, Dept is the primary key. But FD office → dept is not in BCNF because
office is not superkey but is in 3 NF as dept is the prime attribute.
So, overall relation Faculty is in 3 NF but not in BCNF.
FACNAM
E
DEP
T
OFFIC
E
RANK
DATEHIR
ED
Ravi Art A101
Professo
r
1975
Murali Math M201 Assistant2000
Narayana
n
Art A101
Associat
e
1992
Lakshmi Math M201
Professo
r
1982
Mohan CSC C101
Professo
r
1980
Sreeni Math M203
Associat
e
1990
Tanuja CSC C101
Instructo
r
2001
Associat

Consider the following functional dependencies in a database:
The relation (Roll_number, Name, Date_of_birth, Age) is:
A. In 2NF but not in 3NF
B. In 3NF but not in BCNF
C. In BCNF
D. None of the above
Question(GATE 2003)
Data_of_Birth → Age
Age → Eligibility
Name → Roll_number
Roll_number → Name
Course_number → Course_name
Course_number → Instructor
(Roll_number, Course_number) → Grade

Which of the following statements is TRUE?
 D1 : The decomposition of the schema R(A, B, C) into R1(A, B) and R2 (A, C)
is always lossless.
 D2 : The decomposition of the schema R(A, B, C, D, E) having AD → B, C →
DE, B → AE and AE → C, into R1 (A, B, D) and R2 (A, C, D, E) is lossless.
A. Both D1 and D2
B. Neither D1 nor D2
C. Only D1
D. Only D2
Only D2 is True because AD is key and present in both
the tables.
D1 is not always true because FD’s not given and if we
take B->A and C->A then it is lossy decomposition
QuestionUGC NET 2016)

Consider a schema R(A, B, C, D) and following functional dependencies.
Then decomposition of R into R1 (A, B), R2(B, C) and R3(B, D) is __________ .
A. Dependency preserving and lossless join.
B. Lossless join but not dependency preserving.
C. Dependency preserving but not lossless join.
D. Not dependency preserving and not lossless join.
A → B
B → C
C → D
D → B
Schema R(A, B, C, D) is decomposed into three relation →
R1 (A, B), R2(B, C) and R3(B, D)
Now dependencies derived from R1 (A, B) are: A → B
Dependencies derived from R1 (B, C) are: B → C, C → B
Dependencies derived from R1 (B, D) are: D → B, B → D
All the dependencies are preserved and it is a lossless
decomposition.

Consider a schema R(MNPQ) and functional dependencies M → N, P → Q.
Then the decomposition of R into R1 (MN) and R2(PQ) is________.
A. Dependency preserving but not lossless join
B. Dependency preserving and lossless join
C. Lossless join but not dependency preserving
D. Neither dependency preserving nor lossless join.

Consider the schema R= ( S, T, U, V ) and the dependencies S→T, T→U, U→V
and V→S. Let R (R1 and R2) be a decomposition such that R1∩R2 ≠ Ø. The
decomposition is:
A. Not in 2NF
B. In 2NF but not in 3NF
C. In 3NF but not in 2NF
D. In both 2NF and 3NF
QuestionGATE 1999)
R1∩R2 ≠ Ø means there is common
attribute in R1 and R2. Now if we
choose a decomposition positively then
we can choose something like R1(S, T,
U) and R2(U, V) then we can say that
decomposition is lossless because
common attribute is U and LHS of every
FDs are candidate key, therefore it is
in 2NF as well as 3NF. Option (D) is
correct.

Consider a schema R(A,B,C,D) and functional dependencies A->B and C->D.
Then the decomposition of R into R1(AB) and R2(CD) is:-
A. Dependency preserving and lossless join
B. Lossless join but not dependency preserving
C. Dependency preserving but not lossless join
D. Not dependency preserving and not lossless loin
QuestionGATE 2001 & ISRO 2014)
Dependency Preserving Decomposition:
Decomposition of R into R1 and R2 is a dependency preserving decomposition if closure of
functional dependencies after decomposition is same as closure of of FDs before decomposition.
A simple way is to just check whether we can derive all the original FDs from the FDs present
after decomposition.In the above question R(A, B, C, D) is decomposed into R1 (A, B) and R2(C,
D) and there are only two FDs A -> B and C -> D. So, the decomposition is dependency
preserving
Lossless-Join Decomposition:
In the above question R(A, B, C, D) is decomposed into R1 (A, B) and R2(C, D), and R1 ∩ R2 is

R(A,B,C,D) is a relation. Which of the following does not have a lossless
join, dependency preserving BCNF decomposition?
A. A->B, B->CD
B. A->B, B->C, C->D
C. AB->C, C->AD
D. A ->BCD
We know that for lossless decomposition common attribute
should be candidate key in one of the relation.
A) A->B, B->CD
R1(AB) and R2(BCD)
B is the key of second and hence decomposition is lossless.
B) A->B, B->C, C->D
R1(AB) , R2(BC), R3(CD)
B is the key of second and C is the key of third, hence
lossless.
C) AB->C, C->AD
R1(ABC), R2(CD)
C is key of second, but C->A violates BCNF condition in ABC as
C is not a key. We cannot decompose ABC further as AB-
>C dependency would be lost.
D) A ->BCD
Already in BCNF.
QuestionGATE 2001)

Q. Relation R is decomposed using a set of functional dependencies, F and relation S is
decomposed using another set of functional dependencies G. One decomposition is
definitely BCNF, the other is definitely 3NF, but it is not known which is which. To make a
guaranteed identification, which one of the following tests should be used on the
decompositions? (Assume that the closures of F and G are available). (GATE 2002)
A. Dependency-preservation
B. Lossless-join
C. BCNF definition
D. 3NF definition
Question
Answer is (C) since to identify BCNF we need BCNF
definition. One relation which satisfies will be in BCNF
and other will be in 3NF.
1st is wrong because dependency may be preserved by both
3NF and BCNF.
2nd is wrong Because both 3NF and BCNF decomposition can
be lossless.
4th is wrong because 3NF and BCNF both are in 3NF also.

Q. Let the set of functional dependencies
F = {QR → S, R → P, S → Q}
hold on a relation schema X = (PQRS). X is not in BCNF. Suppose X is decomposed into two schemas Y and Z, where
Y = (PR) and Z = (QRS).
Consider the two statements given below.
I. Both Y and Z are in BCNF
II. Decomposition of X into Y and Z is dependency preserving and lossless
Which of the above statements is/are correct? (GATE 2019)
 I only
 Neither I nor II
 II only
 Both I and II
Question

Learn more about database normalization concepts

More Related Content

Similar to Learn more about database normalization concepts

Recently uploaded

Learn more about database normalization concepts

Editor's Notes