2. UNIT III
• SCHEMA REFINEMENT AND NORMAL FORMS:
• Introduction to schema refinement & Normalization,
• Decomposition and properties of decompositions,
• functional dependencies,
• Closure of Attributes set.
• Normal forms: 1NF, 2NF, 3NF, BCNF, 4NF,5NF.
• problems on normalization ,
• Schema refinement in database design.
3. Schema Refinement and Normal forms
➢ Schema: can be defined as a complete description of database.
✓ The Specifications for database schema is provided during the
database design and this schema does not change frequently.
➢ Schema Refinement: Is a process of refining the schema so as to
solve the problems caused by redundantly storing the information.
✓ Redundancy : Means duplication of data Or repetition of
same data or duplicate copies of same data stored in
different locations.
✓ Redundancy is at the root of several problems associated with
relational schemas. Some of them are
1. Redundant storage
2. Insert/delete/update anomalies.
✓ Anomalies: Anomalies refers to the problems occurred after
poorly planned and normalized databases where all the data is
stored in one table which is sometimes called a flat file
database.
5. Schema Refinement/ Normalization
➢ The Schema Refinement refers to refine the schema by using
some technique. The best technique of schema refinement is
decomposition.
➢ Normalization or Schema Refinement is a technique of
organizing the data in the database.
➢ It is a systematic approach of decomposing tables to
eliminate data redundancy and undesirable characteristics
like Insertion, Update and Deletion Anomalies.
✓ Redundancy refers to repetition of same data or duplicate
copies of same data stored in different locations.
6. Problems Caused By Redundancy:
1. Data Inconsistency
2. Memory Fragmentation
3. Anomalies:
There are three types of anomalies:
i. Update
ii. Deletion and
iii. Insertion Anomalies
Let us consider an example : Each
employee in a company has a
department associated with them as
well as the student group they
participatein
7. Problems Caused By Redundancy:
1. Data Inconsistency
2. Memory Fragmentation
3. Anomalies:
There are three types of anomalies:
i. Update
ii. Deletion and
iii. Insertion Anomalies
Let us consider an example : Each
employee in a company has a
department associated with them as
well as the student group they
participatein
8. Update Anomaly:
➢ An update anomaly is a data inconsistency that results from
data redundancy and a partial update.
➢ For ex. if Anand’s department is an error it must be updated
at least 2 times or there will be inconsistent data in the
database.
➢ If the user performing the update does not realize the data is
stored redundantly the update will not be done properly.
9. Schema Refinement
• Anomalies or problems facing without normalization(problems due to
redundancy) :
SID SNAME CID CNAME FID FNAME SALARY
1 A 101 DBMS 1001 AAA 50000
2 B 101 DBMS 1001 AAA 50000
3 C 101 DBMS 1001 AAA 50000
4 D 102 OS 1002 BBB 60000
5 E 102 OS 1002 BBB 60000
10. Schema Refinement
• Due to redundancy of data we may get the following
problems, those are-
1. Insertion anomalies : It may not be possible to store some
information unless some other information is stored as
well.
2. Update anomalies: if one copy of redundant data is
updated, then inconsistency is created unless all
redundant copies of data are updated.
3. Deletion anomalies: it may not be possible to delete
some information without losing some other information
as well.
11. Schema Refinement
➢ Insert Anomaly: Is the inability to add data to the database
due to absence of other data.
➢ For ex: If a new employee is hired but not immediately
assigned to a Group then this employee could not be entered
into the database. This results in database inconsistencies due
to omission.
SID SNAME CID CNAME FID FNAME SALARY
1 A 101 DBMS 1001 AAA 50000
2 B 101 DBMS 1001 AAA 50000
3 C 101 DBMS 1001 AAA 50000
4 D 102 OS 1002 BBB 60000
5 E 102 OS 1002 BBB 60000
NULL NULL 103 XXX 1003 CCC 70000
12. Schema Refinement
Update Anomaly : An update anomaly is a data inconsistency
that results from data redundancy and a partial update.
SID SNAME CID CNAME FID FNAME SALARY
1 A 101 DBMS 1001 AAA 50000
2 B 101 DBMS 1001 AAA 50000
3 C 101 DBMS 1001 AAA 50000
4 D 102 OS 1002 BBB 60000
5 E 102 OS 1002 BBB 60000
13. Schema Refinement
Delete Anomaly: It is the unintended loss of data due to deletion of
other data.
This results in database inconsistencies and is an example of how combining
information that does not really belong together into one table can cause
problems
SID SNAME CID CNAME FID FNAME SALARY
1 A 101 DBMS 1001 AAA 50000
2 B 101 DBMS 1001 AAA 50000
3 C 104 IS 1004 CCCC 70000
4 D 102 OS 1002 BBB 60000
5 E 102 OS 1002 BBB 60000
14. Anomalies
• Update, deletion, and insertion anomalies are very undesirable
in any database.
• Anomalies are avoided by the process of normalization.
• Ways to avoid Data Anomalies: There are two ways to avoid
data anomalies. They are :
1. Normalization
2. Decomposition: Process of decomposing a larger relation
into smaller relations.
15. Functional Dependency
A Functional dependency is defined as the relationship between
the attributes that correspond to a single relation.
A functional dependency (FD) has the form X -> Y (read as X
functionally determines Y )
where X and Y are sets of attributes in a relation R.
Here X is used to determine the value of Y, so it is said
that Y is functionally dependent on X.
Example : A student can have only one birth year : S → B
16. Functional Dependencies in entity sets
➢ Functional Dependencies in entity sets : An entity set can have
the following functional dependencies :
1. Fully Functional Dependency
2. Partial Functional Dependency
3. Transitive Functional Dependency
➢ If x,y,z are attributes of an entity set in a table such that:
x is functionally dependent on y and
y is functionally dependent on z, then
z will be transitively dependent on x through y.
Eg: Students -> Teachers
Teachers -> Management
Management -> Students
17. 4.5 Reasoning about FD’s:
Armstrong’s Axioms: William W. Armstrong established a set of rules
which can be used to infer the FDs in a relational database (from
umbc.edu - no external linking, Google Database Design
UMBC):
➢ Reflexivity rule: If A is a set of attributes, and B is a set of
attributes that are completely contained in A, then A implies B.
➢ Augmentation rule: If A implies B, and C is a set of
attributes, then if A implies B, then AC implies BC.
➢ Transitivity rule: If A implies B and B implies C, then A
implies C.
These can be simplified if we also use
1. Union rule: If A implies B and A implies C, the A implies BC.
2. Decomposition rule: If A implies BC then A implies B and A
implies C.
3. Pseudo transitivity rule: If A implies B and CB implies D, then AC
implies D.
18. Schema Refinement
• To avoid REDUNDANCY and PROBLEMS due to redundancy,
we use refinement technique called DECOMPOSITION.
• Decomposition:- Process of decomposing a larger relation into
smaller relations.
• Each of smaller relations contain subset of attributes of original
relation.
SID SNAME CID CNAME FID FNAME SALARY
19. Functional Dependency(FD)
• Functional dependency is a relationship that exist when one
attribute uniquely determines another attribute.
• FD is a form of integrity constraint(IC) that can identify
schema with redundant storage problems and to suggest
refinement.
• A functional dependency X →Y in a relation holds true
if two tuples having the same value of attribute X and also
have the same value of attribute Y.
If t1.X= t2.X then t1.Y= t2.Y
where t1,t2 are tuples and X,Y are attributes.
23. Functional dependency
RNO NAME MARKS DEPT COURSE
1 A 70 CSE C1
2 B 60 EEE C1
3 A 70 CSE C2
4 B 60 EEE C3
5 C 80 IT C3
S.NO
1 RNO→NAME
2 NAME→RNO
3 RNO→MARKS
4 DEPT→COURSE
5 COURSE→DEPT
6 MARKS→DEPT
7 RNO,NAME→MARKS
8 NAME→MARKS
9 NAME,MARKS→DEPT
10 NAME,MARKS→DEPT,COURSE
FD - is a relationship that exist when one
attribute uniquely determines another
attribute.
24.
25. Functional dependency(FD)
• Reasoning about functional dependencies:
• Armstrong Axioms (INFERENCE RULE) : Armstrong
axioms defines the set of rules for reasoning about FDs and
also to infer all the FDs on a relational database
26. Functional Dependencies in entity sets :
➢ An entity set can have the following functional dependencies :
1. Fully Functional Dependency
2. Partial Functional Dependency
3. Transitive Functional Dependency
➢ If x,y,z are attributes of an entity set in a table such that:
x is functionally dependent on y and
y is functionally dependent on z, then
z will be transitively dependent on x through y.
Eg: Students -> Teachers
Teachers -> Management
Management -> Students
27. Fully Functional Dependency
• If x and y are attributes of an entity set in a table such that
y is functionally dependent only on x, but not on any proper
subset of x,then this type of dependency is called as Fully
Functional Dependency.
Eg : RollNo, SubName -> Marks.
28. Partial Functional Dependency
➢ If x and y are attributes of an entity set in a table such
that :
✓ y is functionally dependent only on x and
elimination of some attributes from x does not
affect the dependency, then this type of
dependency is called as Partial Functional
Dependency.
• Eg :emp_id, emp_name -> salary.
29. Transitive Functional Dependency
If x,y,z are attributes of an entity set in a table such that
x is functionally dependent on y and y is functionally
dependent on z, then z will be transitively dependent
on x through y.
Eg: Students -> Teachers
Teachers -> Management
Management -> Students
30. Functional dependency
➢ TrivialFunctionalDependency
• In TrivialFunctional Dependency, a dependentis always a subset of the
determinant.
i.e. If X → Y and Y is the subsetof X, then it is called trivial functional
dependency.
• Here, {roll_no,name} → name is a trivial functional dependency
Similarly, roll_no→ roll_nois also an example of trivial FD.
S.NO
1 RNO→NAME
2 NAME→RNO
3 RNO→MARKS
4 DEPT→COURSE
5 COURSE→DEPT
6 MARKS→DEPT
7 RNO,NAME→MARKS
8 NAME→MARKS
9 NAME,MARKS→DEPT
10 NAME,MARKS→DEPT,COURSE
31. Functional dependency
• Non-trivial Functional Dependency
• In Non-trivial functional dependency, the dependent is
strictly not a subset of the determinant.
• i.e. If X → Y and Y is not a subset of X, then it is called
Non-trivial functional dependency.
• Here, roll_no → name is a non-trivial functional dependency,
• Similarly, {roll_no, name} → age is also a non-trivial
functional dependency
40. Properties of Decomposition
1. Lossless decomposition
• Lossless decomposition ensures-
• No information is lost from the original relation during
decomposition.
• When the sub relations are joined back, the same relation is
obtained that was decomposed.
• Every decomposition must always be lossless.
41. Properties of Decomposition
2. Dependency Preservation
• Dependency preservation ensures-
• None of the functional dependencies that holds on the original
relation are lost.
• The sub relations still hold or satisfy the functional
dependencies of the original relation.
42. Types of Decomposition
• Decomposition of a relation can be completed in the following
two ways-
43. Types of Decomposition
1.Lossless Join Decomposition
• Consider there is a relation R which is decomposed into sub
relations R1 , R2 , …. , Rn.
• This decomposition is called lossless join decomposition when
the join of the sub relations results in the same relation R that
was decomposed.
• For lossless join decomposition, we always have-
• R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = Rwhere ⋈ is a natural join
operator
44. Types of Decomposition
• Example
Consider the following relation R( A , B , C ) decomposed into two sub
relations
R1( A , B ) and R2( B , C ) R1( A , B ) R2( B , C )
• Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and
R2 , we get This relation is same as the original relation R.
• Thus, we conclude that the above decomposition is lossless join
decomposition.
A B C
1 2 1
2 5 3
3 3 3
A B
1 2
2 5
3 3
B C
2 1
5 3
3 3
45. Types of Decomposition
2.Lossy Join Decomposition
• Consider there is a relation R which is decomposed into sub
relations R1 , R2 , …. , Rn.
• This decomposition is called lossy join decomposition when
the join of the sub relations does not result in the same
relation R that was decomposed.
• The natural join of the sub relations is always found to have
some extraneous tuples.
• For lossy join decomposition, we always have
• R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ Rwhere ⋈ is a natural join
operator
46. Types of Decomposition
• Example
Consider the following relation R( A , B , C ) decomposed into two sub
relations
R1( A , C ) and R2( B , C ) R1( A , C ) R2( B , C )
• For lossy decomposition,we must have-
• R1 ⋈ R2 ⊃ R
• Now, if we perform the natural join ( ⋈ ) of the sub relations
R1 and R2 .
A B C
1 2 1
2 5 3
3 3 3
A C
1 1
2 3
3 3
B C
2 1
5 3
3 3
47. Types of Decomposition
• This relation is not same as the original relation R and contains some
extraneous tuples.
• Clearly, R1 ⋈ R2 ⊃ R.
• Thus, we conclude that the above decomposition is lossy join
decomposition.
A B C
1 2 1
2 5 3
3 3 3
A B C
1 2 1
2 5 3
2 3 3
3 5 3
3 3 3
48. NORMALIZATION
Normalization is the process of organizing data in a
database so that it meets two basic requirements:
1. There is no redundancy of data (all data is stored
in only one place).
2. Remove he anomalies
49. Introduction
➢ The normalization process is proposed by codd in 1972.
➢ It is a process which takes a relation schema through a series
of tests to “certify “ whether it satisfies a certain normal form.
➢ The normalization process which proceeds in a top –down
fashion by evaluating each relation against the criteria for
normal forms and decomposing as necessary i.e. “Relation
Deign By Analysis”
➢ Normalization of data/ “Relation Deign By Analysis”:
Is a process of analysing the given relation schemas
based on their FDs and PK to achieve the desirable properties.
50. Normalization:
• Normalization is a RDBMS design concept which is a
process of designing the database structure such that it
minimizes the data redundancy and also data anomalies.
• The process of normalization includes a series of stages
known as Normal Forms.
• Normalization rules are divided into following normal form :
1. First Normal Form (1 NF)
2. Second Normal Form (2 NF)
3. Third Normal Form (3 NF)
4. Boyce-Codd Normal Form (BCNF)
51. First Normal Form (1NF)
• A relation is in 1NF if every attribute is a single-valued
attribute or it does not contain any multi-valued or composite
attribute, i.e., every attribute is an atomic attribute.
• If there is a composite or multi-valued attribute, it violates the
1NF.
• To solve this, we can create a new row for each of the values
of the multi-valued attribute to convert the table into the 1NF.
OR
• First Normal Form (1 NF) : A relation is in first Normal
Form if and only if all underlying domains contain atomic
values only. i.e. in other words, a relation doesn’t have
multivalve attributes.
• For ex: Consider a STUDENT (Sid, Sname, Cname) relation.
54. 1NF
Rollno Name Course
1 A C/JAVA
2 B DBMS
3 C OS
4 D JAVA/C++
5 E C
Rollno Name Course
1 A C
1 A JAVA
2 B DBMS
3 C OS
4 D JAVA
4 D C++
5 E C
55. 1NF
Rollno Name Course1 Course2
1 A C JAVA
2 B DBMS NULL
3 C OS NULL
4 D JAVA C++
5 E C NULL
Rollno Name Course
1 A C/JAVA
2 B DBMS
3 C OS
4 D JAVA/C++
5 E C
56. 1NF
Rollno Name Course
1 A C/JAVA
2 B DBMS
3 C OS
4 D JAVA/C++
5 E C
Rollno Name
1 A
2 B
3 C
4 D
5 E
Rollno Course
1 C
1 JAVA
2 DBMS
3 OS
4 JAVA
4 C++
5 C
57. Second normal form (2NF)
• Second normal form (2NF) is the second step in normalizing a
database. 2NF builds on the first normal form (1NF).
• Definition. A relation schema R is in second normal form
(2NF) if every nonprime attribute A in R is not partially
dependent on any key of R.
• The test for 2NF involves testing for functional dependencies
whose left-hand side attributes are part of the primary key. If
the primary key contains a single attribute, the test need not
be applied at all.
58. Second Normal Form (2NF)
➢ The normalization of 1NF relations to 2NF involves the elimination of
partial dependencies.
➢ A partial dependency exists when any non-prime attributes,(i.e., an attribute
not a part of the candidate key) is not fully functionally dependent on one of the
candidatekeys.
➢ A relational table to be in second normal form, it must satisfy the following
rules:
1. The table must be in firstnormal form.
2. It must not contain any partial dependency, i.e., all non-prime attributes
are fully functionally dependent on the primary key.
➢ Prime attribute: is an attribute of R if it is a member of some candidate key of
R.
➢ Non-prime attribute: It is a attribute , which is not a member of some
candidate key.
➢ Candidate key: if relation has more than one key each is called candidate key. And
others is called secondary key.
➢ Primary key: one of the candidate key is arbitrarily designated as the primary
key.
59. Partial Functional Dependency.
➢ If x and y are attributes of an entity set in a table such that :
✓ y is functionally dependent only on x and elimination of
some attributes from x does not affect the dependency, then
this type of dependency is called as Partial Functional
Dependency.
✓ Ex: emp_id, emp_name → salary.
60. Second Normal Form (2NF)
Relation R is in Second Normal Form (2NF) only iff :
1. R should be in 1NF and
2. R should not contain any Partial Dependency
Partial Dependency :
Let R be a relational Schema and X,Y,A be the attribute sets over R.
X: Any Candidate Key
Y: Proper Subset of Candidate Key
A: Non Key Attribute
If Y → A exists in R, then R is not in 2 NF.
(Y → A) is a Partial dependency only if
Y: Proper subset of Candidate Key
A: Non Prime Attribute
61. Removal of Partial Dependency
• If there is any partial dependency, remove partially dependent
attributes from original table, place them in a separate table along
with a copy of its determinant.
• Example 1 :Consider the relation
Student(SID, Sname, Cname) :which is in 1 NF (No Multi-Valued-
Attributes) :
62.
63. Removal of Partial Dependency
The above two relations R1 and R2 are Lossless Join and Dependency
Preserving . They were in 2NF * There is less redundancy in 2NF rather
than in 1 NF, but 2NF is not free from redundancy.
64. 2NF
Employee Code Project ID Employee Name Project Name
101 P03 John Project103
101 P01 John Project101
102 P04 Ryan Project104
103 P02 Stephanie Project102
Employee
Code
Employee
Name
101 John
101 John
102 Ryan
103 Stephanie
Employee
Code
Project ID
101 P03
101 P01
102 P04
103 P02
Project ID
Project
Name
P03 Project103
P01 Project101
P04 Project104
P02 Project102
65. 2NF
• A relation is said to be in the 2nd Normal Form when it is in
the First Normal Form but has no non-prime attribute
functionally dependent on any candidate key's proper subset
in a relation. A relation's non-prime attribute refers to that
attribute that isn't a part of a relation's candidate key.
• Check whether then given relationis in 2NF or not
R(ABCDEF)
FDs:{C→F, E→A, EC→D, A→B}
EC+ =ECFABD
PrimeAttributes ={C,E}
Non PrimeAttributes ={A,B,D,F}
It is not in 2NF because C→F is partial dependency
(part of candidatekey determine non prime attributes)
66. Example: 2NF
• Let's assume, a school can store the data of teachers and the subjects
they teach.
• In a school, a teachercan teach more than one subject.
• In the given table, non-prime attribute TEACHER_AGE is dependent on
TEACHER_ID which is a proper subset of a candidate key. That's why it
violates the rule for 2NF.
• To convert the given table into 2nf, we decompose it into two tables:
68. Third Normal Form (3NF)
➢ A relation will be in 3NF if it is in 2NF and not
contain any transitive partialdependency.
➢ If there is no transitive dependency for non-prime
attributes,then the relation must be in 3NF
➢ 3NF is used to reduce the data duplication.
➢ It is also used to achieve the data integrity.
➢ A relation is in 3NF if it holds at least one of the
following conditions for every non-trivial function
dependency : X → Y.
1. X is a super key.
2.Y is a prime attribute, i.e., each element ofY is part
of some candidatekey.
1.A super key is a
combination of
columns that
uniquely identifies
any row within a
(RDBMS) table.
2.A candidate key is a
closely related
concept where the
super key is reduced
to the minimum
number of columns
required to uniquely
identify each row.
FD or it also called as Nontrivial Dependency occurs when
A→B holds true where B is not a subset of A.
That means , In a relationship, if attribute B is not a subset of
attribute A, then it is considered as a non-trivial dependency.
69. Candidate Key: {EMP_ID}
Non-prime Attributes: In the given table, all
attributes except EMP_ID are non-prime. Here,
1.EMP_STATE & EMP_CITY dependent on
EMP_ZIP
2.EMP_ZIP dependent on EMP_ID.
➢ The non-prime attributes
(EMP_STATE, and EMP_CITY) transitively
dependent on super key(EMP_ID).
➢ It violates the rule of third normal form.
➢ That's why we need to move the EMP_CITY
and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP
as a Primary key.
Super key in the table above:
{EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_
ID, EMP_NAME, EMP_ZIP}....so on
Example: EMPLOYEE_DETAIL
table:
70.
71.
72. Third Normal Form
• Third Normal Form: Let R be the relational schema, R is in
3NF only if :
1. R should be in 2NF.
2. R should not contain transitive dependencies.
73. Removal of Transitive Dependency
• If there is any transitive dependency in the relation, then
1. Create a separate relation and copy the dependent
attribute along with a copy of its determinant. and
remove these determinants from the original table.
2. Mark dependent attribute as a foreign key in the original
relation and Mark dependent attribute as a Primary key in
the separate relation
76. 3NF
Student ID Student Name ZIP City
1 A 500064 Hyderabad
2 B 200065 Secunderabad
3 C 500064 Hyderabad
Student ID Student
Name
ZIP
1 A 500064
2 B 200065
3 C 500064
ZIP City
500064 Hyderabad
200065 Secunderabad
500064 Hyderabad
78. BCNF
• BCNF is the advance version of 3NF. It is stricter than 3NF.
• A table is in BCNF if every functional dependency X → Y, X
is the super key of the table.
• For BCNF, the table should be in 3NF, and for every FD, LHS
is super key.
• Example: Let's assume there is a company
• where employees work in more than one department.
79. In the above table Functional dependencies are as follows:
Candidate key: {EMP-ID, EMP-DEPT}
FD :
EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
The table is not in BCNF because neither EMP_DEPT nor
EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it
into three tables:
80. • EMP_COUNTRYtable: EMP_DEPT_MAPPING table:
Candidate keys:
Forthe first table: EMP_ID
Forthe secondtable: EMP_DEPT
Forthe third table: {EMP_ID,
EMP_DEPT}
Now, this is in BCNF because left side
part of both the functional dependencies
is a key.
81. Candidatekeys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}
Now, this is in BCNF because left side part of both the functional
dependenciesis a key.
EMP_DEPT_MAPPING table:
82. BCNF – Boyce Codd Normal Form:
• Let R be the relational schema, R is in BCNF only if :
1. R should be in 3NF.
2. Every Functional Dependency will have a Superkey on the LHS
or all determinants are the superkeys.
• Example : Consider the following relationship R(ABCD) having
following functional dependencies:
83. BCNF
• Boyce–Codd Normal Form (BCNF) is based on functional
dependency that take into account all candidate keys in a
relation; however, BCNF also has additional constraints
compared with the general definition of 3NF.
• A relation is in BCNF, if and only if, every determinant is a
Form (BCNF) candidate key.
• LHS must be candidate or super key.
84. BCNF
• Rollno→name
• Rollno→voterid
• Voterid→age
• Voterid→rollno
Rollno Name Voterid Age
1 A 12MBVH 20
2 B 15YWSFI 21
3 C 18IYDBJP 19
4 D 76GFFSO 20
87. BCNF
• Third normal form always ensures dependency preservation
where as BCNF not.
• Both 3NF and BCNF ensures lossless decomposition.
• AB→CD,
• D→A
ABCD
R1(AD) R2(BCD)
88. Examples
• Find the highest normal form of a relation R(A, B, C, D, E)
with FD set as:
1. { BC->D, AC->BE, B->E }
2. {AB ->C,C ->B,AB ->B}
90. Prerequisite – Functional Dependency, Database
Normalization,
• Multivalued dependency : if two or more independent
relations are kept in a single relation OR
• Multivalued dependency occurs when the presence of one or
more rows in a table implies the presence of one or more
other rows in that same table.
✓ Put another way, two attributes (or columns) in a table
are independent of one another, but both depend on a
third attribute.
• A Multivalued Dependency always requires at least three
attributes because it consists of at least two attributes that
are dependent on a third.
91. • For a dependency A -> B, if for a single value of A, multiple
values of B exist, then the table may have a multi-valued
dependency.
• The table should have at least 3 attributes and B and C should
be independent for A ->> B multivalued dependency.
• Example:
92. 4NF
➢ Fourth Normal Form, it should satisfy the following two conditions:
➢ It should be in the Boyce-Codd Normal Form.
➢ And, the table should not have any Multi-valued Dependency.
➢ A table is said to have multi-valued dependency(MVD), if the
following conditions are true,
1. For a dependency A → B, if for a single value of A, multiple
value of B exists, then the table may have multi-valued
dependency.
2. Also, a table should have at-least 3 columns for it to have a
multi-valued dependency.
3. And, for a relation R(A,B,C), if there is a multi-valued
dependency between, A and B, then B and C should be
independent of each other.
✓ If all these conditions are true for any relation(table), it is said to
have multi-valued dependency.
93. Fourth Normal Form (4NF)
• The Fourth Normal Form (4NF) is a level of database
normalization where there are no non-trivial multivalued
dependencies other than a candidate key.
• It builds on the first three normal forms (1NF, 2NF, and 3NF)
and the Boyce-Codd Normal Form (BCNF).
• It states that, in addition to a database meeting the
requirements of BCNF, it must not contain more than one
multivalued dependency.
• Properties
• A relation R is in 4NF if and only if the following conditions are
satisfied:
1. It should be in the Boyce-Codd Normal Form (BCNF).
2. The table should not have any Multi-valued Dependency.
94. Fourth Normal Form (4NF)
• A table with a multivalued dependency violates the
normalization standard of the Fourth Normal Form (4NF)
because it creates unnecessary redundancies and can contribute
to inconsistent data.
• To bring this up to 4NF, it is necessary to break this
information into two tables.
• Example: Consider the database table of a class that has two
relations R1 contains student ID(SID) and student name
(SNAME) and R2 contains course id(CID) and course name
(CNAME).
99. 5NF
• A relation is in 5NF if:
1. It is in 4NF and not contains any join dependency and
joining should be lossless.
✓ 5NF is satisfied when all the tables are broken into as many
tables as possible in order to avoid redundancy.
• 5NF is also known as Project-join normal form (PJ/NF).
✓ If the join of R1 and R2 is equal to relation R, then we can
say that a join dependency (JD) exists.
✓ Alternatively, R1 and R2 are a lossless decomposition of R.
• A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if R1,
R2,....., Rn is a lossless-join decomposition.
100. Join dependency or JD
• Join dependency or JD is a constraint that is similar to FD (functional
dependency) or MVD (multivalued dependency).
• JD is relation is a join of a specific number of projections. satisfied
only when the concerned
• Projection is defined as taking a vertical subset from the columns of a single
table that retains the unique rows.
• This kind of SELECT statement returns some of the columns and all the rows in a
table.
• Thus, such a type of constraint is known as a join dependency.
• Whenever we can recreate a table by simply joining various
tables where each of these tables consists of a subset of the
table’s attribute, then this table is known as a Join Dependency.
• Thus, it is like a generalization of MVD and we can relate the JD to
5NF.
• Herein, a relation can be in 5NF only when it’s already in the 4NF.
Remember that it cannot be further decomposed.
101. Examples1 : Join Dependency
Student
✓ We can decompose the table given above into these three
tables given below.
✓ And thus, it is not in the Fifth Normal Form.
102. Our Join Dependency
would be:
{(Stu_Name, Stu_Skills),
( Stu_Name, Stu_Job),
(Stu_Skills, Stu_Job)}
The relations given above
have join dependency.
Thus, they do not happen
to be in 5NF.
It means that the join
relation of the three
relations given above is
equal to the very original
relation <Student>.
103. Example 2 FD
• Let us consider some special classes of join dependencies that
help us in capturing data dependencies that are present in a
data structurethat is hierarchical in nature.
• This hierarchical organisation informs the reader about the
rooms, and the students currently living in the room depend
only on the hostel but not the utilities present in that hostel.
• Since hostels have multiple rooms, FDs are NOT adequate
when we want to describe the data dependency among hostels
and rooms or utilities.
104. Example 2 FD
• In such a case, the multivalued dependencies,
Hostel ->-> room or
Hostel ->-> utilities hold
• Thus, using the first-order hierarchical decomposition, one
would be enabled to represent data dependencies that are
present in a hierarchical data structure in a natural way.
• Thus, one can store the hostel database as a lossless join of
the following:
Hostel_utility (hostel, utilities),
Hostel_room (hostel, room, student, syllabus, classes, teacher)
105. Example 3-FD
• The relation X would satisfy join dependency whenever X is equal to the
join of X1, X2, ….. Xn,
where Xi happensto be a subset of a set of attributesof X.
RELATION
Here,
sec ->-> name
sec ->-> language
The relation given is in 4NF.
These anomalies can occur in the
relations that are in 4NF if the
primary key contains three or more
fields.
Thus, the primary key is (sec,
language, name). Sometimes, when
we decompose a relation into two of
the smaller relations, the redundancy
isn’t removed here.
In such cases, decomposing the
relation is possible in three or more
than three relations using the Fifth
Normal Form.
106. • Thus, the relation given above says that sec offers many
elective languages that are taken by a combination of their
students. These students have their individual opinion to
choose their languages.
Thus, all three fields are required to represent this data
and information.
• This relation does not display non-trivial MVDs. It is because
the attributes, language and name, are dependent.
Thus, these are related to one another (A Functional
Dependency subject -> the existing name). This relation
cannot be decomposed into two relations (sec, language)
and (sec, name).
107. • Thus, we cannot decompose this relation into the following
relations:
• X1(sec, language)
• X2(sec, name) and
• X3(language, name), and we can show here that this
decomposition is lossless in nature.
109. Characteristics of Join Dependency in DBMS
• The join decomposition is like a further generalization of the
Multivalued dependencies.
• In case the join of X1 and X2 over C is equal to relation X, then
one can say that there exists a join dependency (JD).
✓ Where X1 and X2 are the decompositions X1(A, B, C) and
X2(C, D) of a given relation X (A, B, C, D).
✓ Alternatively, X1 and X2 are lossless forms of decomposition
of X.
JointDependency : A JD ⋈ {X1, X2,…, Xn} holds over a relation X if
X1, X2,….., Xn is a lossless-join type of decomposition.
• The *(A, B, C, D), (C, D) happen to be a Join Dependency of X if the
join of the join’s attribute happens to be equal to the relation X.
• Here, we use the *(X1, X2, X3) to indicate that relation X1, X2, X3
and so on are a Join Decomposition of X.
110. • Let R is a relation schema R1, R2, R3……..Rn be the
decomposition of R. r( R ) is said to satisfy join dependency if
and only if
111.
112. Fifth Normal Form / Projected Normal Form (5NF)
• A relation R is in Fifth Normal Form if and only if everyone joins
dependency in R is implied by the candidate keys of R.
• A relation decomposed into two relations must have lossless join
Property, which ensures that no spurious or extra tuples are
generated when relations are reunited through a natural join.
Properties
• A relation R is in 5NF if and only if it satisfies the following
conditions:
1. R should be already in 4NF.
2. It cannot be further non loss decomposed (join dependency).
113. 5NF
1. A relation is in 5NF if it is in 4NF and not contains any join
dependency and joining should be lossless.
2. 5NF is satisfied when all the tables are broken into as many
tables as possible in order to avoid redundancy.
3. 5NF is also known as Project-join normal form (PJ/NF).
4. If the join of R1 and R2 is equal to relation R, then we can
say that a join dependency (JD) exists.
✓Alternatively, R1 and R2 are a lossless decomposition of R.
✓A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if
R1, R2,....., Rn is a lossless-join decomposition.
114. 5NF
Agent Company Product_Name
Suneet ABC Pendrive
Suneet ABC MIC
Suneet CDE Speaker
Raj ABC Speaker
Agent Company
Sunnet ABC
Suneet CDE
Raj ABC
Agent Product_Name
Suneet Pendrive
Suneet MIC
Suneet Speaker
Raj Speaker
Company Product_
Name
ABC Pendrive
ABC MIC
ABC Speaker
CDE Speaker
115. 5NF
Agent Company Product_Name
Sunnet ABC Pendrive
Sunnet ABC MIC
Suneet ABC Speaker
Sunnet CDE Pendrive
Sunnet CDE MIC
Suneet CDE Speaker
Raj ABC Speaker
Agent Company Product_Name
Sunnet ABC Pendrive
Sunnet ABC MIC
Suneet ABC Speaker
Suneet CDE Speaker
Raj ABC Speaker
116. 5NF
SUBJECT LECTURER SEMESTER
Computer Anshika Semester 1
Computer John Semester 1
Math John Semester 1
Math Akash Semester 2
Chemistry Praveen Semester 1
SUBJECT LECTURER
Computer Anshika
Computer John
Math John
Math Akash
Chemistry Praveen
SEMESTER SUBJECT
Semester 1 Computer
Semester 1 Math
Semester 1 Chemistry
Semester 2 Math
SEMSTER LECTURER
Semester 1 Anshika
Semester 1 John
Semester 2 Akash
Semester 1 Praveen
117. 5NF
SUBJECT LECTURER SEMESTER
Computer Anshika Semester 1
Computer John Semester 1
Math John Semester 1
Math John Semester 2
Math Akash Semester 1
Math Akash Semester 2
Chemistry Praveen Semester 1
SUBJECT LECTURER SEMESTER
Computer Anshika Semester 1
Computer John Semester 1
Math John Semester 1
Math Akash Semester 2
Chemistry Praveen Semester 1