UNIT 4: Data Normalization: Anomalies in relational database design.
Decomposition. Functional dependencies. Normalization. First normal
form,
Second normal form, Third normal form. Boyce-Codd normal form.
Dr Hamela K
Dept of Computer Science
GFGC, Malur
Prepared by Dr Hamela, GFGC, Malur
Types of Anomalies in DBMS
Insert Anomaly: The term "insertion anomaly" is used to describe when a
new row is added to a table and it causes an inconsistency.
Update Anomaly: If there are some changes in the database, we have to
apply that change in all the rows. And if we miss any row, we will have one
more field, creating an update anomaly in the database.
Delete Anomaly: The term "deletion anomaly in the database" is used
when we delete some rows from a table and any necessary additional
information or data is also lost from the database.
Prepared by Dr Hamela, GFGC, Malur
Assume a manufacturing company stores employee details in a table called Employee
having four attributes:
Prepared by Dr Hamela, GFGC, Malur
Insert anomaly
If there is a new row inserted in the table and it creates the inconsistency in the table
then it is called the insertion anomaly.
Example
Assume that a new employee is joining the company under training and not assigned to
any department. Then, we would not insert the data into the table if the emp_dept field
doesn't allow nulls.
Prepared by Dr Hamela, GFGC, Malur
Update anomaly
When we update some rows in the table, and if it leads to the
inconsistency of the table then this anomaly occurs. This type
of anomaly is known as an updation anomaly.
Example
In the given table, we have two rows for an employee named
Rick, and he belongs to two different departments of the
company. If we need to update Rick's address, we must update
the same address in two rows. Otherwise, the data will become
inconsistent.
If, in some way, we can update the correct address in one
department but not the other, then according to the database,
Rick will have two different addresses, which is not correct and
would lead to inconsistent data.
Prepared by Dr Hamela, GFGC, Malur
Delete anomaly
If we delete some rows from the table and if
any other information or data which is required
is also deleted from the database, this is called
the deletion anomaly in the database.
Example
Assume that if the company closes the
department D890, then deleting the rows that
have emp_dept as D890 would also delete the
information of employee Maggie since she is
assigned only to this department.
Prepared by Dr Hamela, GFGC, Malur
DECOMPOSITION
• Decomposition can be defined as a database management
system process for dividing a single relation into multiple sub-
relations.
• Its main purpose is to break down the functions of a company
into fine levels of detail.
• It eliminates the anomalies and redundancy from the database
by breaking it up into many different tables.
Prepared by Dr Hamela, GFGC, Malur
Types of decomposition
Prepared by Dr Hamela, GFGC, Malur
In a Lossy Decomposition, the relation needs to be decomposed into two or
more relational schemas. There is no way that loss of information can be
avoided during the retrieval of the original relation.
Now, we won’t be
able to join the above
tables, since Emp_ID
isn’t part of the
DeptDetails relation.
Therefore, the above
relation has lossy
decomposition.
Prepared by Dr Hamela, GFGC, Malur
Lossless Decomposition
Decomposition is lossless if it is feasible to reconstruct relation R from decomposed tables
using Joins. This is the preferred choice. The information will not lose from the relation
when decomposed. The join would result in the same original relation.
Prepared by Dr Hamela, GFGC, Malur
Prepared by Dr Hamela, GFGC, Malur
Advantages of decomposition in DBMS?
Easy use of Codes
The availability of decomposition makes it easier for programs to copy and reuse important codes for
other works in DBMS. It only not helps in saving lots of time but also makes things convenient for the
users.
Finding Mistakes
Another reason the programmers opt for decomposition is to allow them conveniently complete complex
programs. The mistakes are quite easier to find with this sort of programming.
Problem-Solving Approach
It is considered a perfect problem-solving strategy using which complex computer programs can be
written easily. The users can precisely join tons of code together for adequate results.
Eliminating Errors
The biggest advantage of having decomposition in DBMS is eliminating the inconsistencies and
duplication to a greater extent. The data can be easily identified in cases when decomposition happens
in DBMS.
Prepared by Dr Hamela, GFGC, Malur
Properties of decomposition
The programmers must be aware of the main properties of decomposition in DBMS.
We have mentioned the major ones in detail below:
Attribute Preservation
The functional dependencies decompose the universal relation that attributes
preservation of decomposition.
Dependency Preservation
Dependency preservation can be defined as the functionality that features directly
in the relation schemas. There is a chance of dependency loss if the decomposition
is not preserved.
•No Redundancy
It is used for removing a few of the issues related to improper design, such as
redundancy, anomalies, and inconsistencies.
Prepared by Dr Hamela, GFGC, Malur
Issues of decomposition in DBMS?
There are many problems regarding the decomposition in DBMS mentioned below:
Redundant Storage
Many instances where the same information gets stored in a single place can confuse
the programmers. It will take lots of space in the system.
Insertion Anomalies
It isn’t essential for storing important details unless some kind of information is stored
in a consistent manner.
Deletion Anomalies
It isn’t possible to delete some details without eliminating any sort of information.
Prepared by Dr Hamela, GFGC, Malur
Functional Dependency
The functional dependency is a relationship that exists between
two attributes. It typically exists between the primary key and
non-key attribute within a table.
X → Y
The left side of FD is known as a determinant, the right side of
the production is known as a dependent.
For example:
Assume we have an employee table with attributes: Emp_Id,
Emp_Name, Emp_Address.
Here Emp_Id attribute can uniquely identify the Emp_Name
attribute of employee table because if we know the Emp_Id, we
can tell that employee name associated with it.
Functional dependency can be written as:
Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on
Emp_Id.
Types of Functional dependency
Types of Functional dependency
1. Trivial functional dependency
A → B has trivial functional dependency if B is a subset of A.
The following dependencies are also trivial like: A → A, B → B
Example:
Consider a table with two columns Employee_Id and
Employee_Name.
{Employee_id, Employee_Name} → Employee_Id is a trivial
functional dependency as
Employee_Id is a subset of {Employee_Id, Employee_Name}.
Also, Employee_Id → Employee_Id and Employee_Name →
Employee_Name are trivial dependencies too.
2. Non-trivial functional dependency
A → B has a non-trivial functional dependency if B is not a
subset of A.
When A intersection B is NULL, then A → B is called as
complete non-trivial.
Example:
ID → Name,
Name → DOB
Armstrong's Axioms
If F is a set of functional dependencies then the closure
of F, denoted as F+, is the set of all functional
dependencies logically implied by F. Armstrong's
Axioms are a set of rules, that when applied
repeatedly, generates a closure of functional
dependencies.
Reflexive rule − If alpha is a set of attributes and beta
is_subset_of alpha, then alpha holds beta.
Augmentation rule − If a → b holds and y is attribute
set, then ay → by also holds. That is adding attributes
in dependencies, does not change the basic
dependencies.
Transitivity rule − Same as transitive rule in algebra, if
a → b holds and b → c holds, then a → c also holds. a
→ b is called as a functionally that determines b.
• Trivial Functional Dependency
• Trivial − If a functional dependency (FD) X → Y holds,
where Y is a subset of X, then it is called a trivial FD.
Trivial FDs always hold.
• Non-trivial − If an FD X → Y holds, where Y is not a
subset of X, then it is called a non-trivial FD.
• Completely non-trivial − If an FD X → Y holds, where
x intersect Y = Φ, it is said to be a completely non-
trivial FD.
The Purpose of Normalization
Normalization is a technique for producing a set of
relations with desirable properties, given the data
requirements of an enterprise.
The process of normalization is a formal method that
identifies relations based on their primary or candidate
keys and the functional dependencies among their
attributes.
Problem Without Normalization
Without Normalization, it becomes difficult to handle and update the database,
without facing data loss. Insertion, Updation and Deletion Anamolies are very
frequent if Database is not Normalized.
Purpose
• Normalization is the process of organizing the data in
the database.
• Normalization is used to minimize the redundancy
from a relation or set of relations. It is also used to
eliminate undesirable characteristics like Insertion,
Update, and Deletion Anomalies.
• Normalization divides the larger table into smaller
and links them using relationships.
• The normal form is used to reduce redundancy from
the database table.
Advantages of Normalization
• Normalization helps to minimize data
redundancy.
• Greater overall database organization.
• Data consistency within the database.
• Much more flexible database design.
• Enforces the concept of relational integrity.
Disadvantages of Normalization
We cannot start building the database before knowing
what the user needs.
The performance degrades when normalizing the
relations to higher normal forms, i.e., 4NF, 5NF.
It is very time-consuming and difficult to normalize
relations of a higher degree.
Careless decomposition may lead to a bad database
design and serious problems.
FIRST NORMAL FORM
• First Normal Form is a relation in which the
intersection of each row and column contains one
and only one value.
As per First Normal Form, no two Rows of data must contain repeating group
of information i.e each set of column must have a unique value, such that
multiple columns cannot be used to fetch the same row. Each table should be
organized into rows, and each row should have a primary key that
distinguishes it as unique.
The Primary key is usually a single column, but sometimes more than one
column can be combined to create a single primary key. For example consider
a table which is not in First normal form
Student Table :
Student
Age Subject
Adam 15 Biology, Maths
Alex 14 Maths
Stuart 17 Maths
In First Normal Form, any row must not have a column in which more
than one value is saved, like separated with commas. Rather than that,
we must separate such data into multiple rows.
Student Table following 1NF will be :
Student
Age Subject
Adam 15 Biology
Adam 15 Maths
Alex 14 Maths
Stuart 17 Maths
Using the First Normal Form, data redundancy increases, as there will be many
columns with same data in multiple rows but each row as a whole will be unique.
Full functional dependency
Full functional dependency indicates that if A and B are
attributes of a relation, B is fully functionally dependent on
A if B is functionally dependent on A, but not on any proper
subset of A.
A functional dependency A→B is partially dependent if there
is some attributes that can be removed from A and the
dependency still holds.
Second Normal Form (2NF)
Second normal form (2NF) is a relation that is in first
normal form and every non-primary-key attribute is fully
functionally dependent on the primary key.
The normalization of 1NF relations to 2NF involves the
removal of partial dependencies. If a partial dependency
exists, we remove the function dependent attributes from
the relation by placing them in a new relation along with
a copy of their determinant.
• In example of First Normal Form there are two rows for Adam, to include
multiple subjects that he has opted for. While this is searchable, and follows
First normal form, it is an inefficient use of space. Also in the above Table in
First Normal Form, while the candidate key is {Student, Subject}, Age of
Student only depends on Student column, which is incorrect as per Second
Normal Form. To achieve second normal form, it would be helpful to split
out the subjects into an independent table, and match them up using the
student names as foreign keys.
New Student Table following
2NF will be :
Student
Age
Adam 15
Alex 14
Stuart 17
In Student Table the candidate key will
be Student column, because all other
column i.e Age is dependent on it.
New Subject Table introduced for 2NF
will be :
Student
Subject
Adam Biology
Adam Maths
Alex Maths
Stuart Maths
• In Subject Table the candidate key will be {Student, Subject} column. Now,
both the above tables qualifies for Second Normal Form and will never
suffer from Update Anomalies. Although there are a few complex cases in
which table in Second Normal Form suffers Update Anomalies, and to
handle those scenarios Third Normal Form is there.
Third Normal Form (3NF)
Transitive dependency
A condition where A, B, and C are attributes of a relation such that
if A → B and B → C, then C is transitively dependent on A via B
(provided that A is not functionally dependent on B or C).
Third normal form (3NF)
A relation that is in first and second normal form, and in which
no non-primary-key attribute is transitively dependent on the
primary key.
The normalization of 2NF relations to 3NF involves the removal
of transitive dependencies by placing the attribute(s) in a new
relation along with a copy of the determinant.
• Student_Detail Table :
In this table Student_id is Primary key, but street, city and state depends upon Zip. The
dependency between zip and other fields is called transitive dependency. Hence to
apply 3NF, we need to move the street, city and state to new table, with Zip as primary key.
Third Normal form applies that every non-prime attribute of table must be
dependent on primary key, or we can say that, there should not be the case
that a non-prime attribute is determined by another non-prime attribute. So
this transitive functional dependency should be removed from the table and
also the table must be in Second Normal form. For example, consider a
table with following fields.
New Student_Detail Table :
Student_id Student_name DOB Zip
Address Table :
Zip Street city state
The advantage of removing transtive dependency is,
Amount of data duplication is reduced.
Data integrity achieved.
Boyce Codd normal form (BCNF)
• Boyce Codd normal form (BCNF)
• BCNF is the advance version of 3NF. It is stricter than
3NF.
• A table is in BCNF if every functional dependency X →
Y, X is the super key of the table.
• For BCNF, the table should be in 3NF, and for every
FD, LHS is super key.
EMP_ID EMP_COUNT
RY
EMP_DEPT DEPT_TYPE EMP_DEPT_N
O
264 India Designing D394 283
264 India Testing D394 300
364 UK Stores D283 232
364 UK Developing D283 549
Example: Let's assume there is a company where employees work in more than
one department.
EMPLOYEE table:
In the above table Functional dependencies are as follows:
1.EMP_ID → EMP_COUNTRY
2.EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate key: {EMP-ID, EMP-DEPT}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_ID EMP_COUN
TRY
264 India
264 India
EMP_COUNTRY table:
EMP_DEPT DEPT_TYP
E
EMP_DEPT
_NO
Designing D394 283
Testing D394 300
Stores D283 232
Developing D283 549
EMP_DEPT table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
EMP_DEPT_MAPPING table:
Candidate keys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}
Now, this is in BCNF because the left side part of both the
functional dependencies is key.

transection mangemanrtin data bse mangement system .pdf

  • 1.
    UNIT 4: DataNormalization: Anomalies in relational database design. Decomposition. Functional dependencies. Normalization. First normal form, Second normal form, Third normal form. Boyce-Codd normal form. Dr Hamela K Dept of Computer Science GFGC, Malur Prepared by Dr Hamela, GFGC, Malur
  • 2.
    Types of Anomaliesin DBMS Insert Anomaly: The term "insertion anomaly" is used to describe when a new row is added to a table and it causes an inconsistency. Update Anomaly: If there are some changes in the database, we have to apply that change in all the rows. And if we miss any row, we will have one more field, creating an update anomaly in the database. Delete Anomaly: The term "deletion anomaly in the database" is used when we delete some rows from a table and any necessary additional information or data is also lost from the database. Prepared by Dr Hamela, GFGC, Malur
  • 3.
    Assume a manufacturingcompany stores employee details in a table called Employee having four attributes: Prepared by Dr Hamela, GFGC, Malur
  • 4.
    Insert anomaly If thereis a new row inserted in the table and it creates the inconsistency in the table then it is called the insertion anomaly. Example Assume that a new employee is joining the company under training and not assigned to any department. Then, we would not insert the data into the table if the emp_dept field doesn't allow nulls. Prepared by Dr Hamela, GFGC, Malur
  • 5.
    Update anomaly When weupdate some rows in the table, and if it leads to the inconsistency of the table then this anomaly occurs. This type of anomaly is known as an updation anomaly. Example In the given table, we have two rows for an employee named Rick, and he belongs to two different departments of the company. If we need to update Rick's address, we must update the same address in two rows. Otherwise, the data will become inconsistent. If, in some way, we can update the correct address in one department but not the other, then according to the database, Rick will have two different addresses, which is not correct and would lead to inconsistent data. Prepared by Dr Hamela, GFGC, Malur
  • 6.
    Delete anomaly If wedelete some rows from the table and if any other information or data which is required is also deleted from the database, this is called the deletion anomaly in the database. Example Assume that if the company closes the department D890, then deleting the rows that have emp_dept as D890 would also delete the information of employee Maggie since she is assigned only to this department. Prepared by Dr Hamela, GFGC, Malur
  • 7.
    DECOMPOSITION • Decomposition canbe defined as a database management system process for dividing a single relation into multiple sub- relations. • Its main purpose is to break down the functions of a company into fine levels of detail. • It eliminates the anomalies and redundancy from the database by breaking it up into many different tables. Prepared by Dr Hamela, GFGC, Malur
  • 8.
    Types of decomposition Preparedby Dr Hamela, GFGC, Malur
  • 9.
    In a LossyDecomposition, the relation needs to be decomposed into two or more relational schemas. There is no way that loss of information can be avoided during the retrieval of the original relation. Now, we won’t be able to join the above tables, since Emp_ID isn’t part of the DeptDetails relation. Therefore, the above relation has lossy decomposition. Prepared by Dr Hamela, GFGC, Malur
  • 10.
    Lossless Decomposition Decomposition islossless if it is feasible to reconstruct relation R from decomposed tables using Joins. This is the preferred choice. The information will not lose from the relation when decomposed. The join would result in the same original relation. Prepared by Dr Hamela, GFGC, Malur
  • 11.
    Prepared by DrHamela, GFGC, Malur
  • 12.
    Advantages of decompositionin DBMS? Easy use of Codes The availability of decomposition makes it easier for programs to copy and reuse important codes for other works in DBMS. It only not helps in saving lots of time but also makes things convenient for the users. Finding Mistakes Another reason the programmers opt for decomposition is to allow them conveniently complete complex programs. The mistakes are quite easier to find with this sort of programming. Problem-Solving Approach It is considered a perfect problem-solving strategy using which complex computer programs can be written easily. The users can precisely join tons of code together for adequate results. Eliminating Errors The biggest advantage of having decomposition in DBMS is eliminating the inconsistencies and duplication to a greater extent. The data can be easily identified in cases when decomposition happens in DBMS. Prepared by Dr Hamela, GFGC, Malur
  • 13.
    Properties of decomposition Theprogrammers must be aware of the main properties of decomposition in DBMS. We have mentioned the major ones in detail below: Attribute Preservation The functional dependencies decompose the universal relation that attributes preservation of decomposition. Dependency Preservation Dependency preservation can be defined as the functionality that features directly in the relation schemas. There is a chance of dependency loss if the decomposition is not preserved. •No Redundancy It is used for removing a few of the issues related to improper design, such as redundancy, anomalies, and inconsistencies. Prepared by Dr Hamela, GFGC, Malur
  • 14.
    Issues of decompositionin DBMS? There are many problems regarding the decomposition in DBMS mentioned below: Redundant Storage Many instances where the same information gets stored in a single place can confuse the programmers. It will take lots of space in the system. Insertion Anomalies It isn’t essential for storing important details unless some kind of information is stored in a consistent manner. Deletion Anomalies It isn’t possible to delete some details without eliminating any sort of information. Prepared by Dr Hamela, GFGC, Malur
  • 15.
    Functional Dependency The functionaldependency is a relationship that exists between two attributes. It typically exists between the primary key and non-key attribute within a table. X → Y The left side of FD is known as a determinant, the right side of the production is known as a dependent. For example: Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address. Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table because if we know the Emp_Id, we can tell that employee name associated with it. Functional dependency can be written as: Emp_Id → Emp_Name We can say that Emp_Name is functionally dependent on Emp_Id.
  • 16.
    Types of Functionaldependency Types of Functional dependency
  • 17.
    1. Trivial functionaldependency A → B has trivial functional dependency if B is a subset of A. The following dependencies are also trivial like: A → A, B → B Example: Consider a table with two columns Employee_Id and Employee_Name. {Employee_id, Employee_Name} → Employee_Id is a trivial functional dependency as Employee_Id is a subset of {Employee_Id, Employee_Name}. Also, Employee_Id → Employee_Id and Employee_Name → Employee_Name are trivial dependencies too.
  • 18.
    2. Non-trivial functionaldependency A → B has a non-trivial functional dependency if B is not a subset of A. When A intersection B is NULL, then A → B is called as complete non-trivial. Example: ID → Name, Name → DOB
  • 19.
    Armstrong's Axioms If Fis a set of functional dependencies then the closure of F, denoted as F+, is the set of all functional dependencies logically implied by F. Armstrong's Axioms are a set of rules, that when applied repeatedly, generates a closure of functional dependencies. Reflexive rule − If alpha is a set of attributes and beta is_subset_of alpha, then alpha holds beta. Augmentation rule − If a → b holds and y is attribute set, then ay → by also holds. That is adding attributes in dependencies, does not change the basic dependencies. Transitivity rule − Same as transitive rule in algebra, if a → b holds and b → c holds, then a → c also holds. a → b is called as a functionally that determines b.
  • 20.
    • Trivial FunctionalDependency • Trivial − If a functional dependency (FD) X → Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial FDs always hold. • Non-trivial − If an FD X → Y holds, where Y is not a subset of X, then it is called a non-trivial FD. • Completely non-trivial − If an FD X → Y holds, where x intersect Y = Φ, it is said to be a completely non- trivial FD.
  • 21.
    The Purpose ofNormalization Normalization is a technique for producing a set of relations with desirable properties, given the data requirements of an enterprise. The process of normalization is a formal method that identifies relations based on their primary or candidate keys and the functional dependencies among their attributes.
  • 22.
    Problem Without Normalization WithoutNormalization, it becomes difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anamolies are very frequent if Database is not Normalized.
  • 24.
    Purpose • Normalization isthe process of organizing the data in the database. • Normalization is used to minimize the redundancy from a relation or set of relations. It is also used to eliminate undesirable characteristics like Insertion, Update, and Deletion Anomalies. • Normalization divides the larger table into smaller and links them using relationships. • The normal form is used to reduce redundancy from the database table.
  • 25.
    Advantages of Normalization •Normalization helps to minimize data redundancy. • Greater overall database organization. • Data consistency within the database. • Much more flexible database design. • Enforces the concept of relational integrity.
  • 26.
    Disadvantages of Normalization Wecannot start building the database before knowing what the user needs. The performance degrades when normalizing the relations to higher normal forms, i.e., 4NF, 5NF. It is very time-consuming and difficult to normalize relations of a higher degree. Careless decomposition may lead to a bad database design and serious problems.
  • 29.
    FIRST NORMAL FORM •First Normal Form is a relation in which the intersection of each row and column contains one and only one value.
  • 30.
    As per FirstNormal Form, no two Rows of data must contain repeating group of information i.e each set of column must have a unique value, such that multiple columns cannot be used to fetch the same row. Each table should be organized into rows, and each row should have a primary key that distinguishes it as unique. The Primary key is usually a single column, but sometimes more than one column can be combined to create a single primary key. For example consider a table which is not in First normal form Student Table : Student Age Subject Adam 15 Biology, Maths Alex 14 Maths Stuart 17 Maths
  • 31.
    In First NormalForm, any row must not have a column in which more than one value is saved, like separated with commas. Rather than that, we must separate such data into multiple rows. Student Table following 1NF will be : Student Age Subject Adam 15 Biology Adam 15 Maths Alex 14 Maths Stuart 17 Maths Using the First Normal Form, data redundancy increases, as there will be many columns with same data in multiple rows but each row as a whole will be unique.
  • 32.
    Full functional dependency Fullfunctional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A. A functional dependency A→B is partially dependent if there is some attributes that can be removed from A and the dependency still holds.
  • 33.
    Second Normal Form(2NF) Second normal form (2NF) is a relation that is in first normal form and every non-primary-key attribute is fully functionally dependent on the primary key. The normalization of 1NF relations to 2NF involves the removal of partial dependencies. If a partial dependency exists, we remove the function dependent attributes from the relation by placing them in a new relation along with a copy of their determinant.
  • 34.
    • In exampleof First Normal Form there are two rows for Adam, to include multiple subjects that he has opted for. While this is searchable, and follows First normal form, it is an inefficient use of space. Also in the above Table in First Normal Form, while the candidate key is {Student, Subject}, Age of Student only depends on Student column, which is incorrect as per Second Normal Form. To achieve second normal form, it would be helpful to split out the subjects into an independent table, and match them up using the student names as foreign keys. New Student Table following 2NF will be : Student Age Adam 15 Alex 14 Stuart 17 In Student Table the candidate key will be Student column, because all other column i.e Age is dependent on it. New Subject Table introduced for 2NF will be : Student Subject Adam Biology Adam Maths Alex Maths Stuart Maths
  • 35.
    • In SubjectTable the candidate key will be {Student, Subject} column. Now, both the above tables qualifies for Second Normal Form and will never suffer from Update Anomalies. Although there are a few complex cases in which table in Second Normal Form suffers Update Anomalies, and to handle those scenarios Third Normal Form is there.
  • 36.
    Third Normal Form(3NF) Transitive dependency A condition where A, B, and C are attributes of a relation such that if A → B and B → C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C). Third normal form (3NF) A relation that is in first and second normal form, and in which no non-primary-key attribute is transitively dependent on the primary key. The normalization of 2NF relations to 3NF involves the removal of transitive dependencies by placing the attribute(s) in a new relation along with a copy of the determinant.
  • 37.
    • Student_Detail Table: In this table Student_id is Primary key, but street, city and state depends upon Zip. The dependency between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to move the street, city and state to new table, with Zip as primary key. Third Normal form applies that every non-prime attribute of table must be dependent on primary key, or we can say that, there should not be the case that a non-prime attribute is determined by another non-prime attribute. So this transitive functional dependency should be removed from the table and also the table must be in Second Normal form. For example, consider a table with following fields.
  • 38.
    New Student_Detail Table: Student_id Student_name DOB Zip Address Table : Zip Street city state The advantage of removing transtive dependency is, Amount of data duplication is reduced. Data integrity achieved.
  • 40.
    Boyce Codd normalform (BCNF) • Boyce Codd normal form (BCNF) • BCNF is the advance version of 3NF. It is stricter than 3NF. • A table is in BCNF if every functional dependency X → Y, X is the super key of the table. • For BCNF, the table should be in 3NF, and for every FD, LHS is super key.
  • 41.
    EMP_ID EMP_COUNT RY EMP_DEPT DEPT_TYPEEMP_DEPT_N O 264 India Designing D394 283 264 India Testing D394 300 364 UK Stores D283 232 364 UK Developing D283 549 Example: Let's assume there is a company where employees work in more than one department. EMPLOYEE table: In the above table Functional dependencies are as follows: 1.EMP_ID → EMP_COUNTRY 2.EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
  • 42.
    Candidate key: {EMP-ID,EMP-DEPT} The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys. To convert the given table into BCNF, we decompose it into three tables: EMP_ID EMP_COUN TRY 264 India 264 India EMP_COUNTRY table: EMP_DEPT DEPT_TYP E EMP_DEPT _NO Designing D394 283 Testing D394 300 Stores D283 232 Developing D283 549 EMP_DEPT table: EMP_ID EMP_DEPT D394 283 D394 300 D283 232 D283 549 EMP_DEPT_MAPPING table:
  • 43.
    Candidate keys: For thefirst table: EMP_ID For the second table: EMP_DEPT For the third table: {EMP_ID, EMP_DEPT} Now, this is in BCNF because the left side part of both the functional dependencies is key.