SlideShare a Scribd company logo
1 of 59
NORMALIZATION
• The normalization process, as first proposed by Codd
(1972a), takes a relation schema through a series of
tests to certify whether it satisfies a certain normal
form.
• The process, which proceeds in a top-down fashion
by evaluating each relation against the criteria for
normal forms and decomposing relations as
necessary, can thus be considered as relational
design by analysis.
• Initially, Codd proposed three normal forms,
which he called first, second, and third normal
form.
• A stronger definition of 3NF—called Boyce-
Codd normal form (BCNF)—was proposed
later by Boyce and Codd.
• All these normal forms are based on a single
analytical tool: the functional dependencies
among the attributes of a relation.
• Later, a fourth normal form (4NF) and a fifth
normal form (5NF) were proposed, based on
the concepts of multivalued dependencies
and join dependencies, respectively
• Normalization of data can be considered a
process of analyzing the given relation schemas
based on their FDs and primary keys to achieve
the desirable properties of
– minimizing redundancy and
– minimizing the insertion, deletion, and update
anomalies
• It can be considered as a “filtering” or
“purification” process to make the design have
successively better quality.
• Unsatisfactory relation schemas that do not
meet certain conditions—the normal form
tests—are decomposed into smaller relation
schemas that meet the tests and hence
possess the desirable properties.
• Thus, the normalization procedure provides
database designers with the following:
– A formal framework for analyzing relation
schemas based on their keys and on the
functional dependencies among their attributes
– A series of normal form tests that can be carried
out on individual relation schemas so that the
relational database can be normalized to any
desired degree
Definition.
• The normal form of a relation refers to the
highest normal form condition that it meets,
and hence indicates the degree to which it has
been normalized.
• Normal forms, when considered in isolation from
other factors, do not guarantee a good database
design.
• It is generally not sufficient to check separately
that each relation schema in the database is, say,
in BCNF or 3NF.
• Rather, the process of normalization through
decomposition must also confirm the existence
of additional properties that the relational
schemas, taken together, should possess.
• These would include two properties:
– The nonadditive join or lossless join property,
which guarantees that the spurious tuple
generation problem does not occur with respect
to the relation schemas created after
decomposition.
– The dependency preservation property, which
ensures that each functional dependency is
represented in some individual relation resulting
after decomposition.
• The non additive join property is extremely
critical and must be achieved at any cost,
whereas the dependency preservation
property, although desirable, is sometimes
sacrificed
• database designers need not normalize to the
highest possible normal form.
• Relations may be left in a lower normalization
status, such as 2NF, for performance reasons
Doing so incurs the corresponding penalties of
dealing with the anomalies.
FIRST NORMAL FORM
• First normal form (1NF) is now considered to be part of
the formal definition of a relation in the relational
model;
• historically, it was defined to disallow multivalued
attributes, composite attributes, and their
combinations.
• It states that the domain of an attribute must include
only atomic (simple, indivisible) values and that the
value of any attribute in a tuple must be a single value
from the domain of that attribute.
• Hence, 1NF disallows having
– a set of values (multivalued attribute)
– a set of values (composite attribute)
– or a combination of both
as an attribute value for a single tuple.
• In case of composite attributes, INF can be
achieved by breaking the composite attributes
into atomic attributes
• Employee(eno, Name)
• Address(eno, Hno, Street, City)
• Consider the multivalued attribute Dlocation
• To normalise this relation into 1NF, we Place
the multivalued attribute in a separate
relation with the primary key of the original
relation
SECOND NORMAL FORM (2NF)
• Based on the concept of Full Functional
Dependency
• A FD X-> Y is said to be a full FD if removal of
any attribute A from X means that the
dependency does not hold any more
• i.e. {X-A} does not functionally determine Y
• A FD X-> Y is said to be partial if some
attribute A can be removed from X and the
dependency {X-A} -> Y still holds
2NF states that
A relation schema R is in 2NF if it satisfies the
following conditions :
• It is in 1NF
• every non prime attribute A in R is fully
functional dependent on any key of R (no non
prime attribute is dependent on part of the
key)
• Alternate Definition of 2NF
A relation schema R is said to be in 2NF if it is
in 1NF and
every non-prime attribute A in R is not partially
dependent on any key of R
• {SSn, pno} is the primary key
• Hours, pname, ename, plocation are the non-
prime attributes
• The relation is not in 2NF because of the
following FDs
– {SSn, Pno} -> ename
– {SSn, Pno} -> pname
– {SSn, Pno} -> plocation
• To normalize a non 2NF relation into 2NF
relations
– Decompose the non 2NF relation into 2NF
relations where the non-prime attributes are
associated with only that part of the primary key
on which they are fully functional dependent
Algorithm for 2NF Decomposition
• Let R be a relational schema not in 2NF
• Let F be the set of FDs holding on R
• Determine the canonical cover Fc of R
• Determine the set of candidate keys (K1, K2, …Kn)
for R
• Determine the non-prime attributes K’ = R - {K1,
K2, …Kn)
• While there exists a non 2NF schema Ri
• Do
– For(Each non-trivial left irreducible FD X->Y in Ri)
– Do
• If Y is partially functionally dependent on any prime attribute, then
• R = (R – Y) U XY
Steps to decompose a non-2NF relation into a
2NF relation
• Step 1:
– Create a separate relation for each partial
dependency
• Step 2:
– Remove the right hand side attribute of the
partial dependency from the relation that is
being decomposed.
i.e. R= (R-Y) U (XY) if X->Y is a partial FD
THIRD NORMAL FORM (3NF)
Based on concept of Transitive Functional
Dependency
• A FD X-> Y in a relation schema R is said to be
transitive if there is a set of attributes Z that is
neither a superkey nor a subset of any key of R
and both X-> Z and Z->Y hold
• From SSno -> dnumber and dnumber -> dmgrno,
we get
• SSno -> dmgrno (By Transitivity)
• If X= Ssno, Y= dmgrno and Z= dnumber
• Dnumber is neither a key or subset of any key,
therefore SSno -> dmgrno is a transitive FD
• In simpler words, a transitive FD is that which
is obtained by applying transitivity rule to
other FDs
• 3NF states that
– A relation schema R is in 3NF if
• It satisfies 2NF
• No non-prime attribute is transitively dependent on the
primary key
• Emp_Deptt is in 2NF because all non-prime
attributes are fully functionally dependent on
the primary key
• But non prime attribute dmgrno is transitively
dependent on the primary key Ssno
• Hence relation is not in 3NF
• To decompose a non-3NF relation into 3NF
relations
– Decompose and set up a relation that includes the
non key attributes that functionally determine
other non key attributes
Check for 3NF
• A relation schema R is said to be in 3NF, if
every FD X->Y holding on R satisfies one of the
following conditions
– It is a trivial FD
– X is a superkey of R
– Y is a prime attribute of R
• Emp_deptt is not in 3NF because
– FDs are non trivial
– RHS in FDs are non prime
– Dnumber is not a superkey
• The decomposed relations are in 3NF as
In Emp,
– Sno is the primary key (Ssno -> ename,Address, dnumber)
In Deptt
– Dnumber is the primary key (dnumber -> dname, dmgrno)
• Given the relation schema R(ABCDE) and the
set of functional dependencies
A->BC, CD->E, E->A , B->D
Check if R is in 3NF
Solution in notes
Algorithm for 3NF Decomposition
• Let R be a relational schema not in 3NF
• Let F be the set of FDs holding on R
• Determine the canonical cover Fc of R
• Determine the set of candidate keys (K1, K2, …Kn) for R
• Determine the non-prime attributes K’ = R - {K1, K2, …Kn)
• While there exists a non 3NF schema Ri
• Do
– For(Each non-trivial left irreducible FD X->Y in Ri)
– Do
• If(X is not a super key of Ri) and (Y has only non- prime attributes)
• R = (R – Y) U XY
• For the FD D->E, we decompose the relation
as follows
• R= (R-E) U (D,E)
= (ABCD) U (DE)
D->E
D is the primary key, hence in
3NF
A->BCDE
BC->AD
A and BC are superkeys hence in 3NF
• BOYCE CODD NORMAL FORM
– A stricter form of 3NF
– A relation schema in 3NF may still have some
anomalies, especially when the schema has
multiple candidate keys which may be composite
or overlapping
– In such cases update anomalies may exist
• The FDs are
{S#, P#} -> Qty
{Sname, P#} -> Qty
S#-> Sname
Sname -> S#
• The keys are
{S#, P#}
{Sname, P#}
• The schema is in 2NF and 3NF
• However inspite of being in 2NF and 3NF, the
relation under this schema will have
redundancies
• Thus there is a need to have a normal form
stronger than 3NF
• The solution is provided by Boyce Codd
Normal Form (BCNF)
• A relation schema R is said to be in BCNF if
each FD X-> Y holding on R, satisfies one of the
following conditions :
– It is a trivial FD
– X is a superkey of R
• These conditions for BCNF are the same as the
first two conditions for 3NF
• However, the third condition is missing, thus
BCNF is a more stricter form than 3NF
• A schema may be in 3NF but not in BCNF
• Now consider the schema SP(S#, Sname, P#, Qty)
And the FDs
{S#, P#} -> Qty
{Sname, P#} -> Qty
S#-> Sname
Sname -> S#
• The keys are
{S#, P#} and {Sname, P#}
• It is in 3NF but not in BCNF because of the FDs S#-> Sname
and Sname -> S# are non-trivial and LHS is not a superkey
• Such a relation can be decomposed into BCNF
relations
• The BCNF decomposition of SP based on the FDs
violationg BCNF are
– S(S#, Sname) with FDs S#-> Sname and Sname-> S#
And key S# or Sname
– SP1(S#, P#, Qty) with FD {S#, P#} -> Qty and key {S#, P#}
OR
– S(S#, Sname) with FDs S#-> Sname and Sname-> S#
And key S# or Snmae
– SP1(Sname, P#, Qty) with FD {Sname, P#} -> Qty and key
{Sname, P#}
• ALGO to decompose non-BCNF schema into a set
of BCNF schemas
• Let R be a relational schema not in 3NF
• Let F be the set of FDs holding on R
• Determine the canonical cover Fc of R
• Determine the set of candidate keys (K1, K2, …Kn) for R
• Determine the non-prime attributes K’ = R - {K1, K2, …Kn)
• While there exists a non BCNF schema Ri
• Do
– For(Each non-trivial left irreducible FD X->Y in Ri)
– Do
• If(X is not a super key of Ri)
• R = (R – Y) U XY
COMPARISON OF BCNF and 3NF
• The goal of database design is to reduce the
redundancy in relations and have consistency
of data
• This is achieved through decomposition into
normal forms so as to obtain schemas that are
– In best highest normal form (upto BCNF)
– Decomposition is
• Attribute preserving
• Dependency preserving
• lossless
• 3NF decomposition results into dependency
preserving and lossless decompositions
• However limitations of 3NF decompositions
are
– Possibility of NULL values
– Some redundancy
• So a higher form BCNF is used
– However it may not be always possible to obtain a
BCNF design without sacrificing some FDs
• Thus we may opt for a 3NF design with some
redundancy and NULL values but more
integrity of the data OR a BCNF design with
loss of FDs
chapter_8.pptx
chapter_8.pptx

More Related Content

Similar to chapter_8.pptx

Similar to chapter_8.pptx (20)

Normalization1
Normalization1Normalization1
Normalization1
 
eaxmple of Normalisation
eaxmple of Normalisationeaxmple of Normalisation
eaxmple of Normalisation
 
Normal forms.ppt
Normal forms.pptNormal forms.ppt
Normal forms.ppt
 
Database Presentation
Database PresentationDatabase Presentation
Database Presentation
 
Fd & Normalization - Database Management System
Fd & Normalization - Database Management SystemFd & Normalization - Database Management System
Fd & Normalization - Database Management System
 
Normalization
NormalizationNormalization
Normalization
 
normalization in database management system
normalization in database management systemnormalization in database management system
normalization in database management system
 
Chapter10
Chapter10Chapter10
Chapter10
 
Lecture No. 21-22.ppt
Lecture No. 21-22.pptLecture No. 21-22.ppt
Lecture No. 21-22.ppt
 
Normalization
NormalizationNormalization
Normalization
 
Normmmalizzarion.ppt
Normmmalizzarion.pptNormmmalizzarion.ppt
Normmmalizzarion.ppt
 
chap 10 dbms.pptx
chap 10 dbms.pptxchap 10 dbms.pptx
chap 10 dbms.pptx
 
Database Management System( Normalization)
Database Management System( Normalization)Database Management System( Normalization)
Database Management System( Normalization)
 
Normalization(15.09.2010)
Normalization(15.09.2010)Normalization(15.09.2010)
Normalization(15.09.2010)
 
Function Dependencies and Normalization
 Function Dependencies and Normalization Function Dependencies and Normalization
Function Dependencies and Normalization
 
chapter 4-Functional Dependency and Normilization.pdf
chapter 4-Functional Dependency and Normilization.pdfchapter 4-Functional Dependency and Normilization.pdf
chapter 4-Functional Dependency and Normilization.pdf
 
Chapter 9
Chapter 9Chapter 9
Chapter 9
 
Database management system session 5
Database management system session 5Database management system session 5
Database management system session 5
 
Normalisation revision
Normalisation revisionNormalisation revision
Normalisation revision
 
Normal forms & Normalization.pptx
Normal forms & Normalization.pptxNormal forms & Normalization.pptx
Normal forms & Normalization.pptx
 

Recently uploaded

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 

Recently uploaded (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

chapter_8.pptx

  • 2. • The normalization process, as first proposed by Codd (1972a), takes a relation schema through a series of tests to certify whether it satisfies a certain normal form. • The process, which proceeds in a top-down fashion by evaluating each relation against the criteria for normal forms and decomposing relations as necessary, can thus be considered as relational design by analysis.
  • 3. • Initially, Codd proposed three normal forms, which he called first, second, and third normal form. • A stronger definition of 3NF—called Boyce- Codd normal form (BCNF)—was proposed later by Boyce and Codd. • All these normal forms are based on a single analytical tool: the functional dependencies among the attributes of a relation. • Later, a fourth normal form (4NF) and a fifth normal form (5NF) were proposed, based on the concepts of multivalued dependencies and join dependencies, respectively
  • 4. • Normalization of data can be considered a process of analyzing the given relation schemas based on their FDs and primary keys to achieve the desirable properties of – minimizing redundancy and – minimizing the insertion, deletion, and update anomalies
  • 5. • It can be considered as a “filtering” or “purification” process to make the design have successively better quality. • Unsatisfactory relation schemas that do not meet certain conditions—the normal form tests—are decomposed into smaller relation schemas that meet the tests and hence possess the desirable properties.
  • 6. • Thus, the normalization procedure provides database designers with the following: – A formal framework for analyzing relation schemas based on their keys and on the functional dependencies among their attributes – A series of normal form tests that can be carried out on individual relation schemas so that the relational database can be normalized to any desired degree
  • 7. Definition. • The normal form of a relation refers to the highest normal form condition that it meets, and hence indicates the degree to which it has been normalized.
  • 8. • Normal forms, when considered in isolation from other factors, do not guarantee a good database design. • It is generally not sufficient to check separately that each relation schema in the database is, say, in BCNF or 3NF. • Rather, the process of normalization through decomposition must also confirm the existence of additional properties that the relational schemas, taken together, should possess.
  • 9. • These would include two properties: – The nonadditive join or lossless join property, which guarantees that the spurious tuple generation problem does not occur with respect to the relation schemas created after decomposition. – The dependency preservation property, which ensures that each functional dependency is represented in some individual relation resulting after decomposition.
  • 10. • The non additive join property is extremely critical and must be achieved at any cost, whereas the dependency preservation property, although desirable, is sometimes sacrificed
  • 11. • database designers need not normalize to the highest possible normal form. • Relations may be left in a lower normalization status, such as 2NF, for performance reasons Doing so incurs the corresponding penalties of dealing with the anomalies.
  • 12. FIRST NORMAL FORM • First normal form (1NF) is now considered to be part of the formal definition of a relation in the relational model; • historically, it was defined to disallow multivalued attributes, composite attributes, and their combinations. • It states that the domain of an attribute must include only atomic (simple, indivisible) values and that the value of any attribute in a tuple must be a single value from the domain of that attribute.
  • 13. • Hence, 1NF disallows having – a set of values (multivalued attribute) – a set of values (composite attribute) – or a combination of both as an attribute value for a single tuple.
  • 14. • In case of composite attributes, INF can be achieved by breaking the composite attributes into atomic attributes • Employee(eno, Name) • Address(eno, Hno, Street, City)
  • 15. • Consider the multivalued attribute Dlocation
  • 16. • To normalise this relation into 1NF, we Place the multivalued attribute in a separate relation with the primary key of the original relation
  • 17. SECOND NORMAL FORM (2NF) • Based on the concept of Full Functional Dependency
  • 18. • A FD X-> Y is said to be a full FD if removal of any attribute A from X means that the dependency does not hold any more • i.e. {X-A} does not functionally determine Y
  • 19. • A FD X-> Y is said to be partial if some attribute A can be removed from X and the dependency {X-A} -> Y still holds
  • 20.
  • 21.
  • 22. 2NF states that A relation schema R is in 2NF if it satisfies the following conditions : • It is in 1NF • every non prime attribute A in R is fully functional dependent on any key of R (no non prime attribute is dependent on part of the key)
  • 23. • Alternate Definition of 2NF A relation schema R is said to be in 2NF if it is in 1NF and every non-prime attribute A in R is not partially dependent on any key of R
  • 24. • {SSn, pno} is the primary key • Hours, pname, ename, plocation are the non- prime attributes
  • 25. • The relation is not in 2NF because of the following FDs – {SSn, Pno} -> ename – {SSn, Pno} -> pname – {SSn, Pno} -> plocation
  • 26. • To normalize a non 2NF relation into 2NF relations – Decompose the non 2NF relation into 2NF relations where the non-prime attributes are associated with only that part of the primary key on which they are fully functional dependent
  • 27.
  • 28. Algorithm for 2NF Decomposition • Let R be a relational schema not in 2NF • Let F be the set of FDs holding on R • Determine the canonical cover Fc of R • Determine the set of candidate keys (K1, K2, …Kn) for R • Determine the non-prime attributes K’ = R - {K1, K2, …Kn) • While there exists a non 2NF schema Ri • Do – For(Each non-trivial left irreducible FD X->Y in Ri) – Do • If Y is partially functionally dependent on any prime attribute, then • R = (R – Y) U XY
  • 29. Steps to decompose a non-2NF relation into a 2NF relation • Step 1: – Create a separate relation for each partial dependency • Step 2: – Remove the right hand side attribute of the partial dependency from the relation that is being decomposed. i.e. R= (R-Y) U (XY) if X->Y is a partial FD
  • 30. THIRD NORMAL FORM (3NF) Based on concept of Transitive Functional Dependency
  • 31. • A FD X-> Y in a relation schema R is said to be transitive if there is a set of attributes Z that is neither a superkey nor a subset of any key of R and both X-> Z and Z->Y hold
  • 32. • From SSno -> dnumber and dnumber -> dmgrno, we get • SSno -> dmgrno (By Transitivity) • If X= Ssno, Y= dmgrno and Z= dnumber • Dnumber is neither a key or subset of any key, therefore SSno -> dmgrno is a transitive FD
  • 33. • In simpler words, a transitive FD is that which is obtained by applying transitivity rule to other FDs
  • 34. • 3NF states that – A relation schema R is in 3NF if • It satisfies 2NF • No non-prime attribute is transitively dependent on the primary key
  • 35. • Emp_Deptt is in 2NF because all non-prime attributes are fully functionally dependent on the primary key • But non prime attribute dmgrno is transitively dependent on the primary key Ssno • Hence relation is not in 3NF
  • 36. • To decompose a non-3NF relation into 3NF relations – Decompose and set up a relation that includes the non key attributes that functionally determine other non key attributes
  • 37.
  • 38. Check for 3NF • A relation schema R is said to be in 3NF, if every FD X->Y holding on R satisfies one of the following conditions – It is a trivial FD – X is a superkey of R – Y is a prime attribute of R
  • 39. • Emp_deptt is not in 3NF because – FDs are non trivial – RHS in FDs are non prime – Dnumber is not a superkey
  • 40. • The decomposed relations are in 3NF as In Emp, – Sno is the primary key (Ssno -> ename,Address, dnumber) In Deptt – Dnumber is the primary key (dnumber -> dname, dmgrno)
  • 41. • Given the relation schema R(ABCDE) and the set of functional dependencies A->BC, CD->E, E->A , B->D Check if R is in 3NF Solution in notes
  • 42. Algorithm for 3NF Decomposition • Let R be a relational schema not in 3NF • Let F be the set of FDs holding on R • Determine the canonical cover Fc of R • Determine the set of candidate keys (K1, K2, …Kn) for R • Determine the non-prime attributes K’ = R - {K1, K2, …Kn) • While there exists a non 3NF schema Ri • Do – For(Each non-trivial left irreducible FD X->Y in Ri) – Do • If(X is not a super key of Ri) and (Y has only non- prime attributes) • R = (R – Y) U XY
  • 43.
  • 44. • For the FD D->E, we decompose the relation as follows • R= (R-E) U (D,E) = (ABCD) U (DE) D->E D is the primary key, hence in 3NF A->BCDE BC->AD A and BC are superkeys hence in 3NF
  • 45. • BOYCE CODD NORMAL FORM – A stricter form of 3NF – A relation schema in 3NF may still have some anomalies, especially when the schema has multiple candidate keys which may be composite or overlapping – In such cases update anomalies may exist
  • 46. • The FDs are {S#, P#} -> Qty {Sname, P#} -> Qty S#-> Sname Sname -> S# • The keys are {S#, P#} {Sname, P#} • The schema is in 2NF and 3NF
  • 47. • However inspite of being in 2NF and 3NF, the relation under this schema will have redundancies • Thus there is a need to have a normal form stronger than 3NF
  • 48. • The solution is provided by Boyce Codd Normal Form (BCNF) • A relation schema R is said to be in BCNF if each FD X-> Y holding on R, satisfies one of the following conditions : – It is a trivial FD – X is a superkey of R
  • 49. • These conditions for BCNF are the same as the first two conditions for 3NF • However, the third condition is missing, thus BCNF is a more stricter form than 3NF • A schema may be in 3NF but not in BCNF
  • 50. • Now consider the schema SP(S#, Sname, P#, Qty) And the FDs {S#, P#} -> Qty {Sname, P#} -> Qty S#-> Sname Sname -> S# • The keys are {S#, P#} and {Sname, P#} • It is in 3NF but not in BCNF because of the FDs S#-> Sname and Sname -> S# are non-trivial and LHS is not a superkey
  • 51. • Such a relation can be decomposed into BCNF relations
  • 52. • The BCNF decomposition of SP based on the FDs violationg BCNF are – S(S#, Sname) with FDs S#-> Sname and Sname-> S# And key S# or Sname – SP1(S#, P#, Qty) with FD {S#, P#} -> Qty and key {S#, P#} OR – S(S#, Sname) with FDs S#-> Sname and Sname-> S# And key S# or Snmae – SP1(Sname, P#, Qty) with FD {Sname, P#} -> Qty and key {Sname, P#}
  • 53. • ALGO to decompose non-BCNF schema into a set of BCNF schemas • Let R be a relational schema not in 3NF • Let F be the set of FDs holding on R • Determine the canonical cover Fc of R • Determine the set of candidate keys (K1, K2, …Kn) for R • Determine the non-prime attributes K’ = R - {K1, K2, …Kn) • While there exists a non BCNF schema Ri • Do – For(Each non-trivial left irreducible FD X->Y in Ri) – Do • If(X is not a super key of Ri) • R = (R – Y) U XY
  • 54. COMPARISON OF BCNF and 3NF • The goal of database design is to reduce the redundancy in relations and have consistency of data • This is achieved through decomposition into normal forms so as to obtain schemas that are – In best highest normal form (upto BCNF) – Decomposition is • Attribute preserving • Dependency preserving • lossless
  • 55.
  • 56.
  • 57. • 3NF decomposition results into dependency preserving and lossless decompositions • However limitations of 3NF decompositions are – Possibility of NULL values – Some redundancy • So a higher form BCNF is used – However it may not be always possible to obtain a BCNF design without sacrificing some FDs • Thus we may opt for a 3NF design with some redundancy and NULL values but more integrity of the data OR a BCNF design with loss of FDs