Ms. Neethu Tressa
Assistant Professor
NORMALIZATION
CONTENTS
 INTRODUCTION
 DATABASE ANOMALIES
 FUNCTIONAL DEPENDENCY
 PARTIAL DEPENDENCY
 TRIVIAL DEPENDENCY
 TRANSITIVE DEPENDENCY
 NORMAL FORMS
 FIRST NORMAL FORM
 SECOND NORMAL FORM
 THIRD NORMAL FORM
 BOYCE-CODD NORMAL FORM
INTRODUCTION
 Normalization is a process of organizing the data in
database to avoid data redundancy, insertion anomaly,
update anomaly & deletion anomaly.
DATABASE ANOMALIES
 Update Anomaly – a record is updated, but other
appearances of the same items are not updated
 Insertion Anomaly – user is unable to insert a new
record when it should be possible to do so
 Deletion Anomaly – when a record is deleted, other
information that is tied to it is also deleted
 E.g.: Consider Employee Table
Emp-id Emp-name Emp-addr Emp-dept
101 Rick Delhi D001
101 Rick Delhi D002
123 Maggie Agra D890
166 Glenn Chennai D900
166 Glenn Chennai D004
FUNCTIONAL DEPENDENCY
 Functional Dependency, denoted by X→Y, holds if
whenever two tuples have the same value for X, they
must have the same value for Y.
 Attribute on the left hand side is known as the
determinant
 E.g.: Emp-name → E-mail
 Emp-name is a determinant of E-mail
Emp-name Project E-mail
Rick Smart System rick@yahoo.com
Glenn Management System glenn@gmail.com
Rick Re-design System rick@yahoo.com
Maggie Smart System maggie@yahoo.com
Glenn Database System glenn@gmail.com
PARTIAL DEPENDENCY
 Prime attribute − An attribute, which is a part of the
prime-key, is known as a prime attribute.
 Non-prime attribute − An attribute, which is not a part
of the prime-key, is said to be a non-prime attribute.
 A partial dependency is a situation where a non-prime
attribute is functionally dependent to a part of a prime-
key
 E.g.: Student-Project table
Stu-Id Proj-Id Stu-Name Proj-Name
TRIVIAL DEPENDENCY
 If a functional dependency FD X → Y holds, where Y
is a subset of X, then it is called a Trivial FD. Trivial
FDs always hold.
 If A and B are attributes of R,
 {A}→{A}
 {A,B} →{A}
 {A,B} →{B}
 {A,B} →{A,B} are all trivial FDs
TRANSITIVE DEPENDENCY
 If A, B, and C are attributes of relation R, such that
A → B, and B → C, then C is transitively dependent
on A
 E.g.: Student (stuId, Name, subject, credits, status)
with FD:
 credits→status
 By transitivity: stuId→credits , credits→status implies
stuId→status
 Transitive dependencies cause update, insertion,
deletion anomalies
NORMAL FORMS
 Most commonly used normal forms:
 First Normal Form (1NF)
 Second Normal Form (2NF)
 Third Normal Form (3NF)
 Boyce-codd Normal Form (BCNF)
FIRST NORMAL FORM (1NF)
 An attribute (column) of a table cannot hold multiple values.
It should hold only atomic values
 SELECT * FROM employee WHERE language LIKE ‘%English%’
 UPDATE employee SET language = ‘French, English’ WHERE
name = ‘George’
 Remove duplicate fields
SECOND NORMAL FORM (2NF)
 A relation is in second normal form (2NF) when it is in
1NF and when every non-key attribute, is fully
dependent on the primary key
 Remove Partial Dependency
SECOND NORMAL FORM (2NF) (Cont..)
SECOND NORMAL FORM (2NF) (Cont..)
THIRD NORMAL FORM (3NF)
 A relation is in third normal form (3NF) when it is in
2NF and all non-primary fields are dependent on the
primary key
 Remove Transitive Dependency
THIRD NORMAL FORM (3NF) (Cont..)
BOYCE-CODD NORMAL FORM (BCNF)
 A relation is in BCNF, if it is in 3NF and for every one of its
dependencies X → Y, one of the following conditions holds true:
 X → Y is a trivial functional dependency (i.e., Y is a subset of
X)
 X is a super key
 Remove determinants that are not super keys
Author Nationality Book title Genre Number of
pages
William
Shakespeare
English The Comedy of Errors Comedy 100
Markus
Winand
Austrian SQL Performance
Explained
Textbook 200
Jeffrey
Ullman
American A First Course in Database
Systems
Textbook 500
Jennifer
Widom
American A First Course in Database
Systems
Textbook 500
Author Nationality
William Shakespeare English
Markus Winand Austrian
Jeffrey Ullman American
Jennifer Widom American
Book title Genre Number of
pages
The Comedy of Errors Comedy 100
SQL Performance Explained Textbook 200
A First Course in Database Systems Textbook 500
Author Book title
William Shakespeare The Comedy of Errors
Markus Winand SQL Performance Explained
Jeffrey Ullman A First Course in Database Systems
Jennifer Widom A First Course in Database Systems
 Dependencies violating BCNF rules:
 author → nationality
 book title → genre, number of pages
Normalization

Normalization

  • 1.
    Ms. Neethu Tressa AssistantProfessor NORMALIZATION
  • 2.
    CONTENTS  INTRODUCTION  DATABASEANOMALIES  FUNCTIONAL DEPENDENCY  PARTIAL DEPENDENCY  TRIVIAL DEPENDENCY  TRANSITIVE DEPENDENCY  NORMAL FORMS  FIRST NORMAL FORM  SECOND NORMAL FORM  THIRD NORMAL FORM  BOYCE-CODD NORMAL FORM
  • 3.
    INTRODUCTION  Normalization isa process of organizing the data in database to avoid data redundancy, insertion anomaly, update anomaly & deletion anomaly.
  • 4.
    DATABASE ANOMALIES  UpdateAnomaly – a record is updated, but other appearances of the same items are not updated  Insertion Anomaly – user is unable to insert a new record when it should be possible to do so  Deletion Anomaly – when a record is deleted, other information that is tied to it is also deleted  E.g.: Consider Employee Table Emp-id Emp-name Emp-addr Emp-dept 101 Rick Delhi D001 101 Rick Delhi D002 123 Maggie Agra D890 166 Glenn Chennai D900 166 Glenn Chennai D004
  • 5.
    FUNCTIONAL DEPENDENCY  FunctionalDependency, denoted by X→Y, holds if whenever two tuples have the same value for X, they must have the same value for Y.  Attribute on the left hand side is known as the determinant  E.g.: Emp-name → E-mail  Emp-name is a determinant of E-mail Emp-name Project E-mail Rick Smart System rick@yahoo.com Glenn Management System glenn@gmail.com Rick Re-design System rick@yahoo.com Maggie Smart System maggie@yahoo.com Glenn Database System glenn@gmail.com
  • 6.
    PARTIAL DEPENDENCY  Primeattribute − An attribute, which is a part of the prime-key, is known as a prime attribute.  Non-prime attribute − An attribute, which is not a part of the prime-key, is said to be a non-prime attribute.  A partial dependency is a situation where a non-prime attribute is functionally dependent to a part of a prime- key  E.g.: Student-Project table Stu-Id Proj-Id Stu-Name Proj-Name
  • 7.
    TRIVIAL DEPENDENCY  Ifa functional dependency FD X → Y holds, where Y is a subset of X, then it is called a Trivial FD. Trivial FDs always hold.  If A and B are attributes of R,  {A}→{A}  {A,B} →{A}  {A,B} →{B}  {A,B} →{A,B} are all trivial FDs
  • 8.
    TRANSITIVE DEPENDENCY  IfA, B, and C are attributes of relation R, such that A → B, and B → C, then C is transitively dependent on A  E.g.: Student (stuId, Name, subject, credits, status) with FD:  credits→status  By transitivity: stuId→credits , credits→status implies stuId→status  Transitive dependencies cause update, insertion, deletion anomalies
  • 9.
    NORMAL FORMS  Mostcommonly used normal forms:  First Normal Form (1NF)  Second Normal Form (2NF)  Third Normal Form (3NF)  Boyce-codd Normal Form (BCNF)
  • 10.
    FIRST NORMAL FORM(1NF)  An attribute (column) of a table cannot hold multiple values. It should hold only atomic values  SELECT * FROM employee WHERE language LIKE ‘%English%’  UPDATE employee SET language = ‘French, English’ WHERE name = ‘George’  Remove duplicate fields
  • 11.
    SECOND NORMAL FORM(2NF)  A relation is in second normal form (2NF) when it is in 1NF and when every non-key attribute, is fully dependent on the primary key  Remove Partial Dependency
  • 12.
    SECOND NORMAL FORM(2NF) (Cont..)
  • 13.
    SECOND NORMAL FORM(2NF) (Cont..)
  • 14.
    THIRD NORMAL FORM(3NF)  A relation is in third normal form (3NF) when it is in 2NF and all non-primary fields are dependent on the primary key  Remove Transitive Dependency
  • 15.
    THIRD NORMAL FORM(3NF) (Cont..)
  • 16.
    BOYCE-CODD NORMAL FORM(BCNF)  A relation is in BCNF, if it is in 3NF and for every one of its dependencies X → Y, one of the following conditions holds true:  X → Y is a trivial functional dependency (i.e., Y is a subset of X)  X is a super key  Remove determinants that are not super keys Author Nationality Book title Genre Number of pages William Shakespeare English The Comedy of Errors Comedy 100 Markus Winand Austrian SQL Performance Explained Textbook 200 Jeffrey Ullman American A First Course in Database Systems Textbook 500 Jennifer Widom American A First Course in Database Systems Textbook 500
  • 17.
    Author Nationality William ShakespeareEnglish Markus Winand Austrian Jeffrey Ullman American Jennifer Widom American Book title Genre Number of pages The Comedy of Errors Comedy 100 SQL Performance Explained Textbook 200 A First Course in Database Systems Textbook 500 Author Book title William Shakespeare The Comedy of Errors Markus Winand SQL Performance Explained Jeffrey Ullman A First Course in Database Systems Jennifer Widom A First Course in Database Systems  Dependencies violating BCNF rules:  author → nationality  book title → genre, number of pages