Normalization
Lecture 2
Assistant Lecturer Huda A. Alameen
hudaa.alameen@uokufa.edu.iq
Redundant information in tuples and update
anomalies
 Information is stored redundantly
 Storage waste.
 Causes problems with update anomalies
•Insertion anomalies
•Deletion anomalies
•Update anomalies
Two relation schemas suffering from update
anomalies
Ename Ssn Bdate Adrress Dnumber Dname Dmgr_ssn
John 111 1/1/1990 A 5 R 33333
Smith 222 2/10/1988 B 5 R 33333
Zaid 112 5/8/1990 C 7 F 3335
NOOR 321 5/8/1990 Y 8 W 21333
Ali 342 2/11/1993 E 7 F 3335
Two relation schemas suffering from
update anomalies
• Insertion anomalies occur in PROJECT-EMPLOYEE relation because we can not insert
information about any new employee unless that employee is already assigned to a project.
Any attribute of the composite key (Proj-ID,Emp-ID) can not be NULL.
• Deletion anomalies occur when we delete the last tuple of a particular employee. In
this case, we not only delete the project information that connects that employee to a
particular project but also lose other information about the department for which this
employee works.
Two relation schemas suffering from
update anomalies
• Update anomalies occur because the department for which an employee works may
appear many times in the table. It is this redundancy of information that causes the
anomaly because if an employee moves to another department, we are now faced with
two problems: we either search the entire table looking for that employee and update
his Emp-Dpt value or we miss one or more tuples of that employee and end up with an
inconsistent state.
Normalization
 Normalization:
The process of decomposing unsatisfactory “bad” relations by breaking up their
attributes into smaller relations.
 Normal form:
Condition using keys and FDs of a relation to certify whether a relation schema is
in a normal form.
Functional Dependency
Example of FD constraints
Normal Forms
 First normal Form
 Second normal form
 Third normal form
 Boyce Codd Normal form
 Fourth normal form
 Fifth normal form
1NF : First Normal Form
 Disallows
-Multivalued attributes
-Composite attributes
Normalization of nested relations into
1NF
2NF : Second Normal Form
 Uses the concepts of FDs, primary key
 Definitions
Prime attribute: An attribute that is member of the primary key K
Full functional dependency: a FD Y -> Z where removal of any attribute from Y
means the FD does not hold any more
 Examples:
{SSN, PNUMBER} -> HOURS is a full FD since neither SSN -> HOURS nor PNUMBER
-> HOURS hold
{SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial dependency )
since SSN -> ENAME also holds
2NF : Second Normal Form
 A relation schema R is in second normal form (2NF) if every non-
prime attribute A in R is fully functionally dependent on the primary
key
 R can be decomposed into 2NF relations via the process of 2NF
normalization
2NF : Second Normal Form
3NF : Third Normal Form
 Disallows Transitive Dependency
 Definition:
Transitive functional dependency: a FD X -> Z that can be
derived from two FDs X -> Y and Y -> Z
 Examples:
SSN -> DMGRSSN is a transitive FD
Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold
SSN -> ENAME is non-transitive
Since there is no set of attributes X where SSN -> X
and X -> ENAME
3NF : Third Normal Form

Normalization

  • 1.
    Normalization Lecture 2 Assistant LecturerHuda A. Alameen hudaa.alameen@uokufa.edu.iq
  • 2.
    Redundant information intuples and update anomalies  Information is stored redundantly  Storage waste.  Causes problems with update anomalies •Insertion anomalies •Deletion anomalies •Update anomalies
  • 3.
    Two relation schemassuffering from update anomalies
  • 4.
    Ename Ssn BdateAdrress Dnumber Dname Dmgr_ssn John 111 1/1/1990 A 5 R 33333 Smith 222 2/10/1988 B 5 R 33333 Zaid 112 5/8/1990 C 7 F 3335 NOOR 321 5/8/1990 Y 8 W 21333 Ali 342 2/11/1993 E 7 F 3335
  • 5.
    Two relation schemassuffering from update anomalies • Insertion anomalies occur in PROJECT-EMPLOYEE relation because we can not insert information about any new employee unless that employee is already assigned to a project. Any attribute of the composite key (Proj-ID,Emp-ID) can not be NULL. • Deletion anomalies occur when we delete the last tuple of a particular employee. In this case, we not only delete the project information that connects that employee to a particular project but also lose other information about the department for which this employee works.
  • 6.
    Two relation schemassuffering from update anomalies • Update anomalies occur because the department for which an employee works may appear many times in the table. It is this redundancy of information that causes the anomaly because if an employee moves to another department, we are now faced with two problems: we either search the entire table looking for that employee and update his Emp-Dpt value or we miss one or more tuples of that employee and end up with an inconsistent state.
  • 7.
    Normalization  Normalization: The processof decomposing unsatisfactory “bad” relations by breaking up their attributes into smaller relations.  Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is in a normal form.
  • 8.
  • 9.
    Example of FDconstraints
  • 10.
    Normal Forms  Firstnormal Form  Second normal form  Third normal form  Boyce Codd Normal form  Fourth normal form  Fifth normal form
  • 12.
    1NF : FirstNormal Form  Disallows -Multivalued attributes -Composite attributes
  • 15.
    Normalization of nestedrelations into 1NF
  • 16.
    2NF : SecondNormal Form  Uses the concepts of FDs, primary key  Definitions Prime attribute: An attribute that is member of the primary key K Full functional dependency: a FD Y -> Z where removal of any attribute from Y means the FD does not hold any more  Examples: {SSN, PNUMBER} -> HOURS is a full FD since neither SSN -> HOURS nor PNUMBER -> HOURS hold {SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial dependency ) since SSN -> ENAME also holds
  • 17.
    2NF : SecondNormal Form  A relation schema R is in second normal form (2NF) if every non- prime attribute A in R is fully functionally dependent on the primary key  R can be decomposed into 2NF relations via the process of 2NF normalization
  • 18.
    2NF : SecondNormal Form
  • 19.
    3NF : ThirdNormal Form  Disallows Transitive Dependency  Definition: Transitive functional dependency: a FD X -> Z that can be derived from two FDs X -> Y and Y -> Z  Examples: SSN -> DMGRSSN is a transitive FD Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold SSN -> ENAME is non-transitive Since there is no set of attributes X where SSN -> X and X -> ENAME
  • 20.
    3NF : ThirdNormal Form