Chapter 5 Normalization An Normalization example
Learning Objectives What normalization is and what role it plays in the database design process About the normal forms 1NF, 2NF, 3NF, BCNF,  and 4NF How normal forms can be transformed from lower normal forms to higher normal forms That normalization and ER modeling are used concurrently to produce a good database design That some situations require denormalization to generate information efficiently
Normalization A process for evaluating and correcting table structure Minimize data redundancy Eliminate Anomalies
Is Normalization Necessary?  NO But it is helpful to maintain data integrity and consistency
Anomalies Update—requires update in multiple locations Deletion—A deletion may lose important information Insertion—Requires complete definitions, ie does see page 187 (an employee can not be entered unless he is assigned a project
Normalization Process 1 st  NF 2 nd  NF 3 rd  NF Almost for 90-98% application 3 rd  NF is sufficient
Dependency When an attribute value depends on attribute B then B is dependent on A A---  B or values of B can be determined by value of A, reverse may or may not be true Ex: ssn--  Name SSN, CID--  Grade
Un-normalized relation   Remove REPEATING groups 1st NF Remove PARTIAL dependency 2nd NF Remove TRANSIENT dependency 3rd NF Every determinant is a candidate key Boyce-CODD NF If we can convert a relation into 3NF almost 90-98% of anomalies are removed
The Need for Normalization ( continued ) Structure of data set in Figure 5.1 does not handle data very well The table structure  appears  to work; report is generated with ease Unfortunately, the report may yield different results, depending on what data anomaly has occurred
1 st  NF Remove repeating groups ASSIGNMENT (Proj_num, proj_name(Emp_num,E_name,job_class,chg_hours,Hour))
A Dependency Diagram:  First Normal Form (1NF)
Second Normal Form (2NF)  Conversion Results
Third Normal Form (3NF)  Conversion Results
Un-normalized form A relation is in un-normalized form, if it contains repeating group Typically shown in parentheses Ex:  PART NO   DESC.     VENDOR-NAME   ADDRESS  UNIT-COST  1234            LOGIC  INTEL                   SAN JOSE    150.00                       chip           LSI LOGIC            SAN JOSE    120.00  5678      MEMORY   INTEL                  SAN JOSE     50.00   chip
SUPPLIER  (Part_no, Part_DESC,  (Vendor_name, Vendor_address, Unit_cost ))
Another Way (Part_NO, V_NAME)-> Unit_cost Part_NO->P_Desc (Partial dependency) V-Name->V_DESC (Partial Dependency) Part_NO P-DESC V_NAME V_ADDRESS UNIT_COST
1 st  NF A relation is in 1 st  NF if it does NOT contain any repeating groups (Part_no, Part_DESC,  (Vendor_name, Vendor_address, Unit_cost )) 1 st  NF..remove repeating groups Break it into TWO relations One without repeating group and  ONE with repeating group AND PK of other relation S1 (Part_no, Part_DESC) S2 ( Vendor_name,  Part_no , Vendor_address, Unit_cost
2 nd  NF A relation is in 2nf NF if it is in 1stNF and it does not contain any  partial  dependency Partial dependency: A partial dependency exists if an attribute is dependent ONLY on PART of the PK and the WHOLE PK We must examine each relation for partial dependency NOTE: A partial dependency can only exist if there are more than ONE attribute as PK
S1 ( Part_no , Part_DESC) S2 ( Vendor_name, Part_no,  Vendor_address, Unit_cost Note S1 is already in 2 nd  NF since there is only attribute as PK In S2: Question is Vendor_address dependent on BOTH vendor_name AND Part_NO? Question is Unit_price dependent on BOTH vendor_name AND Part_NO?
Question is Vendor_address dependent on BOTH vendor_name AND Part_NO? Answer: NO Give me vendor_no and I can find vendor_address, we do NOT need Part_No to know vendor_address, ie Vendor_address depends ONLY Vendor_name, hence the partial dependency
Question is Unit_price dependent on BOTH vendor_name AND Part_NO? YES if you examine the table, price changes with vendor and part_no, ie price depnds on both Part_no AND which vendor supplies it
Remove Partial Dependency VENDOR  _ADDRESS VENDOR_name   UNIT_PRICE PART# Create TWO tables: One with Partial dependency and other without it S21 ( Vendor_name , vendor_address ) S22 (Vendor_name, Part_no ,  Unit_price )
3rd NF A relation is in 3 rd  NF if it is in 2 nd  NF and it does not contain any  transitive  dependency Transitive dependency: A transitive dependency exists when some of the non-key attributes are dependent on other non-key attributes
So far we have three relations that are in at least 2 nd  NF S1  ( Part_no , Part_DESC) S21 ( Vendor_name , vendor_address ) S22 (Vendor_name, Part_no ,  Unit_price ) S1, S21 & S22 are also in 3 rd  NF since there is ONLY ONE non_key attribute and transitive dependency can NOT exist
ERD PART PART-SUPPLIED VENDOR
Q6/p 184 A Using notation from the book (C1, C3)-   C2,C4,C5 (i.e., C2, C4, C5) are functionally dependent on C1 and C3 Above relation is in at least 1stNF, since there are No repeating groups C1  C2 there is PARTIAL dependency since C2 depends on PART of the PK and the whole PK C4--  C5 (transitive dependency since C5 ( a non-PK attribute) depends on another non-PK attribute (C4)
Part b
Part c
Q8/P187 Table P5.8 Sample ITEM Records May B. Next I. B. Rightonit I. B. Rightonit BLDG_MANAGER Canseefar Nottoclear Nottooclear  BLDG_NAME CSF NTC NTC BLDG_CODE 123 325 325 ROOM_NUMBER DT Scanner HP Toner HP DeskJet 895Cse ITEM_LABEL 254668-449 342245-225 231134-678 ITEM_ID Sample Value Sample Value Sample Value Attribute Name
ITEM_DESCRIPTION BLDG_CODE BLDG_NAME BLDG_MANAGER ITEM_ID BLDG_ROOM Problem 8 Solution
ITEM_DESCRIPTION BLDG_CODE BLDG_NAME EMP_FNAME EMP_INITIAL BLDG_NAME BLDG_MANAGER BLDG_CODE EMP_CODE EMP_LNAME Problem 9 Solution: All tables in 3NF ITEM_ID BLDG_ROOM ITEM_DESCRIPTION ITEM_ID BLDG_ROOM BLDG_CODE EMP_CODE
 
Denormalization Reversing normalization i.e from 3 rd  NF to 2 nd  NF Or 2 nd  to 1 st  NF
Convert into 3NF INVOICE (Inv_num, cust_num,lastname,Firstname,street,city,state,zip,date,(partnum, description,price,numshipped))

Lecture8 Normalization Aggarwal

  • 1.
    Chapter 5 NormalizationAn Normalization example
  • 2.
    Learning Objectives Whatnormalization is and what role it plays in the database design process About the normal forms 1NF, 2NF, 3NF, BCNF, and 4NF How normal forms can be transformed from lower normal forms to higher normal forms That normalization and ER modeling are used concurrently to produce a good database design That some situations require denormalization to generate information efficiently
  • 3.
    Normalization A processfor evaluating and correcting table structure Minimize data redundancy Eliminate Anomalies
  • 4.
    Is Normalization Necessary? NO But it is helpful to maintain data integrity and consistency
  • 5.
    Anomalies Update—requires updatein multiple locations Deletion—A deletion may lose important information Insertion—Requires complete definitions, ie does see page 187 (an employee can not be entered unless he is assigned a project
  • 6.
    Normalization Process 1st NF 2 nd NF 3 rd NF Almost for 90-98% application 3 rd NF is sufficient
  • 7.
    Dependency When anattribute value depends on attribute B then B is dependent on A A---  B or values of B can be determined by value of A, reverse may or may not be true Ex: ssn--  Name SSN, CID--  Grade
  • 8.
    Un-normalized relation Remove REPEATING groups 1st NF Remove PARTIAL dependency 2nd NF Remove TRANSIENT dependency 3rd NF Every determinant is a candidate key Boyce-CODD NF If we can convert a relation into 3NF almost 90-98% of anomalies are removed
  • 9.
    The Need forNormalization ( continued ) Structure of data set in Figure 5.1 does not handle data very well The table structure appears to work; report is generated with ease Unfortunately, the report may yield different results, depending on what data anomaly has occurred
  • 10.
    1 st NF Remove repeating groups ASSIGNMENT (Proj_num, proj_name(Emp_num,E_name,job_class,chg_hours,Hour))
  • 11.
    A Dependency Diagram: First Normal Form (1NF)
  • 12.
    Second Normal Form(2NF) Conversion Results
  • 13.
    Third Normal Form(3NF) Conversion Results
  • 14.
    Un-normalized form Arelation is in un-normalized form, if it contains repeating group Typically shown in parentheses Ex: PART NO   DESC.     VENDOR-NAME   ADDRESS  UNIT-COST  1234            LOGIC INTEL                   SAN JOSE    150.00                       chip           LSI LOGIC            SAN JOSE    120.00  5678     MEMORY   INTEL                  SAN JOSE     50.00 chip
  • 15.
    SUPPLIER (Part_no,Part_DESC, (Vendor_name, Vendor_address, Unit_cost ))
  • 16.
    Another Way (Part_NO,V_NAME)-> Unit_cost Part_NO->P_Desc (Partial dependency) V-Name->V_DESC (Partial Dependency) Part_NO P-DESC V_NAME V_ADDRESS UNIT_COST
  • 17.
    1 st NF A relation is in 1 st NF if it does NOT contain any repeating groups (Part_no, Part_DESC, (Vendor_name, Vendor_address, Unit_cost )) 1 st NF..remove repeating groups Break it into TWO relations One without repeating group and ONE with repeating group AND PK of other relation S1 (Part_no, Part_DESC) S2 ( Vendor_name, Part_no , Vendor_address, Unit_cost
  • 18.
    2 nd NF A relation is in 2nf NF if it is in 1stNF and it does not contain any partial dependency Partial dependency: A partial dependency exists if an attribute is dependent ONLY on PART of the PK and the WHOLE PK We must examine each relation for partial dependency NOTE: A partial dependency can only exist if there are more than ONE attribute as PK
  • 19.
    S1 ( Part_no, Part_DESC) S2 ( Vendor_name, Part_no, Vendor_address, Unit_cost Note S1 is already in 2 nd NF since there is only attribute as PK In S2: Question is Vendor_address dependent on BOTH vendor_name AND Part_NO? Question is Unit_price dependent on BOTH vendor_name AND Part_NO?
  • 20.
    Question is Vendor_addressdependent on BOTH vendor_name AND Part_NO? Answer: NO Give me vendor_no and I can find vendor_address, we do NOT need Part_No to know vendor_address, ie Vendor_address depends ONLY Vendor_name, hence the partial dependency
  • 21.
    Question is Unit_pricedependent on BOTH vendor_name AND Part_NO? YES if you examine the table, price changes with vendor and part_no, ie price depnds on both Part_no AND which vendor supplies it
  • 22.
    Remove Partial DependencyVENDOR _ADDRESS VENDOR_name UNIT_PRICE PART# Create TWO tables: One with Partial dependency and other without it S21 ( Vendor_name , vendor_address ) S22 (Vendor_name, Part_no , Unit_price )
  • 23.
    3rd NF Arelation is in 3 rd NF if it is in 2 nd NF and it does not contain any transitive dependency Transitive dependency: A transitive dependency exists when some of the non-key attributes are dependent on other non-key attributes
  • 24.
    So far wehave three relations that are in at least 2 nd NF S1 ( Part_no , Part_DESC) S21 ( Vendor_name , vendor_address ) S22 (Vendor_name, Part_no , Unit_price ) S1, S21 & S22 are also in 3 rd NF since there is ONLY ONE non_key attribute and transitive dependency can NOT exist
  • 25.
  • 26.
    Q6/p 184 AUsing notation from the book (C1, C3)-  C2,C4,C5 (i.e., C2, C4, C5) are functionally dependent on C1 and C3 Above relation is in at least 1stNF, since there are No repeating groups C1  C2 there is PARTIAL dependency since C2 depends on PART of the PK and the whole PK C4--  C5 (transitive dependency since C5 ( a non-PK attribute) depends on another non-PK attribute (C4)
  • 27.
  • 28.
  • 29.
    Q8/P187 Table P5.8Sample ITEM Records May B. Next I. B. Rightonit I. B. Rightonit BLDG_MANAGER Canseefar Nottoclear Nottooclear BLDG_NAME CSF NTC NTC BLDG_CODE 123 325 325 ROOM_NUMBER DT Scanner HP Toner HP DeskJet 895Cse ITEM_LABEL 254668-449 342245-225 231134-678 ITEM_ID Sample Value Sample Value Sample Value Attribute Name
  • 30.
    ITEM_DESCRIPTION BLDG_CODE BLDG_NAMEBLDG_MANAGER ITEM_ID BLDG_ROOM Problem 8 Solution
  • 31.
    ITEM_DESCRIPTION BLDG_CODE BLDG_NAMEEMP_FNAME EMP_INITIAL BLDG_NAME BLDG_MANAGER BLDG_CODE EMP_CODE EMP_LNAME Problem 9 Solution: All tables in 3NF ITEM_ID BLDG_ROOM ITEM_DESCRIPTION ITEM_ID BLDG_ROOM BLDG_CODE EMP_CODE
  • 32.
  • 33.
    Denormalization Reversing normalizationi.e from 3 rd NF to 2 nd NF Or 2 nd to 1 st NF
  • 34.
    Convert into 3NFINVOICE (Inv_num, cust_num,lastname,Firstname,street,city,state,zip,date,(partnum, description,price,numshipped))