Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Database Normalization

248 views

Published on

Brief introductory exercise on database normalization to 3rd normal form.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Database Normalization

  1. 1. DATABASE NORMALIZATION Dan D’Urso Orange Coast Database Associates Laguna Niguel, Orange County, California www.ocdatabases.com slides.1@ocdatabases.com Twitter: @ocdatabases Database Normalization 1
  2. 2. DATABASE NORMALIZATION ACTIVITY • Activity extracted from a college textbook • Shows how to put a database into 3rd normal form • Explains the three normal forms, and • The normalization process • Additional exercises with answers at the end Database Normalization 2
  3. 3. DATABASE NORMALIZATION ACTIVITY • Tracks large industrial equipment placed in large chemical plants in the South • Each piece of equipment has a unique identifier based on the plant and equipment type • Starting point is flat Excel spreadsheet Database Normalization 3
  4. 4. NORMALIZATION 1st normal form 2nd normal form (implies 1st) 3rd normal form (implies 2nd) Database Normalization 4 3rd is normal goal; normal forms go all the way to 5th but we will stick to 3rd for this brief introductory exercise
  5. 5. TABLE DESCRIPTIONS • This is a description of the original table using a common text only notation (table name followed in parentheses by primary keys underlined, required fields bolded, foreign keys italicized) • Equipment (plant_name, eqpt_name, plant_mgr, ept_mfgr, mfgr_addr) • This is our 3rd normal form goal… • Equipment (plant_name, eqpt_name, eqpt_mfgr) • Plants (plant_name, plant_mgr) • Manufacturers (eqpt_mfgr, mfgr_addr) Database Normalization 5
  6. 6. 1ST NORMAL FORM • All rows unique • All values in same column have same data type • All columns have a unique name • All cells atomic • No repeating groups or multi-valued attributes • Order of rows and columns does not matter. Database Normalization 6
  7. 7. REPEATING GROUPS • Although called groups the concept could apply to single columns as well. For example phone numbers: work, home, cell, etc. • Here is a simplified example with student grades • Anomaly: what happens if there is a 3rd subject? • You have to add two more columns, forcing redesign of forms, queries and reports. • Subjects and grades should be moved to a separate table with foreign keys to the studentID. Database Normalization 7 StudentID Subj1 Grade1 Subj2 Grade2 120 Math B French A-
  8. 8. MULTI-PART FIELDS • NOT part of normalization, per-se • But, may want to consider while looking at 1st normal form • Example: customer_name : Bob Smith • Split into customer_first_name: Bob, customer_last_name: Smith • This makes searching and sorting by last name possible (or first name) • Generally you would want to split the parts in a database and recombine them via a query for reports. But this depends on your business rules. You may want to preserve the field as is. Database Normalization 8
  9. 9. DATABASE NORMALIZATION Database Normalization 9 Violates first normal – why? Plant Name Eqpt Name Plant Mgr Eqpt Mfgr Mfgr Addr ethylene Final cooler, feed heater Jim Smith ABC Exchanger 1247 Locust styrene Final cooler Bill Gunn Delta Supply 88 Canal styrene Feed heater Bill Gunn ABC Exchanger 1247 Locust Primary keys underlined
  10. 10. 1ST NORMAL FORM PROBLEM • Violates 1st normal form • There is a cell with both feed heater and final cooler which violates the all cells must be atomic rule Database Normalization 10 Plant Name Eqpt Name ethylene Final cooler, feed heater
  11. 11. DATABASE NORMALIZATION Database Normalization 11 Plant Name Eqpt Name Plant Mgr Eqpt Mfgr Mfgr Addr ethylene Final cooler Jim Smith ABC Exchanger 1247 Locust ethylene Feed heater Jim Smith XYZ Pumps 432 Broadway styrene Final cooler Bill Gunn Delta Supply 88 Canal styrene Feed heater Bill Gunn XYZ Pumps 432 Broadway 1st Normal satisfied
  12. 12. 2ND NORMAL FORM • All non key fields must be dependent on the entire key, not just part of the key • This would be called a partial key dependency • Partial key dependency example (composite PK is SKU, Loc) in an inventory database which tracks quantity on hand by warehouse: • In this example the description and price fields depend only on the SKU. The QOH field can remain as it represents the quantity on hand in a particular warehouse. • The descr and price fields should be moved into their own table with a primary key of SKU. Database Normalization 12 SKU Loc QOH Descr Price 12a TU 15 Bolts 5.95
  13. 13. DATABASE NORMALIZATION Database Normalization 13 Plant Name Eqpt Name Plant Mgr Eqpt Mfgr Mfgr Addr ethylene Final cooler Jim Smith ABC Exchanger 1247 Locust ethylene Feed heater Jim Smith XYZ Pumps 432 Broadway styrene Final cooler Bill Gunn Delta Supply 88 Canal styrene Feed heater Bill Gunn XYZ Pumps 432 Broadway 1st Normal satisfied Still violates 2nd normal– why?
  14. 14. 2ND NORMAL FORM PROBLEM • Violates 2nd normal form rule – contains a partial key dependency • Plant manager depends only on plant name • Plant manager should be moved to a separate table with plant name as PK Database Normalization 14 Plant Name Eqpt Name Plant Mgr ethylene Final cooler Jim Smith ethylene Feed heater Jim Smith styrene Final cooler Bill Gunn styrene Feed heater Bill Gunn
  15. 15. DATABASE NORMALIZATION Database Normalization 15 Plant Name Eqpt name Eqpt Mfgr Mfgr Addr ethylene Final cooler ABC Exchanger 1247 Locust ethylene Feed heater XYZ Pumps 432 Broadway styrene Final cooler Delta Supply 88 Canal styrene Feed heater XYZ Pumps 432 Broadway Plant Name Plant Mgr ethylene Jim Smith styrene Bill Gunn 2nd OK - no partial key dependencies.
  16. 16. 3RD NORMAL FORM • Columns must not be dependent on a non-key column • Called a transitive dependency • Transitive dependency example (orderID is PK) • In this example the customer name is dependent on the customerID which is outside the key. • It should be moved to a separate table with customerID as the PK column (and other columns such as address and city in a more complete example). Database Normalization 16 OrderID Order Date Order Amount CustomerI D Customer Name 1021 12/12/2012 150.00 CA88 Bob Smith
  17. 17. DATABASE NORMALIZATION Database Normalization 17 Plant Name Eqpt name Eqpt Mfgr Mfgr Addr ethylene Final cooler ABC Exchanger 1247 Locust ethylene Feed heater XYZ Pumps 432 Broadway styrene Final cooler Delta Supply 88 Canal styrene Feed heater XYZ Pumps 432 Broadway Plant Name Plant Mgr ethylene Jim Smith styrene Bill Gunn 2nd OK - no partial key dependencies. Why does it violate 3rd normal?
  18. 18. 3RD NORMAL FORM PROBLEM • Violates 3rd normal form • Contains a transitive dependency • If you know the equipment mfgr you know the address • mfgr address should be moved to a separate table with eqpt mfgr as PK Database Normalization 18 Eqpt Mfgr Mfgr Addr ABC Exchanger 1247 Locust XYZ Pumps 432 Broadway Delta Supply 88 Canal XYZ Pumps 432 Broadway
  19. 19. DATABASE NORMALIZATION Database Normalization 19 Plant Name Eqpt name Eqpt Mfgr ethylene Final cooler ABC Exchanger ethylene Feed heater XYZ Pumps styrene Final cooler Delta Supply styrene Feed heater XYZ Pumps Plant Name Plant Mgr Ethylene Jim Smith styrene Bill Gunn EqptMfgr MfgrAddr ABC Exchanger 1247 Locust Delta Supply 88 Canal XYZ Pumps 432 Broadway Satisfies 3rd normal form – no transitive dependencies
  20. 20. NORMALIZATION EXERCISES • Normalize the following… Scenario: recording labor time spent on factory work orders • Labor(badgeno, workorderno, full name, salary, workorderno, workorder_description, workorder_Budget) [hint: is there a partial key dependency? Is there another action you might wat to take?] • Scenario: admitting patients to a single hospital • Patient(patientno, firstname, lastname, admission_date, assigned_doctorno, doctor’s specialty(ies), symptom_code, symptom_description) [hint: look for repeating groups] • Scenario: tracking parts inventory by part # and warehouse • Inventory(partno, warehouseid, quantity on hand, reorder qty, part description, warehouse location) • Do it in steps: 1st normal form, 2nd normal, 3rd Database Normalization 20
  21. 21. CONCLUSION • 3rd normal form is a common goal of normalization • Three stages • 1st normal form – eliminate non atomic cells, repeating groups • 2nd normal – remove partial key dependencies • 3rd normal – eliminate transitive dependencies Database Normalization 21

×