Database Management Systems 4 - Normalization


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Database Management Systems 4 - Normalization

  1. 1. Database Management Systems Data Modelling Part 2 Normalization By Nickkisha Farrell, BSc IT, Dip Ed February 2014
  2. 2. IN THIS PRESENTATION Entity and Referential Integrity Physical Database Design: tables, primary keys, foreign keys Normalization - 1st , 2nd , 3rd Normal Forms Top-down versus Bottom-up Design 2
  3. 3. TABLES / RELATIONS When creating a table also called a relation: • Each attribute value must be a single value only. • All values for a given attribute must be of the same data type. • Each attribute (column) name must be unique. • The order of attributes (columns) is insignificant • No two tuples (rows) in a relation should be identical. • The order of the tuples (rows) is insignificant. 3
  4. 4. ENTITY AND REFERENTIAL INTEGRITY • An Entity typically corresponds to a relation. • Thus an entity’s attributes become attributes of the relation. • These attributes are represented by columns in a relation 4
  5. 5. KEYS • Keys play a very important role in relational databases. They are used to establish and identify relationships between tables. They are also ensure that each record can be uniquely identified by a combination of one or more field found in a table. 5
  6. 6. PRIMARY & FOREIGN KEYS Foreign Key A field in a table that matches the primary key column of another table. The purpose of the foreign key is to ensure referential integrity of the data. In other words, only values that are supposed to appear in the database are permitted. 6
  7. 7. FUNCTIONAL DEPENDENCIES • Describes a relationship between attributes within a single table. • An attribute is functionally dependent on another if we can use the value of one attribute to determine the value of another. • Example: Employee_Name is functionally dependent on Social_Security_Number because Social_Security_Number can be used to uniquely determine the value of Employee_Name. The arrow symbol → is used to indicate a functional dependency. X → Y is read X functionally determines Y 7
  8. 8. FUNCTIONAL DEPENDENCIES Here are a few more examples: - Student_ID → Student_Major - Semester → Grade, Course_Number - TaxRate → Car_Price • The attributes listed on the left hand side of the → are called determinants. • One can read A → B as: • A Determines B • Given a value for A, we can determine one value for B. 8
  9. 9. NORMALIZATION • Normalization is a process in which we systematically examine relations for anomalies and, when detected, remove those anomalies by splitting up the relation into two new, related, relations. In a nut shell Normalization is the process of efficiently organizing data in a database. 9
  10. 10. NORMALIZATION • Normalization is a relational database concept. If you have created a correct entity model, then the tables created during design will conform to the rules of normalization. • Normalization can also be thought of as a trade-off between data redundancy and performance. Normalizing a relation reduces data redundancy but introduces the need for joins when all of the data is required by an application such as a report query. 10
  11. 11. NORMAL FORMS • There are a series of guidelines for ensuring that databases are normalized. These are divided into • • • • • • 1NF – First Normal Form 2NF – Second Normal Form “Third normal form is the generally accepted goal for a database design 3NF – Third Normal Form that eliminated redundancy.” 4NF – Forth Normal Form 5NF – Fifth Normal Form BCNF – Boyce & Codd Normal Form • 4NF and 5NF are rarely seen and won't be discussed in this chapter. 11
  12. 12. NORMALIZATION RULES Normal Form Rule Description First Normal Form The table contains no duplicative groups i.e. no columns are repeated. Second Normal Form (2NF) The Table must be in 1NF. An attribute must be dependent upon entity’s entire unique identifier. Third Normal Form (3NF) The Table must be in 2NF. No non-UID attribute can be dependent on another non-UID attribute. “Each non-primary key value MUST be dependent on the key, the whole key, and nothing but the key.” 12
  13. 13. FIRST NORMAL FORM – 1NF The table must express a set of unordered, two-dimensional table structures. A table is considered in the first normal form if it contains no repeating groups. • Steps to Remove Repeating Groups 1. Remove the repeating columns from the original table. 2. Create separate tables for each group of related data 3. Identify each row with a unique column or set of columns (the primary key). 4. Create a foreign key in the new table to link back to the original table. 13
  14. 14. 2ND NORMAL FORM A relation is in second normal form (2NF) if it is in 1NF and all of its non-key attributes are dependent on all of the key. • Another way to say this: A relation is in second normal form if it is free from partial-key dependencies • Relations that have a single attribute for a key are automatically in 2NF. 14
  15. 15. 2ND NORMAL FORM • Steps to Remove Partial Dependencies 1. Determine which non-key columns are only partially dependent upon the table’s primary key. 2. Remove those columns from the base table. 3. Create a second table with those non-keyed columns an assign an appropriate primary key. 4. Create a foreign key from the original base table to the new table, linking to the new primary key. 15
  16. 16. 3RD NORMAL FORM A relation is in third normal form (3NF) if it is in second normal form and it contains no transitive dependencies. • Steps to Remove Transitive Dependencies 1. Determine which columns are dependent on another nonkeyed column. 2. Remove those columns from the base table. 3. Create a second table with those columns and the nonkey columns that they are dependent upon. 4. Create a foreign key in the original table linking to the primary key of the new table. 16
  17. 17. TOP-DOWN DESIGN VS BOTTOM UP DATABASE SCHEMA DESIGN • TOP DOWN • Identifies the data sets and then defines the data elements for each of those sets. That is entity types are defined followed by each entity’s attributes, often represented by ER modelling. • BOTTOM UP • First identifies the data elements and then groups them together in data sets i.e. it first defines attributes and then groups them to form entities 17
  18. 18. TOP-DOWN DESIGN VS BOTTOM UP DESIGN Top Down Entity Attribute Attribute Entity Attribute Attribute Bottom Up Conceptual Model 18
  19. 19. SUMMARY 1NF - The table must express a set of unordered, two dimensional tables. The table cannot contain repeating groups. 2NF - The table must be in 1NF. Every non-key column must be dependent on all parts of the primary key. 3NF - The table must be in 2NF. No non-key column may be functionally dependent on another non-key column. An entity relationship model transforms into normalized data design. 19
  20. 20. REFERENCES • Gillenson, Mark L.,2012, Fundamentals of Database Management Systems / Mark L. Gillenson.—2nd ed., John Wiley and sons inc • • • zation.htm 20