Database Normalization


Published on

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Database Normalization

  1. 1. CHAPTER 4: NORMALIZATIONChapter Objectives At the end of the chapter, you should be able to:  understand the purpose of normalization;  perform first, second and third normalization;  merging relations (view integration);  transforming E-R diagrams to relations.Essential Reading Modem Database Management (4th Edition), red R. Mcfadden & Jeffrey A. Hoffer (1994), Benjamin/Cummings.[Chapter 6, page 199 - 237]Useful Websites to learn Database and Programming: Erwin M. Globio, MSIT 4-1
  2. 2. DB212 CHAPTER 4: NORMALISATION4.1 Basic Concepts Normalization is a process for converting complex data structures into simple, stable data structures. Why normalisation is necessary ?  The database design must be efficient (performance-wise).  The amount of data should be reduced if possible.  The design should be free of update, insertion and deletion anomalies.  The design must comply with rules regarding relational databases.  The design has to show pertinent relationship between entities.  The design should permit simple retrieval, simplify data maintenance and reduce the need to restructure data. Table with repeating group Remove Repeating group First normal form Remove partial dependencies Second normal form Remove transitive dependencies Third normal form Remove remaining anormalies resulting from functional dependencies Fourth normal form Remove multivalued dependencies Boyce-codd Normal form Remove remaining anormalies Fifth normal form Figure 4-1: Steps in normalisation4-2 Prof. Erwin M. Globio, MSIT
  3. 3. DB212 CHAPTER 4: NORMALISATION 4.1.1 Functional Dependency Normalisation is based on the analysis of functional dependence. A functional dependency is a particular relationship between two attributes. For any relation R, that attribute B is functionally dependent on attribute A if, for every valid instance of A, that value of A uniquely determines then value of B. This is usually represented by an arrow, as follows: A --> B An attribute may be functionally dependent on two (or more) attributes, rather than on a single attribute. For example, in the following relation: ORDER (ORDER-NO, PART-NO, NO-ORDERED, PART-DESC, QUOTED-PRICE) ORDER-NO, PART-NO --> NO-ORDERED, PART-DESC, QUOTE-PRICE In this case, the attribute on the left-hand side of the arrow is called a determinant. For examples: CUST-NO - - > CUST-NAME, ADDRESS, COMPANY INVOICE-NO - - > INVOICE-DATE, CUST-NO, ORDER-NO CUST-NO and INVOICE-NO examples of determinants. 4.1.2 Keys An attitude (or field), K, is the primary key of a table if:  All columns (all the fields in the table) are functionally dependent on K.  Each value is unique.  If K is a composite/concatenate key then it must comply with the following conditions:  No portion of the key should be a primary key.  All attributes that make up the key are not null.4.2 Steps in Normalisation  First normal form (1NF). Any repeating groups have been removed, so that there is a single value at the intersection of each row and column of the table.  Second normal form (2NF). Any partial functional dependencies have been removed.  Third normal form (3NF). Any transitive dependencies have been removed.Note: If a relation meets the criteria for 3NF, it also meets criteria for 2NF and 1NF. Most design problems can be avoided if the relations are in 3NF.Prof. Erwin M. Globio, MSIT 4-3
  4. 4. DB212 CHAPTER 4: NORMALISATION 4.2.1 First Normal Form Example: UNF INF Order-no Order no Order-no Date Date Part-no Part-no Cust-no Qty-ordered Qty-ordered Cust-name Part-description Part-description Cust-address Quote-price Quote-price Cust-no Cust-name Cust-address 4.2.2 Second Normal Form A relation is in 2NF if:  It is in INF, and  all non-key attributes are fully functionally dependent on the primary key and not on only a portion of the primary key. Steps to transform into 2NF  Identify all functional dependencies in INF.  Make each determinant the primary key of a new relation.  Place all attributes that depend on a given determinant in the relation with that determinant that depend on a given determinant in the relation with that determinant as non-key attributes. All the functional dependencies in this case are: ORDER-NO --> DATE, CUST-NO, CUSTNAME, CUST-ADDRESS PART-NO --> PART-DESC Note : In this case, we say that PART-NO is only partially functional dependent on the key. (ORDER-NO, PART-NO) - - > QTY-ORDERED, QUOTE-PRICE4-4 Prof. Erwin M. Globio, MSIT
  5. 5. DB212 CHAPTER 4: NORMALISATION The partial functional dependency in ITEM (ORDER-NO, PART-NO, QTY-ORDERES, QUOTE-PRICE) creates redundancy in that relation, which results in anomalies when the table is updated.  Insertion anomaly. To insert a row for the ITEM table, we must provide the part description information too.  Deletion anomaly. If we delete a row for the ITEM table, we may lose some PART information.  Modification anomaly. If a PARTs description changes, we must record the change in multiple rows in the ITEM table. Example: 1NF 2NF Order-no Order-no Order-no Order-no Date Part-no Date Part-no Cust-no Qty-ordered Cust-no Quoted-price Cust-name Part-description Cust-name Quoted-price Cust-address Quoted-price Cust-address Part-no Part-description Note: A relation that is in first normal firm will be in second normal form if any one of the following conditions apply:  The primary key consists of only one attitude (such as the attribute ORDER-NO in ORDER).  No nonkey attributes exist in the relation.  Every nonkey attribute is functionally dependent on the full set of primary key attributes. 4.2.3 Third Normal Form A relation is in 3NF if:  It is in 2NF, and  no transitive dependencies. Transitive dependencies are when A - - > B - - > C. Thus it can be split into A - - > B and B - - > C.Prof. Erwin M. Globio, MSIT 4-5
  6. 6. DB212 CHAPTER 4: NORMALISATION Steps to transform into 3NF:  Create one relation for each determinant in the transitive dependency.  Make the determinants the primary keys in their respective relations.  Include as non-key attributes those attributes that depend on the determinant. In the functional dependency: ORDER ( ORDER-NO, DATE, CUST-NO, CUST-NAME, CUST ADDRESS) there is a transitive dependency. That is, one of the non-key attribute can be used to determine other attributes. CUST-NO --> CUST-NAME, CUST-ADDRESS Therefore, there are update anomalies in this table.  Insertion anomaly. A new customer is found and cannot be entered until it has made an order.  Deletion anomaly. If an order-no is deleted from the ORDER table, we may lose some CUSTOMER information.  Modification anomaly. If the address of a customer changes, we have to update all the associated past order records. To remove such anomalies, we can decompose the ORDER relation into two relations. Example: 2NF 3NF Order-no Order-no Order-no Order-no Date Part-no Date Part-no Cust-no Qty-priced Cust-no Qty-ordered Cust-name Quoted-price Quoted-price Cust-address Cust-no Part-no Cust-name Part-no Part-description Cust-address Part-description Notice that CUST-NO is the primary key of a new relation and is a foreign key in the ORDER relation. A foreign key is an attribute that appears as a nonkey attribute in one relation and as a primary key attribute in another relation. Therefore the final result is ORDER (ORDER-NO, DATE, CUST-NO) ITEMS (ORDER-NO, PART-NO, NO-ORDERED, QUOTED-PRICE) CUSTOMER (CUST-NO, CUST-NAME, CUST-ADDRESS) PART (PART-NO, PART-DESC)4-6 Prof. Erwin M. Globio, MSIT
  7. 7. DB212 CHAPTER 4: NORMALISATION4.3 Transforming E-R Diagram to Relations 4.3.1 Represent Entities Each entity type in an E-R diagram is transformed into a relation. The primary key of the entity type becomes the primary key of the corresponding relation. Taking the following E-R diagram as an example, Cust-no CUSTOMER Cust-name Address PLACES Qty-ordered Quoted-price Order-no Part-no Date ORDER CONSISTS PART Part-description Cust-no the ORDER entity is transform into the following relation : ORDER ( ORDER-NO, DATE, CUST-NO ) 4.3.2 Represent Relationships  Binary 1:N Relationship A binary one-to-many (1:N) relationship in an E-R diagram is represented by adding the primary key attribute of the entity on the one-sided of the relationship, as a foreign key. Thus the CUSTOMER and ORDER relations in the E-R diagram are then transformed into ORDER ( ORDER-NO, DATE, CUST-NO ) CUSTOMER ( CUST-NO, CUST-NAME, CUST-ADDRESS ) CUST-NO is a foreign key in the ORDER relation but a primary key in the CUSTOMER relation.Prof. Erwin M. Globio, MSIT 4-7
  8. 8. DB212 CHAPTER 4: NORMALISATION  Binary M:N Relationship For a binary any-to-many relationship between two entity types A and B, create a separate relation C. The primary key of this C relation is the composite key consisting of the primary keys for entities A and B. Thus, in the entities types PART and ORDER, a relation called ORDER-LINE is created which consists of the two primary keys in the PART and ORDER as well as the attributes QTY-ORDERED, QUOTED-PRICE. That is, ORDER-LINE ( ORDER-NO, PART-NO, QTY-ORDERED, QUOTED-PRICES)  Unary Relationships In a unary relationship (recursive relationship), the primary key of that relation is the same as for the entity type. A foreign key is added to the relation that references the primary key values. This is known as the recursive foreign key. Example: EMPLOYEE (EMP-ID, NAME, BIRTHDATE, MANAGER-ID)4.4 Merging Relations As part of the logical design process, normalised relations may have been created from a number of separate E-R diagrams and other user views. Some of these relations may be redundant and can be integrated with other relations (view integration). Example: Suppose that modelling a user view results in the following 3NF relation: STUDENT1 (STUDENTID, NAME, ADDRESS, PHONE, GUARDIAN). Modelling a second user view might result in the following relation: STUDENT2 (STUENTID, NAME, ADDRESS, DEPT) Since these two relations have the same primary key (STUDENTID), they describe the same entity and may be merged into one relation. Therefore the result of the merging is: STUDENT (STUDENTID, NAME, ADDRESS, PHONE, GUARDIAN, DEPT) This reduces duplication of NAME and ADDRESS.4-8 Prof. Erwin M. Globio, MSIT
  9. 9. DB212 CHAPTER 4: NORMALISATION4.5 Review Questions 1. For each of the following relations, indicate the normal form for that relation. If the relation is not in 3NF, normalise it. (Note: Functional dependencies are shown where appropriate.) a. CLASS (COURSE NO, SECTION NO) b. CLASS (COURSE NO, SECTION NO, ROOM) c. CLASS (COURSE NO, SECTION NO, ROOM, APACITY)ROOM - - > CAPACITY d. CLASS (COURSE NO, SECTION NO, COURSE NAME, ROOM, CAPACITY)ROOM - - > CAPACITYCOURSE NO - - > COURSE NAME 2. The table below contains sample data for parts and for vendors. Part No. Description Vendor Name Address Unit Cost 123 Logic Chip Fast Chips Cupertino 10.00 Smart Chips Phoenix 8.00 5678 Memory chip Fast Chips Cupertino 3.00 Quality Chips Austin 2.00 Smart Chips Phoenix 5.00 a. Convert this table to a relation (named PART SUPPLIER) in first normal form. b. List the functional dependencies in PART SUPPLIER and identify a candidate key. c. Identify each of the following: an insert anomaly, a delete anomaly, and modification anomaly in the above 1NF relation. d. Convert the relation to 3NF. 3. When integrating relations, the database analyst must understand the meaning of data and try to resolve problems arising form synonyms, homonyms relations. Illustrate with examples (quoting from your project), how such problems can be resolved.Prof. Erwin M. Globio, MSIT 4-9
  10. 10. DB212 CHAPTER 4: NORMALISATION LOCATION Accom ROOM May be Patient no Assigned to Location Patient name PATIENT Is billed ITEM for Extension Patient address Description (Other patient attributes) Attenda Charge Item code Procedure PHYSICIAN Physician Physician ID phone4 - 10 Prof. Erwin M. Globio, MSIT