Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Loading in …3
×
1 of 34

Database Concept - Normalization (1NF, 2NF, 3NF)

23

Share

Download to read offline

Database Concept - Normalization (1NF, 2NF, 3NF)
by Oum Saokosal

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Database Concept - Normalization (1NF, 2NF, 3NF)

  1. 1. Normalization Re-edited by: Oum Saokosal Master of Engineering in Information Systems, Jeonju University, South Korea 012-252-752 oum_saokosal@yahoo.com
  2. 2. Normalization Normalization: the process of converting complex data structures into simple, stable data structures. The main idea is to avoid duplication of large data. Why normalization?  The relation derived from the user view or data store will most likely be unnormalized.  The problem usually happens when an existing system uses unstructured file, e.g. in MS Excel.
  3. 3. The Three Steps of Normalization The standard normalization has more than three steps:  First Normal Form (1NF)  Second Normal Form (2NF)  Third Normal Form (3NF)  Boyce-Codd Normal Form (BCNF)  Fourth Normal Form (4NF)  Fifth Normal Form (5NF)  Domain/Key Normal Form (DKNF) However, only three steps (1NF, 2NF, 3NF) are sufficient for normalization.
  4. 4. I. First Normal Form (1NF) The official qualifications for 1NF are: 1. Each attribute must have a unique name. 2. Each attribute must have a single value. 3. Row cannot be duplicated. 4. There is no repeating groups. Additional: 1. Choose a primary key. The primary key can be an attribute or combined attributes.
  5. 5. Name DOB Course Payment Sok 11/5/1990 IT 450 Dollars Sao 4/4/1989 Mgt 400 Dollars Chan 7/7/1991 IT Mgt IT: 450 Dollars Mgt: 400 Dollars Sok 11/5/1990 Mgt 400 Dollars Sao 4/4/1989 Tour 1) 200 Dollars 2) 200 Dollars 1. Each attribute has unique name -> Good 2. The Payment has multi data type (currency & string) -> Bad 3. All rows are not duplicated -> Good 4. The Course and Payment have repeating groups -> Bad
  6. 6. Name DOB Course Payment ($) Sok 11/5/1990 IT 450 Sao 4/4/1989 Mgt 400 Chan 7/7/1991 IT 450 Chan 7/7/1991 Mgt 400 Sok 11/5/1990 Mgt 400 Sao 4/4/1989 Tour 200 Sao 4/4/1989 Tour 200 All correct? Name? No. Name has duplicated values. Or DOB, or Course or Payment? No. Each one has duplicated values. Name and DOB? No. They still have duplicated values. Name and DOB and Course? No. Still duplicated. Combine all attribute? Still no. The last two rows are duplicated. So what else we can do? Of course, there is a way. Add a new attribute to be a primary key. So let’s call it ID. Not yet. Choose a primary key.
  7. 7. ID Name DOB Course Payment 1 Sok 11/5/1990 IT 450 2 Sao 4/4/1989 Mgt 400 3 Chan 7/7/1991 IT 450 4 Chan 7/7/1991 Mgt 400 5 Sok 11/5/1990 Mgt 300 6 Sao 4/4/1989 Tour 200 7 Sao 4/4/1989 Tour 200 Now it is completely in 1NF. Next, check it if it is not in 2NF.
  8. 8. II. Second Normal Form (2NF) The official qualifications for 2NF are: 1. A table is already in 1NF. 2. All nonkey attributes are fully dependent on the primary key. All partial dependencies are removed and placed in another table.
  9. 9. CourseID Semester Num Student Course Name IT101 2013-1 25 Database IT101 2013-2 25 Database IT102 2013-1 30 Web Prog IT102 2013-2 35 Web Prog IT103 2014-1 20 Networking Assume you have a table below contain a primary (CourseID + Semester): Primary Key The Course Name depends on only CourseID, a part of the primary key not the whole primary (CourseID + Semester).It’s called partial dependency. Solution: Remove CourseID and Course Name together to create a new table.
  10. 10. CourseID Course Name IT101 Database IT101 Database IT102 Web Prog IT102 Web Prog IT103 Networking Semester Done? Oh no, it is still not in 1NF yet. You have to remove the repeating groups too. CourseID Course Name IT101 Database IT102 Web Prog IT103 Networking CourseID Semester Num Student IT101 2013-1 25 IT101 2013-2 25 IT102 2013-1 30 IT102 2013-2 35 IT103 2014-1 20 1 M
  11. 11. III. Third Normal Form (3NF) The official qualifications for 3NF are: 1. A table is already in 2NF. 2. Nonprimary key attributes do not depend on other nonprimary key attributes (i.e. no transitive dependencies) All transitive dependencies are removed and placed in another table.
  12. 12. StudyID Course Name Teacher Name Teacher Tel 1 Database Sok Piseth 012 123 456 2 Database Sao Kanha 0977 322 111 3 Web Prog Chan Veasna 012 412 333 4 Web Prog Chan Veasna 012 412 333 5 Networking Pou Sambath 077 545 221 Assume you have a table below contain a primary (StudyID): Primary Key The Teacher Tel is a nonkey attribute, and the Teacher Name is also a nonkey atttribute. But Teacher Tel depends on Teacher Name. It is called transitive dependency. Solution: Remove Teacher Name and Teacher Tel together to create a new table.
  13. 13. Teacher Name Teacher Tel Sok Piseth 012 123 456 Sao Kanha 0977 322 111 Chan Veasna 012 412 333 Chan Veasna 012 412 333 Pou Sambath 077 545 221 Done? Oh no, it is still not in 1NF yet. So you have to remove the repeating groups, and add a primary key. Teacher Name Teacher Tel Sok Piseth 012 123 456 Sao Kanha 0977 322 111 Chan Veasna 012 412 333 Pou Sambath 077 545 221 Note about primary key: - In theory, you can choose Teacher Name to be a primary key. - But in practice, you should add Teacher ID as the primary key. T.ID Teacher Name Teacher Tel T1 Sok Piseth 012 123 456 T2 Sao Kanha 0977 322 111 T3 Chan Veasna 012 412 333 T4 Pou Sambath 077 545 221 StudyID Course Name T.ID 1 Database T1 2 Database T2 3 Web Prog T3 4 Web Prog T3 5 Networking T4 M 1
  14. 14. ID Name DOB Course Payment 1 Sok 11/5/1990 IT 450 2 Sao 4/4/1989 Mgt 400 3 Chan 7/7/1991 IT 450 4 Chan 7/7/1991 Mgt 400 5 Sok 11/5/1990 Mgt 300 6 Sao 4/4/1989 Tour 200 7 Sao 4/4/1989 Tour 200 What about this table? In case of the above table, there is no 2NF because the primary key is only one attribute, not the combined attributes. Therefore, you can skip 2NF and move to 3NF. In 3NF, you must remove transitive dependency. Both Name and DOB does not depend on ID. So remove them. Both Course and Payment does not depend on ID. So remove them.
  15. 15. ID Name DOB S1 Sok 11/5/1990 S2 Chan 7/7/1991 S3 Sao 4/4/1989 CourseID Course C1 IT C2 Mgt C3 Tour Student Course Payment PID SID Course Payment 1 S1 C1 $450 2 S3 C2 $400 3 S2 C3 $450 4 S2 C2 $400 5 S1 C2 $300 6 S2 C3 $200 7 S2 C3 $200 1 M 1 M
  16. 16. Student Course PaymentM N PaymentID Payment For the Payment table, it is not done yet. It is a relationship between Student and Course.
  17. 17. Stop at 3NF The most commonly used normal forms:  First Normal Form (1NF)  Second Normal Form (2NF)  Third Normal Form (3NF) Highest normalization is not always desirable  More JOINS are required  Affect data retrieval performance/high response time  For most business database design purposes, 3NF is as high as we need to go in normalization process
  18. 18. Normalization in Real-World When you newly create a table in a database tool, e.g. MS Access, SQL Server, MySQL, or Oracle, you won’t need all the steps. The mentioned tools help you to overcome the 1NF already. The 2NF happens when the primary key is combine attributes, e.g. StudentName + DOB. But to do so is unpractical. Mostly, you only use 3NF. Because it can remove all transitive dependency.
  19. 19. Functional Dependency A Bit More About Theory
  20. 20. Functional Dependencies 20 An important concept associated with normalization is functional dependency which describes the relationship between attributes.
  21. 21. Functional Dependencies 21 Functional dependency can be divided into two types:  Full functional dependency/Partial dependency (PD) • Will be used to transform 1NF  2NF  Transitive dependency (TD) • Will be used to transform 2NF  3NF
  22. 22. Functional Dependencies 22 Multivalued Attributes (or repeating groups): non- key attributes or groups of non-key attributes the values of which are not uniquely identified by (directly or indirectly) (not functionally dependent on) the value of the Primary Key (or its part). 1st row 2nd row Relational Schema STUDENT(Stud_ID, Name, (Course_ID, Units))
  23. 23. Functional Dependencies 23 Partial Dependency – when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key (The Primary Key must be a Composite Key). Cust_ID → Name
  24. 24. Functional Dependencies 24 Transitive Dependency – when a non-key attribute determines another non-key attribute. Dept_ID → Dept_Name
  25. 25. Functional Dependencies 25  Consider a relation with attributes A and B, where attribute B is functionally depends on attribute A. Let say an A is a PK of R.  To describe the relationship between attributes A and B is to say that “A functionally determines B”. A B B is functionally depends on A R(A,B) A  B
  26. 26. Functional Dependencies 26 When a functional dependency exist, the attribute or group of attributes on the left-handed side of the arrow is called determinant. Determinant: Refers to the attributes, or a group of attributes, on the left handed side of the arrow of a functional dependency. A B A functionally determines B
  27. 27. staffNO sName position salary branchNo S21 Johan Manager 3000 B005 S37 Ana Assistant 1200 B003 S14 Daud Supervisor 1800 B003 S9 Mary Assistant 900 B007 S5 Siti Manager 2400 B003 S41 Jani Assistant 900 B005 branchNO bAddress B005 123, Kepong B007 456, Nilai B003 789, PTP 27 staff branch Functional Dependencies Determinant
  28. 28. Functional Dependencies 28 Consider the attributes staffNO and position of the staff relation. For a specific staffNO (S21), we can determine the position of that member of staff as Manager. staffNO functionally determines position. Staff number (S21) Position (manager) staffNO position position is functionally depends on staffNO
  29. 29. Functional Dependencies 29 However the next figure illustrate that the opposite is not true, as position does not functionally determines staffNO. A member of staff holds one position; however, they maybe several members of staff with the same position. Position(manager) staff number (S21) staff number (S5) position staffNO staffNO does not functionally depends on position
  30. 30. Partial Dependencies: Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A, if B is functionally dependent on A, but not on any proper subset of A. 30 staff(staffNO,sName,position,salary,branchNO) staffNO, staffName  branchNO True!!! each value of (staffNO, sName) is associated with a single value of branchNO. however, branchNO is also functionally dependent on staffNO. Functional Dependencies
  31. 31. Transitive Dependencies: 31 staff(staffNO,sName,position,salary,*branchNO) branch(branchNO,bAddress) staffNO  sName,position,salary,branchNO,bAddress branchNO  bAddress True for transitive dependency!!!  branchNO → bAddress exists on staffNO via branchNO Functional Dependencies
  32. 32. Normalization Process 32  Formal technique for analyzing relations based on their Primary Key (or candidate keys) and functional dependencies.  The technique executed as a series of steps (stage). Each step corresponds to a specific normal form, that have specific characteristic.  As normalization proceeds, the relations become progressively more restricted (stronger) in format and also less vulnerable to anomalies. Data Redundancies 0NF/UNF 1NF 2NF 3NF
  33. 33. 33 Normalization Process 2NF 3NF UNF 1)Repeat Group 2)PK is not defined 1NF 1)Remove Repeat Group 2)Defined PK  composite PK consist of attributes Test for partial dependency If (exist) (1 Table) Test for transitive dependency If (exist) (1 or 2 Tables) (2 or 3 Tables) (more then 1 table) (3 or 4 Tables) (a b …. TD) 1 (a  ……. TD) 2 (b  ….… TD) 3 (a, b  x, y) (a c, d) (b  z) (c  d) Normalization Process Relation/Table Format -Have repeating group -PK not defined -No repeating group -PK defined -Test partial dependency -No repeating group -PK defined -No partial dependency -Test transitive dependency -No repeating group -PK defined -No partial dependency -No transitive dependency
  34. 34. End of Chapter Re-edited by: Oum Saokosal Master of Engineering in Information Systems, Jeonju University, South Korea 012-252-752 oum_saokosal@yahoo.com

×