INTRODUCTIONOne of the most important steps in designing a database isensuring that the data is properly distributed among its tables.With proper data structures, illogically or inconsistently storeddata can cause a number of problems. In a relational database, alogical and efficient design is just as critical. A poorly designeddatabase may provide erroneous information, may be difficult touse, or may even fail to work properly. A poorly design databasecan cripple an application, producing problems with redundancy,in accuracy, consistency and concurrency of data. Normalizationis a process that serves to reduce, if not eliminate, theseproblems with data.
DESCRIPTION OF NORMALIZATION Normalization is the process of organizing data to minimize redundancy. The goal of database normalization is to decompose relations with anomalies in order to produce smaller, well- structured relations. Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.
Normalization LevelsThere are a few rules for database normalization. Each rule is called a "normal form." If the first rule is observed, the database is said to be in "first normal form." If the first three rules are observed, the database is considered to be in "third normal form." Although other levels of normalization are possible, third normal form is considered the highest level necessary for most applications. There is a number of normalization levels from 1. normal form through 5. normal form. Each normal form describes how to get rid of some specific problem, usually related to redundancy. Below are description with examples of level one to three normal forms.
First Normal Form Eliminate repeating groups in individual tables. Create a separate table for each set of related data. Identify each set of related data with a primary key.
Second Normal Form Create separate tables for sets of values that apply to multiple records. Relate these tables with a foreign key. Records should not depend on anything other than a tables primary key (a compound key, if necessary).
Third Normal FormEliminate fields that do not depend on thekey. Third normal form prohibitstransitive dependencies. A transitivedependency exist when any attribute in atable is dependent of any other non-keyattribute in that table.
Other Normalization FormsFourth normal form, also called Boyce CoddNormal Form (BCNF), and fifth normal formdo exist, but are rarely considered in practicaldesign. Disregarding these rules may result inless than perfect database design, but shouldnot affect functionality.
Examples Of Normalization Table 1. Unnormalized table:Student advisor Adv- Class 1 Class 2 Class 3 room1022 Jones 412 101-07 143-01 159-024123 Smith 216 201-01 211-02 214-01
First Normal Form: No RepeatingGroups Studet Advisor Adv-room class 1022 jones 412 101-07 1022 jones 412 143-01 1022 jones 412 159-02 4123 smith 216 201-01 4123 Smith 216 211-02 4123 smith 216 214-01
Second Normal Form: Eliminate Redundant Data registrationstudent student class Student Advisor Adv- room 1022 101-07 1022 Jone 412 1022 143-01 s 4123 Smitt 216 1022 159-01 4123 201-01 4123 211-02 4123 214-01
Third Normal Form: Eliminate Data Not Dependent On Key faculty studentStudent advisor Name Room Depart1022 Jones Jones 412 424123 smith Smith 216 42
Normalising A Database Should Be Able To Achieve The Following Four Goals: Arranging data into logical groups such that each group describes a small part of the whole Minimizing the amount of duplicated data stored in a database Building a database in which you can access and manipulate the data quickly and efficiently without compromising the integrity of the data storage Organising the data such that, when you modify it, you make the changes in only one place
By Normalization The Following Problem In Relational Database Are Solve Use storage space efficiently Eliminate redundant data Reduce or eliminate inconsistent data Ease the database maintenance burden Enforce referential integrity. Data integrity Minimize modification anomalies
Unnormalized DATABASE SCHEMA FOR A GENERIC SOCIAL NETWORKING SITE (FACEBOOK).
A NORMALIZED DATABASE SCHEMA FOR A GENERIC SOCIAL NETWORKING SITE (FACEBOOK).
Un Normalized Database Problem Repetition of information Inability to represent certain information Loss of information Difficulty to maintain information
SQL PROGRAM Create table user (user_id Int (8), first_name Char (15), last_name Char (15), sex char (7), home town Char (16), relationship_status (8), interested_in Char (7), religious _views char (15), political_view Char (15)); Create table user affiliation (user_id Int (8), Affiliation_id int (4), Foreign key (user_id, Affiliation_id));
CONT. OF SQL PROG. Create table Affiliation (Affiliation_id int (4), Description char (15), Member_count int (10)); Create table user_phone_numbers (user_id Int (8), Phone_number int (15), Phone_type char (10) user_id)); Create table user_screen_name (user_id Int (8), Screen_name varchar (30), Im_service char (10), Foreign key ((user_id Int ));
CONT. OF SQL Create table user_work_history (user_id Int (8), Company_affiliation_id int (4), Company_name char (20), Job_title char (20) Foreign key (Company_affiliation_id, user_id));
PROBLEM TO FACE IF DATABASE DESIGN IS NOT NORMALIZED Normalisation is part of successful database design. Without normalisation, there will be Problem of high data redundancy – if data in the database can be found in two different locations (direct redundancy) or if data can be calculated from other data items (indirect redundancy) then the data is said to contain redundancy. Data should only be stored once and avoid storing data that can be calculated from other data already held in the database. Not normalizing database systems can be inaccurate, slow, and inefficient and they might not produce the data one may expect. It is adviceable to always use the normalization process to design efficient and functional databases. By normalizing, data will be store where it logically and uniquely belongs.
DISADVANTAGE OF NORMALIZATIONThe only real drawback to having a highly normalization database structure is that one may need alarge number of joins to pull back the records the application needs to function.For example in the social network site schema above one will requires a whopping six joins toretrieve a single users information.select * from Users uinner join UserPhoneNumbers upnon u.user_id = upn.user_id the fact here is that oneinner join UserScreenNames usn need six joins -- or sixon u.user_id = usn.user_id individual queries, toinner join UserAffiliations ua retrieve a single useron u.user_id = ua.user_id information. This is timeinner join Affiliations a consuming and slowon a.affiliation_id = ua.affiliation_id dawn your systeminner join UserWorkHistory uwh performanceon u.user_id = uwh.user_idinner join Affiliations waon uwh.affiliation_id = wa.affiliation_id
CONCLUTIONThere are advantages and disadvantages to normalizing database schemas butthe big question now is when is it appropriate to normalize and when notto. It is true that Database normalization is a formal process of designing onedatabase to eliminate redundant data, utilize space efficiently and reduceupdate errors. Anyone who has ever taken a database class has it drummedinto their heads that a normalized database is the only way to go. This is true forthe most part. However there are certain scenarios where the benefits ofdatabase normalization are outweighed by its costs. So I think people should notnormalize for the sake of normalization or because one school professor toldthem to but because the data tells them to for the following reasons.Normalization makes sense to their team, Normalization provides betterperformance. (they’re automatically measuring all the queries that flow throughtheir software, right?) Normalization prevents an onerous amount of duplicationor avoids risk of synchronization problems that their problem domain or usersare particularly sensitive to and finally that Normalization allows them to writesimpler queries and code.
REFERENCES R. Ramakrishnan, J. Gehrke “Database Management System”, McGraw-Hill, New York 2003 Dare O. “When Not To Normalize Your SQL Database” http://www.25hoursaday.com/weblog/commentview.aspx? Microsoft Support, “Description Of The Database Normalization Basics” http://support.microsoft.com/kb/283878#top