NORMALIZATION
Normalization
 Normalization is the process of efficiently
organizing data in a database.
 Normalization is the process of applying a
series of rules to ensure that your database
achieves optimal structure.
Why we do Normalization?
 There are two goals of the normalization process:
1- eliminating redundant data
For example: storing the same data in more than
one table
2- ensuring data dependencies make sense.
For example: only storing related data in a table.
Both of these are worthy goals as they reduce the amount of
space a database consumes and ensure that data is logically
stored.
Database Anomalies:
 Repetition Anomaly
 Insertion Anomaly
 Deletion Anomaly
 Updating Anomaly
First Normal Form (1NF)
 Each attribute must be atomic
• No repeating columns within a row.
• No multi-valued columns.
 1NF simplifies attributes
• Queries become easier.
1NF
Employee (unnormalized)
emp_no emp_name project_id project_name grade salary
142 John 113, 124 blue star, magnum A 20,000
168 James 113 blue star B 15,000
263 Andrew 113 blue star C 10,000
Employee (1NF)
emp_no emp_name project_id project_name grade salary
142 John 113 blue star A 20,000
142 John 124 magnum A 20,000
168 James 113 blue star B 15,000
263 Andrew 113 blue star C 10,000
Second Normal Form (2NF)
 Each attribute must be functionally
dependent on the primary key.
• Partial Functional dependency – a non-key attribute
should not be partially(functionally) dependent on
more than one key attribute.
• Any non-dependent attributes are moved into a
smaller (subset) table.
 2NF improves data integrity.
• Prevents update, insert, and delete anomalies.
Functional Dependency
Emp_name, grade and salary are functionally
dependent only on emp_no. (emp_no -> emp_name,
grade, salary)
Project_name is dependent only on project_id.
But these table should be related through foreign
keys.
emp_no emp_name grade salary
142 John A 20,000
168 James B 15,000
263 Andrew C 10,000
109 Bob C 10,000
Employee (1NF) Project (1NF)
project_id project_name
113 blue star
124 magnum
2NF
emp_no emp_name grade salary
142 John A 20,000
168 James B 15,000
263 Andrew C 10,000
109 Bob C 10,000
Employee (2NF)
Project (2NF)
project_id project_name
113 blue star
124 magnum
Data Integrity
• Insert Anomaly - adding null values. eg, inserting a
new project does not require the primary key of emp_no
to be added.
• Update Anomaly - multiple updates for a single name
change, causes performance degradation. eg, changing
magnum project_name to meganum.
• Delete Anomaly - deleting wanted information. eg,
deleting the magnum project removes employee John
personal info from the database.
Third Normal Form (3NF)
Remove transitive dependencies.
•Transitive dependency – if an attribute can be
determined by another non-key attribute.
• Any transitive dependencies are moved into a
smaller (subset) table.
3NF further improves data integrity.
• Prevents update, insert, and delete anomalies.
Transitive Dependency
salary is determined by grade and not the key
attribute emp_no.
Thus this transitive dependency needs to be
removed, and grade and salary are moved to another
table.
emp_no emp_name grade salary
142 John A 20,000
168 James B 15,000
263 Andrew C 10,000
109 Bob C 10,000
Employee (3NF)
3NF
emp_no emp_name grade
142 John A
168 James B
263 Andrew C
109 Bob C
Employee (3NF) Project (3NF)
project_id project_name
113 blue star
124 magnum
Employee Project(3NF)
emp_no project_name
142 113
142 124
168 113
263 113
109 124
Grade Salary(3NF)
grade salary
A 20,000
B 15,000
C 10,000
Denormalization
 Denormalization is a technique to move from
higher to lower normal forms of database
modeling in order to speed up database
access.

Sql server ___________session3-normailzation

  • 1.
  • 2.
    Normalization  Normalization isthe process of efficiently organizing data in a database.  Normalization is the process of applying a series of rules to ensure that your database achieves optimal structure.
  • 3.
    Why we doNormalization?  There are two goals of the normalization process: 1- eliminating redundant data For example: storing the same data in more than one table 2- ensuring data dependencies make sense. For example: only storing related data in a table. Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.
  • 4.
    Database Anomalies:  RepetitionAnomaly  Insertion Anomaly  Deletion Anomaly  Updating Anomaly
  • 5.
    First Normal Form(1NF)  Each attribute must be atomic • No repeating columns within a row. • No multi-valued columns.  1NF simplifies attributes • Queries become easier.
  • 6.
    1NF Employee (unnormalized) emp_no emp_nameproject_id project_name grade salary 142 John 113, 124 blue star, magnum A 20,000 168 James 113 blue star B 15,000 263 Andrew 113 blue star C 10,000 Employee (1NF) emp_no emp_name project_id project_name grade salary 142 John 113 blue star A 20,000 142 John 124 magnum A 20,000 168 James 113 blue star B 15,000 263 Andrew 113 blue star C 10,000
  • 7.
    Second Normal Form(2NF)  Each attribute must be functionally dependent on the primary key. • Partial Functional dependency – a non-key attribute should not be partially(functionally) dependent on more than one key attribute. • Any non-dependent attributes are moved into a smaller (subset) table.  2NF improves data integrity. • Prevents update, insert, and delete anomalies.
  • 8.
    Functional Dependency Emp_name, gradeand salary are functionally dependent only on emp_no. (emp_no -> emp_name, grade, salary) Project_name is dependent only on project_id. But these table should be related through foreign keys. emp_no emp_name grade salary 142 John A 20,000 168 James B 15,000 263 Andrew C 10,000 109 Bob C 10,000 Employee (1NF) Project (1NF) project_id project_name 113 blue star 124 magnum
  • 9.
    2NF emp_no emp_name gradesalary 142 John A 20,000 168 James B 15,000 263 Andrew C 10,000 109 Bob C 10,000 Employee (2NF) Project (2NF) project_id project_name 113 blue star 124 magnum
  • 10.
    Data Integrity • InsertAnomaly - adding null values. eg, inserting a new project does not require the primary key of emp_no to be added. • Update Anomaly - multiple updates for a single name change, causes performance degradation. eg, changing magnum project_name to meganum. • Delete Anomaly - deleting wanted information. eg, deleting the magnum project removes employee John personal info from the database.
  • 11.
    Third Normal Form(3NF) Remove transitive dependencies. •Transitive dependency – if an attribute can be determined by another non-key attribute. • Any transitive dependencies are moved into a smaller (subset) table. 3NF further improves data integrity. • Prevents update, insert, and delete anomalies.
  • 12.
    Transitive Dependency salary isdetermined by grade and not the key attribute emp_no. Thus this transitive dependency needs to be removed, and grade and salary are moved to another table. emp_no emp_name grade salary 142 John A 20,000 168 James B 15,000 263 Andrew C 10,000 109 Bob C 10,000 Employee (3NF)
  • 13.
    3NF emp_no emp_name grade 142John A 168 James B 263 Andrew C 109 Bob C Employee (3NF) Project (3NF) project_id project_name 113 blue star 124 magnum Employee Project(3NF) emp_no project_name 142 113 142 124 168 113 263 113 109 124 Grade Salary(3NF) grade salary A 20,000 B 15,000 C 10,000
  • 14.
    Denormalization  Denormalization isa technique to move from higher to lower normal forms of database modeling in order to speed up database access.

Editor's Notes

  • #13 Note, dept_name is functionally dependent on dept_no. Dept_no is functionally dependent on emp_no, so via the middle step of dept_no, dept_name is functionally dependent on emp_no. (emp_no -> dept_no , dept_no -> dept_name, thus emp_no -> dept_name)