Normalization
1. Definition
▪ The theory of relational databases includes some wisdom about what is
and what is not good database design. These notions are expressed in
terms of restrictions, each of which excludes certain undesirable properties
from database designs. These sets of restrictions are called normal forms.
And normalization is the name given to the process of creating a database
design that does not violate them.
▪ A logical design method:
-- which minimizes data redundancy and reduces design flaws.
• Consists of applying various “normal” forms to the database design.
• The normal forms break down large tables into smaller subsets.
Normalization
2. Data Redundancy or Duplication
▪ Department name and manager are duplicated for all employees in the
department.
If redundancy exists then this can cause problems during normal
database operations:
• When data is inserted the database the data must be duplicated where ever
redundant versions of that data exists.
• When data is updated, all redundant data must be simultaneously updated to
reflect that change.

Consequences of redundancy
•
•
•
•

Wasted space
Potential performance cost
Potential inconsistency
Inability to represent data
Normalization
3. Anomalies
Insertion anomalies
If a new employee is added to deptid 2, all the info about the department
(name & manager) has to be re-input.
A new department, say it, can be added only when there exists an
employee for the new department.

Deletion anomalies
If the last employee from a department is deleted, then details of that
department disappear from the database.
Update anomalies
If the sales manager of Jones is replaced by Smith, all the records for
employees in sales department must be modified.
Normalization
4. Benefits of Normalization
• Less storage space
• Quicker updates
• Less data inconsistency
• Clearer data relationships
• Easier to add data
• Flexible structure
Normal Form
1. First Normal Form - 1nf
• A relation is said to be in first normal form (1nf) if all attribute values are atomic:
no repeating group, no composite attributes, and primary key is identified.
Patient
#

Surgeon
#

Surgery date

Patient name

Patient addr

Surgeon name

Surgery

Drug admin

Side
Effects

1111

145

01-jan-95

John white

15 new st. New york, ny

Gallstones removal

Penicillin

Rash

1111

311

12-jun-95

John white

15 new st. New york, ny

Beth little
Michael diamon
d

Kidney stones removal

None

None

1234

243

05-apr-94

Mary jones

10 main st. Rye, ny

Charles field

Eye cataract removal

Tetracycline

Fever

1234

467

10-may-95

Mary jones

10 main st. Rye, ny

Patricia gold

Thrombosis removal

None

None

2345

189

08-jan-96

Charles brown

Dogwood lane harrison, ny

David rosen

Open heart surgery

Cephalosporin

None

4876

145

05-nov-95

Hal kane

55 boston post road, chester, cn

Beth little

Cholecystectomy

Demicillin

None

5123

145

10-may-95

Paul kosher

Blind brook mamaroneck, ny

Beth little

Gallstones removal

None

None

6845

243

05-apr-94

Ann hood

Hilton road larchmont, ny

Charles field

Eye cornea replacement

Tetracycline

Fever

6845

243

15-dec-84

Ann hood

Hilton road larchmont, ny

Charles field

Eye cataract removal

None

None

• All attribute values are atomic because there are no repeating group and no
composite attributes.
Normal Form
2. Second Normal Form - 2nf
Second normal form (2nf) further addresses the concept of removing duplicative data:
• A relation r is in 2nf if
(a) r is 1nf , and
(b) all non-prime attributes are fully dependent on the candidate keys.
Which is creating relationships between these new tables and their
predecessors through the use of foreign keys.
Partial dependency must be eliminated.
Break the composite primary key into two parts, each part representing a separate
table.
• A prime attribute appears in a candidate key.
• There is no partial dependency in 2nf.
Normal Form
3. Third Normal Form - 3nf
A relation is said to be in third normal form if there is no transitive functional
dependency between nonkey attributes.
• When one nonkey attribute can be determined with one or more nonkey attributes
there is said to be a transitive functional dependency.
The side effect column in the surgery table is determined by the drug administered.
• Side effect is transitively functionally dependent on drug so surgery is not 3nf.
Normal Form
4. Denormalization
Though normalization is performed to reduce or eliminate
insertion, deletion or update anomalies, a completely normalized database
may not be the most efficient or effective implementation.
“denormalization” is sometimes used to improve efficiency.
Usually driven by the need to improve query speed.
Query speed is improved at the expense of more complex or problematic
DML (data manipulation language) for updates, deletions and insertions.

2 normalization

  • 1.
    Normalization 1. Definition ▪ Thetheory of relational databases includes some wisdom about what is and what is not good database design. These notions are expressed in terms of restrictions, each of which excludes certain undesirable properties from database designs. These sets of restrictions are called normal forms. And normalization is the name given to the process of creating a database design that does not violate them. ▪ A logical design method: -- which minimizes data redundancy and reduces design flaws. • Consists of applying various “normal” forms to the database design. • The normal forms break down large tables into smaller subsets.
  • 2.
    Normalization 2. Data Redundancyor Duplication ▪ Department name and manager are duplicated for all employees in the department. If redundancy exists then this can cause problems during normal database operations: • When data is inserted the database the data must be duplicated where ever redundant versions of that data exists. • When data is updated, all redundant data must be simultaneously updated to reflect that change. Consequences of redundancy • • • • Wasted space Potential performance cost Potential inconsistency Inability to represent data
  • 3.
    Normalization 3. Anomalies Insertion anomalies Ifa new employee is added to deptid 2, all the info about the department (name & manager) has to be re-input. A new department, say it, can be added only when there exists an employee for the new department. Deletion anomalies If the last employee from a department is deleted, then details of that department disappear from the database. Update anomalies If the sales manager of Jones is replaced by Smith, all the records for employees in sales department must be modified.
  • 4.
    Normalization 4. Benefits ofNormalization • Less storage space • Quicker updates • Less data inconsistency • Clearer data relationships • Easier to add data • Flexible structure
  • 5.
    Normal Form 1. FirstNormal Form - 1nf • A relation is said to be in first normal form (1nf) if all attribute values are atomic: no repeating group, no composite attributes, and primary key is identified. Patient # Surgeon # Surgery date Patient name Patient addr Surgeon name Surgery Drug admin Side Effects 1111 145 01-jan-95 John white 15 new st. New york, ny Gallstones removal Penicillin Rash 1111 311 12-jun-95 John white 15 new st. New york, ny Beth little Michael diamon d Kidney stones removal None None 1234 243 05-apr-94 Mary jones 10 main st. Rye, ny Charles field Eye cataract removal Tetracycline Fever 1234 467 10-may-95 Mary jones 10 main st. Rye, ny Patricia gold Thrombosis removal None None 2345 189 08-jan-96 Charles brown Dogwood lane harrison, ny David rosen Open heart surgery Cephalosporin None 4876 145 05-nov-95 Hal kane 55 boston post road, chester, cn Beth little Cholecystectomy Demicillin None 5123 145 10-may-95 Paul kosher Blind brook mamaroneck, ny Beth little Gallstones removal None None 6845 243 05-apr-94 Ann hood Hilton road larchmont, ny Charles field Eye cornea replacement Tetracycline Fever 6845 243 15-dec-84 Ann hood Hilton road larchmont, ny Charles field Eye cataract removal None None • All attribute values are atomic because there are no repeating group and no composite attributes.
  • 6.
    Normal Form 2. SecondNormal Form - 2nf Second normal form (2nf) further addresses the concept of removing duplicative data: • A relation r is in 2nf if (a) r is 1nf , and (b) all non-prime attributes are fully dependent on the candidate keys. Which is creating relationships between these new tables and their predecessors through the use of foreign keys. Partial dependency must be eliminated. Break the composite primary key into two parts, each part representing a separate table. • A prime attribute appears in a candidate key. • There is no partial dependency in 2nf.
  • 7.
    Normal Form 3. ThirdNormal Form - 3nf A relation is said to be in third normal form if there is no transitive functional dependency between nonkey attributes. • When one nonkey attribute can be determined with one or more nonkey attributes there is said to be a transitive functional dependency. The side effect column in the surgery table is determined by the drug administered. • Side effect is transitively functionally dependent on drug so surgery is not 3nf.
  • 8.
    Normal Form 4. Denormalization Thoughnormalization is performed to reduce or eliminate insertion, deletion or update anomalies, a completely normalized database may not be the most efficient or effective implementation. “denormalization” is sometimes used to improve efficiency. Usually driven by the need to improve query speed. Query speed is improved at the expense of more complex or problematic DML (data manipulation language) for updates, deletions and insertions.