2. 16 November 2016
1. Introduction.
2. History.
3. Purpose of normalization .
4. Types .
5. Advantages and Disadvantages .
6. Conclusion .
3. • Normalization is the process of removing redundant data
from our tables to improve storage efficiency, data
integrity, and scalability.
• Normalization generally involves splitting existing tables into
multiple ones, which must be re-joined or linked to generate
original table.
• In the relational model, methods exist for quantifying how
efficient a database is. These classifications are called normal
forms or NF.
4. • Edgar F. Codd, the inventor of the relational
model, introduced the concept of normalization
in 1970 as first normal form (1NF) .
• Edgar F. Codd define the second normal form
(2nf) and third normal form (3nf) in 1971
• Edgar F. Codd and Raymond F.Boyce defined the
Boyce-Codd Normal Form (BCNF) in 1974.
5. • First Normal Form (1NF)
• Second Normal Form (2NF)
• Third Normal Form (3NF)
• Boyce-Codd Normal Form (BCNF)
• Fourth Normal Form (4NF)
• Fifth Normal Form (5NF)
In practice, 1NF, 2NF, and 3NF are enough for database.
6. • A relation is in 1NF if every attribute is atomic.
1. Each attribute name must be unique.
2. Each attribute value must be single.
3. Each row must be unique.
4. There is no repeating groups.
7. First Name Last name Knowledge
Sohan Verma C, C++,java
Mohan Sah PHP, java
Aditya Patel C, PHP
First Name Last Name Knowledge
Sohan Verma C
Sohan Verma C++
Sohan Verma Java
Mohan Sah PHP
Mohan Sah Java
Aditya Patel C
Aditya Patel PHP
8. • A database is in second normal form if it satisfies the following
conditions:
• It is in first normal form
• All non-key attributes are fully functional dependent on the primary
key
10. • A table design is said to be in 3NF if both the following conditions
hold:
• Table must be in 2NF
• Transitive functional dependency of non-prime attribute on any
super key should be removed.
• All nonkey attributes are fully dependent on the primary key.
11. emp_i
d
emp_
name
emp_z
ip
emp_s
tate
emp_c
ity
emp_
district
1001 John
28200
5
UP Agra
Dayal
Bagh
1002 Ajeet
22200
8
TN
Chenn
ai
M-City
1006 Lora
28200
7
TN
Chenn
ai
Urrapa
kkam
1101 Lilly
29200
8
UK Pauri
Bhagw
an
1201 Steve
22299
9
MP
Gwalio
r
Ratan
emp_id emp_name emp_zip
1001 John 282005
1002 Ajeet 222008
1006 Lora 282007
1101 Lilly 292008
1201 Steve 222999
emp_zip emp_state emp_city
emp_distri
ct
282005 UP Agra Dayal Bagh
222008 TN Chennai M-City
282007 TN Chennai
Urrapakka
m
292008 UK Pauri Bhagwan
222999 MP Gwalior Ratan
employee_zip table:
employee table:
employee
12. • It is an advance version of 3NF that’s why it is also referred as 3.5NF.
BCNF is stricter than 3NF. A table complies with BCNF if it is in 3NF
and for every functional dependency X->Y, X should be the super key
of the table.
1. A table is already in 3NF.
2. All determinants must be superkeys.
13. emp_id emp_nationality emp_dept dept_type
dept_no_of_em
p
1001 Austrian
Production and
planning
D001 200
1001 Austrian stores D001 250
1002 American
design and
technical
support
D134 100
1002 American
Purchasing
department
D134 600
emp_id emp_nationality
1001 Austrian
1002 American
emp_dept dept_type dept_no_of_emp
Production and
planning
D001 200
stores D001 250
design and
technical support
D134 100
Purchasing
department
D134 600
14. 1. A table is already in BCNF.
2. A table contains no multi-valued dependencies.
15.
16. 1. A table is already in 4NF.
2. The attributes of multi-valued dependencies are related.
17.
18. • Advantages of normalization:-
1. Smaller database: By eliminating duplicate data, you will be able to reduce
the overalll size of the database.
2. Better performance:
a. Narrow tables: Having more fine-tuned tables allows your tables to have less
columns and allows you to fit more records per data page.
b. Fewer indexes per table mean faster maintenance tasks such as index
rebuilds.
c. Only join tables that you need.
Disadvantages of normalization:-
1. More tables to join: By spreading out your data into more tables, you increase the need to join
tables.
2. Tables contain codes instead of real data: Repeated data is stored as codes rather than
meaningful data. Therefore, there is always a need to go to the lookup table for the value.
3. Data model is difficult to query against: The data model is optimized for applications, not for ad
hoc querying.