DBMS Normalization
• Normalizationis a process in Database
Management Systems (DBMS) that
organizes the attributes and relations in a
database to reduce redundancy and
improve data integrity.
• Normalization aims to eliminate unwanted
characteristics like insertion, update, and
deletion anomalies.
2.
What is Normalization?
•Normalization in DBMS is the process of
organizing the fields and tables of a
relational database to minimize
redundancy and dependency.
• The goal is to ensure that the data is
stored in a way that reduces duplication
and maintains data integrity.
3.
Why Normalize Data?
•1. Reduces redundancy and avoids data
duplication.
• 2. Minimizes the chances of anomalies like
update, insertion, and deletion anomalies.
• 3. Ensures data consistency and integrity.
• 4. Enhances the efficiency of queries.
4.
Normal Forms (1NF,2NF, 3NF, BCNF)
• 1. 1NF (First Normal Form): Ensures that
all columns contain atomic values and no
repeating groups.
• 2. 2NF (Second Normal Form): Ensures
that all non-key attributes are fully
functionally dependent on the primary key.
• 3. 3NF (Third Normal Form): Ensures that
all non-key attributes are not transitively
dependent on the primary key.
• 4. BCNF (Boyce-Codd Normal Form): A
stronger version of 3NF where every
5.
1st Normal Form(1NF)
• A relation is in 1NF if:
• 1. All columns contain atomic (indivisible)
values.
• 2. Each column contains only one value
per row.
• 3. There is a unique identifier (Primary
Key) for each record in the table.
6.
2nd Normal Form(2NF)
• A relation is in 2NF if:
• 1. It is in 1NF.
• 2. Every non-prime attribute (attribute not
part of the candidate key) is fully
functionally dependent on the entire
candidate key (no partial dependencies).
7.
3rd Normal Form(3NF)
• A relation is in 3NF if:
• 1. It is in 2NF.
• 2. There are no transitive dependencies,
i.e., non-prime attributes do not depend on
other non-prime attributes.
8.
Boyce-Codd Normal Form(BCNF)
• A relation is in BCNF if:
• 1. It is in 3NF.
• 2. For every functional dependency (X ->
Y), X is a superkey.
9.
Normalization Process
• Thenormalization process involves:
• 1. Identifying candidate keys for the
relation.
• 2. Breaking down relations based on the
rules for 1NF, 2NF, and 3NF.
• 3. Removing partial and transitive
dependencies.
10.
Examples of Normalization
•Example 1: If a table has repeating groups
of data, it can be split into multiple tables
to adhere to 1NF.
• Example 2: If a table has partial
dependencies, split it further into tables to
achieve 2NF.
11.
Advantages of Normalization
•1. Reduces redundancy and ensures data
consistency.
• 2. Simplifies database design.
• 3. Helps in easier data management and
maintenance.
• 4. Prevents data anomalies during
updates.
12.
Disadvantages of Normalization
•1. Complex queries due to multiple tables.
• 2. Possible performance overhead in
joining tables.
• 3. Increased complexity in database
design.
• 4. More storage required in some cases.
13.
When to UseNormalization
• Normalization is beneficial when:
• 1. The focus is on reducing redundancy.
• 2. Data integrity and consistency are
priorities.
• 3. The database is expected to have
complex relationships.
• Denormalization is used when
performance is a priority.
14.
Denormalization and itsUses
• Denormalization is the process of
combining tables to reduce the number of
joins in complex queries.
• Denormalization is typically used to
optimize query performance at the cost of
increased redundancy.
15.
DBMS Normalization
• Normalizationis a systematic approach to ensure that a database is free from redundancy and
inconsistent data. By applying specific rules, normalization transforms a database into a structure
that improves both the integrity and efficiency of data.
• The primary aim of normalization is to avoid various anomalies and make it easier to manage
large databases with many interrelated tables.
16.
What is Normalization?
•Normalization in DBMS refers to the process of structuring a relational database in such a way
that:
• 1. Data redundancy is minimized.
• 2. Data integrity is ensured by eliminating anomalies.
• 3. Tables and relations are divided logically to ensure efficient storage.
• This is done by breaking down larger tables into smaller, more manageable ones and
establishing clear relationships between them.
17.
Why Normalize Data?
•1. **Reduce Redundancy:** Minimizes duplicate data entries, reducing storage requirements.
• 2. **Eliminate Anomalies:** Helps avoid insertion, update, and deletion anomalies by organizing
data efficiently.
• 3. **Data Integrity:** Ensures accuracy and consistency of the data by maintaining clear
dependencies between attributes.
• 4. **Efficiency in Querying:** Optimizes queries, reducing the amount of data processed and
making it easier to modify data without complications.
18.
Normal Forms (1NF,2NF, 3NF, BCNF)
• 1. **1NF (First Normal Form):** Ensures that the table contains only atomic values. This removes
repeating groups or arrays.
• 2. **2NF (Second Normal Form):** Achieved when a relation is in 1NF and all non-key attributes
are fully functionally dependent on the primary key.
• 3. **3NF (Third Normal Form):** A relation is in 3NF if it is in 2NF and all non-key attributes are
independent of each other.
• 4. **BCNF (Boyce-Codd Normal Form):** A stricter version of 3NF where every determinant is a
candidate key.
19.
1st Normal Form(1NF)
• A relation is in 1NF if:
• 1. It contains only atomic (indivisible) values in each column.
• 2. Each row has a unique identifier (Primary Key) that ensures there are no duplicate records.
• **Example:**
• In a table containing students and subjects, where each row has multiple subjects listed, breaking
that into separate rows with atomic values for each student-subject pair ensures 1NF.
20.
2nd Normal Form(2NF)
• A relation is in 2NF if:
• 1. It is in 1NF.
• 2. Every non-prime attribute is fully functionally dependent on the primary key (i.e., no partial
dependencies).
• **Example:**
• If a table contains student data with a primary key of 'student_id' but the 'student_name' depends
only on 'student_id', then splitting the table ensures that each non-key attribute is fully dependent
on the key.
21.
3rd Normal Form(3NF)
• A relation is in 3NF if:
• 1. It is in 2NF.
• 2. It does not have any transitive dependencies (i.e., non-key attributes do not depend on other
non-key attributes).
• **Example:**
• If a table contains 'student_id', 'student_name', and 'advisor_name', where 'advisor_name'
depends on 'student_name' rather than the 'student_id', 3NF would involve removing this
transitive dependency.
22.
Boyce-Codd Normal Form(BCNF)
• BCNF is a stricter form of 3NF where:
• 1. A table is in BCNF if it is in 3NF.
• 2. For every functional dependency (X -> Y), X is a superkey.
• **Example:**
• In a table with 'employee_id' and 'project_id' as keys, if a project determines the employee
working on it (not the other way around), BCNF ensures that 'project_id' is not a determinant
unless it's a candidate key.
23.
Normalization Process
• Thenormalization process follows these general steps:
• 1. Identify functional dependencies between attributes.
• 2. Eliminate partial and transitive dependencies.
• 3. Decompose the tables based on these dependencies.
• 4. Apply the rules for 1NF, 2NF, 3NF, and BCNF iteratively until the relation is in the highest
applicable normal form.
24.
Examples of Normalization
•1. **1NF Example:** A table that stores customer orders with a column that lists multiple products
can be broken down into separate rows for each product.
• 2. **2NF Example:** In a student-course table, splitting the data into a 'students' table and a
'courses' table helps eliminate partial dependencies.
• 3. **3NF Example:** A 'student-course' table might be further split into separate tables for
'student', 'course', and 'advisor' to eliminate transitive dependencies.
25.
Advantages of Normalization
•1. **Eliminates Data Redundancy:** Normalization ensures that data is stored once, reducing
space and inconsistencies.
• 2. **Prevents Anomalies:** By minimizing redundancy, insertion, update, and deletion anomalies
are prevented.
• 3. **Improved Data Integrity:** The structure ensures that data changes do not result in
inconsistencies.
• 4. **Simplifies Data Maintenance:** With fewer dependencies, managing the data is easier and
more consistent.
26.
Disadvantages of Normalization
•1. **Complex Queries:** Because data is split into many smaller tables, queries may require
multiple joins, which can be inefficient.
• 2. **Increased Complexity:** Managing and designing normalized databases can be more
complex than denormalized ones.
• 3. **Performance Overhead:** For very large databases, normalization can result in slower query
performance due to the need to join many tables.
27.
When to UseNormalization
• Normalization should be applied when:
• 1. The focus is on minimizing redundancy and ensuring data consistency.
• 2. The database is expected to evolve over time with frequent updates.
• 3. Data integrity and accuracy are crucial.
• **When Not to Use Normalization:**
• When performance is more important than data redundancy, denormalization may be used to
simplify the structure and improve query speed.
28.
Denormalization and itsUses
• Denormalization is the process of introducing redundancy into a database by merging tables. It is
used when:
• 1. **Performance is a priority:** Fewer joins mean faster query performance.
• 2. **Simplifying queries:** Denormalized databases may be easier to query for certain
applications.
• 3. **Data retrieval needs:** When frequent read-heavy operations are needed over write
operations.