Normalization
•Normalization is the process of efficiently organizing
data in a database.
•Two goals of the normalization process:
1. Eliminating redundant data (for example, storing the
same data in more than one table).
2. Ensuring data dependencies make sense (only
storing related data in a table).
• Both of these are reduce the amount of space a database
consumes and ensure that data is logically stored.
The Normal Forms:
5 types i.e.,
• 1. First normal form or 1NF
• 2. Second normal form or 2NF
• 3. Third normal form or 3NF
• 4. Boyce-codd normal form(BCNF)
• 5. Fourth normal form or 4NF
• 6. Fifth normal form or 5NF.
First normal form or 1NF
• First Normal Form (1NF)(Multivalued attributes should be removed)
• First normal form (1NF) sets the very basic rules for an organized
database.
• It is a relation in which the intersection of each row and column
contains one and only one value.
• Rules:
• Eliminate duplicative columns from the same table.
• Create separate tables for each group of related data and identify
each row with a unique column or set of columns (the primary key).
• The entire attribute are atomic.
Example:
• Before normalized
Department Faculty Class subject
Computer A
B
B.E
M.B.A
{DS,OS}
{SQM,ST}
After normalized
Department Faculty Class subject
Computer A B.E DS
Computer A B.E OS
Computer B M.B.A SQM
Computer B M.B.A ST
Second Normal Form (2NF)
• Second Normal Form (2NF)(Partial dependency should
be removed)
• Second normal form (2NF) further addresses the
concept of removing duplicative data. It meets the
following conditions,
• Meet all the requirements of the first normal form.
• Remove subsets of data that apply to multiple rows of
a table and place them in separate tables.
• It’s don’t have partial dependencies that’s means its
having fully functional dependency.
2nd Normal Form Example
Consider the following example:
• This table has a composite primary key
[Customer ID, Store ID]. The non-key attribute
is [Purchase Location]. In this case, [Purchase
Location] only depends on [Store ID], which is
only part of the primary key. Therefore, this
table does not satisfy second normal form.
To bring this table to second normal form, we break the
table into two tables, and now we have the following:
• What we have done is to remove the partial
functional dependency that we initially had.
Now, in the table [TABLE_STORE], the column
[Purchase Location] is fully dependent on the
primary key of that table, which is [Store ID].
3rd Normal Form
• A database is in third normal form if it satisfies
the following conditions:
• It is in second normal form
• There is no transitive functional dependency
• By transitive functional dependency, we mean
we have the following relationships in the table:
A is functionally dependent on B, and B is
functionally dependent on C. In this case, C is
transitively dependent on A via B.
Consider the following example:
• In the table able, [Book ID] determines [Genre
ID], and [Genre ID] determines [Genre Type].
Therefore, [Book ID] determines [Genre Type]
via [Genre ID] and we have transitive
functional dependency, and this structure
does not satisfy third normal form.
• To bring this table to third normal form, we
split the table into two as follows:
• Now all non-key attributes are fully functional
dependent only on the primary key. In
[TABLE_BOOK], both [Genre ID] and [Price] are
only dependent on [Book ID]. In
[TABLE_GENRE], [Genre Type] is only
dependent on [Genre ID].
BOYCE-CODD NORMAL FORM (BCNF)
• A relation schema R is in Boyce-Codd Normal Form
(BCNF) if whenever an FD X
• -> A holds in R, then X is a superkey of R
• Each normal form is strictly stronger than the previous
one.
• Every 2NF relation is in 1NF
• Every 3NF relation is in 2NF
• Every BCNF relation is in 3NF
• There exist relations that are in 3NF but not in BCNF
• The goal is to have each relation in BCNF (or 3NF)
Multivalued dependency
• A multivalued dependency (MVD) X —>> Y specified on relation
• schema R, where X and Y are both subsets of R, specifies the
• following constraint on any relation state r of R: If two tuples t1 and t2
• exist in r such that t1[X] = t2[X], then two tuples t3 and t4 should also
exist in r with the following properties, where we use Z to denote
(R - (X Y)):
• t3[X] = t4[X] = t1[X] = t2[X].
• t3[Y] = t1[Y] and t4[Y] = t2[Y].
• t3[Z] = t2[Z] and t4[Z] = t1[Z].
• An MVD X —>> Y in R is called a trivial MVD if (a) Y is a subset of
• X, or (b) X Y = R.
Fourth Normal Form (4NF)
• Fourth normal form eliminates independent many-
to-one relationships between columns.
• To be in Fourth Normal Form,
– a relation must first be in Boyce-Codd Normal Form.
– a given relation may not contain more than one
multi-valued attribute.
• Defined as a relation that is in Boyce-Codd Normal
Form and contains no nontrivial multi-valued
dependencies.
Example
JOIN DEPENDENCIES
• Whenever we decompose a relation into two
relations the resulting relations have the loss-
less join property. This property refers to the
fact that we can rejoin the resulting relations
to produce the original relation.
• Lossless-join dependency is a property of
decomposition, which ensures that no
spurious tuples are generated when relations
are reunited through a join operation
Example:
The decomposition of the branch staffowner relation
branchNo Sname Oname
B003 Ann beech Carl Farrel
B003 David Ford Carl Farrel
B003 Ann beech Tina Murphy
B003 David Ford Tina Murphy
Into the BranchStaff
branchNo Sname
B003 Ann beech
B003 David Ford
And BranchOwner
branchNo Oname
B003 Carl Farrel
B003 Tina Murphy
• Relation has the lossless-join property.i.e, the
original branchstaffowner relation can be
reconstructed by performing a join operation
on the branchstaff and branchowner relations
Fifth Normal Form (5NF)
• Fifth Normal Form (5NF)
• A relation decompose into two relations must have
the lossless-join property,
• which ensures that no spurious tuples are generated
when relations are reunited through a natural join
operation.
• However, there are requirements to decompose a
relation into more than two relations. Although rare,
these cases are managed by join dependency and fifth
normal form (5NF).
• Fifth Normal Form:
• A relation that has no join dependently is in
fifth normal form
• Example: Consider the property item supplier
relation.
Property No itemDescription SupplierNo
PG4 Bed S1
PG4 Chair S2
PG16 Bed S3
• As this relation contains a join dependency, it
is therefore not in fifth normal form.
• To remove the join dependency, decompose
the relation into three relations as,
• Property item
Property
No
itemDescrip
tion
PG4 Bed
PG4 Chair
PG16 Bed
• Item supplier
Item
Description
SupplierNo
Bed S1
Chair S2
Property supplier
Property No SupplierNo
PG4 S1
PG4 S2
PG16 S3
• The property item supplier relation with form
(A<B<C) satisfies the join dependency JD(R1
(A,b),R2 (B,C),R3(A<C)).
• i.e., performing the join on all three will
recreate the original property itemsupplier
relation.
DOMAIN KEY NORMAL FORM (DKNF)
• The idea behind DKNF is to specify the “ultimate normal
form” that takes into account all possible types of
dependencies and constraints.
• A relation is said to be in DKNF if all constraints and
dependencies that should hold on the relation can be
enforced simply by enforcing the domain constraint and
key constraint on the relation.
• For a relation in DKNF, it becomes very straight forward to
enforce all data base constraints by simply checking that
each attribute value in a tuple is of the appropriate domain
and that every key constraint is enforced.
Denormalization
- Causes redundancy, but fast performance & no
referential integrity
- Denormalize when
• specific queries occur frequently,
• a strict performance is required and
• it is not heavily updated

Normalization in data base management system.pptx

  • 1.
    Normalization •Normalization is theprocess of efficiently organizing data in a database. •Two goals of the normalization process: 1. Eliminating redundant data (for example, storing the same data in more than one table). 2. Ensuring data dependencies make sense (only storing related data in a table).
  • 2.
    • Both ofthese are reduce the amount of space a database consumes and ensure that data is logically stored. The Normal Forms: 5 types i.e., • 1. First normal form or 1NF • 2. Second normal form or 2NF • 3. Third normal form or 3NF • 4. Boyce-codd normal form(BCNF) • 5. Fourth normal form or 4NF • 6. Fifth normal form or 5NF.
  • 3.
    First normal formor 1NF • First Normal Form (1NF)(Multivalued attributes should be removed) • First normal form (1NF) sets the very basic rules for an organized database. • It is a relation in which the intersection of each row and column contains one and only one value. • Rules: • Eliminate duplicative columns from the same table. • Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key). • The entire attribute are atomic.
  • 4.
    Example: • Before normalized DepartmentFaculty Class subject Computer A B B.E M.B.A {DS,OS} {SQM,ST}
  • 5.
    After normalized Department FacultyClass subject Computer A B.E DS Computer A B.E OS Computer B M.B.A SQM Computer B M.B.A ST
  • 6.
    Second Normal Form(2NF) • Second Normal Form (2NF)(Partial dependency should be removed) • Second normal form (2NF) further addresses the concept of removing duplicative data. It meets the following conditions, • Meet all the requirements of the first normal form. • Remove subsets of data that apply to multiple rows of a table and place them in separate tables. • It’s don’t have partial dependencies that’s means its having fully functional dependency.
  • 7.
    2nd Normal FormExample Consider the following example:
  • 8.
    • This tablehas a composite primary key [Customer ID, Store ID]. The non-key attribute is [Purchase Location]. In this case, [Purchase Location] only depends on [Store ID], which is only part of the primary key. Therefore, this table does not satisfy second normal form.
  • 9.
    To bring thistable to second normal form, we break the table into two tables, and now we have the following:
  • 10.
    • What wehave done is to remove the partial functional dependency that we initially had. Now, in the table [TABLE_STORE], the column [Purchase Location] is fully dependent on the primary key of that table, which is [Store ID].
  • 11.
    3rd Normal Form •A database is in third normal form if it satisfies the following conditions: • It is in second normal form • There is no transitive functional dependency • By transitive functional dependency, we mean we have the following relationships in the table: A is functionally dependent on B, and B is functionally dependent on C. In this case, C is transitively dependent on A via B.
  • 12.
  • 13.
    • In thetable able, [Book ID] determines [Genre ID], and [Genre ID] determines [Genre Type]. Therefore, [Book ID] determines [Genre Type] via [Genre ID] and we have transitive functional dependency, and this structure does not satisfy third normal form.
  • 14.
    • To bringthis table to third normal form, we split the table into two as follows:
  • 15.
    • Now allnon-key attributes are fully functional dependent only on the primary key. In [TABLE_BOOK], both [Genre ID] and [Price] are only dependent on [Book ID]. In [TABLE_GENRE], [Genre Type] is only dependent on [Genre ID].
  • 16.
    BOYCE-CODD NORMAL FORM(BCNF) • A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD X • -> A holds in R, then X is a superkey of R • Each normal form is strictly stronger than the previous one. • Every 2NF relation is in 1NF • Every 3NF relation is in 2NF • Every BCNF relation is in 3NF • There exist relations that are in 3NF but not in BCNF • The goal is to have each relation in BCNF (or 3NF)
  • 18.
    Multivalued dependency • Amultivalued dependency (MVD) X —>> Y specified on relation • schema R, where X and Y are both subsets of R, specifies the • following constraint on any relation state r of R: If two tuples t1 and t2 • exist in r such that t1[X] = t2[X], then two tuples t3 and t4 should also exist in r with the following properties, where we use Z to denote (R - (X Y)): • t3[X] = t4[X] = t1[X] = t2[X]. • t3[Y] = t1[Y] and t4[Y] = t2[Y]. • t3[Z] = t2[Z] and t4[Z] = t1[Z]. • An MVD X —>> Y in R is called a trivial MVD if (a) Y is a subset of • X, or (b) X Y = R.
  • 19.
    Fourth Normal Form(4NF) • Fourth normal form eliminates independent many- to-one relationships between columns. • To be in Fourth Normal Form, – a relation must first be in Boyce-Codd Normal Form. – a given relation may not contain more than one multi-valued attribute. • Defined as a relation that is in Boyce-Codd Normal Form and contains no nontrivial multi-valued dependencies.
  • 20.
  • 22.
    JOIN DEPENDENCIES • Wheneverwe decompose a relation into two relations the resulting relations have the loss- less join property. This property refers to the fact that we can rejoin the resulting relations to produce the original relation. • Lossless-join dependency is a property of decomposition, which ensures that no spurious tuples are generated when relations are reunited through a join operation
  • 23.
    Example: The decomposition ofthe branch staffowner relation branchNo Sname Oname B003 Ann beech Carl Farrel B003 David Ford Carl Farrel B003 Ann beech Tina Murphy B003 David Ford Tina Murphy Into the BranchStaff branchNo Sname B003 Ann beech B003 David Ford And BranchOwner branchNo Oname B003 Carl Farrel B003 Tina Murphy
  • 24.
    • Relation hasthe lossless-join property.i.e, the original branchstaffowner relation can be reconstructed by performing a join operation on the branchstaff and branchowner relations
  • 25.
    Fifth Normal Form(5NF) • Fifth Normal Form (5NF) • A relation decompose into two relations must have the lossless-join property, • which ensures that no spurious tuples are generated when relations are reunited through a natural join operation. • However, there are requirements to decompose a relation into more than two relations. Although rare, these cases are managed by join dependency and fifth normal form (5NF).
  • 26.
    • Fifth NormalForm: • A relation that has no join dependently is in fifth normal form • Example: Consider the property item supplier relation. Property No itemDescription SupplierNo PG4 Bed S1 PG4 Chair S2 PG16 Bed S3
  • 27.
    • As thisrelation contains a join dependency, it is therefore not in fifth normal form. • To remove the join dependency, decompose the relation into three relations as, • Property item Property No itemDescrip tion PG4 Bed PG4 Chair PG16 Bed
  • 28.
    • Item supplier Item Description SupplierNo BedS1 Chair S2 Property supplier Property No SupplierNo PG4 S1 PG4 S2 PG16 S3
  • 29.
    • The propertyitem supplier relation with form (A<B<C) satisfies the join dependency JD(R1 (A,b),R2 (B,C),R3(A<C)). • i.e., performing the join on all three will recreate the original property itemsupplier relation.
  • 30.
    DOMAIN KEY NORMALFORM (DKNF) • The idea behind DKNF is to specify the “ultimate normal form” that takes into account all possible types of dependencies and constraints. • A relation is said to be in DKNF if all constraints and dependencies that should hold on the relation can be enforced simply by enforcing the domain constraint and key constraint on the relation. • For a relation in DKNF, it becomes very straight forward to enforce all data base constraints by simply checking that each attribute value in a tuple is of the appropriate domain and that every key constraint is enforced.
  • 31.
    Denormalization - Causes redundancy,but fast performance & no referential integrity - Denormalize when • specific queries occur frequently, • a strict performance is required and • it is not heavily updated